Playlist-MCP

Apache 2.0

Overview InspectNew Endpoints Schema Related Servers Reviews Score

playlist-mcp

transcripts.db•8.45 MB

SQLite format 3@ ��.~Y �&&��##�wtabletranscriptstranscriptsCREATE TABLE transcripts (video_id TEXT PRIMARY KEY, transcript TEXT, created_at TIMESTAMP)5I#indexsqlite_autoindex_transcripts_1transcripts M��~ytoje`[VQLGB=83.)$��zupkfa\WRMHC=71+% � � � � � � � � � � � � � � � � � � � � � } w q k e _ Y S M G A ; 5 / ) # ��{uoic]WQKE?93-'! ��ysmga[UOIC=71+% � � � � � � � � � � � � � � � � � � � � � } w q k e _ Y S M�d#�Aq3uBctLa_Sg[Music]what a wild ride this week has beenCubeCon 2025 is awesome because of thecommunity the swags andtech It really encourages people intheir works It really allows me to youknow see what's the latest things whichare happening in the world of cloudnative and AI The talks the sessions areincredible I really like the lineupPeople are coming together to grow thefutureI would describe CubeCon Cloud NativeCon London as knowledge opportunitiesconnection and friendsWe hope to see you at an upcomingCubeCon Cloud Native Con in 2025 And asalways mind the crash loop[Music]2025-04-15 21:57:23.708013�#�mADGE1P5ynmsQ[Music]cubecon Cloud Native Con Europe 2025 day2 kicked off exploring Kubernetesinnovation end user stories and so much[Music]more this is my first Cu��؃ă��~��}��|��{��z��y��wp�vi�s^�r]�pX�oR�nA�m5�l4�k-�j,�h&�g%�f�e�d�b�]�[�Y��X�V�U�SۂR΂QƂO��N��M��L��K��J��I��H~�Do�Cj�Ah�?g�>f�=e�<`�:^�6Z�5W�1V�0U�/S�.Q�,O�*N�)M�(G�'E�$D�#C�"@�!?�>�<�8�6�5�4�-�&� �� ւρ��~��{��y��x��v��u��tu�rk�pd�n_�mT�l@�j2�g+�e�d�b�`��^�[�Z��YفXҁẂV��U��S��R��P��O��N��M}�Lw�Iq�HW�GK�CC�A>�@.�?(�=�<�:�9�8��6��5�4ځ3ρ0Ɂ/��.��-��,��+��)��(��'��%t�$j�!^�L�C�<�1�'�!�� ہ Ձ΁Ł��~�|{zqygw^uMs;p6n0m#ljgf�d�b�a�`�_�]�\�[�Z�X�W�V�U�T�S�R�Q�P}OwNoMiLaKTJMHEG7F,DB@>=�;�:�9�6�5�3�2�1�0�/�.�-�,�*�(x'k%c#V"O GB:430/-)'&"! �� ' Q� 1�x� �$#Qj ��!� a ��7�} q��aVQ�� q5� F� ��1( ��g �l ��!� ��JA1� q|!�q&�� k A[Q �q��4 �� !�A�Q� �A��{�� 8�� A!A��X AGZ� �� a6��Wq�E� �� q!�� !z a��1H��a� � 1�Qha� 1� Qy� � q� ��9 �a��a�IA� Q�1�Y�%� !�i#-3NyXaVPGvo�#1rgZPi2dTvE�#Kcjh0-hXwWw�#-fGztPUuD8k�#07RnkzSc6Jg�#Du0mPGFd7Fc�#q44WBAGzKhk�#I9GV4N23dvE�#NCkHrvqFMl8�#Fb_3dWJdY9I�#7IA-Vw1K7eg�#G8U141NkrDI�#RwcC44BWDvA�#w1wh9dc6m34�#kYT7KV_Cijs�#joOTwCatd9g�#HEhnch8Wpj8�#Cydz0hadVuQ�#lEXm6k2wpG4�#IvIgsHS5MDk�#0GNjonLfCQA�#WeWQqQM6kjM�#sLFmnCyZ89M�#_3fpZA-DqDU�#M56SHzETAmM�#JXcQcofGzrA�#nclJn1KEjis�#KK0FKiQ7nis�#7-JtDLNT0c8�#7sr1eHJBXKs�#anqWhSnN7sA�#zTLbnstVjHc�#Eb9AweCazi8�#bpsclYlGl2s�#KQBz7nwWxUE�#KS_rGWazTio�#6usWUdJMyHY�#jikiO3CC7Zw�#xGywrHPAMms�#317rLOIKfDQ�#W3f5Ks0j2Q8�#0qNOZpdW870�#y21i3lG2jUM�#Rw4c7lmdyFs�#dDkXFuy45EA�#KtW4HkonQHU�#HrO5KVMQfHs�#QhTlZs4m59w�#TELnK0PrKHU�#rAIcQvKBuZA�#y0JgZ-hQ-Bo�#ahANKkTT-yo�#Zr7y27HpII4�#Zfp94fOMcwE�#zZ7bDPZMCqY�#gycxQT3DHIU�#iCAFXF5ECto�#ufY_JFPpzRI�#6l5zCt5QsdY�#SdLLOcNZN5E�#CK7Il4ZiqTA�#OnqzoBf7dUE�#6hWoA4jEk5M�#4pBhVVrCHyM�#7KCBigZi_Rk�#26p_qvuCy-s�#CvGbwn5ZrFg�#D21yF0E-v2s�#c0dEL_bBRVU�#lQEYxCXVkVU�#oCbJdcy3zzA�#40OmDwTgl1A�#OTzd9eTtLRA�#W5C0O7vk78o�#Tz8IcMSY7jw�#hjbZOBghxYU�#_pgOuaYwvBQ�#Dc6S4vU9GiM�#vrG5tBDsdd0�#9U3WMez9q74�#2_ECK6v_yXc�#Q15XbASxHM0�#dNb1m84Bp4c�#9lPp-6nJ8bI�#Es3DBj2UgIE�#1iWD14xvBQA�#rVz-vIFGT4k�#x8wEo6ZDT1g�#Z_15EyXOnhU�#_xoDbpm-Qks�#rdTPbm9f_fc�#gGP9QdlNr9Y�#aWxuaEFSarU�#poBOYc_EkpA�#Ea5OuNjpi9M#ZManfhV6DZU~#-SFVDr3wQ_w}#VCmp--NcxeE|#lIYXVIPsk_U{#rbVV8WIJYwwz#Cn8xvysLWVgy#SqKqB-q_m8Ex#Zbi46yTlSVow#XtA-NKoJDaIv#m8ZnlZTo1OEu#AHY4IDlBhzEt#1c2va5nATmQs#UVPe-rdxK7wr#bFKls7IvzNEq#YvXCcSjXKEQp#CzdX5qDgQ2Uo#gWgagjHtnlEn#kyLdmGYZ6BQm#Mbk6FY_9FKMl#0p-sZT0LWOgk#aYGGnDDGX-Qj#7U6nAxUxG6ci#L13y_-zLin4h#RdT6P5x_fDMg#WaDSASWA2z4f#3KLsfEyNKrYe#p52nxvo6hXkd#HV3Nb_wUro4c#FC5TAGsBbRQb#fnt3f8sWJLAa#Fnb1a5Kaxgo`#fznzH-gf9h8_#T-nN86wTebM^#mewXGSwDCE4]#ZmcZlDCYDgE\#nvKpg3JgSjs[#h1AyaAIf3HAZ#pPKuJg_6A3kY#_oIoaW5i-xEX#emjrmJZR-ZIW#qH5djJlbodYV#pvTRjsSXMi0U#1rtyQaTfbdMT#VAWw5CujiR8S#DVFQ20OrEFkR#OAb54JRIS6MQ#bIxw1uK0QRQP#QbR908kgk1YO#3oWODC2mdk0N#x5qguW0SF_IM#I9t7qfOjgboL#2r92tTuFYg8K#lj_qgsb4h38J#OJ1WoQjYAJoI#jtGSzIvw9jIH#GgxRHpQIEfgG#V7HbBO_umOUF#VU_vj2r3BgIE#FqUPqroF-RwD#dVM20108SRcC#AYxjk8ZZcloB#eO8szEGNwooA#nXdGXdxmWNQ@#JFS0lSfHtMI?#g8rtqqNTL9Q>#DmfZq70WOxI=#ZOG1J1Niuh0<#NnYtnUeJi7U;#8Q8sFzODEUo:#KPNuLwXNkNQ9#gMDC1zzHabk8#XelZnqurT2s7#xwGDxqI_3Nk6#WuMyfaF0UeM5#bC4xbBJs0CA4#3aUg2qxfoZU3#oLZ2EjjKibw2#KlsxQMfdKLw1#MHfDvUUJ14I0#_rDE1PD5Z5I/#yCyezOTVU_Y.#1FwE0ajODU8-#ZIk_EqI8rVA,#urRefZ0KnU4+#4Pei4LMigQE*#J93U9n_qxSI)#BkQRGsVBhkc(#EBbuyn72jtw'#d_9JNRkT7dg&#LEiFzJnqU-E%#Q40yLLLIW9Q$#-yOKr2DOJ_o##6MrXcbcxnN4"#ksKOPx99rIE!#Ae2-LNHtUr8 #mynQyP2_17E#GyvARSG3_ws#jUChVGvSB5g#UI-b-Odg39A#A1HGYh0Wz9U#qj9q_-S91L8#g7KIuv7KipE#BBqDpqATcI0#IoEe05sPqhk#DWq8UWmcRQg#_47X1eKkiEs#-k1CdrRAGMM#rACTrbTnFqY#W_EF1HnP4tU#lFaSEevdZvU#K3edF36HWYU#7GQRyAxPa9g#JqG1wey7-Ao#IGK7TZPuma4 #tSBfDzStoYE#UfYctUtDDfQ#kQ4X6-mPHqw #d2szUE0jhX4 #j0AqGpC_pp4#VLVSa6xD5tk#R0255efML-I#85MDID9Ju04#q3uBctLa_Sg#DGE1P5ynmsQ#pWBbX6p�#UI-b-Odg39Aies and and meetups andorganizations worldwide so if you'reinterested in hosting uh a C we'recalling it C10CF uh party uh and and anda you know event within your localcommunity please go here and uh requestand we're going to go support uh ally'all over the world to go celebratelater when we actually get to ourofficial 10-year milestone so veryexcited to kind of uh celebrate witheveryone this year on all the amazingstuff we have done uh this this thisdecade so that's a little bit about thepast and the history so if we look atcloud native today CNCF um you knowthere are over 275,000 contributors toover 200 projects coming from 190 uhdifferent countries out there and we aretruly a global uh community um I hadsome friends at GitHub that wereactually doing some interesting uhanalysis uh and you know coming up somegraphics where they were like well youknow let's let's you know heat mapseveryone loves a heat map and so likehere uh you're like here's a heat map ofall the maintainers you know uh acrossuh CNCF and kind of you know where theycome and you know uh and you know youkind of look at this map and you're likeoh wow this is like truly you know allyou know all over the world is kind ofrepresented uh here in terms ofmaintainers you know for the communityobviously ly contributors are aresignificantly wider but uh you know ifyou zoom in we are in the UK so I askedthem to kind of give me a little bit ofa zoom in thing and you kind of see somerepresentation here but you know themaintainers in CNCF there's about 1500of them truly drive uh the community umhere and yesterday we hosted our firstuh maintainer summit uh that broughttogether all of our different projectmaintainers to kind of share lessons uhyou know uh talk about how they solveproblems in their respective projectscommunity and so I just kind of want toyou know thank that literally the 1500you know plus maintainers really drivethis whole ecosystem that everyonedepends on so maybe a thank you to allthe maintainers and uh congrats for yourfirst maintainer summit umyesterday so our first uh announcementis uh we have the latest uh state uh uhsorry cloud native uh annual survey thatwe put together is uh put out and uhdownloaded so you kind of go take a lookat it we have some updated informationuh on project usage number ofcloudnative developers out there and soon so please check out um you know thethe latest uh annual survey that we puttogether so let's do the usual so wehave some new members in CNCF i'm veryexcited to announce that uh Ericson HaraProxy and Morrenus have joined us uhgold members so thank you for yoursupport you're very much appreciated uhto support this uh lovely organizationwe also have a lot of other uh newmembers uh silver members uh academicsnonprofits universities and so on thatjoin so uh welcome to team cloud nativeand glad to have you supporting the theCNCF so another uh new announcement uhthat we're doing so last year on stage Iwas in Paris I announced our new kind ofcloud native education ambassadorprogram to basically help improve thestate of cloud native educationcertifications and help other peoplelearn more about different CNCF projectsso this year we are uh announcing uhkind of an update to this program at anew level called the golden cubstronautuh program and this is essentially folksthat have passed not only the kubernetescertifications but the other CNCFprojects like ISTTO backstage uh Argoand and so on so we have a couple dozenof initial golden cubstronauts in stagesomewhere i don't know where where ally'all are but uh if yeah there you areso uh please stand and uh we'll spot youand thank you uh kudos uh those are uhit's a lot of certifications that youhave to do and uh I'm very happy to tohave you um here so thank you like likeyourhats uh education's important cncf is adiverse uh diverse ecosystem we got alot of projects it's not always easy umthe other thing we announced uh latelast um you know year was uh we were uhyou know expanding into kind of new newareas so if you looked at that kind ofmaintainer map uh Africa was a littlebit light on it and you know what betterway to uh expand uh reach is uh bycreating kind of a new program to servethat market so we partnered with anorganization called Andela to try totrain over 20,000 um you know uh uhfolks in Africa on cloudnative uhtechnology uh particularly starting withthe KCNA and early Kubernetes uhtraining when we launched this thingwe've had 6,400 people apply uh from 46different uh countries uh in Africa soit's been awesome to see this going andI hope next year uh I could come up withan update to show more maintainers andcontributors in that part of the worldbut we're officially kicking things offin uh May 5th 2025 uh as uh you know tokind of really start this program so I'msuper excited that it's finally movinguh alonghere the other thing we've beeninvesting in is uh you're probably nostranger to platform engineering it'skind of a new popular uh topic in thecloudnative world a lot of CNCF projectspower platform engineering teams andproducts out there and one thing that wenoticed was uh a lot of the educationand content out there is very maybeskewed maybe it's single vendor or notvery open source friendly so we wenttogether to work with our community andcreated two new certifications heretoday we are launching that the uh cloudnative platform engineering associate uhexam is available to register and takeso it is available if you go to that QRcode and obviously when you attendCubeCon all y'all have you know uhdiscounts uh and and so there so pleasecheck it out platform engineering isdefinitely here tostay the other thing that I've noticedin kind of my tenure journey uh in inCNCF is we talk to a lot of companiesthat are adopting cloudnative softwareusing it and a lot of times some ofthese organizations are new to opensource right they're hey like we'retrying to set up our policies we'retrying to figure out how to you knowcontribute back so we've partnered withum uh you know the Linux foundation andthe open source initiative to create anew uh training and certification calledcode or certified open source developerit's really meant for enterp enterprisesto try to figure out how to navigateopen source usage some of the crazy uhupcoming regulations that are coming howto deal with licensing issues so this isnow available for everyone to takeadvantage of especially if you're a uh alarge enterprise out there so awesome tokind of see this finally done becauseit's it's it's definitely sorely neededuh out there so um my next announcementis uh you know I I've I've known this uhgentleman for for quite a while andthey've been a very early supporter ofof CNCF and I'm very excited about akind of a new initiative that we've beenuh working on together that very much isrelevant to the European uh economy andand region so uh I'll let him explain ituh a little bit more but let me uhintroduce uh Vasu from SAP on the stageto talk about a little new uh funproject that we're working on so Vasucome on come on[Music][Applause][Music]out thank you Chris hello CubeCon thanksfor having me now while I work for SEP alongtime platinum sponsor and supporterof course of the CNCF I'm actually hereto represent a broad initiative acrossEurope i'm actually excited and I'mtruly am to announce the launch of NeoNephos a new open-source initiativeum yeah that is hosted now in the LinuxFoundation here in Europe so why did westart Neo its mission is to build asovereign multi-provider cloud edgecontinuum for Europe so I probably needto explain this continuum a bit as youcan see here on the map several EUmember states have mobilized a whooping3.5 billion euros of investments with apolicy instrument called the importantproject of common European interest orin short the IPSI now neonos is anon-bureaucratic outcome of thatfounding as founding members here thecompanies here you see of neonos and notall of them are receiving funds by theway we recognize the importance of uhcollaboration in open source yeah forthis mission yeah and it's an urgent andan immediate uh mission that we allsubscribe to now we also see neonafos asour open sourceinfrastructure for all the other EUinitiatives that are currently in motionas we speak now with this we canactually put public and private moneyinto common use and that's dear to myheart into open-source use now but whyam I here to announce this foundation atthe CubeCon well you are all here alsobecause of that because if you look intoall our the digital machine rooms of allour companies we find cloud nativeeverywhere kubernetes open telemetrygithops and all many many more yeah sothe cloudnative ecosystem has createdthe de facto standard around portabilityand versatility and it is clearly thekey to solving the interoperabilitychallenges that we face not only forEurope but for everyone everywhere elseso technically what do we actually dowith these CNC pro CNCF projects now ofcourse we're leveraging all theseprojects to jointly build a referencethat serves this sovereign cloud edgepurpose and I'm especially proud that wecan actually pay for maintainers andcontributors for example for the KCPproject where we can move the edge ofwhat Kubernetes can actually do and inneonos we also have donated projectsbased entirely on the upon thekubernetes principles some of which havesecretly or openly matured uh in ourrespectiveenterprises and um yeah are working atscale actually in short I want to makeit short yeah the idea is to take ourbeloved cube control command or is itcube cuttle yeah and apply kubernetusdown to bare metal across crossplatformproviders into the telco edges and makeit the European standard yeah if youwant to know more I'll be talking aboutneonos and our initiative in the LinuxFoundation Europe birds of a feathersession at 14:30 today together with oursister project Silva thank you Christhank you for uh it's an honor to standupon the shoulders of giants and thankyou CNCF for being such a welcoming andgreat community okay see youlater thank you Vasu i think he's maybeunderelling some of this like they do agreat job funding and and supporting uhCNCF maintainers and upstreamcontributors so very much kudos to himpersonally in in driving a lot of thatwork at SAP and always excited to seeCNCFA projects used in the way they wereintended to kind of build um cloudnative systems that are used acrossdifferent clouds and regions across theworld so super awesome effort so I wantto kind of wrap uh things up and beforeI hand it off to the lovely programchairs to continue uh uh you knowCubeCon here but um you know we are aglobal community we now have uh fiveCubeCons worldwide uh that are happeningum later this year uh we are doingCubeCon in China in Hong Kong and inTokyo CubeCon Japan for the first timewhich is very exciting and then we'll bein India in Hyderabad and later in NorthAmerica and Atlanta this year so thereare five CubeCons that we do to kind oftry to meet and reach our community inspecific regions so uh highly encouragefolks to uh attend uh most of these uhJapan and India for sure will sell outso register uh early so hope to see youthere and for next year to help withplanning since these these events havebecome so large uh this is a big kind ofkeynote stage um we are planning furtherin advance so uh as I mentioned beforenext year for CubeCon we're going to beuh in Amsterdam we're back at the RyeConvention Center over there which is alovely space love that area and thenwe're going to be in LA uh in 2026 forCubeCon North America um however uh forthe year after that we've already bookeduh the next CubeCon Europe and I'm happyto uh announce today that uh we will beuh in Barcelona back so uh book yourcalendars March 15th uh through 18th umone of my favorite cities in Europe andit'll be excited to be uh back there soum you know I want to thank everyone forcoming and and showing up uh I trulyhope you have a great week of learningnerding out meeting some new friends andand and so on so uh with that I'll I'llkind of close uh shop off and have uhCasper come lead the way on uhofficially really kicking off the restof the keynote so uh welcome and trulyenjoy this week so thank you very muchall thanks for attending2025-04-15 21:57:24.257837 =�� =�d#�Aq3uBctLa_Sg[Music]what a wild ride this week has beenCubeCon 2025 is awesome because of thecommunity the swags andtech It really encourages people intheir works It really allows me to youknow see what's the latest things whichare happening in the world of cloudnative and AI The talks the sessions areincredible I really like the lineupPeople are coming together to grow thefutureI would describe CubeCon Cloud NativeCon London as knowledge opportunitiesconnection and friendsWe hope to see you at an upcomingCubeCon Cloud Native Con in 2025 And asalways mind the crash loop[Music]2025-04-15 21:57:23.708013�#�mADGE1P5ynmsQ[Music]cubecon Cloud Native Con Europe 2025 day2 kicked off exploring Kubernetesinnovation end user stories and so much[Music]more this is my first CubeCon thisspecific group in London is veryinteresting because I'm getting to knowso many people from different companiesfrom all around the worldactually now I can participate in moredeep talks around the technologies weare speaking about artificialintelligence environment i'm excited toget more knowledge about thesesustainable workloads[Music]2025-04-15 21:57:23.111115�\#�qApWBbX6pOyPgsayCubeCon Cloud Native Con 2025 keynoteskicked off today highlighting 10 yearsof CNCF platform engineering AIobservability accessibility and[Music]more So I noticed this huge excitementin the community This year we areannouncing kind of an update to thisprogram in a new level called the goldencubes As a cube snot I would like tomeet people and make meaningfulconnectionsBut I'm meeting them in person and itfeels more family and community like anduh I'm very happy to to have you hereThank you Laker hatsThese certifications actually gave me aclear path to learn something I couldorientate myself within the projects I'massigned to much better than withoutdoing this whole journal[Music]2025-04-15 21:57:22.317149�]#�sArP1I6Cegej4[Music]welcome team cloud native to CubeConCloud Native Con Europe 2025 what agreat time it is for us to catch up withold friends meet new ones share ideaslet's get into it and get started heatheatheat heat heat heat[Music]you baby I will be your fool if you letme forget me just another thing you'relost to2025-04-15 21:57:21.484449 ��?#��5A85MDID9Ju04hello everyone to CubeCon Europe i'mvery excited to be in London here uhalso welcome to our largest CubeCon yetum it's very exciting we have over uh12,500 people here so very uh excitingso thank you very muchum be before I I kind of do the usualyou know new members new new type ofannouncements uh I've been feeling alittle bit nostalgic you know lately uhwe had the Kubernetes 10-yearanniversary last year that we had a lotof celebrations of this year is whenCNCF turns 10 and so uh you know I kindof like would like to talk a little bitabout the history uh of the organizationuh and kind of where where we're goingum you know I have a lot of nostalgiabecause the the first CubeCon I everintended was actually in London back in2016 it was the first CubeCon Europe uhat the time and if you kind of look inthis uh room you know this keen thekeynote hall here looks very differentthan it was down here for for sure butuh at that time we had a few hundredpassionate folks coming together to kindof build and work on you know Kubernetesand and eventually what what became umyou know the CNCF uh so if you kind oflook at the history here I think a lotof people aren't aware that um CNCF wasannounced in July 2015 uh we had about20 uh original organizations that wereinvolved uh we hosted our first boardmeeting in December 2015 at the New YorkTimes building in Manhattan thank youNew York Times for the space it wasgreat uh and then in March 2016 weformalized the uh cloud native technicalboard the TOC and formally voted toaccept uh Kubernetes as our first uhproject which was kind of cute it wasnice that the TOC was so gracious to uhvote and accept Kubernetes uh at thattime but if you kind of look at thisoriginal list of companies that wereinvolved we had AT&T Box Cisco CloudFoundry Core OS Cycle Computing DockereBay Goldman uh Google Huawei IBM IntelJoint Kismatic Mesosphere Red Hat SwitchTwitter Univa VMware and and Weave Worksthat was a very small list of oforganizations and you know some of thoseorganizations are no longer here i kindof uh uh in italics I kind of mentionedthe companies that were either acquiredor or dissolved but the key lesson hereis there was a small group oforganizations and folks that cametogether with a vision to make you knowuh infrastructure and cloud native uhbetter and you know and and and they didit you know you know here we are sosmall kernel of idea with passionatefolks can truly uh you know truly changeum the world so uh before I kind of moveon to uh you know other announcements Ijust wanted to have a special thank youfor some of the folks that were there inthe early days so you know uh specialthanks to our early governing board andTUC uh members uh I've bumped into somefolks already uh here i've uh seenKelsey and Alexis and so on so justspecial thank you uh to all these folksthat were there uh in the beginning whenwe had this kernel of a crazy idea ofwhat CNCF became so let's do a littleround of applause and see if you saythem thank you for theirearly crazy uh crazy supportso 10 years of CNCF you know we havetons of projects now so there's a lot ofanniversaries that are happening uhacross the board so we did Kuberneteslast year for 10 years it's 11 yearsthis year um a lot of people aren'taware that some of the projects we havein CNCF are actually older than uh us asan organization so things like Vitess uhand the update framework are 15 and 16years old so we're going to be doing alot of celebrating this year and uhacknowledging the 10 years of amazing uhhard work that this community has doneso I'm happy to announce just like lastyear we're going to be throwing some uhpart very cool to bein the middle of a phase change inprogress when everything is new toeveryone at the same time and most of ushere today have been building softwarebefore this LLM boom and we know thehard part about building softwaresystems isn't just the writing of thecode it's the testing it's themaintenance the tuning and the debuggingthat comes after so while LLMs have aton of implications for everything thatwe're doing I want to spend this timewith you today on this specific part ofit how we make sure that the code thatwe build on top of these magical blackboxes still works the way that we expectafter we've shipped and it's in front ofthe usernow we work with black boxes andsoftware all the time but LLMs take someof the ways that we're used to ensuringconsistent reliable behavior and makethem a little more difficult take forexample trying to make sure your codeoverall is testable mockable anddebugable well unit tests rely on yourbeing able to define a representativeset of inputswith LLMs we're intentionally openingthe door to a very long tale of possibleinputs as formocks LLMs are by naturenondeterministic swapping it out fordeterministic mock doesn't really helpus much here and when it comes todebugging your LLM behavior we don'thave simple logical paths to stepthrough the whole point of incorporatingLLMs into our software is to wrap thefull breadth of human expression intoour code debugging LLM behaviors kind ofjust turns out to trying something andseeing whathappens so this turning upside down ofour worldview is happening on a literalsoftware engineering systems engineeringlevel where these black boxes justaren't testable or debugable the waythat we're used to which means there'sno solid sense of correct to fall backto it's also true at a meta level thereis no environment within which we canconduct our tests and feel confident intheresults even normal product developmentor release practices have deter havehave turned inside out instead ofstarting with an alpha or beta and thenfeeling confident in later broaderrelease all these early access programstend to do is inherently fail to capturethe full range of user behavior and edgecasesthese boxes in the middle column maystill be working for you and may stillbe a good idea they just won't be enoughfor ensuring the correctness of your newLLMbacked functionality when you'reinviting your end users to do a lot ofthings in your system that you may neverhaveexpected so is it time to give up oneverything we've learned and know how todo and just embrace the rise of promptengineering as a specialized skill setof thefuture no obviously not or else Iwouldn't be here todaybecause a lot of the conversations we'vebeen having as an industry over theyears about how to build reliable andperformant available systems in achaotic cloudnative world are stillrelevant probably more relevant with thesort of rapid iteration that buildingwith lens demandsadopting CI/CD let us shift code let usship code much more frequently rapidlyiterating on userexperience talking about testing inproduction helped us all embrace thechaos of real userinput versus artificially sterile testenvironments high cardality metadata hasbecome a must-have if softwareengineering teams are trying tounderstand complex systems whether thatcomplexity comes from your architectureor trying to understand the businessimpact or per user experienceshighdimensionality data being able tobreak down aggregate data by a bunch ofpossible related fields not just a smallnumber of predefined ones has become thedefault way to detangle the interactionof multiple contributing factorsand SLOs's service level objectivesborrowed from our SRE friends have letus leverage existing alerting workflowsbut anchoring them around fuzzy conceptslike user experience or are wedelivering a good service to our endusers these trends have already begundriving a distinct new approach tomaking sense of our software even in aprelimworld and we already have a model forhow to measure debug and move the needleon unpredictable qualitative orquantitative experiencesobservability where it's all aboutcomparing expected behavior against whatwe're actually seeing inproduction in front of our liveusers because these are some truthsabout building on LLM somethingunpredictable will happen user behaviorwill be chaotic one fix will breaksomething else tests won't be enough andearly access programs won't help youthese aren't only properties ofextremely complex systems they impactall of us once we've decided to take innatural language input or unpredictableinput from users and pass it off to AIto make decisions about that pathcarries with it a level of chaos andcomplexity that will force us all tolevel up fastnow observability helps embrace someunpredictability enabling the sort offeedback loops that let you learn fromwhat's really happening with your codethe same way that we've learned to workiteratively with tests observabilityenables us all to ship sooner observethose results in the wild and wrap thoseobservations back in the into thedevelopmentprocess we just said tests wouldn't beenoughright that's where eval come as a quicksidebar for anyone who isn't familiarthese are tools that allow us to codifywhat good looks like in an LLM worldthat allow for a little bit moreflexibility about what success orfailure means for our applications thepattern is that you develop your set ofeval as you develop your application andwe use them to capture intended behavioror flag behavior as you work with yourprompt now eval parallel observabilityin really useful ways whileobservability is all about theunpredictability and chaos of productioneval capture that good and the bad asthey happen although in both caseswhether you're talking about theinstrumentation behind yourobservability or eval themselves theintention is for them to evolve withyour codeand what you find in your observabilitytools about how users are trying to useyour software turns out to be the bestsource of input into defining those newevals and so if you put these two thingstogether the same way that observabilityoffers feedback loops into developmentobservability pairs with your evals toform these feedback loops informing andimproving your prompts as you'reuncovering the behavior of your LLMs inresponse to your prompts you're definingevals releasing quickly watching thatcode in the wild then closing the loopas you learn and pulling those learningsback into your codebase to live onforever aseval let's go one level deeper what doesit look like to ensure that you'resetting yourself and your observabilitytooling up to support this kind ofworkflowwell because LLMs and Gen AI are thesenon-deterministic black boxes withunbounded variations on inputs they canreceive with a rapidly evolving set ofpaths used to refine those inputsgetting good observability into thesesystems is all about systematicallytracking the inputs andoutputs let's take a look at what thislooks like for a standard web app byinstrumenting our application we cancapture what arguments were sent to iton any given HTTP request some metadataabout how the app was running what wasreturned all this lets us reason aboutthe behavior that we expect for a givenuser endpoint or set of parameters andit lets us isolate and debug the issueif the actual behavior deviates fromthatexpectation but what about this paymentservice it's a third-party blackbox outof my control where even if I wanted toI couldn't go in and instrument it orlook at the logs flowing through of whatit'sdoing what I do know is what requests myapp is sending it from where in the codeand on behalf of which user and I knowhow long it took to respond and whetherit was successful and probably someother metadata by capturing all of thatI can start to reason about how theinputs impact the outputs of my blackboxhow my application and my business logicimpacts all of that and ultimately theimpact on the experience the end user ishaving taking that and carrying thatinto an LLM world there are a few moreboxes but the principles remain the sameone tactical note I like using tracesfor this in order to understand therelationships better between the overallend user experience and sort of thesubcomponents if you want to usestructured logs go for it you doyou either way we start with the enduser experience the raw input andeventually their the output that we'rereturning we can also capturemetadata on the context that we'reconstructing along with our prompt andhow long that took we can keep track ofthe prompt itself that we ultimatelypass to the LLM and useful metadatacapture useful metadata like tokenusage as well as any parsing orvalidation of LLM outputs beforereturning to theuser by operating under the generalprinciple that criteria for decision-mshould be captured in a span you canthen go and isolate any of theinteresting behaviors based on how aprompt is generatedand all of that ultimately lets us seeall of the work that we're doing up toand including calling the LLM all in oneplace yes of course we can also get theaggregate graphs like the ones on theleft in the middle we can look atlatency we can reason about usersatisfactionbut in a workflow where we're rapidlyiterating on an LM experience with tonsof potential inputs that can impactwhether your application looks like it'shallucinating you need to be able to getfrom any aggregate graph to inspecting agivenoutlier this blue row at the bottom thisis a span where ultimately we'reactually calling theLLM and being able to ask questions likeokay what actually got passed to the LLMwhat was it responding to relies on allthe spans above it because that's allthe work that we're doing to build thebest prompt that we can hand to thiscommercialLLM and when there are that many thingsthat could result in an unsatisfying LLMresponse we need all of the context wecan get to iterate and investigatetowards investigate and iterate towardsa better promptnow there are a number of specializedtools out there that promise out of thebox answers for LLMobservability i will assert I don't wanta siloed or specialized tool i wantsomething especially not one that triesto tell me what to care about i want mytools to reflect what I care about whatgood looks like for my applications andsomething that aligns with the workflowsthat my engineering teams are using forthe overall applicationlogic because everything I've talkedabout so far isn't some new skill set ormindset so much of this shift towardsembracing observability is alreadyunderwayin the last decade or so we've seen ahuge shift from developers aren't justhere to write a lot of code which AIcode assistants are certainly helpingwith to expanding our responsibilitiesto owning our services being joining oncall rotations and testing in productionultimately being responsible for whatour end users see as a result of ourcode because like it or not whenbuilding for this new Gen AI worldnothing is predictable besides somelevel ofchaos over the years as I've gone aroundtalking about observability I'vecommunicated some version of this bottomstatement that software behaves inunpredictable emergent ways and theimportant part is observing your code asit's running in production while usersare using it this was true before LLMsbut now as we are intentionallyembracing these non-deterministic immerblack boxes with emergentbehaviors this statement is that muchmoretrue so as we enter this age of AI I'mweirdly optimistic because we have manyof the tools and practices at ourdisposal to make sense of this brave newworld and we're just getting started ifor one am excited we gotthis thank you for your time andattention today if you'd like to learnmore I've got an O'Reilly report and aO'Reilly report and a book to recommendyou uh I will be at the Honeycomb boothin the Expo Hall my team and I wouldlove to hear about what you're buildingthank you so much have a great CubeCon2025-04-15 21:57:24.744159 �(#��AR0255efML-Ihello my name is Christine Yen thank youfor having me i'm excited to be herewhile I am the co-founder and CEO ofHoneycomb I will say the statements andassertions I make should apply to anymodern observability workflow let's getin writing software today feels moremagical than it ever has before we'vegot LLMs everywhere we have cheap APIcalls to foundation models not evengoing to touch the whole idea of vibecoding in this talk there's a lot to beexcited about here it's UU�(#� AVLVSa6xD5tk10 years ago Kubernetes revolutionizedour way of running workloads now it'sreinventing itself again to adapt to newchallenges static provisioning was anachievable luxury for traditionalworkloads but it's grossly unaffordablefor emerging AI and ML needs to optimizespending and navigate global resourcecontention more sophisticated deeplydynamic scheduling is necessarylet's travel from here up through allthe layers of the stack where myselfCorentin our Google co-workers and thewhole community have been working tomake dynamic workload scheduling morepowerful than everbefore so let's start at the pod levelit used to be all about CPU and memorywhich have been treated fungeibly in thelast decade but today workload authorsknow that hardware can make a bigdifference on how their workloadperforms when it comes to hardwaredevices there are a multitude of optionsand all of these options come withtrade-offs in the cloud environment likecost availability orperformance dynamic resource allocationis a new set of pod parameters that giveworkload authors the model to expresssophisticated hardware requirements andpreferences it's now possible to give avery specialized pod just a small partof machine that could be shared withother specialized pods or on the flipside you could give it exclusive accessto all the GPUs on a node the breadthand depth of attributes available to younow with DRRA give you a totally newunparalleledflexibility once you have defined apod's resource claims you can leave itto the system to satisfy it on the flyand pack your nodes as efficiently as itcanwith these fancy demands node allocationhas to follow through to provide thedesired resources but for the first timein a long time big clouds don't feel soinfinite anymore everyone is fightingfor the popular card that just gotwrecked just in time capacity tools likeCarpenter or GK custom compute classesare giving you the ability to defineexactly what your priorities are whetheryou can leverage a spot instance preferflexibility mixing reserved and ondemandcompute or as a last resort fall back toa less powerful machine when runninginto capacity or koda issues in computeclasses admins configure theirpreferences in any order mixing machineshapes hardware or availabilityclass the cluster or scaler thendynamically provisions the rightinstance at the right moment keeping thecost low while optimizing for theability of compatible hardwareif you still can't get a node in theregion where your cluster is sometimesreaching a bit further is better than anunscheduled workload we have the abilityto zoom out and look at multiple regionsfor better availability but it's oftenhard to know where specialized capacityis and the static system is unlikely tobe utilized well recent efforts inmulticluster introduce new tools to helpyou target or expand to entirely otherclusters for training batches multiq isone example where a workload would lookfor space in a fleet of clusters we'realso very excited to announce thatGoogle is open sourcing a solutioncalled multicluster orchestrator toachieve dynamic placement especially ofunique workloads like AI model serverswe just released a blog post talkingmore about it with a lot moreinfo we've just scratched the surface ofwhat's available today and thedirections that Kubernetes is goingmyself Corin all of our co-workers atGoogle everybody in the Kubernetescommunity everyone's working hard togive you the tools to level upscheduling for your workloads thisCubeCon is full of talks that will allowyou to explore this new age of dynamicprovisioning and you can always comevisit us at the Google booth and ask usmore we look forward to seeing you thereand thank you for having us thank you2025-04-15 21:57:25.193491re aroundobservability s surite operations and Ishould say it's been an absoluteblessing for me to be with eBay for overuh for around 13 years and I've beenwith eBay straight out of college uh I'ma big open source enthusiast uh startedoff with Drizzle DB shout out to anyoneuh in the audience who has worked onDrizzle so far um since then I worked onworked with multiple communities likeopen telemetry Prometheus and morelet's start off by saying thatobservability is intense out of the 1013 years at eBay uh 10 of them I'vespent uh on either logging or monitoringor now observability as we like to callit and the scale of operations hasexploded over the last uh 5 years orso complexity doesn't cease to slow downwithin eBay we have roughly 4,600microservices that power the actual eBaysite and uh from a scale perspective wegenerate 15 pabytes of logs per day uh10 billion active time series and 10million spans per second that's sampledat roughly uh 2% so it's not a trivialamount of data that we have to dealwithand so uh incidents impact our abilityto provide the highest qualityexperience to our customers and that'swhat we we we tried to dothe fundamental problem with the sitebecoming more and more complex is thatas humans we have limits to how much wecan comprehendum at at any given point in time so ifyou take manual triage as an example thetime it takes to traverse a very longcall chain um is a lot the amount oftime it takes to see through terabytesworth of uh logs is a lot and how manydashboards do I have to look at before Ican arrive at a hypothesis and all ofthese are trial and error based andusually there are a lot of errors beforewe actually land on what isright so at a time where we were uhrelying on static uh threshold-basedalerts or manually eyeballing things asthey fall off a cliff uh there was afirst pass at uh innovation and this wasbasically machine learning um shout outto our uh my partner in crime Huai Janguh in Shanghai who was spearheaded a lotof this we first tried to reduce thetime to detect to under four minutesusing anomaly detection um we built uhsomething called Groot which isavailable as a white paper online uhwhich can attach a root cause to everyuh alert that's uh triggered off of abusiness KPI and we did simple autoremediation say there is a bad partthat's there if you're able to identifythat outlier uh maybe uh bounce thepod it's a good start but it doesn'twork all the time and one of the reasonsbeing that it learns based off of whatit has seen um and if it sees somethingnew it does not know what to do with itbecause there is no reasoning capabilityso we expect the machines to domore and then came the LLMs they camewith a big bang uh they could do thingslike given human input be able tocomprehend it and do something on top ofit um and they can respond like humansuh which was fascinating to all of us wehave we have played with chat GBT when Ifirst got my hands on chatgbt I asked itgo rewrite the Prometheus postings indexto use roaring bit maps i tried triedtried tried tried and eventually Ilearned about what hallucinationactually meant and then I gave up onthatso there are there are problems with AIu and Christine's talk alluded to someof it um but one of the things that whenchatp and other lms came out we werelike this is the silver bullet this isgoing to solve everything just throwengineering out the window it's nolonger needed ai is going to doeverything we we tried to bite off morethan uh we can chew and there was a lotof gu randomness that was guaranteed soeverything that we tried fail fail failsome more and end of the day we need torealize one thing that the probabilitiesneed to work in our favor if we promptin a very deterministic way give very uhcrisp context then it becomes a littlebit more deterministic but if you keeplayering more and more probabilisticthings into more complicated workflowslike triage the site or triage an alertthen the probabilities are going to workagainst you and you're going to get veryrandom responses and do you want to usethose random responses when you're uhtroubleshooting an incident probably notso at that point we realized that maybeit's worth starting off small with thecurrent u capabilities that uh LLM havewhat are simple things that we can knockout of thepark and that's when we came to therealization that we need to build whatwe like to call building blockcapabilities uh capabilities that are ofhigh quality highly deterministic thatwe can confidently rely onthe first one being a trace explainergiven a traceID pull the spans analyze try to findwhat the causal span is and then do asummarization on top of it do a logexplainer given a bunch of log linesanalyze them find if there are any erroror latency patterns that are worth uhinvestigating summarize them given ametric explainer uh given a bunch oftime series analyze them identify ifthere are any particular trends that areuseful are there any anomalies do asummary and then finally a changeexplainer given an application making achange identify what kind of change isbeing made and uh do a summary on top ofit so you can see that summarizing is uha very key theme in all of these uh uhexplainer capabilities that we are weare buildingbut this doesn't necessarily eliminatethe problems withAI uh there are finite context windowsthat we need to work with and uhtypically the data is large uh and ifyou take our checkout API as an exampleit has 3,000 spans and there are usecases that I've seen where we have 8,000spans per request so what would happenif you shove all of it into the LLM youwon't be able to fit within the contextwindow um the more data that you try togive to the LLM the more it willhallucinate over time and you have morepeople screaming into the abyss as aresult so shoving LLM everything intothe LLM is definitely going to be metwith disappointmentand this is when we came to therealization that uh AI and engineeringare in a love relationship and it isimportant for usto use both AI and engineering for theirstrengths and combining their strengthswill achieve help us achieve somethingthat is trulymagical so we went back to the basicsgiven a trace what are things that wecan do uh to make sure that we can makethings a little bit more predictable sothe first thing that we did was to cleanthe trace up how do we clean removeeverything that's not in the criticalpath uh there there is an uh there is awhite paper that Uber did a few yearsago called Crisp uh which effectivelyteaches us a good way of uh generating acritical path we started off using thatand then did some improvisations on topto the point where we had a criticalpath algorithm that worked for us andeliminated all the spans uh that are notin the critical path and after that wedid few short prompting to say that okaythis is how s surres within the companyuh triage active incidents uh an examplewould be uh if it's a 4xx don't considerthat as a hard failure if it's a 5xxthen you need to pay more in attentionto that then we focus more on self timebecause selftime is where um uh youfight you find truly resource intensivespans and then finally we uh leveragedLLMs for what they're actually good atwhich is to uh summarize do simplereasoning and then explain specificallythe the criticalpath another thing that we uh ended updoing is uh we dictionary encodedeverything so it's not going to beservice name equal to checkout it's justgoing to be service name equal to oneand for all practical purposes machinesdon't really need um to know if it ischeckout or if it is fooar it's justgoing to look at things analyze and thenspit out uh responses so once we havedone that we split the trace intoupstream and downstream chunks wegenerate partial explanations for uh allof them and with the partialexplanations we basically u combine themto generate uh a final explanation ofthe full critical path and this helps usto identify if there are any uhperformance more than one performanceissue that we need to be worriedabout then so assuming that we havethese uh capabilities that are of highhigh quality we started to layer themtogether um uh a good example for thatwould be a dashboard explainer givendashboard metadata that conveys theseare all the time series that need to bepainted these are the annotations thatdepict changes these are some of thefaulty traces that uh are worth lookingat pull all those use the correspondingexplainers and then try to generate u anexplanation on top of all ofit and you can take that even a stepfurther to come up with a triageworkflow saying that given an alert lookat a bunch of KPI dashboards standarddashboards um analyze the KPI dashboardanalyze the standard dashboards analyzeif there are any faulty traces and oneof the things that we did was uh startwe we had a pipeline stage in the alertmanager that embeds faulty traces intothe alerts so that the alert has morecontext for having a meaningful triagefinally summarize all of it where did itcome handy so we we we had an issuewhere there was a slowdown in databasequeries with u uh roughly 1,300 uh spanswithin that uh trace the trace explainerbasically came in and said that uh usersegment service uh has a particular spanthat is taking a lot of time this wasenough to tell us that this is theservice in the call chain that hasproblems then the log explainerbasically came in and said that there isa timeout exception that you need to beworried about and this basically savedour s quite a bit of time in identifyingwhat was going on otherwise they wouldhave to manually go to the logs to thetraces find out what's going onthe possibilities are infinite if youreally start thinking about uh buildingthese things as building blockcapabilities there are so many thingsthat you could do uh for one if we hadall the metric metadata available on uha a vector database you could startwriting PromQL saying that tell me howmany search requests failed in the lastuh last 1 hour if you have an ability togenerate PromQL expressions you can puta few of them and then create aPrometheus rule group you can ask thisuh uh the platform to increase your kotoyou can ask why did I get a particularalert why did my SLO violate andeventually you can get to a point whereyou can stack all of these up and saythat help me triage the site for mewe also realized that uhum it's it's better for us to figure outways in how in which we can remove theprobabilistic uh nature of LLMs and thebest way according to us is to back themup by APIs uh and the critical pathalgorithm is a good example if you wereto do the same thing for time series youwould probably do it with the uh anomalydetection under the hood and thenproviding the responses for the LLM tosummarize stick to what the LLM is goodat it's good at simple reasoningsummarization is good at code generationuh things like copilot and uh internalknowledge search through retrievalaugmented generation many things canstill be done with code instead of LLMsand we should continue doing that do notuse the LLM just becauseum what could we do more with uh whileLLMs can comprehend the wild wild westthere is a strong need for us to havestandardization we need standardizationon the inest side and we needstandardization on the query side um weneed widespread adoption of opentelemetry with people following theschema because um that will help us makea lot of assumptions and we need thesame on the query language uh as wellwhich we don't have today and naturallanguage angage is frankly not theanswer u Chris Larson u one of theco-chairs for the query languagestandardization group of theobservability tag um has a talk later incubecon i strongly encourage everyone toparticipate and uh um take part in thatuh in thatconversation are we on the right pathsure um and uh it's a um it's a lot ofwork that was put in by the team uh toget this done um we strongly believethat we are in the right direction uhbut will I change my stance in the nextCubeCon maybe uh the space is so rapidlyevolving that if I walk down the stagesomething new would have come up and itwould all be irrelevant um if you haveany questions uh I'll be at the OpenTelemetry Observatory a few minutes uhafter this talk um happy to have aconversation and thank you very much[Applause]2025-04-15 21:57:25.704021 ��r #�Ad2szUE0jhX4hi everyone um it was really amazinglistening to Chris Anichek's uh openingsession where he talked about how farwe've come from the first CubeCon Londonuh and I enjoyed that cuz I was actuallyat that event and to think we've gonefrom that few hundred enthusiasts tonearly 10 million cloudnative developerstoday is pretty amazing um but I'm hereto talk about not about looking back butlooking forward how do we go from wherewe are today to get the next 10 millioncloudnative users because they're goingto come from all over the world and haveall sorts of different backgrounds andsome of them will look like the nerds umsorry I mean early adopters that havebuilt the community we are todayum but some are going to be students orWindows server admins and just normalfolks that expect stuff to just work outof the box and be easy to use and learnand as they come on board what's theiruser experience going to be usingKubernetestoday well the first thing they're goingto discover is that there's no one thingcalled Kubernetes to go install there'sno uh apt get or brew install or if Ilook on an app store I don't find uh theapp called Kubernetes so it becomespretty clear that one does not simplystart using Kubernetesum and the typical user journey at leasttoday starts with a command lineexperience it quickly dives into thedepths of YAML and for a novice userseeing demos by experts typing thingslike kget getpo it can prettyintimidating so then it's not just aboutgetting Kubernetes itself up and runningright we all know that there arehundreds of add-on projects for storagenetworking security and more each withits own installation process and userexperience so no question right thatthis is a a fantastic this flourishingecosystem is really a strength ofKubernete�#��AAj0AqGpC_pp4hello everyone good to see you all thisis usually when my kids give me a callthrough my wife's phone but the goodthing is they're they're in the audiencetoday i love the three of you a lot weare here to talk about AI enabledexplainers uh for observability um whenI say we actually did something with AIit doesn't mean that we we're the onlyones who have done something but if youif you have listened to my talks in thepast I'm big on storytelling and overthe next 10 to 15 minutes uh I'm goingto tell a story about how uh we came upwith the concept of explainers and howwe are using it right now and where wesee it go in thefuture i keep saying we who are we areeBayebay today is uh present in a over uh inin 190 markets with 2.3 billion livelistings 134 million active buyersworldwide andthe volume of uh money we transact juston mobile devices is uh roughly $13billion who am I uh my name is VijaySamuel i'm a principal MTS architect forthe reliability engineering organizationuh at eBay uh uh my primaryresponsibilities as and what built thecloudnative ecosystem to where it istoday but it comes at a price right weall know this there's not just a steeperlearning curve but switching contextbetween different tools and theinconsistency that you have between thedifferent interfaces um that all impactsdeveloperproductivity and if we're honest thisall adds up to an often confusing andsometimes daunting userexperience so if we're going to reachthat next 10 million we've got to dobetter thanthis now here's a nice analogy for you30 years ago Windows95 I would argue is what catalyzed massadoption of PCs and the internet becauseit turns out normal people preferpoint-and-click usability and everythingworking out of the box over a commandline and a handcrafted configfile right and that leads to the thoughtwell can we as a community build abeautiful easy to use modern KubernetesUX to unlock the n that next level ofadoption and what would thatbe well I think we need three thingsfirst we'll always want an embeddedin-cluster web UI you know building onthe role of dashboard today second weneed a unified tool that makes it easyto manage deployments spanning multipleclusters and third a fast and easy touse out of the box experience forbeginners to get started with and learnKubernetes running locally right ontheir desktopso a nextgen dashboard a multiclustermanagement app and a Kubernetes desktopexperience are you getting it yet theseshouldn't be three apps butone and it turns out that thefoundational technology for buildingthis is in the CNCF already in the formof the headlamp project i'm actuallyexcited to share that it's been acceptedinto the Kubernetes project under SIG UIand in fact the G the GitHub repo movecompleted just yesterday and head Thankyou[Music]headlamp has a lot of the corecapabilities that we need today andwe're working with many vendors and endusers across um the community to enablean incredible Kubernetes web and desktopexperience so let's see what that couldactually look like so I've justinstalled this new Kubernetes desktopapp first thing I want to do is create acluster running locally on my laptop igive it a name it's created using minicube i get a nice overview of thecluster and then can quickly see whatworkloads I have running like pods anddeployments dig into and even edit thosedetails and down the side there you seesettings for storage networking securityand more all nicely organized for easyaccess and cruciallydiscoverability now we can also connectto existing clusters all I have to do isto import a cube config here I've gotone that points to three clusters indifferent clouds i'll go into this onein Azure um it's really nicevisualization we'll go into the map viewshows the relationship between keyobjects at a glance and of course youcan uh click in to see all all thedetails now to install an app or anadd-on component we go to the appcatalog now this is actually just a listof available solutions pulled directlyfrom the CNCF artifact hub so it'sintegrated into the community it's pullsfrom that community repository and withjust a click I can get an add-on likemanager installed and running we alsouse artifact hub for publishing UIplugins here for example we see the fluxplugin install that immediately get aflux specific view it's an extra toplevel item but uh practically anythingin the UI can be customized in this wayso that's our vision of a compellingfuture Kubernetes user experience theheadlamps only a foundation it'll takeall of us here coming together todeliver on that vision developingplugins for all the CL cloudnative toolsadding language translations for greaterglobal reach and um providing that inputand feedback so that it really truly canbe communitydriven and responsive to theneeds of the community you can do thatthere's a GitLub GitHub hub issue linkhere or right here at CubeCon we'll havea the headlamp team at the projectpavilion and there's acontrib so uh please engage and be partof this cuz together I believe we cantake Kubernetes to the next 10 10million users and beyond thank you[Applause]2025-04-15 21:57:26.180876lly abouthalf the size of number CVES per daythan the other operating systems so ifyou count CVEs as being security issuesas it matters I don't um we're stillbetter than the other people and numberof releases 8 to nine weeks likeclockwork for the past 15 years newrelease like clockwork you can justdepend on us i gave a longer talk abouthow all this works you can just Googlethat and look at that if you're curiousuh turns out we run the world um Android4 billion devices everything else is aroundingerror you guys and servers with maybe200 million again rounding errorchromebooks 25 30 million a year for thepast decade still a huge number roundingerror um Wi-Fi all the Apple Wi-Fi orthree 5G modems run Linux so all theiPhones run Linux as well washingmachine all the TVs air traffic controlfinance um satellites and my favoritethe cow automatic cow milking machineruns onLinux so let's talk code i mean I'mgoing to it's a keynote so we have toshow code right um this is a real codein the kernel today um Bluetooth this isan example of a security bug we're goingto talk about security issues the firstline up there we we ask for someparameters back we look at the resultand do something with it looks fine beenthere for a few years turns out that'swrong at our level of the stack if weget something wrong that's a securitybug we forgot to check that it actuallyreturned a proper value uh my internfixed this fast this past summer got aCBE wonderful um and we use go-tos cuses gotos fun but we go to the place weneed to unlock from because we need toremember to unlock from this and thecompiler can't really check for this sowhen you're a reviewer or maintainer youget a patch sent to you all we can seeis oh yes they did look at the returnvalue that we need to unlock we didn'tdo this we can't always remember thatand that's a common common bug here'sanother CVE that happened normally wejust return the error they did check forit properly but they forgot the fact wehad to actually unlock unlock some otherstuff later so wouldn't it be nice if wecould do this automatically c a numberof years ago put in something calledscoped references so we can when youleave the scope of a reference it'llautomatically clean up for you finallyif we look at the code in Rust othercompilers do this automatically we'relike why can't we do this in a kernel wefinally incremented our version of Cthat we support so that we can finallydo this and now we have something calledguards so we can Here's the diff weremove some lines of code and we can sayhere's a guard for this lock let's grabthe lock and then let's do somethingwhen the lock goes out of scope it'll befreed much cleaner we can get aroundgotos we can do lots of stuff this isgood so going forward we're going tostart doing stuff like this we don'twant to modify the existing code justgoing forward is good but we still haveto manually remember to grab the locki'll come back to that in a minute sonot only grabbing locks but allocatingmemory we can do things like oh we wantto grab some allocate some memory at thetop when we return normally we'd have togo and free it manually here we lose thescope and away it goes the compiler justknows to free it up properly but insteadwe also have to say we want to save itso we got to manually save it at the endas well so this is C scoped locks andallocations this is good it makes thecode simpler it makes the reviewingeasier which is very very important wewrite code for people first compilerssecond because we have to maintain thisfor long periods of time people readthis and have to understand it less bugswhich is very important and mostimportantly the maintainers can havemore fun and go do other things insteadof reviewing unlock bugs don't want todo this so let's look back at theoriginal code that we had up here at thetop c so that we were to write that codein Rust the top line adds a a questionmark at the end and that tells thecompiler if there was an error herewe'll return the error so Rust willenforce the fact not only do we catchthe error that we also looked at thereturn value that makes code muchsimpler the compiler catches the bug foryou and that's very very important wewant the compiler to catch the bug evenbefore a maintainer has to look at thisstuff again with locks Rust does somecool stuff rust will force you to grab alock before you can even access themember the data remember locks arealways supposed to be for data not forcode and here rust an example this is inthe kernel today of Rust grabbing a lockbefore we access the data if we try andaccess the data ahead of time you justcan't do it the compiler will not letyou so Rust can prevent a huge majorityof security issues at build time whichis much much more important than reviewtime don't rely on humans rely on codeor tools that we can automate this stuffthat's why some of us are pushing forRust in the kernel this is veryimportant so again Rust can do all thesame things as C can do here very goodbecause we have more developers than wehave reviewers again we might have ahandful of maintainers for the kernelthat do that full-time the majority ofthe maintainers do this as a part-timething part of their paid job or part oftheir own time we need our remaintainers and reviewers to have asmuch time as possible make their job asmuch as e as easy as possible rust canalso do some cool things we can enforcevalidation of untrusted data data comesinto the kernel we need to validate itbefore we can trust it that's a hugecommon security issue rust can enforcethis again memory lifetime rules lockingrules error handling and type safety allvery good things but Rust is not asilver bullet here's some example codethat is not in the kernel thank you umthe last line there very common off byone error boom the kernel will crashrust however when we access memory thatit didn't want to access will just crashthe kernel will crash i'll give you aCVE you'll reboot the box and go on butwith C you would have a memory exploityou could probably take over the machinerust will fail safer it still will failit's fine but it will fail safer soagain you'll still get the CVE but youwon't get the box taken over so that's agood thing that's a very good thingthat's one other reason why we should beusing Rust not that it's going toprevent us from crashing it'll preventus it will crashsafer it's today it's in the kerneltoday we have 34 million lines of C codeyou don't run all those millions oflines of code you run about on a server2 million lines of code but 25,000 linesof Rust one of the new GPU drivers isbeing written in Rust today thedevelopers there are pushing for a lotof this stuff and it makes the codeeasier to understand easier to reviewand hopefully more stable over time butyou've heard a lot of people complainingabout this changing kernel developersminds ishard the main reason this is hard and weare grumpy about this is because itforces us to review our C code and ourcode has evolved over the past 30 yearsin ways that sometimes we don'tunderstand it sometimes we need to goback and look at it and enforce therules rust allows us to force to enforcerules on our C APIs in the kernel thatwe hadn't been able to do before so it'smaking us re-review older code and it'shard developers maintainers don't wantthey want to do new things they don'twant to look at their old stuff at timesbut the change can be good i think it'simportant to do this because mostly mostimportant it'll make us maintain thecode easier again we write code forpeople first compiler second this willmake us last for the next 30 to 40 yearsmake the compiler do the work for us andthis lets us I think have more fun formaintainers because again the compilerdid the work ahead of time we don't haveto worry about us trying to remember allthose intricate things did they grab thelock did they check the proper referenceall that stuff it's just done for us andthat as a benefit makes Linux moresecure for you you can go solve yourproblems you can do what you want to dowith Linux better and that's I thinkvery very important and most importantlyit lets us do world domination thank youvery muchthank you Rich2025-04-15 21:57:26.702524 # #�N #��SAkQ4X6-mPHqwhi I'm Greg kernel maintainer developerone of many thousands of kernel peoplei'm talking about Rust and Linux butfirst it's really about C and Rust andLinux because C is what runs the worldand C is still very important but Rustis doing good things too but I was toldthat many of you don't really know whatLinuxis this is a Linux Foundation conferenceum Linux is that little thing at thebottom that hides the hardware from youit's our job as a kernel an operatingsystem to isolate different processes tomake it so that you have a fast securesystem and to make the hardware lookagnostic you don't care what disccontroller you're using you don't carewhat network controller you're using youdon't care what processor you're usingit just works that's Linux's job we getout of the way we let you go and dowhatever you want to do we're a tool tolet you achieve your task um ourcommunity is big it's really really biguh almost double the size of Kubernetessorry you're number two we're stillnumber one um this was just last year atleast 355 different companies we don'treally count them all we kind of getclose um we also go fast really reallyfast this is the number of changesaccepted it takes on average at leastthree tries to get a change accepted sothat change at 76,000 changes has beenreviewed two other times before it gotaccepted our maintainers do a lot a lotof work we have about three maybe 200maintainers that do the majority of thework 700 total um so the ratio ofdevelopers to maintainers is still quitelarge um we're going really really fastthat's almost eight nine changes an hourfor the past decade 24 hours a day 7days a week small percentage of those gointo the stable trees and 13 CVEes a daysounds like a lot we're actuaethere's uh integration tools being ableto plug and play pulling data in withdifferent connectorsetc so now withcubeflow best practices putting that alltogether for the implementation it'salready got that ML ops builtin so now we'll get into kind of the keycomponents of sign language recognitionand processing with AI ML inference andorchestrationso how does sign language recognitionwork exactly we use computer visionmodel models for real-time video proprocessing pre-processing is happeningfor a lot of that because we have a lotof videos and images with you knowvarious skin tones different backgroundswhat shirt I'm wearing there's a lot ofyou know if I have a loud shirt onthere's a lot of noise introductionthere and if there's a lot of noise inthe background equally the same thingnot necessarily noise to you uh in theway you hear it but I'm talking aboutvisual noise so if I'm spelling the wordhow h o w there's a lot of noise alreadyit's just three letters but there's alot of noise because when you're doingthat transition from the h to the O itkind of looks like a C in the middle sowe're trying to figure out how to cleanthat noiseout it's a technical challenge that we'dfaced here uh we've used an overlayframework so that hand with thelandmarks that I showed youearlier within that image there arevarious points for X Y and Zaxises so interacting with that andtrying to capture the entire grandscheme of my hand and what's happeningtherethe models might look in one directionif I'msigning you know a word and it's thecamera is looking from the front but ifthere's a different perspective oflooking from the back or the side thatcan changeeverything we always want repeatabilityand uh the role of AI and machinelearning here is to be able topreserve not word for word translationbut we want to be able to do thetransformation within the context ofwhat's being signed so moving that tothe next word understanding what that isgoing forward and that's where the LLMcomes into thepicture also I wanted to add that theLLM when we're using that for signingcities for example in the United Statessome signs are exactly the samedepending on the context so we haveAustin Texas and Albany New York andthey have the same sign so the LLM needsto be used to understand which you'retalking about within thecontext so you know we're here atCubeCon so of course we have aKubernetes uh foundationthis is providing the seamlessintegration with the process and thedata handling the preparation everythingthat's repeatable we can add words signsand integrate that uh in the repeatablepattern and optimize for videoprocessing as well so when we'reprocessing that video we want toleverage the latest and greatest CPU andGPU so we don't have to change anythingthat's going on out there leveragewhat's currently new in that space andbe able tothen include CubeFlow to run againstthat also the scalability of CubeFlow isa great opportunity for using that asfar as getting a fasteroutcome if I had a longer amount of timeI would be able to go in very depth intothis i know there's some machinelearning people out here that probablyget it but I just wanted to demo howcomplex this project was there's a lotof complexity we've got the UI clientwe've got the inference service thoseare going back and forth quite a bit andthen behind the scenes we also havemodels that are running on cubeflow andpipelines that are adding the data aswell as using uh q or kserve foroptimizing theinference also we can add the notebookintegration um you can tweak thingsthere to be able to test your models etcso here's the demo so you can see whatI'm doing here with myhands you can see all of those landmarksthat I was speaking about with the X Yand Z those are being passed to themodel and the inference service istrying to figure out what I'm trying tosay so it says "Welcome to CubeConeveryone[Applause]so when we started this project we wereusing regular images ofhands and maybe not you know retrievingthousands of thousands of people's handsreally that task can be impossible forme to prepare for here you know they letme know I think in January so I hadlimited amount of time um we had a wayto shortcut it a little bit um withusing MM Labs and media pipe and beingable to identify those landmarks so wedidn't have to worry about thebackground the colors like I was talkingabout before identifying those vectorpoints and our model improvedtremendously was able to understand it alot better and so these landmarks andshapes here were identifiable andreducing the noise in theenvironment you can see here that wewere talking about scale within ourproject as well if you did something toofar away from the camera it would kindof blend your fingers together and itwouldn't really be able to recognizethose landmarks but then if you backedup a little bit or made it into acertain space then it would workbetter and then we decided to add shiftand scale so that we could normalize thedata from zero to one and with that inmachine learning you know the conceptswith the zeros and ones you know 35whatever that may be so those differentvector points adding that shape and sizewith the shift in scale it didn't matteras much so it wasn't really muchdifference between the two so weexperimented with that and wow it wasdifficult definitely not an easytask so I'll have an interesting story alittle bit about this partso with machine learning and you knowanalyzing the data and putting all theinference and the models and everythingit worked great but now we were to theactualtesting so with P's and Q's you knowthat the phrase you know mind your P'sand Q's right so we had a problem withour P's and our Q's if you're lookingfrom the camera from the front itwouldn't recognize this the P or the Qand I thought hm that's reallyinteresting that it's not being able torecognize that and I got my teamtogether and we had a long discussionabout what that was was what was causingthat and then I thought oh right theimages that it's using is showing the Pfrom the side so of course the camera isnot going to be able to recognize thatso that means now we need to add thedifferent views of the camera from frontside top back etc and so we just wedon't naturally sign it in thisdirection so we had to make the P looklike this from the front with the XYZ soit was a really interesting discoveryand I just wanted to show you why liketheauthentic authenticity matters if it's ahearing person maybe you don'trealize what a deaf person looks likewhen they're signing in naturally and soif a hearing person was doing this withthese images they may not understandthat it would work but if you turn itfacing forward as a deaf person wouldsign then that's the differencethere so the road map for the future isthe infrastructure is of course addingmore words and making it kind of acommonality um partnering with datacollection or working with people thatare all over the world just bringing inmore data that's our nextchallenge so I just wanted to end alittle bit here with a thank you if youwant to learn how to sign thank you thisis what it looks like so you're welcometo find me a little bit more about thati also wanted to shout out to mycolleagues that made this keynote andproject successful and I've got the youknow the team here we were all learningabout the technologycollaborating and really was best of allreally we required a lot of perseveranceso I just wanted to shout out to thesetwo folks Kamal John and Ali so thankyou so much for your contributions here[Applause]and then of course if you have furtherquestions and you want to contribute youwant to get involved please feel free tostop by the deaf and heart of hearingworking group kiosk in the projectpavilion in the expo space i'm happy totalk and you know reach out on LinkedInum so you know where to find me multipleavenues that way and I only I also wantto add before we closeuh there is a new mentorship programthat the TAG implementeduh within CNCF uh contributionstrategies with the mentorship programhere so under reppresented groups if youwant to be part of the mentorshipprogram please sign up thank youeverybody enjoy your CubeCon2025-04-15 21:57:27.185831 ��D#��?AUfYctUtDDfQhey everyone I'm here you know I hadthis great talk planned out for you like30 minutes worth of content and I had itall ready to go and then they said "Heyyou're the keynote." And I was like"Great except I have to cut a lot ofstuff out because I've only got 15minutes." So we'll start here with theML pipeline implementation with CubeFlowhow to streamline machine learningworkflows efficientlyso yeah you know that's a mouth well ahandful I guess I shouldsay so I have a sign languageinterpreter right here who's voicing forme and she's also got the clicker so I'mnot the oneadvancing i am Rob Cotch and I serve asa principal data engineer within Slalomi'm deaf and I use American SignLanguage i work with software dataKubernetes you name it the wholegamut so a picture's worth a thousandwords right i'm sure you've heard thatif you're an English speaker and so withsign language recogni recognition it'squite difficult what we need to succeedwith that is really we need the contextit's very spatial as a language so whenI'm signing here I'm talking about thefuture it'd be out here in front ofme if I'm talking about the present it'sright here closer to my body and if I'mtalking about the past it's behind me soyou can notice this you know machinelearning whether it can recognize thator not is difficult and then the datacomponent of that is sign language andrecognition is quite far behind comparedto voice recognition because they'reabout 20 25 years ahead of us andthey've got programs that can do speechto text fairly easily but sign languageis relatively new with that so we've gota ways to go to catch up and thenwhether we're using LLMs or not we don'tknow right we're a little bit slower toget to that point with the query promptsand waiting for that we can't wait forthat when we're doing sign languagethings so it's a different approach inthatway so this photo here you'll noticethat there are landmarks on mypalm and this is what's important for uswith the machine learning workflowyou can see here it's quite challengingwith finger spelling to be able tocapture the movement so that's why Iwanted to show you this image but I willbe showing you a demo momentarilyso withCubeFlow it's an end-to-end workflowmanagement system and we'd like to beable to implement that for sign languagerecognition and have a unified platformso that anyone could use that if youhired somebody off the street they wouldbe able to just readily available usethatplatform we want to be able to automateall of these things consistently ofcourse and then when the data is comingin we want to make sure that that isautomated um that it comes withversioning and that if something's notworking we can go back to a previousversion and really that's a big thinghere in you know the CubeCon and thecommunity here it's got a greatecosystem and we've got activedevelopment happening all the tim loadmaking troubleshooting far moreefficient and this is really theobservability shift we've been waitingfor but let's take this a step furtherobservability isn't just about betterdata it's also about making that dataaccessible and actionable and that'sreally where platform engineering comesin by applying platform engineeringprinciples we can transform obserabilityfrom an afterthought into a seamlessscalable and developer friendlyexperience developers shouldn't strugglewith complex configuration obserabilityshould be built in with open telemetryenabling all instrumentation foreffortless uh telemetry collection andopen telemetry provides vendor neutralAPIs to ensure consistent telemetrycollection its semantic conventionsuh standardizes naming across tracesmetrics and logs enabling automaticcorrelation and reducing inconsistenciesbetweenservices and rather than leaving teamsto figure out observility on their ownplatform engineers should define bestpractices workflows that make it easy toimplement obserabilitycorrectly and different services ofcourse have different uh obserabilityneeds and the open telemetry collectorum its architecture supports flexiblepipelines allowing teams to uh customizetheir telemetry uh and how they shoulduse it and of course obserility shouldbe treated as a product with supportdocumentation and continuousimprovement and a strong observabilityplatform ensures effortlessly telemetrycollection cross signal coration andintegration into existing work inexisting workflows so by embeddingobservability into developer workflowswe remove friction reduce toil andunlock realvalue but let's focus a little bit onthe automatic instrumentation part andthe self-service part platformengineering teams can give developers astrong observability foundation withoutthem touching a single line of code thisis really where the open telemetryoperator comes in it enables automaticinstrumentation uh for applications inGo Node.js .NET Core Java and Rubybut observability doesn't stop atinstrumentation we of course also needmonitoring as code this is where Percy'sanother CNCF project comes in a sandboxproject in this case and Percy's providean open dashboard uh specification andreally eliminating vendor login in thiscase so now we can use Percy's as thatstandard it enables dashboard as code soteams can define dashboards in YAML aswe all love I guess uh apply GitHubs andversion control them like any otherapplication code and this approach makesobserability an integral part of thedevelopment lifecycle so now let's see it all in actioni built two small applications a springboot application using MySQL and a Goapplication using Postgress SQL i'llwe'll deploy these services anddemonstrate all instrumentation usingthe open telemetry operator andtelemetry will be sent to Prometheus andto for metrics and JGA for tracing andfinally we'll use perses to managedashboards as code and all of this isrunning on a local K cluster so if wecan switch to thedemo so what you see hereis get pods you all know that command Iguess so this is what we have in thecluster we have the open telemetryoperator running it manages a few opentelemetrycollectors and then we have the Percy'soperator running which manages a Percyserver as you see here we have Jergerfor tracing we have Prometheus formonitoring we have the two databases Imentioned before like my SQL andPostgress and then we have the two to-doapplications it's very simpleapplications written in Go and onewritten inJava and the open telemetry operatorprovides uh a custom resource for opentelemetry collectors and I in this caseI have two open telemetry collectorsrunning one deployed as a damon set fornode level telemetry and another onedeployed as a stateful set in this casefor more clusterwide telemetry and ifyou want to learn more about this Iwrote a blog post on the D-Zero blog andyou can read more about how to configurethat um but let's have a look at the uhthe collector quickly um if you're notfamiliar it defines like differentpipelines for the different signals soin this case metrics uh we will receiveuh metrics using the OTLP protocol getsome cublet stats and then we willforward that to Prometheus and fortracing also OTLP and then we'll forwardthat data to uh to Jerger or thesetraces to Jerger we do some processingin between um and this is how we arecollecting the data then I want to showyou the application now it's a springboot application there we have it andwhat you should note here is there's nocustom instrumentation in this file atall um and not much else it's just aspring boot application it's fairlysimple and again this is the restcontroller uh handling everything nocustom instrumentation or anything likethat it's simple spring boot uh functioncalls uh for you know get post updatedeleteand we want to instrument thisapplication without touching the code atall and for that the open telemetryoperator now has this instrumentationresource it's a custom uh it's a CDcalled an instrumentation we give it aname in this case I just give gave itthe name instrumentation i put it in thename space open telemetry and then wedefine where we want to send our data soin this case it will be to the collectorI just showed you before we can do likeconfiguration of the sampler we can addlike we want UIDs for our Kubernetesresources and we can configure for thedifferent languages how we want theoperator to inject um the different uhorder instrumentation so we apply thatto the cluster and nothing reallyhappens u as you can see here it's uhyou need to opt in so we will edit thedeployment for our to-do go applicationwe need to add an an annotation to uh tothe potspec and we do that right here asyou can note I already I cheated alittle bit i put in an a line anotherannotation which is necessary for the goor instrumentation it utilizes evpfunderneath and you need to specify wherethe binary is in in this case it'scalled to-do uh at the root and then wetell the operator to inject go we willdo the same thing for our Javaapplication but without that additionaluh annotation as you can see here so wewill add the annotationhopefully there we go and then insteadof injecting go we are now injectingJava into the we want a Java we want totell the operator that this is Java andwe point to our instrumentation resourcethat I showed you before it's the namespace and the name of the actualresource and now we can see a few thingsis is happening to our applications solet's just have a a deeper dive intowhat it actually did for our Javaapplication it added a few environmentvariables as you can see on what is thethe service name but the thing you needto note here is the Java tool option itit adds this additional Java agent thatis really handling all the uh theinstrumentation for you and that comesin through an init container as you cansee up here at at the top and this thatis really what is handling all the theauto instrumentation um and for for theGo one as you can note that there's asidecar here which is the EbF sidecar ialso wrote a blog post about this uh amonth ago you can read more about thaton the desk blog as well if you'reinterested so let's try and and you knowput in some requests see it all inaction so I will just port forward thisinto my local machine and uh I will do afew requests to the actual service andit's a to-do application so I guess weneed to buy some hotel swag so maybe at-shirt so we'll just put in a few uhto-dos there we can of course see thateverything was created correctly in thedatabase so now we have a few calls forour Java service and we'll do the samething for our Go application so rememberto go by to the cloud native cornerstore or something like that and and buya hotel t-shirts orwhatever and it's the same thing sowe're moving into JGO now for ourtracing just to see that now we actuallyget data out of the box without touchingthe code you can see we'll find thetraces and now we have traces uh for forthe Go application and we'll have adeeper dive into the Javaone yeah here we go because the Javaapplication is a little bit moreinstrumented or the libraries are alittle bit better instrumented um andyou can see we have six spans we clickon a span and this is the theinformation we get out of the boxwithout touching the code we can seedurations we can see the different callsgoing on we can actually see what theapplication is doing towards thedatabase and all of these follow thesemantic conventions of course by opentelemetry so it that's quite nice thatwe can get all of this out of the boxwithout touching thecode cool then the last thing I want toshow you is Percy's um Percy's comeswith this uh data data source so we ofcourse need to define where is our dataum so for that there's this data sourceperuse data source type and in in thiscase we are really just pointing to thelocal Prometheus cluster or not clusterbut the Prometheus server that we arerunning in in in our cluster and that'swhere it can query its data we applythat into the cluster uh I should saythat Percy's is still in alpha so don'tprobably not use it in production but Ijust want to show you where where we areand and and there's a lot of people thatare like forming around this project sothe database or the data source wascreated it's a Prometheus data sourceeverything seems to work pretty well wewill clear again the last thing I wantto show you is of course a dashboard umand this is the resource it's called thePercy's dashboard sorry for all thescrolling but it's a pretty big file umbut you define like again the name it'syou put in the name space you want toput this in and then you define yourdifferent panels and the signer chartsthe the curious you want to do and thenyou can lay it all out in a grid againthat's a lot of scrolling the grid comeshere and I'm really bad at scrollingapparentlyso we lay out the different panels andwe define the grid and and that's reallyhow you you build a dashboard in Percy'sapply that to the cluster and you willsee a dashboard in the Percy's server ina second we'll go and refresh and thenwe have metrics uh which is awesome umand a lot of vendors are now alsoconforming on this as a standard for howthey're doing their dashboarding so ifwe can go back to the slides so what didwe just see we used open telemetryoperator to instrument Java and Goapplications without modifying the codethis this allowed us to capture tracesand metrics we saw how autoinstrumentation makes observabilityfrictionless enabling teams to gaininsights into their applications withoutany additionaleffort we also deployed Percy'sdashboards as code showcasing howobservability can be managed by GitHubsuh or using GitHubs principles and bytreating dashboards and monitoringconfigurations as code we ensureconsistency version control and easy uhintegration with existing workflows sothis demo highlighted that obserabilitydoesn't have to be uh complex or requiremassive engineering effort it can bebuilt into the platform making itaccessible for everyteam so observability is evolving and itis evolving quite fast these days opentelemetry is standardizing telemetrycollection so the days of fragmenteddata are numberedthere we go percy's is standardizingdashboarding and by applying platformengineering uh principles we cantransform obserability from anafterthought into a seamless scalableand developer friendlyexperience and again observability is isa systems problem not a tracing loggingor metrics problem and when we connectsignals together we empower developersto solve problemsfaster here's a few resources um beforewe wrap up uh I really encourage you togo and visit the open telemetry uh obserobservatory booth at S400 um I reallyenjoyed the uh the open telemetry courseand the certification it's a really goodway to learn about these uh technologiesand of course there's some great bookson on this as well and thank you all foryour time if you want to check this outyou can scan this or you can go into tothe GitHub repo and and try it outyourself as it should should be fairlyeasy to to get started and if you wantto continue the conversation I'll be atthe desk zero booth right after this souh come and uh say hello and then yeahenjoy the rest of CubeCon Town[Applause][Music]2025-04-15 21:57:27.728849 zz�s#��AtSBfDzStoYEhi again everyone it's truly an aprivilege to be here today especially asthis marks my final time as co-chair forCubeCon Cloud Native Con it's been anincredible journey and I couldn't thinkof a better way to wrap it up than herein London with all of you so let's talka little bit about observability foryears we relied on the three pillars ofobservability logs traces and metricsbut here's the thing we don't have ametrics problem or tracing problem wehave a systems problem and yet many ofus still treat these as separateentities we have one browser tabs forlogs one for metrics and on a third onefor tracing and we relying on humans tocorrelate signals together it'sinefficient it's errorprone and it's nothow modern obsibility should work somany teams settle for a good enoughapproach and observability is often anafterthought but it shouldn't be andthis fragmentation leads to multiplechallenges we have many differentsystems rely on complex securitylanguages making it difficult to unifyinsights the vendor plug-in restrictsflexibility and causes barriers toswitchingtools metadata inconsistency acrossplatforms results in unreliablecorrelations and due to high complexitymany teams avoid instrumentationaltogether leading to gaps invisibility and with no unified insightstroubleshooting remains slow andinefficient forcing engineers tomanually piece together informationacross disperate systems so a shift ishappening it's a shift toward coalationit's a shift toward standardization it'sa shift toward open telemetry this iswhere the community really comestogether open telemetry or hotel forshort it's a CNCF project and is now thesecond largest project within the CNCFby contributor account it has become thedef facto standard for distributedtracing and now extends to logs andmetrics as well open telemetryeliminates proprietary agents reducesmetadata fragmentation and enables crosssignal correlationand it brings four keyadvantages first instrument once writeyour instrument instrumentation once andit works across any back end withoutbreaking changes second it separatestelemetry generation from analysis opentelemetry then ensures that telemetry isproduced independently from the toolsthat are analyzing it and this allowsteams to switch platforms withoutreinstmenting while vendors focus onanalytics instead of this proprietary uhinstrumentation and third it makesobserv makes software observable bydefault open source libraries can nowship with native instrumentation with noextra effort required from your endwhich is really really nice and finallyit improves how we use telemetry bylinking signals together open telemetrydrastically uh reduces cognitive $$�Y #�kAIGK7TZPuma4well thank you Casper and thank youagain to all of our keynote speakersthis first day of CubeCon CloudNativeCon had an amazing set of sessionsbefore we head to break we have a fewannouncements for you we invite you tojoin us at the DEI community hub adedicated space to connect learn andcelebrate diversity equity inclusion andaccessibility the DEI hub offersopportunities Wednesday through Fridayto engage with community groupsparticipate in allyship and advocacyworkshops check skedd.com for moredetails and tonight don't miss out onthe Cube Crawl Plus Cloud Native Fest ina solutions showcase right after ourlast breakout session at 6:15 p.m andthat's your opportunity to network withyour fellow attendees enjoy some amazingfood and dive into fun activitiesentertainment and games yeah that'sright and whether you're aiming for ahigh score in the arcade zone taking aswing at the crazy golf or enjoying ataste of Britain there's uh somethingfor everyone and while you're theredon't miss the poster pavilion wherecarefully selected poster sessions uhshowcase innovative ideas in theecosystem we invite you to engage withpresenters vote on your favorite postertoday and tomorrow and we'll announcewinners uh on Friday during keynotes ican't wait and please also join us inthanking a few folks because withoutthem none of this would be possiblethank you to our sponsors and our Danconscholarship fund sponsors please showyour support and visit our amazingsponsors in the solution showcasethroughout the event yes and a hugethank you to all our incredible programcommittee and the track chairs who did alot of hard work in choosing all theseamazing sessions for this event thankyou for joining us for today's keynotebreakout sessions will begin at 11:15today including some great tutorialsthat you won't want to miss so pleasedon't forget to leave feedback onsketch.com and we hope you have a greatday and see you all back here tomorrowall right thank you everyone bye2025-04-15 21:57:28.461641our tenants so that doing the rightthing becomes easy and that that thesecurity and availability of ourworkloads becomes even easier to testoptimize andmaintain with all this growth it'sunsurprising that I am hiring and wehave jobs open in our China and Indiatech centers in Guangja and Pune so ifyou're interested in working at a scalethat few other organizations operate atyou can get in touch with me on LinkedIni'd be really happy to hear fromyou and finally I just want to say ahuge thank you to Chris and the team atCNCF for the opportunity to talk to youthis morning and I wish you all the verybest for CubeCon 2025 here in Londonthank youmorning everyone i am Carlo PhysicoChief Technology Officer at Pepton andthis is Fabaldi our head of engineeringuh Pepton is a bout company uh based inin Switzerland we are focused on drugingwhat are called disordered proteinsdisordered proteins are a very peculiarclass of proteins that are um can bedescribed by standard experimentaltechniques and uh in in particular uhtoday I'll try to explain you why wehave pepon focused on these proteins uhin the meantime uh these proteins are uminvolved in a vast majority of cancersand they're involved in many neuro uhdegenerative diseases so they are kindof important for the old mankindand I think we can we can move on um andmain experimental techniques fail at uhdescribing those proteins we createdthat pepon and experimental techniquefor the first time able to resolve thedynamics of those proteins and uh uh wemanaged to do that with a super highresolution uh um technique we callhydrogen dutino exchange massspectrometry and we managed to get thehigh the highest throughput in the fieldsee look at this example we generatedwith a text to video um experiment heremodel u this video so an avocado sittingon an armchair and as you can see theavocado here is empty hopefully you cansee thatum okay you can't it's a very cool videoby theway thank you thank you all[Music]i promise you when it will go up it willbe super cool to see anyways um wecreated um a new model a newfoundational model to generate what iscalled an ensemble of those proteinsi'll try I'll walk in the meantime sothat you we make it moreengaging so um this ensemble are whatthe pharma field is is lacking uhnowadays and this is something uh we arefocused on is like looking at a videoyou can appreciate how uh all thedetails you have in a video only if yousee the whole video not just a singleframe this is the same for proteins welook at the video we simulate the unoundand uh we do that with this foundationalmodel i'd like to talk about we'll seelater and uh everything has been hasbeen uh trained on DJX cloud so inpartnership with MIT Pepton Universityof Copenhagen and Nvidia DJX cloud anduh we call this model Pepron it will beout will be open source everyone will beable to use it fabio as Carlo said weare building our model on U Kuberneteswe had distributed training in mind forthe get-go partly because obviously it'sgoing to become pretty big model and itwon't be able to fit on a single GPUpartly because we want to accelerate thedevelopment cycle uh we chose uh DJcloud from Nvidia as our Kubernetesprovider they provide a very niceplatform with high performancenetworking storage distributed storageand G scheduling which obviously youneed for these kind of things and for usit was really easy to move from oursmaller development environment to um abig fullfeatured cluster basically justmeant uh putting our training script ina Docker container running it as adistributed Python training on thecluster and then just kick back to takea Thank you for the is showing up soI'll wrap it up real quick yeah justshow the avocado video so I go to theavocado indeed because this is kind ofcool um as I said this is a texttovideouh model we have this pronto an avocadositting on an armchair you see the coreis empty this is something you would notunderstand looking only at the firstframe right we at Pepton developed atechnology to do the same with proteinsgenerating what is called an ensemblestarting from an amino acid sequence andhere's the thing um if you think aboutit you don't really care if a pixel inan image uh is misplaced right well withproteins is not like that a single atommisplaced will push the energy toinfinity making the ensemble uselessthat's why we needed to uh change thenew architecture under the hood and comeup with a completely new ensembleevaluation metric and uh before thinkingabout any kind of AIdriven process weneed to see those states the red onesand we need to see them experimentallyand we need to see them for a proteinalone in solution and the presence of adrug there is a technique for doing thatit's called hydrogen dutium exchangemassspectrometry and it doesn't work forthese other proteins uh that's what yousee here everything is red it means wedon't have enough resolution well we atPepon created a new protocol to getultra high resolution and much superiorthroughput and we use this data set tocreate a we use this technique to createa data set to train pattern our ensemblegenerator yeah I mean we kind of alreadywent through this stuff but as Imentioned we uh run our training onKubernetes and well actually all alsothe entire stack that does the analysisfor the mass spectrometers runs onKubernetes as well but I think we canskip over this thank you Fabio andeverything runs on RunAI so inpartnership with as I said MATUniversity of Copenhagen and DJX cloudwe created this model this is astate-of-the-art model ladies andgentlemen this is how we make drugs andwe're pepton thank you thank youhi my name is Tyson Singer uh I'm the uhVP of technology at Platform so before Iget started uh first I want to do a bighappy birthday shout out tobackstage Spotify thankyou spotify opensourced the frameworkfor our homegrown developer portal 5years ago in March of 2020 and theproject took off from day one and it hasaccelerated from there over 3,000companies have adopted backstage tobuild their internal developer platformor IDP within Spotify it's used by over700 R&D squads every day to help us shipand maintain ourproducts before we open sourcedbackstage we'd already been using it foruh quite a few years and it was helpingus solve a whole range of developerexperience problems and issues contextswitchingfragmentation cognitive load thingsyou're all familiar with it had beenbecome invaluable to us and so we wentahead and open sourcedit we thought that other companies wouldfind it valuable too but we also hadanother motive it wasn't just aboutsharing our ambition from the very startwas for backstage to become the standardfor IDPs um but sometimes you have to becareful about what you wishfor now we were no longer building forourselves but we're also building foreveryone else and at times those thingsreally did feel like they were uhcompeting with each other so I'm goingto give you an example we had to rewriteour entire back-end system and that wasnot on our roadmap but we saw that adopters reallyneeded an easier way to build andintegrate plugins so we did the workbuilding with the community andrewriting the whole backstage backendover the course of a year and we when welaunched it not surprisingly it wasgreat for everyone everything about thenew backstage backend is simpler to workwith and that's for that applies to usas well and so now we're in the middleof completely rewriting the frontend so there's been this tension in howwe balance internal priorities withexternal ones but ultimately we believeit's a virtuous cycle if we hadn't opensourced backstage we'd likely still beliving with our previous morecomplicated back-end system so in fiveyears we've gone basically from thisvery thin uh framework for building anIDP to a better backstage than we couldhave built uh on our own and so nowwe've taken that one step further tocreate the best IDP for the futurespotify portal our SAS product built ontop of the OSS backstage all of whichproves out our original idea investingin the community has been an investmentin ourselves as well and because of thatwe expect an even brighter future forBackstage ahead so I want to thankeveryone else who's been an end user anda contributor to Backstage thanks[Music]hello everyone and welcome to CubeConand CloudNative Con London my name isKatie Gamanji and I am a senior engineerat Apple i'm also part of the TOC ortechnical oversight committee withinCNCF last year at Dubdc we introducedApple Intelligence a personalizedintelligence system that brings powerfulgenerative models to iPhone iPad and Macapple Intelligence is designed toprotect your privacy at every step todeliver this advanced securityarchitecture whenever possible AppleIntelligence processes tasks locally onthe device but more sophisticated tasksrequire additional processingpower private cloud compute or PCCallows us to scale our computationalcapacity and draw on even largerserverbased models for these complexrequeststhese models run on servers we haveespecially created using Apple siliconand offer the privacy and security ofyour iPhone from the silicon onup private cloud compute is a greatexample of how we're investing in andadopting open source technologieswe draw from the security properties ofthe Swift programming language to runsoftware with transparency built in andleverage the high performance andresilience of gRPC for thetransportationlayer grpc is a mature CNCF project withmore than eight years of upstreamdevelopment it is a widely knownframework that can efficiently connectservices across distributedinfrastructure and within private cloudcompute we heavily leverage the buildingblocks of gRPC for the transportationlayer birectional streaming is a corepart of gRPC that we use to communicateload information between the PCC gatewayand a single Apple siliconserver since privacy and security is atthe root of our components we ensure theresponse payloads are encrypted by theuser device and rooted using gRPC to thePCC gateway all the way to Apple siliconmachines and finally GRPC has a veryrobust livveness probing that is highlyconfigurable as JRPC controls itsunderlying transport we lean into itsresilience to ensure it is capable ofperforming theseprobes from a technical standpoint someof the libraries we use are gRPC Swiftand Swift Probab which in simple termsare the Swift implementation for gRPCand ProBuffand at Apple we've written all kinds ofservices in Swift from iCloud Keychainto App Store processing pipelines andSharePlay filesharing and when building out a servicethat supports Apple commitments to userprivacy Swift was a natural choice forseveralreasons from a performance standpointSwift has a very low memory footprintthat keeps the system resource usagedown this enables us to make the mostefficient use of our hardware forinference withinPCC we're also able to leverage ourexisting investment in memory safeprotocolimplementations by using Swift librariesthroughout the stack we minimize theamount of unsafe code parsing untrusteddata shrinking our attack surface thesame policy was applied throughout theentire codebaseand finally we spent a number of yearscollaborating with the community tobuild an open-source Swift serverecosystem that has great solutions forprivacy orientedservices private cloud compute is agreat example of how we're bringingtogether two open source ecosystems anda decade of innovation high resilienceand performance of gRPC with thesecurity properties of Swift enables usto bring industry-leading devicesecurity models into the cloud we'relooking forward to continue ourinvestment in these communities and makeSwift a platform for the cloud andserver developmentif you'd like to learn more about ouruse cases in the cloud native space andhow we engage with the wider ecosystemyou can attend one of the 10 other talksdelivered by Apple engineers at thisconference and to explore how Swift canbe a fast modern and safe language foryour server you can checkswift.org/server for more details and ifyou'd like to learn more about privatecloud compute you can go tosecurity.apple.comthis is Katie Gamanji and I look forwardto seeing how you can shape the opensource ecosystem thank you and enjoy therest of the conference[Applause]2025-04-15 21:57:28.939783 �O#��UAJqG1wey7-Aogood morning everyone hs HSBC embarkedon our Kubernetes journey back in 2018and since then we've now run to runninghundreds of clusters supportingworkloads for markets in the UK ChinaHong Kong Mexico and many others as wellas our retail wholesale andinstitutional business running onpremises and in the major hyperscalerstoday I want to talk about the retailbanks microservices API solution theservice hosting platform this startedlife as an experiment in the publiccloud adoption team to see if ourexisting on premises platform could bemigrated to cloud and by 2019 our firstthree clusters were hosting liveworkloads and we were ready to start onour migrationproject we're currently servicing around600 million discrete hits a day to ourplatform which is running at the moment7,000 production services with many morestill to migrate from our legacyplatform all running across only a dozenor so clustersas you can see we scaled up very quicklyand we soon encountered the sorts ofproblems that you only see in smalleryou don't see in smaller clusters ratherthan waiting to become a victim of ourown success we had to shard ourworkloads based across individualmarkets an approach that also helps usto manage our blast radius alongbusiness lines our clusters are verystable thankfully but change can stillbe a bit of a pain problem so tomitigate these issues during upgrades werun our clusters in a blue greenconfiguration rehydrating the newcluster from backups and only cuttingover when we're confident that the newcluster is stable at the scale weoperate at cost is also a huge challengeand we've had to develop ways of holdingteams accountable for their spendinggiving them the tools they need tomanage the financial footprint of eachworkload responsibility is shared soit's our responsibility as a platform tomake sure the core service is optimizedand the onus however remains with thetenant to keep a tight reign on costs byprovisioning elastically based on demandand only using what theyneed the industry trend at the moment isvery much towards a more democratic anddevolved tenency model with smallerclusters allowing engineering teams theflexibility that they demand in our casehowever we work in a highly regulatedorganization with hundreds of ITcontrols that we need to satisfy butwith a high degree of consistency acrossour workloads that we support a morecentralized model still makes sense forus our product roadmap is ambitious andit takes us several years out into thefuture we're looking at availability andresilience because even in commodityclouds we can experience failure we planto move to a hot hot cluster model whichmeans storing our reference data frometc outside of the cluster and adoptinga more declarative githopsbased modelwe're also looking at adopting ambientmode istto with the help of our partnersfrom solo IO with a view to saving maybehundreds of thousands of dollars off ourannual cloud bill and we're looking atimproving our shared responsibilitymodel providing better guardrails for ''�V#�eA7GQRyAxPa9ghello everyone and welcome to cucon andCloud native con London my name is Katieganji and I am a senior engineer atApple I'm also part of the to ortechnical oversight committee withincncf last year at wwc we introducedApple intelligence a personalizedintelligence system that brings powerfulgenerative models to iPhone iPad and macApple intelligence is designed toprotect your privacy at every step todeliver this Advanced securityarchitecture whenever possible Appleintelligence processes tasks locally onthe device but more sophisticated tasksrequire additional processingpower private Cloud compute or PCCallows us to scale our computationalcapacity and draw on even larger surveybased models for these complexrequests these models run on servers wehave especially created using Applesilicon and offer the privacy andsecurity of your iPhone from the Silicononapp private Cloud compute is a greatexample of how we investing gain andadopting open sourceTechnologies we draw from the securityproperties of the Swift programminglanguage to run software withtransparency built in and leverage thehigh performance and resilience of grpcfor the transportationlayer JPC is a mature cncf project withmore than eight years of Upstreamdevelopment ment it is a widely knownframework that can efficiently connectServices across distributedinfrastructure and within private Cloudcompute we heavily leverage the buildingblocks of grpc for the transportationlayer bidirectional streaming is a corepart of grpc that we use to communicateload information between the PCC Gatewayand a single Apple siliconserver since privacy and security is atthe root of our components we ensure theresp response payloads are encrypted bythe user device and rooted using JPC tothe PCC Gateway all the way to Applesiliconmachines and finally JPC has a veryrobust liveness proving that is highlyconfigurable as JPC controls itsunderlying transport we lean into itsresilience to ensure it is capable ofPerforming theseprobes from a technical standpoint someof the libraries we use are grpc Swiftand Swift protuff which in simple termsare the Swift impl M mentation for grpcandprabu and at Apple we've written allkinds of services in SED from iCloudkeychain to up store processingpipelines and shareplay filesharing and when building out a servicethat supports Apple commitments to userprivacy Swift was a natural choice forseveralreasons from a performance standpointSwift has a very low memory footprintthat keeps the system resource usagedown this enables us to make the themost efficient use of our hardware forinference withinPCC we're also able to leverage ourexisting investment in memory safeprotocolimplementations by using Swift librariesthroughout the stack we minimize theamount of unsafe code parsing untrusteddata shrinking our attack surface thesame policy was applied throughout theentire codebase and finally we spent a number ofyears collaborating with the communityto build an open sour Swift serverecosystem that has great solutions forprivacy orientedServices private Cloud compute is agreat example of how we're bringingtogether two open source ecosystems anda decade ofinnovation High resilience andperformance of grpc with the securityproperties of Swift enables us to bringindustry-leading device security modelsinto the cloud we're looking forward tocontinue our investment in thesecommunities and make Swift a platformroom for the cloud and serverdevelopment if you'd like to learn moreabout our use cases in the cloud nativespace and how we engage with the widerecosystem you can attend one of the 10avat talks delivered by Apple engineersat thisconference and to explore how swift canbe a fast modern and safe language foryour server you can check swift. orgsserver for more details and if you'dlike to learn more about private Cloudcompute you can go to security.apple.comthis is Katie ganji and I look forwardto seeing how you can shape the opensource ecosystem thank you and enjoy therest of the conference[Applause]2025-04-15 21:57:29.464832$ow on uh adopting them assessing themand so on so please check it out it'sfilled with a lot of detailed and funfun information um another thing is alot of our end users also come fromuniversities and academic environmentsand uh we have formalized a academicaccredititation program for uh academicinstitutions and universities out thereto basically help work with them toimprove their curriculum because look uhcloudnative technology uh it's it'ssometimes a lot to learn especially whenyou're a university student and it isimportant and critical for us that weensure that educational material uh outthere and the next generationessentially of cloudnative developers isproperly prepared so if you represent auh university we would love to kind ofget you involved you can learn more uhabout that program by scanning that umQR code so the final thing before uh Ibring out our uh lovely uh end user uhawards is um you know we we had ourmaintainer summit earlier this week andwe always kind of work with ourmaintainer community we talk to our enduser members and uh recruiting andfinding people is is always difficult umyou know we essentially have uh talkedto enough maintainers where you knowthey like to work at companies that haveopportunities for them to alsocontribute back to our projects topotentially spend some time working onopen source and you know we had this oldkind of uh job board that we've uh youknow created a while back and we decidedto kind of update that and put basicallysomething out there that our communitycould use both uh developers andcompanies uh and end users that arelooking to hire cloudnative uh talent soum we have this new board in beta whichyou can kind of go look at and if you'reboth a maintainer looking for a job theycould go work on open source for exampleyou go see this job posting uh from CERNwho is hiring a Kubernetes engineerwhere they're going to go spend a 100%of their time working on open sourcetechnology and more importantly 25% ofthe time is going to be allocated tocontribute back uh upstream toKubernetes which is super cool so thisis bit of a beta for us we want to workwith all y'all to improve this so pleasego check this out so with uh theseannouncements uh kind of uh done on theend user side uh I'd like to kind ofbring up some special some specialguests so uh earlier uh you know youknow this year we we've kind of made aspecial announcement about a kind of neworganization joining forces with theLinux Foundation that both the CNCF hasworked on for a long time and a lot ofour end user companies out there havebeen using a mix of technologies so uhwithout me kind of rather you knowspoiling things I rather have uh uh anend user come here talk a little bitabout it and also the organizationitself so um uh I will uh stop now and Iwill go invite uh Ricardo uh from CERNwho many of you know and then JonathanBryce from the open infrastructurefoundation talk a little bit about uhopen infra joining Linux foundation andkind of why this is super cool for bothend users and the wider open sourcecommunity so uh Jonathan and Ricardoplease come on out[Music][Applause]well thank you very much Chris uh thanksfor the welcome uh he he dropped thenews here the Open Infer Foundation isgoing to be joining forces with theLinux Foundation i'm super excited aboutthis for a lot of reasons uh we are at apoint where there's going to be over atrillion dollars building outinfrastructure and I think that this isan amazing opportunity for all of ouropen source communities to takeadvantage of this and make sure thatthis infrastructure is built with opensource tools and I think that it's alsoreally great because it reflects what wehave seen from many many end users whoare already bridging across communitiesand projects to build out greatinfrastructure in financial services intech in AI and also in science so I'mI'm proud to be here today with Ricardofrom CERN cern has been a longtime userof many many open source projects uhthey're a previous open infra super useraward winner a previous CNCF end useraward winner and uh just I thinkrepresent%s some of what uh what whathappens when we really collaborate justin the biggest possible wayyeah so I'm really happy to be here aswell so I'll quickly go through thispicture so this is actually happenedduring the OpenStack summit uh in 2018in Vancouver so CERN has been uh alwaystrying to scale out our infrastructureand do more with the the tools we haveso we had introduced the OpenStack backin 2012 we had a large infrastructureand by 2016 we became uh Kubernetesusers and we had Kubernetes inproduction by 2018 we understood that uhjoining the two would allow us to expandour capacity and be able to do more withthe same budget so in this in this uh uhsession I actually tried my first livedemo at the time mark in the picturethere tried to jinx it but he didn'tmanage so we had a small hiccup butactually the demo then actually workedso this uh allowed us to then uhunderstand what we could do with thesetools and this is was the message wewanted to pass uh in 2018 as well sothis is a picture of the Atlas uh uh uhdetector at CERN where we collide beamsof protons but in reality we produce alot of data so to analyze all this datawe have a large infrastructure you cansee there more than 10,000 physicalservers more than half a million coresand a lot of Kubernetes infrastructureon top of it you can see over almost 600clusters tens of thousands ofapplications and if you count thescientific computing this is even largerso CERN has a long tradition of needingthis kind of resources if we look backin the 80s we actually used to even haveour own networking stack it was calledCERNet i found a plug once uh next to myoffice i was very curious what it whatit was but then we started using morestandards with TCP IP in the 90s we hadour own analysis libraries we alsostarted using other things in the early2000s we end we started talking aboutpabytes of data and this is where thingsstarted getting a bit trickier westarted using uh standard uhdistributions for the OS like uh Linuxwe had a great keynote from Gregyesterday and then but by then we werestill by ourselves uh pabytes of data Idon't know raise your hand if you hadpabytes of data in the early2000s not so many so we started lookingwe we have to find something here andthen we built our own custominfrastructure but then the cloud cameso the early 2010s really opened up theopportunity to start building so muchin-house and participate in the muchlarger communities like OpenStack andthe CNCF later so we really understoodthis was the way to go uh since thenwe've been participating very largely inthis early 2010s we talked abouthundreds of pabytes of data what we knownow is that we have have to scale this10 times so what's coming next meansexabytes of data and we know we cannotdo this by ourselves we know we need allof you and all the communities to gettogether and join forces so really happyto see this happening and really lookingforward to what's coming next so thankyou very much thank youall right it's super cool to hear from anew sister foundation and obviously youknow CERN has been a longtimecontributor to open source and has useda variety of different uh technologiesso I think we're stronger together underunder one roof so to kind of cap um thenext portion of our end user program umone of my more favorite parts of uh youknow our our kind of CubeCon is when wego and uh acknowledge and recognize uhamazing end users who have done reallycool things and kind of shared it withthe wider community so we have thisend-user award program that we've beendoing for a while and um you know beforewe go uh too much into those uh uh youknow ceremonial things um I'd like tointroduce uh a new member of the CNCFstaff who is focused on cultivating andrunning our enduser efforts so umwithout further ado I'll go introduce uhBrian Douglas who is a new uh teammember on CNCF and he'll talk a littlebit more about uh himself and the enduser program so let's uh have Brian comeup on stageherewelcome hello hey folks uh I'm BrianDouglas my friends call me Buggy on theinternet uh I actually know a lot offolks uh from my previous life uh atGitHub and other startups so I am thenew head of ecosystems for the CNCF i Iwork with enduser companies to help gettheir stories and also to help themengage in programs uh so everything youknow need to know about me is at thatURL um but one thing I do want tomention is I host a podcast uh I host apodcast in a little small city outsideof uh San Francisco called Emeryville uhand it's a building the studio isactually across the street from thissmall company inside of a bigger mousecompany called Pixar and uh I find itamazing because this the podcast weactually talk about open sourcemaintainer success stories uh it'scalled the secret sauce and I enjoy itbecause I I get to learn how people gotto where they are and what's interestingabout Pixar being across the street isone of the first early success storiesof open source happened at Pixar in 1987uh engineer by the name of Bruce Perfound a bug in his software decided toshare that bug uh the solution to thatbug on an email message thread and thatwas the first success story of opensource uh and I enjoyed that storybecause Bruce went on to work on this uhfilm this blockbuster called Toy Storyuh and what's great about that is Brucetook a toy of fixing a little bug andnow he gets to share the story publiclyforever in perpetuity so speaking ofstoriesI I have the benefit to share a newstory uh with our top end user award uhso could you guys um humor me by givingme a drum roll can you tap on your yourknees excellent so our top end useraward for this year at CubeCon EUis Richard Bang please comeout hello good morning Kubon how are youall doing today my name is RichardSikben it's my honor here to present mycompany and Group and all the smartengineers behind the scene who actuallymaking this moment a reality so let'shave a raise of hand how many of youhave actually heard aboutEndgroup thank you in fact I mean I'mnot surprised because people actuallyknow us more by knowing our products andservices rather than the company name umit's about Alip pay world first my bankand all the financial businesses uhwhich actually making our company asuccessful one our slogan is to make isto bring small and beautiful changes tothe world and that's what what that'swhat we believe in and that's why thename of ant matters so the technologybehind the scene is actually what makeall of this a successful story and weare firm believer of open technology sofirst we are actually a very seriouslarge scale and cloudnative end user uhwe use Kubernetes and we actually builtone of the largest Kubernetes singlenode clusters on on the planet and we'reactually using Harbor Prometheus ArgoDragonfly EST and all of these cloudnativetechnologies furthermore we're actuallya developer community player all ofthese names we're seeing here are ourdevelopers which are activelycontributing on GitHub to our cloudnative technologies so if you see themin the community say hi and wave and wereally love working with all of you herelast but not least through our work withthe community in the past couple yearswe actually began realizing that thereare certain aspects of our usagescenarios which can be really beneficialthat's how we begin becoming activecontributor and project innovators tooso we have a couple of uh labels here aswe can see Dragonfly and Nidus um areessentially the project we're currentlyworking on the incubation stage lastyear we contributed two projects KCL andcushion stack into our CNCF family andwe're also currently working veryintensively with our CI model formatspecification as well and if you want toconnect more with our companiesapparently there are still a lot of youwho actually doesn't know about thecompany uh feel free to connect withthrough our websites and if you want toactually reach out and have a directconversation here's our ospo email andmy personal link in super lookingforward meeting with all of youexcellent thank you so much Richard takeapicture excellent all right appreciateit thank you so much for and thanksthank you so much very nice to be here[Applause]2025-04-15 21:57:30.133009 ��)#�� AK3edF36HWYUso um you heard uh from a bunch of endusers that were all using uh Kubernetesand cloud native technologies we had auh we we had a bi we had we had a bankfinancial services company we had abiioarma company we had a streaming uhmusic company uh we also had a you knowgenerational company like like Applethat does a bunch of things and theseare all end users that we call in CNCFthat are using cloudnative technology invery interesting ways when we startedCNCF we always wanted to ensure that endusers would have a formal voice in theorganization represent representation onour board representation on ourtechnical board and we have a formalprogram within the CNCF for theseorganizations to come join contributeand learn from each other so uh you goscan that QR code but it is a greatprogram to really you know learn aboutcloud of technology help build out yourteams uh recruit you know great greatfolks from our community so definitelyrecommend checking out we have over 150organizations uh here that arerepresented within the CNCF uh you knowend user community and you heard fromsome of them today um we also have anenduser technical advisory board thatconsists of a lot of these folks fromthese different companies and uh theyhave folks from the New York Times AppleLockheed Martin Boeing JP Morgan in itAdobe Toyota Black Rockck and CERN theseare all companies that are workingtogether with our technical board andmaintainer community kind of helpimprove uh our projects for end users intheir kind of unique industries thatthey uh work in so highly recommend uhparticipating in the end user communitysometimes the end user community getstogether and helps produce reports andresources available and today I'm happyto announce the latest uh tech radarfrom the end user community is releasedthat was been focusing on developerexperience and um uh observability orAli uh for sure for short so if you gocheck out the QR code you can get anin-depth report on kind of what opensource projects and technologies thatour end users have been using theirexperiences and recommendations uh youkn# __�#�uAlFaSEevdZvUso hello everyone uh I'm arop this isGabrielle kenes we are from michancompany leader TImanufacturer and uh we are part of coneras a service uh platform on TT which isresponsible to deliver kubernetesplatform to our internal project teamsand we are here to share with you how wescale cloud and un Prem uhinfrastructure while cutting costsso first these are some uh metrics uhabout kubernetes footprint in michan sowe um manage 62 clusters in 42 differentlocations many due to deployment ofcluster in each of our Factory and wehave 450 business application deployunits uh with um 36,000PS so basically our kis jour startedback in 2018 and we've gone throughseveral iteration of our platform sincethen and back in 203 2023 sorry um wewere using a vendor based solution atthis time uh two stream of eventconverge the first is that uh our vendorum switch its strategy and basically theproduct that we were using up until nowwill no longer be available and it turnit would force us into a clustermigration because the vendor will notprovide us with a migration pass fromone product to this new product thesecond is that Michin matured on itsopen source strategy and around thistime Michin created its open sourceproject office and finally Michin alwayshad an ambition to uh make itSpecialists within the company Thriveand in this regard uh opting for a makeinstead of BU strategy can go a long wayso at the end of 2023 we we did decidedto rebuild our platform from the grounddump only only using only open-sourcesoftware which AR will now introduce youto so this is a global picture theimplemented solution based on opensourceproject uh we can see in the middle ofthe picture the cluster II managementcluster uh which is used to manage workCLclusters uh in this management clusterwe have deed several uh components sothe first one is Aro CD to make um gupsat scale uh we also use cluster andproviders to manage differentinfrastructure for work cluster lifecycle we have a crossplane uh componentto um under prerequisits forinfrastructure and some um add-onsDeploy on these clusters and we have acustomcomponent which is U here to make thelink between cluster RPI cluster objectand uh CD cluster secret object form onthe bottom of the screen we can see thatwe have several repositories so thefirst one there are G repositories thefirst one is uh used to store clusterinventories cluster definitions andtheir addons we also have an AGDconfiguration repository to storeapplication sets application sets areAro CD object used to dynamically createapplication based on cluster labels andwe have also uh some mchart repositoriesto store several addons so gatekeeperfor policies celum for cni and custom shalso and uh finally we have the workloadcluster on severalinfrastructures uh which will be uh builand containing all the servicesnecessary to be de delivered to ourinternal projectteams so the impact were mostly pospositive as you can guessthe first we want to mention is that theengineering cost for our platform wentdown 44% year to year uh and this isincluding the I Engineer uh we had ontop of the already existing team uhanother metric which chose for you isthe what we call the upgrade Lan time itcould take us up to months to integratethe vendor solution back into ourecosystem and it's now a matter of weekto do the same work uh for uh and we canuse um Junior Engineers to do so becausethe project Pro the process is much morestreamlined and finally while do whiledo exact uh our kubernetes footprintstill grew about 100% while we wereactually working on this project andfinally and and not the least of it uhplatform engagement the engineerengagement was uh is now much betterbecause it's very much more interestingfor us to work on uh open sourcesoftware instead of vendor basedsoftware so um basically it's time forme to wrap it up we have uh an articleon the blog post up for you to read anduh time for you to thank you uh for yourattention[Applause]2025-04-15 21:57:30.742921ed to move tosmaller models smaller models meansorchestration and we also want agentsthat are interacting with the variousparts of the system which is distributedcomputing so if you squint this shouldlook really quite familiar this shouldlook a lot like the same kinds ofpatterns that we've already solved withKubernetes which is awesome but there'salways abutt these systems have an awful lot ofstate the model the paradigm that we'vebeen working with for cloud native isthat stateless isgood ai is the opposite ai is all aboutstate so the question is how do weevolve how do we shift that paradigm tomake Kubernetes AInative so let's not forget thatKubernetes already provides very strongprimitives for us to managestate we have persistent volumes uhwhich provides durable storage we've gotstateful sets that provides us with astable identity and access to storagefor pots and we also have demons setthat provide us with a consistent way ofdeploying pod across node and to someextent that allows us to have kind ofnode level statemanagement we can also tap on thegrowing and thriving ecosystem for dataon kubernetesi would mention projects such as VESTwhich provides you know cloud nativedistributed MySQL atscale or projects like Rook whichprovide us for an orchestration forstorage systems such asSE we also have been seeing a growinginterest in event-driven architectureand projects such as K native or streamywhich enable capka and kubernetes havehelped us to build dynamically linkedapplication andmicroservices but AI workload operate ata totally different scale if you look atstate management AI agent just don'tstore state they share they modify andthey react to data in a very dynamic andhigh volume and high throughputmanner so the current construct that wehave they need to evolve and they needto evolve because we are already dealingwith huge issue at large scale we dealwith issues such as data localitysynchronization and performancebottlenecks so going forward we need tostart looking at how to make uhscheduling AI native in Kubernetes andhow to understand state managementbetter ai workload are very intensive interms of memory and compute consumptionso we can leverage Numa aware AIscheduling to deal with the memoryaspects we can also use GPU and topologyaware AI scheduling to make sure that weuse our compute and our access toaccelerators betterwe also have LLM gateway that willprovide us with a efficient way to routeinference request to those very highconsuming large language model that weare all trying to use but the biggestchallenge we have is we need a fully newmodel for fall tolerant management whyisthat ai agents don't fail in atraditional way they may fail byproducing a wrong output and we need tobe able to recover from this type ofissue in a distributed system so goingforward we are going to have to dealwith a better way to manage state atscale for those highly distributedsystems thankfully CNCF projects alreadyprovide us with a strong foundation todo this q is an extremely interestingworkload management capability onKubernetes we have envoy AI gateway kerand vlm that provide us with a scalableinference management for geni model andwe've got ander agent that help us buildhighly distributed intelligentarchitecture and systems finally opentelemetry is a very very big uhdependency for us when we are startingto look at how to correlateobservability across dynamic AI agentsand you can read about the progress thatthe CNCF AI working group has madethere's a really good white paper umshowing that there's lots of progressalready because we we've got such astrong foundation we've got such anamazing ecosystem and community we canlead this shift towards decentralized AIbecause AI is going to change cloudnative and it's going to move tointelligent state aware but we can buildthis together so if you'd like to hearmore Vincent has a talk tomorrow aboutAI on Cube we've also got a live Red Hatdemo this afternoon at 3:30 in the demotheater and then of course do come seeus at the Red Hat booth thank you verymuch thank you2025-04-15 21:57:31.251558 � ��c#�AW_EF1HnP4tUhello Cucon two and a half years ago ourindustry changed fundamentally chat GPTbrought AI into everyorganization next generation modelsingested more and more and more and moreof humanity's knowledge and got biggerandbigger and the way we interacted withthese models got bigger as well we movedfrom just chat to multimodal AIwe also got bigger in how we interactedwith them instead of just having smallsnippets of text we had enormous volumesof contextstate but state has weight all of thisdata in one bigmonolith wasn't sustainable and itwasn't scalable so we had to break upthe monolith again we ne(+confidence and and get over the impostersyndrome to think yeah I could stand forthe TOC and then I ended up being on theTOC for for three years so that was ahuge influence on my career i also justwant to say welcome everyone to Londonthis is my home city so you're verywelcome[Applause]so now really looking back now and kindof like looking at what's happened overthis last 10 years you know there's somehighlights of some key milestones butthere's there's a lot there and I reallykind of want to see that from both yourperspective what what do you considersome of the you know things that wereturning points for the CNCF over thislast decadeso this is something that I'm actuallytalking quite a lot about and one of thepivotal moments within our ecosystem wasthe emergence of interfaces when wecreated these interfaces for runtime andnetworking components now what itactually did is introducedstandardization within our ecosystem butat the same time embraced theinterperability principle which meansthat we embrace multiple solutions forthe same problem space now thesetranslates to innovation for the vendorcommunity because they'll be able tobuild on top of the existingintegrations and plugins as well as forthe end user community this translatesinto extensibility because you havemultiple tools you'll be able to choosefrom and benchmark and actually choosethe tool that fits your pre-existingplatformrequirements i think for me if I lookback to the sort of 10 years of historyif you remember at the beginning of thatit wasn't obvious who was going to winthe orchestrator awards we had severaldifferent options for orchestratingcontainerized applications and I thinkKubernetes came out as the the standardreally because of the foundation andbecause of the the vendor neutralitythat meant this project could be adoptedby lots of different cloud providers andlots of different end users and everyonehaving the sort of confidence that aneutral foundation can give to an opensource project yeah no I I I think thinkas well and even even before the CNCFstarted we already saw like the need forspecification when we talked about thatlike OCI for example and really tryingto drive so that we could have alignmentand you're right Liz I think in thattime between like 2015 2016 I felt likethere was some new scheduler coming outand we were just kind of going back andforth i think even in one role I had Ichose Kubernetes over something else andI lost my job because I chose the wrongscheduling in their opinion but I'm gladI chose this one but amazingly though wewe did see that there was a lot ofthings in that and and I think some ofthe other things that I think wereimpactful is even like the governanceand you know like seeing some of thechanges that happened the TOC played akey part and even when it add membersall of a sudden exponentially we sawproject growth start to happen so manythings that really were helpful here butnot everything was easy there was somechallenges there as well you know somethings whether it's technicalorganizational um you know evencommunity related and so how did theCNCF maybe overcome or is theresomething we can learn from these thingsI think one of the really importantthings that again was intentional rightfrom the beginning was this idea thatyou referred to of not having a singleopinionated solution but havingcomponents that can interoperate witheach other and there is no singlesolution to solve everybody's theseproblems everybody needs this kind ofcomposable architectures so the downsideof that is we now have the landscapethat's pretty difficult to navigatebecause there are a lot of choices butthe good thing is there is probably aproject out there to solve or get veryclose to solving the problems that endusers are encountering as they deploytheir environments yeah and I you knowand I was also thinking of one otherexample like we look at kind of likewhat open telemetry today is startedfrom two projects you know at both we'rehaving some you know like having themboth align on specs and but eventuallygetting them to agree and getting tofor,m like look at how that explanationwent up but it was challenging at thattime took some time for that to developand now we're here again this pointwhere it's about ready to graduate andso we get the benefit end users get thebenefit as well so you know as we lookforward and look at kind of like howthis community is growing you know whatdo you see as the most potential overthis next decade for like the CNCFproject adoption to be more accessibleand inclusivenow this is an area that we have beenexploring quite a lot within the TOC aswell and there we're actually taking adifferent approach instead of lookingfor trends we're actually looking forgaps within our ecosystem that we wouldlike to surface to the community andhopefully see more contributions to orinnovation within that area as well andsome of the I'm going to mention threemain domains that we would like to atleast showcase with uh with ourcommunity now in terms of the potentialfor the growth within our ecosystemwe're looking into multiclustermanagement and observability this isstill a challenge especially if you havea cross provider strategy and if youhave to individually configure and scalethese clusters and also surfaceobservability across a cross providerstrategy the second domain we'd like tohighlight is around cost spending andsustainability as we have an increasedadoption of cloudnative architecture wealso have an increased focus on costspending as well now we are equallyresponsible for managing our cost but atthe same time our carbon footprint andwe're looking to have more collaborationbetween uh these two groups within ourecosystem and the final gap that I'dlike to mention here is around toolingaround infrastructure provisioning andsecret management this has been a gapwithin our ecosystem for a long time andwe'd like to surface it again to seemore innovation and contributions tothis spacei think that the enduser community asit's grown is able to give us so muchfeedback and so much information aboutwhat those gaps are what things areworking what things aren't working notjust to the projects but also to the tothe vendors and to each other tounderstand how to build systems thatwork so I'm really excited aboutincreased involvement from end users andbuilding things like referencearchitectures and uh the end usertechnology radar reports it's reallyimportant as a end user technicaladvisory board member I 100% agree thoseare two great artifacts if you have nottaken a look at them uh Adobe we haveput something up there a few othercompanies as well and we're reallytrying to help you know help kind ofguides and building taking some of thebuilding blocks and bring creatingsolutions that really work for your yourenvironments and so with with that likewhat guidance would you provide projectsthat are applying trying to get intolike that CNCF development stages wellagain within the TUC this is area thatwe've been focusing quite a lot in thelatest years we try to streamline ourprocesses around sandbox inclusion andproject moving levels to incubation andgraduation and as part of this effort weare aiming to be as declarative aspossible and provide clear guidance forthe projects and maintainers to navigatethe ecosystem all the way to the maximummaturity level possible so if you'd liketo apply or be a project within CNCF Idefinitely encourage you to check ourTOC repository it has a wealth ofinformation and look at our backlog it'spublicly available we also have ourongoing work that you can check but moreimportantly you can look at our closedissues which are successful case studiesof how we completed previous due todiligence around incubation andgraduation projects sandbox inclusionhealth assessment issues or any kind oftopic that was brought to us um uh tothe TOC to oversee and another thing Iwould like to mention is if you find anyof the criteria that is not declarativeenough or not clear enough or if youfound a gap that we haven't coveredwithin our evaluation open an issue onthe TSC repository so we'll be able toaddress it accordinglyi would say before you even check outthose processes you need to be reallyintentional about why you're donating aproject in the first placeif you think that just putting a projectinto the CNCF is going to magicallyprovide you with contributors and usersand a adoption community it's not theCNCF can amplify the work that you putin but you have to put in work as aprojecti would also say that a lot of projectscome from vendors and which is thatthat's totally fine that's exactly rightbut as you contribute your project youhave to be sure that the interests ofyour business and the interests of theproject are aligned if they're not it'snot going to be successful so reallythink carefully about the future of yourproject and how that's going to impactyour users how you're going to build thecommunity around that project and howthat's going to hopefully positivelyimpact your business but don't justexpect that magic will happen justbecause you joined the CNCFwell I want to focus on one last thingand that's really I think one of themost impactful things and that's thecommunity and you know the CNCF hasgrown rapidly in memberships you knowand that takes a lot of resource youknow to be able to sustain these thingsyou know this also means that we need tohave maintainers and make sure thatthey're stay in a healthy state so youknow what what do you think to help isneeded to help us to keep thriving heresustainability of our community issomething that I'm focusing quite a lotand speaking quite a lot about becausethe technology ology is thegravitational point of our ecosystemhowever none of it would be possiblewithout the people around it so it'svery important for us to scale as acommunity at the same pace withtechnology evolution as well and as partof this goal within the TOC again wehave multiple initiatives aiming to helpus grow as a community as well and oneof the initiative I'd like to mention iscalled the tag reboot which aims torevise the or restructure the currenttags and set up for success for the nextdecade of cloud native some of the workthat we're doing here is to reduce theamount of tags from 8 to five and we'realso introducing the notion of communitygroups sub projects and initiatives weaim to have a greater focus and actuallymore closer collaboration with the tagsand the projects within the ecosystemand the last thing I'm going to mentionis do check the issue 1527 i know it byheart by now but it's it's a good issuewhich has a lot of discussion aroundthis topic and more importantly afterCubeCon we're going to open theelections for the new tag cheers andleads so check out the issue check allof the discussions around it and if youidentify that any of your expertisealigns with the new tags that we'regoing to bring up do apply doself-nominate and nominate someone elsefrom your peer going back to Dan Kau besomeone who is enabling someone withinyour community so that's some actionitems for you after CubeCon yeah I woulddefinitely encourage people to look forroles where they can bring theirexpertise and learn uh you can gain anawful lot by contributing time intothings like the tags or contributing toprojects so there's tons ofopportunities talk to your managersabout how great an opportunity that'sgoing to be for your career uh I thinkthe other really important thing that weall need to bear in mind as a communityis this this is a successful ecosystembecause it balances what projects needwhat individual maintainers need whatvendors need what end users need and weneed to be kind of conscious of all ofthese different stakeholders bit of ajargony world but uh everyone has a roleto play in this ecosystem and we need tokeep that perfect balance yeah so Ithink this has been uh incredible tohear both of you i I made a referencehere to Sarah Noatne you've not seenthis talk from 2017 she provided anarchitecture for community that is verycritical i really want to just say thankall of you for your all yourcontributions for this great 10 yearslet's keep driving forward innovationand let's have a happy C10 CF have agreat CubeCon CloudNative Con thank you[Music]2025-04-15 21:57:31.805155 ��H#��GArACTrbTnFqYwell by way of introductions I have twofriends with me from the community ihave Liz Rice the chief open sourceofficer of Cisco at IsoValent Cisco correction and KatieJamanji from Apple and field sen fieldsenior field engineer that's right helloeveryone well thank you both for comingwith me today on stage and you know I'mthat individual when like big events ormilestones come up you know I'm alwaysthat one that likes to dig into thatdigital photo album which means theWayback Machine YouTube and I was atthis event the Kubernetes launch and Ithink my brain was so into Kubernetes atthat time I wasn't really payingattention to some of those initial talksbut there was some seeds of foundationthat were happening there that reallywere the foundational principles for theCNCF over this next 10 years and thefirst talk was by Craig McClucky andwhat was interesting at that was he wasreally kind of given the backstory aboutwhy they were giving Kubernetes and whatit needed and he talked about there wasa need for a foundation and the goal wasthat they wanted to move everything tocloudnative computing and that it wouldreally be about not just a few bigplayers but it'd be the whole industryand this far this foundation wouldreally focus on like a harmonized set oftechnologies providing referencearchitectures looking forinteroperability and hoping to fill gapsthat may be in this ecosystem and reallycreate a safe space for the industry tobe able to engage in the following talkthen was when Jim Zlin came out and Ithink here was really like some of thereal guidelines that were needed thatwere going to be helpful to be able tohelp this foundation to grow with thesefoundations there's important thingssuch as governance you know membershipwhat are the ecosystem you knowdevelopment going to look like as wellas like the more pragmatic things oflike intellectual property managementand this is when he introduced thefoundation i I don't remember everseeing this slide at that moment in timebut there it was amazing to see thatthat was like the origin and so when Iwas talking to both of you and we werekind of looking at this I was reallyimpressed you both kind of mentionedsomeone that I think we can't overlookand I want to start there and that's DanKhan many of you have benefited from theDan Con scholarship over 7,000 millionsof dollars have been donated to thisthis diversity scholarship but I want totalk individually about the impact youhad on both of you maybe we'll startwith you Katie absolutely this is awonderful way to start this panel aswell now when I'm looking back into mycareer I can definitely identify some ofthose pivotal moments where I had peopleencouraging me to step up and grow andDan was one of those persons for me aswell i do remember when he approached meduring Kubernetes forums when I stillhad them and it was in Sydney and heapproached me and encouraged me to bepart of the TOC and this was even beforeI knew what the TOC was doing on or howimportant governance is for a successfulopen sourceecosystem fast forward many years laterI'm still part of the community and I'mstill here and talking about thescholarships as well my first CubeConwhich was in 2018 uh I actually wentthere on a scholarship as well and I Iknew about Kubernetes i knew about theproject some of the projects in theecosystem but I didn't know about thiscommunity around it and it's one of themost welcoming communities that is outthere and I definitely would like Dan Iwould definitely like to thank Dan toencouraging me to be part of thiscommunity and taking leadership roles aspart of this community as wellyeah I really relate to theopportunities that Dan gave to you andto me as well uh 2018 was the year thathe had invited me to be a programco-chair the one of the role that you'renow in uh and following that year I feltlike I'd learned quite a lot about theprojects and Dan kind of gave me the*e ingress tier aroundrequest routing prioritization and loadbalancing for fine-tune LLMs running onGPU based infrastructure the communitythe CNCF community recognized this gapand implemented a new spec and APIcalled gateway API inference extensionssolo is proud to be one of the communitymembers that contributed to the spec andalso that we have provided a fullimplementation of the speck and API inthe open source K gateway projectclayton and Josh will be coveringinference extensions in their keynotetomorrow so if you want to learn moredefinitely catch thatkeynote another use case important usecase is related to the workload themselso not only the infrastructure isevolving but the workload themselves theapplication themselves aretransforming from what we call can calltoday a traditionalmicroservices all the way to a genticworkload where the business logic whenthe LLM is actually in charge of thebusiness logic and core component likeyou know core like component like uhtools and and function calling becomingthe core so let's explore what does itmean to the infrastructure and we willstart with thetooling so one of the very very excitinginnovation that happen right now in themarket in the AI market is MCP mcp is aprotocol for entropic that define thecommunication between the agent and thetools and MCP worksfantastic for lowcale envir low scaletools so basically a few tools but whathappen if you have 100 tools or whathappen if you have thousand tools howyou going to address problem likediscovery like security likeobservability so is that problem soundsfamiliar to you because honestly wealready solved it right the same problemdifferent protocol but the same problemand we solve it with API gateway sotoday I'm extremely excited thrilled toannounce MCP gateway mcp gateway isgoing to federate all the tools all thethe to the tool servers to a singleendpoint and then seamlessly going tomultip duplex the traffic to the end tothe to the upstream tool um the greatnews the best news is that of courseit's open source and it's alreadyupstream to K gateway go check itout all right so up to this point we'vefocused on the network connectivitylayer for agentic infrastructure inKubernetes but one problem remains whatabout building deploying and managingagents themselves in Kubernetes we saw agap here and we created the K agentproject to address this gap k agent isbased on a core agent framework layerbased on Microsoft's Autogen but it addsthree important components or extendsthat foundation with three importantcomponents the first is an MCPbasedtooling layer that's extensible andships out of the box in the project withtool server implementations for some ofthose popular CNCF projects the secondlayer is a set of example agents thatleverage these tools to align andsupport traditional applicationdevelopment and platform engineeringworkflows in Kubernetes and finally Kagent defines a declarative API andcontroller implementation to simplifydevelopment of your own agents fordeployment inKubernetes k agent we believe that Kagent will do to platform engineeringwhat cursor did for softwareengineering so we open source K agenttwo weeks ago and the early response areincrediblewe truly believe that K agent is acrucial piece of the futureinfrastructure and as one it belong tothe CNCF so let's make it happen rightnow so I created a pull request and Ihave it here and we're going just topush the button and donate it to theCNCF thank you it's donethat always makes me nervous i expectedto time out or something that's awesomeokay uh all right so thanks very muchwe're out of time i just want to pointout that we made a lot of announcementshere uh this QR code will take you tomore details on everything we announcedbut more importantly if you come by ourbooth all the engineers behind theseprojects are there and they want to talkto you we have hands-on labs foreverything we've discussed today so youcan get hands- on with the technologyand we have some really cool swag soplease come by and check it out and weare hiring yeahthank you bye guys2025-04-15 21:57:32.348115 --�H#�IA-k1CdrRAGMMhi everyone today we're going to makesome boldstatement after years of hard work a fewwrong terms andco-correction the cloudnative networkingis now complete andoptimized so a complete cloudnativenetworking stack begins with ingressglue is an incredibly popular envoybased API gateway that's optimized forscale based on feedback from thousandsof community users but there was oneoptimization remaining and that was anopen governance model for that communityat CubeCon North America we announcedour intent to donate Glue to the CNCF asK gateway and we're excited to sharetoday that K gateway is now an officialCNCF sandboxproject that's rightthank you uh we're very excited aboutthat we're also excited about our ourannouncement at CubeCon North Americaand the ISTTO community's announcementthat ambient mesh went G in the STOcommunity ambient sidecarlessarchitecture optimizes for costefficiency performance and userexperience for service mesh andKubernetes today we're announcing twonew tools at ambient mesh.io that helpexisting sidecar users adopt ambient thefirst is a cost analysis tool that looksat your environment and will predict thecost savings of migrating from sidecarsto ambient the second tool is anautomated zero downtime migration toolthat helps users move from sidecars toambient with a singlecommand and when we say optimize we meanoptimizewe build multicluster functionality intoambient that deliver a jaw-droppingexperience it's lightweight it'sintuitive and it'seffortless and when we say scale we meanscale 100 million pot scale withincredibleperformance so all the component all thebuilding block are now as part of theCNCF so please go try that and continuehelping us enhance it and fine-tune thestock and we completed it just in timejust in time to what just in time forthe biggest transformation of thecentury and of course I'm talking aboutAI ai is going to touch every aspect ofour life and it will be careless not toadopt itso if AI is this hugetransformation we must ask ourself arewe ready forit more important is the infrastructurethat we built together as a communityfor the last decade is ready for it canwe reuseit is there anygaps so let's envision together thefuture infrastructure for AI and agenticworkloadall right so let's start where manycompanies begin in their generative AIjourney integrating applications intheir Kubernetes clusters with hostedLLM providers this integration requiresthose applications to egress calls tohosted models those calls require guardrails semantic analysis cost andend-to-end governancecontrols in this scenario we worked withmany of our existing customers that weredoing this ex these exact use cases andimplemented these features in ourproduct Glue AI gateway we're excited toshare today that we have open-sourcedthese AI gateway features as part of theK gatewayproject the next step for many of theseorganizations is to start hosting modelserving on their own infrastructure butthis places some very specific newrequirements on th. ��5#�#A_47X1eKkiEshelloeverybody so I think we all know thesefolks and they thought that they werewinningAI but then we had the deepseek momentand now we know that open source has theopportunity to winAI just as Kubernetes has theopportunity to win opensource so the challenge is that in aworld with exponentially growingapplications AI powered applicationsKubernetes faces many challenges one ofwhich is this incredible sprawl so wehave to figure out how to manage it atscale so that's why we created Cordantcordant is a super control plane forKubernetes that's designed to deploy runand manage Kubernetes at scale at theedge private clouds public clouds on VMsbare metal containers you name itwherever it is we can run deploy andmanageit so I'm going to show you this demohere this is Cordon running on top of KZour Kubernetes distribution it's beingconfigured to run uh OpenStack it'sdeploying the G-core control planecombined with CNCF ecosystemcomponents and we're going to bring upMicrosoft54 an open- source MIT licensed openmodel i should say open model not opensourceum and then we're going to talk to itand we're going to ask it tell us aboutKubicon so what's interesting here isthat this can be done in a matter ofminutes we can deploy this AI inferenceengine anywhere in the world at anytime there we go we're deploying themodel as we speak it's GPUoptimized okay so as the model's comingup we want to ask ourselves what does itmean for Kubernetes towin we have to contain and manage thesprawl we have to figure out in a worldwhere there's more and more AI poweredapplications whether it's vibe coding orwhether it's existing applications beingempowered further how to manage thatsprawl and so that's why we built Cordonand I hope you can come by the Morantisbooth and ask us more questions thankyou[Applause]2025-04-15 21:57:32.8961102asparticularly cotlin Docker pogress Kafkaand of course kubernetes and nice becamea really strong brand we had madestickers t-shirts socks hoodies cakeseverything to kind of build the brand ofmaking good software at nav and anotherthing we did was we open sourced thewholeplatform and that's really important tomany developers that they can use andmake open source code and this actuallyevolved into nav open sourcing a lot ofthe code that we write to make servicesfor the Norwegian people I just checkedand we have about 3,000 open reposstories onGitHub yeah so no cubec con presentationis ever complete without our favoriteconfiguration language yaml this is thenice application Manifest this is ourinterface to our users thedevelopers we created this back in 2018when crds and are back in kues was justreleased the operator pattern hasallowed us to move to the cloud andswitch out implementation detailsunderneath with few to know changes inhow the application is built or deployedand today we have 10 custom builtoperators all working to together as apart of the niceplatform and as you can probably guessthis is a bare minimum configurationonly exposing a very simple applicationwithout any dependencies or add-ons thatthe platform can provide the applicationmanifest can easily become quite complexbut luckily we have documented all thedifferent parameters and how to use themon our public dock site Link in that QRcode and we take great pride in havinghigh quality and living documentationthat is easily available to ourdevelopers and we believe that this hasbeen crucial for adopting the theplatforminternally and today the nice platformallows our developers easy andself-service access to a golden path offunctionality to build run and operatetheir applications it's built on a solidfoundation of Open Source Technologiesmany of which are part of the cncffoundation and this has allowed us toscale the platform to hundreds ofautonomous product teams buildingthousands of applications andcontinuously rolling out new changes toproduction as we are speaking rightnow and teams own their own cicdpipelines for their applications andcalls on the platform whenever they wantto build sign and deploy their app theplatform Provisions all the dependenciesthe application declare such asdatabases message cues or caches no needfor any other infrastructureprovisioning tool or click Ops and justthis week just in time for cucon wesuccessfully migrated all of our teamsand apps away over tovalky and as a part of the applicationdeployment we also generate softwarebill of materials or as bomb which anwhich is an inventory of all thesoftware components and the versionsthat the application contains it's likethe ingredients list that you find onthe back of food items and we use thisingredients to scan for knownvulnerabilities and are on our way to100% aspam coverage currently at93% and for the longest time theplatform did not have its own graphicaluser interface only apis and clis andbut as a number of teams and applicationgrew we saw the need for giving ourteams a quick and easy way to to get anoverview keeping track of key metricssuch as known vulnerabilities resourceutilization and cost and we have foundthat giving team's access to thisinformation improves the overall qualityand reduces unnecessaryspend and as you can probably see fromthe screenshot we did not go withbackstage but opted for building our owndeveloper portal after a lengthyevaluation it's fully open source andavailable under the nice GitHuborganization a nice console is tailormade to how our application operatorfunction and you will find the sameresources as available through theapplicationmanifest and nice actually works thisgraphs shows the deployment ofproduction average per week per year andwhen we started in 2017 we had like fouror five or maybe 10 deployments per weeknow we're up to almost 3,000 deploymentsinto production everyweek and we've done this whilemodernizing a lot of the services andproducts and it made it possible toautomate improve all the public Ser orthe Public Services ofnow so we had this great thing thisplatform and we had this community wherewe talked about it and a lot of peoplewanted to learn from us and we werereally willing to share as well we metwith lots of people and we presented andwe even madepodcasts at the same time the teamcontinued to make nice even nicer makinga lot of what Tans chrisan just talkedabout but we realized that sharingwasn't really enough we wanted to takethis a step furtherand we decided we wanted to try to makenice available as a service for otherorganizations so we made nice as aservice and we stupidly thought that wasgoing to be quite easy all we had to dowe thought was to add some automationsaround how we provision clusters so wecould set up a new cluster for every neworganization that they wanted to usenice but it turns out the technicalParts would the really easy ones all theother stuff was much more difficult onething for instance was and and wedevelopers remember we had to learn howto sell stuff and selling is verydifficult uh so credits to all peopleactually try managing to sell softwareproducts that's much more difficult thanmaking them and also we are governmentagencies there's loads of red tape forinstance there's rules governing how youpurchase stuff in the public sector andwe had to find a way and we did find away of selling nice without having atender process which everything elseneeds and when we did that we tried hadto figure out stuff like do you need VATon top of the price so what you'reselling I won't bother you with thedetails but if you're interested I havea five-page document written by a teamof lawyers uh explaining why wedon't uh and we had to make loads ofagreements we had to have a data privacyagreement a security agreement and otheragreements handling all the commercialstuff again more difficult than making agood platformand we had to make three differentdepartments in Norway agree on the factthat this was a goodidea also quite difficult we managed toget three different departments to agreeon the fact that it wasn't a bad ideaand that was enough when we had that wecancontinue so today we have two otherorganizations using nice as a servicethe statistic Norway and the directorateof Agriculture several other governmentAdministration and agencies arecurrently evaluating and I know many whoare attending cucon today also wants touseit this is a fully managed Service asear mentioned where the platform team atnav is responsible for operating andmaintaining all of the platformsincluding 24/7 support each instance ofnice is completely isolated from otherinstances sharing no resources networksor accounts and is provisioned fullyautomatically and I asked theatis inNorway for a quote why they chose us andthis is what they said nice takes careof the platform so we can take care ofthe data and I think that's very spot onwhat we are aimingfor so what have we learned from thisprocess first of all treat yourplatforms as products make the platformthe same way you make other softwareit's not about solving your own problemsit's about solving the user's problemsand that's really easy to forget uh whenyou try to make stuff so remember thatalso be open source and make a communityworking in the public sector we reallybelieve that every all the code that wemake should be seen and possible to readby everybody who uses our services andproducts and also in today's especiallyin today's political climate try to useopen source software as much as possiblereduce dependencies on large largecompanies and for us ownership wasessential insource yourdevelopment and as a consequence of thathaving loads of develop will generateloads of good products that can beshared just like we did with nice andfor that to work you have to reduce thered tape and make it really easy toshare good products across the Norwegianpublic sector so try to make everythingnice and of course being nice to eachother thank you yeah and if you areworking with with Platforms in thegovernment sector please find us afterthe keynote we're so happy to talk withmore of you thank you2025-04-15 21:57:33.596863 KK�&#��ADWq8UWmcRQghi keepon we want to share the story ofbuilding a platform and a community andhow those two together are shaping thefuture of the Norwegian public sector myname is adun and this ishrist this story actually starts at thecubec con in Berlin more or less exactly8 years ago it was my first cubec conand the first cubec Con in Europe I waspresenting something about continuousdelivery on the kubernetes and I wasjust about to start my new job at navhow many know you were at cucon thatyear if you were you probably rememberthis this was the year when Kelsey hightower was really was the keynote speakerand at the time he was the Mega star ofthe kubernetes world he got anincredible reception when he came onstage I remember Standing Ovation andpeople screaming and maybe some smokemachines or something that's what Ithink at least he had a great live demoand told some greatstories but our story actually startsoutside the convention centerin the que to get in as I said I wasabout to start working for nav my firstjob in the public sector nav is a largepart of Norwegian government and we payout benefits to the sick the sick andunemployed and we help get people backtowork so in the queue outside I met a guycalled ARarar and he was leading the platformteam for the tax authorities in Norwayso basically he was trying to do thesame job as I was going to start butdifferent part of the public sectorsystem we re realized that there was agreat potential for cooperation andlearning from each other and from thedifferent teams and just in that queuewe met several other people doing thesame thing in the public sector so whenI and I got back from Berlin we startedto work on trying to build thatCommunity we set up a slack and wequickly got loads of members and weplanned our first Meetup and H chrisangot involved quite early as well so thispicture is from the Meetup one monthmonth after cucon we had 80 participantsand presentations and an open space andmany of these people are here at cucontoday so this lucky encounter becamewhat is known today as oftenly pass orpublic pass Norway a Wonder of a Kindgrassroot Community for discussing andcollaborating on all things related toModern applicationplatforms we have managed to gather over2,000 Engineers across 86 organizationacross most of the public sectorand each year we organize severalmeetups where members are sharing fromtheir own experiences on a wide varietyof topics Gathering hundreds ofattendees of all ofNorway at the heart of the community isthe slack workspace that eron mentionedhere members can discuss and collaboratefreely with like-minded individuals andorganizations without someone try tryingto sell them something it had alsobecome a place where it can askquestions about a service or an API fromanother government government agencysince they're all gathered Under OneRoof government efficiency before it wascool so but our job wasn't just to builda community we also had to build aplatform as I talked about earlier andwe had decided to call our platform niceour naming strategy was first to find acool word and then try to retrofit itinto something and nice means navinfrastructure nov applicationinfrastructure system or service it kindof dependsand this was a part of a quite bigtransformation we had we used beforethis we had six or 700 Consultants doingmost of our development work but wewanted to insource and hire owndevelopers I was among the firstdevelopers being hired at now and now wehave hired over 300 developers makingservices for the people ofNorway and actually making a platformlike this is a really good recruitmenttool because it helped us show thedifferent developers that they couldwork at and used the cool newtechnologies and at that time that w1 ��#��CABBqDpqATcI0good morning we're here today to talkabout a new project in the Kubernetesecosystem sponsored by the servingworking group called the gateway APIinferenceextension it takes any Kubernetesgateway and turns it into an inferencegateway and an inference gateway helpslarge platform teams small platformteams self-host large language modelsefficiently in production it's informedby our experiences at Google and BiteDance and we're very excited to talk toyou today aboutit 10 years ago someone told me thatKubernetes wasn't going to be relevantto the majority of users we'd all beusing functions as a service or maybe 12factor managed platform as a service uhin thecloud now obviously as judging by all ofus here they were wrong but there was aseed of truth in that it wasn't yetclear that the majority of largeplatform teams would have a diverserange of workloads and that they wouldneed and demand the flexibility from ofKubernetes as well as depend on a richecosystem of composable automationthere's a similar timeline going on inAI today two years ago it seemed thatall models would be proprietary in thelast year open models have dramaticallyclosed thegap because larger models are moreflexible and smaller models are moreefficient to serve we believe there is ameaningful and durable trade-off betweenvery smart frontier models and smallerpredictable building block open modelsso we expect everyone hopefully willeventually need to serve open models forefficiency at scale while still continueto depend on cutting edge models fortime to 5�n#�AIoEe05sPqhkगुड मॉर्निंग एवरीवन वेलकम टू द फाइनल डेऑफ़ क्यूब कॉर्न क्लाउड नेटिव कॉर्न यूरोप2025 अह दिस इज़ आवर फाइनल डे बट फ़र्स्टबिफोर गोइंग टू द सेट ऑफ़ की नोट्स टुडेलेट अस सी हु क्यूकॉर्न क्लाउड नेटिवकॉर्न हैज़ चोजन फॉर द बेस्ट पोस्टर सेशंसदिस ईयर सो ओवर द पास्ट फ्यू क्यूबककॉर्न्स वी हैव बीन होस्टिंग पोस्टर सेशंससो पोस्टर्स आर लाइक एन अमेजिंग वे ऑफ़डेमोंस्ट्रेटिंग टेक्निकल कंसेप्ट्सइनोवेटिव आइडियाज एंडदेन मेनी थिंग्स इन वेरी विज़ुअल एंड इज़ीटू अंडरस्टैंड वे अह ओवर द पास्ट टू डेज़वी हैव पोस्टर सेशंस इन द प्रोजेक्टपवेलियन सो आय होप यू ऑल गॉट अ चांस टूचेक देम आउट एंड कास्ट योर अवार्ड फॉर दफॉर योर फेवरेट वंस सो नाउ आई एमएक्साइटेड टू अनाउंस द विनर्स फॉर दिसइयर्स पोस्टरसेशन द विनर्स आर डैमियन डैसू एंड माइकलरायन बाफ कांग्रेचुलेशंस ए ह्यूज थैंक यूटू ऑल अदर पार्टिसिपेंट्स फॉर देयर फॉरशेयरिंग देयर इनक्रेडिबल वर्क वि दकम्युनिटी एंड कांग्रेचुलेशंस टू द विनर्स[प्रशंसा]2025-04-15 21:57:34.1484456marketwhat flexibility and composableautomation will we all need when modelsare a fundamental part of ourapplications that's exactly the questionthat we started the serving workinggroup in Kubernetes to answer a year agoatCubeConu and just like in the beginningof Kubernetes we depend on experiencedplatformteams looking to build the nextgeneration of their platforms to guideus we were very fortunate that bitedance was ready to build the nextversion of their platform they chose todo it in the open withus running LLM in Kubernetes soundssimple in theory but in practice isstill challenging let's dive intoseveral production issues that shows LLMinference challenges are truly uniquethe first plot is from a byanceproduction LM service capturing thedaily request distribution so as we cansee the request valuation is spiking theinput prompt size vary widely and theoutput length are unpredictable thisactually highlights one of the biggestdifferences between traditionalmicroservices each request is differenthere's another demonstration we can seesudden strikes in GPU compute activityeven under constant QPS so at that atthat time we figure it out a batch ofrequests with super long uh promptssneak in so for LLMs it's not just theread of the requests that matters butalso their shape the length of theprompts the number of the generatedtokens and the prompt structure theseall significantly affect the GPU loadand make itunpredictable let's see anotherchallenge in binance we serve manymodels in production well their trafficpatterns vary widely some models arecritical production models while othersare experimental with near zerotraffic given how resource intensivethese models are this kind of trafficscrew makes the problem even moreproblematic it also makes the life cyclemanagement and resource allocationextremely challenging and highlight theneed for resource usage awareorchestration rather than just relyingon static deployments another uhchallenge we face is hardwareheterogeneity due to the machinedelivery timelines coder policies andavailability requirements our inferenceclusters commonly end up with a mix ofdifferent GPU types as show in the tablein 15,000 GPU clusters we have overeight GPU types even the heterogeneouspool help us serve a range of differentworkloads we noticed it brings thecomplexity sometimes our model has torun on different GPU types due to thecapacity issues well due to GPU'sdifferent tops and memory bandwidth it'svery hard to abstract the way thosehardware differences which makes themanagement like routing even moredifficult all of these challenges likedifferent request shape model trafficscrew or hardware hydrogenerity directlyimpact one shared infrastructure layerwhich is routing and that's why itbecomes a bottleneck we have torethink to handle the unique challengesof LM inference we need a new class ofrouting solutions that go beyondtraditional load balancers bance andGoogle have been working together overthe last year to pull our experiences inserving to make Kubernetes better forLLM inference what we found is that LLMserving success really depends on threethings denser faster and automated itall comes down to the self-hosted LLMserving that teams need more controlflexibility and speed than ever so let'sget started by digging into the firstpartdenser laura which stands for low rankadaptation is a technique to fine-tunethe large pre-trained model efficientlythe core idea about the Laura is toadapt the large pre-trained model tospecific tasks without needing to changethe entire model weights but just asmall set of the parameters calledadapters so these adapters commonly uhonly add 1% of the storage and memoryoverhead compared to the original modelwaste while at the same time it canmaintain the uh uh efficiencies andaccuracies without anyloss uh during the training phase Laurafrees the uh original model ways andonly fine-tune the two small matrix Aand B and uh in the inference phase itget the output from the traditional uhmodel and uh merge with themultiplications of A and B to get finaluh inference results so 7that's the basicofLaura even Laura gives the resource uhuh efficiencies and model flexibilitiesmanaging it in Kubernetes issurprisingly challenging why becauseLaura has to be loaded alongside thebase model so that means Laura cannot bedeployed in separate containers thosebreaks the Kubernetes principles rightnow every single container may servemultiple models that also bring theproblems to the uh request routing andthe original base models uh servicecannot be used to find the Laura anymoreand load balancer becomes evenchallenging especially multiple lorascontent for the shared GPU resourceswithin the samepot luckily we solve these problems inproduction let's use a concrete examplefrom Bidance to illustrate the benefitsof denser deployment in Bance we havemany database product like to integrateAI capabilities to enhance userproductivity and lower the learningcurve text to SQL is one of the mostpopular AI capabilities that translatethe uh natural language to SQL queriesin by dance we fine-tune the D6 33Bmodel to support our texttosql use casehowever we have many business lines theyprovide the SQL like query scenariossuch as log search or elastic searchwhile the query structure is verysimilar to SQL but the syntax and thesemantics differ significantly so eachscenario needs their own uh fine-tunemodel however if we support all thesescenarios with dedicated GPUs that willbe costly to address this resource issuewe adopt the Laura adapture solution wefine-tune all these models in Laura wayand packing all the SQL like adaptersinto one share deep models by followingthis adapter sharing and routingpractice we can deploy the new adaptersin seconds achieve 1.5 to4.7x GPU cost saving under differenttraffic conditions so in this setup thegateway plays a key role in minimizingthe number of the model servers andintelligently select the least busyinstance bite Dance's experiencemirrored broad feedback from GKE'sgenerative AI customers large models areslow and latency is important modelsgenerate output tokens converted to textat word or subword boundaries a bitbelow human reading speed that's greatfor chatbots but doesn't work so wellfor other use cases to hit a specificlatency objective in online serving youneed to understand your trafficdistribution in terms of input andoutput tokens you need to choose afoundation model sized to haveacceptable quality at the lowest computecost select an accelerator configurationfor both your model and your trafficthat is cost effective and reserveenough accelerators to handle your baseload and hopefully cover your burst loadand sending more requests to anaccelerator at the same time increasesthroughput and the latency of all otherrequests leading to the curve up here soit's your latency tolerance not justyour traffic load that determines thecost to serve we worked with teamstrying to solve this problem in withmultiple variables repeatedly new modelsnew hardware better software andincreasing prompt and output lengths allled to high toil how can we helpoptimize productionserving we start where it hurts the mostwhere the very nature of large modelserving leads to wasted resources and ahuman can't be in the loop loadbalancing we are moving frommicroservices and web apps with verysmall and very predictable requests tovery expensive and highly variable LLMqueries roundrobin load balancingdoesn't cut it anymore some acceleratorswould sit idle and some requests wouldbe stuck waiting longer to get processedwe need to look at each request andestimate where it will fit we also needto know how full the servers are and howsending one more request to that serverwill impact the latency of all otherrequestswe believe we can automate a significantamount of the toil in generative AIserving just based on these two ideas atthe load balancer model the cost of anincoming LLM request and how it'llimpact other requests on that server andbuild a real time snapshot of theperformance of each backend bycontinuously gathering metrics capturingthe complex relationships betweenhardware concurrency of traffic andclient visible latency the extra CPUtime spent gathering metrics andprecisely scheduling requests lets ususe more of the accelerator and expose abetter operational view to the serviceowner even with just the simplestversion of this loop we see good resultsas the number of model servers growsrandom load balancing of non-uniformrequests increases the chance that onemodel server is going to get multiplevery long requests in a row since amodel server needs GPU memory togenerate output tokens when memory fillsup that model server has to stopaccepting new requests which increaseslatency just by steering requests to themodel server with the most unused GPUmemory we can achieve over 30% higherQPS at constantlatency over a uh predictable trafficload getting us closer to the maximumutilization of the accelerator and thisis just a representative chat agentworkload the more workloads you add to ashared set of model servers and the moretraffic patterns that overlap thebenefit of an algorithmic approach toload balancing increases well we focusedon a single dimension for optimizationhere we both see a huge number ofpossible uh optimizations and researchand ecosystem that could be integratedbut how can we bring a highlydistributed ML ecosystem togetherwe're here at CubeCon because what needsto be done is more important thanalgorithms or hardware we need commonground for operationalizing large modelsas just another workload if we're allgoing to be running large models inproduction as a fundamental part of ourapplication infrastructure in a fewyears we need to identify the APIs andcomponents that can be standardized andreused we need common ground to bringthe latest research to production and acommon framework for innovation thateveryone can take to productionif there's a standard dynamic andextensible load balancer it's Envoy wechose to build our architecture aroundEnvoy because we knew it would work bothwith and without Kubernetes and the richecosystem of the gateway API wouldensure our extension could avoidduplicating all of the regular loadbalancing features that LLM serviceowners also need we use the standardEnvoy X proc callout mechanism todecouple our algorithms from the loadbalancer you can deploy themindependently this also gives us thefreedom to have multiple implementationsand to allow for forking andexperimentation when very large platformteams need something that the opensource project doesn't provide yet wealso worked to standardize the metricswe would need in the top model serversso that operators have a consistentexperience across the ecosystem and ourscheduler would need to do less ourfocus is automating the boring parts ofgoing to production with LLM andbringing you the best of ML research weplug into a broad set of gatewaysolutions we don't have any opinion onhow you deploy your model servers webuild in support for Laura forprioritization and fairness and forstandard model rollouts so you cansafely share your model servers betweenmany different workloads for higherutilization we want to be a loadbearingpart of your serving infrastructureorchestrate all of us can depend on anecosystem driving optimization and thecontrol you need over your productionjourneyyeah 2025 makes the year of productionscaling leading inference uh projectslike VM and control plan airs SG longand tensor RT have all prioritize largescale deployment in the road mapsputting efficient scalable inference atthe heart of their strategies so thegateway API inference extension projectwill play a critical role as thefoundation for LLM aware load balancerin Kubernetes unlocking intelligenttraffic control for LM workloads lookingahead we'll focus on enabling a fullsuit of production readiness featuresincluding fairness for multi-tenenciesheterogeneous weighted routing adaptiveSLO driven routing and KV casual wirerouting to support production workflowsat scale and with that we'll turn itback to you we hope that you try the ifyou're thinking about LLM serving inproduction give us a try give usfeedback and help become part of thecommunity thank youhey2025-04-15 21:57:34.6933879��ल्स यू समटाइम्स मैक्स कैन बीरोंग ऑलराइट सो लेटस सी हाउ एक्जेक्टलीदीज़ रियल टाइम इनसाइट्स आर पावर्ड बाय दवर्ल्ड ऑफ़ क्युबर्न हियर वी गो इट्स नॉटजस्ट रेस डे द टर्न असिमुलेशंस दैट हैपनटू फिगरआउट हाउ व्हाट्स द मेक एंड मेकअप ऑफ़ दटायर दैट नीड्स टू बी यूज़्ड ऑन दैटपर्टिकुलर डे इन दैट पर्टिकुलर ट्रैक दवेदर कंडीशंस हाउ द ड्राइवर इज़ परफॉर्मिंगऑन दैट गिवन डे व्हाट इफ कंडीशंस व्हाट इफदेयर वास एन एक्सीडेंट व्हाट इफ वी गॉट समपेनल्टीज हाउ डू यू टेक दोज़ रियल टाइमइनपुट्स एंड एक्चुअली ड्राइव रेस टाइमडिसिशन इंस्ट्रक्टिंग द ड्राइवर टू डू दराइट थिंग टू विन द रेस ऑल ऑफ़ दिस पावरबाय ओपन सोर्स यू कैन सी ऑन द राइटसाइडएफएडी ऑल ऑफ दोज़ थिंग्स दैट पावर रियलटाइम रेस इनसाइड्सक्यूबिनिटी इज यबिकस आई डोंट थिंक आई नीडटू से दिस टूदिस क्लाउड दे यूज दे एस एन ओरेकल रेडबलरेसिंग यूज क्यूबिनिटीज अक्रॉस देयरडिफरेंट एनवायरमेंट्स इन द क्लाउड इन देयरफैक्ट्री इन द ट्रैक साइड पिक्स बिकॉज़ ऑफ़दैट दे आर एबल टू स्पीड अप देयर डेवलपमेंटप्रोसेस एंड प्रोवाइड न्यूअर इनसाइट्सविथिन रेसेस दैट हैपन इन अ कपल वीक्स ऑफ़ईचअदर नॉट ओनली डु दे यूज़ इट फॉर रियल टाइमरेस एंड साइट्स राइट आफ्टर द रेस इन देयरपेटवॉल्स टू बी एबल टू सबमिट देयर केसेसव्हिच विल डिटरमाइन बिटवीन विनिंग लूजिंगएंड इवन बीइंग एक्सक्लुडेड फ्रॉम अ रिज़ल्टदे नीड टू मेक डिसिशनंस विथ 100्स ऑफपेजेस ऑफ एफआईए रेगुलेशंस द रियल टाइमइनसाइट्स फ्रॉम द रेस टू फिगर आउट व्हाटटू कंटेस्ट व्हाट टू लेट गो ऑल इन 30मिनट्स इनकम्स जेएनएआई सॉल्यूशन पावर्डबाइ Oracle रिट्रीवल आर्गुमेंटेड एजेंट्सअलोंग विथ अ लार्ज एलएलएम मॉडल दैट दे यूजटू ट्रेन फॉर देयर ओन डेटा सेट्सप्रोवाइड्स दिस काइंड ऑफ इनसाइट्स विथअगेन ऑल ऑफ दिस टेक्नोलॉजीस स्पेसिफिकलीलामा एंड ओ लामा इन दिस केस अ��ाउविंग देमटू डिटरमाइन व्हाट रूल्स टू कंटेस्ट इन 30मिनट्स आफ्टर द रेस विथ लॉट ऑफ हिस्टोरिकलडाटा ऑल द हिस्टोरिकल रेगुलेशंस देयर ओनहिस्टोरिकल रेस जजमेंट्स दे डू दिस नियररियल टाइम एंड सेविंग अ लॉट ऑफ़ कॉस्ट विथक्यूबनिटी स्केल अप स्केल डाउन सो इजीओनली पे फॉर द 30 मिनट्स दैट यू आर यूज़िंगद स्टफ इन नॉट ऑल दटाइम वी आर ऑलवेज लुकिंग फॉर द नेक्स्टमैक्स अगेन ओरेकल रेड बुल रेसिंग यूसेसडाटा इन एआई टू सर्च फॉर द नेक्स्ट मैक्सफ्रॉम देयर यंग ड्राइवर्स अकैडमी व्हिचप्रोवाइड्स अ लॉट ऑफ अनस्ट्रक्चरर्ड डेटास्ट्रक्चरिंग दैट डेटा एंड प्रोवाइडिंगअगेन इनसाइट्स इनटू ड्राइवरकैरक्टरस्टिक्स हेल्प्स देम फिगर आउट दनेक्स्ट मैक्स टुगेदर विथ क्युबुनिटीज़ओरेकल रेडबल रेसिंग विंसओरेकल रेडबल रेसिंग थैंक्स दिस कम्युनिटीद ओपन सोर्स कम्युनिटी फॉर ऑल दैट इट हैज़प्रोवाइडेड फॉर देम टू विन द रेस लेट्सटॉक अ लिटिल बिट अबाउट ओरेकल ओरेकल हैज़ अलॉन्ग हिस्ट्री विथ ओपन सोर्स फ्रॉम ओरेकलLinux टू ओपन JDK दैट वी कंट्रीब्यूट एट दमैक्स वी आल्सो कंट्रीब्यूट टू अ लॉट ऑफ़अदर प्रोजेक्ट्स नॉट ओनली डू वीकंट्रीब्यूट इन काइंड वी हैवकंट्रीब्यूटेड 3 मिलियन डॉल पर ईयरस्टार्टिंग इन नवंबर 2023 टू सीएनसीएफ टूअलाउ दिस ओपन सोर्स कम्युनिटी टू यूज़ओरेकल क्लाउड फॉर द बेनिफिट ऑफ द लार्जरकम्युनिटी वी आर वेरी प्राउड टू अनाउंस ऑलऑफ द प्रोजेक्ट्स फ्रॉम सैंड बॉक्स टूइनक्यूबेशन टू ग्रेजुएटेड प्रोजेक्ट्स दैटहैव बीन यूजिंग ऑटो क्लाउड इन अ 176ग्लोबल रीजन समथिंग वी आर वेरी प्राउड ऑफइन द ईयर 2025 दिस कैन बी यूज्डएवरीव्हेयर इन द ग्लोब इंक्लूडिंग सोवरनक्लाउड राइट हियर इन दइयू आय होप यू गेट अ चांस टू एक्चुअलीट्राई आउट रेडबल रेसिंग इन आवर बूथ्स टुडेएंड यू मे बी द नेक्स्ट मैच थैंक यू एंडएंजॉय द रेस्ट ऑफ क्यूपकर्नथैंक यू2025-04-15 21:57:35.272804 ��I#��IAqj9q_-S91L8हेलो एवरीवन आई एम एक्चुअली सो एक्साइटेडटू बी हियर इन द कीनोट स्टेज विथ द कीननोट टॉपिक बिकॉज़ दिस टॉपिक इज़ सो स्पेशलटु मी बिकॉज़ इट कनेक्ट्स माय इयर्स ऑफ़इंडस्ट्री एक्सपीरियंस इन द टे�;�p#��Ag7KIuv7KipEगुड मॉर्निंग क्यूपकॉर्न माय नेम इज सुधाएंड आई रिप्रेजेंट ओरेकल एंड रेड बुलरेसिंगटुडे वेलकम टु अ सनी लंदन आई वाज नॉटएक्सपेक्टिंग दैट आई एम फ्रॉम सीएट्रलव्हिच इज़ अपेरेंटली सपोज़ टू बुज़्ज़ली लाइकलंदन एंड वी आर रेनी एंड क्लोउडी एंड दिसवीक इट वाज़ सनी होप यू हड अ ग्रेट टाइमहियर इनसाइड क्यूबकॉर्न एंड आउटसाइडएंजॉयंग द सिटी आई आल्सो रेप्रेसेंट दसीएनसीएफ बोर्ड आई एम वेरी प्राउड हियर टूप्रेजेंट सम ऑफ दीज़ नंबर्स 5.2 बिलियनकमिट्स इन जस्ट 2024 इफ यू थिंक अबाउट इटदैटस टू थर्ड्स ऑफ़ द वर्ल्ड टूइंग वन कमिटइनसाइड ओपन सोर्स इन जस्ट वन ईयर आई नोदैट्स नॉट हाउ बिग आवर कम्युनिटी इज़ बटदैट्स व्हाट वी आर गोइंग दैट्स द लिमिटइट आल्सो रिप्रेजेंट्स अ 25% ग्रोथ इन दओवरऑल प्रोजेक्ट्स एसोसिएटेड विद ओपनसोर्स एंड अ लॉट ऑफ़ देम ऑफकोर्स एसोसिएटेडविथ एआई दिस इज़ हाउ वी विन द न्यूटेक्नोलॉजी ड्रिवन बाय एआई टॉकिंग अबाउटविनिंग लेट्स एक्चुअली टॉक अबाउट विनिंग अरेस ओरेकल रेडबल रेसिंग रेडबल रेसिंग यूजअ लॉट ऑफ़ ओपन सोर्स टेक्नोलॉजीस पावर्डबाय ओरेकल क्लाउड टू विनलेट्स टेक अ लुक अंडर दहुड ऑर्कल रेडबल रेसिंग ऑब्वियसली यूजस अलॉट ऑफ़ रियल टाइम इनसाइट्स फॉर रेसडिसिशन रियल टाइम इनसाइट्स की वर्ड रियलटाइम हाउ रियल इज़ रियल टाइम लेट्स हियरमैक्स टॉक अबाउट द रियल टाइमनेस प्रोसेसवे अराउंड दिस गेम जस्ट लॉक फ्रंट नो यूडोंट लॉक इट्स जस्ट डू लॉक दैट्स दमैसिव डू लॉकदैट्स रियल टाइम यू डू लॉक आई हैव द डाटाहाउ मेनी ऑफ़ यू हियर आर एफ वनफैंस ग्रेट आई होप दैट दैट ट�8<��ीकॉम डोमेनअह विथ द ऑर्गेनाइज़ेशन दैट हैज़ गिवन मी सोमच इन द क्लाउडनेटिव कंप्यूटरिंग फाउंडेशनइट्स लाइक वी आर गोइंग टू एक्सप्लोर हियरद सीएनसीएफ यूसेजेस इन सम ऑफ द टेलको एंडयूज़र्स इज़ सम ऑफ़ द एक्सपर्ट्स फ्रॉम दटेलको एंड यूजर कंपनीज़ फॉर अस आई एम फसीलाअह आई एम अ क्लाउड नेटिव डेव डेवलपर एटएरिक्सन अ मेंबर ऑफ़ द टेक्निकल ओवरसाइडकमेटी ऑफ़ द सीएनसीएफ एंड ऑफ़ कोर्स एज़ यूऑलरेडी नो वन ऑफ़ द कोच चेयर्स फॉर दिसक्यूपकॉर्नअकॉर्डिंग टू द लेटेस्ट टेलीकॉम मोबिलिटीरिपोर्ट्स 5G सब्सक्रिप्शन हैव ऑलरेडीरीच्ड 2.3 बिलियन इन 2024 द टेलीकॉमइंडस्ट्री इज़ ऑलरेडी इन कंटीन्यूअसलर्निंग ऑन हाउ टू अडॉप्ट अ फुल्ली क्लाउडनेटिव ऑपरेशन एंड एज़ वी नो द क्लाउड नेटिवकम्युनिटी हैज़ ऑलरेडी सॉल्व सम ऑफ़ दचैलेंजेस द सीएसपीस आर फेसिंग सो प्लीजवेलकम आवर पैनलिस्ट टू द स्टेज वी हैव गॉटथॉम केवलिन प्रिंसिपल टेलको क्लाउडआर्किटेक्ट एट Vodafone फिलिप एंडग्वथवीपी ऑफ सॉफ्टवेयर इंजीनियरिंग ऑरेंज एंडयल स्टुडलर सिस्टम आर्किटेक्ट एटSisकॉम वी आर गोइंग टू एक्सप्लोर हाउVodafone ऑरेंज एंड SISCOM आर यूजिंग दसीएनसीएफ प्रोजेक्ट्स इन देयर नेटवर्कवर्क ट्रांसफॉर्मेशन जर्नी थैंक यू फॉरजॉइनिंग मी हियर टुडे सो मे बी लेट्स गेटस्टार्टेड विथ अ क्वेश्चन टू योर सो लियोकुड यू प्लीज शेयर सम इनसाइट्स टू स्विसकॉम्स क्लाउड मेडव जर्नी सक्सेस स्टोरीजएंड ऑफ कोर्स सम इनपुट्स ऑन द की चैलेंजेसटू याह थैंक्स फॉर योर क्वेश्चन थैंक्सफॉर हैविंग अस हियर आई विल स्टार्ट विथ दकी चैलेंजेस सो आई थिंक इन टेलको वी आरस्टिल थिंकिंग इन फिजिकल बॉक्सेस एंड ऑफनब्लैकबॉक्सेस वी थिंक दैट वी नीड टू वर्चुअलीकेबल थिंग्स व्हिच इज़ द ऑोजिट टू द मोरओपन एंड मोर सिंप्लिस्टिक अप्रोच सीन एंडक्लाउड नेटिव अनदर एग्जांपल इज़ नेट व्हिचब्रेक्स विथ द डिक्लेरेटिव ऑटोमेशन फ्=लोटू ऑफन वी स्टिल यूज़ एक्सेल एज़ आर टूल ऑफ़चॉइस एंड टू ऑफन वी डिप्लॉय थिंग्समैनुअली सो आई रियली थिंक वी नीड अ कल्चरलशिफ्ट इन द इंडस्ट्री टू बी फास्ट इनटिवअप्शन बिकमिंग अ टेक कंपनी इज़ अबाउटिंगक्लोटिव टूलिंग सो वी यूज़ फ्लक्स फॉर दडिप्लॉयमेंट ऑफ़ आवर सीएनएफ वी यूज़ आरटीसीफॉर क्लोटिव नेट वी यूज़ फॉर इंटीग्रेशनइंटू आवर पीकेआई वी यूज़ एक्सटर्नल डीएनएसफॉर ऑटोमेटिंग डीएनएस एट द एंड वी ग्लूएवरीथिंग टुगेदर विथ दबनेटस रिसोर्स मॉडलव्हिच लीड्स टू हैवी प्रोडक्ट एक्टिविटी गवी कैन गो डाउन फ्रॉम वीक्स एंड डज़ डाउनटू आवर्स एंडमिनट्स टू सम दिस अप वी कंट्रीब्यूट टूओपन सोर्स वी कंट्रीब्यूट टू प्रोजेक्ट्सलाइक सिल्वर एंड वी पब्लिकली स्पीक अबाउटआवर चैलेंजेस एंड आवर सक्सेस बट वी कैनओनली डू सो मच एस स्विस व्हिच इज़ व्हाई वीफाउंड इट द क्लाउड नेटिव टेलको फोरम टूहैव अ कन्वर्सेशन अबाउट मॉडर्नाइजिंगटेलको दिस फोरम इज़ कॉम्प्लीमेंटरी टूएक्सिस्टिंग इनिशिएटिव्स विनफ सीएनसीएफएंड अदर बॉडी एंड वी आर इनवाइटिंग यू आईएसएंड टेल को टू जॉइन अस टू जॉइन फोर्सेस एसएनइंडस्ट्री आई एम कन्विंस दैट एवरीथिंग दैटवी टैकल टुडे इन टर्म्स ऑफ़ क्लाउड नेटिवअडॉप्शन फॉर सीएनएफ विल हेल्प अस टूरिड्यूस एक्सपेंसिव टेक्निकल डेप्थ एंडअलाउज़ टू मूव मच फास्टर इन द फ्यूचर सोफिलिप कैन यू शेयर सम इनसाइट्स ऑन योरक्लोनिटिव एंड ओपन सोर्स जर्नी एट ऑरेंजयस फॉर श्योर फैंस फॉर योर क्वेश्चन सोफर्स्ट आई आई वांट यू टू रिमाइंड दैट ओपनइज़ अ की स्पेस फॉर टेलीकॉम ऑपरेटर्सव्हेयर वी कैन लर्न शेयर एंड कोलैबोरेट ऑनद कमेंट्स एंड देन वर्किंग एंड फोकसिंग ऑनवर्ड्स मेक स्पेशल एट ऑरेंज वी आरएवरेजिंग मल्टीपल ओपन सोर्स इकोसिस्टमजस्ट लाइक आवर पीियर्स आई वांट टू फोकसस्टार्ट फर्स्ट ऑन द ऑन द सिल्वरप्रोजेक्ट दैट इज़ अ लाइन फाउंड�>��शन यूरोपऑस्टिट प्रोजेक्ट एमिंग एट बिल्डिंग एनइंडस्ट्रियल ग्रेट क्लाउडनेटिव टेलकोस्टटू रन आई वुड से नेटवर्क फंक्शन वर्कफ्रॉम कोर टू एज टू रन एंड टुडे फॉर दनेटवर्क फंक्शन डिपार्टमेंट ऑफ़ आवरएफिलिएट्स वी बिल्ड ऑरेंज डेको क्लाउड दैटइज़ 100% बिल्ड ऑन सिल्वर इफ टुडे इज़ ओनलीआई वुड से 10% ऑफ़ आवर वर्कस बिकॉज़ ऑफ़ दरीडसेंट अडप्शन ऑफ़ द क्लाउड एटी नेटवर्कफंक्शन इट विल बी मोर देन 60% बाय 2014 वीआर लेवरेजिंग वेरी इंपॉर्टेंट प्रोजेक्टऑफ द सीएनसीएफ इकोसिस्टम लाइक क्लस्टरईपीआई एंड फ्लेक्सिटी फॉर इंटर्न बेस्डगीट अप्स मॉडल एंड सपोर्ट सेवर ओपन सोर्सफ्लेवर वी आर आल्सो कंट्रीब्यूटिंग टूहेनुकेट एंड इट्स आर्किटेक्चर रेफरेंसइंप्लीमेंटेशन एंड कंफर्म टेस्ट वी आरपार्टिसिपेटिंग टु द सीएनटीआई फॉर अस इट्सअ ट्रू कैटरिस्ट ऑफ द क्लाउड नेटिव टेलकोबेस्ड प्रैक्टिससेस एंड होम फॉर्म टूटेस्ट एंड टॉपिक अबाउट नेटवर्क फंक्शनसर्टिफिकेशन वी आर लिवरेजिंग ओपन एसएसएफप्रोजेक्ट लाइक फॉर इंस्टेंस द प्रोजेक्टदैट इज़ ब्रिंगिंग आई वुड से ऑल दसिक्योरिटी ऑफ़ द इंफ्रास्ट्रक्चर सपेशननेफ्यू फॉरेस्ट इज़ अ काइंड ऑफ़ नो स्टारअबाउट नेटवर्क फंक्शन मैनेजमेंट एंडऑटोमेशन एंड बिकॉज़ टुडे टेलकोस आरअनबर्किंग इंटू अ न्यू मोनेटाइजेशन यूनिटऑफ़ देयर एसेट वी आर सपोर्टिंग स्ट्रोंगलीविथ आवर अपीयर्स द कैमरा प्रोजेक्ट व्हेयरवी आर डूइंग द स्पेसिफिकेशन ऑफ द टेलकोएपीआई सो हियर यू कैन सी कंक्रीट एग्जांपलऑफ हाउ ओपन सोर्स इज़ टुडे सपोर्टिंग आवरमोस्ट इंपॉर्टेंट ट्रांसफॉर्मेशन जोमूविंग फ्रॉम आई वुड से टेररको टू टेरकोसो नाउ आई गॉट द क्वेश्चन फॉर यूअकॉर्डिंग टू यू व्हाट आर द की चैलेंजेसएंड प्रोजेक्ट वेयर क्लोनिटी कुड सपोर्टइन द फ्यूचर थैंक्स फॉर आस्किंग सोमनाफर्स्ट रिफ्लेक्ट बिट ऑन द पास्ट एंड दट्रांसफॉर्मे?शन शेप इंडस्ट्री मोबाइलनेटवर्क ईच जनरेशनल जंप फ्रॉम 2G3 5Gरिक्वायर्ड सब्सटेंशियल इन्वेस्टमेंट बीरेडियो इक्विपमेंट डेट सेंटर हार्डवेयरनेटवर्क कैपेसिटी बट आल्सो मैसिवसॉफ्टवेयर आर्किटेक्चर चेंजेस फॉर एक्समूविंग टू 4G विर्चुअलाइजेशन और टू 5G विकंटेनर्स मैनेज बायक्यूबिनिटी लुकिंग टू द फ्यूचर दे आर मेनीयूज़ केसेस 6G लाइकली द टेक्निकल वर्क फॉरस्टार्ट कपल वीक्स इन साउथ कोरिया एआईमशीन लर्निंग टू प्ले अवेट रोल इनइंप्रूविंग नेटवर्क क्वालिटीसस्टेनेबिलिटी एंड वी गॉट द सीएनसीएफ एआईवर्किंग ग्रुप फ्रॉम प्रोजेक्ट्स लाइककेप्लर व्हिच वी थिंक पिवटल इनअचीव देन गर सपोर्टिंग सिस्टम ऑब्ज़र्वऑकस्ट्रेट मैनेज नेटवर्क अक्रॉस द होल ऑलडोमे दे रई कंटिन्यू टू रिलाई ऑनप्रोजेक्ट्स लाइक ओपनसेट्रीथियस फ्लक्स आर द ऑपरेटर फ्रेमवर्कएंड मेनी मोर एंड देन ऑफ कोर्स एसनेटवर्किंग ऑपरेटर्स देयर आर कीनेटवर्किंग प्रोजेक्ट सच एस म्ट द मल्टीनेटवर्क कम्युनिटी क्यूब प्लेन रोल इनमेकिंग एंड सस्टेनेबल एस पॉसिबल नाउक्रूशियली व्हाटएवर वी डू इन द फ्यूचर वीहैव टू अवॉयड दोज़ मैसिव हार्डवेयरइन्वेस्टमेंट्स इन न्यू इक्विपमेंट एंडदोज़ ह्यूज सॉफ्टवेयर आर्किटेक्चर चेंजेसएंड क्यूब एंड सीएनसीएफ प्रोजेक्ट्स की टूदेम प्रोजेक्ट्स लाइक क्यूब एंड मेटलक्यूब इनेबल क्लाउड नेटिव लाइफ साइकिलमैनेजमेंट ऑफ मेटल एंड वीएम प्रोजेक्ट्सलाइकविल इनेबल अस टू डिप्लॉय पोर्टेबलएप्लीकेशन अक्रॉस आवर नेटवर्क आईटीस्टेटली एंड ए्बिशियसली वी थिंकअपग्रेडिंग फ्रॉम 5G टू 6G कुड बी अ सिंपलएस अटी रोलिंग अपग्रेड और अपडेटिंग एनऑपरेटर सब्सक्रिप्शन चैनल मे सीशियस बट आईथिंक द टेक्नोलॉजी द प्रोजेक्ट्स एंड दकम्युनिटी इन प्लेस टू मेक रियलिटी सोर्डफ्रॉम ऑपरेटर्स आई लाइक टू सीर सीएफकम्युनिटी लीडर व्हाट डू यू थिंक डू एग्रीएंड हाउ कैन कम्युनिटी सपोर्ट इन दिस विज़नऑफ़ कोर्स टॉम इट्स इंपॉसिबल टू कंक्लूडदिस डिस्कशन विदाउट मेंशनिंग हाउ मचसीएनसीएफ इज़ यू नो रिप्रेजेंटिंग द कोरफाउंडेशन ऑफ़ ऑल द थिंग्स वी आर रिलाइंग ऑनअह इन हेल्पिंग द सीएनएफ रनिंग सो अ ह्यूजअ बिग थैंक यू टू ऑल ऑफ़ यू हियर इन दऑडियंस हु आर कंट्रिब्यूटिंग टू ऑल दिसप्रोजेक्ट्स इन हेल्पिंग द टेलको वर्ल्डरनिंग एंड इफ यु टेक वन स्टेप अबोव दसीएनसीएफ एंड सी द ओवरऑल Linux फाउंडेशनअम्ब्रेला इट इज़ अमेजिंग टू सी हाउ द एलएफअह इज़ लाइक एन्वेशनिंग द एंड टू एंड टेलकोनेटवर्किंग लैंडस्केप एंड डोंट गेटस्केर्ड सीइंग अ ह्यूज लैंडस्केप हियर बटऑल ऑफ अस हैव अपॉर्चुनिटीस टू कंट्रीब्यूटसो इफ यू आर वंडरिंग हाउ टू कंट्रीब्यूटफील फ्री टू स्कैन द क्यूआर कोड हियर इटइट विल टेक यू टू द लेटेस्ट सीएनसीएफब्लॉक पोस्ट व्हिच वी हाव क्यूरेटेड विथइनफार्मेशन ऑन हाउ टू कंट्रीब्यूट टू एनीऑफ़ दिस प्रोजेक्ट्स अह थैंक यू सो मच टॉमफिलिप एंड यॉर् फॉर जॉइनिंग मी हियर एंडशेयरिंग युवर इनसाइट्स एंड थैंक्स अ लॉटटू द ऑडियंस इन लिसनिंग टू आवर व्यूज फीलफ्री टू रीच आउट टू अस इवन आफ्टर द सेशनइफ यू वुड लाइक टू चैट मोर अबाउट टेलकोएंड क्लाउड नेटिव यस थैंक यू सो मच बिफोरआई वाइंड अप आई वुड लाइक टू गिव अ फ्यूसेकंड्स टू दिस अमेजिंग इनिशिएटिव दैट हैज़बीन डन बाय द कंट्रीब्यूटेड स्ट्रेटजी सबग्रुप इट्स लाइक समथिंग वेरी इंटरेस्टिंगआई वुड इनकरेज ऑल ऑफ़ यू टू टेक अ लुक एंडसी इफ आर इंटरेस्टेड टू जॉइन मेनी थिंग्सव्हिच मेनी ऑफ़ अस फील सो स्ट्रेट फॉरवर्डएंड इजी इन आवर डे टू डे लाइफ्स मे नॉट बीसो फॉर मेनी अदर पीपल अराउंड अस सो आई एमश्योर यू विल फाइंड दिस एस एन अमेजिंगइनिशिएटिव प्लीज ट्राई टू सपोर्ट थैंक यूसो मच2025-04-15 21:57:35.782419A to follow as well as software as aservice generally software as a servicedoesn't fall under the CRA and there's awhole bunch of specifics out there aboutif you have a uh piece of software thatyou do ship and it can only operate withyour SAS then maybe it does fall underthe CRA but again all of this is beingworked on as we speak in fact as we hadto submit the slides yesterday uh therewere uh meetings among the EU workinggroups to start to really uh clearlydefine some of these things and closethoseloopholes so why Eddie why don't you nowtalk to us about how this impactseverybody here there's a lot of peoplein this room most likely you fall intoone of these three categories ormultiple you might be in multipledifferent categories as an individual uhyour company might be in multiplecategories so we need to spend a littlebit of time on understanding what theroles are before we can reallyunderstand our responsibilities i'mgoing to try and touch on both of thosevery briefly again those of you who havealready dug in and read this whole thinghave a really solid understanding of thecomplexity here we're going to try andgive an overview of this and tell youwhy this isn't scary uh maintainersquick show of hands how many maintainersactually got up this morning and andcame out here maintainers of an opensource project you are all upfront ohthere's so many of us all right so nolegalliability as an individual you theredon't worry about this right we have hadfear and uncertainty and doubt and atone point therewere very reasonable concerns and uhmaybe some negotiation and hagglingaround who could be sued and who couldbe fined maintainers are now completelyout of scope for thislegislation you do not need to worryhowever if your business is to maintainopen source and bring it to market aspart of a commercial activity even ifthat is a free open-source productitself if for you it is part of acommercial pipeline you are amanufacturer you do have legalliabilities hopefully there is amiddleman between the two of us the freeopen-source maintainers that arewilly-nilly wild west we can do whateverwe want the manufacturers who are goingto be tightly regulated there's now thestewards as well the stewards are thefolks who havea type of commercial activity uh whomight be throwing massive conferencesaround open-source but everything isrolled back into open source right anymoney that's made off of the open sourceis brought back into open source thenonprofit type of model is the stewardsthe stewards are subject to a lighttouch customtailored regulatory regimethat will be figured out in the futureand it will be case by case for each ofthe stewards i've spent long enough onthis slide you guys have read thesethree words for way too long mike pleasetell us you talked about products Italked about open source what aboutopen-source productsokayso to start if your product that you'reselling includes open source that couldbe because it's based on open sourceit's a product that includesdependencies that are open source thatis sort of considered an open sourceproduct here right and generally withthe CRA there's three categories rightthere is critical which includes atleast as co as the the currentdefinition certain types of specific uhsecurity uh hardware for security thatis critical you have third-partyuh conformance assessmentrequirements there is important which isa category and of course that has twosubcategories uh there is class 2 whichalso requires a uh third party securityassessment and also includes uh certainproducts that maybe include containerorchestrators uh you have class onewhich for some odd reason at least rightnow includes operating systems which ishas less requirements but uh that'sstill being sorted out that doesn'trequire a third party assessment but youstill need to selfassess that you'redoing various security uh uh practicesand then there is everything else thatis sold in the EUokay bring it home with an example whilewe're at CubeCon let's talk aboutKubernetesokay if youare one of the enterprise flavors ofKubernetes or just have a product thatyou are distributingKubernetes you have requirements you canno longer pass the buck to open sourcemaintainers and expect that they willjust do this on behalf of yourbusiness i heard clap there got someKubernetes maintainers up here clappingall right all rightnow again open sourceKubernetes hey there are norequirements for there sorry there's noliability for you obviously you shouldwe should all be doing good things insecurity butagain contributing maintaining opensource is a thanklessjob the thing I want to highlight hereis that this is really really good newsthat we have this level of clarity inbeing able to distinguish between theconsumers the manufacturers the stewardsthe maintainers this is a level ofclarity that we haven't had in the pastand we are extremely fortunate for thisand this willunlock positive business activity thatwill result in increased cyber securityglobally if we continue to work togetheron this if you are wanting to deployvanilla Kubernetes and bring that tomarket through your product you canchoose knowingly to assume thatliability and you'll be accountable ifit is not within your scope you do notwant to assume that business risk rightlet's say you're you're doing Kubernetesyou're distributing Argo you'redistributing Flux now we can know thatif that is brought to market by you orby somebody else to you if let's say ifI bring it to market I bring an opensource project to market uh by providingenterprise support services and Mike isbuilding a product with digital servicesoff of that when I bring it to market Iam assuming some legal liability thatMike can now offset onto me and so wecan now plan ahead to know that we'regoing to use these support contractswe're going to use these enterpriseflavors to shift some of that risk andliability which is a huge benefit and sowhy don't you tell us how the LinuxFoundation CNCFOpenSSF is helping so the LinuxFoundation is obviously the one that wehave the closest eyes on mike and I areboth very very involved across the LinuxFoundation between the fintech opensource foundation open SSF CNCF we'vegot a lot of visibility on this um theLF is very very very invested in makingthis as easy as possible for absolutelyeverybody uh and this is both from thefoundation such as the the new LFXinsights rollout that's going to becoming out uh this summer it's going tobe really freaking cool um and also justthe things that we're doing asvolunteers around the community thisweek we worked with the open telemetryproject with flux with uh the sandboxprojects measurery and oscal compass tomake tangible impactful securityimprovements that will streamline theadoption of these products later on whenaccountability is put in place for theirprojects we used the baseline both Mikeand I were authors on this because we'vesaw this coming and we want to be ableto find a way to codify the standardsthe minimum necessary security practicesthat every single open project opensource project should do let's writethat down put it in control language andmake it measurable and actionable andthe next thing that we have on thehorizon coming out of open SSF is todefine recommendations for assessmentsso that way when we are looking at ourown open source projects and we want tobe improving the state of the securitythere we can know the best way to dothose assessments for ourselves uh yeahand finally please check out there's anuh the QR code leads to an LinuxFoundation Europe page with a wholebunch of details about what the LinuxFoundation is doing and details aboutthe CRA and it's going to be great andplease please please please as amaintainer of multiple Linux Foundationprojects come join uh the working groupscome join the meetings come you knowopen up issues check out the baseline uhand if you're so inclined help us buildthe tools that we're using to help makecompliance with things like the CRA andin the future other regulations muchbetter grab this QR code right now getmore information we need help actioninga lot of this this is good news if wework together thank you[Applause]2025-04-15 21:57:36.247665 ��L#�QAUI-b-Odg39Ahello everybody my name is Jacob and Iwork at Haroxy Technologies you mightknow of us as the company behind Harroxythe legendary opensource software loadbalancer that a lot of you are probablyusing today I'm going to be piggybackingon the previous keynote aboutperformance and security and LLMsbecause I think there are a lot oftopics we should talk about i talked toa lot of our customers and they all saywe're all in on AI but I think the risksof that are yet to be underC�Y#��iAA1HGYh0Wz9Uallright who in here has read the phraseproducts with digital elements no fewerthan 343 times over the last few monthsi got one hand right here all right youare mypeople you're also probably not going tolearn anything for the next few minuteseverybody else we're going to work onkeeping this simple as we go through andtalk about this legislation my name isEddie Knight i'm the OSO lead atSonotype i am also the co-chair ofCNCF's TAG security and as you may havenoticed I am from the UnitedStates i'm Mike uh CTO of Kusari tagsecurity lead and open source securityfoundation governing board member andI'm also from the United States andwe're standing here in the UnitedKingdom with you to talk aboutlegislation in the European Unionso that speaks to the global impact ofthislegislation but it appears that 62% ofyou according to a recent uh LinuxFoundation survey are not familiar withwhat this legislationis so what is it it is European Unionlegislation intended to protectconsumers and businesses from cybersecurity threats in products withdigital elements it's a good thingthat is a very very very shortdefinition Mike so I'm going to ask youto double click on that zoom in give usa little bit more detail sure it'ssimple seeum our good friends at OpenSSF Fukami uhcreated this map which shows all theroles and responsibilities for the CyberResilienceAct but it's simple you have you knowopen- source maintainers you havemanufacturers and they're cooperatingwith uh regulatory bodies and and Mikethat's not simple that's not simplelet's start at 2027 and let's workbackwards to today okay in 2027 the CRAgoes into full effect that means anybodywho is selling a product with digitalelements in the European Union will havesecurity responsibilities this includesthings like software bill of materialsand just general good practices for thesoftware and the dependencies of thesoftware that they'reusing but that's not all eddie why don'tyou talk to us about 2026so if everything goes into full effectin 2027 we are getting a fortunatestairstep put in place uh starting in2026 middle of 2026 we're going to seethe notification to conformityassessment bodies they're going to startlocking in the details literallyyesterday there is a change in thedetails around this legislation andwe're going to see middle of 2026 thisshould start getting locked in to wherethe the really refined meaty bits willhave more precision and we'll know whatwe're going to be assessed on end of2026 we're going to have the first pointwhere we will have responsibilitiesthe first responsibility starting onDecember 11th 2026 is going to be forreporting requirements of any knownexploitable vulnerabilities in productswith digitalelements all right so I'm the one whohas to define what a product withdigital elementsis again just put simply it's a productwhose intended and foreseeable useincludes direct or indirect dataconnection to a device ornetwork okay okay okay let's keep itsimple pretty much if you ship hardwareor you shippedsoftware it generally falls under thepurview of theCRAgenerally okay okay okayif you're already fall under another EUregulation like those under uh theautomotive industry medical deviceindustry you're already doing a lot ofgreat security activities and you don'tfall under the CRAin addition there is a lot of debateabout certain things like when browsersfall under it when what rules they mighthave@stood i thinkwe're kind of in the 2003 of OASP top 10for web security right now in terms ofLLM security so when they talk about I'mbeing allin on AI security or an AI thefirst step they usually think about iswell we're going to build an AI gatewayand so what's an AI gateway well it's anAPI gateway right we've all built one atthis point or most of the projects inload balancers have built one APIgateway right now an AI gateway so we'regoing to do some authentication we'regoing to do raid limiting we're going todo PII detection or maybe PII extractionand prompt routing before we send thedata to an inference engine which isultimately an HTTP API but I think oneof the things that's missing there orpossibly missing is prompt security andthat's really you know the ignore allprevious instructions that doesn't workanymore or a time bandit attack thatused to work with open AI until recentlyso there are obviously solutions to thatfor example there's a prompt guard modeland a llama guard model from metathere's a shield gemma from Google andin the end many of these are built onsome kind of a variation of dbertaclassification it's a set of models sothey are actually large language modelsthat are classifying you ask is thisprompt safe yes or no and they answeryes or no so it's ultimatelyclassification problem so we're going toadd prompt security to the AI gatewayand basically run the model itself inthe AI gateway um so I did that and Iran these models inside an AI gatewayinside a load balancer and ultimatelywhat I found and this is what I want totalk about is it's pretty slow so if yourun a model like this on a G6X large AWSinstance which is ultimately a prettybig instance at this point if you startapproaching 500 tokens it takessomewhere between 150 to 200milliseconds to process the prompt andmost prompts are bigger and most ofthese models have a context window ofabout 500 tokens so if your prompt is2,000 tokens you have to do this fourtimes and that's a second of time you'reprocessing that's that's lifetimes in aload balancer world and if you startdoing this in parallel like a proper AIgateway it gets worse so here's if youcan see I ran a nonoptimized model sobasic transformers and then an optimizedmodel with an inference engine and onthis instance you cannot almost everreach more than 60 requests a second andonce you start getting to a concurrencyof let's say eight concurrent requestsyou will never reach over 40 basicallyso there's a lot of work to be done inmaking this faster there areoptimization strategies we can run anoptimized inference engines i did thatand you can get about 30% you can enabletoken caching but ultimately tokencaching as we know it right now is meantfor generative AI and we're not doinggenerative AI we're doing classificationwe're classifying the model so it's outthere to know if this is useful longterm there are some advanced techniquesi've tried these as well and it workslike you can filter for some of thebasic prompts and bad words with textfiltering the only problem is if youmake a typo in the filter or in yourtext the text filter will no longer workwhile the LLM will interpret itcorrectly so all you have to do ismangle a few words and it's just goingto work so I think there are a fewlessons learned we need to innovate newtechniques using the tools we have andwe have a few tools two using AI itselfto secure AI in the load balancer worksbut it's still out there if it's viableit's something that we are researchingwe might need to research altogethersome smaller models that can run on aload balancer and can run much quickerand three I think that the AI gatewaysare necessary but the security is isevolving in the end as I said I thinkwe're in the 2003 of OAS top 10 rightnow in AI security everybody's tryingsomething everybody's doing it but wedon't know what's going to work a fewyears from now because it's really hardto keep up so thank you so much if youhave any questions please come to thebooth i would be happy to talk aboutthis a little bit more[Applause]2025-04-15 21:57:36.860077Ee whichare Chad Jeremy Facella and Alex and forthe first time this year we're alsorunning the TOC shadow program andRicardo and Kevin are joining us in thatcapacity most of you are familiar withthe CNCF landscape we have more than 214projects within our ecosystem and theseprojects are dispersed across threedifferent levels of maturity we havesandbox incubated and graduated projectssandbox projects provide a solution fora niche problem space these are greenfield ideas next we move to incubatedprojects and here we already seeadoption in production but moreimportantly we see contributions frommultiple organizations because we wantto ensure the vendor neutrality for thedevelopment of the project and finallywe have the graduated projects and theseprojects are here to stay we also referto them as the ones who cross the chasmso moved from early majority to earlyadoptersnow within CNCF we have 134 projectswhich are in sandbox 36 are incubatedand 31 are graduated and I would alsolike to draw your attention towards thefaint gray line at the top of the graphthis represents our archive projects andI don't think we talk sufficiently aboutarchival sometimes it is natural forsome of the projects not to reach thatmomentum in contributorship oradoptorship i think it's very importantfor the maintainers to take the lessonslearned and redirect their energy eitherto resist existing projects within theecosystem or by creating new initiativesand open sourcing them and within theTOC we have focused more on the healthassessment of the projects and we haveactually some of them resulting witharchival and currently we have 13projects within thisstatus you also can notice from thisgraph that we are growing exponentiallyit took us six years to welcome a 100projects within our ecosystem and threemore years to reach the 200 projectmilestone we are growing at a doublerate however since the technology is thegravitational point of our ecosystemnone of it would be possible without acommunity around it which means we needto scale as a community at the same pacewith technology growth and to do thatwithin the TOC we are revising ourprocesses constantly to ensure we aresetting ourselves for success for thenext 10 years of cloudnative following up on what Katieexplained so we have these differentlevels uh in the CNCF projects uhsandbox incubation graduation there theyhave a lot of meaning for the projectsthemselves as the TOC and the communityhelps them mature through thesedifferent levels but they also have alot of meaning for the end users as anend user I know this is very importantwhen you're selecting your project whenyou're building your stack to understandwhat is what are the expectations youcan have from the different projects sowe wanted to validate it validate thisthis idea and we started using uh aspart of the due diligence process thatKatie mentioned we have adopterinterviews and we started doing thissurvey to try to understand how the endusers themselves are seeing this uh thisdifferent information so we startedasking some basic questions and ifyou're an end user we'll we'll startreaching out more actively during theyear as well and the first question weasked was do you feel you have a goodunderstanding of the meaning of thedifferent levels in in the CNCF projectif you're a longtime end user I canassume that you probably have a goodidea if you're a new end user maybeyou're still being introduced to thecommunity so we we wanted to to to seewhat the replies would be so actually wegot mostly positive reply saying thatYes and they are already relying on thislevels when they are choosing theprojects in the landscape in some casesthe answers were a bit more unclear ithink so but I'm not completely sure canwe discuss a little bit and validatethat I have a goodunderstanding so the second question waswhat can we improve in the TOC in termsof defining this levels and making themmore useful for the end user so we askedis there information you think that ismissing regarding the meaning of of eachlevel and we got really reallyinteresting replies as you saw wFe havethree levels sandbox incubate incubationgraduation the feedback we got is thatit would be uh really useful to havemore information uh about where theproducts projects are in the levelitself sometimes the projects will stayin incubation for a couple of years theend users would like to have informationon what's the progress within incubationtowards the what the ideal should be toto move to graduation they would likemore information on the state and theevolution of the projects in the processalso other another request was toactually link the different maturitylevels to project releases so that ifpeople are using for example an olderrelease for some reason for some reasonor another to understand what was thematurity level at thatmoment and the third question we askedis if people are actually alreadyrelying on these levels internally andhow they are doing it i know as an enduser actually do this so we keep our owninternal matrix but it was reallyinteresting to ask this question and wegot a variety of uh answers some peopleare already using the levels as aninitial data point but they perform alot of due diligence in addition to thatother people have a much more uh complexinternal matrix where the maturity levelis one of the criteria but they built amuch larger criteria internally to maketheir decisionsnow Katie explained that one of theproblems we had uh as a community onscaling out to this number of projectsis very very similar to what we havewith scaling out infrastructure uh asthings grow and grow fast you have tomake sure you can you can accept theload so if you've seen previous uhsessions at at KubeCon in the lastcouple of years one of the struggles wehad in the TOC was scaling out to to uhbe able to help the projects mature andhandle the load that comes in so we hada lot of effort in the last two years totry to improve the the projects we cameup with things like domain technicalreports and general tech technicalreports we do a lot of pre-checks beforewe started due diligence we improved ourprocesses quite a bit now the resultsare out so on February 25th this year wehad a sandbox review project we we get alot of new projects to review each timeand we actually sent out a message tothe community saying we had great newslike we actually emptied the queue forsandbox this was a big achievement so ifyou will go and watch the YouTube videoyou will see a lot of happy faces fromthe TLC members from from achieving thisif we look at today I just took uh thispicture uh yesterday you will see wehave five sandbox projects in the queuewhich is something we can handle in oneuh review period which is reallypositive and even if you look to theother levels because we have this uhevaluation with technical reports uh andwe also do this pre pre-checks for duediligence even the move to incubationand graduation the backlo backlog iswell under control if you see thenumbers here and in two weeks after thecoup coupon coupecon freeze we'll pickthis project so we are really gettingthings moving much fasternow some of the most important workwithin our ecosystem is also done by ourtags or technical advisory groups theseare micro communities within ourcommunity that focus on a broader domainwithin the landscape such asobservability security environmentalsustainability and so on these areformed of SMMES in the area that usuallywork on white papers working groupsproviding guidelines for our communityand providing guidance for projectsduring um projects moving level howeverover the time we have noticed that thetags are not scaling very well and notmeeting the demand required by the TUCas such we kickstarted a new workloadwhich oh going back uh we kickstarted anew workload which is called the tagreboot and we aim to restructure ourtechnical advisory board to scale forthe next 10 years of cloud native wehave a wealth of information on theissue 1527 and I definitely encourageyou to check it out for moredetails in a nutshell some of the workthat we're going to do as part of thisworkload is to reduce the amount of tagsfrom 8 to 5 we're also introducing thenotion of community groups sub projectsand initiatives sub projects are longlived efforts that require continuousstewardship and currently we identifythe project review and the contributorstrategy as existing sub projects withinourecosystem initiatives on the other sideare short-lived efforts that have veryclear objectives and exit criteria andcurrently we identify the artificialintelligence working uh white paper aspart of this group very importantly hereis that free initiatives we aim toincrease collaboration within ourcommunity anyone would be able to openany initiative as long as it hasstewardship and a very clear exitcriteria at the same time we aim to useinitiatives for cross foundationcollaboration as well because we aim tohave a closer partnership outside of ourecosystem as well and initiative providea perfect channel for thata very important point here I would liketo mention is that we will open theelection for the new tag chairs andleadership after CubeCon as such if youidentify any of your expertise alliancewith the new tags do nominate yourselvesand more importantly nominate your peersas well be an endorser within thiscommunity and another goal or aim thatwe aim actually aim to achieve for thetag reboot is to cover some of the gapswe have within our ecosystem andhopefully these gaps will shape thefuture trends within our landscape thereare three main areas that I would liketo mention today the first one is aroundmulticluster management andobservability this is still a challengewithin our ecosystem especially if youhave a cross provider strategy andsurfacing observability within this uhsetup is quite challenging too next wehave cost management and sustainabilitynow we have an increased adoption ofcloudnative architecture but at the sametime we have an increased focus on thecost spending however as adopters we'reequally responsible for our costspending and carbon footprint and we'rehoping to bring groups likeobservability and environmentalsustainability in closer collaborationto cover cover this gap within ourecosystem and finally we have toolingaround infrastructure provisioning andsecret management this has been a gapwithin our ecosystem for a while now andwe're surfacing this back to thecommunity once again hopefully toencourage more contributions andmovement within this arearight and before we close I would alsolike to highlight that some of the tasksthat were traditionally expected fromthe TOC we realized it would they wouldprobably fit better uh among the endusers so recently we created thistechnical advisory board probably heardit last year uh and in here we aretrying to focus on a couple of uhpriorities for for the year and thiswill be focusing on feedback loops toestablish a relationship a closerrelationship between projects and theend users and to have a way for them tocommunicate directly in a in a moreefficient way the second one isreference architectures this is the oneof the main uh requests from end usersto have more guidance when they startadopting new projects making choices butalso help them uh uh extend the usage ofcloudnative infrastructure in otherareas and the last one is ecosystem gapsuh Katie just mentioned a few but tohave a more established process toidentify this gaps by interacting moreclosely with the end users this uh willbe this is a new board that works veryclosely with TOC and will help us outalso scaling out to the to the growth ofthe number of projects we've been seeingnow if you have any questions towardsthe TOC we are all around the room socome and find us also if you want toknow about our latest work you can go toour TOC repository we have our curingworkload and boards open so you'll beable to check what we're doing alsowe're going to have a TUC AMA today at2:30 so if you have any questionsregarding to any of the things wementioned today please come and ask usthis is Katie Gamanji i'm Ricardo andthank you very much and looking forwardto seeing how you can shape thecloudnative ecosystem thank you andenjoy the rest of the conference thankyou2025-04-15 21:57:37.467958 % 3h%�@#�9AmynQyP2_17Eall right before we wrap up today'skeynote we'll be taking the scholarshipphoto at 10:30 so in like 2 minutes Soplease stay clear of the stage and givethem priority before any other stagephotos Um and I also want to remind youthat the solution showcase closes at 2p.m today So be sure to visit oursponsors before it's too late and makesure to grab your cloud nativeuh CubeCon conference t-shirt at theCloud Native Corner Store I'd also liketo extend my heartfelt appreciation toall of you But first I want to take amoment to acknowledge my esteemedco-chair Casper This marks his[Applause][Music][Applause][Music]fin this marks his final time asco-chair for CubeCon Cloud Native ConAnd you already gave him a round ofapplause for his incredible contributionto this event but we could give him onemore[Music]We're really grateful Casper foreverything you've done Thank you Thankyou so much Thank you so much CasperThank you so much for all the hard workand support Yeah thank you so much Ireally enjoyed serving in this role andthanks so much for the opportunity It'sbeen it's been a pleasure being beingco-chair It's been so much fun I hope tosee you all at future CubeCons and cloudcons Yep and I'm thrilled to pass overthe torch to our newest co-chair AbbyBankser[Applause]Thank you so much Casper You've beensuch an amazing influence to thecommunity already and I look forward totrying to build on that and specificallyto work with you Facil and Joseph onfuture events And I know we'll see youthere Speaking of our future events wehave both Hong Kong and Tokyo which arehappening in June followed by Atlanta inNovember Then next year we haveAmsterdam in March followed by LosAngeles in October And finally in 2027we have Barcelona in March Yes we willsee you at all at one of these eventsOnce again we truly appreciate you forjoining us for the keynotes and thankyou once again to all of our speakersand sponsors who have helped make thisevent happen Have a great final day ofthe conference and we look forward toseeing you next time Thank you everyoneThank you[Music]2025-04-15 21:57:38.679265�<#��/AGyvARSG3_wshi everyone i'm here today to talk toyou about science and cuberettes ipresent for you science at light speeedcloud native infrastructure forastronomyworkloads so in this talk I will presenta bit about who am I and a bit about thecollaboration that I'm part of and thenI will present the infrastructure andshow you a quick demo about an astronomyworkflow and at the end someconclusions so who am I my name isCarolina i have a background in highperformance computing and cyiabilityengineering and my first gone was in2020 when I started learning aboutkubernetes and I'm working as part of acollaboration with 12 other teamsand explain to you a bit about how thewhole project is setup so to begin with there is SK thesquare k array and it consists of threesides so there is one site in SouthAfrica which is the on the left handside where there are about 200 dishesbeing built up to 15 meter in diameterand we do observations up to thegigaherz rangeinAustralia is layout is about 13,000small antennas being built and they areorganed in circular stations and it willdo observations in the Maherzrange and as you may see this is aradioon telescope so it will not beobserving visible light but ratherreceivingH�:#��+AjUChVGvSB5ghello everyone and welcome to the lastday of CubeCon and Cloud Native Con inLondon my name is Katie Gamanji and I ama senior engineer at Apple and I'mRicardo i'm a computer engineer at CERNand today would like to share the latestinsights into the TOC work since lastCubeCon in Salt LakeCity the TOC or technical oversightcommittee is a technical body within theCNCF we aim to provide a vision andsteer the overall technical landscapewithin CNCF at the same time we try weaim to provide clear guidance for ourprojects to reach the maximum maturitylevel possible we have 11 members withinthe TOC and Karina is our new TOC chairand we're also very happy to welcomefour new members to our committeDI different radio wavelengths oflightand these received signals can then besynthesized into images andvisualizations but the actual raw datathat is received issignals and in total we are 12 memberstates and spread in the word as Imentioned I will show you a bit in thenext slide and well when this fulldevice is built it will receive up to600 per yearso that's quite a bit the store betweendifferent partnerinstitutes and to drive this huge effortas you see we have 12 member states inthe project from all over the world andin different time zones also and inaddition we have some partner states andobserver states that we're collaboratingwith and here you can see the thedifferentsites and some of the research topicthat the SG will be used for when it'scompleted is for example the cosmic dawnformation stars andgalaxiesreionization puls science solar physicsand so on and at both of these sites thesmaller there can be small groups ofantennas formed and linked together tosmaller subsets which makes the devicequite versatile for different types ofexperiments and observationsand how will this data collection thenbe done so there are direct links fromthe telescope sites into the localprocessing sites so for example at SouthAfrica there will be up to 8.9Australian persecond processing centers thatdata telescopes rather in the samecountry and the output from this localprocessing is called observatory sciencedata products and data products willthen sent out to SRCE via 100 gabitlinks and as data products will thentransfer all over the word andreplicated globally for the scientist touse and what is thissrcnet so as I mentioned before we willcollect data with the SKO theobservatory and then it will be storedand distributed into theSRCE so it's madeshared service layer offer federatedservices scien can use to distribute andcreate new dataproducts and as a reminder these dataproducts are basically data sets withobservation data that the scientist canuse for theirresearch and both the storage transferofs arechallenges partof the distrib nature of the srcet anddata replicating capabilities are meantfor avoiding for example storingexcessive amounts of data at the singlesite but rather distribute it to be moreavailable and I'd also like to point outthat as far as possible we want to usepreexisting technologies and we'retrying to avoid building our ownsolutions as far aspossible so I will explain a bit more indetail what is SRC net site what does itlook like so to give an example I workfor the Swiss SRC and there are 14 otherSRCs globally so in different countriesand the work we do at each SRC isbasically prototyping and testingdifferent technologies and thenselecting and comparing them and sharingthis knowledge with other SRC netsites and we're working towards aunified service layer that will run ontop of heterogeneous infrastructurebecause in each side the underlayinginfrastructure is completely differentat each institute but we're trying tocreate something that is unified on topof thatspecifically for Switzerland Swissregional center part of Scotch which isthe local organization that iscoordinating all the contributions tothe SKO and as you can see sketchincludes the different scientists andpartner institutes from all overSwitzerland and how did different SRCthen connect so we have chosen to have acloud native SRC service layer on topunderneath eachsfrastructure accessed workloadexecution framework will be shared andthere has of course been some challengesto this approach such as the quickinvolving ecosystems around kuberettesand running hpc oncubernetes on the other hand by usingcloud approach we can also the knowledgewith other teams and share theconfigurations between the differentsites so that's verynice and I'll now show how the Swisssite is set up how our infrastructurelooks like and as a reminder it'sdifferent in every site but this isours so how does it look like this isthe underlying Cuberus layer that isprovided to us byCSCS and CSCS is the Swiss nationalsupercuting center cscs deploys developsand manages this carbon structure forus and they have a supercuter calledAlps and we are receiving some of theseAlps worker nodes configured into aVcluster which is a novel technologydeveloped by CSCS for splitting thesupercuter into different tenants foreach customer and the Vuster here isshort for versile clusterlinkerhereinterest toy a new cluster first theworker notes are configured into clusterand the use the system management apithe csm apito create a list of the notes that areincluded in the cluster and this list ispassed oncross andy infrastructurecrossing infra definitionsrepository and then it will deploy theworker nodes the harvester vms thecontrol plane the worker nodes usingterraform and after that an is used toconfigure the alps worker nodes and jointhem to thecluster and this is quite easilyreproducible for them because ofleveraging crossplanwork nodesneed flexible and we very happy to usesetup and what layers have we deployedon top of this then so here I wanted toshow what we have added at the Swiss SRCand it's all hosted on GitLab and I alsowanted to say a hugethank the open source community formaintaining and contributingefforts thanks a lot forthis and our deployments are based onRCD and the up pattern and I added alink to our repostory if you want tohave a look at that and if you seesomething to improve you're free to opena ticket and reach out and I also addeda diagram here to show a little bitabout how the services are connected toeach other for example how themonitoring stack is set up or how thescience applications are communicatingwith eachother but this is also a snapshot aboutwhat technologies we're using at themoment and as I mentioned before we areconstantly trying out new technologiesand this keepsevolving and some of the components arealso great out in this diagramCSy ourselves for ex storage and networkcomponents and now show you a quick demoof the astronomyworkflow first you will see the sciencegateway and here I can search for somecoordinates that I'm interested in forto do some research at this point so Iwill find some data and I can choose oneof them that I'm interested inso for example this onehere and it's stored in the SRC sites onthe left hand side as you see and it canbe moved for further processing toanother SRC site and I'm going to moveit to the Swiss productionenvironment and this data set will startstaging and when the staging hascompleted when the data has beentransferred I can startI will open the scienceportared a Jupiternotebook and here I will run an examplecommand to cut out a small piece of thisdata set so you can imagine cutting outa small piece of the sky in here andthis will create a new outputfile which I then can use anotherapplication called karta to open it andvisualize it to see how it looks likeand this is an example of simulationdata that shows what could be observedwith the SK observatory when it has beenbuilt and now for some quick conclusionsso as I mentioned before both of thesesites are in constructions and we'redeveloping the SRC net to be able tohandle all the data that will come outof the device when it's readyand there is already a first imagegenerated with SKO data in Australiafrom the Australian site so you can readabout that in the press release that wasmade just a few weeksago and we have some small milestonescoming up so we're starting to testscience use cases on the shared SRCETinfrastructure to make sure that it willwork and science verification will beupcoming in 2027 and early operations ofthe telescope will start in2029 and the advice is meant to last for50 years so that's quite a lot of timeespecially in the kuber world whereeverything is changing but it will workin observation cycles of six months eachmeaning that there will be specificrequirements for the software that isused during these periods and regardingcubernet infrastructure we want to keepit up to date as much as possible andduring these years we of course count onthe collaboration with the cloud nativecommunity thank you so much on behalf ofeveryone fromcollaboration[Applåder][Musik]2025-04-15 21:57:37.952934Kneed to be extra um on topof things to be able to exploit thatdata and not just spend money storingstuff that we don't carei think that we all uh can see how thelike the picture that we get in the inour heads like the joke about the umspherical co uh cow in the vacuumchamber chamber you know we think aboutlogs we think about metrics and tracesmaybe and we have this picture where weare going to analyze the data we aregoing to extract value from it and weare going to be happyhowever reality is not a vacuum chamberi guess um I guess that's good but theproblem is that things are usually verychaotic and they are growing even morechaotic each day so we get the logs wegot the metrics we get the traceseveryone has a different standard someof them have their own standards some ofthem come from like the olden days anduh we need to make sense of that to beable to extract value otherwise it's itjust doesn't make any any any any reasonand I think I'm not the only one whoprobably feels overwhelmed when it'swhen one faces this situation it's likewhat am I supposed to do with this rightso we kind of look the other way and letit pile up the problem is that we arestill still paying for it rightand part of these of these uh of thesolution to this isactually making sense of some of thisinformation or processing it in the wayright we want to consume the informationin a way that makes sense we want toprocess it so we're not sendingeverything everywhere and we want to doit at a lower cost cuz we are not likethat's not what we want to pay for whichis why flu bit is important that's whatwhat we are trying to do we're trying tohave this system that you can use thatdoes not in uh induce a substantial costwhen it comes to the the processingrightso as I said before u we come from avery fragmented[Music]um land I'd say back back in the daywhen we only had just a few collectorswhen we were talking aboutlogs all of that or most of that wasvendored so it was it was kind of hardto to get something that would do whatyou want the way you want and that'swhat where fluent bit comes from theidea of flimbit is to have a a systemthat's vendor agnostic that can geteverything you want in everywhere or toeverywhere you want itit is very important for us to have avery wide uh set of uh integrations whenit comes to both inputs and outputs soyou can get data in the format that youhave it and get it to the point that youwant in the format that you that youwant it so you can exploit it umproperly and that's that's somethingthat I feel that we have achieved to behonest we have a very um extensive setof integrations we have support for foreverything ranging from obviously logsmetrics and traces we can get them fromthe local system we can get them from Iwould say probably legacy systems we youcan plug your syslog there you can getit from open telemetry endpoints you canget it fromPrometheus there's a lotthere so as I mentioned uh when Flip Itwas born it was born out of necessity itwas not just trying to reinvent thewheel in a way that we thought was theright way right 10 years ago we the theindustry needed something that was uhlightweight that didn't uh have overheadwhen it comes to CPU to memoryespecially when it came to the IoT areawhere we didn't have a Raspberry Pi inevery single clock right so we we didn'thave gigabytes of RAM we had kilobytesof RAM and we didn't have eight coresmaybe we hadone few megahertz so that was the pointor one of the main drivers of flip backin theday obviously the IoT craze did notreally live much longer but we found thenew home with the Kubernetes um crowdright in in Kubernetes even though youmay have some resources to spare youstill want something that will not raiseyour bill you want something that's fastyou want something that's uh that has avery low memory footprint you wantsomething that that can do what you wantat the minimum costpossible and that's what we try to dowith Flimbit and we tried to do it in away that does not lock you into anyvendors or any platforms that's to me asa maintainer that's one of the the mainobjectives toL cater to the community wewant we want you to be able to useFluent Bit in any of the ecosystems thatyou have we want we want you to be ableto integrate with any of the otherprojects that existi am of the opinion that every projecthas its value and its place and I liketo play along with them i don't thinkit's about you know uh taking theirplace i think it's about integrating ithink it's about the user and the userhas to choose what's better forthem so Fluent Bit is part of the Fluentfamily right fluent bit and fluent Dthey come from the same place and theyare uh under the same umbrella they aregraduated under the well as part of theCNCF i will talk a little bit more aboutthis later but it is important to toknow that we are trying to improve thesestatus so things are clear for the userand uh that's that's something that'simportant to be honest it's not my focusi'm not the right person to talk aboutthat i'm more of a tech guy so I Ithought I would mention it butmaybe that's something that Eduardo willactually expand in his webinar becausehe is the project leaderright so going back to to Fluent we havethese uh very wide uh arrangement ofintegrations you can you can use syslogyou can do uh local files you can dokubernetes like for example what I thinkone of the main use case for thekubernetes crowd is uh setting up flitas a pod or as a sidecar and taking thelogs from their node and shape uhshipping that somewhere all right but wealso have the Kubernetes events plug-inwhich can uh allow you to get some moreum insight on your Kubernetes clusterand we also have the Kubernetes filterwhich allows you to enrich those logsthat you are capturing with informationabout your cluster about your servicesnamespace and all of thaton the other end of course you have theflexibility to store that in SplunkElastic Search uh Posid Data Doganything you can send it to Chronospherewhich is the company that now umsponsors Fluent Bit uh or you can sendit to any other point Ashure AWS StackDriver there's lots there's I'm quiteconfident that there's something forevery need in here and if there is not Iwould like to hear from you becausethat's what we um what what we think ismost important to gather to thecommunity i want to hear from you so wecan improve and extend ourselection as I was saying fluent bit canrun in a in a differentum arrangement or a wide variety ofarrangements right so in this case wehave fluent bit running um in eitherbare metal a container or a virtualmachine we have packages for theoperating systems we have containersum and we we even have a free freeBSDport which is not Linux but it'sinteresting right it surprised me when Isaw it so we have Windows installers aswell soGDU so we have a very interesting andflexible routing system in Fluent Bit ithink it's one of the one of the verystrong points in Fluent Bit you candecide where your data is going in avery dynamic way you can even do it inruntime based on when where where thedata is coming fromand I I think that's one of the verystrong suits of fluent bit uh in this inthis example we have a few um instancesof flu bits set up in a rather simpleway but then we have another one that'syou know a little bit morespicy and in the same way we we can runflit as a um aggregator i would say uhyou can run fluent bit you you do nothave to collect logs metrics or traceslocally you can also receive them fromother systems you can receive them fromother fluent bit instances or you canreceive them from any uh hotelcompatible producers or even aPrometheus endpoint you know maybe youyou have an application that has aPrometheus endpoint that you need toscrape or an application that producesopen telemetry traces or even a legacyFluentd system that's that you cannotreplace right so you can use Fluent Bitto get all of that information in acentral place where you can process itin a very agnostic place it doesn'tmatter where your metrics are comingfrom once they are influent they are allthe same you can do all of the all ofthe same processing for metrics thatcome from Prometheus metrics that comefrom open Mtelemetry or even metrics thatyou have converted yourself using blockstotric plug-in which I think it's verycool uh log tometrics plug-in is aplug-in that you can use to convertmetrics that are presented in a textualformat to a proper metrics context thatyou can act on and deliver to the rightplace and that's that's interesting idon't know how many of you are familiarwith tools like IO stats but that'ssomething that I would see as one of theuse cases for thisas I said interperability is one of ourmain focuses and I won't want to wastemuch more time on this so now I'll tellyou a little bit more about what's newin Fluent before so so you can seewhat's what to look for once theplatforms that you use uh update tobefore or once you update tobefore we have made some improvements inthe hotel processing um area we now havetrace sampling that's what that's a newum plug-in or processor rather that wehave created that allows you to downsample the amount of traces that you getthis new processor has two operationmodes one's called head sampling andthat's probabilistic which means thatwhat you're doing is you are definingthe probability of a trace uh gettingstored or discarded right in thisexample that I have here I don't know ifyou can sorry I don't know if you cansee the the mouse pointer but basicallywe have the spans here and given the 40%probability that we have set up in ourprocessor you can see that a bunch ofthem are getting dropped and this iswhat makes it to the to the end right ithink that this is interesting and thisis something that could be very usefulwhen combined with the other operationmode of the trace samplingprocessor which is the table samplingprocessing what this means is that whatyou're doing is you are setting up atime windowwhat the processor will do when you dothis is it will buffer the spawns thatare ingested until that window expiresand the idea of this is to ensure thatyou have a complete picture of the tracebefore making the decision on if youwant to keep it or not right so let'ssay that you want to that you have a aweb store and you want to keep thetraces when one transaction failsright that's something that you can dousing either the open telemetry uh thestatus field in the trace or maybe someof the attributes in the traceand in that use case I would say thatmaybe you are most interested in storingthe traces that actually give youimportant information when debugging aproblem because you want that actionableinformation as soon as possible to fixyour problems because time is moneyrightbut you have one hopefully smallpercentage of the traces that arerelated to failures and you also have alarger percentage of the traces that aresuccesses right and I would say that inthat arena I would use I would combinethe trace sampling the the tailtrade the tail trace sampling uhmechanism to ensure that we keep 100% ofthe failures with the head samplingprocessorum to ensure that we keep a healthypercentage of the successes because youstill want to keep a poll on how thingsare going right but you don't want tostore all of the successes becausestoring those costsmoney in B4 we have alsoum added a new feature to processorswhich is called conditional processingwhat this allows us to do is todetermine if we want a piece ofinformation to be to be actedon by a processor given a certain set ofconstraintsthat's interesting because umpreviously in the original pipelinemodel you would have the input thefiltering stage and the output and ofcourse you could do that selection ofwhat you wanted to process which witheach filter using the routing systembut the processing umstack is meant to allow you to scale toa much higherdegree by attaching the filters andprocessors directly to the inputbefore or beforeB4 you were not able to determine if oneelement of the processor stack acted ona bit of information and that was kindof a limiting factor when trying toleverage the um benefits of theprocessor stackhowever with this new feature of theconditional processing um stack you canhave the same flexibility that you hadwith filters with proceNssors and thegood part is that this decisionon if a piece of data is supposed to beacted on or not is not just based on theon the tag on the routing informationbut actually on the context of theinformation you are choosing whether youwant to act on the data or not based onthe data itselfanother improvement that we have made isuh we have added some options to our TLSlayer to allow you to set the minimumand maximum TLS versions that you wantto allow Flu and Bit to interact withbecause in some specific um deploymentsyou might want to ensure that you arenot interacting with older versions toprevent and uh downgrade the tax and andsuch or maybe even you have a corporateumrequirement the same goesfor the same goes for the cipher suitmaybe you want to ensure that you arenot allowing uh your the other endpointsto force you to negotiate a weakercipher that can be um that hasvulnerabilities or can be broken in anyway and that's the idea withthis we also have introduced um a systemto ingest environment um to ingest filesthe contents of files from the filesystem in so-called environmentvariables in your configuration thiswill allow you toum not have to hardcode some secrets ofum in in your configuration or configmaps but being be able to actuallydeploy them on files and I think for theLua scripting use case it's interestingbecause um you should be able to loadyour Lua scripts from the file systemusing this as well which should makeyour config maps much much tidier whichIlike we have also introduced um Zigintegration i don't know how many of youare familiar with the Zig language it'sone of those new languages like Rust butI think Rust is much more focused on thesecurity aspect sik is more of a lowerlevel uh system and it's uh focused ononperformance as well as security but moreon performance itself at the moment weonly support zigg for output plugins butit's it's part of our uh road map toextend that to both input pluginsprocessors and filters oh and customplugins but I will speak about that inin asecondso that that's about it for the for thepresent and for the future uh what weare what we intend to do is we want toextend our integrations we want to haveproper idiomatic native integrations forum for all of the current languages forRust for ZIK for Go and we want them tobe fully featured to we want you to beable to write your plugins using uh anyof these um languages in an idiomaticway and we want you to be able to writeinputs processors filters outputs andcustom plugins and I would like to makea note on this because some of you maynot know what custom plugins are thoseare plugins that are not exactly meantto manipulate the data in the pipelinein the pipeline those are plugins thatare meant to do other things like fleetmanagement or maybe TLS certificatemanagement uh there's I would like tojust tell you about something that Ithink is really cool and I would like Iwould really love if someone from thecommunity got involved and created aplugin like this like for example toimprove the way in which TLScertificates are handled like there's uhI'm sure that you're aware that there'sanother project in the CNCF called umsearchmanager and if I'm correct one of thetrendsin in the scene is to to to have lowerlonged uhuh shorter lived certificates so I wouldlove to see a feature in Fluent bit tointegrate it with search manager to getuh short-lived certificates on eachdeployment rather than having to deploythe certificates as part of the theconfig mapso that that's one of the things and wealso want to introduce the possibilityto have parallel pipelines in a singlefluent bit uhinstallation but I think that if youattend the the webinar that the driveris going to give on the 24th you willget a much better picture of the futurethan I can paint here so with that saidum if you have any questions I wouldlove to address them uh and we're donejust[Music]check okay questions pleasemake them easy pleasethank you very much hi thank you um soyou talked about uh having a lot ofnoise and not necessarily wanting toship and keep all of it uh in whataspect do you think uh fluent bit canhelp with that could you please speak alittle bit louder louder yeah pleaseokay i was trying not to speak too loudum in what ways will you help me notship and keep a lot of redundant datawith fluid bit or do you think it willin the future no uh well one way wouldbe with this new field um processor fortraces you would use it to to discardthe data that you uh do not need rightthe the conditional sampling yes theconditional sampling and when it comesto logs you can uh there are manyfilters for the for the actual logs youcan uh discard parts of the logs you canmodify them you can ensure that you arenot shipping anyPII with with them in fact I would Iwould um combine the conditionalprocessing system with the contentmodifier processor to to do that that'sthat would be one one of the waysthere's plenty of ways to discard whatyou know that you do not need and andyou have uh also you have metrics inBloombit as part of the system we havemetrics maybe that can help you defineokay can I can I have variables in theconditions for example count to 100 andthen stop shipping something somethinglike that like thresholdsum I think you can but um I'd like to toknow more about that use case so maybeyou can join the the Fluent Slack serverand we can have a conversation aboutthat because uh if if that's notpossible at the moment I would like toknow so so we can implement it thank youit sounds usefulthank youhi um so with the I guess new featuresin Fluent V4 does that make Fluent D abit more redundantor I think it's not about the newfeatures in V4 i think it's and I alwaysI always preface my uh my my answer tothis question with a honest umdisclaimer i don't like to speak ill ofother projects and I could be wrong okaybut I think that Fluent D is more in amaintenance state and not actuallyinnovating i don't think FluentD canhandle metrics or traces and I don'tthink they have any um plans to uh adoptthe open telemetry model or to you knowkeep innovating in that in that state sothat's in my opinion that's the factorthat drives the deprecation of fluentbit of fl in favor of fluent bitokay thank youhi um correct me if I'm wrong butconsidering the new uh processing uhcapabilities for logs would you say thatthe custom L scripts are now discouragedso to speak because some of thoseoperations you could have done alreadyusing those righti think that there's a place and a timefor everything lua scripts are somethingthat you can add to your configurationwith minimal overhead it won't take youlong all you have to do is write the Luascript and put it in your into yourconfiguration if what if you want towrite a custom or a plug-in regardlessof if it's a processor a filter an inputor an output you will have to write thatin the appropriate language and you willhave to build that and it's I'm notsaying it's a very long pro um very longproject but it's going to take youlonger than just writing your Lua scriptto the configuration file and in myopinion uh those lu scripts are notreally a problem because those are jetcompiled so they are fairly fast um if Iwere to to do that maybe I would startwith the Louis script and then I I wouldtake the time to write a proper pro uhplug-in to do that because yes of courseit's going to be faster i don't know ifuh I I think I didn't mention this uhspecifically but at the moment we have afew integrations right you can writeyour plugins using C which probably noone wants you can write your pluginsusing go you can write your pluginsusing anything that builds to wasamwhich means rust for example and you canbuild your your plugins using zig so andscripts of courseblue I swear I didn't silenceyou wasn't mei can just speak right now yeah surecome herefairly simple right uhhuh so how wouldyou now recommend to do this conditionalconsidering that you could have hasalready done by scripts i think that ifyou can use the like the regular stuffthe the built-in soft the conditionalsthat's going to be much faster becausethat'sYeah of courseand I think we are done because they cutour microphone2025-04-15 21:57:39.347685 � ��( #��AAe2-LNHtUr8thanks for coming to this presentationuh I'm a bit of a stickler when it comesto starting on time so for those who hadto wait I hope you took uh theopportunity to take a picture of the ofthe webinar that Eduardo the projectleader is going to be giving in about 20days that one's going to be pretty goodand it's going to be a bit moreextensive than this oneso this presentation is going to talkabout the the history like where we comefrom where we are right now and wherewe're goingthe idea is to to give a a good completepicture for those of you who are not umaware of what flame bit is or where weare and uh for those who who are and arelooking forward to version 4.0 zero wewhich we released on Monday uh I'm goingto talk a little bit about the newfeatures uh we have just introduced andthe the ones that we would like tointroduce in the new fe in the nearfuture and the longterm as you probably know thetelemetry arena has been uh evolvingthrough the past 10 years well probablymore but but the the trend is thatthings are growing exponentially we'regenerating more and more data uh andmore and more types of data uh a mere 10years ago we were basically just movinglogs around and nowadays we are doingeverything from logs to traces toprofiles so things are only goingto grow and get even more hectic whichmeans that we JQsteering committee has electionsevery year where the entire Kubernetescommunity the active contributors theyelect uh three or four members uh eachcycle to represent the community in thesteering committee and the steeringcommittee um essentially then uh dele wethe community delegates the to thesteering committee the power to uhselect members of the code of conductcommittee uh the code of conductcommittee is a group of five individualsand they also rotate like uh every everyyear with a two-year term per member andwe do have a product security committeeso if you ever got a CV report or yougot a security flawu you might have gotten communicationsfrom the product security committee theyare in charge of making sure uh CVS aretaken care of they are making uh theythey make sure that any kind of incomingum security uh flaws are responsiblyhandled and they have a process to makesure they get in through the embargoprocess and the patching process so thatum the final product is always secure umamongst the SIGs and the working groupsthere are we kind of like logically orvirtually like uh um segregate them intothree kinds like um what uh specialinterest groups which take care of theproject level responsibilities wherethey maintain different aspects of theproject they cut across basically uhmaintaining the project like VR uhcontributor experience then cigar uhrelease docs testing and kis infra thenwe have the horizontal sigs they arebasic basically cutting across thetechnical areas horizontally of thewhole kubernetes project and then thereare vertical sigs for uh feature areasor very tight scoped vertical areas ofthe project and that's how essentiallyway the community is structured thereare uh additions deletions to the groupsbased on requirements if a for examplelike uh working groups serving devicemanagement um they all came up in thevery recent past in the last year whenum the need arised of talking about suchtopics in the Kubernetes projectnow you might wonder how you as a newcontributor get into the whole ladder ofuh becoming a maintainer so you startoff being a non-member contributor youcontribute to any areas of the projectyou don't need to be a GitHub uhorganization member to startcontributing then when you havecontributed let's say a few PRs orcontributed substantially you can askthe reviewers or people who have whohave been working with you in yourcontributions to sponsor you to be amember of the org then you become amember contributor and as you keep uhworking on areas of the project youreview more code you um like add newfeatures or work on other differentareas like it's not just about code it'salso like the project management areasit's also about like maintaining theinfrastructure you grow up the ranks tobecoming a reviewer then approver thenyou become a sub project uh owner thenyou become a sub project lead and at theend if you want to take up like moremanagement responsibilities of thespecial interest group itself you can uhopt to become a SIG chair or a tech leador a working group lead if you are uhleading a working group towards acertain goal and uh this process takessome time as well this is all based onlike how people are contributing howpeople are reviewing code it's based ontheir code review quality or workquality like how how many blog posts areyou writing it's not always likequantitative as well it's veryqualitative process of growing up theladder now where does the contrax fit inwe have talked about the community wehave talked about growing so Sik contribessentially is responsible for improvingthe experience of who contribute to theproject um we do that by um creatingmaintaining programs and processes thatpromote the community and uh the membersby trying to reduce any friction ofcontributing and often like we also tryto retire the programs which have servedtheir purpose and are not serving theirpurpose anymore and whenever like uhthere is a new area coming up if peoplewant like a different kind of uhinitiative to improve uh the contricontributor health of the community wedo take up those programs as well nowRhow do we do those um our work istechnically segregated into uh a few subprojects uh to start with we have thecommunity sub project which owns thekubernetes/ community repo and anydocumentation or groups associated withit then we have the contributor coms soif you uh have been following our socialhandles or have been reading our blogson either the kubernetes.io blog orkubernetes.dev dev blog that's the teamwhich is shipping out all this contentum any sort of announcements any sort ofuh um like reminders all of them gothrough the contributor comps then wehave the contributor documentation theyessentially manage the contributor guidethey manage the kubernetes.dev websitefor Kubernetes contributors then we havethe community management group whichmaintains uh policies for differentcommunication platforms like we have uhdiscuss we have slack we have uh githubalthough github also has a separate subproject to maintain the overallconfiguration and uh the repo healthwe'll come to that later and then wehave the dev stats sub project so if youwant any statistics about the kubernetesproject the contributions secreted bythe geography or the company whocontributes uh there's a def strats uhsub project who maintainsthat elections uh we talked about twospecific electionsum the election sub project they own atool called eleto it's open source uh weused to use a different pro likeproprietary tool earlier where it was alot of friction to create elections runelections then the community decided tostart uh a new initiative to create aproject to actually do elections so ifyou go to elector.dev you can see uhthe documentation of the tool that weused to do elections then there is anevent sub project so uh the kubernetescontributor summit which has beenrunning for like almost a decade uh wasbeing run by the events sub project andnow they're helping CNCF in running themaintainer summits um I talked I talkeda little briefly about the GitHubmanagement sub project they are supposedto uh maintain the org membershipprocess the whole repo management thingwe don't own the tooling as such but weown the configuration and any um likeany discussions around the configurationof the uh whole community then have thementoring sub project so you might haveheard about programs like GSOC LFXmentorship then we have a few mentoringcohorts that we do inside the communityto grow new contributors into reviewersor existing contributors intomaintainers of specific areas um allthree of us or um two of us at least Iknow we came through the whole mentoringprocess um of becoming a lead of sikcontrib so we started on like we weredoing a few things around s contrib likeMario was doing uh contributor comsrelated areas i was doing GitHubmanagement and uh other u tech technicalareas of the project uh Priyanka wasalso working on GitHub management andother areas of the project and we allwere mentored by the previous set ofleads to become u the leads of uhcontrax then at the end we have theslack info project it's very uh um akinto like when you u try to join theKubernetes slack there's an inviter botand then we have some groups managementum this all requires some amount ofeffort in maintaining the tooling aroundit um it fortunately the slack API hasbeen so stable that we did not need tochange even a single line of code forthe five last five years and we don'thave maintainers for that as well uh tothat extent for now okay so I'll give itto Mario to talk about the next partsyeah so basically also if you know SlackAPI you know what todo so uh yeah as Nun already mentionedum some history so for the last 9 yearsand we have now 10 years of Kubernetesuh oh no 11 in June in June we get 11 umso we can soon open our Twitter accountagain with the birthday uh however um sowe have for 9 years we did in the pastuh Kubernetes contributor summitscontributor summits is usually one daybefore CubeCon or sometimes it was atthe colloccated events day where all ofthe contributors of the Kubernetesproject come together and talked sitdown had unconference sessions hadnormal sessions uh that socializedbeScause as you saw we have a hugecommunity of people that want to worktogether and join the Kubernetes projectso the problem was this is alwaysKubernetes focused so this was onlyKubernetes and Kubernetes is it's a bigproject in the in the community but it'snot alone right so we have over 200 newprojects inside of the CNCF and the ideawas that we get to hey we want to have amaintainer summit which means that wehave maintainers from all of theprojects which means that we have peoplefrom sandbox projects that we havepeople from uh incubating and graduatingprojects um so the the higher you are inyour uh in your life cycle of yourproject um the more people you canbasically send to maintainer summit sowe also include some other folks insideof the maintainer summit which meansthat there's a technical oversightcommittee which is part of the CNCF thisis basically the they make technicaldecisions on um on a level acrossprojects and it's always good to havetheir input as well we also haveincluded tech share so technicaladvisory groups so that we have like oneday where each of those different groupscome together and talk about hey what dowe need to do to foster our all overproject environment and work togetherand this is now the goal of themaintainer summit that happened thefirst time in India last year and um thesecond time now in Europe on on Mondayand we had in on Monday we had360 maintainers from across all of theprojects coming together and yeahworking on on stuffso we also want people to understandmore of the ecosystem because a lot offolks just create a project the projectbecomes eventually part of the CNCF anduh we want to they don't know what theCNCF can help you with and they alsodon't have the understanding sometimeswhere they fit in because they arecompletely new to the to the communityand that's also why we basicallyincluded all of those uh technicaladvisor groups and uh TOC's in this sothat we those projects can learn andalso that projects can learn from theother more graduated projects so we hada lot of sessions done by Kubernetesmaintainers to basically explain thingslike hey we have social media channelsin Kubernetes maybe social mediachannels is also interesting for yourproject so like educating or helping thethe smaller and the newer projects tofind their place in inside this wholebigecosystem for it's hard to create anevent for the Kubernetes community daysin the past we had uh we had our ownteam that was the CN the events team inthe uh in the in our world where webasically had like part of Kubernetescontributor experience experience thatjust made the maintainer summit theproblem is you need to m you need tohave a lot of people to run an event andall of those people did it in their freetime and it also took like a lot ofthings away so we basically included umwe now have the help of the CNCF eventsteam which is great because they help uswith all of the logisticsum we also we still have volunteers inuh contrib that do like communication tothe internal uh project and also to helpuh with the program committee becausethe kubernetes is as the biggest projectgets its own track so uh we have threechairs that are overseeing the wholemaintainer summit that are selected outof the community this year it were uh itwere two people from the kubernetesproject and one person from the TOC sothat we also have like from a chairperspective like the spread across thedifferent projects now we are lookingfor new chairs for the next year becausea term is always one year so we startwith the Kubernetes maintainer summit inEurope and we end with the uh maintainersummit in theUS so we have now also a formalized umCFP process so we go the same way likewe do with the with theum events for CNCF and uh the good thingis that we now can also havesponsorships so companies can actuallysponsor the maintainer summit thatpeople can come together so that we canhave more maintainers on thoseevents we also talked briefly about howto get into Kubernetes how to get intoKubernetes contributing and stuff likethis and uh for this we also created anew initiatTive which is cool which iscalled new contributor orientation newcontributor orientation's goal isbasically you saw the slide that Navaruntalked about with the these are the sixthis is the community structure theseare our committees that's a lot that's alot to understand and it's really reallyhard for new people like hey I go toGitHub and look for good first issue andthen I looked at the good first issueand the good first issue is not a goodfirstissue so we basically need to take orthe goal of this new um thing is that wetake the hand of people who want tocontribute and introduce them into thecommunity and show them how it'sstructured and show them how they canprocess so basically we have now ameeting which is every month on thethird Tuesdayum in two time zones so we do themeeting uh America friendly and uh alsoAsia-Pacific and European friendly sothat we have like all of the folks thatreally can join it is roughly one and ahalf hours so we have a 40 40-minutepresentation of content and 20 minuteswhere you can just ask questions we havepeople from the community joining and umanyone can attend there's no judgmentyou don't need have you don't have tohave any previous knowledge about itjust come to the meeting join it andlearn about the Kubernetes project andthere we can also help you to steer youtowards a sik to understand what sikwould be interesting for you tocontribute and stuff likethis presentation is welcome toKubernetes what is Kubernetes communitystructure we help you with a workflowum it's a we have a common contributorpitfalls as I mentioned good firstissues are usually not good first issuesthat's a problem we are aware of thisand uh we basically just help you tonavigate through all ofthis we trackstatistics we we went off with a greatstart and then it declined a little bitbut this is mostly on uh on our faultand I say our fault uh from a from acom's perspective ive because we haven'tfigured out a good way yet to do likeregular constantly communications butyou still can see that we have aconsistent intake of peopleum that that basically join thosemeetings and uh we are happy if you ifyou can join those meetings uh and learnabout the the Kubernetes project and wewill steer you towards it we will we wepromise we will be better incommunications to make it easy foryou we have a repository where you canbasically um have a look at statisticsand all of the stuff uh just go to theuh QR code and there you can basicallyfind all of the information about newcontributor or uh orientation and as Isaid we improve with content or weimprove with uh coms and content and wewant to learn about it but we also havealready a lot of communication that isbeing done and Priyanka will talk moreabout this yeah yeah um we are past ourtime so we this should be our Q&A butI'll try to cover the most importantpart because this is where we will tellwhere do we need help uh within SIGcontracts uh before I go to that umthese are the um kubernetes.io andkubernetes.dev blogs i I'll just I'lljust cover some stats on all the thingsthat we have been doing uh as part ofsome of the sub projects under sigcontrabix and sig contrabix uh as a sighandles uh these two blogs one of theseis handled in collaboration with sigocso on kubernetes.dev dev in 2024 i thinkby the end of the year we have more nowby 2024 we had 14 blog post and out ofuh uh in addition to that we had six sigspotlight blogs so if sig contrax isalso helping to understand about othersigs and you can find out sig spotlightsblog uh for that and on the other uh onekubernetes.io IO in collaboration withSIG dog sig contrib help publishing 45blogs in 2024 we have more we are inApril now i'm I'm pretty sure we we areworking on more and social media reporti'll just quickly run through them theseare our accounts i'll give you a fulllist in one screenshot but um Kubernetesis on LinkedIn as of from the timeKubernetes uh seek contribu contributorcoms took over this handle we had about147% growth in the new followers andabout985% growth in the post so about 141 newposts um Kubernetes Kubernetes io onxtheUse are the stats there and then wealso now have a new handle on blue skywhich um since November 4 2024 have6,000 plusfollowers and what I just talked aboutare our enduser facing channels we alsohave different handles for contributorfacing channels so on X we have handledKate as contributors the stats are herebecause we see the reds because we aremoving to blue sky and so we have areplacement or equivalent account now onblue sky which iskubernetes.dev um again since November4th in the contributor facing area wehave 300 followers and 300 plusfollowers and more we also haveum a handalone masterdon okay so this is this is the list ofall the uh accounts we we as sig contribmaintains us with the help ofcontributor coms project so I'll waitfor like 2 seconds if somebody wants totake a photo or you can find them in ourhandoutnotes we also want to thank all ouramazing contributors we can't thank themall on the screen so these are the fewuh we want to highlight based on therecent very recent contributions umthank you to Arpit arit is joining us asa one of the new YouTube admins arwinand uh Arwin helped us with uhmaintainer summit India frederrico thankyou Frederrico i don't know if you arehere in the conference thank you for allthe work you are doing in helping usmaintain our blogs both thekubernetes.io and kubernetes.dev devhandle Fica thank you for all yourcontributions toward LWKD and Saiak uhSak is one of the people who is helpingus bring our um infrastructure side ofour blogs uh bring bring all that backin shape and thank you to Wendy um forhelping us with the com side ofmaintainer summit London and there aremore people so thank you to a lot ofpeople now just wrapping up super quickthere are a few help wanted areas wehave number one is the Slack automationnow Harun said we do not havemaintainers um we do need maintainersthoughso take a screenshot um lot ofcommunicate uh a lot of actuallyabsolutely everything that we do as uhfor management of the community we wantit to be automated and this is one spacewhere we can usehelp who would be the right people to umtalk to I think you can reach out uh tothe Slack infra admins people in Slackit's mostly me slack infra or you canjust reach out to Nabarun yeah he wasthe one who last wrote those umautomation okay next up we have electtoi think we started receiving some helpon the electto side but roughly what weneed help with is this is a ingrowncommunitybuilt uh software that we runevery single year for two uh runs ofelections this um software is written inPython and Flask and we need help withum hardening it we need to uh rework orextend our testing so if you areinterested into the testing area and ifyou have ski your skill sets are relatedto Python or flask or you want to learnthis is the project where we need a lotof help and the people who you can uhtalk to about this would be you can pingus in sig contrib onslack.getkus.io or there is also anelections channel in slack.getkus.io youcan reach out to ushere next up we have contributor side imentioned Saiak um he is one of thepeople who is helping us maintaining theHugo um static site generator and doxytheme that helps us run our blogs but westill are relying on one person and wecan have we can use a lot of lot of helpso if somebody is interested in helpingus maintaining our contributor site orlearning alongside Saiak uh feel free toum just read what Sak is doing and thenmaybe reach out to us uh you can reachout to us in the any of the uh channelsI talked about SIG contraix onslack.kers.io or we also run bi-weeklymeetings so you can come on any of thoseas well finally mentoring uh mentoringis one sub project where we help otherpeople or other projects in thecommunity uh run mentor mentoringcohorts but we also need a structurethere we uh I think Sylvester is helpingus build a lot of structure already butwe could use a lot of more volunteersthere to help us run those cohorts so ifyou are into um shephering tasks ormanagement tasks this is one of theareas where we can use your help andfinally those are the open help wantedareas but we always have more at hawkstuff that is happening in the communityso please hop on to our any of thebi-weekly meetings one happen in a Apactime zone Apac Mia and another one is aUS friendly time zone uh you can joinany of those and and you will find atleast one of us on those call and manymore contributors yeah so we we do goover our sub projects one by one anddiscuss what is going on so and finallythis is a list of all the resources youcan definitely I my personal favorite isK community bookmark it if you are newto the community even if you are not newto the community you're hoping sigs oryou just want to learn what is what isgoing on K community is the resource youcan open u many of the directories thereabout specific SIGs specific workinggroups and you can see which area isowned by which particular SIG and so onum slack.k.io IO again is our Slackorganization Slack working space um youcan use that URL to join the communityand SIG contrax is the channel forKubernetes SIG contributorexperience kubernetes SIG contrax Ithink there is a a hyperlink there thatwill lead uh point you to the mailinglist for SIG contributor experience andthat mailing list is the way to join toour or get invitations to our bi-weeklymeetings as well and I think that's samefor NCO as well yeah and it's again wewe we truly acknowledge it's always nottoo easy to get started uh and it isfrustrating it was for me and I can sayit for most of my colleagues here so theonly thing is you just stick around youobserve other people and you you followum their footstep or you make your ownby observing them and if you need helpsik contrax is the place to ask for helpif we can't help you ourself we willmake sure we find you resources or wepoint you to the right people and thatis what NCO is for or our regularbi-weekly meetings as as well once againthank you for joining the talk and welook forward to working with you ortalking to you or meeting you in thecommunity open to Q&A yeah thank you[Applause]um if you have feedback for us do fillin the feedback form it will take you toa sket page most likely where you canread the session and then put in somecommentshello helloyep any any questionsyou you mentioned death studs ioccasionally go there just for fun it'sand and to justify my work towards mymanagement because it does help withthat so a good tool I know we also relyon it to track active contributorsuh for elections for example i rememberat some point there was a similar toolfrom theCNCF are there any plans to migrate orkeep or will both keep running for theforeseeable future devstats is also aCNCF project just Kubernetes projectmaintains the KS part of it and like thegraphs for the KS part um so there's devstats there's a global dev statsinstance for all of the projects whereyou can basically get the samestatistics for all of the projects andthere is a beta oh there's a betawebsite with more more stats for alsoproject health and uh project releasesand stuff like this so um both will staybut the focus of the both of both of theplatforms is slightly different um but Ican't remember the name where it is butit's somewhere on the CNCF website umh LFX insights LFX insights yeah thereyou can get more statistics on thedifferent projects and uh yeah dev statsis like more personal more companyrelated stuff for like if you want tohave really nittyritty details um Ithink I can add uh we did onboardKubernetes to uh LFX insights and we didfeedback like this is what we usedevstats right now for one of the usecases you already mentioned is forfinding out who who is eligible forelections uh or casting a vote currentlywe do not have 101 API from APIs fromLFX insights um but that was mentionedlike it was so early and we are tryingto get feedback from all the projectsand we'll work on that so we have tofollow up on that yes where we are um inthe migration or I did hear devstatswill be uh sunset if LFX insights umdoes everything what devs sets does butit is if it does which which is not thecase todayyeahany otherquestions then not thank you all forattending2025-04-15 21:57:39.966428 RCR��"#��;A6MrXcbcxnN4all right welcome everybody welcomewelcome still people filing in uh Ithink there's more room in the backthere but uh you know grateful to seeall your all's faces today so let's getthis thing going this is the crossplanetalk uh my name is Jared this is Nick uhwe are part of the leadership of theCrossplane project we've dedicated agood chunk of our careers now tocrossplane um so we're obviously veryexcited to talk about something we'revery passionate about um as is usuW��!#��qAksKOPx99rIEhey everyone good morning to the thirdand the last day of the conference umthank you all for being here uhlistening tous to talk about how we makecontributing to Kubernetes easy and whatwe have been doing for the past decadeum what challenges have we been facingwhat good things we have been doing andhow do we plan to uh manage theKubernetes contributor community and theprocesses for the next decadeso um I am Nabarun i am one of theco-chairs of Sik contributor experiencei uh maintain a few other areas ofKubernetes and contribute to this otherCNCF project calledKCP yeah hi everyone i'm Mario i am alsoone of the co-chairs of contributorexperience in Kubernetes and uh I'm alsodoingfinetunes projects and some CI/CD workfor all of the CNCF projectsand hello everyone my name is PriyankaSagu I work at Souza and I'm one of thetechnical leads for Kubernet u segcontributor experience and we do nothave two people on the stage today soI'll also introduce them we have Kaslynwho is also one the co-chairs for SIGcontribuum who also is one of my fellowtechnical leads herecool thank you forintroducing so who who do we makeprocesses for so here we have a photofrom Kubernetes Contributor Summit NorthAmerica 2024 where you can see um asmall chunk actually of our wholecontributor base we do have a lot ofcontributors contributing toKubernetes um as of um last week we had90 about 95,000 contributors toKubernetes at any point in time in thelast 11 years of the project um existingthey contributed over 4 and a halfmillion uh contributions includescommits reviews everything it includesall activity across the project's reposand um their contributions have beenheld by around 8,000 reviewers in thehistory of the project um and thesenumbers are like quite uh growing iwould not be surprised if we reach like100k contributors by um start of nextyear or end of next year because thenumbers are like quite rapidlyincreasing having said that how is thecommunity structured so the Kubernetesum is a really huge project and tomaintain such a project we need to havecertain groups um we need to have astructure so we basically have threespecific kinds of groups one is in darkblue which which are the specialinterest groups contributor experienceis one of them uh we have in light blueum working groups they are supposed tobe short-term um groups who have a veryspecific um goal and they have an exitcriteria and they are supposed to wrapup at the end of the uh cycle byreaching that uh success criteria andthe work will be folded into one of theuh sub projects one or more of the sorryone or more of the uh special interestgroups sub projects and then the thirdtype of group that we have arecommittees they are mostly elected umthe PXal forCubeCons we have a very diverse range ofuh like experience levels withcrossplane so some people are running itin production some people have not evenheard of crossplane before and they'recurious about what it is so we do haveto start this talk with 10 minutes ofbasic introductory stuff the people themore crossplane pros might be a littlebored with that but I guarantee you thesecond half uh is going to be quiteinteresting as Nick is talking about thethe future of crossplane a v2 ofcrossplane very exciting stuff youhaven't seen all right let's go so thebasics what is crossplane so we like tothink of uh crossplane as a cloudnativecontrol plane you basically use it toprovision and manage all of yourresources uh you can take thoseresources you compose them into higherlevel abstractions and with thoseabstractions you offer those to yourdevelopers so that they can self-serviceand provision and get the resources theyneed uh Kubernetes is a great controlplane for containers crossplanebasically extends Kubernetes and teachesit how to manage everything else umcontrol planes are not a new conceptright cloud providers have been runningcontrol planes in their backends foryears uh but now Crossplane gives youthe framework and the tooling that youneed to build your own control planeyour own platformso the basic building blocks here wecall these managed resources incrossplane uh so think about all thedifferent cloud providers out there allthe different SAS offerings on premisessoftware all that stuff there'sthousands and thousands of differentresources and services out therecrossplane aims to bring those into theKubernetes control plane and allow youto provision and manage them from thecontrolplane in practice what that looks likehere with a little practical example isimagine an S3 bucket so this S3 buckethere uh crossplane is representing it asan API object in the Kubernetes controlplane so just like any other API objectit's going to have a spec and somedeclarative configuration that youbasically specify your desired state forthis bucket and then crossplane will gotake that desired state and apply it tothe real world ending up with an S3bucket just like any well- behaved umKubernetes API object they're going tohave a status as well that's going togive you like the observed state of thereal resource out there in the realworld the conditions um there there'sgoing to be events that tell the historyof the life cycle the you know what'sgoing on with that resource so they'reall well all these you know thousands ofresources in crossplane that are in theKubernetes control plane are wellbehaved Kubernetes API objects how doesthis work probably like you expectimagine taking uh this bucket manifestas a user and you apply this S3 bucketmanifest to the control plane to the APIserver probably through GitOps orsomething like thatum the or there is a set of controllersin crossplane that are watching andactively reconciling these resourceswith the real world so the S3 controllerit will see get an event from the APIserver it'll see that somebody has adesired state of a bucket it'll use theAmazon API to talk to AWS and provisionthat bucket out there in the real worldand matching the actual state with thedesiredstate so we've got the building blocksthe basic building blocks in crossplaneand these managed resources let's buildthem into an actual platform now so theconcept of composition is hugelyimportant in crossplane basically itallows you to take these granularresources assemble them compose theminto higher level abstractions and thenthose abstractions are what you offer toyour developers so take for example ifyou want to compose together a GKEcluster node poolool network subnetsubnet all that stuff and then you wantto offer that as a very simple clusterabstraction to your developers you givethem some limited configuration optionsand they basically are then able toprovision workload clusters that are youknow in accordance with your golden pathof the platform team uh you know and allthat infrastructure complexity is hiddenaway from them uh so itY's a much easierexperience to you know unblock yourdevelopers and get them going quickly toproduction uh all of this is you knowKubernetes API right so any tool thatknows Kubernetes can work withcrossplane and uh be compatiblethere so let's do a little visual hereum and spoiler alert This will bechanging a bit for V2 when Nick gets onthe stage and shows you all that butlet's think about the middle of thisdiagram here uh as a platform engineeryou need to define your platform API foryour developers so you specify theschema the configuration knobs you wantfor them what is this abstraction thatyou're exposing to your developersunderneath the covers you have to definea composition the logic how whatresources are you composing together andhow do you compose them together inpractice let's look at this example hereagain of uh say you as a platformengineer want to expose a platform APIfor databases to your developers so youwould go through the effort of definingthe schema of that API what are theconfiguration knobs I want to give mydevelopers and then your developer comesalong and she you can see here on theleft side of the screen that she's likeokay I just want a small Postgressthat's all I want the complexityunderneath the covers is you know isabstracted away and hidden in yourplatform there so you're at runtime forthis small Postgress uh you know youhave a GCP composition for it maybe youhave Alibaba maybe you have Azure orwhatever maybe you have uh cheap orexpensive maybe you have silver or goldit doesn't matter you could have anynumber of compositions that define thisdatabase abstraction and then for theGCP case that's going to be cloud SQLSQL user global address connection allthat sort of stuff so that's a bit of apractical look at it uh let's look at alittle bit more details about how to dothis uh so to define the shape of yourAPI define your platform API we havethese what they're called compositeresource definitions you basicallydefine your custom API group you'reextending Kubernetes right you're you'reyou're giving Kubernetes a new API herebasically and so you can define the APIgroup the kind and then the schema whatare the configuration knobs that youwant your developers to have exposedthen you have to write the the logicwhat what resources do you want tocompose together uh you know how do youwant them to to happen and hugelyimportant the way we do that incrossplane is by running a pipeline offunctions that's how you composeresources together in crossplane solet's talk about functions for a bitbecause they're hugely important sobasically in crossplane you're running apipeline of simple functions and that'show you compose resources together uhfor all of those functions you canbasically use your language of choicethere's a whole plethora of uh ofdifferent languages that are supportednow but the key part the reallyimportant part here is that you justneed to focus on the unique logic foryour platform what does your platformneed to do for your team and focus onjust codifying that your golden pathsyour configuration that's important toyou let Crossplane do everything elselike managing the life cycle of thesethings doing all the heavy lifting youknow garbage collection blah blah blahall that sort of stuff but you get todefine a simple pipeline of functionsnow when I say functions you might thinkquickly of like okay I have to writecode that's not true uh there's a wholeecosystem in crossplane of these uhreusable functions that the communityhas built and the key takeaway here isthat all of these functions basicallyoffer you a bunch of new experiences incrossplane of different a variety ofways that you know to co to express yourlogic and build your platform uh sothere's a whole spectrum of you know nocode declarative low code full codewhatever you're most comfortable withuse that you know they're not going toforce you into one particular way ofbuilding your platform use the languagethat you're most comfortable with maybethat's a high level config languagemaybe that's a lower level generalpurpose programming language it doesn'Ztmatter use what you want and then you'renot stuck with just that only languageeach function uh in that pipeline uhcould be whatever language you want soyou can mix them and match build theplatform in the way you're mostcomfortable with so let's look at someexamples uh we're going to go throughthese quickly and not look at every linebut for example with functions you cando templating and define like a variablenumber of access keys you can use Pythonto define this S3 bucket if you likePython you can use KCL to define like anEC2 instance for a variable number ofregions you can use uh Q to define someIM policies for uh a given set of ARSyou can use pickle to make a config mapwhatever there's like a whole spectrumof experiences here or drop down to alower level programming language ageneral purpose programming languagelike go then once you're using if youwant to define your platform and buildyour platform with a general purposeprogramming language now you can startusing like the native tools of thatlanguage and you know unit tests andtest frameworks and linting and you knowuh autocompletes intellisense languageservers all that stuff um help youdefine your your platform as a softwareproject it's up to you basically allright So the last release in crossplanewas the 1.19 I think it was I was goingto say last month but now we're in earlyApril so it was actually February Ithink I almost lied to you I'm sorry butrecently enough one month ago let's sayuh and so what we're focusing on thereis m continuing to mature crossplane thekey uh APIs and features that folks areusing and adopting in crossplanecontinue to uh mature those so usage isnow beta uh the claim server side applyis now beta and then we also learned amistake from one of our mistakes as wellof how we promote code APIs and doingthat in a safe way so you can upgradecrossplane and downgrade crossplane aswell across all the versions uh welearned some mistakes from that and youknow did a couple patch releases andhave a good policy in place so thatworks really well uh and a couple of thethings we worked on is basically makingcrossplane more useful in more scenariosuh so some host network scenarios thatwere in high demand from the communitywork now being able to use privaterepositories with the crossplane CLIjust making it generally more useful forpeople and then last slide for me uhnext release 1.20 20 is going to be inearly May so just about a month away uhso roadmap for that and ongoing as wellso the change logs feature in crossplanewe've had it in runtime for a couplereleases but we haven't actually rolledit out to the providers yet and I thinkthat's a really important featurebecause it basically lets you have likean audit log of everything crossplane'sdoing to all of your resources why it'sdoing it when it's doing it how it'sdoing it where it's doing it all thatstuff that'll be available in all theproviders so I think that's reallyimportant we're obviously going tocontinue maturing the important featuresthe key feature areas and APIs thateveryone's using and adopting we'regoing to continue to mature those goingto work on uh insight observabilitymetrics like more insight into whatcrossplane is doing that'll be a themeand then of course the future ofcrossplane crossplane v2 which Nick willtalkabout thanks Jared so Monday of thisweek we released a preview build ofcrossplane v2 so it's out there it'sreleased and it's something that you cantrynow our goal with crossbane v2 is tomake crossbane more useful for morethings we also want to make it moreintuitive and less opinionated comparedto crossplanev1 there's three major changes incrossbane v2 composite resources and ournamespaced manage resources in ournamespaced and you can use compositeresources to compose whatever you likeany Kubernetes resource as opposed toonly crossplaneresources over the last couple years atKubeCon something that I while I wasmanaging the crossplane booth somethingthat I would frequently hear from peopleis "I love crossplay for myinfrastructure abstractions I builtcluster abstractions databaseabstract[ions etc what should I use formy apps or my microservices or thingsthat actually are using thatinfrastructure?" And it got me thinkingi was like why why not just usecrossplane or why not at least allowpeople to use crossplane so one of thethemes with crossplane v2 is it's bettersuited tobuilding API abstractions for anythingnot just infrastructure and especiallyfor applications ormicroservices so composite resources XRSare now namespaced jarrett touched onXRS before but as a quick recap theseare Kubernetes custom resources that youwhen I say you I'm typically meaning aplatform engineer or something like thatuh is defining so you teach Crossplainthat this type exists and you teachcrossplane what a schema is here I showan app but it could be a billing frontend a web app an Acme database it'sreally entirely up to you so incrossplane v2 all XRS all compositeresources and all managed resources werecluster scoped uh there were reasons forthis uh it was inspired by persistentvolume claims and uh persistent volumesin Kubernetes and it kind of made sensewhen we were thinking about it as ainfrastructure tool arguably but uh butin this modern world I don't think itmakes sense anymore so we've just madethem names based what this means iswe've lost an entire concept incrossplane there are no claims incrossplane v2 claims were a namespaceproxy you would create a claim in yournamespace and then crossplane wouldrespond by creating a cluster scriptedXR that was identical to the claim noneed to do that anymore given that XRScan now be cluster scoped another smallchange that you might see here if you'refamiliar with crossplane is on every XRthere's a couple of things thatconfigure how crossplane works so theexample on this uh uh YAML that you'relooking at now is the compositionreference this is the part that tellscrossplay when someone's create thiscreates this app use this configurationto know what to do to know whatresources to create in response to thisapp being createdum previously all of these sort ofcrossplay machinery fields were just toplevel under spec and this means that itwould be easy for users to confuse stuffthat was crossplay machinery that theymay not need to worry about with stuffthat is actually important like theimage in this example the image that theapp's going to run so we've moved all ofthese crossplay machinery fields underspectrosplane just to make it easy tosay hey you can ignore all thecrossplane stuff under spectrosplane ifit's not relevant to you as auser should say most composite resourcesin our namespace uh when you saw the XRDexample earlier you'll notice it looks alot like cross uh sorry a Kubernetes CRDcustom resource definition uh and we'vetaken the scope field that exists onCRDs and we just put that on XRDs so nowyou could choose whether your XRD isnamespaced or clusterscript well you're choosing whether yourXR is namespace or cluster script usingthe XRD so I think I expect that likeKubernetes resources the vast majorityof XRS are going to be namespaced goingforward but there are some interestinguse cases for cluster scoped XRS so anamespaced XR can compose can createresources in its namespace a clustersscoped XR can create clusters scopedresources or resources in any name spaceso one thing I'm excited about withclusters scripted XRS is you can imaginepackaging up something like Argo CD oran operator using an XR building thatAPI yourself and then using that to rollout to clusters or control planes to uhto deploythat in B2 i mentioned before manageresources are now namespaced this one'spretty straightforward uh right now asof the preview we've only updated theAWS providers to have namespace managedresources uh the next couple of monthswe'll update all of the providers tosupport namespace managed resources thismeans that when you're creating anamespace XR you can compose that XR ofnamespace manageresources and I won't take too much timeto explain the benefits of namespacingin Kubernetes i presume that you're allfamiliar with it it's a tenencyisolation lets you give people access tocreate app XRS and ma\ybe manageresources directly in one namespace butnot others if you uh if you wantto and that gets to another thing that'smore of a philosophical change withcrossplane v2 in crossplane v1 we uh Ithought of it as kind of verticallyintegrated and what I mean by that is wetold people you shouldn't really makemanaged resources directly you shouldn'tput managed resources in a helm chartyou should create managed resourcesusing a composite resource and theopposite was also true we said compositeresources were for crossblade managedresources not for arbitrary Kubernetesresources we no longer hold that opinionwith crossplay and v2 uh with everythingbecoming namespace to it to become itmakes a lot more sense to uh potentiallycreate mandatory resources using anothertool if you want to that's fine i thinkit works best and great with compositionbut you don't have to use compositionand similarly you can use composition tocompose whatever Kubernetes resourcesyou want could be a deployment could bea service uh could be whatever you wantnot just crossplane manage resources solet's uh show that nowuh don't worry this diagram that looksvery complicated and weird is showingyou how it used to work in crossplane v1so technically in crossplane v1 youcould already compose whatever resourceyou liked but it was a little strange sowhat we're showing here is this examplewhere the platform team has defined anew API and that API is an app and sosomeone comes and they say all rightcool I'm creating an app in my namespace and what the platform team hassaid is when someone creates an appcrossplane what I want you to do iscreate a deployment a service and an RDSinstance and you can see the machineryof how that works is the app claim getscreated in the namespace then crossplanejumps out of the namespace to thecluster scope creates an app XR which isjust a mirror of the claim then that appXR is composed of two objects for ourKubernetes provider which is weird in ofitself the Kubernetes provider for aservice that runs onKubernetes those clusters scoped objectmanage resources would then jump backinto the namespace create a deploymentof a service and meanwhile the RDSinstance would be clusters scoped so itgets the job done you your user createsan app and out the other end comes adeployment service and name uh RDSinstance but it's more complicated thanit needs tobe so here's what that looks like incrossplane v2 i think you'll agree thatyou know there's like at least threeboxes missing from this diagram so it'sa lot simpler the user creates anamespace XR and crossplane responds bydirectly creating a deployment a serviceand an RDS instance also namespaced andlike I mentioned you don't have to usemanage resources at all here's the samething user creates an app we create adeployment a service and a cloudnativeuh PG database cluster running Postgresin the cluster as opposed to running RDSor some cloud servicecrossplane v2 is backward compatiblewith crossplane v1 what this means isonce crossplane v2 is g you'll be ableto upgrade from v1 to v2 and most peopleshouldn't have any breaking changesdoing that there's two things we've donethat enable this one is I mentionedbefore that xrds have a namespace uhhave a scope field and it can benamespace or cluster scoped actually itcan be namespace cluster scoped orlegacy clusters scoped and if you set itto legacy cluster script which will bethe default for v1 of the xrd API uhthen it will just create a v1 style kindof legacy uh legacy xr so that xr willsupport claims it will not usespectrosplate it will work just like itworks in crossbane v1 we're also addingnamespace manage resource support to allproviders but we're not removing clusterscope to manage resourcesso in crossbane v2 we'll consider v1style clusters scoped uh xrs and v1style clusters scope manage resources tobe a legacy feature we intend for you tomigrate away from those and start usingthe name features but you don't have todo that as a forcing function to upgradeto v2 you can install v2 and all of yourv1 style stuff will still besupported we did take advantage ]of themajor version update to uh remove somefeatures that have been deprecated for acouple of releases most notablycontroller config which is a way toconfigure out providers it's beendeprecated for like 11 releases neveractually made it out of alpha uh but itwas heavily used uh so we finally pulledthe band-aid there and so that's nowgone in v2 uh and the other one that'sbeen deprecated for two going on threereleases is native patch and transformthis was the precursor to compositionfunctions so if you are using nativepatch and transform you do need toswitch to composition functions beforeupgrading to crossplanev2 all right now I'm going to do a quickdemo of v2 i need to hand thismicrophone to Jared so itdoesn't make a weird noise allright so for this demo I'm just going togo through the get started compositionguide that's in the new V2 docs forcrossplay we actually did quite a lot ofwork revamping the V2 documentation uh Ihope that you can see this okay it'sabout as big as I can make it so we'regoing to stick with the same appcomposite resource example our goal hereis we're going to create an app thatlooks like this you can see it's onlygot one field in spec the image and whenthe app gets created we want crossplayto create a deployment and a service nomanage resources nice and simple justgoing to create a deployment and aservice the uh replicas are going to becopied from the deployment status andthe address in the app status is goingto be copied from the services status sowe need to do three things to uh createthis new type of XR this new customresource in Kubernetes we need to defineit we need to install a function and weneed to configure how crossplane callsthatfunction so first we define it bycreating an XRD defines the schema andteaches crossplane and Kubernetes thatthis app is the thing that exists it'sthe X ID that we've got here i'm goingto jump around a lot now from thesewindows iapologize and I have uh this X ID herealready so I'm going to create that X IDand just just so you know I'm not lyingto you that's the same X ID i promise umnow that I've created this XRDKubernetes knows that app is a thingthat exists and Crossplay knows that appis a thing that exists crossplay knowsit's responsible for app it knows itshould spin up a controller and startreconciling apps but it has no idea whatto do if you create one if you createone of these right now crossplay is justgoing to be like I I don't have animplementation for this i just know Iownit so next we need to install a functionthat teaches Crossplay what to douh for the purposes of this guide it'skind of more complicated as as Jaredtouched on before but let's let's assumethat uh uh a function is basically apluggable configuration language thatlets you pick what configurationlanguage you want to use to configurecrossplane we've got four in this guideuh YAML is a bit boring and I want uhyou to pick what function I use so whydon't you let me know by show of handsi'll ask for a show of hands in a secondwould you like Helmstyle templated YAMLPython or KCL which is a relatively newCNCF configuration language all rightshow of hands for Helmstyle templatedYAMLcool all right show of hands for Pythonit's good i like Python that's myfavorite uh show of hands forKCL oh I think KCL wins though all rightwhat do you think Jared it was prettyeven but go with KCL kcl it is um allright so now we would install thisfunction uh I actually cheated becausewe're in conference Wi-Fi and I willshow you that I have actually installedall four of these functions alreadybecause crossplay would actually have todownload them and I don't want it tofail to download them uh so I what wehave now is we've defined the app andwe've also installed the function thatwe're going to call but now we kind ofneed to connect the two together we needto tell crosswind okay when someonecreates an app call this function withthis input uh so to know what to dolet's jump back over here and then godown to configure the composition andI'll show you what these look likebriefly uh templated YAML looks like aHelmchart pyt^hon looks like Python uh quicknote on Python uh some languagesactually support you you'll notice thatthis is like Python inside of a YAMLdocument which is convenient for smalland simple cases like this which ispretty gross for more complicated thingsyou probably want syntax highlightingand completion and things like that umfor Python you can actually just likewrite a function in Python and use yourIDE so that's that's what I would do fora larger function but this is great forsomething simple like this and thenfinally KCL which is what we're going torun with that's what KCL looks like it'skind of a little bitPythonic so apply compositionKCL going to make sure I don't have anyother compositions in there greatcomposition's there that means Crossplayknows what to do now when you create anapp i've got an app here there's the appsuper simpleall right so I've created that app sonow crossplane is saying okay I've gotthis app i am going to look up whatcomposition I need so it's going to lookup and find that there's only onecomposition it's going to use the KCLone that we just created and then it'sgoing to call function KCL with theinput from that composition so let's seehow that'sgoing all right that's looking promisingoh it's it's actually done already Ithink so uh you can see here in theseevents that it's selected the KCLcomposition that we created before andfor a minute the deployment serviceweren't ready but I can see that it isnow synced and true so sync synced astrue and ready is true so one of thecool things about this all beingnamespaced and just like more of anormal Kubernetes resource compared touh crossplane v1 is I can use the coupecouple cube cuddle tree plugin on theapp and it'll show me that the app istrue and it'll show the app's composedover deployment and service and all thethings that theyentail that's ends the demo this is allnamespaced hey that's the big changethank you can I I want to come back overhere i come back over here um CrossplayV2 is available today jared and I spenta lot of effort on uh revamping the docsfor Crossplane V2 as well uh so if youfollow this URL or open this QR codewhich we will show later once we finishthe talk um that'll take you to the docsthat'll teach you how to get startedwith crossplane v2 as a preview thereason it's the preview is the actualdesign is still open we're still waitingto get more feedback from the communitywe wanted to give you something concretethat you can install and kick the tireson and play around with and let us knowif this is the right direction before wecommit to it as the as the real futureof crossplane v2 i feel pretty goodabout it personally but I'm reallyexcited to hear what you think uh sodefinitely give us feedback oncrossplane v2 and give us feedback onthe new documentation that we've put outnow I'll give you back to Jared to wrapit upall right let's close it up here uhthank you Nick for showing us off allthat awesome V2 stuff um so you knowonce again like the project thecrossplane project is nothing withoutits community right um so morecontributors are always welcome uh wedid recently author like a gettingstarted guide for new contributors so ifyou go to the crossplane uh GitHub repoit's in the contributing folder rightthere easy to find um there's a littleQR code if you want to scan that too uhbut basically it kind of helps youunderstand the lay of the land of wheredo we need contributions in crossplanehow do you get started with your firstcontribution um like uh how do you finda good first issue all that sort ofstuff uh the docs definitely need helpas well so there's a lot of places topitch in and help out and this is kindof your guide to figure out where to getstarted and how to become a crossplanecontributor you know and key takeawayfrom this slide crossplane.io go to thewebsite there all these things arelinked from there uh you know you canfind the docs you can find our reposblog all that sort of stuff socrossplane.io that's where to go to findout other ways to get involved and thenfinal slide here once again V2 this isthe call to action folks uh we showedyou off all this changes that we thinkare really helpful to make crossplaymore useful more like easier to use lotsof great stuff there so try it out we'dlove to hear your feedback and thank youeverybody for being here todayyep i think we got three minutes forquestions uh so we will take anybody whohas aquestion yes I saw your hand first i'llcome running to you oh you're fasterthan I amright on hi uh thanks um does crossplanehas mechanism for fetching data similarto Terraform's datauh in a way it's got primitives you canuse to do that uh it doesn't have uh yetsomething that'suh exactly the same as Terraform datasources we have two things you can usethough we have what we call observe onlyresources so you can create a resourceand crossplane won't actually doanything in the cloud it'll just mirrorit back and then on top of that you canwrite functions that will then sort ofload that up and sort of do interestingstuff with it we're it's it's a commonrequest to have something more like datasources and we think we've got theprimitives and we're thinking about whatwe want to do to build something alittle bit more convenient than sayingyou have to write a function to do it sokind of basicallyokay thank youquestion anyoneelse thathi thanks for the prayers I've got aquestion is there a way to schedule thedeployment of various resources or wantto deploy one resource pair to anotherone in corpse plane to schedule theorder like[Music]um the deployment phasesI I think you were asking basically canyou order the updates or creations rightuh yes you can so because compositionfunctions are pretty open-ended uh whatcrossplane will do every time it calls acomposition function it will say here'sthe observed state of the world so youask me to create a deployment in aservice every time it calls the functionit'll say hey I've made the service butI'm still waiting on the deployment forexample which means it's pretty trivialto add logic to your function tobasically say hey just don't create thisservice until the deployment's createdor whatever sort of thing so yes you canum but you do need to by default itdoesn't by default it'll just createeverything but because we give you a lotof power it's relatively easy to do thatand there are certain uh existingopinionated functions that supportconfiguration language that will kind ofmagically do that for you as well butit's kind of on a function by functionlayer okay thank you oh and then lastquestion was right over here if you'restill hereburden wait oh oh there you areokay so is there any way to uh increaseuh the reconstellation loop uh intervalso we won't get red limited in uh someprovidersyeah to wait to to like to throttle itor to pause it or those sort of thingsexactly yeah yes there's a couple ofconfiguration options that everyprovider exposes that basically allowsyou to specify how often you want to besyncing resources or you know you canmake them go slower basically and thenthere's also the concept of a pause incrossplane where you can apply a pauseannotation to resources and compositeresources and stuff and crossplane willstop reconciling it entirely then untilyou unpausep it exactly what I needthank you so much awesome thank you forthe talk yeah and that's never done justa quick follow up on that as well thereis another feature that is hopefullygoing to come out of alpha soon calledreal-time compositions this is more atthe composition layer but um somethingthat can cause things to be reconciledmore than you'd like is that at thecomposition layer where every 30 or 60seconds polling and just checkingeverything uh we have an implementationthat uses sort of watches or kind oflike pub sub uh so that you don't needto be constantly pulling long storyshort when we get this to beta there'llbe fewer reconciles in the system andthat's a priority for the soon somethingbut even better cool yeah awesome sothat's all the time we have thank youeverybody once again come to the boothand hang out more2025-04-15 21:57:40.673564`ngsum and that's like uh the first uh oftenit's the first um point of contact forproject to with theCNCF all rightgreat uh thanks Danielle so that wasreally the overview of the current tagstructure and then we can dig into alittle bit about what what runtime isspecificallyuh and what we've been working on hereso really runtime is uh you may think uhit's about runtime right so clearly thatis how it got it started um but reallygoes into all kinds of different thingsas well such as uh uh orchestration anddifferent workload typesum uh into uh operating systems andvirtualization as well and then into alot of edgeand devices and you can tell there's alot of kind of breadth to what isruntimetoday and actually a lot of the AIinitiatives as well have fallen under uhcurrently under runtime um so justreally a a broad area and it's it's morethan just container runtimes rightuh so some of the examples and I thinkit's important to note that this is allabout the projects at the end of the dayand the tags are here to help uh the TOCof course but it's really about helpingthe projects and those communitiesbecause that is why we're all here andthat's what makes um the CNCF soimportant so a few callouts um from thisslide we have uh flat car which isrecently incubating cube edge which isrecently graduated as well as WM clouduh just graduated last year um and justjoins a long list of of projects thathave come through uh even Kubernetesitself is technically a runtime projectalthough very uh self- sustaining uhtoday with a lot of its own structureand just to go over some of these scopeareas as I mentioned so workloadorchestration you can see a lot of theprojects there um it does get intoruntimes as well like container runtimeso containerd cryo and friends uh yukiwas a recently uh accepted sandboxproject which is a container runtimewritten in rust uh which is reallyinteresting uh and I didn't mention theweb assembly as well is also a runtimearea of focus so things like Wom cloudWM edge and then uh moving down toserverless so K native and a lot of theuh OSS as well so K3S K0S uh was justaccepted into sandbox uh earlier thisyear as well and then uh special purposeoperating systems so a lot of the umpurpose-built uh Kubernetes distros umis is a good area of focus thereum and then of course we have the AI andthen uh a lot of those related projectsthere a lot of lot of interest um inthat area and there's been historicallya lot of working groups as well underunder runtime and we can getinto sort of the the current structurethat we have uh in those working groupsand and talk a little bit about um someinitiatives that they have been able toaccomplish to this pointso first uh do you want to do you wantto talk about this part yeah uh sureyeah briefly about the AI working groupuh this working group has been aroundfor maybe about two years uh there's alot of there's been a lot of interestaround AI so the community came togetherand tried to create something aroundcloudnative AI on how to run AIworkloads on on top of cloud native butalso how you can use AI to improve cloudnative workloads like AI ops and thingslike uh LLMs connecting to Kubernetes somultiple areas uh so we have uh severalinitiatives going on and uh a lot of itstarted with the cloud native AI whitepaper that was published in Paris lastyear uh and subsequently we we hadinterest in different areas like an AIsecurity white paper uh creating aplayground or there's also a ascheduling challenges white paper sothere's there are a lot of challengesaround GPU management so the communityhas come together to try to create somebest practice ices and and create adocument on that and there's also asustainability AI white paper in theworks and this kind of addresses theissue with energy consumption which isalso kind of related toGPUs uh so there's been a lot of effortaround web assembly as well there's beena really cross company cross uhcommunity uh kind of bringing a place tobring the web assembly communitytogether uh to work on specificinitiatives so uh one outcome has beenuh updates to OCI artifact laayoutspecifically for WOM and then uh reallyworking on code updates through um kindof in collaboration with the bite codealliance as well so those are differentorganizations but this working groupallows uh kind of a CNCF uh uhcollaboration with uh that group as wellum so they have committed code to uhupstream in a lot of cases as well asworking through uh W3C uh standards aswelluh the batch system initiative uh or BSIis another uh effort that has aa community around it and you can see onuh the landscape that we showed there'sa lot of uh batch um uh projects andcommunities there so things you may haveheard of around Q uh which has beenmentioned um and a lot of uh reallyabout how how best to um definespecifications and uh CR types and uhreally resources that are specificaround batch effortsone thing that I'd like to point out onthe batch system initiative workinggroup that there is a lot of overlapwith uh AI type of things because a lotof the AI type of workloads uh requirebatch scheduling and efficient use ofresources you know GPUs CPUs or memoryso this is tightly coupled with um someof the other stuff that's going on inthe cloud native AI working group uh soI'll talk about a little bit on howmaybe some of these initiatives can fallinto the AI working group with the newtax structure uh but it could there'salso an open door policy in terms oflike if the community wants to cometogether and create something that'ssuper specific maybe we can create likea um a short live initiative or it couldalso be a long live sub project but itall depends on who leads the uh thisinitiative and and how the communitycomes togetherso uh next working group is containerorchestrated devices working group likeuh one ofunusual working groups in CNCF ecosystemit started uh in the lunch table likeback in CubeCon San Diego just like fewpeople talking like how we actuallygoing to use the devices in Kubernetesand result of that is what like we areone of the only probably working groupwhich produces the code and owns thecode in uh in CNCF so we are owning thespecification about how devices areexposed inside the container so it's thelowest building block uh which like manyof like more user visible things builton top of it example is DRA or commonline usage in podman in docker and uhalexe and and so on so we have a peoplecontributing from different companiespractically like all of those big uhprojects related to uh containerruntimes are contributing and the goodthing uh what I really want to announceon you this cubecoin is what ourspecification 1.0 is uh released uhseveral days ago uh but it doesn't meanwhat like the work for our working groupis over we still have to do themaintenance because like all of thosebig projects are depending on where uhwhat we are producing um plus there arenew things which people are starting toask more and more and there are somemovements in uh OCI runtime spec so weexpected to get uh support for uhnetwork interfaces which was again uhmerged to OCI spec just in few days agoum if you're interested in devicesplease also join our meeting our smallcommunity of hardware vendors andruntime maintainers are really welcomeum so joinum our neighbor working group inside tagruntime is IoT um we have a bitdifferent structure so if previous groupdoesn't well produce code but doesn'tpro produce the documents IoT wereopposite uh if you interested in Hinterested how IoT devices connected tothe Kubernetes clusters have a look atthe white papers which was published atuh well quite recently well one of thema bit earlier one one one isrecently we have also the specialpurpose OS working group um so we had acouple of uh outcomes for that more likein the advocate ating uh for ourdifferent uh approaches for specialpurpose OS so we had a panel in CubanParis we had also a panel um in the opensource summit in Vienna uh birds offeeder in the last fostam um it's uhmore like we utilize it to have um kindof a round table to to see where are thesimilarities and advocate uh for forwhat is an oper what is a specialpurpose uh operating system so that'sthe stabge that we are in and uh wewelcome uh anyone that is interested inthe field to come and join andparticipate so now I'll talk a littlebit about the tag reboot uh so some ofthe changes coming up u after thecubeconfreeze uh so the new structure of theCNCF will look like this uh we'll havethe TOC uh making decisions aboutprojects about the different levels ofthe projects and about governance in theCNCF and then on top uh behind orunderneath we'll have the the tax andand this is not going to change in termsof like uh what it was before uhbut the changes will come with the subprojects initiatives and the communitygroups so sub projects will be somethinglike a working group uh but uh they'llhave an option to work directly to withthe TOC or with a tag or of of theirchoice if they think they're moreassociated with with a specific tag theycan work directly with the tags but ifthey need to work directly with the TOCthey can do so and this can be includedin the charter or the review process andthe application process and these subprojects can actually be longived sothey can live for 2 years 3 years or aslong as the community is active and andand there's activity going on and thenwe have we'll have initiatives and andthese are meant to be shortlived likesomething specific like a deliverablelike like maybe like a spec or or somepiece of code or that community comestogether and and creates like asomething that solves a specific problemand then finally we'll have a communitygroups and this these this will be morelike umuh meetings where people kind of discussthings uh together like tech technologyand around like a specific topic sowe'll have all the structure uhavailable on how to uh apply for eachone of these or how to create one ofthese in the the CNCF TOC repo and alsoon the CNCF website and the CNCF uh uhstructure in the in the GitHub uh theCNCF GitHuborganization oh next one sorry okay soso this is kind of like what it wouldlook like in terms of like the differenttags so we'll have uh initiatives andfor example we could we could havesomething like artificial intelligenceuh initiatives uh and then sub projectscould be something like uh you knowcontributor strategy and then in thetechnical advisor groups this is whereum this is the current structure of umsorry no this is this is the newstructure of the tax so we'll have fiveand we'll have developer experienceworkload foundations infrastructureoperational resiliency security andcompliance and then so attack runtimewill map more directly to tag workloadsfoundation but uh it doesn't mean likefolks that are working in tag runtimecan actually work with the other tags sofor example there's a lot of overlapwith tack in infrastructure and therewill be maybe topics around operationalresiliency or how do you you keep umservices up and running all the time orhow do you monitor thoseetc and these are more details about thedifferent areas uh that the new tagswill cover so you'll see here the tagworkloads foundation maps more directlywith tag runtime well you have runtimescontainers VMs uh batch schedulers uhdynamic scaling CI/CDetc and you you'll see that also there'soperational resilience and and and thereare things like that t related to tagruntime for example reliabilityperformance and then in taginfrastructure uh there are some overlapwith like compute uh and maybe edgeSo more details about the scope ofworkloads foundation which is morerelated to tag runtime so you wementioned before so there's web assemblyvirtual machines runtimes schedulers uhand some of the initiatives thatexamples that may fall under tag runtimecould be something like uh a bestpractice on batch scheduling or orsomething on how to use githops u in amore efficient way uh and then for theleadership uh the suggested uh leadersuh could come from these different taskslike runtime app delivery orenvironmental sustainability uh but ifyou're not in a current leadership rolein in one of these tags I encourage youto look at you know the the scope andthe different areas and and please applyapply and and I mean there's a way to tobe a tax chair or or a tech lead in inthe specific area that you're working inuh so these are the current workinggroups in tack runtime in the suggesteduh changes uh so for batch working groupwe are suggesting either archivingdepending on how much interest there isor uh applying for a new initiative uhsame with uh wasn't working group uh thecontainer orchestrated devices workinggroup uh COD working group uh will alsothe the recommended thing is to reapplyand IoT edge uh there hasn't been a lotof activity in IoT edge so therecommendation is to archive but ifsomebody wants to work on it you knowfeel free to step forward specialoperating uh purpose operating system uhapply for initiatives or or use acommunity group in the cloud native AIworking group uh they suggest to ucreate a TOC soft project so just justwanted to mention that it's important tonote that all of those communities itdoesn't mean that the work is stoppingor that that community is not importantit's just in the uh in the relation toCNCF it's important that there's aninitiative or something that that is adeliverable out of that so the thecommunity's work continues and then thatinitiative can be uh part of thedeliverable andoutcomeyep so this is just an example ofartificial intelligence initiatives uhso recently we had the AI uh or we havethe scheduling white paper in the worksand the I security white paper we alsohad the white paper published last yearand the leadership is usually led by theTOC uh but then you have you could haveteam members from all these differenttags uh and you know anything related tocloud cloud native artificialintelligence is applicable to thisinitiativeso uh even with those big tag rebootsand uh changes in the leadership and theway how we are working uh there isnothing really change in the way how youcan participate or how you can actuallyget yourself into one of the tag groupsagain regardless how it will be name itregardless will it be like initiativetag or something else uh pleaseparticipate please join our meetingsplease join the discussions please helpus with review of new projectsum any any feedback from fields anyfeedback from the end users will be muchappreciated by all of the tag members byall of projects and and so onso we have currently working groupswhere will be restructuredsomehow how we don't yet know we don'tknow who will be leading it maybe likeit's it's maybe last time when we are inthis group standing on the scene andtalking about the same subject we don'tknow but still uh we are continueworking on our particular topicsum while we until the new uh structurewill be announced please contact us withusual places uh we usual like our Slackchannel it still exist still workingwe're still going to have the nextseveral meetings on unusual time slotssame again with working groups so untilthe new structure will be establishedthe work is continued as usual and likeour leadsso please reach out yeah please reachout anytime uh we're available so whilethis happens and yeah the work continuesit's just a a matter of like you knowwhere where it's going to fit in thedifferent groups just want to add uhshortly that the whole idea of thisrestructure is to enable the communityand uh help them to grow faster and justlike to present and be able to spininitiative uh more easily so that'sthat's just like if you look at thelandscape now we mentioned that it'slike 200 project that it grow and stillgrow exponentially so we just thinkabout the next 10 years and where wewould go with that so we need to be moreuh had to say it but agile and to umsupport the growth of the community sothat's all facilitating the communitygrowth that's that's the entire idea wetalked about the structure but not aboutthepurpose so I think we don't have anymore uh that's it for the slides sowe'll open it up open it up forquestions so anybody has any thoughts orthings that they want to talk abouthappy to answerany questions anyhands well I hope you enjoyed yourCubeCon and thank you for coming for thetalk and um have a nice Friday folks[Applause]2025-04-15 21:57:41.238056 ��B##��;A-yOKr2DOJ_ohello uh welcome to the tag runtime uhtalk so uh we're gonna talk today aboutum the tag tag runtime it's uh currentstructure um in principle we have fivespeakers uh Rajas couldn't join us todayand Ricardo might uh join in a bit so hiI'm Danielle uh we got here AlexanderandStefan so that's us um should we sayshort something or Sure okay so I'm a PMfor Microsoft and also co-chairing techruntimehi I'm Steve Rust i work at Okabi i'm aprincipal architect working on uh all ofour cloudnative initiatives for theOkami cloud uh akaLenode and I'm Alexander Kaneki i workfor Intel uh doing all different stuffabout the Kubernetes and CNCF stack uhmostly at runtimes resource optimizationand so on and particular here it's moreinterest of container orchestrateddevices working group and CDI specby the way I'm Danielle I forgot to saymy name before uh and unfortunatelyRogers was called home uh so he couldn'tbe with us today but uh we also haveRicardohi Ricardo Aravina I'm one of theco-chairs for tag runtime I've been aco-chair for a long time like about fouryearscool so that's us um we're gonna talkabout the intro about the tag time uhoverview of the tag itself and theworking groups um then uh kind of goingto talk about the tag reboot um andrestructure um and then some ways to getinvolved if you'reinterested so uh what's a tag so we gota nice definition here the CNCFtechnical advisory group uh scales tocontribute by the CNCF technical anduser committee while um um retainingintegrity and increasing quality insupport of the CNCF mission so basicallywhat um the tag the goal is to help theTOC to um just do whatever is needed ifit's a technical reviewsor just uh getting involved with thecommunity facilitate discussions and soon we'll get into it in asecond the landscape is quite wide uh wehave a lot of projects so that's why weneed to kind of restructure tags inorder to facilitate all of these uhdifferent areas uh that we have uh atthe CNCF so now there is uh209 I think including uh archivedprojects and uhabout200 like five six years ago I thinkthere were like 50 yeah so it's growingand in order to facilitate all of thatthen the TOC uh needs some help andthat's where we get into thepicture so what does a tag do um it'sit's a it's a way to get basically tohelp the TOC to be involved with thecommunity so we facilitate um talks andand um areas of expertise that uh needcoordination between the TOC and thecommunity um currently there are eightdifferent tags u but that's about tochange we'll get to that in a bituh the tags work with the TLC to help tospread the work across the differentdomain of expertise um some of theprojects might be um under one tags ormore because it depends which areathey're touching um so depends of thescope of theproject here are some uh um the tagsthemselves so in the currentstructure so you could see the littletiger that's ushere but we have observer uh networkingstorage cloud native security deliveryapp and yeah okay you could read therest what are the tag responsibility asit's structured today so we have thereach out to project when to engage withthe community uh project come and do umpresentations S uh we have also helpedthem um to to do um project reviews uhstart help to boot um bootstrap ininitiatives uh white paper andfacilitate different working group andum we meet with the community byfacilitating uhbi-weekly fortnightly meeti_eh both of our experiences and it'sconfirmed by research is that it's oneof the important factors um when makingdecisions uh concerning the adoption ofof a project within uh an enterprise ora company right um but uh next slidepleasebut um given the context of today uhtoday's presentation where we're bothKubernetes co-chairs um we want to throwout this fun fact wherein Kubernetes uhseems to be uh the second largest opensource project in the world justtrailing Linux um and is also theprimary container orchestration tool forum roughly 71% of the Fortune 100companies next slide please um and totie this board together um uh we we'vereceived a lot of compliments and a lotof um not so good compliments as wellabout um the quality of ourdocumentation um so we thought we'dshare the knowledge with a lot of folksum who might be interested in um sort ofrevamping or improving theirdocumentation efforts um in open sourceand otherwise because it's not just opensource where documentation is clearly uhhaving worked in an enterprise I do knowthat we um have proprietarydocumentation that we regularly need toupdate and improve um based on thereleases um so we are going to talk youthrough how we achieve this feat and forthat I think uh Natalie take over thankyou so um we have a um a decent agendatoday we've got another 25 minutes withyou and if we've got time at the endwe're happy to take some questions andI'll run around with the microphone tohelp out with that um but a couplethings we're going to cover today sowe're going to talk about the proximityproblem when it comes to kind of teamsand how we all work together we'rewalking in the users shoes and how weembrace different voices when it comesto docs and contributions uh we're goingto um mention this idea around the Mentopyramid principle which is quite cool umlooking at how we do a lot ofcommunitydriven development when itcomes to improvements to documentationand then the things that we would liketo improve which is where we talk aboutthe call outs of please help us pleasecontribute to docs please this is theway that you can actually do so so withthat um so let's go into the proximityproblem um the closer that you are to agroup I'm just going to read this outloud um the more you interact with thatgroup this is the idea of the proximityproblem but you can't be everywhere atonce and so you have to choose um thegroup that gets priority um given thesedynamics is it better to be embeddedwithin project teams or within atechnical writing team um so myself andDivia we have uh diverse backgroundswhen it comes to tech and our work inKubernetes i'm a web dev by design umand then I went into engineeringmanagement and now I work in theKubernetes space in terms of open sourceum and so uh the technical uh writingteam is actually more my my pace in myarea unless we're working on the websitenecessarily um and I believe Divas youknow you have a CIS admin backgroundyeah I have I have a CIS adminbackground uh was in the trenches foraround I think a decade before I movedinto project management and then foundmy way into uh technical writing andwhere I am today as a developer advocatebut um given that um I believe what wejust said like you um we are not theexperts or the technical people rightnow so to say when it comes to thedocumentation for Kubernetes we'veworked with Kubernetes um individuallyin uh different capacities in our careerbut currently um there are several morepeople within different SIGs morequalified and um that definitely createsa problem of whether we should be thestakeholders when we create when wecreate the documentation or should it bethe technical experts that actuallycreate the documentation and get itbettered by us and that's something thatNatalie is going to decipher for us yeahyeah definitely we are not the engineersanymore necessarily so we always have tomake sure that we're thinking in thespace of whose side are we on in termsof representation is it the thetechnical owners or is it the users andI think you can probably um guess whatour answer is going to be so in terms ofany technifcal documentation butspecifically what we aim for in theKubernetes docs we need it to be notonly readable and usable for the folkswho are using it um but we also need itto be technically accurate and those arethose two lines that we are drawing herehow do we um make sure that something isreadable and usable in one space butalso technically accurate we can't be inboth places at once well I think TimBanister in the in the audience here ourformer tech lead probably can sometimesum but at the same time um it's one ofthose things where we really do in orderto not only help our workloads andprobably our work life balance but wereally do have to maintain that divideand so in the Kubernetes documentationwhile we in seek docs we say that we ownthe docs we don't own the content of thedocs and that's something that's reallyreally important we own necessarily howto documentation um but then we talkabout this idea of what and so our docsrequirements in the project we talkabout how we have you know differentfeatures that come up with differentstages of implementation alpha betagraduated um and we talk about this ideaof ownership in the area of ke owners Umso for folks who don't know a ke standsfor a Kubernetes enhancement proposal umand this is a way that we actually getfeatures introduced and movingthroughout the Kubernetes project andit's a really specific way of how wedivide up technical ownership ofownership of work that is being shippedin the project um this happens threetimes a year if you're interested injoining the release team to help outwith a Kubernetes release you can alsocome and talk to us i've been on therelease team before it was awesome umbut basically this um these requirementsmean that these stages of implementationnot only are handled by technical ownersbut a lot of those features require docsand those docs and how you actuallywrite them and implement them that'swhere we come in to help out um so wework with the ke owners to do exactlythat we need to shape how docs arestructured and the reason that we needto do that is because we're representingthe users we want users to be able toapproach our docs in a way that'sreadable legible and has a flow so thatthey're getting what they need out of itwhich is you know at the maybe thebeginner level versus all the way up toan ad administrator and app developerlevel we need that flow to be consistentand usable and that's why we in docs ownthat specific area and workflow um andthen the what as I mentioned beforethat's the technical content that'sactually owned by kept owners and we arealways getting technical LGTMs in termsof docs to make sure that even though wehave a lot of knowledge aroundKubernetes as well getting that actualtechnical overview is super importantbecause the actual content of thatdocumentation is owned by a differentarea a different special interest groupand that representation of them needs tobe accurate as wellso we represent in SIG docs the users ofKubernetes we set the standards to makeour docs usable and ke owners ensurethat docs are technically accurate sothose are the two groups that we'retalking about in terms of proximity andwe are always the ones knocking on theke owner door saying hey you got tochange this hey the style is like thishey please use site relative links asmuch as possible etcetc um so what we wanted to do in termsof helping to encapsulate that messagethat information we're going to put outum I think every speaker is late withslides usually on shed so we're verysorry about that but we will after thistalk put our slides up and we wanted toum link here from last year um uh how wenavigated how we navigate collaborationacross docs across SIGs with docs in theproject um that uh Divia and Ry andXander were able to give last year umXander is um and Savvita sorry should Isay and Xander is one of our tech leadssevita also involved in stick securityum and how we enable that thatcollaboration um this is a really cooltalk to understand how we do that twoteam proximity problem work um and feelfree to watch it on your own timeoh uh it'll start plgaying so that isprobably also better to start there wegoI fixed it all right I'll give it overto you right so uh Natalie talked a lotabout how we at SIGDOCS are more um youknow representing you as the users andhow we want to make it more userfriendly by representing y'all uh whenit comes to actually creating talks souh what we do is um we basicallyclassify the dogs um based on whatKubernetes does for our various usersand who actually uses them now uh when Istarted out um obviously I did not startout as a co-chair or even as an expert iwouldn't call myself an expert still butum I did not start out knowing a lot sothere are various levels ofdocumentation that we have um uh we haveconcepts and howtos um for you knowpeople who are just getting started soif you go to kubernetes.io/talksio/talks you'll see that uh there'sbeginner level in information with aglossery with a um you know page ofconcepts or rather different pages ofconcepts and tasks and um how-tos of uhgoing from you know scratch to having aKubernetes cluster of your own based onthe operating system you're using basedon uh the kind of tooling that you'reusing there's a lot of uh material thereon getting you started and getting youhands-on with Kubernetes we previouslyalso had Katakoda tutorials um which wehad to remove um because of uh someissues but uh those also were likereally good beginner friendly resourcesthat we had to actually go from zero tomaybe uh a cluster of your own and nowwhen it comes to uh Kubernetes um themain c the main people who are usingKubernetes those are the developerssystem admins architects etc we havedocumentation for them as well so wehave like API reference documentation abunch of um you know advanced conceptualdocumentation that may be required by uhfolks who are working on it um in a moredeeper capacity we have those as wellnext slide please so with this we focuson catering to different skill levelsnot um uh and it could be someone as newto Kubernetes as you know gettingstarted with the cloud native ecosystemin general um and we want to ensure thatit is um you know consistent it is uhrecognizable so the language that we usewhich we shall get to in a bit is alsovery consistent and we make sure thatwhen we are evaluating the documentationyou get that consistent voice from usrather than it being inconsistentlystructured and inconsistently uh drivenacross the board uh the reason being umone study actually did show that thevoice and the um tone of thedocumentation that you write um actuallydoes have an impact it's one of thosereally small things that you do notreally notice uh but it does have a hugeimpact on a consumer and how well theyare able to grasp the information thatyou are conveying in your documentationwhich brings me to the voice bit becauseum uh we if you've noticed ourdocumentation any of it there is nofirsterson reference it's not um umthere's no act there's no um you knowpassive voice usage if you you if younotice that please obviously do flag itbecause that's a part of our style guidethat we consciously do not use all ofthose but this is how we actually ensurethat the voice is consistent across ourdocumentation and um within SIGD docs weactually use um active uh voice to makedocumentation easier so that the userwhen they complete a particular task itis them that it is the user that we areaddressing it's not a passive uh youknow service that we're providing so itbasically uh gives you or empowers youto take responsibility and it also umyou know positions you as a subjectrather than a passive consumer ofdocumentationum again throwing back to the uh styleguide that we have here you all canvisit this link and it's a constantly umupdated style guide uh we have madeedits to it so this is not a set instone sort of a thing uh this definitelygets updated as and when we find um uhyou know things that would be beneficialfor our end users to consume um uh andwe make edits to it i think we recentlyhad a couple of edits a couple of monthsback so um we address um the user or theconsumer of the documentation as you anduse active voiche all across thedocumentation and we've always stressedon the usage of direct language we donot want you to rise uh write umShakespeare and pros that's not the endgoal of documentation like I'm prettysure um nobody would like readingKubernetes documentation in Shakespeareand pros uh it's already very difficultuh let's not complicate it it's a coolidea for a talk translating Kubernetesdocumentation to and it would be and andit would even be worse if we were to usethis um use flowery language to actuallytranslate across different um languagesright so we ensure that um the languageis simple it's easily consumable and itis addressed to the user that's actuallyconsuming these talks um and with that Ithink uh I'll hand it over to Natalie tocover the next few aspects yeah so um wetalk about how we're addressing usersand how it's really important to bedirect and communicable in that way umbut on top of that in terms of like howyou communicate in the docs we alsothink about in terms of the personasthat we're looking at and skill sets ofhow our docs are consumed and how peopleare going to be reading what we'reactually putting out there and there'sthis really really cool concept calledthe mento pyramid that we want to talkabout here the mento pyramid principleum that helps us actually structure howthose how docs should be consumed for usuh specifically in the Kubernetesproject so the mento pyramid principlefor folks who don't know it's aframework for documentation specificallyin the technical writing world um and itstructures information like a pyramid sothe idea is that at the very top is themost important takeaway and then you godown in that structure to the most andleast in well I don't want to say leastimportant but let's say you're gettingmore and more granular in terms of theinformation that you need um and thisconcept is really interesting for quitea lot of factors because we talked aboutpersonas before and we need tounderstand and we constantly are askingand engaging with users and contributorshow you're using the docs why are youusing the docs people may just say "Ohpeople want to learn how to useKubernetes." Okay great um what are youusing it for are you um using it foryour for your work are you doing hobbyprojects at home is this something thatyou had to do because your boss told youare you moving uh careers into S sur orsomething for example or are you takingan exam all of these different kinds ofuse cases is something that we like totry and focus on in terms of doc's umconsumability and the mento pyramidprinciple helps us do exactly that whatwe want to be doing with thatspecifically if I go to the uh nextslide as you can see 79% of onlinereaders strictly skim content when itcomes to docs especially docs that areso vast and so detailed like theKubernetes documentation and so we wantto make sure that we're using somethingthat we call scannable text principlesin the documentation style that weemploy there so what's a scannable textprinciple for example subheadings uhhaving the main main key takeaways atthe top bullet points is a very goodscannable text principle things likethis that seem a little bit sometimesobvious but you're not sure why it'sreally grouped into these kinds ofdocumentation principles to make sureinformation that you need is beinghighlighted in the places that you needit and so our doc style guide again umis very very much uh um influenced byhow general users read on the web i'vegot um information about that um studyin this slide as well where we want tobe making sure that any use case or anypossible use case that any of ourpersonas have can actually be uh be belooked at and be uh um good with thedocumentation that we provide and so Imentioned exams before and this is areally great use case that a lot ofpeople don't kind of think about thatoften you know when people are takingthe CK and CKAD um they're often kind ofstudy cramming these folks know usuallyhow to use Kubernetes but that thoseexams are hard and they're hard on sortof on purpose but we make sure that ourdocumentation is kind of geared intiohelping folks with these exams as wellso they can kind of you know study forthose last three or four hours beforebefore they're kind of taking thosetests i know a lot of people who do itand it's and and and it's the docsstructured in the way that we do aroundthis menty principle is actually reallyreally great for this kind of use caseso it may not seem as an obvious thingor like why would SIGD docs care aboutpeople taking the docs exams well it'sagain one of the different personas anddifferent reasons why people would beusing our docs and why we want toimprove the readability of that so umthe um the menty pyramid principle issomething that um it's it's a very nerdytech writing thing but if you ever wantto hear more about it please come speakto Divian and myself because it's areally cool way to structure yourinformation for projectsall right and now we're going to go intosome communitydriven development we'vegot 10 minutes Divia right um so uhagain we talk about docs um anotherreally great um and and and contributingto them and another really great way togenerally contribute to docs and we saythis to folks all the time is actuallyusing them if you need to follow atutorial and it turns out that maybe astep that you've replicated is wrong butyou know the right one put in that PR tofix the docs for example this issomething that I can't stress enough thebest way to contribute is to use thedocs and possibly find those things thatcould be fixed in the first place andthis is part of the communitydrivendevelopment work that we want to kind ofchat about today um the Kubernetesdocumentation it may come as a surprisebut it's never perfect or complete umand this is something that we stressconsistently we have three leases a yeara year I mentioned and again because wehave different features graduating todifferent stages we have to um updatedifferent feature flags differentfeatures that are new need documentationbecause they're userfacing um justhaving docs um and having a docs team aspart of the release team three times ayear you're constantly going to getthese updates on top of that we have 16localizations including English in termsof what our do the languages our docsare available in and there is no waythey are always up to date either infact the amount of edits and and changesand improvements that come into Englishdocs there is a natural lag that willhappen for our localization teams andthey're always trying to make sure thatthey're catching up to give their usersthe best experience on the doc side tooand then finally on the versiondocumentation side in terms of supportthat that's another reason why docsaren't always going to be perfect thereare going to be different versions ofdocs when different features come aboutum and that's something that um adds tothe inconsistency of completeness whenwe come when it comes to Kubernetesdocumentation so we need continuouscontributions um and that comes from avery dedicated community i like to saywe are all pro driveby contributors ifyou just want to do one or two thingscall definitely get in contact with usthat's that's something that weabsolutely happy to to accept but withthat we also have this idea in mind ofmaking sure that we're keeping thoseusers in mind in terms of um readabilityum and we're also keeping in mind acouple of things where I put myengineering manager hat on for examplefor a second in terms of whatcontributions that we want what wouldthe docs be improved with a certainchange this is a question especially inSIG's elite SIG docs leadership we askin PRs and issues and in meetings allthe time is this is this an improvementare the docs better after this merge umand the majority of the time it is yesand that is why we're leaning towardswanting to get that contribution throughis it perfect doesn't need to be but arethe docs improved and better because ofthis great let's try and push forwardwith this change um and so that's why wetalk about that we encourage PRs thatare good enough um it is really reallyhard to as as a contributor going maybesomewhat quite new into the projectj andyou're um putting a um a PR through andyou're doing a lot of back and forth umcomments and reviews and it's a longdrawn out kind of maybe possibly painfulprocess because there is maybe areviewer or two who wants to get thethings that you're submitting perfectand exactly right that experience sucksum and we want to make sure that there'sa bit less of that that there is a bitless of that are these docs good enoughis this change already an improvementand can we make another PR to make theimprovement better again this issomething that is a like I think in thelast couple of years for docs is a bitof a new leaf where we're really tryingto just get the changes out there andget the experience to be betterand with that each contribution has anengineering cost this is something thatespecially when I used to manageengineers I used to I used to talk aboutlike how long is this going to take howmuch of your time and context switchingis this going to take and so on and soforth there is an engineering cost toevery contribution and so when we talkabout good enough we do still have acertain uh uh baseline and a certainstandard and what we talk about there isthis idea of trivial edits again a driveby contrib contributor might just see ohthere's a small typo in this equivalentof three pages of documentation i'm justgoing to submit this small fix puttingthat through getting the review tohappen getting the merge to happengetting the PR wrangler that we have onthe week to actually see thenotification i'm so guilty of this it'sit's something that comes at a cost andso what we want to do is encourage folksto ask themselves can this dock beimproved further maybe you find a secondthird spelling mistake maybe you seethat actually this big chunk ofinformation would be better suited intoa bullet point format for betterreadability and you giving yourreasoning as to why because it helps youmaybe read something this way those arethe kind of contributions driveby or notthat are really great for us to receivebecause it shows us that you're thinkingabout how the docs are going to beimproved for you the user in terms ofhow you then consume the docs and that'sthe absolute golden golden standard forme and I think Divia and a lot of otherdocs maintainers when it comes to PRsthat are coming through so yeah everyevery um every contribution has anengineering cost um uh um I I can waxlyrical about this topic for a long timeso I'm going to I'm going to move on andthen with that on top of that we're alsoalways looking for feedback with ourdocs we um these are two quickscreenshots from the majority of thedocumentation that you see up on theKubernetes.io IO website where at thebottom when you scroll to the verybottom of a docs page in the bottom lefthand side of this slide you see thislittle feedback option where it saysfeedback was this page helpful two clickbuttons of yes no we get a lot of nos umand that's okay too because that givesus a great signal um unfortunately thesignal is just it was not helpful but wedon't know what part wasn't helpful orwe don't know if we don't know what themeaning of helpful is to that specificuser because of what they're using thedocs for and again it we need to get abit more granular with that too andwe're looking for ways to improve thatum but then on the right side the rightimage that you see on the screen thereum in the top right corner of um all ofour docs pages we have these optionswhere we you can edit um the page createan issue um you can print the entiresection which uh Matt Fina from SUSAjust mentioned that uh if you wanted top print the whole API referencedocumentation it would be over athousand pages so don't do that um butbut um we also add this here so thatfolks can see and have a bit of a callto action that wow I can contribute tothis thing that I think is maybe thatmaybe could be better and that'ssomething that we're always hoping tohighlight to users as wellso now I'm going to touch upon the factthat um although we've given you allthese tips and tricks we're no we're notthe standard uh we are not perfect andthis has been drilled consistently Iknow that but uh we are also looking umon ways to improve the documentation oneof the ways uh one of the things thatwe're looking to improve is our APIreference documentation and the way wegenerate it um in fact uh if you werenot able to catch this talk um at thiscubecon I think it was just yesterday itwas yesterday yeah it was just yesterdayand probably in the rooms beside um youknow uh in the to the side of ours butum this is something that we areactively trying to revamp uh as aproject we have and we have a subproject dedicated to referencedocumentation but again it's um it'sit's it's it has a lot of issues becauseit's it's go it has a lot of code thatis basically um you know needs to berevamped and we are looking for help umso if yall are you know proficient withgo uh and would like to contribute or orPython because it's just a bunch ofbatch scripts that call a Python scriptplease help us yeah we we are lookinglike we've we've iterated this requesteverywhere um so now uh now it's I thinkthis is the best avenue to sort of likeissue a call to help so um if this issomething of interest please do reachout um and the next bit um uh I thinkI'm going to put Tim on the spot here uhthis is uh one of the things uh about umour blog sub project and we have beenlacking blog reviewers and approvers forquite a while right now we've beentrying to improve the situation uh anduh fortunately or unfortunately we'venot been able to do much although therehas been little you know spurts ofprogress here so this is something againthat we're looking to improve becauseblogs um as you might um you knowenvision they come out every release andevery almost we have like feature blogswe have a release blog we have blogsthat just do not fit anywhere in arelease but are just like use cases andcase studies and stuff like that sothere's a lot of work in the block subproject as well with respect to approvaland reviewing and we need people likeall of this is still very much done bypeople with actual personal liveswhenever we find time and you have apersonal life yeah I do I unfortunatelylike I get four hours of like time whereI sleep so yeah so uh so uh we haveactual personal lives and a day job sowe are all doing this um in a voluntarycapacity and you can help um you knowjoin the kubernetes slack um we hang outon sigoc slack channel but there arealso a couple that I've not mentionedhere dedicated to localization and theblog sub project but if that's ofinterest please let us know but do jointhe slack.k.io IO um Kubernetes Slacklink and um we urge you to read ourcontributing guide if you are sointerested in contributing to thedocumentation and whatever Natalie and Ihave said actually does reflect in mostof our uh contributing and style guidesum there's a PR process and now sincewe're in CubeCon I can uh you knowassume safely that a lot of you all knowuh GitHub so we have a PR process thatwe go through when we uh accept acontribution so please be familiar withthat um and um if you have burningissues um feedback um you know we haveuh mailing list but we'd rather chat faface to face on a call during meetingsum uh attend our SIG meetings you willget the um invites to those by joiningour mailing list and last but not theleast um there have been some fantastictalks in at previous CubeCons uh they'reavailable freely on YouTube you don'thave to pay us so please go ahead andwatch them and um Tim here has actuallygiven a good talk with um you knowCeleste and Brad um about uh theintroduction to Kubernetes dog so pleasego ahead and watch it and if you're anew contributor want to get started umattend the monthly new contributor meetand greet that we have for sick dogs umit's first Tuesday of the month um 10:30UTC i do not know what that translatesto in other time zones but you can getthe invite in your inbox by signing upfor our mailing list and I don't knowhow much time we have left but thank youso much for being a patient audiencewith us yes thank you so much if youhave questions come and find us thankyou2025-04-15 21:57:41.747132 �`��[%#��mALEiFzJnqU-Ethank you for coming out this talk wasoriginally aimed atbeing sharing our end user story as endusers of the project and contributors toSpiffy Inspire now after chattingthroughout the conference with differentfolks throughout the week we feelthere's still a lot of skepticism andlack of understanding possibly even feardoubt and certainty around quantumcomputers and quantum resistantalgorithms so we shifted things a littlebit last minute to explain some more ofthe case in the background was theimpetus behindit i will preface with I am not acryptographer but over the past yearworking with Hugo and closer to thecryptography community I have come tolearn the most dreaded situation for acryptographer to be in is to have toexplain the math of a cris crypto systemto a lawyer and why is that mathsecure i recently found myself only acouple weeks ago having to explain it toa lawyer myself and to my surprise hewas able to grasp it and he hit me backwith a great analogyof cy cyber attacks used to be very muchlike a bank heist against the clockquick frantic loud they only had as muchtime to like detection meant failurel��|$#��/AQ40yLLLIW9Qall right so uh thank you so much toeveryone that actually showed up forthis talk right at the end of CubeCon umwe did not expect this audience so thankyou very much for showing up this hasbeen the largest crowd we've spoken toso last year we had a lot in um uh inParis I believe yeah that was that wasjust like a smidge larger but thank youso much for showing up um I'm Da Moanand uh I'm joined by Natalie here uh andwe're going to talk about how we atKubernetes are um trying to bridge thegap to adoption with the help of docs soto give you some context we are some ofthe people who chair documentation atthe Kubern on the Kubernetes project umour co-chair unfortunately could not behere but uh we have some of our stellartech leads and if you've probablyencountered them um or you know attendedone of their sessions yesterday that'sgreat but um yeah I'm at my day job I'ma principal technology advocate at Souzauh one of the co-chairs as I mentionedbefore and I also serve on theKubernetes code of conduct committeewhat about you Natalie uh yes hieveryone I'm Natalie Vlatco my pronounsare she her and um I am an open sourcearchitect and OSO lead at Cisco a littlesmall company that you maybe have heardof um I'm also one of the co-chairs ofSIG docs i've been involved in thecommunities project for quite a fewyears um and I have recently become umappointed to the to-do group steeringcommittee and the to-do group is a groupthat's part of the Linux Foundation thatgets together folks who are Ospo expertsinterested Ospo curious folks who wantto um um share best practices andinformation about how to help yourcompany your area your um domains getbetter in the open source program officespaceand um so uh starting right out uh whywe're here today uh basically we're hereto talk about documentation in thecontext of open source adoption becauseum one of the things that we've normallynoticed um in the open source space umwitdm nowthese attacks have turned a lot likeforklift jobs the attacker lifts theentire safe the data the credentials thesystems and takes it somewhere quiet noalarms no rush just a slow methodicaldecryption andexploitation a blast open safe does beara lot of resemblance to unscrambled datafrom a Kubernetescontainer forward secrecy protects pastand future data even if long-term keysare compromised this is especiallycritical in systems whereconfidentiality needs to last fordecades like in healthcare finance ordefense while current cryptographicsystems are still holding strong therise of quantum computers could changethat dramatically even though we don'thave large enough quantum computers yetthe threat isthere a quantum computer operates overcubits a single ideal cubit exists in asuperposition of the basis states of K0and K one which correspond to theclassical binary bytes zero and one butinstead of these being just values thesebasis states are actually orthogonalvectors and they form the basis forlinear vector space what that means is asingle cubit can express any linearcombination of the vectors K0 and K 1 solet's look at the composition of a cubitwhich we call phi we will write this asKFI equals alpha K0 plus beta K 1although by definition alpha and betaare called a probability amplitudemeaning their squares have to add up toone the cubit allows for states like getzero plus get one divided by the squareroot of two which means it's in a statewhere both it's zero and one will equalprobability and more importantly thatpropertyscales most traditional encryption todaysuch as RSA elliptic curve cryptographyand diffy helman rely on the difficultyof factoring large numbers of solvingdiscrete log logarithms however quantumcomputers can run short algorithms whichcan factor large numbers in polomialtime making RSA analytic curveencryption obsolete shore remains thebiggest threat shore algorithm uh wasproposed by shore in 1994 and defendingagainst it requires a new set ofmathematical algorithms we have todiversify the set of mathematicalproblems that's why the nationalinstitute of standards has alreadystarted standardizing quantum safealternatives of the quantum safealgorithms latisbased cryptography makesfor the most versatile foundation forquantum resistant crypto enablingu multiple primitives like fullyhomorphic encryption signatures and keyexchange mechanisms they offer strongsecurity reductions to problems likelearning with errors and are built inlinear algebra over np heart latticeproblems structured as n dimensionalgrids as you see on the slide this isthe backbone of schemes like Kyber andthe lithium in round three of the NISTpostquantum cryptography standardizationprocess which culminated last year NISTselected Kyber MLEM and MLDDSA asfederal information processing standardsFIPS 203 and FIPS204 now breaking classical crypto mightfeel decades off right now we would needthousands of physical cubits just to getone one reliable logical cubit and IBMthinks we might not reach that leveluntil the late 2030s that said asurprise breakthrough a black swan eventcould speed up things much faster thanexpected right now there's at least a10x capability gap between hardware andwhat they need to break what they wouldneed to do to break modern cryptographyso we're not quite there yet but of whatmakes it not quite there yet aretechnical hurdles like cubits notstaying stable for long enough there's alot of errors during computation andchallenges in scaling up this quantumcomputer systems we haven't reallycracked quantum error correction whichis essential for making quantumcomputers reliable in recent weeks wesaw Google announce Willow uh at 105cubits and that's not quitecryptographically relevant yet we wouldrequire millions of cubits and errorcorrections for such a task and thecurrent focus is quantum error correcorrection and scalable bits at the sametime the expert opinion has started toshift this is a graph from the globalrisk institute where the surveyedexperts put the likelihood at around 30%of there being a breakthrough in thenext decande that's quite seriousconsidering attackers can alreadyharvest data and decrypt it later soeven without a quantum computer yetthere is a threat window the moment datais encrypted today and for the timeperiod it needs to stay secure and themoment in the future where quantumcomputers might decrypt itretroactively so if X + Y is greaterthan the quantum thread because the dataencrypted today like consider healthcaredata requiring long long-termconfidentiality must stay secure forabout 50 years it will take 10 yearsshould they commence work today tomigrate systems to quantum safealternatives if a quantum computercapable of breaking RSA and ellipticcurves arrive within the next 60 yearsthat organization's already at risktoday butuh very well framed by Ryan Hurst in thecryptography community if you handlelife critical data you should already betaking action if your organization dealswith sensitive data that requireslong-term confidentiality you shouldstart thinking about it today but if youdon't deal with either chances are yourorganization may not face this in the inthe time of yourcareer but quantum computers aside if aquantum computer never is realized whatis the value of these algorithms giventhat there's enough attention to theproblem classiccryptography has subtle implementationflaws that can lead to significantvulnerabilitiesyou can look at the documentation for uhPython scriptography the hassmat moduleand it has the warning of you shouldonly use it if you know 100% what you'redoing uh things like timing side channelattacks weak subgroup attacks oroffcurve inputs these crypto librariesoften expose very low-level hazardousAPIs that are very easy to misuse it'slike shooting yourself in the foot anddespite strong standards there's manyflawed impairment implementations thatremain persistent to these problems sothe merit of this newer set ofalgorithms lies in the diversificationof problem particularly constant timingimplementations deterministic functionslike shake which reduce the chance offailure due to poor randomness definingsampling stops for attackers forexploding bias and deterministicbehavior uh many of these algorithms areformally specified and formally verifiedand fast key generation isn't reallyabout performance it also enablesforward secrecy at scale and vettedparameters prevent uh likedo-it-yourselfconfigurations secrets can be longivedparticularly in national defensegovernments tend to work with secretswith life cycles for about 25 years ontop of that with the grounds of storenow decrypt later attacks meaning anadversary might store encrypted messagestoday that have been exfiltrated anddecrypt them once they have a largeenough quantum computer later uh manygovernments like the UK and the US hasstarted to legislate a plan to migratesystems this is from GCHQ uh only lastweek NSA uh the last year releasedupdates on CNSA 2.0 O with the advisoryto account for quantum resistantcryptography we know cryptographicmigrations tend to take a lot of time wecan look back at the SHA 2 migrationtimeline and the sequencing of phasesfor cryptographic algorithm transitionfrom selection and developmentstandardization implementationdeployment and usage where we're attoday is like very early steps ofimplementation but the majority of theindustry is is not quite there certainlynot the enterprise there are uh realworld adoption examples of uh keydigital infrastructure this is not anexclusive list there many more opensljust announced in the in the last weekuh near-term roadmap for it so how canthe kubernetes community and cloudnativecommunity take action to prepareIt's important to limit the utility ofkeys to an attacker and the key lifespanis not just about the algorithm itlargely depends on how the keys aremanaged using keys for shorted periodswill reduce exposure even if those keysare compromised regular rotation willlimit risk but it also will help youstay prepared for transitionshistorically as an industry we haven'tdone a great job if we look athigh-profile compromise like digoterheartbleleed storm a lot of the keymanagement owe have in place todayfocuses on reducing sprawl but not intrue protection even if you're usinghardware security modules keys are mostexposed when they're in use or whenthey're distributed and that's exactlywhat where the fences are theweakest cryptographic agility is aboutstaying ahead of the evolving risk thatmeans being able to swap algorithmsciphers and protocols easily managingkeys securely throughout their lifecycle and adapting policies as standardchange and staying prepared for newcryptographic threats on the horizon asthe cryptographic right answers evolveand uh the use of postquantumcryptographyincreases commercially the anxiety inthe enterprise will grow the fear isoften driven by lack of clarity on whatapplications use cryptography and theunknown outcomes of changing algorithmsand rolling keys and that hindersadoption now to overcome that anxietythere's steps you can take you caneliminate manual processes and removethe human from the loop you can make useof better visibility and observabilityto help trust the safety of changeswhile standardizing the practices thatmake everything easier to understand anddebug and by regularly exercising thosechanges you will build the confidencethat teams need in order to move fasteradam Langley's quote is a great metaphorfor crypto agility the idea is fairlysimple instead of spreading complexityacross the system focus on a well-definfined maintainable point one jointwelloiled that way when crypto needsneeds to change you will have a singleplace to update making your system bothadaptable and securekey management is often treated byauditors or compliance teams as acheckbox and there's rigid basedpolicies but also ensuring you overcomethe disconnect between policy teams andoperations is critical there are anumber of operational challenges uh thatI'm not going to cover at length buthaving them on screen for for you andfor posterity if you want to look at itafter so let's bring it back into beinga spiffy maintainer talk uh I have usedthe Spiffy throughout the last decade ofmy career throughout many differentorganizations and there's a number ofreasons why we chose to build around itit gives us dynamic verifiableidentities for every service there's nomore static API keys or long livecertificates the credentials areshort-lived and automatically rotatedreducing the risk and identities aretied to trusted runtime signals and withspire the reference implementation weget centralized control over theissuance renewal revocation all governedby policy when authenticating workloadsservice account level trust is notenough you need workload level atastation you measure the code verify theruntime behavior possibly even tie it totrusted execution environments or policybased context so with that I'm going topass it on to Hugo and talk about uh thework and the approach of fitting PQCinto Spiffy Inspirehi so Spiffy and Spy make use ofcryptography in ways which requireadaptation to obtain PQC security soit's based on TLS mutual TLSspecifically which uses asymmetriccryptography it uses key exchangemechanisms and it uses certificateverifiers which isa piece of data which verifies that thecounterparty controls thatcertificate these are vulnerable tocryptographically relevant quantumcomputers so these need to beretrofitted with new quantum resistantalgorithms in order to make mutual TLSsecure in the postquantum landscapespiffy is based on TLS it is based onX509 certificates it can also use uh JWTtokens so we need to retrofit theseformats with new algorithmsso the real highest threat that is inthis landscape is that of thecryptographically relevant quantumcomputer poses to key exchange becausethis allows attackers to capture datathat is encrypted now and store it onthe speculation that at someday in thefuture they may have a cryptographicallyrelevant quantum computer and then theycan just retroactively decrypt all datathat they have captured today whichmeans that you actually need to bethinking about this threat right now sothe absolute highest priority is totackle the key exchange piece in the TLSstapck and replace the existing keyexchange algorithms such as ellipticcurved deb helman with a postquantumsecure key exchange algorithm of courseyou also ultimately want to retrofit thesignature algorithms in X59 certificatesand in JWTs with postquantum securealgorithms but because that requires anactive attack it's lessurgent so we integrated postquantumcryptography algorithms to Spy we addedKyber X259 hybrid key exchange which ispostquantum secure to the TLS stack thatis used by Spy we augmented X59certificates with dialium 3 signatureswhich are now the basis ofMLDDSA that has been adopted by NIST asthe basis for authentication so we havetackled both of these threats both thehigher priority threat and the lowerpriority threatdesigning for quantum security we havenow TLS with postquantum vetographywhich can be used alongside with thefiltering policies that selium allows toexpress filtering policy at layer 7including HTTP request fragments thisenables um secure crosscostcommunicationso the building blocks Aspire Psylliumand Envoy to actually support the umservice mesh approach where HTTPrequests being made over MTLS canactually be audited and inspected interms of the URL structure to allow anddeny requests at an application levelbased on what is in thatURL so now we have a short demowe did didn't really plan not to bemiked up and having to handhold themicrophone so I might put it like righthere right here for Hugo thanksso we we have fitted this for uhtactical edge equipment uh we have aXR4000 and a and couple gateways so whatwe have here is a three node Kubernetesstack running psyllium as you can see ifI run on selium status we have a lot ofdifferent things running and it's all ingoodstatus so we have a short payload[Music]ah there wegookay so if we take a lookat our namespaces we can see we have adefault namespace and we have a simple testapplication running in the defaultnamespace so if we now take acommand such as thisSo we've now made an HTTP request fromour demo podworker to our Temo echo server and wecan see that we just have a simplediagnostic result over this HTTP requestbut if we try a different HTTP requestwe can see that our access is denied whyis this well it's because we'veconfigured a psyllium networkpolicynamely this which as you can seerestricts if we take a look at the YAMLthat defines this we can see thatthis restricts the HTTP requests that weallowed to make so that we are onlyallowed to make requests by this URL andnot to anything else so we are allowedto impose fine grain policy where thisis not allowed but this is uh the bigdeal is of course that this is goingover envoy service mesh so that from onenode to another this is all being uhsecured over MTLS which is postquantumsecure so that your application is notrequired to know anything aboutpostquantum algorithms the node to nodecommunication is secured and you canactually add policy on top of that so wecan also take a look at some things thatshow us how this isworking so forexample if I just paste in a commandhere we can paste a command in hereand we can see that there is a wholeprocess going on behind the scenes inwhich selium is interacting with spireto get authentication certificates thatcan be used node to authenticatecommunicationswe can also see the identities that aregoingon eachnode as you can see there's a numericalidentity assigned to each workload andthis is then retrieved as an X509certificate from Spy which is used forMTLS communication[Music]so we can also look at an X509 SV IDspecifically so for example if I takethiscommand so what is this going to do wellwhat this is going to do is this isgoing to query the spire server for anX509 SV ID which is a kind of X509certificate and it is going to requestthe identity for our Echos server demoand it is then going to decode that X509certificateso what do we have here what we havehereis a new certificate which is signed wecan see that there is a spiffy identityon it for our echo workload but we canalso see that the signature algorithm isactually something new it's actuallysomething new enough qthat the version ofOpenSL installed on this machine doesn'tunderstand so it's just telling us thatit doesn't understand it and showing usthe raw data but um I can tell you thatit is a DIY freesignature so there you haveit and that pretty much sums itup okay yeah it's uh noteworthy that uhthe support for Kyber and the lithiumare part of uh Spire server Spire agentconfigs i don't know if you want toshowcase that real quick or[Music]Could could you do a cube codle edit forthe spy agent inspire serverconfig okaywe'll share the the code examplesafterwards uh as well with the changesthat were upstream to spire but I'mgoing to pass it back to Hugo for someuh keytakeaways here you can grab[Music]this so basically cryptographicallyrelevant quantum computers threatencurrent asymmetric crypto especiallytheir stor later attacks where an attacknow just recording traffic that youcan't currently decrypt could bedecrypted later in the future soadoption of at least postquantum securekey exchange mechanisms is critical nowto future proofing systems like spiffyinspire and other PItechnologies so that that is the highestrisk um hence you want to adoptpostquantum secure key exchangemechanisms particularly hybridmechanisms such as x2559 kyber 768 whereyou are actually hedging against thepossibility that either algorithmbecomes insecure in the future likewiseyou would ultimately but this is of lessof a priority want to retrofit X59certificates and JWT tokens and otherthings that use signature algorithmswith postquantum secure signaturesuh there are some gnarly parts of theTLS protocol which can result in apeculiar outcome such as where bothsides of a connection support apostquantum secure chem but due toslight implementation differences theyactually don't negotiate the chem thisis something to watch out for and inthis case they end up falling back to anon-quantum secure key exchangemechanism um so you actually need tohave a way to to measure what you wereactually negotiating in the field andverify that this is beingused so with the Go ecosystem and ofcourse this is highly relevant toKubernetes very large num amounts ofsoftware implemented in Go go has nowshipped mainline hybrid key exchangemechanisms for TLS with Go 1.23 so inmany cases you can actually deploy thisin your existing Go code just byupgrading to Go1.23 uh for things like uh postquantumsignatures we actually made an adoptionbased on Cloudflare's circle which is alibrary that Cloudfare pioneered forthese Dithium fee signatures uh the NITstandards have now been published so youshould expect a lot of churnthank you forthat security is never done so weencourage you to start planning andconsidering uh the sharp edges thetrade-offs the considerations to embarkon migrating to quantum resistantalgorithms and uh working with the opensource community and sharing thoselearningswe would like to thank uh the spiffyinspired projects as a whole thesteering committee the maintainers andthe community of the respective CNCFprojects that make part of the data pathwith which we showcased it's literallytaken a village but we're veryappreciative for for the support andenabling us to carry out this work atthe same time uh as a small businessworking in the front of national defensewe want to share our customers uh well abig sentiment of gratitude to the UnitedStates Department of Defense and ourpartners at the Dell TechnologiesFederal team for allowing us andenabling us to carry out this work anddemonstrate a feasible path that canlight out uh an approach again like onejoint welloiled in order to bringquantum resistant sooner than would haveotherwise been possibly realized uh hadwe been doing this individually and andnot in theopen thankyou i think it's launch time but happyto take any questionsi I thank you um I wonder if you're ableto comment on the delta in overheads inin running uh in the in the target stateas opposed to the the old crypto i canbarely hear you with the survey runningby my side can you ask the questionagain please louder louder um I Iwondered if you were able to comment onthe delta between the overheads inrunning in the in the target state andthe state so I believe what you'retalking about is the larger key sizesand traffic sizes of postquantumalgorithms correct exactly yeah yeah soyou are going to be paying a a highercomputational cost for the for this kindof um postquantum security um everyonewould like an algorithm that exists thatis postquantum secure which is as fastand lightweight as the classicalalgorithms but um we are waiting forGodo on that oneum so you're going to be paying a highercost but I don't think that you're goingto be really having it as a concern inpractice because the cost of the TLShandshake in a large variety ofdeployments is not the bottleneck inyour workload even before or after soyeah you are going to be seeing uh moreTLS traffic overheads but I think thatmainly shows up let me trade places withyou real quick no no go on but I thinkthat mainly showsup in[Music]um environments where the primary thingyou're doing is terminating TLSworkloads so for example if you have aTLS load balancer that is acceptingrequests from the internet and it doesonly this well for example you'rerelying on that load balancer to be ableto handle a million requests per secondif you adopt post quantum cryptographythat may go down but in an environmentsuch as Kubernetes where the overhead ofa TLS connection is lost in the noise ofcountless other computational activitiesthat are going on I don't think thatyou're going to see a significant impactdoes that answer your question yeahthank youhey guys great talk thank you very muchfor that um I was wondering if you couldhelp me unpack a little bit what is itthat we could do like today to prepareourselves for these new breed ofalgorithms like if NIST is standardizingand ratifying latisbased stuff rightyour open SSL version was so you knowthat's so bleeding edge that even yourOpenSSL version on your laptop couldn'treally recognize the algorithm is it tooworldly do you think like all thosebuilding blocks that you've mentioned uhthe spire the psyllium theum I don't recall all of them apologiesbut it seems that the stack is fairlyexperimental right so would you say thatwe are like as cryptographicum you know non non-experts oncryptography but someone with a vestedinterest on making our system secure isit are we ready to even start adoptingthe lithium and all these Latis basedstuff is our stack going to break if wedo that could you just shed a little alittle bit of light on that so I thinkas an end user or as a systemadministrator or operator deploying inthe field um the main thing you canreally do is listen to the advice ofyour vendors and upgrade when they havea postquantum story available and ifthey aren't promising a postquantumstory in the future then you should beasking them questions on that um interms of what's available Now we'reactually finally seeing starting to seethe practical viability of postconquumalgorithms in terms of existing softwarestacks um there's now compatibilitybeing shipped by OpenSSL but we'vealready seen actually for a number ofyears um Cloudflare and Google shippingexperimental versions of thesealgorithms in Chromium and so on and soforth so we're now getting to the pointwhere this sort of thing is being turnedon by default um as I said in Go 1.23 23um you now automatically get the benefitif you want it of a postquantum securekey exchange mechanism so again it'sjust a case of really upgrading yoursoftware when new versions becomeavailable and these capabilities will beturned on by default in most cases um ifyou're building your own software umthen you need to be looking at the SSLlibrary you're using and really enablingthat so in Go for example it's literallyjust a case of upgrading to Go 1.23 twofree and then suddenly this is in thebox it's automatically on so I would sayfrom an end user perspective it's mainlya question of upgrade when the patch isavailable and really track the softwarethat you're deploying and see if avendor has offered you an upgradeyet i believe no more questions thankyou very much2025-04-15 21:57:42.409189se stages that are alpha theexpectation of alpha is okay I have a afeature a proposal I know more or lesshow this is going to work i discuss itwith the six and all the stakeholdersand it's I wouldn't say a prototypebecause we want alpha to be usable rightpeople can try and should work it shouldwork no it must work but we understandthat alpha has uh it's impossible toknow everything beforehand right we needto to allow innovation so we cannot beenforever trying new things or prototypingor getting feedback so the the qualitybar for alpha is a bit lower then wemove to beta in beta we have um a morehow can I say a stronger requirement forstability right we if we have a featurethat is better and doesn't depend onAPIs this is going to be enabled bydefault that means that every Kubernetesuser is going to run their cluster andit's going to have your new featurerunning so when something is running itmeans that there is already a contractwith this end user right we cannot breakthem but we they still know that we arenot fullying getting the end details ofthe fetus so we may have some room forimprovement so they need to be awarethat some things can change and thenafter beta we move to GA in GA we have astrong strong commitment with with theend user and with implementation andthis is one of the key points of ofKubernetes right we have a very highquality bar and this means that we canhave an ecosystem that depend on ourfeature right you develop something inin one version against a J version and Imean we are not perfect we fail thereare bugs and there are things but Ithink that we are doing well on on ourstability of thefeatures and we talked about featuresbut you see one of the strongest pointsof Kubernetes are the APIs right most ofthe features depend on the APIs the thekey of the APIs is that allowto provide contrast with all thesethousand project that depend on Kubernetrightso you don't need to absorb all theproblems in Kubernetes you can createthe right interfaces define the rightsemantics and allow all the ecosystem togrow based on your interfaces and that'swhy it's important and you see that ifyou go to any API review or or you havea new feature there is a group of peoplethat is specifically attending to thisAPIs are very sensible and very criticaland we invest a lot of that but mostimportant that I want to highlight APIsare not just for defining the syntaxesright I'm going to put this field orwhatever thing it also define theinteractions with with the othercomponent right and it defines what thesystem should do it's not only to definethe fields you need to define how it'sgoing to behave because people expectthat when you add some field or when youset your dam manifest in a certain waythe system is going to react react incertain way and this is what providesinteroperability and portability that'sthe one of the biggest success ofkubernetes is that you get your handchart your jaml you apply in one cloudand runs in the other cloud too rightbecause we have a strong APIs with cleardefinedbehaviors but here is when when theproblematic start we talked aboutfeatures and now we talk at APIs theproblem with APIs is that if the APIsare better and are better forever it'shard to graduate them right so since uhI don't remember now which version 121or1222 all these uh beta APIs are disabledby default and as a consequence all thefeatures that depend on beta API aredisabled by default so we start to see abit more on the problematic of featuredevelopment right so now you havefeatures but you have also APIs and weneed to be able to graduate both at thesame time to offer the stabilitybut we also talkedabout Benjamin is going to dig a bitdeeper later on on these topics but wetalked about kept we talked about thefeature gates and we talked about theAPIs and in the ke process some yearsago we started to to find gaps in ourstability right we were building a lotof features and then the system wasunstable i don't remember in12123 we have a lot of flakes in the inthe CI right things didn't work whybecause we had a lot of features but thesystem was not sttable we have kind ofthis is a joke but most of you that workin software development I'm sure thatyou work in projects like this right sowe have now the production businessgroup that's the one that when you gowith the ke and you want to move to betaand to G they start to to ask these hardquestions what happen If how the enduser is going to control this justicerunning 10,000 nodes what are yourscalability requirements so this is animportant process an important groupthat the project put in place to controlthat when we release something isproduction ready and production readymeans you can run it and have yourbusiness on top of yourfeatures and well all these steps areneeded to guarantee a quality bar wewant uh high quality bar for the projectbut we know that the quality bar has tobe adapted to the development because weneed to innovate and we need to keep thecode stable and for achieving boththings we need to to face this rightthat's why we have this different stagesand different criterias and then how weenforce this well I think that's in thenext slideWe we enfor this with testing so we havea different test suit right we we runkind of a pyramid but it's a metaphor wewe don't want to d to get intotechnicalities but we have a strong fitintegration and it test and we have thisthe suite of test that is calledconformancethe the whole point of this test is thatuh integrations and third party projectscan can build application depending onthe APIs and the behaviors that's whyit's important we I said this severaltimes and I will not stop saying this weneed to guarantee portability we need toguarantee that application works indifferent Kubernetes versions anddifferent Kubernetes installations andthat's why it's so critical this thistest this is the minimum necessary thatwe consider for for a Kubernetes testcluster to be to run portableapplicationsokay how we enforce this well I touchedon this beforewe have a lot oferh we we spend a lot of money inCI and we run a lot of test and weinvest heavily in automation and we relyheavily on six if you think in in thetraditional software development thereis no developer QA and other people thisisan how can I say there isa a share responsibility thank you ashare responsibility right it's not thatthis is my feature this is your featurethis is our project and we need to taketake care of that right and it's ourproject and I'm responsible of thisfeature or of this area I need to beaccountable of of thisarea what is how we enfor is we reallydon't enforce it there are differentgroups of people that everyone adds thevalue there is a a group on C release CIsignal that monitors some specific testthat that rise that when something failsjust warn the the responsible of the jobthere is people in the six that areresponsible of fixing these jobs and wework as a community and I think that oneof the most proud things that that I'mof this project is that we have a zeroflake policy so we don't allow test toflake why because if a test flakes youdon't know if it's working or if it'snot working if it's your code or what itis and we have a strong commitment withthis and we had also this written in ourpolicies we don't allowflakes this over toyou so the SIG testing special interestgroup um is dedicated to Kubernetestesting you've heard about this a bit weto be very clear don't own the teststhat's the shared responsibility that'snot scalable there's so much complexityin all the different parts of theproject to have one central group ownall of those it doesn't make sense butwhat we do own is the testinginfrastructure the standards theframeworks the best practices andworking with all the SIGs onthat so why are we giving this talk nowwhy are we making some changes toimprove this pretty much DRA i'm I'mlooking at Patrick here who's one of ouralso sik testing TLS who's been workingon DRA dra makes this problem uh a lotmore interesting because we havemultiple alpha features that depend onbeta features and there's a huge demandto graduate quickly and get thisfunctionality available for all thesepeople that are running alul these hugeclusters with accelerators and need moreflexible scheduling and it is a lot morecomplicated to understand how are wetesting this correctly where can we runthese when it maybe depends on variousthings so we'll talk a little bit moreaboutthat so when we're testing uh a featurethat you've been developing ke the unittests are like fairly self-explanatoryyou have the scoped piece of code that'spart of implementing it and you'recontrolling running that code and youwrite test cases for it pretty standardsame thing for the integration tests area little weird ours are a little weirdbut the integration tests themselves arestanding up the environment so you knowexactly what the API server looks likebecause you just started it when we getto in to tests uh it's a lot morecomplicated we mentioned earlier thatthere a conformance suite well that'spart of it we have a test binary thatneeds to be able to point at differentclusters it's not really feasible tototally couple the end toend tests witheach of the cluster environmentsdirectly so when we get to end toendtests it is totally separate that we'rebringing up some kind of cluster andthen we're trying to run tests on it andwe need to know wait can I run this DRAtest case or not is it depend on analpha feature or beta feature and so onso in the past uh the main way that youwould tag one of these tests with aframework is you would just say well ithas a feature and feature is a supernebulous thing um and it would getinserted as sort of like a pseudo tag inthe generated test name with the rest ofyour like handwritten test name and itwould be in braces and say feature andthen the name of the feature you gaveand this is just an arbitrary string andit could mean that your test depends onhaving a load balancer installed um itcould mean that your test depends onhaving a feature gate turned on or itcould mean that the test depends on thecluster being configured in a certainway like we're going to test the dualstack logic so the cluster needs to havedual stack enabled not be a single stackcluster um so this is really heavilyoverloaded so what you would see is mostof the CI jobs would just skip anythingwith any feature tag like I didn't doany special setup i'm not dealing withthis and if you're building a featurethen you're going to go make sure thatyou have something that instead says I'mgoing to run things that match thisfeature and it just doesn't scale verywell um you're pretty much on your ownto figure this out and you can talk tous but there's no shared um systemreally patrick here did some work awhile back to make us uh onto GKO v2with some other members of the communitywhich also got us access to GKO labelsum we hadn't adopted them very much yetgenko labels with the tests they allowus to move metadata about the tests intoselectable labels that aren't crammedinto the test name you can actuallyquery them as we'll come back to andyou're not just doing a regularexpression that's include and exclude soyou can actually say you know I want torun certain things so we've uh in 133towards the end of the release we wereable to finish we tweaked uh this withfeature gate method so now if your testjust is testing a feature gated featureyou pass with feature gate and you passthe actual standard feature gatedefinition and I wrote here you may notbe familiar with this in Kubernetesthere is a centralized registry offeatures that can be enabled across allthe components and so there is like apackage that has all the featuresthey're named they have a bunch ofmetadata about them like who owns themum and and centrally approved they havethe state alpha beta GA if they're on bydefault and so on so you can nowactually pass to the ED tests this is mycanonical feature metadata and it canannotate the tests with the informationabout the feature stability levelwhether it's turned on by default andand so on the feature name and no morehorrible broken redexes trying to selectthe right testsso you now say my test depends on thisfeature gate and that information isthere you say my test depends on thisfeature vlike special setup installing acontroller or in the case of DRA itdepends on both we need to have a DRAdriver installed because there isn't astandard driver it's for whatever deviceyou want whatever you're allocatingdynamically so you need to have thedynamic allocation features turned onand you need to have the dynamicallocation test feature which issignaling that there's some extra setuprequired before it can run on a clusterso you tag it with both um and so thisis the this is this was talking aboutthis is the metadata we we pass throughwhat we um are doing as part of this issetting up some standard CI jobs uh thatwill run all of the tests that just needto turn on that feature gate and skipthe ones that require extra things sofor a lot of features in the projectthey're doing something like declarativevalidation that is a totally server sidefeature in the API server it doesn'thave any external dependencies it justneeds to be turned on because it's not aGA yet and to test that all they'll needto do is annotate this and so we will bereally pushing the feature approvers tosay you should not be promoting featuresthat are have no tests or beinghandested uh as may have happened in thepast there will be an expectation thatyou definitely have CI all the way downto an alpha feature even if it's brandnew um we need to know that it's workingif it's totally broken we shouldn't beshipping it to users um and itdefinitely shouldn't be promoting tobeta if it doesn't have reliable testsyet so when we're doing this we'rebasically saying okay this cluster we'returning on all the alpha features we'returning off evented because we that'sone of the features that had promotedfurther turned out it wasn't stableenough needs quite a bit of work buthopefully going forward we're justturning on all the alphas we're turningon all the alpha and GA APIs so that youdon't have to worry about thatdistinction in CI we're standardizing onwe turn on alpha and uh APIs andfeatures at the same level beta and soon makes it a little bit less confusingto think about and then we can writethis query that basically says if it hasa feature that it's an off by defaultfeature gate or it doesn't have anyother feature information we'll run thatif it's not a beta off by defaultbecause we're doing alpha tests and ifit's not a deprecated feature becausethat's another feature level that wedidn't talk about some extra complexitythere for basically here's a way totoggle something that we're removing forone more release you can turn it back onbut it it's going away um and then thisslow disruptive flaky is sometimes whenwe're working on tests we've kind ofkicked them out or or or said this isslower we're we're going to iterate somemore on that but basically there's somecategories of like these tests are knownbad um and we're not able to fix themquickly let's make sure that they're notblocking um pull requests until we getthat sorted outum if you have a feature like DRA youstill got to work with us to set up someCI for that but that will be adiminishingly smaller set of featuresand we will expect uh to keep up andhave that for everyone um alsouh let me come back to thisone where did that so one other thingthat happened here as we were working onthis is we actually found that while theKubernetes feature policy says alphasmust be off by default uh this wasn'tenforced which made this problem a bitmore complicated and also means we'renot communicating things clearly tousers there was in fact a feature thatwas on by default since 126 that was inalpha quality and it's going away in 133hopefully it's fine most of you probablywon't run into it it relates to hostnetwork on Windows um but now we havesome tooling in place uh that you can'tdo that it will block the pull requestif you try to set on by default andalpha that's an inconsistent statementto the user um so that simplifies how wecommunicate on the test to make surewe're shipping the right quality bar toproduction and this is the kind of thingthat we're working on to make sure thatgoing forward um you know we're 10 yearsin the prowject and still having featuresthat are partway through the life cyclebeing hand tested locally just isn'tgoodenough and so this is an example whenyou have a feature that you do need todo extra setup so we're saying it'sallowed to be tagged as it requires thisextra dynamic resource allocation setupum or an off by default feature becausewe're going to turn all them on to runthis because this has alpha and betafeatures um we've turned on uh the alphaand beta APIs that we need um and we'reignoring the flaky tests until we'vegotten um that sorted out and uh there'ssome extra bit here that isn't inlinebecause it's some fun bash that sets upactually having a a mock DRA driver umand this runs on a kind cluster which isanother sig testing subproject and these are pretty likecopyable so if you have a relativelysimple one um where you need somethinglike a load balancer or a driver uh fora dynamic resource we have like littlelocal implementations you just need tomake sure that if you're testing thatthat you know we've taken like this DRAjob made a version that installs theload balancer instead and runs your loadbalancer tests and we're doingthat so we'll be putting out some morecoms on 134 early in the release cyclekubernetes is about to ship 133 uhshortly after CubeCon and we'll bemaking sure that everyone who'scontributing in the community knowsabout these changes knows theexpectations to adopt them and theincreased expectations on not just thewritten quality bar that we've had butactually making sure that we've providedum the tooling and the shared resourcesto execute on it more easily and thatincludes getting more of this signalinto release blocking and pull requestblocking um we actually want to go asfar as to say even just turning on analpha needs to be blocking releases umnot the tests we know people are justgetting started iterating on those butwe'll have that signal running somewhereso it can block promoting them to betaum and pull requests should startblocking on beta APIs consistently uhthat's not actually guaranteedtoday make it a lot harder to haveregressions with these earlier in theprocess kubernetes used to lean a lotmore heavily on stability in GA we'rebringing more of that back down to betaand even down to alpha that even analpha feature shouldn't make GA featuresunstable thanks let us know if you haveany questions come join us in siktesting in the KubernetesSlack yeah[Applause]do wehave how long does a typical featuretake to run all the test suites in CIfor give it like one feature that'sreally hard to say they're all over theplace for some features we need to runscale testing you know uh we need tomake sure that if you make somescheduler changes that you haven'tregressed the performance on largeclusters and some of those test runstake 12 14 hours uh 5,000 node clusterswow so the the there is when we talkabout this there are two stages rightpresummit that is we don't want thisthing to merge and we have kind of aline of one hour to to run so you getsignal right you're developing you'resending something and you want to knowI'm breaking something or not so we havekind of a hard line in one hour it's notexactly but we try to keep that then wehave a lot of of test that running inperiodic jobs right and then there arelike a scalability test that run everytwo or three days and that's I don'tknow 1,000 5,000 that can take longerbut usually for developers that want tosubmit something we we tend to get a barof one hour for for getting feedbackwe also have a two-hour bar for releaseblocking signal and then we have somerelease informing signal that's like therelease team should look at this but itit's not guarantee blocker also to beclear um we're talking about like awhole suite we generally don't want tohave like a CI run that we have to dofor like a specific feature there's someoverhead to just like compilingKubernetes and bringing up a clustereven when we're running with kind we wehave to throw a decent number of coresat just building Kubernetes in areasonable amount of time there'sthere's a lot of code uh so typicallythese are shared suites and individualtest cases are are expected to be prettyfast normally but Kubernetes hasthousands and thousands of test cases tocover all of the everything so um itreally depends on the scope of thefeature okay thank you very muchhey um can you expand a bit about thezero flakes policyokay yeah uh uh we did a talk about thatlast time you want to Okay um so youknow if anybody spots a flake we arefiling a bug we are assigning peoplethat are working on that feature thecommunity is uh pushing to to fix thatum it takes an initial push to get to aplace where things are pretty reliableso people aren't used to flakiness and Ithink we've achieved that so now thatpeople aren't used to flakiness thecommunity is pretty good about why isthis thing failing it doesn't seemrelated to my change we also have a lotof tooling for this um we have thissystem called triagego.kates.iotriage um and we basicallyrun a pipeline that does KN&N clusteringof failure messages across all of the CIjobs uh so you can say hm I have a graphof over time this error message startedcropping up more so we have some failuremode with like starting up a pod orsomething and you can prove that maybeit only failed once or twice in a PR jobbut there are 10 other periodic teststhat are also hitting it and that graphhas spiked and there's a button tocollect that information and you canjust like file an issue with all thatand um the release CI signal team alsoworks really hard on this for uh all ofthe release blocking and informingsignal to make sure that the moment oneof those crops up we're tracking itwe're fixing it we're removing it thethe flaky tag you saw earlier was moreof how we handled in the past we wouldjust go okay we got to kick this testout we still have some but we've alsobeen eliminating thosethere is also a cultural thing right sobefore we had these uh gingo flakeattempts or something no so ifeventually pass it will we remove thatthe test pass or doesn't pass then yeahbut that that is the easy one like whenwhen you say zero it's zero it meansthat if it flakes one then you you haveto chase it yeah and then that's whatsaying is then we have a a a strongcommunity position on the right so youhave the releasing when they see thatthis periodic graph you know flake theyopen the bag when when there is aninvestigation can happen several thingsthat the test get fix it or that thetest get removed right so that's thething if you keep the flakes going atthe end you are going to have a lot offlakes so you need to take a decision Ithink that in my experience in all theotherprojects this is one project that takesthis decision right or you can get yourfeed out you have a feature that isflaking and it happened it was flakingit was better and nobody did anything soit got demoted to alpha so when when youstart when the project has this youposition and this strong commitment toquality then everybody start to absorbthis this behavior right and then iswhen you in the late run are able to toto have a more stable CI because theperson that develop the feature is alsoaware right and it's not that I'm goingto draw me a feature and leave it's I'mgoing to draw my feature i want myfeature and I'm going to take care thatmy feature is just stablei think to get there though we had tomake a pretty big push around this andlook at like what are sources of flakesget rid of the systemic sources offlakes like noisy neighbor problems inCI get that cleaned up so there's noexcuse it's like oh the CI is making mytest unreliable no no your test isunreliable and then burn those down andthen get people used to okay we're nolonger doing flake attempts and we havea bot that auto posts retest the CI whenit fails none of that's going on itpasses or it doesn't get people used tothat uh so that it actually feels likean outlier and then it's a lot easier toget buy in from all the developers thatyeah we should address these when theysee lots of flakes it's really hard toconvince people to work on iti think that's it i think we're timethanks all okay[Applause]2025-04-15 21:57:43.122632 ��.'#��AEBbuyn72jtwthree days it's long man give yourselvesa round of applause you madeit heck yeah all right I'm going tostart the timer i said I'd forget it andI didn't nice hi everybody my name isDan i work at a company calledSerbos this is CubeCon in London in 2025that is my title slide thanks forjoining me todayuh I get to work at Serbos and I get towork at Serbos on open source and beforethis I worked at lots of neat placeslike Ubisoft and Mozilla and data dogand I say this only as context to letyou understand where I'm coming from forthe remainder of this talk i've gottento dabble a lot in infrastructure uh anddevelopment and also in security and athing that I've learned uh along thispath I've learned a y��@&#��7Ad_9JNRkT7dghelloeverybody thank you for coming we aregoing to talkabout how to manage the quality inKubernetes and explaining how thegraduation process go from alpha to G myname is Antonio i work at Google hi I'mBenjamin Elder i also work at Googleokay we are both of the sicktestingand members ofsteering and for all of you that thatknow Kubernetes you may know this howKubernetes work kubernetes is animportant open source project with alarge ecosystem right it's not onlyKubernetes you've been here you go tothe booth you see I don't know howthousands of projects depend onKubernetes rightand to handle the development of of thisproject the way we are organized istechnically we have a specialinteresting groups right so each groupcan be horizontal and handle things likeAPI machinery or CLI or can be verticallike networking or notbut what is more important for for theproject and for the users is the projectneed to keep growing need to keepevolving right and this means adding newfeatures so I don't remember when thisstarted h we are very large and we needto get organized right so the the theway we organize the new featuredevelopment is via caps rightso the cat process is it canbe a bit heavy for newcomers or forusers but it has a purpose right has thepurpose of communicating across it soeverybody can look at the place oh I'mproposing this because it's impossibleto understand beforehand all the radiusthat your f is going to have so it'simportant that we have a center plane acenter point to communicate with withall the fus and also to be transparentwe are an open source project this isabout community right if I want to addsomething I need to get fib from theother six because you don't work inisolation we try to break all the silosso there is no network there is no nodethere is people that is interested innode that may work on network things andpeople that is interested in networkthat may work on on allthingsso let's let's d a bit uh a bit more ininto the care process right so when youwant to add a new feature then you needto to consider what is going to be thelife cycle of this feature right we havethrerzlot of things butone of those things is thatauthorization different fromauthentication thank you very much isn'tjust a security problem although it'soften relegated to that it's actually adeveloper experience problem as well andthat's really what I want to talk abouttoday um let's start with what I calland I'm not the only one who calls itthis is the authorization paradox okayevery single request in every singleapplication especially in today's modernwhisbang cloudnative environment rightneeds to answer the question at everysingle encounter is this user allowed todo this thing under what conditionsright it's unavoidable it's importantfor security but authorization isactually treated as an afterthought uhin our architectures and our workflowsespecially as developers especially evenas infrastructure people and we'veactually seen this pattern before uh I'ma man of a certain age and I rememberwhen deploying was just a collection ofbash scripts or like setting andgrepping and stuff like this right umyou probably remember when observabilitymeant grepping through log files asopposed to like a single pain you gotyour graphana and your data dog andstuff right now uh our industrybasically recognized these pain pointsand was like we don't want to feel thispain anymore what can we dobut authorization hasn't gone throughthis elegant developer transformationyet right it remains stuck in this sortof a weird strange place it's somethingthat every developer has to implementbut few have good tools or or patternsreally for building it right so itcreates friction rather than flow andwhen something creates friction we findways around it and sometimes not verygood ways right uh this leads to thisdisconnect and that's kind of what Iwant to talk about today okay but let'sstep back and look at how uh how we gothere right so how has it evolved overthe past decade um remember networking iremember networking i remember pluggingin stuff right went from manuallyconfigured load balancers and firewallrules to declarative service meshes CNIplugins right so we we did that leap uhinfrastructure concerns have now becomedeveloper concerns with clean primitivesand clean abstractions storage followeda similar path right moved from managingvolumes and mount points to justdeclaring persistent volume claims etcetc right uh CSI standards deploymentsyou know we transferred from errorproneshell scripts and stuff being done byhand and I'm just going to super quickSCP this over to prod we don't do thatanymore you don't do that anymore don'tdo that anymore okay and in this casetransformation made developers moreproductive whilst simultaneouslyimproving security improving reliabilityright that was the idea um so thepattern is clear like you have goodabstractions and those good abstractionsdon't just hide the complexity they theyreally change how we work and and that'sa good thing so why has authorizationremained kind of stuck um well it hasn'treally followed the same evolution pathand it's part of the problem is thatit's really often deeply coupled withapplication logic which I'm going toaddress in a moment um it tends to likelive inside business logic it's notoutside of it like networking or storageright these are just things we need todo the business authorization is muchmore of a business level problem so itmakes it harder to extract it makes itharder to standardize it makes it hardto rationalize about especially like ourlike weird developer brains we don'toften times think about the business anduh second authorization requirementstend to be domain specific likefinancial services tend to havedifferent controls than a SAS product ordifferent controls than like Uber fordogs right there's no single model thatworks for everyone and and that's roughuh third modern applications have likeincredibly complex relationship needsand like objects and resourceconstraints and stuff like this youthink about like I don't know GoogleDrive permissions or Slack permissionsright it actually gets really reallytangled up really fast and these aren'tsimple matrices of users and actions{right you think oh like user A canaccess document B and has CRUD access onit like that doesn't exist anymore ithasn't existed since the '90s right it'sit's much more complex than that there'sno one-izefits-all solution it doesn'texist a lot of people will try to tellyou it exists those people are trying tosell you something i'm trying to sellyou something that's beside the pointthe reality is there is no one you knowfits all solution it doesn't exist so asa result authorization as I said remainsin this weird space betweeninfrastructure and application andbusiness it's too complex to bestandardized too common to be solvedaresh every time right in principle atleast so this complexity has real costsuh it can't just be ignored it actuallycosts money which is the issue rightdevelopers constantly context switchbetween implementing features enforcingpermissions you can look like controllercode business logic intertwined withauthorization checks and this tangledrat's nest of if then else statementsit's freaking disgusting and it's superhard to maintain especially you've gotlike multiple mobile apps and likemultiple this multiple that anddeploying that it sucks these scatteredimplementations they create securitygaps it's hard to verify permissions arebeing enforced consistently uh whenthey're spread throughout the codebaseit slows velocity no one wants a slowvelocity we got to go fast we got tobreak things except when the thingsbreak and then then whatright so if making changes to thesepermission models requires extensivetesting oftentimes like regressiontesting because it's difficult tounderstand the full impact of thesechanges and it creates this massivefriction especially for onboarding newteam members right are you a new teammember waiting into this codebase forthe first time and you're spending thefirst three months you're there justtrying to figure out permissionstructures before you actually touch anyfeatures that sucks right so it's not asecurity problem it's not aninfrastructure problem it's it's adeveloper experience problem and poordeveloper experience makes everybodyfeel bad but what if we approach thisdifferently right as like a workflowissue um what if authorization could bea creative tool instead of a constraintyeah developer experience isfundamentally about enabling flow rightthat flow state that we all love thatstate where you're fully immersed insolving the problem right super supergood authorization tends to break thatflow but it doesn't have to okay um itcan be declarative rather thanimperative that's really important itcan live in well-defined places insteadof just scattershot across yourapplication's codebase and with theright abstractions it becomes intuitiveit becomes maintainable that's the ideaand so the the key insight really if youif you take nothing else away from thistalk and I'm going to give it to youright now is that you got to focus onthe workflow first right focus onsolving that how developers interactwith authorization during development isa big big big element and we can createbetter security outcomes betterdeveloper experiences and that's whatwe're going to explore next so when wearchitect uh let's go ahead and saycloudnative applications because lookwhere we are right we typically give uhauthentication the architecturalattention that it deserves rightauthentications usernames passwordspassphrases identities oh yeah we getthat we get putting in a password we getputting in a security token and that hasa big architecture around it right butauthorization often degrades into thesescattered if then else statements like Isaid earlier and and separating outbusiness logic from database access andso on and so forth it's important tomake these clear boundaries right weseparate business logic from databasesources uh database access rather wecreate clear boundaries between businesslogic and permission logic and thismeans authorization becomes a serviceright a service rather than just a bunchof lines of code it could have its ownAPIs its own contracts its own testingstrategy its own development life c|ycleif we wantedto and these are all sorts of thingswe're comfortable with and familiar withas developers these are things thatallow us to have flow right um Netflixactually famously took this approach onreally really early uh with theirmicroservices ecosystem and they foundthat centralizing authorization improvedsecurity and actually acceleratedfeature development they have somereally good white papers on this thatyou should go read uh when you have amoment if you don't believe me um theirwhole thing was focusing on corebusiness logic without getting likebogged down in permission checks it's areally good paper so definitely go readit up and so there's a really crucialmindset shift here right uh we frameauthorization as a constraint what can'tusers do and this negative framingpushes authorization to the realm ofsecurity restrictions and the betterquestion really is how do we enable theright access that's the right questionwe should be asking right it's apositive framing it turns permissionsinto a product feature whoa whoa rightwhen we make this shift permissionsbecome part of product discussionsthey're not just relegated to securityreviews and stuff you don't want totouch they shape user experience throughprogressive disclosure of functionalitybased on you know uh context andconstructs look at how Slack approachesworkspace permissions for example rightthey're not just security controlsthey're collaborative features thatenhance how teams work together that'spowerful that's the difference betweenauthorization as a constraint andauthorization as a product capabilityright so to be clear about what we'retrying to avoid we're trying to avoidauthorization that's added as anafterthought we're trying to avoidcontext switching that kills that thatdeveloper flow uh authorization logicthat's tightly coupled with businesslogic becomes resistant to change so wehave to you know we have to deal withthat permission models need to evolveover time they always do and you end uprefactoring core functionality and Iactually did a talk in Denver a coupleweeks ago at an OASP talk about howFacebook refactored a UI model anddestroyed their permissions and made 15million private posts public one timethat was a fun white paper too avoidthat right um the idea is thatauthorization shouldn't be a secondarythought you should treat it as a firstclass architectural concern so let'stalk about core design principles forworkflow first authorization the firstis domain driven authorization thismeans modeling permissions aroundbusiness domains not technicalconstructs right and we think about thebusiness i know as developers we don'twant to but we got it don't think interms of CRUD operations on databasetables i know that feels familiar to youwe're done with that we have moved pastthat now okay think in the language ofyour product permission should speak thelanguage of that product not thecodebase necessarily right this createsa ubiquitous language between productand development and security teams andeveryone understands what editor meansfor example and a document system butlike update documents is a technicalimplementation detail so which one ofthose two should you actually use rightum a document collaboration system mightmodel permissions like editor reviewerand viewer rather than read write anddelete because that's what the productactually expects um these map directlyhow users think and ultimately how thethe the business and the product thinksabout itself and this is an alignmentwhen you start to feel this way um itmakes authorization more intuitive foryou the developer for other people inyour organization and it makes theproduct more maintainable as it evolveswhich is really really nice the nextprinciple is using declarativedeclarative policies rather thanimperative checks okay uh traditionalcode bases are filled with imperativepermission checks they're scattered allthe way like if user has permission thenallow this to happen right um thesechecks are they're hard to audit they'rehard to test uh they're hard to maintainnotably uh declarative policies in }turndescribe intent what should be allowedrather than how to check permissionsright they become human readableartifacts that can be version controlledthey can be reviewed uh they can betested independently of your applicationcode this is critical your policy shouldclearly express who can do what who cando what based on businessrules it should live outside of yourapplication code it could be testedindependently to be updated withoutchanging your application you don't wantto actually have to reroll your app andredeploy your app when permission modelschange this is an antiattern rememberthat declarative approach scales withcomplexity as your permission modelgrows more sophisticated the policiesremain readable they maintainable whileimperative checks would create a tangledmess chopped that earlier thirdprinciple use external decision pointsfor authorization externalize yourauthorization super important decoupleauthorization from application logic andI've been hinting at this now I'm sayingit out loudright instead of embedding permissionchecks in your code and your applicationasks an external service can user dothing this architectural pattern enablesconsistent enforcement across all ofyour software that's what it does that'sthe magic of it right that's what's socool about it all your microservicescall the same authorization service it'suniform it's easy to reason about Okayyour application grows and authorizationdecisions then don't become a bottleneckthat have to be updated okay you canverify your permission model withoutspinning up your entire stack that'spretty sweet right this doesn't meanauthorization becomes a black box itshouldn't be a black box right it shouldbe treated just the same way the rest ofyour code and the rest of everything youwork with works it's not a black boxit's a service okay it's a service it'sprobably got an API and we're developerswe love APIs right fourth principle isabout making authorization context awareaware simple rule-based access controlor arbback uh isn't enough for modernapplications it was enough when it thepol you know our backb 96 right that'sthat's when it was ratified uh we'vedone a lot since 1996 so we need toconsider attributes like time like uhlocation resource propertiesrelationship contexts all these sorts ofthings so contextware authorizationadapts to this situation a healtharesystem might allow access to patientrecords during business hours butrequire additional approval after hourssomething as simple as that implementedas code especially if you have a lot ofexceptions especially you have a lot ofdoctors and a lot of sites and a lot ofpatients right a collaboration toolmight check document status beforeallowing certain adjectics uh actions sothe idea is TF might like fine grainaccess control uh so moving from arbbackto something like dynamic arbback oreven Aback attribute-based accesscontrol which I can talk about for hoursafterwards if you'reinterestedum let's talk about Aback for a littlebit actually you know what let's do itlet's talk about some CNCF projects alittle bit later on that do some Aback ithink that's going to be good so let'slook at how these principles integratethroughout sort of this development lifecycle right so during requirementsgathering um you need to look at yourpermission models and those permissionmodels should be defined alongside yourfeatures remember we're not talkingabout tacking on permissions at the endanymore your these permissions and thisauthorization now becomes part of yourproduct thinking what access patternsdoes this feature need uh how dopermissions integrate with thoseexisting models right and we're talkingabout API design uh authorizationrequirements should be explicitlydocumented in the API specs themselvesright these aren't separate things theseare integrated in who can call theendpoint what context is required forthat call to be successful or not rightduring implementation those clearinterfaces for authorization checks makethe integrationconsistent and this helps with thatdeveloper workflow that we alluded toearlier developers ~should never have toinvent new ways to check permissions ifyou're inventing new ways to checkpermissions something has gone wrong youhave gone off therails okay testing you need dedicatedtesting for your authorization just likeyou have dedicated testing for yourtests we're all highle super good testdriven developers here uh we have 100%code coverage at all times becausethat's how we roll here in this room ibelieve it i believe in you yourpolicies and your authorization shouldnot be any differentum and you know in operations where Ispent a lot of time monitoringobservability for authorizationdecisions helps you to identify problemsright so actually integrating policychecks integrating authorization checksinto your logs into your audit trailsright it's not just about did libraryfail or wasn't I I not be able toconnect to this database like whatpermission checks are being made and howare they affecting the behavior of thesystem right monitor that it's it's notan afterthought right it's woven in inevery stageAnd the payoff here from this approachis substantial it's substantial okay wetalked about how this costs money thisis how it stops costing you as muchmoney maybe even starts making you moneyit's developer experience uh reducedcontext switching uh the permissionlogic lives in well- definfined spacesit's easier to reason about you haveclear contracts between components thatmakes integration simpler services knowexactly how to request authorizationdecisions so the actual code to developthose services is more straightforwardself-documenting permission models heckwhy not right uh team members can nowgrasp those permission models instead ofhaving to spend weeks mired in the swampright this leads to faster onboardingthis leads to faster feature developmentthese benefits are not theoretical to beclear okay i'm not making this upthey're practical improvements toeveryday development work that we see inour customer phase okay they'reachievable with these tools and thesepatterns so let's talk a little bitabout the authorization ecosystemum let's look at some of the tools rightover the past few years there's been aincreasingly rich ecosystem of toolsthat's become available uh in the opensource world i'm only going to talkabout open source stuff because I'm ahardcore open source nerd um many havefound homes actually in the CNCF whichis really super uh so that is excellentso we're going to talk about a couple ofthose and this has helped drivestandardization it's helped to driveadoption uh they take differentapproaches to authorization challengesjust remember I said at the outset thatthere's no such thing as aone-sizefits-all solution so differentapproaches different ways of thinkingabout it right some prioritizescalability some prioritizerelationships and so on and so forththey all share a common goal of makingauthorization more manageable okay umlet's see how this fits in first I wantto talk about is OPA OPA or open policyagent which you may have encountered uhif you're on the sponsor floor over thepast few days uh a number of people heretalking about hey let's do OPA rightit's a graduated CNCF project it's seenwideadoption some adoption not as much Iimagine as we'd like across the industryuh it's a general purpose policy engineit's a general purpose policy engineright um it can enforce policies forsystems uh I don't know Kubernetesadmission you know admission controlmicroser API authorization and at itscore something called RIO rio is apurpose-built policy language forexpressing rules it's uh it's a reallanguage right so you're likeprogramming your policies it's quitepowerful it's super interesting um thearchitecture completely decouples policyfrom code we talked aboutexternalization right so it aligns withthat principle that we talked aboutearlier um it is used at some pretty bigcompanies i alluded to Netflix earlieruh last I checked Netflix is based onOPA although I'm sure that my head ofproduct will correct me on that afterthis talk if I am wrongum it's really good at what it does ifit is what you need but like a lot ofgeneral purpose tools it's notparticularly excellent at any one thingand it can be difficult to struggle alittle bit with Rio which is probablywhy OPA hasn't seen the level ofadoption uh maybe that the project insome ways deserves so keep that in mindalso we have Open FGA uh Open FGA is aCNCF sandbox project which is reallycool it's focused on relationship basedauthorizations at at fairly large scalesopen FGA is based on Google's Zanzibarpaper uh super super interestingcomputer science stuff if you're intothat uh which describes uh the systemthat powers their permissions like forGoogle Drive and like YouTube and all oftheir services right the sort of planetscalerelationshipbased architecture right umOffzero created an open sourceimplementation that's part of the CNCFuh it's specifically optimized for likedoing relatively high performance uh atscale but again it's all about modelingcomplex arbitrary relationships betweenobjects and if that sounds like a littlebit abstract it's because it is you knowit it's for example not very good atarbback right it's it's only marginallygood at AVAC so if your problem is yourGoogle this is probably what you want ifyour problem is we're anything otherthanGoogle it might not be what you need butit's really really interesting and Ihighly recommend taking a look at it umI do want to mention and we're going tostep away from the CNCF here for amoment and talk about some other openstandards which are important is theopen uh ID foundation uh the open ideauh foundation has established the offsenworking group to uh standardizeauthorization APIs and interfaceseffectively right so taking this wholesmorgesborg of possibilities that existout there i'm going like hey like can wecan we get together as an industry andcome to some conclusions about how wewant to work together and this group'sbuilding on the success of oath 2 on thesuccess of open ID connect notably ifyou've ever encountered these thingsbefore which has solved similar standardand challenges for authenticationuh the focus is on interoperabilitybetween authorization systems creatingsort of common interfaces that differentimplementations can support this work isaround creating an open dialogue rightuh for authorization systems acrossplatforms across vendors and it's superimportant work it's slowgoing becausehow do you get enemies to talk to eachother it takes a little while enemiescompetitors right to talk to each otherthat's how it isum it's less about specificimplementations to be clear althoughthey do have reference models and moreabout ensuring that different systemscan work together coherently and Ozenwhich you see up here Ozen represents animportant sort of effort for thelong-term evolution of the authorizationecosystem so if this is a world you'regetting into this is a working groupthat you need to get familiar with rightum Serbos takes a different approach uhSerbos is involved in that open ID uhworking group is involved in Ozen useshuman readable YAML policies thatproduct teams can understand review uhthese policies sort of clearly expresswho can do what to which resources underwhat conditions and the thing I want topoint out here is that I think itactually has pretty good developerexperience which is maybe a littledifferent from the other model are theother tools we mentioned earlier um youknow playground for testing policies asan in-built testing framework I thinkwhich is pretty good um simulatingrequests examining traces it it feelsmore like a more like a developer toolthan some of the other ones which feelmore like a bit of an obscure endpointor serviceuh if you know if your priority ismaking authorization accessible I thinkSerbos is probably worth taking a lookso how do you choose among these optionsi just gave you a bunch right uh there'sno oneizefits-all as I said earlier yourchoice should really depend on yourarchitecture it should depend on yourneed okay uh consider OPA if you needbroad policy enforcement beyond justsort of authorization so you know wemight choose OPA for example if you needconfiguration validation or Kubernete�sadmission control or uh something likethat right there a broad service okay uhyou can look at open FGA if you havecomplexrelationshipbased permissions atplanetary scales uh particularly forthose user-to-user sharing scenarios umyou should look to seros if you likethose developer workflow integrationsand and a certain level of simplicitybelying that complexity is is whatyou're interested in um many teamsactually end up using multiple toolsthis isn't necessarily weird youprobably have multiple tools to do moreor less the same thing in yourenvironment pick the one that makessense and if two makes sense pick twoit's okay right the important thingagain is is is to settle on thoseprinciples we talked about earlier solet's see how to implement these tools alittle bit more effectively in the fiveminutes we've gotleft good luck Daniel now that weunderstand the principles and the logicLet's talk about some overviews theseare battle tested approachesarchitectural patterns developmentpatterns testing patterns eachpattern addresses a different aspect ofthe authorization challenge so first wehave an architectural pattern so policyas a service this pattern encapsulatesall of the authorization logic in adedicated service that your applicationscan call to make authorization requestsyour applications can make explicitauthorization requests can user Xperform action Y on resource zed or Zthis policy service returns a simple yesor no right you just get that yes or noback done right it enables consistentenforcement across services and the samepolicies apply everywhere uh simplifiespolicy updates when the rules change youupdate the policy service withouttouching the application right um youwant to have like maybe a REST API orsomething you make gRPC requests againstthis is what we're looking at here whenwe're thinking about policy as a servicethis is what we're aiming at right apattern works really really well forexample in microser architectures oranywhere where centralized governance isimportant um and and policy changes endup being frequentokay we also sidecar uh because we're atCNZF everybody likes Kubernetes rightthis takes a different approach soinstead of calling like a a centralizedservice that exists somewhere in yourwhatever um each application instancehas its own authorization sort of enginethat's applied along with it or it'sdeployed along with it right uhtypically as like a container sidecar ormaybe it's part of a damon set I meanyou do you right uh this gives you verylow latency for those authorizationchecks uh since there's no network hopor minimal network hop uh depending onhow you're doing things sidecar runs inthe same pod it makes those local callsreally fast and especially if you'reputting authorization in the criticalpath and you should be putting in yourcritical path this is a really goodmodel for dealing with network latencyfor dealing with environments wheremaybe you don't necessarily have thetime or the luxury to wait for a roundtrip right uh obviously it shines inKubernetes environments uh you canimplement it you know with Envoy if youfelt like it it works super nice rightthe idea is to eliminate those networkdependencies instead of having a centralservice whilst still getting theadvantages of having an API or anendpoint that you can just callprogrammaticallyuh you can also look at multi-layerauthorization right so patterns thatrecognize different types ofauthorization belonging in differentlayers of your stack uh your API gatewayyour ingress layer was maybe we're goingto have some coarse grain checks uhwhere you might be even just validatingthe authentication to a certain degreeright you also do API check verificationuh basic ro checks could occur at thislayer for example and then you can moveto the service layer right so where youcan enforce the business logicauthorizationAnd you can use this to perform uh userchecks right can user perform thisaction at their current level of accessand these current conditions time of dayetc etc um and then you get into thedata layer and the data layer you mightbe implementing like rowle checksauthorization checks object levelauthorization checks right security forthat really fine grained access controlat a fundamental primitive level uh eachlayer handles what it's best at and thiscreates among other things and I know wewant to talk about security but I'mgoing to talk about security for amoment because I really like opsec it itactually helps creating this defense indepth model if you've ever read aboutthat or heard about that um so this isagain what you want to kind of be aimingat but I'm getting off track sopractical implementation of this wouldsort of integrate all of these and Irealize I'm going super fast but h don'thave a lot of time so moving todevelopment patterns policydriven designflips the traditional implementationsequence right Instead of writing codefirst and adding authorization later youstart with authorization requirementsthis is the O Z equivalent of testdrivendevelopment right so write your teststest them and if your policies arespitting back to you what you expectthem to then write the code that usesthem right this works really really wellbecause it means you know that yourauthorization is going to do what youexpect and you can bundle that into yourfeature set um you can even use thesepolicies to like drive API design ifyour policy says managers can approvedispense expenses for example then yourAPI needs endpoints that expose thatcapability appropriately right you canuse this to drive your API design in anintelligent way um like shout out sirboss we're really good at this okay youalso have request context enrichment umthis pattern addresses an important butlike really practical challenge which isthat authorization decisions tend toneed rich context right right you need alot of context to make a goodauthorization decision these includeuser attributes resource metadataenvironmental factors relationshipinformation right this is where our boxstarts to fall apart you need to be ableto bundle all of that together and go"Hey we have got these complexdistributed emergent systems with a tonof rich contextual information use thatcontextual information to design goodauthorization policies implement that ina request middleware or in integratethat in an interceptor somewhere rightum this rich context makes yourauthorization rules a lot more powerfula lot moreflexible uh I'm running out of time sovery very quickly i have so much I wantto say i have so much I want to sharewith you because this is such a hugething to mine and I love itokay policy testing you have to testeverything you have to test yourpolicies if your authorization solutiondoesn't have tests built into it yeetyou need something better than that allright you got to have the testing superimportanttest fixtures oh my god practicalworkflow integrations ide plugins ahit's all so goodokay these transformations aren't justtheoretical okay they're inorganizations around the world today i'mnot talking out of my butt here aboutstuff that doesn'texist this is how we move from frictionto flow in the authorization processthis is how authorization becomes partof your creative workflow rather than anobstacle developers spend more timebuilding features this way right lesstime wrestling with permissions thatsounds good we shift from security as anadd-on to security by design defense anddepth it becomes integrated into thearchitecture right from the beginningit's inherent in how you build that'sthe goal we create a situation where wecan empower developers to make bettersecurity decisions we can allowdevelopers to make better productdecisions that's huge teams understandthe why in a way that they wouldn't havebefore we create friction we eliminatefriction to create flow we get thatsecurity by design we get thatempowering of the developers so where doyou go from here where you go from hereis Discord and Slack and Docs becausethis is the start of your journeythanks i appreciate y'all hanging outwith me i'm a little over time uh I willhang out over here if you have anyquestions otherwise safe travels be welleveryone2025-04-15 21:57:43.687642�ve AI you can ask your best LLMuh model to predict a fraud detectionand it will probably write you know uhhundreds of lines but it will never dothe job so the whole thing was uh bornfrom the idea we were building anopen-source platform to predict let'ssay it's a navigation system for cyclistif you cycle in big cities in urbanplaces you know there is no such thingthere is no like a navigation system totell you yeah there is a road closurethere or there is a fire there or thereis an accident so or this dedicated roadis closed so the whole the idea thegenesis of the the work was born aroundbuilding this platform make itcompletely open source and give it awayto all cities from Helsinki to Madridyou know and say like just plug it toyourdata give it to your customer use uh bythe way there's a uh also like I want toshout out give a kudos to the openstreet map folks and you have like a lotof if you are there cyclists in the roomit's a really amazing tools out there sowe had the need to get something like acustomuler we built the whole stackaround Kubernetes we had Apache Sparkthen you can argue why you choose thisversus this one this was like an MVP ora minimal viable product that we wantedto kind of provide as a blueprint tobuild this kind of uh navigation systemfor cities that have cycling um for um Imean you know people using cyclingroutes and navigation system okay uh sothat's was the stock uh spark uh wasborn in the Hadoop era then you knowneeded a cluster manager we've seen alot of talks and even like you knowtestimonies from Apple I guess frommigrating from Hadoop to Kubernetes sowe had in the past a standalone MSO uhyarn which is still widely used yarn isimportant because it'swe at least got uh inspiration fromthere and then finally uh Kuberneteswhich is you know the the most popularone and it's and I think objectively isthe the right way of deploying spark souh how do we run quickly spark onkubernetes so uh using an operatoractually there are two operators one ingo one in java but anyway the more wehave better it is uh and then so we havea spark operator that runs there and yousee like you know the rest is is prettysimilar and what matters here is thattheuler is still aroundokay so uh the benefits are there are amyriad of benefits there is this is notthe point here i hope you are allconvinced but you know uh besides all Ican list all of this but one of the keythings here uh I didn't hear this a lotthis is my personal view we want to doefficient infrastructure we want to doefficient utilization we want to reduceour uh energy utilization our carbonfootprint so thinking that cloud nativemeans we have infiniteresources it doesn't mean anything imean it doesn't mean anything like youknow even if you are big company and youhave billions to burn you would not burnthis into infrastructure so the wholeidea was to how can I have a finite setof resources how can I use themefficiently that's the kind of key thingand hopefully like you know we can moreand more talk about this how can I useefficiently my infrastructure and theresult will be cost reduction and alsowill be you know energy reductionfootprint uh carbon footprint reductionetc but there is one limitation not uh alot it's that the default scheduleuleris not adaptingokay but saying that it seems like thedefault scheduleuler is not uh goodenough sorry i I want to be like youknow we we keep bashing about the thethe default scheduleuler but kudos tothat SIG or people working the the thedefault schedule is really amazing bythe way if you look at it and Danielledid this amazing kind of just uh I meanif you have an hour to kill if you aretaking the Euro Star or if you areflying back home just go ahead and lookat how amazing is the defaultscheduleuler it's an amazing piece oftechnology by the way so what we sayingit's not adapted it's because becauseour need is not there so we have to bereally conscious that the work was doneand it's still evolving by the way it'sreally uh amazing it's not just pod tonods it's more complex than this and andunder the hood you may just submit you�rworkload your pods but under the hoodit's doing a a piece of amazing you knowstuff to do this and we can take take iteven further you know this is like thewhole kind of process where you have theprefiltering you have differentalgorithms and concept just to do justto do that uh schedulingpart okay so hopefully this is was thefirst kind of you know message I want toconvince you the default in Kubernetesis amazing our needs from the batch wordthe batch pro processing were missingokay that was uh the the the firstmessage so now what are the limitationif I like expand a little bit if I uhkind of elaborate whether you are comingfrom one product or another well I meanone project open source project oranother you you will hear this and thisis really like you know this makes sensethe lack of uh application concept itdoesn't the default scheduleer doesn'tuh kind of you know distinguish betweena application spark application or rayapplication or trino application orflank application the the awarenessscheduling fifo fair etc etc to tosimplify and maybe it's like moreoversimplification is just one largequeue uh you know only pod level queueand limiting SC uh scheduling policiesobviously we're talking about gamescheduling what is gang scheduling Ineed all my pods to run together just tosimplify it okay I have one job I needall my pods I cannot just run half of mypods and then wait for another half gangscheduling in a nutshell is just to putall my pods together okay then uhlimited multi-tenency support and uhfurnace priority uh handling againcoming depending on what is yourbackground and what is your favoritetool uh you will see more or less uhlimitation this happens a lot and youmay think this is like really kind ofthis should be kind of uh sorted alreadythis happens a lot we have a lot of thisdeadlock like you know I need a app thatI need four parts for my sparkapplication A to run same thing for aspark application B the only problem islike each of them they were assigned twopods and this like you know it willremains like this without any way to saythis is completely insane let's just youknow free the pods let's spark ofapplication a complete and then we'llgive it the resource to uhapp okay uh that's for the thelimitation and then uh it feels likeagesago who who here knows aboutCubot anyone no i feel like a dinosaurokay this was 2018 it was not like youknow my my son was born 2018 it was notbut this was the first initiative aroundhow can we bring batch processing tokubernetes and I call it the ancestorbecause it's laid the foundation toeverything we are doing and I will talkabout volcano and volcano was kind offork about this and by the way you seeapache spark there it was not even g onkubernetes when it started so spark wasjust experimental on on kubernetes wehad a cafe and we had a tensorflowstuff like this but it laid thefoundation and uh I don't know if peopleum really like kudos to the people whothought about this uh they had this uhthey were visionary at that time socubebot it's archived now but it reallygave the it shaped the the future of thecustom schedulers and the first one isvolcanouh I'm just giving you here I mean youcan read all about it but it's uh CNCFbacked uh when I when I mention this uhCNCF backed or Apati softwarefoundationbacked it's really importantat least from our perspective the reasonis really there are some tools if theyare backed by John do in his uh in hisbackyard in Nebraska we cannot rely onit that's the thing so we really needtools that they are kind of you knowbacked adopted by the community so wecan run them in production that's thekey thing but still there there are someframeworks and Python stuff that rely onone person and it's it's really kind ofscary but the here thing I I startedwith this is it really matters that ifwe want to trust a solution a project itshould be somehow uh widely adopted andbacked and there is a vibrant communitybehind it soum what's what else I can say it wasreally adopted for uh batch schedulingsystem it has a lot of uh tight controlon on resources kotaas uh handling�extensive resource allocation andsuitable for uh mass massivemulti-tenant clusters again I will showyou this but multi-tenants in a nutshellagain for for those who are wonderingwhy I mean just think about that uhstock I showed you that cluster I showedyou I had data engineers I had MLengineers I had data scientists I hadPhD student doing that research and Ikind of you know split split theresources into different business teamsor different groups using the sameresources again finite set ofresources uh one on the top of you knowI will add it's adapted for GPUs orspecific specialized hardwares which isreally important we had a lot ofdiscussion around this and by the waythis is personally my next nirvana howcan we mutualize the usage and slice theGPUs because it costs a lot of money andI will mention thisuh when to use uh batch jobs wheneveryou know whenever you have batch jobs uhsophisticated uh scheduling it's reallyadapted AI and ML uh batch processingHPC I I just put here you know I'm notfamiliar with HPC let's be honest withyou but I think you know I've seen a lotof talks around this a lot oftestimonies that it's uh well suitedalso for HPC and multi-tenant clustersthe best practices this is just tosummarize again so leveraging the queuemanagement aspect so organization youmay asking me you may asking me ask mesorry like how do you think about thisqueue how do you manage this queue howdo you organize all this and I will uhtouch a point on this at the end we haveto specify the resource requirements soat least think about it and yeah manageall these jump job dependencies anyonefamiliar withairflow yes so the spaghetti that youhave at the end uh this dog this crazydog that the data scientists give youand say this is my amazing schedule runit so you really have to rethink thiskind of stuff that you say no I I don'tI like spaghetti spaghetti in the lifebut not to run on kubernetes this isreally important when you need torethink the way we have job dependencieswhether it's batch or a and m okay nextso this is volcano and as I said it's ait's a fork from uh cube batch then wehad unicore so the name is kind of crazyanyone had an idea about the name wherethe name comes from i'll pay youcoffee yes say again yarn yeah okay i'llpay you coffee or tea so yeah so yarn isthere uh yarn so you remember that yarnwas the cluster manager so yarn unifiedwith Kubernetes or something like thatin some orders anyway so there's yarnunified and kubernetes so we took allfrom what was good gang scheduling andall this concept and we brought them tounicorn and then you know it fits ouragain ourh needs in the batch wordApache software foundation I kind ofgave you the the reason and the rationalwhy it matters to trust a project opensource project uh but scheduling systemit enable enables sorry yarn likefeatures Q gang scheduling think supporthierarchical hierarchical make it cuesand guarantees uh resources even thoughwe see like in real life this maydiffer yeah where was really importantthat we didn't have to kind of you knowsorry for my French but mess around withthe existing Kubernetes gives us thiscapacity to run it smoothly like youknow abstract away all the complex youdidn't have to do anything else andideal again similarly to multi-tenantclustersuh priority based scheduling I mean ifyou're interested you can read all aboutit again uh when to use big dataworkloads multi-tenant and dynamicresource needsIt has uh you know like some bestpractices similarly we have to thinkabout ahead about the cues how do wemade this hierarchy we have to configurethe resource kotaas what do I have togive the the priority to my dataengineering team mad data science teamand then it it uh I mean I heard thismorning from the volcano folks that theyalso have a in the latest version a UIbut so far only unicorn has this nice UIto kind of figure out uh a visual aspectthere are some benchmarking andperformance testing out there i this isjust my my personal view again uh youknow uh I spent some time doingaeronautics and in another life I had along hair and if you want the pl�ane tofly it's not on a simulation so you canspend 10 years doing simulation but onceyou put it on you know the on theairport that's when you know exactly soI'm really happy with this stuff but youknow they lag this realistic aspect ofbringing something like Spark which iscomplex to something like Kuberneteswhich is complex bring them together andtest them in real life scenario and thisis a call out there for the communityand I have been trying to preach forthis we need to standardize we need a aset of I don't know data set let's thinkabout for those who are familiar withTPCDS for analytics we need a a data setto test these scheduulers to see howthey really kind of you know act in uhrealistic uhsituations uh okay so uh we've seen alot of at least I've seen a couple oftalks talking about uhkw which is kubernetes without cublet soyou can deploy as much as uh hundreds ofnodes but there is nothing there uhrunning like you know there's no cubletso going back to our platform what wasthe need think about it that we arecalculating ahead we have a model justto simplify that say these are the rootsfrom point A to B we already kind of hadour modelnow I have a new information sometimesyou know the sophisticated word big wordevent driven architecture stuff likethis okay this is really simple okay Ihave this information road closure inParis okay there was the Olympics i havethis information how do I deal with thisokay I have my data pipeline i definedexactly in my graph you know on it'sit's based on graph theory where whatshould I do and this information here orthis pipeline has to take the priorityor whatever over sorry whatever isrunning in my Kubernetes cluster is thisclear this is like the most importantthing that I want you to go with likethe priority and wherever it's runningit's important but I have a new kind ofyou know higher priority workload i havesomething that matters more thananything else that should take thepriority but at the same time I need myplatform to continue serving okaycontinue working at least you know insomecapacity so this was the whole kind ofgenesis and the idea about why we neededto add um custom schedulers and why it'sreally relevant comparing to the defaultscheduleuler so in our setup in ourexperiment we didn't use that you knowkindof not at least not uh relevant in thereal life scenarios but the setup is asI said we have a finite set of resourcesit's not infinite so and this is the uhthe kind of setup so we had amulti-tenant as I said we had a dataengineering team specialized incrunchingdata dealing with streaming ingestion wehad an M ML team we had a data scienceteam and we have to split the resourcesand make sure this platform is up andrunning the in differentsituation this uh kind of jobs this iswhat we evaluated as being like you knowthe average uses so we had you know 100events coming per hour about you knowdifferent things happening on the uh onthe city level that impacted ourcalculation of the best route forcyclist uh and it depends on the uhinformation we got depends on the eventwe got it's either like purely CPU it'spurely it's just like a set of joinsit's just any information we need towrite in a database or it's just uhimpact or it's impacting I think sorrythe whole model that we need to retrainfromscratch um if you look at it here againyou can uh legitimately ask why did wereach no more than thousand jobs uh Ithink I can just say that at some pointthe API server and etc will start to bethe bottleneck and this is not the pointof the talk so there are other impactsthere are other ramification I would saywe have to consider but this was likeyou know for this kind of capacity thisis the maximum we could do and yeah soin different scenarios and the onlything that we simulated completely isthe node failures I was part of acompany that was built around uhbuilding a whole stack around spotinstances or you unused instances so itit really matters for us that I I losethe spot instance because a spot killbecause it's taken back from the thecloud provider how do I make sure my uhjob is resilient so we had to simulatesometimes by hand node failures just tomake sure this now this is thesubjective part and please don't shoutat me uh it's for rational and technicalpeople the scale is really bad threestars is it like a review for arestaurant yes but actually it was areally good consensus actually among ushow do we evaluate this is it likeperception do we calculate the uh Idon't know the integral of the secondderivative of no no way just just thinkabout like you know rational basic stuffso the scale is three stars is the bestthing we can get get you know and onestar if it's really okay and for thecomplexity is three sad face and I knowhow to make sad face and one if it'sokay all right so if you think about itwe took this again from a pragmaticpoint of view and I see Tom really isrunning so the support for differentworkloads definitely here volcano and bythe way yeah I forgot to say that thewhole uh I mean idea here is not to sayone vs another i'm completely againstthis um really the again I said this themore we have the better it is and thenlet the community let the developeradopt and grow the tool that they seebetter fit for our needs so it's not onevs another it's just to say like youknow our experience how did we kind ofyou know uh get into this result so thesupport for different type of workloadsthe complexity of configuration thereason in uh volcano's defense herevolcano is more complex is more completecovers a lot of uh use cases so uh as aresult as a consequence it's has a lotof configuration in place that's thereason why because it covers a lot ofuse cases i said scopes the resourceutilization you know it's quite similarbut we see at end volcano kind of youknow uh uses well the the wholeresources we had in hand fall toleranceand recovery I would say it's the samewhat does it mean either I inject uhnode failure port failure or just likeas as simple as preemption preemption isI take a workload that is rugging and Igive these resources to a higherpriority and sometimes it will never berescheduled back again that's kind ofweird situation or bizarre situation thescheduling latency this is uh somehowwhat we we got as a result again threetwo stars not really far and the loaddistribution pretty kind of similar andjob completion time especially that youhave jobs that run um I mean again thisis really important when we talk aboutsome batch jobs or some training fullmotor training we're talking about daysso it doesn't matter like you know 30seconds or even 5 minute when my jobsrun for 48 hours really what matters isthe completion the number of jobs thatwill complete concurrent job that we'llcomplete that's a really important thingand then we have this weird behavior andname them weird because we couldn'treproduce them and share them with thecommunity time to time uh I mean youknow it's not a perfect match for interinteractive workloads if you arefamiliar like you know you can connectyour notebook like a Jupiter notebook toyour Spark cluster and data scientistsjust run code interactive code actuallyboth of them are not really well suitedand then some weird starvation happenedand you know uh in in in this is involcano and in for unicorn we uh that isreally uh bad situation the spark driverkind of dies sometimes uh because it'saffected by the preeemption as aconclusion uh I would say I'm reallyrunning out of time they are really wellsuited both of them are really wellsuited for batch workloads comparing toum the default scheduleuler the clusterutilization is really by a factor of ttwo or three like that's really provensay it's not scientific it's not hardscience but this is something that isthere is a consensus it enhance theresource management the schedulingpolicies and the seamless integrationwith kubernetes is really important weare not getting out we don't need a kindof weird stuff outside and the againkudos to the amazing community whetherit's volcano unicorn and othersunfortunately that I couldn't reallytest for all the amazing work they havebeen doing thank you very much forlistening to me2025-04-15 21:57:44.227689 22��2(#��ABkQRGsVBhkchello everyone uh thank you for joiningthis maybe last session i heard thatthey kept the best for the last soreally excited and happy uh I will justI always dream had a dream to do thiscan I take a selfie with all of you guysokay just wave i can show my wife thatyou know I have an important jobokay are you there guysthank you very much thank you that's afirst to-do list my dream now okay workhopefully I will not bore you to deathuh there were a lot of subjects um thistopic is really trending and it's reallyhot i have one explanation here so mybackground is data engineering machinelearning engineer i'm sorry for all theinfra guys but no one is perfect so dataengineering machine learningengineering we come from another wordHadoop word and it's probably deadalready and we came to Kubernetes and wefound a new home was it the perfect homei'm not sure about this we struggled atthe beginning a lot especially that Imean Kubernetes was builtfor stateless yes 9 years ago I mean Iheard this morning that there was no wayno one was thinking that Kubernetes willbe the target platform for any type ofworkload and this is kind of you know Iwill give you like here my feedback froma from the field as a end user as a dataengineer as a practitioner okay socustom scheduulers there is an s it'splural but it's only two that we uh kindof managed to test and uh so faruh yeah hopefully I would not you knowspend a lot of time and I just want youto go you know take take away here withtwo things that uh actually now we canrun batch workloads which is ETL ELT andML engineering and uh like training andserving at scale with Kubernetes likeany other framework we did in the pastso the the Apache Spark case um sowhatever I'm talking about Apache Sparkhopefully oh sorry I need to put this uhI took already maybe twominutes okay so um whatever I'm sayingabout Apache Spark hopefully will applyat least for Ray who is familiar withRay who prefers Ray to Sparkokay so no really like here the the goalis to get anything that was you know toserve the purpose to work on Kubernetesokay so uh this is the plan hopefullyI'll try to go fast and then the keepthe rest for the rest and this lookslike obsolete already i've seen in thethree days I was really depressedbecause it's Gen AI everywhere and thiswas like you know oldfashioned machinelearning predictive machine learning wepredict I don't know like you knowtraffic jam we predict fraud detectionbut it seems like it's already yesterdaylike you know uh I was so depressedhopefully it's still something relevantfor you guys but you know it's not we'restill doing and you know businesses andour customers and what we do is stillbased on predictive uh ML orold-fashioned ML not everything is uhgenerati��mean so let's take a step backand think about what uh correctnessactually is so let's say we have twooperations we have put a equals to twoand we have a get a out of the key valuestore and we execute them sequentiallyso intuitively we would expect that theget result would be the value that wedid the put with the last one that weputin so that's how intuitively we think ofhow the value will be getting out of thekey value store right but if the thingif the requests are actually overlappinglike when they're being executedconcurrently how would you define thecorrectness so before that let's discussthe representation that we'll be usingthroughout the the site deck um on howwe represent the operation history so onthe left you can see that we haveclients zero and one so we have twoclients in this in this graph and eachof the client they can only execute onething at a time so if there's a requestin flight we cannot uh send anotherrequest for that very client in the sametime and for each of the rectangular boxon the left edge it's the time that theclient sent out the request on the rightedge it's actually the time that aresponse is received for that client andthe blue line is the instance that theoperation isapplied so coming back to the questionso what if the requests happenconcurrently for example consider abovecase where we have one put request andtwo get request running concurrentlyuh as you can see interestingly umclient one get a actually returnedresult one get client two actually hadthe result two so is this a valid casedepending on the consistent consistencymodel uh we can have different answersbut in ETS's case because we're usingstrictizability this means that theoperation appear to have occurred insome order consistent with the real-timeordering of those operation so in asense coming back to the example beforeit means that if we can find a set oflinearization points for all theoperations then we can say that theoperation are correct under theconsistency model that we're using so inthis case if we actually do it uh if weassign the points that we have clientone's get a goes through first thenclient zero's put goes through andclient two's get goes through then thisis a valid history so let's hand it backto Ara and see how actually ATCD workunder this uh modelthanks Henry uh so here we have adiagram that illustrates anc clusterconsists of three members three clientsmaking concurrent requests to each ofthose members uh as we see client zerohas made a put request and even beforethe response uh is received client onehas made a uh get request as Henrymentioned before this is a typical caseof overlapping requests and we can alsorefer to the timeline graph on the uhtop right handside um so what what we see is that theclient the request of client oneoverlaps with uh the request of clientzero and the client two overlaps withthe request of client one but at anypoint of time we see that the responseof the request from client zero uh hasbeen completed before the requests fromclient one and two are executed which islike demarcated by the blue linerepresenting the logical instance oftime when it was actually applied to thesystem uh and thus we see cl both clientone and client two get the correct valuewith the updated value of a and the mostupdated uhrevision in an alternate scenario we cansay uh see like see the client zero makethe same put request and it getscompleted almost instantly so theresponse of okay revision equal to 5 isreceived at that instant in thisscenario if client one makes a requestuh get request uh and it has beencompleted and uh and already client zerohas seen the value of a equal to two uhclient one cannot see the value of uh aas one and that's what is happening hereuh which is what is breaking the systemand that is what we want to avoid as ageneral rule of thumb uh the revisionshould only increment and at any pointof time the current revision shouldnever be less than the previous revisionuh and in this case the client one is uhseeing a previous revision or a stalerevision which means it's traveling backin time a�nd that's not alloweduh with the problem of the revisiongoing back in time there can be realworld problems like uh networkconnection uh errors the network betweenthe nodes can go down and even in thatscenario the system should not respondwith the stale data and instead itshould catch up with the latest datathat is available on the other nodeswhen the connection is reestablishedlike that there can be disk problemseither slowness or corruption of thedisk but as long as the majority of thenodes are active the consistency of thesystem needs to bemaintained and there are a whole bunchof other failures like this like packetloss clock drift upgrade downgradeerrors and so on and these uh can happenintermittently at any point of timewhich is not in our control and thenature of these failures being randomand intermittent uh makes it moredifficult to test let alone testing itagain and again every time with everycommit so in order to have a frameworkwhich can reliably reproduce suchscenarios such intermittent scenariosover and over again with each and everycommit that goes into a release therobustness test framework was introducedwith this framework we ensure that eachof these failures uh that has happenedin the past can be reproduced reliablyand it is fixed properly and ensure thatin the coming releases it doesn't causeanyregression so let's quickly see how therobustness test differ from umtraditional tests like unit test and uhintegration test or fuzzingso the input for the unit andintegration test is fixed we know whatthe input should be and for fuzzing itis random and it can be garbage valuesjust to see the um whether system canhold under such scenarios but forrobustness test it is random but thoseinputs arevalid and uh while unit and integrationtest and fuzzy tests uh target thefunction the um robustness test actuallytargets the binary and the entireinfrastructurestack and the output is basically uh forunit and integration test it'spredictable it is easy to assert whetherit is matching or not matching forfuzzing we know whether at any givenrandom value the function is failing ornot so that's what we uh need to checkbut for robustnesses it's a little bitdifferent because we need to validatethe random input that was given uh andthat's what the challengebecomes and for the uh environment uhbasically unit tests first tests andintegration tests are all run asprocesses but since for robustness testwe need to have uh the scenariorecreated reproduciblewe actually use as a process containeras well as virtualmachines and uh using these uhrobustness test framework we have beenable to uncover these issues over aperiod of last two years and if we seeuh these discovered by column uh we weget to see these have been reported byuh users and robustess framework itselfand based on the necessity of theseissues uh they have been reliablyreproduced tested against each and everycommit that goes into a release and thatbecomes a very big part of the uh CDdata inconsistency effort that has thatthe team the entire team has uh put intofor each of the release so that it canbestable uh over to Henry for the designprinciples thanks Ara so let's dive uhinto the robustnesstest so here are the goals that we havein mind when we design the test we wantit to be able to explore so it's kind oflike cover the edge cases that requiresrace condition to manifest and we wouldlike to explore some code path that'sare that's not covered by the existingunit test and end to end test yet andreproduction as mentioned that we wantto have an easy way uh to debug issuesand ensuring historic issues that's notsurfacing again and validation becausenow we're giving uh random valid inputwe need to somehow validate the outputwe want to also ensure not just thefinal state is correct we want to ensurethe intermediate state are correctbecause we have the strict cializabilitypromised in the in ourdocumentation so there are two mainareas that we'll be focusing on forchecking the correctness the one of themis the key value API it should be strictserializable it should be atomic durableand� for watches it's the order uniquereliable so it's like you don't youcannot afford to have a missing watcheventand one of the main direction of ourtest is to make sure that we are asreliable as Kubernetes would expect usto be so we set foot by understandinghow CD is used by Kubernetes which isdocumented in this contract we codifythe interface we test it continuously sowhen we know that for the new bug fixesor new features when we introduce it wewon't break Kubernetes byaccident so let's talk about theexecution stages of this test so thereare main three uh three main stages uhthere are setup execution validation sofor setup we will cover theconfiguration and we will be configuringthe scenarios for execution we'llgenerate traffic from client we'llinject error and we will collect theoutput from both client and server umfor validation we'll run it through thechecker and see if we detect any anomalyso let's go one by one for the setup uhwe need to set up a cluster and to setup a cluster we start it from a cleanstate so it's empty database we startthe cityd binary and we will configure anumber of nodes no versions and theaddress so they can communicateetc and for scenario we have asmentioned exploratory or reproductionfor exploratory ones we will configurethings like traffic profile different uhleader election timeouts or differentsnapshot counts and for productionscenarios remember this table that wementioned um we would like to reproducesome of these scenarios that requiresrest condition or and failure injectionto to actually occur so let's take thisexample where it requires this ref panicfail point to reproduce the arrow so wewill have to specify what fail pointunder what profile and traffic it willbe a very efficient way for us toreproduce so we can cover this in thefuture uh in the future CIruns so having done the setup part wehave to execute so to execute we havethe client that will generate requestsand send it to the the server and therequests are defined in the scenario andthe uh so one example of the definedconfiguration might be the traffic andfor traffic you can see this is one caseof the traffic where we have differentcombination of the requests that will besent and during execution as we alsohave mentioned we will have to injectfailures Um the bolded ones are the onesthat we currently have uh the the testis able to inject and how do we actuallydo that we will have to rely on threedifferent frameworks well not frameworksbut tools so we have gail we have lazyfs and we have the network proxy solet's go one byone uh the first one is goofail it's aproject under the at cityd organizationso it enable us during runtime to injectfailures into certain co uh code pathsand it's very simple to use so as youcan see in the function uh you cancomment with go fail and then with thecertain format that you follow then youcan use go fail to generate the go codefrom the comment for you and then youcan compile the this code with thisgoofel uh code in contain inside andduring execution you can use uh HTTPendpoint to enable it or useenvironmental variable to enable it onstartum in our city codebase for example wehave this panic code uh here tointerrupt the uh um leader to send outmessages in the raft part of thecodebase and if we enable this duringthe the test then the node will crashand we want to check in the end thecluster is still in in the consistencyorder that we are expecting it tobe uh the second tool that we mentionedis the lazy FS which is is what we usefor testing the storage where it cansimulate data loss or unsynced rightsand lastly for the network we had thereverse proxy basically sits there andblocks off traffic on when we ask it toso it can simulate some sort of networkpartition or partial um uh networkdisconnectionso we have talked about generating therequests we have talked about injectingfailures now we talked about umcollecting the output since it's aserver a client server architecture wehave to collect from both sides serverside it's basically we collect the viaheadlog from the server uh and also thesnapshots the� client side we will recordthe requests and responses and we alsorecord the watch responses that we havestreamed uhcontinuously um and what do we actuallyrecord on the client side it will looksomething like this let's break it downso basically we we record the client IDinput which is the request payload therequest time response response payloadand response time and in case of arrowwe record that too and as we have seenin the previous um slides before this ishow we can represent it so for eachclient you can lay out the events uh inthis visualization and we will have torun the validation basically onthis so recall that we said we generaterandom valid input it's not like unittest where you already have the inputand output hardcoded so now we need tofigure out a way to to verify what thecorrect output the output is correct ornot and to do this we have to use astate machine so just a quick refresherstate machine you will start with somestate in this case a equals to one andeach of the operation basically is atransition from state to state so if youput b equals to two into the system itwill then go to the state where you havea equals to 1 and b equals to two and soon and so forth so using state machineyou can verify all the intermediatesteps and you can also verify the finalstate uh if if the uh cluster is in thecorrect uh working orderum so when we are modeling at CD usingstate machine the idea is that we have asimplified implementation we don't havewav we don't write to disk we just inmemory um hashmap basically becausethat's basically what key value store isso now having modeled this given ahistory we can then walk through thestates and verify if the SCD is actuallyin the correctorder so having built the city model wethen leverage porcupine to help doesit's a fast linearizabilitychecker so coming back to thevisualization before what point will dois it will assign linearization pointsthroughout the history and if you followthis visualization you can actuallyfollow the lines and you can see yoursystem state evolves over time and inthis case it's correct because itdoesn't have any red line so we're allgoodthere are other type of validation thatwe also built into uh this test whenit's running because it's targeted tofor std so it's like we can check someinternal consistency like at the end ofthe test we can check for the hash KV orwe can also check if the successfuloperation observed on the client side isactually also presented on the um uhwrite ahead log sideso in the unfortunate event that if weactually get a report like this whereyou can see we put a key and we have are revision 165 somehow the future getactually have a revision 164 this timechuming backwards not good this is theexample that we have showed earlier inthe site too um what can we do aboutthis so hopefully we don't need to dothis a lot of times but uh yeah if everwe have to do have to run the reportlocallywe have to first actually make sure ifthis is actually a real bug on the ATCDside because as we mentioned we arewriting a model uh to model the behaviorat CD so there might also be a casewhere there's a bug in the frameworkthere's a bug in the model so we need toactually first distinguish this firstand then if we actually know this is abug once side try to reproduce it overthere fix the bug write end to end testand for those that actually require racecondition or error injection to work weadd it to the uh robustness test suiteso we can consistently reproduce it overSo let's do a smalldemo this is basically demonstrating theissue that we have as we mentioned theuh watch uh the the revision travelsback in time so since all the test casesare written uh in the make file you canjust run make and that uh issue uhnumber and it can actually reproduce thethe bug and what we're looking at iswhat we have recorded on the client sidehas operation and watch and we also havethe the server side we have the snapshotand the rightheadlog so we can quickly take a look atwhat we have stored for the watchesit's basically the events that'sobserved on the client side the put anddeleteoperations that's basically just structwe have it loggedstructurally and then we can take a lookat the operation side what we log overthere it's basically also following thestructure that I presented in the slideearlier every client will if they haveum it can have both you can also justhave the watch and also just have theoperationthen we will show basically that's whatwe recorded and the robustness test willalways generate a report if it fails andyou can that's where usually when whenwe're debugging it we will look at thatreport and trying to figure out at whichpoint the history start to violate theproperty the consistency model propertythat we're trying to see so this is whenyou open the report this is what youwould see you can click jump to thefirst arrow to go straight to it or youcan enjoy scrolling and try to findwhere the red lineis and yeah this is where we showed inthe slide revision 165 goes backward inthe future of uh in in the future itgoes backward somehow so yeah this isthe bug that we discovered with thistest suiteso some future work starting I thinknext week we will have this anti-thesisintegration thing we will uh explore newways to reproduce our bucks consistentlywe try to we will try to see how anticiscan help us um uh improve our robusttest suite there are a lot of GitHubissues that's open targeting the robusttest something like adding more internalconsistency checks or deflate CI runslike we have the long-standing issue ofQPS um and also we run bi-weeklyrobustness test meeting where we look atthis beautiful dashboard this is allgreen so that's good if there's a red orpurple that means there's something thatfailed we need to look into it and canjoin us for a ride to bi-weekly surprisebecause we really don't know what wewill see every twoweeks so yeah thank you for listeningand uh hope you enjoy this talk and uhyeah have a nice afternoonhey thanks for the talk uh so it seemslike in the robustness test you're kindof relying a little bit onnon-determinism right like it seems likethe requests that you're simulating arerandom weights and probably running inuh random order do you um like do yourun these tests numerous times everytime you run them and how do you makesure yes um so for reproducing the issueif you actually check the main file weusually have count for like 100 or 200depending on how hard it is to actuallyreproduce it so it's not like one shotyou get it like the example we show isactually kind of lucky for some testsyou actually need to run several hundredtimes and because it's race conditionsometimes to trigger it right so thiskind of why it's hard to like debuggingthis criticism hard because of this hasthere been any interest in like using uhmore formal um proof verification toolsor uh like I'm not going to say TA plusyes so on the ref side there's actuallywork on the toa plus uh that was there'sa work on that but on the at city side Ithink there were a pull request open orthere was someone trying to work on itbut it hasn't really been worked on likewithin yeah I think maybe like less umless formal proof but more like you knowframeworks like JSON mastrom that thosekind of tools probably more pragmaticright um we did had gson test report ifyou actually look at it I think on 34before they actually ran it once um andthe problem with that um on Merrick'stalk they actually he actually mentionedit so the way the test is designed isnot really able to run on CI youactually have to set up a cluster ofservers and also um you need to reallyhave some very deep knowledge into howto write them because it's it's not likeby magic it runs you still have to writea model and whatnot so for us ourexpertise is in Golang so we decided torun write this test ourselves we do thisframework we do the same check and wecan do more internal consistency checkand whatnot to to to run this on CI andalso for us easier to track and debugyeah makes sense thank you you can talkto Merrick if you have more questionsif no question then thank you everyonefor attending and have a nice weekendthank you2025-04-15 21:57:44.764339 $ $�B*#�=A4Pei4LMigQEwelcome everyone today we will exploreinto the concept of the data gravityunderstanding its implementation ofkubernet environments and the discussingefficiency and stability of the managinglarge scale of data injectionsandminimum latenciessomyself myself aris I'm a devopsyengineer last three years as gamingindustry and the healthcare industry I'ma tech entrepreneur building theSASbased applications rightnow so what is data gravity data gravityis refer to the occupancy ��P)#��WAJ93U9n_qxSIhello everyone uh thank you for comingto our talk uh don't let your cubernetescluster go wild ensuring at cityreliability so if you're an end user ora direct customer of ATC or indirectlyusing ATC as one of your Kubernetesdistribution this is the right place foryou uh so a little about ourselves uhI'm Oro Shaha a software engineer VMwareby Broadcom currently contributing toATCD and also responsible for downstreamreleases of ATC and Kubernetes i'm Henryi'm a software engineer at Google and Ialso contribute to ATCDso before we start we would like toextend our sincere thanks to BenjaminMarik uh for laying the foundation andmaintaining the CD and its uh testingframework as well as supporting thistalk with their invaluable knowledgeuh so in this session we will introducewhat a distributed key value store isthe design principles of the robustnesstest and the deep dive of it and finallywe are going to demo on how to run arobustnesstest a good reference of handling thedata cons inconsistencies in ATCD uh viarobustness test framework has beencovered by Marik around two years agowhen it was designed and implemented uhyou can refer to that also if you arelooking for a beginner friendly primerof what robustness test is and how youcan use it too you can refer to our talkfrom last year in OSS Japan uh with thatover to Henry for more about HCD and itsconstraintsso let's have a quick intro ondistributed key valuestore so Etsy ensures strictserializability but what does thatactually �there where wehave a data accumulate it's enhancingthe more applications and the servicesresult in the containerize thiscontaineriz can be lead to increase thelatency and the challenges in the datamobility make it crucial to theaddressing the disturbance of systemthat we havekubernates as the data accumulating inthe attaching the applications andservices leading to the centralizing theincreasing thelatency the data accumulated basicallythe centralized latency is improvementand impactful theirapplications the challenge in thecubernet during the data latency isbasically the data latency is lead tothe higher latency and the increasingthe data transferring cost is thecubernet clusters in the cubinetclusters the cubinet environment thedata gravity can be significant impactperformance it can be cause the increasethe latency and it's raise the datatransferring cost their change isscalability and the efficiency thedemand damaging the efficiency and thereducingthe reducing thest for the data injection efficiencybasically the to reduce the data gravityoptimizations data injection in thepipeline in ensuring the technologiessuch as the par processing and thecontainerize have we been showing to cutthe data processing times andsuitability leading the more efficiencyand the scalability and the datahandling in the cubernateclusters so reducing the latency and theedge computations the processing the theprocessing the data edge computationreducing the latency and reduces thebandwidth of theuses edge computation is basicallyinvolved to the processing the datacloser to the source which have whichcan be significant reduce the latencyand the decrease the bandwidth as byintegrations edge cases of the kubernateorganization can be enhanced theperformance and the source of basicallythe g uh datagravityimprovement improve the networkingperformance Basically utilization isadvanced CNI and the service mesh as canbe enhanced the data transferring ratesand reducing the latency advancednetworking reducing the including thecontainer uh CNI and like like servicemesh can be improve the datatransferring rates and the reducing thelatency their tools help help the uhmajorly complex networking uhcommunications and enhancing theperformance and in cubernetenvironment storage challenges andsource solutions basically theimplementing the presence of volume andCSIdriven CSI uh drives improve the storageand scalability and the reducing latencystorage challenges in the cubernetessuch as the enhancing the datascalability and and stability can beaddressing by using PV volumes and CSIstorage for driven their solutionenhancing the storage implementationleading the reducing the latency and theimprovement ment of improvement andenhancing the performance in theKubernetenvironment there is some case study uhsorrystrategy strategy ofComprehensivestrategy for the security data basicallythe implementation RBCA and networkingpolicy and reducing the securityenhancing the significantly enhancingthe data security is like major risk forthe associate with the data gravityimplementation implementing the CBCAbasically the roll back accesscontroller defined in the uh structnetworking policy have been showing toreduce the security introducingenhancing the data injections and thecompcomplexities comprehensive uh strategiesthe comprehensive set is optimizer inenhance computation and the networksolution its data gravity efficiency andthe scalability in[Music]in it's combined the storageoptimization enhanced computing and theadvanced networking solution is migrateduh efficiency as a data uh data uhgravity significant leading the enhancethe performance and scalability in thecubernetenvironment what the strategyimplementations the strategyimplementation is a uhuh we can uh reduce the optimization ofthe cost implementation of the latencyand enhance the scalability leadingoverall the uh performance of theKubernetes environment and enhance theperformance uh the uh rate uh rate ofthe costbasically yeah thank you so muchfor that's all2025-04-15 21:57:45.296774�shot featuresupports crashconsistency Taking snapshots of all thevolumes at the same point in time alsois more efficient than taking onesnapshot at a time It provides betterperformance We started to design thevolume group snapshot feature shortlyafter volume snapshot moved to GA in1.20 release but it took a while beforewe finalized the design and weintroduced it as alpha feature in 1.29release Then we moved it to beta in1.32 This feature introduces three newAPIs We have a volume group snapshotclass that is created by a admin todescribe how volume group snapshotsshould becreated We have a volume group snapshotthat represents a user's request tocreate a volume group snapshot formultiple volumesWe have a volume group snapshot contentthat represents a physical volume groupsnapshot resource on the Sod system Thecontent is created by the snapshotcontroller in a dynamic provisioning andcreated by the admin for thepre-provisioningcase Leonardo will explain the two typesof provisioning laterIn a volume group snapshot class we hadthe CSI driver name The parameters thatcontain information that is opaque toKubernetes and is understood only by CSIdrivers It contains the deletion policyThis works the same way as the detectionpolicy in a volume snapshot class It canbe either delete orretainHere's an example of a volume groupsnapshot class The deletion policy isdeletehere Here is the volume group snapshotAPI In the spec you need to specify thesource The source can be either a labelselector or a warning group snapshotcontent name depending on the type ofprovisioningWarning group snapshot class name may beleft new to indicate that the defaultclass will beused After warning group snapshot iscreated you can see the bounding groupsnapshot content name and the creationtimestamp in the status You'll also seethe ready to use parameter in the statusthat indicates whether this volume groupsnapshot is ready to be used to restoreyour PBCsHere is an example See the label hereYou need to specify the label on all thePVCs you want to be snapshottedtogether In the volume group snapshotcontents back you have the deletionpolicy CSI driver name volume groupsnapshot class name and the source Thesource can be either a list of volumehandles for the volumes on the storagesystem or group snapshot handles whichincludes the volume group snapshothandle and a list of individual snapshothandles on the storage system The volumegroup snapshot and volume group snapshotcontent have a one to one mapping toeach otherIn the volume group snapshot contentexample here we see volume handles inthe source In the status we see volumegroup snapshot handle and the list ofindividual volume handle and snapshothandlepairs To support this feature we alsoadded volume group snapshot definitionin the CSI spec This feature in CSS backmoved to GA in the CSS back 1.11 releaseCSI spec does not have a betaphase We added a new group controllerservice and a new capability We addedthree newRPCs Create delete get warning groupsnapshotIn order to support this feature in aCSI driver a sort vendor will need toimplement this new controller serviceand the newRPCs In this diagram we show how CSIdriver is deployed and how variouscomponents that support volume groupsnapshots uh are working together Weadded the volume group snapshot supportthrough the snapshot controller and theCSI snapshot set car Snapshot controllerdoes the heavy lifting It creates thevolume group snapshot content and uh inthe in case of dynamic provisioning andit is responsible for binding the volumegroup snapshot and the volume groupsnapshot shotcontent The CSI snapshot or site car isdeployed together with the CSI driver Itwatches the volume group snapshotcontent API objects It calls CSI driverto create or delete volume groupsnapshot on the storagesystem We introduce feature gate in bothsnapshot controller and the CSI snapshotset car Since this is the beta API thefeature gate is disabled bydefault Now let me hand it over toLeonardo He will explain how the twotypes of provisioning worksOkay So there are three main things youcan do w�ith volume group snapshots Thefirst one is called dynamic provisioningand it is the workflow you use when youneed to take a backup and given a volumegroup snapshot class you need to createa volume group snapshot object and thesystem will provision everything for youSo a volume group snapshot content a setof volume snapshots and a set of volumesnapshot contents Isn't it beautifulit's magic From just one object you'regoing to get everything The secondworkflow is called pre-provisioning Andthis is what you use when you want aKubernetes cluster to take control of anexisting group snapshot existing alreadyin your storage For that workflow towork you need to create all the objectyourself So the volume group snapshotcontent the volume group snapshot thevolume snapshot contents and the volumesnapshots Then the third and most themost important workflow is the restoreAnd this is really easy to do becauseit's just designed to be used as youwould use a plain volume snapshot We seethatlater And from now on we are going toexplain how the dynamic provisioningworks And you know since uh I am anItalian guy and I love classical musicat opera I will use an opera metaphor Sowe split this process in four acts andeach act as scenes and this is an operafor four personas You have a kubernetesadministrator you have a CSI driver youhave the snapshot sidecar and the commonsnapshot controller So let's start fromact oneSo what happens now you are a Kubernetesadministrator working on your deskEverything is good Everything is fineAnd suddenly a developer enters askingyou for adatabase Do you really need it yes I doUh but it's just temporary data Don'tworry You can destroy everything whenyou want and recreate itback I don't trust temporary data Let'screate some storage for it And I knowthat is always best to have thetransaction log in a separate volume SoI'm going to create two PVCs One is fordata This example use CMPG because it'sthe operator I'm most familiar with butconcepts are generic The second one isfor the transaction loged in posgressThis is named wall right ahead log Themost important bit here is the commonlabel between the persistent volumeframe of data and the other one of thewalls is named in this example instancename Then I configure immediately avolume group snapshot class because it'sbetter to feel safe than worry you knowI'm going to back up everything and ifit istemporary okay acttwo and now we are inside the cubecontroller manager you know everythingis good someone created a chrome job andthe chrome job starts handling a yldefinition of a job to the jobcontroller the job controller is onealways complains because he want to beused by in isolation and everyone usejust the chrome jobs But anyway let'swork I'm going to create a pod for youIt's fine And this pod is triggeringyour database operator to take abackup And the database operator it is aquiet place That's back going on Back isperfection Beautiful What is doing thedatabase operator is creating a volumegroup snapshot object and look at thatwe have the labels the instance namelabel we have seen beforeh isn't itbeautiful and we have the reference forthe volume group snapshot class namethis comes from actone okay we have a new object thesnapshot controller is waking up oh lookI have a new object let's create acontent for it I'm going to get all thedetails about the volumes you need tosnapshotSo I feel the volume handles everythingbeautifulOkay And now we are inside the pod ofthe CSI driver And there's the snapshotsidecarRealizing that there is a new volumegroup snapshot content object Okay But Idon't know how to take group snapshot bymyself I need to ask my CSI driverfriend which is sitting next to me Andso I scream in a gRPC format a creategroup snapshot request But that poor guyis sitting just next to me Look I heardyou Okay these are the UID I'm going todo your work Good Okay let's take agroup snapshots These are the UIDs ofthe snapshot you requested me And thisis the UID of the groupsnapshot And you know why groupsnapshots are so fast it's because thesnapshot is continuous scr�eaming at thatpoor guy sitting next to him until it isready Ohgood Okay let's do that So I'm going tofill all the details It's everythingthere these magic numbers all theUIDs and it is ready to use immediatelyBeautiful In thisexample the snapshot controller wakes upagain because the objectchanged So I now have a volume groupsnapshot object I have a volume groupsnapshot content objects I need tocreate a set of volume snapshots and aset of volume snapshot contents becausethis is what you need when you need torehydrate your PVCI do that just using the data uh theywere sent to me in the stepbefore and I'm going to createeverything This is how the magic worksBut you know again since I created newobjects the snapshot sidecard wakes itup Oh there's new object Let's ask thepoor CSI driver the status of thisobject I already told you they are readyOkay let's do I'm going to feel thestatus Good I have the status now Theyare ready to use They have a volumegroup snapshothandle This classifies them as a memberof a group snapshot not just anindividualsnapshots Actthree Okay we go back to our officeEverything is good and you know thedeveloper comes asking for a restoreIt's always happening But it wastemporary data Yeah but you know it'staking so long to recreate from scratchCan you restore as it was 2 days agookay Okay let's doit Okay it is enough to recreate asystem volume claim and use the datasource stats inside thespecification referring to the volumesnapshot objects that were created a fewsteps before This is just like you woulduse a plain volume snapshotWe want restore to be easy because whenyou need to restore something we usuallyare having troubles It's better to avoidcomplexity This is so important This iswhy works that way Actfour The world is quiet and beautifulAnd wefinished And that's it So this is howdynamic provisioning worksAnd as you can see it is teamwork It notit is not just one software componentdoing the magic It's collaboration Sothe most important here is thecommunication between all these softwarecomponents This is more important thanthe actual implementationAnd if something doesn't work okay weneed to know that this is a beta fishernow and we need to check the fishes gateto be enabled There are feature gates inthe common snapshot controllers thereare there is a fissure gate in the CSIsnapshot cycle and then we needobviously the CRD to be installed and weneed to check that the PVCs need can bereally snapshot together at the sametime at least they need to beprovisioned by the same CSI driver andthat CMS CSI driver is the one thatshould be referred to in the volumegroup snapshotclass okay I checked everything but itstill doesn'twork well so You know the history Youknow which actor is supposed to play Nowthe logs are your friend and you need tocheck which step you stopped and whathappened and then we need to iron outsome bugs But anyway uhokay so you know I'm a posgress personso I need to share a few words aboutposgress but this is true for manysoftware systems working like a DBMS Soyou get basically two data store at theprice of one The one on the right iscalled the transaction log and everytime you change something uh someone isadding a note there and when you committhis is flushed down to permanentstorage and periodically the system willperform a checkpoint Checkpoint meansit's going to flush down all the hash toyour data store which is the one on theleft and can be split in multiplevolumes like in this exampleSo to behap a system like that you needcollaboration between the primitivesjust like volume snapshots in thisexample and the databaseoperator So someone need to start the PGbackup uh PG backup start function whichis going to turn a few knobs inside theDBMS and immediately perform acheckpoint The system is consistent atthat point and then you can take yoursnapshots When you did that you need torun pg backup and this stops the backupprocess When you need to restore thissituation it is inconsistent becausesnapshots are taken at different pointintime The database will just look for thecheckpoint immediately� before the firstsnapshot So the one where PG backupstart was performed and then apply backall the transaction log until it isconsistent This is going to take a bitof time because they they are notconsistent Meanwhile with groupsnapshots this is easier becauseeverything is consistent and it's goingto be really quicker This is why groupsnapshots are really loved by databasepeople like me It's such a great featureforus So let's do a little sacrifice forthe demo godsAnd uh uh uh uh uh uhuh I hope it is bigenough Can you see it is it big enoughgood Good Okay So this is a clusterobject I'm using CMPG but everything isgenerichere I just created a cluster objectwith just one instance That means youare going to get just one port But thisdatabase is split across multiplevolumes You want it bigger That'sgood Huh better now You multiple volumesthere There are boot seven YeahOkay To back up this system I'm runningthe CMPG backup command It is no magicAbsolutely It's just creating a backupresource that is triggering the databaseoperator And you know what wow this isnice It's going to trigger even anupdate of item Whoops Okay it's fineAnyway our backup started finished andcompleted This is what happens when youturn on your Wi-Fi Umokay How did it do that it just createda volume group snapshot object which isready to useLook at the definition of this The mostimportant bit is thisselector Okay And then you geteverything else because CNPG isannotating it with the database statusBut this is something for CMPG Then youare going to get a volume group snapshotcontent object as we were saying beforewhich is ready touse And then you are going to get assetOh this is the definition of the volumegroup snapshot content with all themapping between the UID of the volumeand the UID of thesnapshot And then weget poof Okay a list of volume snapshotsone perPVC This is nice So you get a referenceof the name of the PVC that was notsnapshot and a list of volume snapshotcontents Here itis So it's really easy to use them andit'steamwork So great Uh I'm going to startmy presentationagain Poof poof There wego Back to ShingThanks Leonardo Now that the feature isbeta we have been working on fixing bugsand trying to bring this feature to GAWe'd like to see implementation frommore storage vendors and we'd like tosee integration for more Kubernetesapplications like the backup restore usecase that Leonardo described for thepostgress SQL operatorHere are some resources for yourreferenceThat's all we have Thank you[Applause]Yeah I have a question about the demoLike in the demo you showed a volumethat was 2 GBuh how it works for how quick it worksif we have some larger volumes attachedOkay Uh the answer is it depends on howyour storage is implemented This is justaninfrastructure So yeah sorry I don'thave a clear answer for you Check yourCSI driver You will find a lot ofdocumentation about this worksinternally It may be immediate or it maytake some time depending on if your CSIdriver implements secondary storage ornot tooYeah definitely depends on how how yourstorage works Okay Okay Thank youUm hello Uh I have a question about thedemo also about the restore with CNPG Uhbecause in the CNPG documentation youneed to delete the cluster and recreatefrom a backup Uh does it mean that withvolume group you need just to restorethe volume group right and withouttouching the cluster object yes When thevolume group snapshot feature will bemerged inside CPG you will just be ableto reference the volume group snapshotobject and then you are done Okay Soit's not available yet right but if youneed to change yeah it will be merged Ihope for the next release Okay thank youOkay thanks for the presentation I Iwant if I can I would like to answer thealso the question before about therecovery time Mhm because I I did a Idid a presentation in CubeCon in I thinkChicago a year and a half ago where withMichelle from Google we showed how torestore a posgus database of 4.5terabytes in uh two minutes as Leonardoexplained volume snapshots are still atthe the foundation on this but I wouldlike to ask you uh if you can brieflycover uh the differences betweensnapshotingin VMs and the benefits that Kuberneteshas brought to this technology that isunprecedented I meanokay so uh you're talking okay the VMsright so yeah VMs or outside KubernetesI mean the advantage of Kubernetes rightso uh with this feature if your VMsusing multiple volumes you can take ayou know group group snapshot you canthat will be very helpful for your VMsright because your VMs typically havemultiple volumesyeah and to add a bit of something Thisis a foundation work that is common toevery storage implementation So you justhave an an API and with using that APIyou are independent on the actualimplementationYeah Sorry I probably rephrased it wrongbut mine was uh I've seen from a from auser's point of view the advantages ofhaving a standard interface thatKubernetes brings when taking volumessnapshots that it's not available in inVMs where every vendor has their ownSorry I should So you if your VMs haveall the data stored in the PVC you canuse this Yeah you can use this uhfeature to take a group snapshots OkayYeah Thank youThank you for presentation I have aquestion Uh in your selector using onlyhost name is it possible to use adifferent u attributes in a selectorabsolutely Absolutely It's just a plainselector Okay So this is foundation workIt's not about a particularimplementation It's for everyone to useSo potentially we can make likemultiport uh snapshotAbsolutely Okay Thank youOne clarification question Are youcalling this CSI provider with a listand delegating consistency to them orare you are you taking responsibilityfor consistency and handling callingthem parallelso CSI driver needs to implement thisone and need to call their vendorspecific APIs right if they supportconsistent group then they can use thisthey can implement this but if they donot have that yet then it's not going towork right so this depends on whether avendor supports this technology or noton their need you need adoption fromeach vendor that's providing a CSIdriver to handle yes consistency andthey can implement that however theywant in the back end Yes they can Yeahthey can implement this one if they havethat feature in their storage systemYeah Do you know out of the majorvendors who supports it yet so right nowuh it's it's a bit early Uh last timewhen we asked there are like a couple ofcompanies saying they they haveimplement this So that's why I'm callingfor implementation for most vendors OkayThank youany tentative timelines uh when thisfeature will become GAUh so we just moved to beta in 1.32 Soright now we are fixing some issuesProbably need some soak time Um this isjust tentative right maybe 1.35 releaseSo we'll seeThanksHello Bonjouro Uh I would like to ask uhwhen you take the snapshot and uhuh the snapshot operator cooperate withthe CMPGuh operator uh of and you have to issuethe PG start backup combat it has toflash the wall to the disk to be able totake a consistent snapshot How does thisaffect a affectperformance and if this affectperformance because I imagine that Iwonder that in that case uh right willbe locked for a moment to the databaseYeah So this is actually posgressspecific and not about but yeah okay uhwith posgress you can spread yourcheckpoints across time to avoid thespike of a workload just when youperform acheckpoint So it depends on how youconfigure your engine but you candefinitely avoid that spike at least agood part of it Okay thank youOh so sorry I I just need one moreminute or two minutes of your time Iwould like to thank Shink for all thehelp she gave me when I startedcontributing to the Kubernetes CSIecosystem She set an example both at thetechnical level and the personal side ascould to be a really good leader on thetech storage initiative A big round ofapplause for[Applause]her Thanks Leonardo uh you have beencontributing a lot to this feature andalso there are many other communitymembers who are contributing to thisfeature They are not in thispresentation but I want to thank all ofthem for contributing[Applause]2025-04-15 21:57:45.870874 2�2��+,#�� AZIk_EqI8rVAhi everyone uh here's the session sixscheduling intro andupdates i'm Kens from Tate working forservice me stuff and and I am M fromGoogle i am working in AI training teamall right so let's get startedso s scheduling we are maintaining thecomponents that are related to the youknow part p part placement and kubuleris the main component that we aremaintaining in the kubernest uh upstreamrepository and others are the subprojects uh that helps usersadditionally in this area in many youknow different ways in this sessionwe'll introduce the you know briefintroduction about the commandscheduleuler itself and discuss thelatest uh enhancements around thescheduleuler and lastly we will sh��k+#�� AurRefZ0KnU4helloeveryone Thank you for coming to oursession on waring groupsnapshots My name is Shining I work atVML by Bortcon I'm also co-chair ofKubernetes six storage And my name isLeonardo I'm a principal softwareengineer working in EDB I'm a posgressuser since a lot of time and I I'm acontributor to the Kubernetes CSIproject and a maintainer of the cloudnative PGproject Here's today'sagenda We will discuss why we needwarning group snapshots explain how itworks and how to use it At the end wewill do a demoWhen I was preparing for thispresentation I asked Chad GBT to show mewhy disaster recovery is importantHere's what I got It looks chaotic anddangerous all around but in the middleof this picture it looks calm andsafe This is definitely better than whatI candraw here We have a data center beingprotected from disasters such as fireflood cyber attacks A shield thatsafeguards your critical data A contrastbetween chaos andsecurity Disaster can happen at any timeHow do we protect our precious data froma potential loss here I listed whatcould cause a disaster Actually humanerror is one of the leadingfactors as shown in this picture If thisdata center is burned down your datastored there will be lost for sureHowever if you have your data backed upin a different location you can alwaysrestore your data backSo data protection and disaster recoveryare very important for your missioncriticalapplications Volume snapshot API wasintroduced back in Kubernetes 1.12release and moved to G in Kubernetes1.20 release It allows you to take acrash consistent snapshot of apersistent volume and use that torestore your data back at a later timeif a disasterstrikes It provides a basic buildingblock to protect your applicationsrunning inKubernetes We already have a volumesnapshot API for inter video volumes Sowhy do we need volume group snapshotsnow let's take a look at anexample Suppose you have an applicationrunning that uses multiple volumes tostore its data logs and so on You wantto protect yourapplication To ensure applicationconsistency you need to quiet yourapplication before taking a snapshot andunquest afterwards But quas takes a longtime and it is very expensive So you maynot want to do it frequently But youstill want to be able to back up yourdata morefrequently Without application quas youtake a snapshot of the first volume attimeone Then you take a snapshot of thesecond volume at timetwo Then you take a snapshot of thethird volume at time threeNow you try to clone the sameapplication to a different name space ormaybe your original application iscorrupted somehow You are trying torestore your application back from thosesnapshots Now after restore you may runinto problems because the snapshots weretaken at different times and you getinconsistentdata How do we solve this problemfortunately we have consistent groupsnapshots come to therescue Consistent group snapshot allowsa snapshot to be taken from multiplevolumes at the same point in time toensure right orderconsistency This is important forapplications that have multiple volumesAfter taking the consistent groupsnapshot you can create a volume fromthe snapshot and get your applicationback Note that if your application is adatabase you will still need to do crashrecovery Leonardo will give an examplelater Volume group snap��owsome other uh updates in the major subprojects so command scheduleuler is thecomponent that uh decides the pot partplacement uh so it it implements thefeatures like you know uh resourcerequirement port affinity node affinityand you know part spread or some othersas well and each feature is actuallyimplemented as a program in generallyand so for example we have a resourcefit program that handles resourcerequirement and we have an uh interotinterfat affinity and yeah each feature isimplemented likethat so each plugin can work at severalinterfaces which is called um extensionpoint so those uh futures and scores aretwo major extension points that involvesthe scheduling decision uh at the futureextension point uh we like uh programscan reject node that shouldn't run theport so for example uh like resourcefeed programing rejects nodes thatdoesn't have enough resources for theport and you know node affinity plugingrejects nodes that don't have enough youknow labels and stuff and then afterthat at uh the score extension point uhprograms can score nodes based on theirpreferencepreference and yeah for example theyThere is a plugin called image localityplugin that gives high higher score tonodes that have uh container image atthe cashalready so we will see how it works uhthe scheduleuler schedules partsbasically one by one and each timeevaluate all nodes so that it candetermine the best node for each part inthis example let's say we have fournodes and also we have two futureplugins and two scoreplugins first all nodes are evaluated bythe future plugins uh in this examplelooks like uh future A plugin rejectsnode for so for some you know somereason and feature B uh plugin rejectsnode three for another reason so lookslike only node one and two are going forthe scoring phase and then you know eachscore plugin scores each node based ontheir perspective and let's say theyscore them likethis the total scores will be like thisso in this case node two uh got thehighest you know total score so thispart will eventually go to the node toget theresult i only introduced about two majoryou know extension points feature inschool but actually there are moreextension points so this is the actualyou know like full picture of it andthis entire architecture is calledschedulingframework this green area is calledscheduling cycle it is uh responsiblefor you know making a decision of potplacement like I described and yeahfuture and score will be the major oneshere and if the scheduleuler decided todecides to schedule parts for some nodeuh then this bot will be uh proceedingto the next step which is called bindingcyclethis yellow area shows the binding cycleactually yeah so this uh binding cycleis responsible for applying the decisionon command API sub actually so likespecifically it updates the parts likethere's a field called node name uh inthe part souh at the binding cycle the scheduleulerupdates the the node name field with thescheduling result and outside thescheduleuler uh pro uh cublet willnotice the port update with the nodename and starts the portaccordingly so one note is that uh thescheduleuler runs the scheduling cycleone by one but runs this binding cycleas synchronously so this is for betteryou know this is for better efficiencyuh as I mentioned the binding cyclemakes an API call so you know that is akind of expensive uh operation right sothis architecture like uh schedulingcycle and binding cycle allows allows usto decouple making a decision andactually applying it via API callsuh also we have a scheduling queue uhthat holds all pending parts and decideswhich part to schedule next so it takesa like a crucial role for you knowdetermining which part to retire firstand which is not like based on priorityor based on some you know cross updateand etc uh we'll see the detail in thelatersection all right so we saw the overviewof commander scheduleuler and from herewe will discuss about the recent updatein the commandscheduleuler so we will take a look ateach one by one we basically have thesefour major updatesrecently the first one is quing h�ints soin command the scheduleuler theperformance really matters like uh it isyou know really important to uh likebecause the scheduling cycle schedulesports one by one right as I mentioned sowhen your cluster gets bigger and biggerthe amount of created parts could gobeyond the scheduling at some point andit could cause the those kind of uhthose ports uh being pending for a longtimeso that is the you know worst casescenario that we want to prevent and wehave a you know certain goal for ourscheduling throughput and try to keepthe throughput with the goal in everyscenario we actually have a like ascheduling performance test in the upupstream scheduleuler so that we cankeep the goal with everychange and yes so therefore uh in therecent VV cycles uh we've been basicallyyou know focusing uh on the performanceimprovement more than adding newfeaturesso the recent enhancements that we'llintroduce today are mostly about theperformance i mean the OA is only theexception I guess uh like yes so queinghint is one of that uh when your part isunscheduableuh like what could make your partscheduleable again uh so thinking aboutit like what makes port schedulable issome change in the crust some resourcechange in thecluster so we go search any change inthe cluster like uh the cluster eventsin the scheduleulerum for example like uh node is updatednode is created pot isupdated piv is you know deleted etc allthose uh changes are called we call themcrust events so let's see how uh youknow does Q handles those events andmake the decision of retryingparts so this is the previous state uhso if the part is rejected at thescheduling cycle it comes back to thequeue with the note about which pluginsrejects rejected this part so in thisexample looks like the part is rejectedby the resource feed program the worldis very small and the resource fitplugin is the plug-in that checks eachnode's you know resource requirement uhresources resource capacity andcalculates whether the part can go tothere or not so in this example probablylike there's no part that that hasenough resource to accommodate this uhschedulingpart so each plugin basically registerswhich kind of uh cross the events thatcould uh resolve uh their schedulingfailure so in this case resource fe uhscheduling failure could be resolved bynew node is created or pot is up uhsorry node is updated to have moreallocatable capacity or some you knowexisting parts are uhremoved so this resource feed programregisters those eventsand what we are doing is like uhbasically we keep checking the eventsand when the queue observes node eventnode addition event uh which the pluginregisters it requires the it triggersthe retire of the scheduling of thispart so this is how previously thescheduling you know ritual worksbut thinking more about it uh likebasically like the ideais not all node addition event couldmake this part scheduleable actuallybecause for example what if the new nodeis very small and cannot run the pendingpart uh then here is the motivation ofcuring hint so queuing hint is thefeature that allows progress to futureout across the events so that you knowthe queue can determine when to reach Ymorewisely so in this example the programhas the Q hint uh to check uh whether anew node is big enough for this part'srequest and if yes then retry this partand if no then ignore this event so thisQ hint helps in reducing unnecessaryscheduling rituals which allows us youknow eventually increase the schedulingbook all right so next one is async prepreeemption so if your parts uh getunscheduable uh they may go through theprocess called preeemptionso it's like uhum this is the feature in commandschedule uh that allows higher priorityparts to delete some lower priorityparts so that it can make the price forthis higher priority part instead of youknow delet uh those uh lower priorityparts so in this case uh okay this guythis high priority part wants to go tonode one by deleting some parts thereand the scheduleuler deletes those partsso that this part could likely go tothere in the next scheduling cycle butwe have �a problem here uh like when thepreeemption happens uh scheduling cycletakes time to complete becauseuh the preeemption process has to makeAPI calls to delete thoseparts so given the preeemption happenswithin the scheduling cycle itnegatively affects our schedulingthroughputsso basically you know the scheduleuleruh waits for all the the API calls todelete those boards to finish and thenstarts the next scheduling cycle sothat's you knowexpensive so this ex uh this enhancementliterally just tries to run those APIcalls asynchronously to decouple themfrom the scheduling cycle it's like athe idea similar to you know bindingcycles so when the scheduleuler decidesto uh delete some parts on node one itit just reserves the praise for thispart on uh node one and it just startsthe next scheduling cycle withoutwaiting for you know preeemption APIcalls to be done so given we we we wemade a reservation before starting nextscheduling cycle uh next schedulingcycles take uh this uh ongoingpreeemption process into considerationwhen calculating the scheduling uhschedulingresult yes so the next feature that wewant to mention today is pop bots fromback of Q1 active is emptyuh why we made this feature when theactive queue is when the active queue isnot empty and we want to schedule a portwe just take the first part from theactive queue and schedule it but whenthe active Q is empty the cube scheduleidles which means that it does nothingfor some period of time even if the potscould be in another queue like bag of Qwaiting for the bag of penalty to finishso we decided that we could pop the potsfrom the bag of Q if the activity isempty so we could utilize thescheduleuler resourcesappropriately so let's see how it looklike previously we have three cues inthe scheduleuler unscheduable portswhich is the factor a map bag of Q andactive Q and we want to schedule oneport that was rejected by schedulingcycle for example uh some node hasn'tenough capacity so pot has to wait for anew node to appear and when there is aevent node add to the cluster the Q hintcould decide to retry the pot and thenpot is moved to the back of Q waitingfor its back pen penalty to finish andthis this backoff is scale with thenumber of attempts so it could be a fewseconds for example and each second theperiodic flash happens that pops popsall the pots from the peg of Q to theactive queue but before moving the portto the active queue uh the pre andqplugins are called for that port and ifthey fail the pot is moved to the ankoslevel pots again but if they succeed thepot is in the active queue now waitingto be popped by the scheduling cyclewhen it'sfirst so how looks the proposal for thisfeature first of all as I said we wantto pop the port from the bag of Q andactive Q is empty then PNQ plugins hadto be moved before adding the port tothe bag of Q because we want the popoperation to be as performant aspossible the third thing is that we hadto change the bag of Q's auditing uhfunction to order the pots but by activecues a ordering so order them bypriority because if we pop the potearlier we want to pop the highestpriority pots if possible to reduce thenumber of preemptions and the last onepops that are in the bag of queuebecause of some errors for example APIerrors during the scheduling we want tokeep those spots still in the bag ofqueue because it's some kind of deadlimiting for the API server to not beexhausted so how the scheduling queuelooks like with the new feature first ofall we can see that P andQ plugins weremoved left and that for Q and a newgreen line was added to the schema soagain we have a port in unscable portsthen now the PNQ plugins are executedand if they pass the pot is moved to theback of Q but now if active Q is emptywe can pop this pot at any time from theback of Q to the scheduling cycle but ifthe active Q is empty it works the sameas before so pop waits to beperiodically moved to the active queueso now we have a pot in the schedulingcyclethe other thing that we want to mentiontoday is the array so how can we expressour resource needs in the port we canspecif�y CPU or memory as well as thestorage or some hardware like number ofGPUs but it's not enough becausesometimes we want to uh get a fractionof some resources so we could requestfor a storage or use some CRDsalways but if we know what we need butnot like exactly some we want some kindof device and some number of memory wecould use the array where we can expresswhat we need in resource climb like inthis example and nodes have someresource slices populated that sketchthat could match those claims to theslices so now what were what were thethe features sponsored by six schedulingin1.33 first of all partitionable deviceswhich mean that port can apply for thepartition of a larger device likely amultihost device that spans multiplenodes the prioritize alternatives evendevice requests which means that if wecan't get a specific device we can takeanother one but it's less preferred sowe can make some preferred order listfor the devices and device times andtolerations that are similar to the notetimes if we apply a time on a device wecan aict all the parts that use thisdevice yes and what are the futurechallenges of the DA that are related tothe cube scheduleuler support forcompassible disagregated infrastructureso we could like to attach some devicesdynamically as well as some cross nodedependencies where pot bypots schedulingof the cubeuler might be not enough likefor the multi-host feature in thepartitionabledevices and there were there were othercube concessions unfortunately beforeours that told more about the new DAfeatureso you can always watch those sessionafterwards and what were the other majorupdates in 133 related to cubescheduleuler it was graduation of matchlabel keys input affinity andanti-affffinity to GI as well as nodeinclusion policy input topology spreadto GI and we improved schedulingthroughput of ports that use interotaffinity and port topology spreadfiltering as well as some preeemptionscenarios by around 20% in largeclusters so that's a quite good uhimprovement but you know it's not alwaysuh it's not like working in all the usecases so you could check how it works inyour cluster so now let's move onquickly to the sub projects updates sofirst sub sub project that we want tomention is Q so what is Q q is acomponent that manages quotas and howjobs consume them so it decides when ajob should start should wait should beadmitted to start and when it should bepreempted so what's new in the queuethere is a building integration forupper upper and leader worker set aswell as multiq supports now both cubeflow jobs and ray cluster fair sheddingwas added which allows to split theborrowability sources in some fmanertopology scheduling was added as well asranging forit so if you know again more about thequeue there were other cube concessionsand we want to encourage you to watchthem if you didn't jointhem now now we can move on to the thescheduleuler so the scheduleuler thecubeuler only schedules spots so itlooks only on the pending pots in thecluster and it doesn't care much aboutthose pots after they are put on thenodes so for example some uh constraintsthat that pot head for example some pottopology spreading that means that wewant to spread pot equally in thecluster or some anti-affinity couldchange over time in the cluster meaningthat those rules are not are no longermet by such pot and you can use thescheduleuler to evict such ports thatdoesn't meet the policies that werequire to enforce so what's new therethe cube schedule the the scheduleulerthis matrix now so we can use promuseand kubernetes matrix to guide the thescheduleuler if we should have the potfor example that is consuming too manyresources on a node and we want toutilizeless and the last sub project that wewant to mention today is cube schedulesimulator so for example if you want touh test some custom plugins or tweakconfiguration on of cubeuler and youdon't want to run in it in your realcluster you can always use cubescheduleuler simulator it uses quark tocreate a fake cluster and it has a niceUI that allows you to show eachscheduling step especially for examplewhat uh plugin rejected what note andwhat were the scores of each plugin foryour pots yeah and what's new there thethis the cube scul simulator can beconnected to your real cluster it meansthat you don't need to manually createall the resources of your cluster in thesimulator but you can connect your cubeconfig to the scheduleuler simulator andit will download all the node spots atCira and you have you can try thetweaked scheduleuler on a real clusterlikely so how to get involved in the sixscheduling first of all you can write inthe six schedulings channel on Slack onKubernetes Slack as well as we have twoperiodical community meetings first isthe Asia and Europe which was recentlycreated is in the friendly time forthose continents on Tuesdays as well asthewell-known America and Europe meeting onThursdays of course you can all alwayscreate some issues in the Kubernetesrepository as well as pull request andwe are happy to review yourneeds and you can scan a QR code if youwant to take the sessions and we arewaiting for your questions now if youhave any the microphone ishere thank youum I went to the Q session yesterday andit from what I understood they'rebuilding a lot of theuling logic into Qso it's kind of pre-cheduled by Q andthen it goes to the Kubernetesscheduleuler to actually schedule on uhonto nodes uh it seems like with there'sa bit of like overlap there and if isthere a way of handling that better sothat more Q logic is integrated withinuh the scheduleuler or the scheduleulerprovides some kind ofuling service orsomething like that so there's not thislike kind of double upyes that's a correct issue butyeah as far as I know the Q has thetopology of scheduling and it plans toutilize the scheduling framework workthere of course it's not great becausewe have like separate schedulers intoplaces but now it's the the best we cando because you know the scheduleuleritself the scheduleuler schedule spot bypot and changing this mechanism is along story but yeah in the futureprobably this could change so okay coolthank you thank youhello I have question can we like livemonitoring this uh active queue andbackend queue like it's could be morecomplicated in huge cluster how it'swhat's going on in this queue and orthere is only like we can monitor theevents forthem it's like the monitoring the queuei think there are metrics exposed thattell you how many ports are each in eachof the cues so you could check them aswell as you can check some metrics forthe events so you can see how thosespots move between the killsinclud autoscaling and I got a questionin my session that I hope you might beable to help answer uh which is um in asimplified case imagine that uh you havelike a logs collection agent and aworkload that are on the same node uhand the log collection agent ramps upwith the workload uh so with uhautoscaling like the two of them exceedthe capacity of the node but like justthe agent can scale down when theworkload's not there and then createsspace for the workload again if thatmakes sense like you end up in a cyclewhere theautoscaling forces you to exceed thecapacity of the node and then you wouldexpect it to scale back down oncesomething's evicted uh do you have anyideas about like how the scheduler mightbe able to handle a situation like thati don't know to be honestI didn't either so that's fair yeah canwe discuss after this session yeah yeahyeah if you could like join the scheduleyou can just discuss it with us thankyoui do have a a separate question you talkabout Q you didn't talk about B groupslike I feel like I'd be curious to hearlike what is what are the talks likeabout supporting better supporting costscheduling maybe introducing like somenative construct to group BSlike you want to yeah you can speaklike probably in the future we willextend the support for such cases inulerbut now you know we can use skew to todo that up as a native solution ofKubernetes awesome thank you thank youokay so if there are no more questionswe can end today so thank you forjoining everyone[Applause]2025-04-15 21:57:46.468026�ams so in the build stages actuallyyou build your application imagesnormally you will build on top of thebase images that you acquire or you mayhave uh to use some other utility imagesso in this stages so before you use uhthe external images stored in the youruh private registry you can validate thesignature to ensure they are really uhapproved uh by yourcompany and uh in this stages after youbuild your own uh application images andyou probably will also produce someother uh um security related image manmetadata such as spawn for complcompliance purpose and alsovulnerability reports so for uh yourapplication images and those uh uhsecurity uh security related metadatayou may also use not project tooling tosign so that later on you can validatethe signature to ensure they aretrust now you have your applicationimages you have uh supply chain relatedmetadata you have everything signed andit's it is ready to uh for publishingthose images so if you uh like Bnami asa image publisher you can publish to thepublic repository or any otherrepository that your customer can accessthen they can validate the signaturebefore use that or uh you are serviceteam you um uh you release this imagesfor production so that your uh platformteam or service team can deploy thoseimages to uh for your own services soduring deployment you can uh validatethe notary project signatures and otheruh and signatures of other uh umsecurity metadata to ensure that theimages can betrusted okay so with this approachum basically not project can help you toensure the authentuh end to end now I will hand over to Jto talk about uh his company's practicethank you so first of all to give itmore some context I would introduce alittle bit orange logic because probablymost of you doesn't know this companyit's not a tech company or not for youit's a leading company in the damos andthe dynamic system is the way to storeassets uh for a company to make themsearchable to delivermore potential on the assets and u wework for the major company in the worldbehind the hood we help them to evolveand our company provides a SAS solutionto what is a single tenant solution tohelp them to search all these assetstranscode them store them and we speakabout pabytes of data around multipleregions in multiple cloud providers sowe have a lot of challenges coming um asyou can see we are multicloud we try tobe flexible all over the world and thatends up with a lot of deployments in thenature etc so we try to modernize ourinfrastructure by being more cloudnative and moving away from traditionalinstallation to something based onKubernetes with in the meantime moresecurityso some numbers uh with an example ofcustomers we have a lot of regents andclients using the application and a lotof data to process and we also need forall of that a lot of compliance andcertificationsso that'swhy my making the migration toKubernetes and to this large ecosystemof tools what is also a kind of jungleyou probably know it so that's the besttime to leverage a better security onsupplychain to do that we asked some questionsinternally so from where to start firstthing we need to check uh and trust thedependencies of our applicationobviously so for that there is a lot ofstandards like asbomb and things likethat to list your dependencies to knowthem but after that you have to trustand ensure they are coming from trustedsources we also need to use asymmetricsignature because obviously we all knowitto certify something the best waycurrently is to use the asymmetriccryptography because you can verify asignature with a public key you can signon your side with some any mechanism sothat was part of the needs we also needto prevent the signature being unsafebecause if anyone can sign and if thesignature is shared over workstation orwhatever in the end it defeat the systemso it's not acceptable and we need alsothe signature beingumsorry being ableto to work with a tool flexibledepending on the technology you use ifyou want to use a cloud provider an HSMdevice or whatever else and to finish wewant having multiple level of sign�atureit's a little bit related to our ownneed because of our release workflow etcbut something we considered is the factyou can sign an artifact going out fromyour supply chain but if it's certifiedat this time it doesn't means it's supercertified maybe it's unstable maybe it'sunsecure to put it in production so it'scoming from you it's safe but maybe it'snot safe to go in production so we wantto leverage also multiple level ofattestation so what Nar brings to usuh we'll focus on three parts about whathe presented before so the build deployand run on the build part uh there isthe first rule nothing can betrustable if uh it can be attested safeand what means safe safe doesn't meansonly noone in input something bad on your codeor in your dependencies it can be alsoalso some saying what is unstable anapplication unstable is unsafe so unsafeis large second rule we need to be surethe mechanism is able tobe played with with an audit to be sureno one can sign transparently fromanywhere and we can't know who has madethe signature so the key must be securedin the right place and the only way toaccess the key is to be loggedthe third the third rule is to ensure wecan easily verify the key so as anexample traditionally when you developsoftware on desktop and things like thatyou sign them usually and you buy acertificate to an authority what isproviding a high level of trust intheory so you embed your certificate inthe final artifact the executable theDLL the kernel driver whatever and thenanyone can verify the signature becausethe master CA is public so that's anexample of mechanism to verify thesource and it's something what we wouldlike to rely on because we already makesome desktop application so our firstguess was to say why not use the samemechanism forcontainers and the last rule is thefact some other mechanism are working ontransparencing is like uh let's encryptlet'sencrypt anyone can create a certificatefor it until you are owner of the domainyou can create let's encrypt certificateit's trusted that means It's somethingwhat attest you are this domain but ifyou buy a domain Google instead ofGoogle nothing prove that you are a goodactorso it's not enough for us to have aglobal trusted certificate we prefertrust exactly what we want and also theother mechanism of transparencylogs is making you mandatory to uploadall your build pipeline information oryour build pipeline artifact and resultspublicly on the internet to make themverifiable and we don't want as a SAScompany publishing all of our pipelinestuff on the internet makes no senseso another big interest of notary andnotation CLA especially is the fact it'sa tool what is very light independencies we'll see that later butalso something very extensible thatmeans anyone can create easily anextension for his need and installing anapplic extension for notation is sosimple as making notation plug-ininstallYou provide the URL with a ZboardTGZ eventually or recommended you put ashot to verify the artifact and thennotation will install the binary in afolder and then when you use notationkey add what is the way to register anew key for future signature you specifythis plug-in and what happened under thehood is just the fact that executes thatplugin passing some subcomands and onlythree subcomands are describe keygenerate signature and get plug-inmetadata so that means anyone can buildits own plug-in for its own need on anylanguage can be in Golang can be innetcan be in whatever so that makes thisproject reallyextensible and not close to somespecific use case if you want to use anHSM device on prem you can use acommunity plug-in or build your own ifyou want to use Asia key volt there iscommunity plugins if you want to use avault there is community plugins so thatmakes the tool very flexible and againbecause the dependencies are low therisk islo so next step once you sign yourartifact you needto ensure it's going where you expect itto be going so what we decidedinternally is to put in place apromotion mechanismwhere for an artifact so a docker imagesbasically you can sign it multi�ple timeswith multiple keys and multiple level oftrust so if the artifact for example iscoming from your build chain directlyfrom your source code we sign it inlevel one so it's trusted it's comingfrom your pipelines it's coming fromyour infrastructure it's safe in theoryeverything has been verified before butyou are not sure if it's ready forproduction in terms of stability andmaybe in terms of additional CV orwhatever so the idea is to sign it levelz row before going in internal QA thenonce the internal QA is okay and someonea human or some other asynchronouspipelines are saying okay I am happywith this image it pass some new gateslike antiverse scanning vulnerabilityscanning human testing in that casesomeone can trigger a new workflow whatis logged audited to sign it level twoand in that case the idea is to say okaythis image is ready for for testing onclient side that means any client shouldsafely install it on a UATenvironment and the last step what canbe shortened depending on the use caseis to sign it for production and withthe same mechanism someone will justclick to launch a new workflow to finishum the key is to verify all thatsignatures so that means once the imageis going to an environment theenvironment must reject an image what isnot signed with theexpected signature chain for thisenvironment so you could have a CAspecific for production a CA specificfor test and a specific specific for Qenvironment and everything should beautomated to ensure nothing unexpectedhappens in a wrong environment so I letyou continue on the other partyeah thanksJ so let's uh talk about why you shouldchoose notproject so first thing first secure sowe are security tools right so ourselvesshould be secure so not project weregularly apply the security bestpractice by going through the securityaudit by third parties as you can see inthe table that's our very first majorrelease we have done two security audituh in 2022 2023 and uh uh last year 20242025 we also did another security auditon two uh important feature one is thetimestamping another is the revocationchecking so we are our quality is thebest in class and we plan to do anothersecurity audit probably uh later thisyear early nextyear another reason that Jon alsotouched a bit is uh not a project we arebuilding our solutions tools based onstandards and uh we also have theextendability by providing the pluglingframework and also we focus on um umenterprise scenarios so as you can seeon the bottom we have a notary projectspecifications they are building on topof uh uh standards um the OCI standard oopen container initiative which allowsyou to manage your uh artifacts acrossregistries across multicloud platform wealso support IETF standard uh signatureformat one is the JSON web signaturewhich is uh popular uh another is coysignature it is a binary uh encodedwhich is very concise and efficientlyespecially um useful for uh edge or IoTcases for example for the low powerdevices from the plug-in side we have aum plug-in from major uh cloud providersuch as a key volt uh AWS signer andalso Alibaba cloud and we also have someother uh plugins uh by some vendors andwe have a hutchob world plug-incurrently it is in arva state so welcomeanyone to contribute into this plug-inand we are also uh working uh on a newplug-in and with this new plug-in we uhenable the uh informal case support orwe call it the shortlived certificateand also zero uh touch uh certificatemanagement and we also uh integrate withthe whole ecosystem so currently we haveuh the CSD uh pipeline arrow devopsuh uh flux cd and also github actionsintegrated with the notation and we planto integrate with zago cd and thegitlabs so also if someone is interestedfeel free to reach out and help tocontribute another reason why uh youshould consider not project you can seeuh in the screen it is the tools and thelibraries that um built by notaryproject so as a security tools we uhminimize uh our dependencies so thewhole problem of the software supplychain is due to the fast growing of theopen source project right you can justuh acquire and use but as a securitytool we should minimize our dependenciesotherwise we will have our own securesupply chain issues so as you can seethe three major repositories uh we havevery few dependencies uh especially forthe first one time stamping client so ifyou work in the security area orinteresting in this area you want tobuild some timestamping related clientfor interactive with uh uh timestampingserver for your uh shortlive certificateor timestamping related scenarios youwill not find any good uh libraries uhin the industry so timestamping clientis uh another uh major contribution thatnotary project provide to the wholeecosystem and we have a zero uhdependencies and we have uh currently wehave a popular adopters uh from cloudproviders and uhregistries so actually any registrysupport uh OCS standard uh is uh notproject can be used for and also uhadmission control engine such as kibodoalso support uh notary projectsignatures something about road map souh we basically complete all the corecapabilities for uh sign and verify OCIartifacts so now we are working onextending the sign capabilities to theuh other artifacts uh that are notstored and distributed as OC artifactsso such artifacts could be uh forexample spawn for your binaries or uh AImodel files or uh web assembly uh modelfiles and uh as OCI getting popular sowe will um um change the defaultbehavior to by default support OCI 1.1referrals API which is more elegant uhway to manage yoursignatures um and we will also make uhso currently working on making ourexperimental feature sign and verify uhOCI image layout which is a kind ofimages you can put on your file systemyou don't have to store in the OCIregistry first you can put it on yourlocal file system you sign it then youput to the registry which is more secureand in the future we plan to support uhattestations probably we will work within total community to support in totalattestations and we will also working onthe transparent lock uh and uh we willalso b uh build it based on the IETFstandard we still have five minutes so Iwill give a short demo on our uh newfeature which is about assigning uh blobfiles on your file systemokay so taking a spawn file asexample so you build a spawn file onyour file system in your trusted domainand before you publishing it you want touh sign it to ensure the integrity andthe uh authenticity so I'm using thenotation the alphaversion uh I already have my keyprovisioned uh currently I'm using atestkey then after that I can use the newcommand set uh blob notation blob sign ichoose the format cozy a binary formatfor the signature and the med uh mediatype is spdxyeah so after that I signed the spawnfile i have everything on my file systemit is not leaving my uh trust domain yetso now I publish in um anywhere usingany file transfer or method to publishthe spawn file and the signaturetogether so that later on any consumerwant to consume the spawn they canvalidate the signature firstokay so I have where everythingdownloaded on my file system as aconsumer and this is to check I have theroot say certificate config that is thetrust anglethen for not project we have the trustpolicy which uh you can find and tuneyour policy according to yourrequirements so I initialize this trustpolicy with the trust store as justshown and the trust identity which isthe subject field of the sciencecertificate provided by the uh publisherthen I show the trust policy thatconfigured it is as expected then I canstart to verify thesignature yeah it's easy i just specifythe policy name the spawn file and thesignature okay so after signatureverification successfully uh I can startto uh analyze the spawnfileoops yeah so with that I think wecomplete all thecontent so anyquestions yeah there are two QR codehere uh one is not project websites feelfree to explore and bookmark and anotheris the stack channel under the CNFuh community so you can find the notaryproject stack channel ask any issuesquestions thereso if you have a questions you can standuh in front of this microphone toask okay thank you2025-04-15 21:57:46.926487 ��I-#��IA1FwE0ajODU8so hello everyone welcome to the notproject maintenance track so today isthe final day of CubeCon EU London uhthis year so hope everyone havediscovered something interested umlearned something new and made some newconnections and especially having somefun my name is E i'm not a projectmaintainer i'm also senior productmanager at Microsoft so currently I'mfocusing on the cloud native securityand the redtries and I'm Gil so the technical leadarchitect at onlogic and I'm leading themigration of the company to more cloudnative solutions to modernize all theinfrastructure and all the things sincesome years nowokay so uh it's very great to have Jtoday as a co-speaker he will share umhis company's journey in software supplychain practice and also how not projectcan help to support thiseffort so what is not project so anotherproject is uh incubating sab incubatingproject and it is in the cloud nativesecurity domain so maybe some of youattended the uh uh the keynote today uhthere are some talk about the cyberattack resilience act this initiative sonot project definitely can provide thesupport from tuning perspectiveuh not a project mainly answer twoquestions the first one is as you see inthe screen it is how I can trust thecontainer images or other artifacts Iused how do I know they are from trustedsource this is about theauthenticity another question is how canI make sure those artifacts are notmodified by malicious users duringpublishing or distribution so this ismore about theintegrity so not project our mission isto ensure the authenticity and theintegrity of cloud native uh artifactsby providing standard based solutionsandtools so a bit deep dive into the majorscenario that not project that supportsnowadays uh you may see this uh diagramin the lighting talk so this is actuallya framework help you to understand howto address the uh supply uh uh supplychain issues in a systematic way sostarting from the left it is the acralstages so in this stages you normallyacross so in this example uh there uhBietnami is one of the major publisherin dog harp so maybe you already knowthat so Bidnami they uh sign they buildsign and publish uh container images andhammer charts in dog harp and they signwith notary projectsignatures and if you go to doc harborand did a search you can find uh as apublisher they have a guide for consumerthat it mentioned you can and usenotation to validate the signature toensure that the images ham charts arereally coming from uhVietnami so at the acrystages you can use notary project toolsto validate the not project signaturesfrom any external uh images that youwant to acquire for internal useso after that you normally you will uhcopy those external images in your uhinternal registries and uh uh as therewill be um uh vulnerabilities detectedover time so probably you will um usesome back best practice to scan theimages regularly then you will probablygenerate the vulnerability report forlater analysis so for the containerimages and vulnerability report you mayalso want to sign with your enterprisekey use uh not project tools so that itcan show it as a proof of use for uhinternalte��ke thisyear uh a lot of people are working ondistributed inference right so we needto kind of find a a way easier tomapping this deployment uh pattern ofthe workloads to better use theunderlying uh infrastructure basicallybetter align with the underlying uhnetwork topologySo uh let's deep dive uh little bit moreabout uh volcano uh key features Soactually we have been working on thisarea foruh quite a long time uh before the nameof uh this project volcano was usedactually we were uh a sub project calledthe cuba batch in uh uh kubernetesscheduling sik and uh uh during thattime we found that that actually youknow uh workload awareness of schedulingis very important it's not not just thepart by part when you do like GANscheduling and also like a lot ofuh batch workloads We found that uh wecan really reuse a lot of middle stateof the uh scheduling uh algorithms andalso uh we need to kind of cue a lot ofuh workloads and also you know deal withthe fair sharing uh between uh differentusersThat's why uh we kind of extended thethe scope and turn it to an standaloneum CNCF project So currently volcano isin the incubation level in CNCF and wealso uh have been adopt adopted by a lotof users and uh like uh this year lastyear a lot of more and more uh use casewere are working on the training andalso uh inference So for a volcano webasically support uh batch API uh likeuh volcano job pod group and also uh jobflow like Q and the new one weintroduced is uh hyper node uh hypernode Hyper node is a an uh abstractionfor the underlying network topology Wewill dive into uh deep later on and alsouh we have provided a lot of uhecosystem support Uh on the top you cansee actually the mainstream AI frameworkas well as the uh big data and also uh alittle bit HP HPC uh framework computingframeworks Uh we all uh provide verygood uhintegration Okay Uh so the one of themost important feature I would like tointroduce today is about the uh hypernode obstruction and also the networktopology whereschedulingUh so uh we know that uh especially fortheir hyperscalers and a lot of uh largeend users uh they are kind of uhdesigning their own AI uh cluster tomake more uh powerful uh uh helping thetraining workloads and also uh more uhlike this year for distributed uhinferenceSo uh like like I said before uh thereare a lot of uh research and investmenton the uh innode uh topology right andthis year uh last year a lot ofrequirements are focusing on the internode uh topology network topology forexample So on the left you can seeactually uh like Nvidia DJX uh they havethe concept called a super pot rightbasically uh group of nodes that areconnected with very high uh performanceuh GPU network we call it uh uh MVL linkuh MV switch and also infiniband and oreven uh roi and we found that actuallythere are also for some of the uh uh AIcluster vendor hardware solution vendorthey are trying to design uh similarconcept you know group uh a set of nodetogether define it as a high performancedomain and uh uh the underlying AInetwork GPU network might be a littlebit different in implementation but froman abstraction layer uh abstractionperspective there there are a lot ofcommon uh ideas So that's why we arethinking about uh we really should makethis abstraction layer uh clear andeasier for users for schedulers for theother part of the system to uh to bettereasierintegrate Uh that's why we startedworking on the uh the API called uhhyper node So basically it's uh itdefines a a group of node that has uhsimilar uh performance uh uh in the uhespecially from the network space uhperspective actually uh in in real worlduh use caseuh the nodes inside the one of the uhhyper node can be just in same flavor Sowe also uh from the API uh perspectivewe also actually designed this datastructure to to be able to be nested Souh like uh I'm not sure if my cursor ishas them Okay Uh so on the bottom layeryou can see these are actual uh nodesbasically uh Kubernetes nodes and we canyou know uh design all nodes uhconnected to the same uh uh le uh likethe switch zero this one level uh as uhone of the u�h hyper node right so solike this node seven node six they havejust the one hopyou know uh connecting with each otherAnd we also make it possible to definelike uh the nodes connected in the uh bythe switch one the second layer as a asa kind of nested uh hyper node make iteasier uh for users to mapping theirworkload in in the different layer anduh uh in the result it will be turn turnout to be uh multiple hyper nodes andactually multiple treesuh in your cluster you know you canactually define one of the tree from thedata center network perspective as atree and also like uh the GPU networkthe the infinity band uh perspective uhwe don't have the limitation you know ithas to be one tree if some of the nodesare not not connected into the same GPUnetwork that's fine you can just definemultiple one And when you're doing uhconfiguring your uh kind of schedulingpreference you can actually you know usethis field that we are adding to the uhu uh part group uh the net uh the thefailed network topology and you can alsodefine like the higher tier the highesttier allowed That means uh for examplewe marking it as two uh that means weonly accept accept uh scheduling resultswith notes in uh you know uh like S5 oruh S4 like this So it's easier for usersto easy map a group of pods that beinghighly uh heavily communicated with eachotherSo this is for the hyper node uhperspective So I'm I'm giving just theone of the example how a workloadperspective uh requirements uh or can uhthe configuration can be souh so actually like you know for uhinference like LWS leader work set theyalso have the concept of groups right uhto group a set of uh parts together anduh for for training uh we already have ajob or a volcano job that's kind of wethink that it's a a group of similar uhpod that can work together Uh we canactually map this group to um tier onenodes right Basically all the nodesconnected with just one hoop uh one hopAnd you know uh for training we can douh tensor parallelism uh in in in thisamong this uh uh group and also we cando uh data parallelisms uh uh acrossthese different groups So that makes youa um good clear expectationuh how the kind of latency between uhyour parts would be right when you arereally doing training and uh it stillgives the flexibility because theunderlying the actual network uhconfiguration I mean also the setup theimplementation can vary according to theuh hardware provider youchoose So uh uh a bit more about the uhschedulinguh over workflow or or the overwork proprocess So uh we are also currentlycollaborating with a lot of uh hardwaresolution provider to uh provide autodiscovery functionalityum for the different uh hardwaresaccelerators and also the even the um uhAIclusters right so uh users would able touh use the hyper node controller uhtogether with the uh provider plugin touh make it auto uh you know discovery uhfetching the underlying informationabout the the real uh network setup turnit to a bunch of hyper node definitionand become you know the uh basically thetree trees in inside the uh cluster andalso uh we are relying this mechanism toto collect the uh status information uhespecially health check between you knownodes uh through the uh switch like inthe large scale training we always metuh issue some of the switch or some ofthe cable uh connection So that givesyou uh you know a flexibility to easiermonitor the status and when uh userscreate uh workloads uh it's still justthe normal process The only thing we addis the a new field called networktopology And uh basically the volcanoerwill uh find the best way to uh map thisgroup of parts to the underlying hypernodes Okay So um actually for thenetwork topology aware scheduling wehave been working on uh for a long timeIn the very beginning we were just usinguh labels on node and also like in 2020we started to working on uh uhimplementing the scheduling algorithmsUh however there are some limitations ofthe uh labels on node that's why uhstarting from last year we we areworking on making it really a standaloneAPI So as you can see uh there are a lotof um advant�ages uh uh if we choose theuh the API uh mechanism for example uhit gives more clear semantics becauseit's a a clear um API with very clearyou know uh field uh however like nodesit basically there's no kind of bestpractice Um the node key uh the labelkey label value can vary uh you know bythe different uh provider right and thatalso turns out when people areconfiguring their scheduling uhconstraint it can be complicated whileuh with the standard API it's quite easyuh you know you can always reuse thesame topologyconstraint configuration for yourworkloads And uh uh because of the uhdesign of the hyper node structure datastructure it provides the scalability ofuh flexibility of the granularity uhgranularity You know you can uh uhconstraint basically limit the uh theworkload scheduling like spread acrossthe different uh hyper nodes that is inum tier one or tier two connectionthat's fine while node labels basicallyuh everything you need to do it byyourself and with the management likelife cycle or status management uh withthe simple unified API it's quite easyand clear you can check out like whichhyper node consists of uh which set setof nodes and also uh we can easilymonitor the healthy status while nodelabels you need to kind of always take alook update the node labels and alsothere's no where to you know uh trackthe status and uh uh represent uhhealthy healthy uh information uhproblem to thesystem All right So uh that's a littlebit about the uh hyper node thing andfor the rest of the part I'm handling uhmy colleague Shuen to give theintroductionOkay Next I will introduce some otherfeatures of cano and the first one is fof life cycle management and forrecoveryuh in distributed air training and highperformance computing environments uhport failures caused by hardware or orsoftware issues uh can disrupt thecompletion of a job and uh we can knowthe job life cyclemanagement enables user to define eventsand action to handle uh these failuressuch as restarting the entire job andrecent updates have further enhancedthis capability with multi-layeredresultpolicies and instead of restarting theinter job users can now choose to uhreset only the field port or one taskuh improving the job executionefficiency and additionally a timeoutsemantics is also supported if a portrecovers with a specified time timewindow uh the predefined action uh areskipped uh next I will discuss thelatest updates on GPU virtualizationuh given the high cost of GPU sourcesand low usage particular in AI inferenceWano offers GPU functionality to uhenhance efficiency supporting both WODAand makemode and WNO also provides a unified APIfor requesting fractional GPU sourceswhich is called W GPU memory and WGPUnumber uh allowing multiple ports toshare a single CPU C G C G C G C G C G CG C G C G C G C GPUcard Uh then I will talk the the edge ofscheduling in multiple clusterenvironment Uh more and more users areusing multiple cluster to manage theirworkloads while they uh use Cano as ascheduleuler in a single cluster uh toalso use volcano's schedulingcapabilities in multicluster environmentVolcano has incubated the volcano globalsub project for our multiclusterscheduling and uh including the cub Qpriority scheduling in multi-tenant andthe fair share and job priorityschedulinguh beyond AI scenario scheduling andresource management also offersadditional functionalities for uhunified workloadscheduling Uh Q is a key concept in uhresource management Uh a Q can beconsidered as a basic unique unit ofresourceallocation and often corresponding todifferent teams or departments Uh sincedepartments typically need to share orreclaim resourcesuh flat cues are not sufficient foreffectively managinguh resource sharing in hierarchicalstructures Uh therefore a more finegrained and nonflat structure isnecessary to uh uh uh to handle theresultation between differentdepartments And this approach can comeseven more critical when migrating bigdata uh from yarn to uh cloud nativeplatforms And a Q has three importantfields Capability which is a hardquarter limit well deserved which meansan elastic kota that can be uh reclaimedby other cues and a guaranteeing whichrefers to uh reserved resource thatcannot beshared and uh in the latest versionintroduces a a resource dashboard whereyou can view jobs and p groups and cuesand you can also check uh resource usageand the key fields withinHQ And in the upcoming version the Canodashboard will support the creationdeletion and updating and uh uh and allthese resources and pro providing evenmore control and uh flexibilityUh volcano natively supports batch jobscheduling and is fully compatible withdefault scheduledalgorithms And this allows you to uhschedule both batch jobs andmicroservices microservices in a unifiedmanner And additionally by colllocatingonline and offline jobs and also dynamicresource oversubscription can optimize resourceutilization while ensuring that theservice level objectives for uh onlinejobs aremet So next let's discuss the futuredevelopments of volcanouh distributed inference is a keyscenario and cano is integrating withthe later work set API to uh to supportthe gun scheduling andadditionally will support elasticreplica settings for workloads likedeployment andsite and uh enabling better gunscheduling for microservicesuh and for multi- tennis narrow becausewe are working on supporting uhdifferent scheduling polic policies fordifferent cues Uh we are also makingimprove improvements to uh discardingfeatures and the support for DR iscurrently in uhprogress and if you have any futurerequests or preferences regarding uhfacialpriorities uh please feel free to shareyour comments on these issuesUh since it opens source release Volcanohas attracted a large number of uh uhdevelopers and uh and users and it's nowbeing used in production by over 60organizations We like to thank all thecontributors and theusers and feel free to share your usecases on GitHub and our community isopen and welcome to and welcome to anyrelated questions orrequests and you can also uh contributefollowing uh our contributionguidelines and finally you can uhconnect with panel through our uh facialwebsite GitHub and uh Slackchannel Okay that's all Thankyou Uh so we still have a bit of time Uhif you have any questions please use themicrophone in in the center YeahHello Uh I have a question about thescheduling part I saw for the networkside using the network tire toscheduleuler between the cross switchkind of things But uh I'm running likeif goes to one uh let's let's takeNvidia GPU as an example It's 100 I saylike uh if you're using the one node80GPU M link situation like do you haveany special setup for for this this kindof a topologyUm so uh actually today we only uhdefine the uh is actually it's more fromyou know just the uh resourceperspective and the and the uh statusmonitoring perspective for theunderlying like uh setup It's actuallyfree for you to you know use and uh uhwe have been discussing with some of theadopters So they have basically some ofthe assumption or uh design pr uhprinciple like uh making the nodeconnected to uh like uh let's say uhtier one hyper node um all theperformance are the same uh I mean alsothe like the latency and bandwidth theyare the sameOkay I see Yeah because for it it Yeahit's still a little bit different likeif uh the the nodes if they connect withany link there they they are they're ina much faster path there and also ifthere's a real optimization there youmay not just schedule one one GPU out ofthis node and goes to the other it needsa special kind of configuration to makeit work I'm just curious like how thescheduleuler side do this kind of workOkay Yeah Yeah Yeah So currently uh weare more like uh because we we are juststarting this work So uh currently it'suh you know uh the resource uhallocation perspective like when you aredoing you know um for the workloadconfigur communicationuh we are still thinking about uh whatwe can help to do Yeah Okay Thank youGood pointThanks All right I think the time's overSo uh uh you're still welcome to reachout us or uh just join the community touh share your feedback Thank you2025-04-15 21:57:47.628708 ��Y.#��iAyCyezOTVU_Yhello everyone uh thanks for joining ourtalk So uh we are going to share uh theuh uh work we have done in the recent uhtimes Uh so basically it's uh also atalk about the volcano project So wewill uh focus more about the uh the newuh features we are working on to uhaccelerate the high performance AI uhmachine learning trending Uh my name isKevin Juan Uh unfortunately uh Williamdidn't make it here so I'm helping uhhim to give the this talk Uh personallyuh my background is all about uhscheduling stuff I started contributingto upstream kubernetes uh back to 2015and so also a lot of uh sub projects andcurrently I'm also working on the uh TOCto help uh the uh whole uh community Sotoday's talk is more uh based on mypersonal expertise and also as the roleof uh maintainerHello everyone My name is Shu Jun andyou can also call me Zir and I'm amaintainer of the candleAll right Uh so basically we we willjust uh cover uh the following part uhparts uh of the talk today uh a littlebit background and deep dive our uh uhkey features especially the new uh newones for AI workloads Yeah and also uh alittle bit more about the uh schedulingstuff and the uh the futureplan Okay Um so we all know that uh inthe recent years uh the rapid growth ofAI uh workloads become more and moreespecially uh for the LMS uh you knowthat uh and from scale efficiency or uhperformance perspective and more andmore uh advanced setup and thedeployment requirements have uh uh rahave been raised However for uh usersespecially like uh the data scientistsuh that they don't have much backgroundabout the um uh the infrastructure uhsort of things uh uh we think thatsimplicity is always very importantthing So it's kind of uh you know uh weneed to always consideruh exposing more the like the topologylike the underlying hardware thing aswell as uh providing a a more simple wayfor uh the users to useSo uh based on our observation and theresearch we think that there are uh twokey trends very uh closely relevant tothe cloud native AI infrastructureespecially uh also very relevant to thework we are uh working on So um from theresources layer uh uh previously we havedone a lot of research and also uhimplementation try out in the uh in nodetopology like new awareness and alsolike the the feature uh discovery thingand uh uh now and the focus uh and alsothe kind of attention extends to moreinter node topology thing like thenetwork topology stuff Uh with that wealso need to kind of think about uh howto better support you know heterogeneousuh hardware hoggeneous AI cluster fromdifferent vendor right uh from theinfrastructure management layer uh weknow that uh the workload have havebecome more and more uh complicated likeuh for the training more and more aredoingdistributed training and also li��d qualFourthly uh the distributed catchcapability uh can be used in theacceleration scenarios Uh fifthly uh theintelligence intelligent detering helpsto migrate uh data from the uhhyperformance media to the lower mediato help the business to uh cut thebudget Finally uh you can use cubs onkubernetes new kubernetesvs pluginNext I will introduce several end useruh cases where cubefs applied Uh thefirst use case is building AI storage uhbased oncubefs In recent years uh artificialintelligence has uh brought ustremendous changes In fact every uhindustrial upgrade brings bothchallenges and opportunities to storageuh from the parsers uh in internet areato current AI area Each time it hasbrought about improvements for storageuh such as the architecture uhscalability and the performance So inthe area of AI what specific challengesare wefacing at OPPO uh the file storage teamhas been supporting AI related businessfor over five years Uh the businessinclude uh computer version uh largelanguage model and model modality and soon From the architecture uh diagram wecan see uh some of the upstream and thedownstream components and sourcesrelated to AI in Kubernetes environmentsThe blue parts in the picture are thesections related to the storage The toplayer is the service layer for specificof AI products Uh the model layer is AImain workflows in the cloud nativeenvironment There are several uh storagerelated parts such as AI codes and theironline editing and sharing compcapabilities uh data integration AItraining workflows and uh modu modelstorage and distribution Cubfs has alsouh developed a feature that enablesacceleration framework The featureaccelerate uh data reading by profilechaining through adapting a storageplugin in pytorch Uh the bottom layer isthe inf infrastructure layer This layerincludes storage computing power as wellas some networkcapabilities Okay Next uh let's startmain process of artificial intelligenceand uh explain in detail is requirementsto storage Uh just as introduced beforeuh artificial intelligence can be uhdivided into uh three stages includingthe process of data processing uhtraining and inference Uh dataprocessing actually means that datasources uh we write data into storagesystem and at the same time uh a lot alarge amount of data filtering cleaningand uh uh will be carried out Uh herecubifas has uh the capability to uhhandle different data sources reducingthe uh business pressure uh of movingdata from a different uh storage systemIt can also handle a tremendous amountof this storage based on its scalablecapability The second stage is modeltraining Uh the most training scenariofor most uh training scenario thetrading is a repetitive process a largelanguage model may be an exception uhbecause he it can load all the data atone time but for a training process likethe uh computer version data readingneeds to be done multiple times with theI with a high uh probability uh herecubif has a capability uh to handle highthroughput with a low latencySee the third part is the infra stage Inthis stage the model generated in thesecond stage be uh used for AI platformIt is uh necessary to distribute themodels uh to the endpoint uhquickly The distribution prociserequires high uh throughput and shouldbe uh completed with a short timeNext let's talk about some of thechallenges faced by the uh AIstorage especially for the largelanguage model Uh to uh achieve a bettermodel there needs to be a large numberof tokens parameters and deps From thediagram we can see that the parametersand deps has been have been growing foruh many years uh this will generatelarge data sites checkpoints and modelsAlso the upward trend of these factorshas start to slow down uh to uh to thecurrent number is still quitelarge Just as the last page introducedto improve the uh just as the lastuh page introduced to improve the modelperformance a large number of parametersare used and it is it also requires alot to computational uh resources Asthis picture shows the computationconsumption of the model has been risingrapidly in recent years especiallydurin�g the training process or largemodels In the past few years most AIdevelopers has experienced the shortageof computition resources For atechnology company like OPPO we have toproduce a lot of uh computing power onthe public cloud This is not onlyconsideration for the budget but also uhfor uh elastic capability of computingpower Of course uh there emerged a lotof uh optimized algorithm to reduce itto a certain extent but the demand stillexists Using the uh computing resourceson the public cloud is a good choice butit's not an easy thing either Thecomputation transfer from private cloudto public cloud can be achieved uhespecially for for kubernetesenvironments However the storage istherefore there are several challengesin the geographic distribution of datathat is uh storage cost transfer cost uhand technical difficultiesFirst of all the cost of storage israther significant Multi-copies of dataof different regions means extra costwhich will put pressure on budget But isif there is a central storage and onlythe whole data transfers to thedifferent kind of pawns in differentregions the cost can be reducedsignificantly Secondly we need toconsider the transfer cost The first oneis transfer all the data and thentransfer the incremental data Uh howeverthe data transfer may not be needed bythe application in different regions Thesecond one is just transferring the uhhall data Despite this difference thereare still uh some common difficulties toaddress One of the main difficulties isensuring this consistencyConsistency is an important aspect thatneed to be considered for for multicopies It needs to be realized inspecific D transformation system Howeverwhen uh it comes to how did uhsynchronization ensuring consensifieduh the data just need to be updatedincrementally by the user's request andthat's all it takesAnother important factor is timelinesswhether it is incrementalsynchronization or the hotsynchronization Time lenis is a criticalfactor in data transfer in OPPO to meetthe uh training requirement to for AI Weuh solve it by combining public cloudcatching with self-built uh cloudstorage That is the full data is storedin the uh self-built cloud and the wholedata is catch in the public cloud Alsothe latency of the forced pool isrelatively increased The performance ofthe repeated R will be better uh if thedistance between the cloud center andthe self-built cloud center is shortthat the impact can be reducedfurther Nextuh uh let's talk about the uh modeldistribution Model distribution is a bigchallenge for uh bounty regions in OPPOuh the efficient distribution of modelsis also built on catch system So thisrequires the uh catch system to haveseveral features Uh as a chart showsuh firstly the system should have theability to distribute uh models promptlyThe cache system or cubs support passconfiguration When a file write uh rightwithin the configured path it can besynchronized to the endpoint uh of thedistributed catch system achieving aprivate warming uh during the rightprocess Secondly uh there needs to be aworking control system prevent theclient from accessing the steel datadata Certainly uh supports a priorwarming capability to enable better uhintegration with business operations Thepriming is a service the directory uh ofthe request to be uh performed is takenas a specific task to d and driven bythe measurement systemThe cache system uh also needs to haveuh se several technicaluh several technical points capabilitiesUh the first is high performance andhigh throughput Uh and the second isload balancing to avoid the situationwhere hot catches uh can't stand themassive requests as shown in the diagramuh up uh after the air training model uhis stored in cubifi it will be uhdistributed to terminals uh of theinference services through the cachesystem The dial cache uh system isdesigned based on um consistent harshingThe file is divided into uh multiplesegment and each segment is onemegabytes in size Each segment mapped toa slot partition It's harsher uh andmanyuh and many[Music]uh and many uh slot partitions aremapped to one uh ca�tch node togetherThis is the routing rule The distributedcatch uh supports uh both memory anddisk storage forms Uh in the benchmarktest each catch node within uh fourterabytes capacity of me disk can uh proprovise uh 50 GB network throughput Thebottleneck of the distributed cache liesin the band wise Uh the cubs distributedcatch uh uh takes multiple aspects intoconsideration in it design Here is someuh main featuresElastic cache cap replicas can beconfigured according to businessrequirement There are two main scenariosto use uh multi-relicas First thescenario is to massive request to holddata uh in the regular operation of theuh cache system It usually use a singlecopy Secondly it refers to uh multipleregions distribution of of cache datacan be spread globally based onmulti-relicaform Distance awareness means that in ascenario of multi- catch replica cachesystem allows the clean client to selectthe replica cache node with the lowestnetwork latency Additionally theperforming capability is currentlycurrently under development Therequirement runs as a task which can bescheduled systematicallyuh large modules large models and otherbusiness can be preheated before realoperation Finally in order to integratewith a cloud ecosystem to adapt to moreusage scenarios a community is seekingcooperation with other communities likefluid uh to provide uh uh servicesolution that combat backend uh datapersistent and the cloud uh data sketchaccelerationIn OPPO we uh used to use highperformance disk to support highthroughput in AI scenario But uh theannual cost of storing 10 pabytes ofdata inme is quite significant throughdata analyze of AIdriven business It hasbeen found that for most AI business 80%of the data will not be read withinthree months and typically uh how dataaccounts for less than 10% of the totaldata How to balance computitionalperformance and storage cost the bestway is to uh utilize the highperformance media to the writingpressure and support the hot supportreading and migrate the code data to thelower cost storage media that isintelligent uh data ting There areseveral uh tech technical points Thefirst one is the life cycle system whichis fulfilled by the uh module LC node ofcubis The life cycle system is built torun tan task and realize in uh thecontrollable and maintable tearingprocess as shown in the picture Thedirectory of the file system can beconfigured as uh as the tearing pathMoreover during the migration processclient leases are used to ensure thenormal read and write operations are notaffected For date security the originaldate will uh be retained for a certainuh period after the transfer There therea check for both sides to guarantee theuh dataconsistency Uh next uh let me introduceanother end user case that is computinguh compute storage separation incleanhouse Uh computing and storage hhas have different resourcesrequirements So the separation ofcomputation computing and uh storage isa very reasonable solution uh before theapplication of compute storageseparation clear house have uh severalpinpoints in storage as introduced inthe chart First uh the storage capity ofsingle node is limited Uh the inspectorshould pay attention to disk space andfinish the operation timely Uh secondthe s sur have to use highest capacitydisc for each node to ensure the systemruns smoothly the cost is significantThird on the other hand similar tosituation with capsity the diskperformance can't fully utilize becauseit can be accessed by it can only beaccessed by uh specific partitions Foronce the space is balanced betweenpartitions or nodes the data migrationand balancing will be quite complex andmay affect online sourcesAll the factors can contribute to stpoor stability and the factors relateduh to storage So it's better to leavethe professional things to professionalteams This is the first addition ofarchitecture to maintain uh clear houseoriginal structure The third layer use asingle replica model of of cubs to makesure that the volume distribution areseparate from each other CL clearhouseuse uh two uh cubs clusters to storedifferentreplicas This approach uh offers nearlyunlimited storage space and enables datasignificantly cutting cost At the sametime it adjusts to cleanhouse originalarchitecture While however it hasdrawbacks because we use two differentsingle replica volumes in cubifi storagethe consistency can't be guaranteed bythe storage because the two volumescan't sense each other Uh the storageplatform doesn't know uh therelationships of the two volumes Thecheck module needs to uh do theinspection periodically to guarantee theconsistency once there's a bad of Kubabecause of a single replica and can berepaired automatically Uh the repairdepends on the tracker module of clearhouse At the same time to enhance the uhrepair repair speed the bad disk shouldbe detected uh permanently back to thehouse What's more the inspectionsuh D traversal and repair needs a longtime The drawback of this systemarchitecture influence the onlineservices uh to a certain degree causingdelays in processing requestsFinally the clear house team updates thewhole architecture and uses the cubs asa shared storage Clear house doesn'tneed to care about data repairs comparedto the design on the lastpage The reliability and stability aremuch better than those old design withonly one replica And since the serverfully depend on cubast the pro to repairthis disk failures can be achievedMoreover the operation and maintenanceare simple A cubas can restore it on itsown with sr intervention leaving uhbusinessunaware The last uh end user case iscompute storage separation with SDK uhthe computer and storage separation inclay house is based on the fields clientand the storage is accessed by the mountpoint The fields is a standard tool toaccess storage However in reality fusecan affect the whole performance asshown in the slide The SDK has thefollowingfeatures has high throughput performanceand stability This is because keytotally run in the user mode and bypassthe kernel mode At the same time it wasmany uh constraints of fuse itself suchas the boot block size Uh moreover itcan also adapt adapt to some uh specialrequirements in the usermode What is shown in uh this in thefigure is a typical computing servicebased on tree Uh the this service storesboth the war logs and SST files uh incubif It's a typical append only writeapplication forstorage This is a firmware architectureof K value storage and it usually usedfor online services for massive keyvalue storage such as radius and rosbThe requirement of the uh stability ofdata storage is extremely highespecially for P99 requirement uh shouldbe kept within onemillisecond based on the uh requirementswe also have actually made a lot ofoptimization on the background uh Iwon't uh introduce it in detail hereuh last year the community uh releaseduh several wens including four officialuh versions and three uh beta versionsThis year Cubs already released the datatouring version and also uh and we willofficially uh release another threeversions Uh the first one is uh release3.1uh which mainly focus on distributedcache and is mentioned a lot in today'sspeech uh but this is not the finalversion Uh there will be an enhancedversion released 3.5.2 to as a secondphase of the distribute patch At thesame time the warden will be a stabilityenhancedversion including features such as uhmitic migrationuh as mentioned in the speech uh that wewill put uh the uh manos memory uh intoB storage to reduce the memory costs uhwhich is also a basis for hybrid clouduhduring the energy there are some aspectthat we want to schedule to do Uh thefirst aspect is uh the performanceimprovement and the uh second aspect isis hybrid cloud uh where we want tobuild an a complete hybrid cloud systemuh support the S3 outside storage and tomake the uh data flowfreely Uh thank you all for listeningpatiently If you have any questionsyou're welcome to raise them or contactor contact us through the Slack Uh Iwilluh do my best to answer and at the sametime I hope uh you can pay attention toCuba and it communitydevelopment Okay Thank you Thank you all[Applause]2025-04-15 21:57:48.191920 \\��/#��OA_rDE1PD5Z5Iuh hello everyone Uh I'm Li Chong fromOPPO Uh and I appreciate the opportunityto attend this conference Uh this is myfirst English speech Uh so for uh excuseme if I I have any grammar issues or badaccentuh uh I'm the maintainer of open sourceprojectKubas and uh I also in charge of thefile system of OPPOuh well let's get startednow today my topic is Kuba in actionempowering users through casestudies my speech will include uh threeparts first I will introduce thearchitecture of cubef Uh second I willintroduce several end user cases Uhfinally uh let's review the futureprospect ofcubef is a next generation cloud nativeuh storage systemuh uh kubas joined the saf in uh 2019 uhafter about uh f five years ofdevelopment Uh we released around adozen versions in the past few yearsFinally at the end of the last year wegraduated from CNCF The picture is fromthe CCFwebsite Uh in terms of the uharchitectureuh cubs is a independent and uhself-governing storage system Uh as canbe seen from the diagram it includesseveral parts Uh the part uh in the topleft is the client subsystem The clientis actually a very complex componentuh it provides capabilities fordifferent interfaces such as S3 ADFS andposics Meanwhile the client sideinteracts with the back end frequentlyuh and even gets the whole process ofour system The middle part on the leftis the catch subsystem Uh the catchsystem is a new designed system uh foradapting to different uh accelerationscenariosuh currently to support uh AI uh thecatch system is becoming more and moreimportant This is a main point in mytopic The lower left corner is ametadata uh metadata subsystem It'scarefully designed It has strongconsistency and be can be scaled easilyuh many system level functions like thetrash audit and autonomic posicsinterfaces uh need support from themeditativesubsystem On the top right is the objectaccess zone uh different from thetraditional object storage architectureCubeFS achieves a capability based onthe file system engine and this systemcan be considered as as a city proxysystem The system in the bottom rightstorage subsystem uh is a data subsystemThe storage subsystem uh uh contains twoengines One is a multi-replica uh engineand another is the usual codingsubsystem which is independent storagesubsystem using the coding algorithm Itit has its own metad data management uhincluding the inspection and managementsystem uh percent layer uh and so onNow uh let me highlight the features ofcubifi Uh firstly cubifi supportsmultiple protocols Secondlyuh has two engine storage enginesincluding multi-replica and coding uhstorage engine Thirdly uh all the maincomponent components achieved a strongconsistency based on raft an��e uh uh fields so a bit of adifferent perspective I came into NATSdifferently than mostum but uh nonetheless you get here andit it we find that at Sanadia it'spretty much a unified vision uh nomatter what field we come from soMaurice is going to talk to you a bitand then I'll show you some demosall right so uh first off some NATS 101who's already familiar with NAT or likeheard about it and okay quite a fewpeople um also like using it inproduction or maybe also not yet inproduction okay cool quite some handspretty pretty cool um so first prettymuch kind of the boring stuff like whatis the project what is it all about sonext slideso this is kind of a bold statement so Ikind of I'll try to get kind of wrapstory around around thisso kind of thinking back like it mightmight sound strange like talking aboutthis at a conference like this but likereally long time ago you used to be justyou rent a server somewhere you'd havelike one application run one databaseand that was it that like that was yourcompany and it was like so simple beforeand well in a way because like SSH intoFTP things have gotten a lot bettersince but in terms of applications arenot that simple anymore like I don'tknow kind of the use cases that we we'rehearing and probably also have variouslike unique cases within within yourbusiness um it's just really amazinglike how well relatively quickly likeall of these applications have just haveto be distributed in a way or like maybethey can can't always have an internetconnection if you're at the edge um butbut but but still like from this thiskind of initial perspective that thingsdon't necessarily need to be complex andif uyeah so kind of what what does like aconcurrent or what might the currentstack look like so you have varioustechnologies for well reaching out tolike from one server to another youmight need some some like maybe you needsome other other patterns so you kind oflook look around like what's what othercomponents are available you search fora different component you add it intoyour stack but then still you're you'remiss you're still missing somethingmaybe maybe it's observability orpersistence in one way or another andyou're kind of adding multiplecomponents and and to be honest it's notlike like any like this thing on any anyproject or something it's like becauseand I think I don't need to tell youthis but all of these projects are justgreat but it's kind of the thing thatmakes me think is like kind of all allthis this general pattern of I need toadd and just keep on adding more andmore components and things started tobecoming complex and kind of going backto that initial simple example of justone server or maybe multiple but likeyour application a database things weresimple and the things you were workingon were actually like maybe over 90%your own business thing your own usecase a thing that you'd like to work onand not necessarily like other likemaintaining other things or And it'skind of starting to look like the mostof the things that you kind of need towork on is like having lots of otherother components in there and also youthe thing that you would like to do i Ihope at least that kind of also dependswhat what you're working on but I if I'mmostly coming as as a developer andengineer I like to fix things for forbusiness and what what they're doing umso I'd like to spend my time on that notnecessarily on like any any other systemand keeping that up i'd like to I thinkyou get thepoint so kind of where where where do wecome from in terms of like what do wehave available right nowit's well we think it's kind of thesethese inherent limitations that you haveso things like the the onetoonecommunication the um just adding all theall these layers in terms of like thecomponents not necessarily specificcomponents but like in general adding uhmore layers of I have an applicationsomewhere I need to deploy it somewhereI need to have it highly available maybein a cluster I need load balancersproxies and the list goes on andon thingsare centralized and location dependentin a way even though we don'tnecessa�rily need them to be and if we'redoing like an edge use case or likelike maybe a car is driving around thatI mean it is location dependent in termsof like it's physically somewhere but interms of if you relate it to where thedata center is it's not static at all solike how do you then still uh figurethatout and it's just generally likemultiple technology that you need tolearn and not only as a developer butalso in operating it and yeahso kind of not necessarily going back tothat initial example of like one servera database which was probably SQL at thepoint I don't know uh and like anapplication and thinking about aboutlike what do we need nowadays fordistributed systems we're building thatright now and how do we simplify thatand we're I think we've learned a lotand also with the whole likecontainerization etc so things like andI'll need to check those slides becauseI can actually read it from there so uhlike services that are inherently justdiscoverable giving you the flexibilityfor premers to use any type ofcommunication pattern do you need toreach one service okay fine you can dothat if you need to have like onerequest and you have multiple servicesof different types that need to answersure then you do a like a request manypattern you don't need to send everyapplication a different message andthings well how it relates to NATS forexample decentralized doesn't need to bedecentralized but in a way where I can Icould be the one hosting some some someinfrastructure and I could give you thekeys to actually an account somewhereand I could give you somesome I don't want to equate that to likebeing an AWS in a way I'm not sure butjust in general I can give you some keysto an account and you can make users Idon't need to care about who needs toaccess your system and I give you thepretty much the ability to do anythingyou want in a way that I don't even needneed to know who your users I just giveyou some limits in terms of how muchinfra you can useum and things like intelligent routingand still having like again kind of thatedge use case of you can you have dataat the edge for example bringing yourservices not sending your data up to thecloud but bringing your services down tothe edge to do local inferencing or likelocal oh I kind of mentioned AIsorryum and of course like the internetconnectivity issue even with the likeI'm not sure how your experience waswith the conference Wi-Fi but especiallyon Wednesday I was really strugglinglike even loading the Google slides soanyway I think those things are changinga lot so I don't think we'll bedeploying on the cloud well I think wellwe will still no worries but in generalthings will be shifting more to the edgeas well and things like internetconnectivity is not a given in in such aplace and you you probably and it alsoagain depends like it depends on youruse case and what you're solving for fora business as a businessbut yeah and again nets provides asingle technology tolearn yeahso I'm not a sales person so I'm notgoing to tell you that net is the thingthat you need to use and otherwiseyou're you're I don'tknow if this talk just helps you kind ofrethinking like where can I simplify myarchitecture if you and I don't really Ipersonally don't mind if you would beusing nets or you wouldn't but I kind ofmake you think like where can I simplifymy architecture in a way can I and maybenets can play a role in it maybe maybenot but that that that's that's yourdecision and again I'm not a sales guybutyep so kind of uh why do teams love netsand like kind of there are thesedifferent kind of roles and in practiceand it also depends like in whichcompany you work for and like theseroles might be fluid in practicebut might be more fluid in practice butkind of for giving developers the toolsto use all these kind of differentcommunication styles using uhpersistence through key value storesopic storage all with just a single APIjust one technology to learn similarthing for architects like what's how doyou map the technology and theinfrastructure that you need onto youryour your business and allowing�architect uh architects to pretty muchdeploy infra or have infrastructure inmultiple clouds multiple geos at theextending to the edge and again whereinternet might not be available at someat some point and also kind of give itgiving the operators kind of the handlesthey need for like things likemulti-tenency like the thing I mentionedlike I can give you the keys to anaccount and you can make users um and Ican also give some keys to someone elseand they you cannot see what what whatothers are doing and also of coursesimilar important things like securitymonitoring and etcso what is NAT kind of going towardslike this this simplicity theme um soit's a really small one it's one staticgo binary that you need it containseverything if you want to cluster in thecloud you have a really like largeworkloads you can use it you can clusterit you have access to persistencethrough streams KVS etc but again thesame binary at the edge the samecapabilities that you can use for theconnectivity for the persistence all ofthat is just available and you will havethe ability to actually choose how touse it and how it fits to your businessuh yeah and of course it is so smallbecause like no external dependencies itsupports many OSS official clientlibraries also community maintainedclients and uh well the Slack hasalready reached over the 10K Slackmembers so that's that's prettycool so going through this prettyquickly um we finally done the NAS 211release who has been waiting for thisfor more than ayear well I was we're finally we finallyare there i'll go over it a bit quicklywe also actually on Wednesday wereleased a podcast that we did on itit's way way more details and I'm justnot going to be able to like fit it intothis time slot so if you're interestedin in these things and more kind of thedesign and like what went into testingand reliability uh watch that later butin essence distributed messages messagetracing your topology can be very largeand where does your message actually gowell tracing is of course like importantthing uh per message TTLs is a new thingso previously with streams you wouldonly have like one max age if you have aa max age of an hour a message would goaway after after an hour if you wantanything different then no sorry youwould need a different stream with a newmax age but now you can actually haveper message details and you can expirebased on whenever you need them to umpretty obvious one consumer pause impossuh jetream consumers you can pause andimpose them now instead of like needingto shut down your application or havingto change your application to actuallypause consumption you can actually do itas as a kind of a management ormaintenance point of view uh and youdon't really need to like adjust yourapplication uh consumer priority groupsis kind of this thing that we're goingto build on more in the future as wellone thing that we're we're not sure howlike where it's available right now interms of it will be in orbit but one ofthe things that it will have is likeuh if you have multiple clusters uhacross for example various geos oravailability zones and you have serviceslocal to uh also processing you need todo but those services are overloaded youhave a way to overflow pretty much uhsome requests to maybe a differentavailability zone or uh different cloudor whateveruh to ensure the load is still still soso balanced properappropriately um and batch direct getpreviously you could pretty much onlyget multiple messages if you actually uhcreated a consumer um or you could likeget a single message by with one APIcall but you never could get multiplemessages in a lightweight way withoutcreating a consumer and now you now youcan so you can send one request to getmany responses uh of like messages thatare stored in your stream or in youryour KV forexample and I kind of alreadymanaged uh mentioned this with orbitit's kind of this this you might haveheard or maybe not but kind of the thingis with the clients we the officialclients we provide we kind of wantfeature parity across all of them umwhich is hard if you get like acontr�ibution to any uh any one of theclients and we'd really like to say yesput it in but then there's so much workalso to get it in all the other clientsand even one obvious thing is like allthe languages are different and we'rekind of trying to be ideatic in in allthe languages so there might be somethings that work in Go or in Rust orwhichever language uh but might not workin another and we didn't really have aplace to to put that contribution wherewe do would like it to have it but alsofrom our end kind of being able toiterate over APIs was pretty pretty hardfor us and our solution is pretty muchorbit so every uh pretty much everylanguage can have an extension to the tothe official client you can pretty muchdo anything you want if anyonecontributes we can just accept it uhversioning and iterating over APIs is isway easier and eventually once like anAPI is finalized we can actually pull itinto the client and then we can actuallybe sure like it works properly it'suseful for all the clients and you canproduce move it into the client and onceit is in the client you actually can canguarantee stability you don't still needto change the APIum final things uh with the knackrewrite so knack is uh Kubernetes CRDsif you want to manage streams uh keyvalue stores etc that way did a rewriteuh has now proper control loop alsorelated to some changes that were in the211 um and also support for key valuestores and object stores if you want tocreate them i believe uh we also did aJavaScript client rewrite this is now uhpretty much the different transports ofdowo bun browser repockets etc are nowall in one repository instead ofseparate ones um and also if you onlyneed core messaging you only need thatpackage and if you only needing you onlyget that package so more bit moremodular as well um yeah and with thatI'm going to give it to Jordan I thinkyes I remember the slides i think youcan talk as wellhey so uh thanks so for the rest of theslides this is going to be less aboutour execution engine and more about howwe developed the product dog foodingeverything that server had i'm justusing the execution engine as a as a umas a reference if you have morequestions about that we can talkafterwards first off we have threeslides on different ways to likeconfigure NATS um as a cluster one ofthe most interesting things I'velistened to Maurice answer this questionabout a hundred times in our booth islike what does our referencearchitecture look like and the questionis that it's really hard because it'sit's different for every customer rightyou have retail that wants um to be ableto support a connection loss in theirstores you have things all the way outto the edge that need to be able tooffload at certain times whenconnectivity is there so the NAT serveritself runs in two modes one is a NATserver and one is a leaf node and withutilizing it's the same binary it's notlike you have to deploy multiple thingsutilizing those and and putting a littlethought into your architecture reallyextends the concept of a clusteranywhere uh we call I think um a clusterof nodes we just call that a cluster andthen we can do clusters of clusterswhich you call a supercluster so it'sreally whatever you can imagine we canusually find an architecture for it sothis is just a simple um demonstrationof you can see uh on prim and connectingthree clouds and that eventually reachesthe customer via web modal so web ormobile so you can imagine GCP Azure andAWS there als also supporting onprimwith uh resiliency between all theclusters another one we see is the edgedevice being the customer and uh likestore so we can have storefronts runningsmall clusters that provide that p youknow that uh disaster recovery and thatum uh you know backup to each otherwhile also uh reaching up into biggerclouds in the back end so you can havelike large clusters up in the cloud andall your stores reaching in that can bea typical hub and spoke type thing whereyou have a single thing and hundreds ofstores reaching in uh and the big onethe the one we were demoing in our boothif you came and saw it was this� is kindof like AI at the edge so we recentlydid a demo where we were able to doinference model generation datageneration in three different clouds allover a same uh a NAT subject busand because the underlying layer wasNATS the messaging system was able tomove between clouds real time supportmodel generation real time support datagathering via camera or web app realtime uh all with the you knowum without downtime moving you know whenwhen when nodes and clusters would go upand down so it really does show that uhkind of like NATS can go wherever youwant and that's one of the things thatwe don't do is like we don't try anddefine what your edge is you tell uswhat your edge is because we've havepeople that mainframes the edge we havepeople where an ESP32 is the edge and wehave people where a phone is the edgeand our goal is to meet you where you'reatso I'm going to give you a brief reviewof what Nex is so that we can talk aboutin and uh as an application but this iskind of how NAT the NATS ecosystem wasbuilt we had the connectivity layeryou've all if you're familiar with NATserver you've been using it it's it'spub sub it's load balancing it's servicediscovery it's it's all these things uhthat make our connectivity easy and thenwe lay we overlay uh jetream on top ofit that gave us KV that gave us streamsthat gave us object store we couldpersist data we could replicate datathroughout our cluster these all thesethings came for free and then we startedbuilding applications on top of it andwhat we found when we were buildingspecifically the execution engine is wedidn't go we didn't have to reach intoour old bag of tricks we didn't need adatabase we didn't need a KV store wedidn't have to pull in everything werealized oh it's already sitting therewaiting on us even from a securityperspective where you would normallyhave to do things like u ooth well wehave you know off call out which is uhthe net server supporting um ooth stuffwe also have uh subject level uhsecurity which I'll show you here in asecond which essentially mean uhenhances that multi-tendency uh conceptAnd you know it'll feel a lot likenamespacing here in a minute and theimportant thing to say is no code waswritten for that we leaned on NAT serverconfiguration for all of it and you'llsee in a minute it's prettypowerful um so going back with what uhMarie said a second ago Orbit so Orbitwas even new to our team it it onlydropped maybe a few months ago and wecame in and we're like we we started tolook to see what all was there andthings like requestmini we already had the capability toput out a request listen for a bunch ofuh responses and then use some contextand some timers and kind of kind of dothis ourselves but what we found is alot of people wanted to do that a lot ofpeople want to requestuh you know do scatter gather request auh a message from a bunch of serverswait some you know expected amount oftime and then stoplistening so this simple request miniconcept in theorbit removed I think a few hundredlines of code and fixed like 14 raceconditions that we had because we hadwritten our own loop and and we were notdoing and we were not uh closing ourcontext correctly this solved all thatfor us so we found very quickly that theuh the many is better than the few whenit comes to things like you know commonpatterns like request a scatter gatherin uhnext or in nats and then yeah so that'sthe whole trusted code it it comes froma community and and many people areusing it so these bugs tend to bubble upa lotfaster and this was an exciting one weonly very recently since 211 landed wewewe turned ended on in the uh executioncode and we started using it to like tryand follow our request so one of thevery first things we do when we deploy aworkload is we said we auction an entirecluster and we're like who can take thisworkload and it was kind of you canimagine it it goes out the serversrespond if they can but what was prettycool is we turn this on and all of asudden we can start to see our requeststravel through the cluster so you cantell I live pretty close to AWS West sothat'�s where I ingress into the into theour cluster but then I have nodes in AWSor in Azure and GCP you can see ittraversing the gateways between all ofthem going out asking those nodes thesame question and returning back so thishas actually helped us do a lot of uh uhcommunication based troubleshooting fromon on our endand let's try ademo talk while I set thisup allright so what I have right here is Ihave I have NATS running on the left Nexrunning on the right nats is runningpretty much as bare as you can get itall the defaultsettings and what we're going to seeThank you man is I'm going to dosomething as simple as deploying aworkload i don't know if I'm doing thelogs but you can trust me so whathappened was I I deployed a simpleworkload it's a counter and I'm going todo to list the workload foryou and this all kind of works exactlyhow you would expect uh you deploy aworkload you list your workload and andit's it's there but the problemwith a default server is if I go over tothe other user and I requestYou can also see my workload somulti-tenency is not working everyonecan see everything there are admin typeuh commands you can also run uh that ifyou were to run those aswell that's not how you spellnode i do have one bit of security inthere all right so those are admin typecredentials that normal user or admintype functions that normal usersshouldn't be able to do so now we wantto lock that down but what we don't wantto go do is add a whole bunch of customlogic to our application so let'saddress it in a NATSconfiguration and what I'll do is I'lljust show you kind of what I'mdoing so we're going to create threeaccounts here an admin account and twouser accounts in the default versionit's pretty much one one plane no nocredentiing but when you look down hereyou'll start to see that on our controlsubjects we're going torequire that the account that they'rethat you're using to do the requestmatches the account in your credentialsand this is what empowers all thesecurity over like a NATS based subjectmapping so with a little bit ofarchitecture and making sure thesubjects make sense you start to get umsecurity for free so if wedo that we'll turn next backon all right Mr we're going to start theexact sameworkload that's the wrong commandso I think I actually I actually havelogs dumping this time so you can seethe workload running right there and nowwhen user two goes to well when user onegoes to look at their ownstuff they have permissions to ask whatworkloads does this account ID have butnolonger does user 2 have that and you'llnotice it happens so fast because thatrequest isn't actually making it intoum into NATS the export that it has isnot even allowed to do a request on thatsubject so the server shuts it downimmediately um and you'll you'll seeeven when wego so we'll even try and do the um theadmin type functionality it shuts downimmediately because the users aren'tallowed to do that so what we prettymuch demonstrated is configurationbasedsecurity that no longer has to be rolledinto your application the NAT server cando it all uh which we found to beextremelypowerful then I think we're justhere so it's allyou so yeah that uh thank thank you forfor coming out like here are a bunch oflinks we we recently did a uh an onlineconference where we had a bunch of greatdemonstrations it's on um on our websiteit's on our website so if you go tosanedia.comuh you should see a link up top to watchthat where we had something like fivehours of demonstrations of everythingfrom 211 features to the executionengine running to another new thing wehave which is connectors which isrunning on top of the execution engineand pretty much all of the new NAT uh211 features that you could hope to seeso if you have any questions we're happyto have them uh feel free to send usyour feedback and thank you for beinghere[Applause]um one question what is uh a workloadcould a workload something be that I canreact on a event which is coming I cansubscribe and then I can react can I howa pattern matching which events arecoming combiningand is that what you mean with theworkloadssothe so pretty much what workloads allowsis multiple types so you can run uhpretty much any type in terms of likeyou can run some containers you can alsouse you run for example JavaScript we'rekind of working on that so you canactually uh kind of as a as a functionas a service as a I think that's whatyour your question was about or ohum I think when it's integrated in nutsit's something which react on events orcan react or it's triggered when anevent is coming and these things so yeahso the workload engine itself does emitevents so we are in the process ofmaking it event sourced and making it alot so you saw a little bit of delay inthe thing it is pubs it is request replyat the momentBut it is using streams to bring thewhole event sourcing type ecosystem intoit as well to be a lot more reactive sousing that is kind of where we'll get tothe more declarative type workloadseventually right now it's veryimperative i hope I'm understanding yourquestion correctlyyeah I think so yeah so I want to call aworkload or lambda serverless functionfor example if some event is coming or aspecific event so you have event filterif that event is coming with that yeahso you said if there's an event on a natsubject can the workload be triggeredyeah exactly yeah so that is actuallyhow our functions as a service will workuh we call them trigger functions you'llyou'll run them and they listen topretty much a trigger subject and you ifas soon as you send some sort of payloaddown that it will ex it'll execute thefunction and return it kind of like apretty much exactly like a lambda and aweb hook but all overnats okay thanks alot absolutelyhi one quick question um did you work onor any optimization of key value stostorage with nuts when it comes towasting itemsi I was trying to use nuts and um in myuse case it would be very handy but Ium find out that uh for example once youhaveuhone 100,000 items in in key value it'sreally slow to list or compact storageon the file yeah so the we've beenhearing this a lot and kind ofthe the thing that makes our KV uniqueis also kind of where this performanceissue comes in so having the ability tolike the delete and purge markers uhthat you have which allows you to towatch on a KV but also introduced likethis so per message is one thing thatthat that will hopefully solve that soyou will not be needing to do any manualcompactions anymore so when you purge akey you can give it a TTL and thenactually the data will just be beremoved which pretty muchwell let's let's see but it should solyou should not need to do manualcompaction anymore and it will be solvedin that way uh so you could just sayprobably you're using like a purgealready because you're using the manualcompaction so you give it a DTL and itwill be removed automatically and thatshould ensure that this the stream iskept at the at a reasonable size and itit it stays snappy but yeah so permessage DTL is like a feature that we'reuh one use case was also being able tooptimize this okay thank youallright you have a had a question or Yeahjust a quick one um so the workloadthing you demoed what's the differencebetween that and Nats microthe so that's actually all built on topof NATS micro um it's uh the the conceptis to run NATS micro you you build yourthing it has the micro and then you haveto run it somewhere we're trying toreplace that run it somewhere for you soif you build if you build your own NATSmicro instead of putting it in an EC2 orECS or wherever you can tell theworkload engine like deploy this for meand it'll go find a place to put it foryou uh what we're not is another runtimewe don't need another runtime we havetoo many but what we are trying to be isa a control plane for artifact type soif you have an OCI we know 50 ways todeploy anOCI we can do it in the cheapest way foryou based on some analysis or we can putit in Kubernetes for you or we can putit in ECS for you that's that's the holewe're trying to fill we're Yeah we'renot trying to be another runtime okaythank you absolutelythank you everyonethanks2025-04-15 21:57:49.031088 ��_0#��uAMHfDvUUJ14Ihello everyone welcome to the NASmaintainer track talk about the NATstack first of all it has been a very atleast like for personally speaking hasbeen a pretty long conference i've beenstanding on my legs for all day i hopeyou're you're feeling a bit better uh Ididn't actually know that my legs wereable to like stand for that long uhanyway um yeah talking about again nexttalk let's first do some some intros andwe'll get right into itso uh for some intros I am Maurice Faini am from the Netherlands maybe anyDutch people in here or look at thatthat's great it's it's it's it's prettyfunny like we have so like pretty muchlike I'm senior software engineer Snadiawe are a remote company we have prettymuch people all around the world prettymuch so it's pretty funny when like ourour French colleague can can just talkFrench and like there are some Dutch orBelgian people anyway that's fun stuffum yeah and so pretty much I'm a NASproject maintainer i've been working alot on the server the last few well I'veactually joined Senia relativelyrecently like a half year ago a littleover like nine months but I've beencontributing prior to that for yearsalready uh and pretty exciting as wellsince uh prior to last CubeCon Iactually became the NATS ambassador ofthe project which is still kind ofexciting to me but I don't know uhpreviously I've been a software engineerum yeah fun fact also being pretty muchat the same time uh teacher at theuniversity so I uh hope I kind of kindof do a good job at presenting and ifnot let me know[Music]awesome anyone from Colorado here nopedidn't think so um so my name is JordanRash i am from the States i'm a softwareengineer at Cenadia uh my primary rolethere is I'm uh one of the maintainerson the NATS execution engine how manypeople knew that Cenadia was getting inthe business of an executionengine it's new so uh we're going totalk about it a little bit today uh notas a product but as what uh features andlike perks it gets on top of the NATserver for free when we starteddeveloping it we foundum building a product on top of ofNATSuh gave us a lot of really cool featuresthat normally as an app developer we hadto think of that we didn't have to thinkof anymore so we'll show some of thoseand and before this with cyber securityand defens��ormation about what thespecial interest group of uh of ours diduh you can find the links in the in thepresentation that is also linked in thein theschedule so like i mentioned over thepast 12 months or even before i'll jumpuh quickly in the annual reports ifyou're also interested in what the sigapps was doing over the past 10 years alittle bit of in-depth history of ourspecial interest groups as well as theentire kubernetes project in the annualreports there are two presentation thatwe did last year during cubecon in parisand in salt lake city which was going indepth into the history and evolution ofvarious apis various resources and theproblems that we encountered along theway so if you're interested into that uhi will also welcome you to to have alook at the annual reports or just lookfor the presentationuh from past cubecons but now focusingon what we are currently working on andwhat we've been in the in the progressfor the past 12 more or less months someof those features there are in theparenthesis information about theversions maybe you've already had anopportunity to play with them if youhave any feedback let usknow okay i'malive the poster not necessarily um soquickly i'll go through them try toexplain if you have any questions orsuggestions let me know we will have ihope a couple of minutes after thispresentation i'll be also hanging out uharound the the hallway for a couple moreminutes if you have any questions solet's quickly go over the uh the stablefeatures that we released over the pastyear the first one uh was an interestingone so pot disruption budgets the waythat they were originally uh created thegoal of the resource is to protect yourapplication through labels and uhselector labels and number ofapplications it will always ensure thatthe number of specified replicas isavailable and during eventual drainingof a note it will stop or preventdraining from continuing until we have anew pod uh replacing there was a tinylittle bit and we had uh multiplearguments back and forth with regards tohow the pod replacement uh poddisruption budget was counting the podsoriginally it was counting both readyand not ready pods so assuming and i dohope that each and every single one ofyou have proper healthy and readinesspots defined in your pod once a pot isready as in past the readiness there isa difference between a ready and notready pod but before this change we werecounting both equally normally only theready pots should be counted towardsyour uh pot disruption budgetand but because kubernetes one of theimportant goals is making sure that thebackwards compatibility is not uh is notbeing broken we needed to introduceanother field in the pod disruptionbudget which explicitly allow you topick one solution or the other and theaddition basically there was that i'monly interested in the ready pots in mypdbs to uh to count towards thedisruption budget that means if your potis not ready it can be evicted uh whichis important in some cases and allowsyou to save extraresources um i i know that we found ituh very useful in a couple of uhinteresting cases another one wasactually long overdueum a lot of users came to us with theirrequest to be able to randomly pick podswithin a replica set when we weredownscaling normally there's a half thealgorithm that that basically tries topicks the newest pods created andreplaces those but that's not always thebest solution especially if you'rerolling your application and you'reactually trying to replace older withnew ones it would actually pick thenewest one first currently we'veimplemented a little bit more uhrandom randomization in the algorithmand uh that allows us to have a littlebit more[Music]um control over how the uh the rolloutis progressing in the replica set movingon that was an interesting uh use casefor stateful sets if you are aware eachand every single stateful set definesbasically pods and each pod has aparticular number in a stateful set umand it's always starting from zero up tothe number ofreplicas for the cases where you'retrying to migrate your stateful set fromone c�luster over to another you actuallywant to maintain the numbers consistentwhen moving from one to another but inparallel to that you want to stillmaintain the entire uh stateful set as asingle unit um watch from outside of thecluster so this uh future specificallyallows you to say oh my replica setshould start numbering from let's saythree not from zero but three becausethe zero and one already move to thenext cluster and this way i can slowlymove one by one pod from one cluster toanother or for storage or for uh forvarious migration primarily uh themigration was the the main use case thatwe've been uh investigstigating uh the last one in ga ibelieve that's the last ngaum of course it decided to blink down sothe last one uh six storage actually thepeople within stick storage reach out tous normally when we are creating astateful set originally when we createdthe stateful set we decided that wedon't want to uh break users and wedon't want to touch the pvc's created bythe stateful set this was to ensure thatthe data that is being managed by thestateful set is secure for the entirelifetime so we never do anything and ifyou were either removing your statefulset or migrating whatever you were doingyou are responsible as the owner of thestateful set to remove the pvcs afterthe stateful set was removed oreventually uh migrated there were someuse cases and with the help from thestick storage we've actually implementedan explicit policy so you have to beconfiguring your stateful set andclearly point that during either a scaledown or entirely during a removal it'scompletely fine to um to remove the pvcsalongside the stateful set so this is ayour explicit decision to touch the pvcor leave it as is and if i remembercorrectly there are we always we evenwent as far as there are separateconfiguration for scale down andseparate configuration for delete so youcan specifically say that it's okay todo it only when removing or also uhduring the scale down so you have uhthat kind of flexibility oh actually wehave a lot more related uh ga featuresand a lot of the if younotice a lot of the work around the sigapps currently and especially within thepast i would say two three years hasbeen driven heavily by the working groupbatch so those are people that helped usensure that any kind of workload whetherthat's a iml or any kind of uh high uhuh high performance computing that theyare feasible in a kubernetes cluster soa lot of work around the batch areaprimarily the jobs introduction of indexjobs or improvements in the performanceof handling jobs is coming from uh fromthe batch working group so a couple ofexamples elastic index job so index jobuh if you've never heard about them it'sbasically the ability to specify it's acombination of that that will beprobably the best analogy it's acombination of a job and a stateful setmeaning that each and every single podin an elast in an index job has aspecific index if that index fails a newpod will replace the same pod with uhthe same number so it will have a stableuh dns also associated with it if iremember correctly we're also injectinguh environment variable with informationabout your uh your index in a uh in ajob the elastic part in this particularfeature allowed us to scale or modify uhthe the index job but only when we aremodifying both the completions andparallelism because if you remember uhproperly the job allows you to modify aparallelism which is how many pots areexecuting particular job at any givenpoint in time and that is basically toallow us to dynamically scale up anddown the job depending on how busy ourcluster is or what's going on with ourclustersso and completions are usually fixed inplace at the at the moment when the jobis created for elastic index jobs andfor some use cases uh when using indexjobs primarily again in the area of ummpi or hpc computing it was uhreasonable for us to say oh i want tocreate this many this many pods but oncewe reach a completion that is not fullydefined at the moment when we arestarting the job or that is definedoutside of the kubernetes cluster within�the job itself we are able to modifycompletions and parallelism together butit has to be the same number at andmodified at the same time um aninteresting thing and that's actually avery simple feature but a lot of userhave asked uh for this thing for quite awhile when we are creating jobs fromwithin a cron job a lot of users wantedto have a clear information when thisparticular job was meant to be createdwithin a cron job there is an ability todecide that oh if something happens inmy cluster for whatever reason i don'tknow whether a cube controller managergoes down or you turn off the clusterfor the night i'm not asking i'm notjudging whatever you want to do do do umand the crown job controller kicks inagain you can tell the chron job to ohif i'm delayed with my uh my schedulefor this long it's still okay to createthe job for me that's an explicitseparate feature and a lot of peoplewanted to know still within the jobcreated how big the delay is what is theactual time why my job should have beencreated so we're currently injecting anannotation into a job so you can inspectthat information and eventually dowhatever you're you're pleased with thatuh with that uhdate um the next one is also verysimilar to to the previous one uh thistime around and this is what i wastalking about we're um injecting anannotation with information about yourpot index in a both stateful set and inan index job which was a request frommultiple users again similarly fromprimarily coming or gathered around thebatch workinggroup because they want to know that ohyes i'm i'm index five and eventuallywhat they want to do with that theyfrequently well i had an example that ibuilt like multiple years ago when i wasinitially introducing jobs uh you canchunk your major for example um imageprocessing pipeline into chunks and youactually want to know that your index ina pod is number five because the numberfive bucket is the one that i have topick up so you don't have to manually uminject that information into the job butjust read it from theannotation and a couple more uh gafeatures and something actually that islanding in in this new upcoming uhkubernetes release uh one is a backofflimit so in an within a job and thatapplies to every single job uh we allowyou to say oh i can retry a particularjob a couple of times but that limit isspecified as a global limit for theentire job so if i'm running i don'tknow a couple hundred or a couplethousand and my back off is and thedefault is six if i remember correctlythat might not necessarily be enough andthere will be only it'll always be uh ona at a global scale of the entire job insome cases especially for the index jobyou want to be a little bit moredescriptive and you want to have thatkind of a limit applied only for aparticular pod this allows you to ifyou're running um your job againstseveral clusters and one of the the podswill land in a faulty node for examplethat one faulty node will only hit thisuh kill this particular pot and get itreplaced rather than killing the entirejob because you reach your back offlimit so it's always currentlycalculated per particular index of ajob um the next one job success andcompletion policy uh before that wealways said your job has to complete andit has to finish everything uh meaningall the pods in a job have to be uh haveto reach a completion status with someexceptions currently we will actuallyallow you to define an exit criteriawhich will uh be which will allow you totell that a job is finished sooner thanactually all the predefined completionsnumber is reached so that's also in somecases useful for the batch use cases soan interesting uh featureum another batch related uh topicsprimarily this is coming from peoplethat are working on q does anyone of youheard about the q project the q as inwritten by k k u e u uh it's a projectone of the uh sig app sponsor projectwhich helps with queuing the workprimarily for the batch workloads againum their use case that they have andthis is not the first time where we havea similar mechanism in kubernetescluster uh endp point slices �and ipaddresses is a similar we basicallyintroducing a an annotation which whichsays this particular job is handled byan external controller usually in theannotation you'll put the name of thecontroller but that is assigned to thebuilt-in job controller within thekubernetes cluster not to run the job atalli know it looks super simple from theoutside because from outside it's ohi'll say just this and the controllerdoesn't do anything but actually uh whenwe allowed third parties or external jobcontrollers what we wanted to ensureisthe the correctness of the statusesreported by external uh controllers inthe job status so we a lot of the effortthat we put into implementing thisparticular feature was around ensuringthat the status validation for a jobobject is properly defined and encodedin the kubernetes api server to makesure that not only the built-in jobcontroller is behaving in a specific andand consistent way but also for to forceother users or authors of an externaljob controller to also make sure thatthe the controllers abide by the samerules and are following the same uhlimitation this way users of the jobobject can ensure that whatever they'reusing the built-in or an externalcontroller they get the the sameconsistency with regards to this uh tothe job status they primarily theirtheir main use case how they are usingis in a case when you're running in amulticluster environment you can have aa primary job defined in your hubcluster which serves as a central pointfor holding the information about thestatus but the actual execution is beingum is being mirror or placed insubclusters or even can be divided inmult into multipleuh different clusters so that's the kindof approach that they have in which caseobviously in the in the hub cluster orin the central cluster they only want tomirror the information about the statusand the progress of the job but theexecution is happening somewhereelse okayso what's in our p uh what's in our pipefor for the next couple ofreleases the job pot failure policy isan interesting development that has beenin the works for i would say more than ayear it defines entire sub languagewithin a job which allows users todefine when my job or when specificallymy pot should be retrieded or notcurrently we have mechanism describingexit codesconditions andum yeah conditions and exit codes comingfrom a pod which allow you to say ohyeah this is actually an expectedfailure um and it'll be fine if it fafinishes or this is not something that iwant to uh or this is some a an errorcard that i did not expect and i shouldprobably just retry this particular uhpod to bererun another interesting uh topic thatwe're we have in the books for for acouple of months and we've been goingback and forth is uh normally when youare rep uh when you are uh rolling yourdeployment it will try to allow umassuming that you specify the correctsearch and max unavailable it will tryto fill in and replace your pots asquickly as possiblelike i saiduh to match your search and and max andavailable uh configuration for yourdeployment but there are some cases forexample if you're running at your quotaor very close to it where that kind ofreplacement is not possible so we wantto be a little bit more mindful withregards to not pushing the limits as faras we want and explicitly request thatonly bring new pods when the previousone isuh reaches a termination phase so thisway we will allow you to replace one byone or like multiples in parallel in inwhich case if you're at quota let's sayyou have pi five pots and this is yourquota i will be only um allowed tocreate the new one when one of theexisting five will go down as in it willreach a terminating state so that that'sa we have a couple of use casesdescribed in the in the document particfor that feature so that's an um that'salso if you have any thoughts or or orideas around implementation of theseum we're all we we are very open to thefeedbackuh the third uh on the list is somethingthat is actually in the uh that has beendiscussed this week numerous times butalso in various um special interestgroups within the kubernetes project andwhich includes architecture nodeautoscaling um sig apps i can't rememberwhat else but there were like multiplesix involved the idea is currently uhnowhere in kubernetes node resource orum anywhere in the built-in resources wehave the ability to expressuh that we are in the process ofdraining a note how long it should handhow long it should take care when itshould be able to just forcefully removeuh start forcefully removed which potscan and which cannot be forcefullyremoved yes we have pdbs but pdb thesehave a very limited uh surface becausethey will basically prevent the evictionfrom happening but they will not allowyou to either um delay the eviction oreventually if your eviction processrequires a specific amount of time formoving your workload from one one pod toanother that information is not encodedanywhere and we are thinking anddiscussing how we could better in inwhat better way we could describe theentire draining process and eventuallyintroduce external actors which couldhelp us with draining the node andeventually moving application from onenode over to the next onethe last one on this slide is somethingthat um i'm personally very ashamed ofbecause that's that's kind of like atopic that that still has uh actually athree-digit enhancement number whichmeans it goes way in the back i forgotto check the actual date um but theenhancement is probably more than likefour or five years old the idea is uhintroducing max available for statefulset if you've ever used the maxavailable for deployment or replica setwhich basically allows you to set ohit's okay for my uh my my application toreach this number and this is either anumber or a percentageuh of unavailable pots during a roll outthis basically allows you to to speed upyour roll out to a newer version wewanted to introduce and we startedintroducing this functionality into astateful set but the problem is that umeach and every single enhancement in thekubernetes each and every single futurein kubernetes when it's being introducedrequires a very tight informationpresented to the cluster administratorthat the f that the future is workingcorrectly or is not and for the past 2or 3 years we are all struggling to comeup with a reasonable metric that willexpress the roll out of a stateful setbecause for deployment or replica setthat's pretty easy a lot of thosecontrollers basically will spin up asmany as many pots as you allow and it'sdone there's no um there's no externalparties affecting the roll out processin in as in from the controller point ofview in case of a stateful set at leastthe one where you're rolling one afteranother i'm not talking about theparallel uh code uh pod mode roll outbecause that's obviously that will allowto roll out multiplepots but in a case where you're rollingone after another the actual timerequire required to start theapplication will significantly affecthow quickly you can roll the entireapplication so with that in mind wecannot easily express the informationhow long the roll out for thatparticular or other application uh wouldtake because those numbers will differfrom application to application and thatbasically means that the metric is notreliable and will not allow us toquickly say oh this actually the numberif your your deviation is more than 10%from the normthat's bad or good and we are not ableto express yes the feature is workingcorrectly or no it's not and you shouldprobably roll out to the previousversion or basically turn the featureoff so this is where we're currentlycurrently dealing uh there are also uhproblems around like you see in thepicture we lost the main contributorthat was working on this feature uh theywere moved to to work on differenttopics so we're looking for newvolunteers to help moving thisparticular uh futureforward okay i'm literally right on timeum those are the information how to bestreach us i will be in the hallway for acouple more minutes if you have anyquestionsuh we are going i'll be able to answeruh them afterwards thank you very much2025-04-15 21:57:49.574395 SS��1#��YAKlsxQMfdKLwwelcomeeveryone uh thank you very much forshowing up and learning about sigaps i'mfully aware it's friday afternoon andthe one thing that is on everyone's headis i want to go home i'm exhausted theconference was awesome i hope and um butyou're all probably ready to just finishit off so i do hope that thepresentation will be a strong finish uhso let's start so who we areum my name is mache as i already saidand sadly janet and ken were not able tojoin us today at the stage uh the threeof us are in charge of the sig apps andfor the rest of the presentation i'lltry to walk you through what the sigapplication group is what we do and whatare the plans for the upcoming weeks andmonths um so how to reach us that'sprobably the the simplest and thestarting point we are meeting everyothermonday and the times depending on yourtime zone are listed on the web page thebest option also is to reach out onkubernetes slack six uh sig apps or amailing list if you remember from thepast we used to have a google groupsmailing list we migrated this over tokubernetes managed it's also easier toremember it's just sigaps atkubernetes.io so what does the sig appsdoesi will admit that the sigops has apretty broad area of impact into thekubernetes project theoretically we arealways welcoming and inviting a lot ofthe people to show up and present to ourgroup what you are doing even thoughprimarily uh the sig or special interestgroup is evolving around what thekubernetes core controllers and i'llcover a couple of them in a moment butif you're thinking about it that's likedeployments that's job cron jobs damonsets stateful sets even though we areprimarily responsible for the followingcontrollers we are always open tohearing from other people in thecommunity using kuberneteswhat kind of issues that they arestruggling what kind of problems thatthey are solving how they are solvingthem also if you are uh seeinglimitation in the built-in resourcesthat the sick is responsible for we aremore than happy to hear from your proabout your problems how we can solvebecause there's a lot of discussion thathas been happening over the years maybewe'll be also able to point you in theright direction or connect you with adifferent group different team differentperson that was already struggling witha similar problem and how they how theyresolved it in the past u aside fromthat um there's also something that wedo every single year we're doing like asummary inf��experiment in this one sig and then tellus what you learned and then we'll applyit across as well right yes that that aswell lessons learned and try and try andmove them from one one lessons learnedfrom one sigum to give an overview of the project umwe kind of structure the project in inis you want to do this one yeah allright i caught my breath so uh justbefore we go into this one um if youview it from the top we have the CNCFcncf well before that it's there is LFand then LF CNCF is one of thefoundations in uh LF and then withinCNCF there is so many different projectsKubernetes was the first one butcertainly you saw in the keynote howmany projects we have and withinum within Kubernetes uh essentially howdoes kubernetus report to the rest ofthe organization right like that's theway to think about it so uh kubernetusworks with uh the other CNCF uh you knowtags and working groups and uh with theTOC on different things but thenKubernetes itselfum you know we have a steering committeethe steering committee is not in chargeof any of the technical architecturedecision they are about the health ofthe project they are about likesustainability of the project uh theyworry about like the long-termconsequences of doing certain things ornot doing some certain things you youknow they serve as you know they go siton the CNC of GB uh governing board andadvocate for our project they go figureout like hey how much cloud credits dowe need and how who do we have to houndto get so they do evangelization they douh community curation and like okay issomebody being a pest in the communitythat you know on technical matters wealso have a COCCC uh kind of uh you knowbody as well but that is what steeringcommittee does and any anything that isarchitecture related um ends up in theSIG architecture right and the SIGarchitectureends up uh asking everybody to hey youwant to start a new group write acharter for yourself make sure that thecharter doesn't overlap with the othercharters that are already there rightand that is what we call sigs specialinterest groups so as you normallyexpect there is a compute there is anetworking there is a storage there is aUI there is scalability like those areall natural attributes of the thingsthat you use or you know naturalyou know outcomes of how we ourorganized oursel so one funny thing thatyou would notice is HCD is me uhmentioned here as a SIG how many of youknow HCD as a separate projectbefore right so why is it here right sothat was an outcome because HCD was nothealthy over a long period of time itlacked enough contributors and we werelike okay let's absorb lot of thegovernance and security and those kindsof things that you don't really need tostaff by yourself right uh and uh youknow the few of you who are taking careof C uh HCD please spendtime on the code please spend time onthe CI/CD stuff make sure it is healthyand make sure it is usable by uhKubernetes and it it is still standaloneyou it can still do all the things thatit used to do before but do it under ourumbrella so we provide you the cover ofthe rest of the things that you need todo follow our um you know processes forenhancements and things like that rightso that is one example of somethingwhere we provide an umbrella we you knowwe we kind of like so this is what sigarchitecture does right right so we youknow whenever there's a problem I'llanother example is like hey um when weall decided to do uh batch related stuffokay start a batch working group rightand then give them some room to likewhat does the API look like what do thecontrollers need to do right and uh youwhat does what what are the newresources types that will be useful foreverybody right so the working groupwill start working on those things andthen propose some changes and then soessentially that'sover these 10 years of cuminus we'veevolved to this point where we havethese six um that doesn't mean that weprune them we also prune things that areout ofd and uh you know that are uh waylong in the tooth and we wrap them upand say okay we are moving theseresponsibilities to either other� sigs orwe are winding the work down we arecleaning up the code so we do those kindof curation tasks as well we also havewe we also as a as an a project we haveworking groups which are actuallyintentionally temporary so if you if yougo back that to that slide you'll seesome some working groups on there andbecause we know that sometimesorganizations just become about prep youknow keeping themselves alive so we wehave these working groups and they haveexit criteria and um they tend to begroups of SIGs working together and thenthe code whatever they produce ends upback in a SIG so this this working groupis a temporarylike working group right rightso as a group under Kubernetes we alsohave to write a charter um to tell thesteering committee what we want to doright so you know it's checks andbalances responsibilities you know justlike we are asking other six to writecharters we have to write one too andthis is what we ended up writing in oursum as the our scope so uh how many ofyou know about the conformance programuh Kubernetesconformance right like it basicallyassures people that you go from onevendor to one another vendor distro toanother distro you end up gettingexactly the same thing right so youdevelop a workload on one place then youknow the conformance program is the onethat essentially like guarantees thatthese are the APIs that are availablethese are the things that you can expectfrom Kubernetes whether it is running uhon prem or in one of the major cloudsand things like that so we also like uhtake care of like uh imagine everybodycoming up with their own API conventionslike it'll look so disjoint becausepeople are working on these things overa long period of time and like you knowthey might do different things atdifferent time and like somebody mightfollow certain patterns and other peoplemight not you know know those patternsor like you know uh they are they don'tagree with those patterns and then theywrite their own pattern like imagine howthe APIs would look like so sigarchitecture says here is our APIconventions it doesn't mean that it isset in stone it just means that it's aliving document we curate it we takecare of it over a period of time as weevolve as our code evolves as ourcommunity evolves we we keep it up todate uh similarly for deprecation policyand you know u John's favorite uhproduction readiness he'll talk a littlebit about it at and uh you know one ofmy favorites which is the enhancementproposal uh at one point in timeeverybody was pushing code right youknow opening up PRs asking asking otherpeople in the community to review andapprove and like boom you know codelanded but then what was happening wasokay how are we going to take care ofthe code you know uh how are we going tolike make sure that there is some amountof like you know if something goes wrongis there a way to switch it off is thereu you know things like that uh you haveto think through what is its effect onsome of the existing features uh youknow how does it interfere with uhexisting workloads uh that are expectingsome things right so that's why we weare like telling people please write itdown please write down what you're goingto do how you want to dowhat milestones and like so we areessentially trying to do this uh andit's the cap is a living document tooright so you know you might say hey thislatest milestone I'm going to do theseuh four things and then next milestoneI'm going to do these next three thingsbut you couldn't do it because you knowwe are a calendar based schedule uhright like if your things are ready itgoes in if it's not ready it needs towait for the next one So uh we go backand fix the cap and say okay this iswhat we are going to do in the nextmilestone and also like if you learnsomething in alpha stage you go and fixthe cap and say okay we are changingwhat we're going to do maybe we'll do analpha 2 you know and so on so it's aliving document and everybodyessentially gathers around it people whowant to know what is happening go lookat it what are the PRs linked to itright and uh you Um that's how we end updoing en�hancements so uh we talked alittle bit about all these things and Iwanted to put up the links for each oneof them uh so conformance is importantAPI reviews are important uh weliterally have a set of people who whoare in in charge of like okay does thisAPI contract look good you know um canwe promote it from v1 alpha 1 to v1 betauh one or like do we have to do a v1alpha 2 because we found something uhyou know and you know things are notworking and like is there an upgrade uhprocess for this is there a downgrade uhsituation where people will want to goback because something was not workingso API reviews that's where it happensright like if you uh adding a field iseasy dropping a field is not easy rightlike and changing the meaning of a fieldis different as well right so we wetalked through those things in the APIreview process um and we already talkedabout the uh some of these rest of thethings um how many of you had to u dealwith a deprecated feature gate or youknow or a thing that got switched stuffum that you were relying on in some ofsome of these uh you know workloads okayat least 1 2 3 uh 4 5 6 7 so we do dodeprecation like how many of you had toswitch over from docker tocontainerd see that's an example rightthere's more hands uh now you know whatwe ended up doing so we went through aprocess and said we have to telleverybody that we are doing this we arewe have to tell everybody that uh youknow we need to clean this up and ourcode base is unmaintainable because wehave to support something in tree uh fordocker which we are not doing for any ofthe other runtimes so Docker seems to bespecial we need to make it the same ascontainerd and uh uh cryo so you knowdocker needs to move move out of treeand you know we won't be supportingdocker the way we used to support dockerin the beginning right so we wentthrough this process uh you know andthis is the same process that we followfor every other thing as well so thereis a you know slowly over a period oftime we try to take people off umexisting features um especially becauseyou know we are not going from v1 v1 ofcuminus to v2 of cubet you might noticethat you know we are doing we are still1.33 is the next one right uh in the1.34 is the next one uh we haven'ttalked about 2.0 like we always uh referto it as jokingly right like if we everget to 2.2 too but until then these arethe uh you know uh toggles we have righthow do we deprecate how do we design sohow do we move it up or down and thingslike that so uh wealso are like you know the last resorthere of uh um you know if there aredisagreements between different uh teamsdifferent SIGs who are working ondifferent things or like they don't knowwhat to do or like they need uh uh uhsomebody else to like uh look look atthe things that they are working on theycome to us you know we have a meetingevery two weeks and you know they put usput something on the agenda and thencome come and talk to us we have variousmechanisms for talking to each other butyou know that is one of the ones that uhwe tend to use uh we do a lot of asyncas well we have mailing lists we haveslack channels and things like that sousually by the time it comes to ameeting we've already talked about it inmany different forums over a period oftime between many different stakeholdersacross um different um you knowdifferent SIGs and working groups so umthat's basically how we conduct businessum in terms of sub projects we what aresub projects right so sub projects isbasically sig architecture is a bigcommunity but we need people to focus ondifferent aspects of the work that isbeing done in sig architecture so wekind of like divide them up into subprojects so we talked about confirconfirance a few minutes ago conformanceis one so we uh the people in thisessentially look at like okay thecurrent set of conformance tests is itgood are we missing any tests uh whennew tests get added um you know is it uhhas itbeen flakeless in the last 2 weeks uh oryou know 2 months or whatever right andthen u we end up like having somecriteria around that and uh if you neednew uh conformanc�e test then we tell thesigs and hey um you're thinking aboutthis new thing there's a new feature ora new API and like you know u you knowwe need to add a performance testbecause we want everybody to get thesame experience regardless of where theyare doing it right um code organizationum who here h um have Golang projectswith like vast vendordependencies lot of vendored code fromall over the place uh quite a few rightuh so this is something that you don'tsee but it is there right like when youuh when you uh use Kubernetes you'reessentially seeing um things that wereuse from across the Golang ecosystemso the code organization ensures that wewe are using as little as possible youknow just because you know there is uhyou know security bugs in in there andyou know there's duplication in there uhsometimes uh adding a new dependencyincreases the size of the binary morememory more CPU maybe uh one of thevendor libraries has in it that startssome threads and it is doing some youknow unnatural stuff so There is a setof people who go around like fixingthese things and some of the things thatwe work on in code organization takesages i have some charts I'll show you ina little bit so uh that's the example oflike the different teams uh John runsthe production readiness reviews um youknow Jordan uh runs uh some of the APIreviews uh I help out with the codeorganization so and so we have somespecific people doing specific thingsand you know we have uh some folks whohelp us um so that's the organizationmodel so API reviews we talked a littlebit about it already uh we do have aproject board uh if you're interested umyou know it is uh you have you know ifyou go through one of the issues and seeexactly what ended up happening uh thenyou will figure out like oh this is whythey did it right like so the ahamoments are endless um you know I hangout with the API review folks I don'twork on it directly but um if you shouldlook at the API conventions you shouldlook at the API changes um markdown filetoo because uh if you are writing codeif you're writing controllers if you'redoing uh work in the cubernetescommunity then you know you should knowthis uh by heart because you know thatthat's what we look for and that's whatwe whenever you have a PR open andthings like that you know we kind oflike make sure that you are followingall these things well and I would say Iwould say even if you don't want tocontribute upstream to Kubernetes ifyou're building anything on top ofKubernetes so you're buildingcontrollers for CRDs or anything likethat I would highly recommend you go andget familiar with the API conventions weuse they're all there for a reason uh sothis is kind of the institutionalknowledge of 10 years of buildingKubernetes-based APIs and if yourcontrollers follow the same conventionsit's likely that there will be ecosystemlike adv tooling and things will be ableto handle your uh your CRDs and youryour CRS and things better than theywould be otherwise and essentially don'tmake the mistakes we already made learnfrom what we what what we did already souh this is an example of the conformancetest uh we have made so when theconformance uh when we started doingconformance tests we did have a lot ofum uh gaps in there but we've closed thegaps uh over a long period of time uh itliterally took us years to make surethat all APIs that we publish are testedand in conformance um so uh happy to saythat um you know I think in 131 or 132uh we were able to close that gap andnow what we are doing is whenever weland new features new kepts um all APIshave to come and by the time theygraduate u from alpha to beta to GA theywill end up having conformance test sowe don't have to go back and catch uplateruh code organization this is my specialproject i love doing this it'smaintenance grunt work each thing thatwe end up doing um it takes a very longtime because uh it's spread acrossvarious u different things uh you knowsimple example from um that I can giveyou is like we some of the PRs and someof the work that we ended up doing incode organization spreads across likethe open containers runcy um containerdproject and you know hcd project and uhseveral other projects so um you knowthis is the convoluted graph that wehave and uh this is just part of thegraph um so this is not even the fullgraph um so I talked about cleaning updependencies so if you look at thebottom you will see all the versions youcan see the number of vended file goingcrazy and then the number of vendoredline of code you know it was going totouch 2 million lines imagine thatright what an attack vector right uhanybody who you know has code that endsup in Kubernetes um you know there's anattack vector right there so we've beenable to bring it down significantly bycleaning up pruning negotiating you knowdduplicating and things like that andlike if you look at the dependencycounts at the bottom you can see themaster uh you know extremely happy withthat result right now but doesn't meanthat it's the end of the road there'smore work to do uh and always um youknow we always look for help so withthat I will hand it over to John alittle bit more and I'll take a breakuh all right great so uh I think we havefive minutes or so left is that aboutright um yeah okay so um uh I'll try andbe quick because I we'll try and leave alittle bit of time for questions herebut um as Dim was saying we have thesedifferent sub projects each sub projecthas different responsibilitiesenhancements looks at uh how we actuallyadd features to Kubernetes and make surethat the work that people are doing umadheres to certain you know designprinciples and things like that umproduction readiness review is uh as weadd features to Kubernetes we want tomake sure that they evolve through aprocess they start in alpha they go tobeta they go to G but what does thatmean what does it mean to go from alphawhat does it mean to create a newfeature we have certain requirements tomake sure that we don't break yourclusters that you can turn things offthat you can downgrade that you candowngrade and then upgrade again um sothat's part of production readiness foryou and that you can monitor uh youryour yourclusters um so one of the big thingsthat as a project we're working on andtherefore sig architecture is involvedin um while we continue to try and buildreliability and trust that includesproduction readiness reviews as well asupgrades so you may have heard somethingabout this compatibility version um sothat's part ofthe SIG architecture is one of the SIGsinvolved in that but that's an idea forhow we can um decouple binary upgradesfrom API upgrades uh which makes itsafer to kind of you can do multi-stageupgrades more easily and then of coursewe're trying to work uh on howKubernetes relates to a IML workloadshow it relates to hardware how we canmake it most useful in those situationsum I'm going to skip all this and go toQ&A so any questions herelots more to talk about but we wouldlike to take questionsthere are no questionsall right well what I would like to sayis uh you know Oh here we have one allright microphone oh yes it's on solooking back at the beginnings ofKubernetes let's say the version thatwas still written inPython how much of that would make itthrough the current process if it wereto be validated the same wayuh yeah it wouldn't but but we are notthe same project right we didn't havemillions and millions of clusters andusers depending on us then right andthat that predates me in Kubernetes i'vebeen eight eight eight years I've beeninvolved and so even that you know butbut um yeah no so your point is I thinkwell taken that one it meansthat the the bar has been raisedsubstantially for what can get in thedoor and it has to be because we have tonot break you know you want your bankingapp to keep working and it's probablyrunning on Kubernetesthank you for your question yeah in inessence we've come very far Yes yesabsolutely absolutelyanyone elseall right well thank you and please comejoin us uh we'll put this deck up on thesketch and uh you'll see links for howto get involved um we always need moreuh more help here in Kubernetes thankyou2025-04-15 21:57:50.186127 ��h2#��AoLZ2EjjKibwuh hello everybody um welcome this isthe um SIG architecture discussionum so uh I seem to be one speaker shortbut we'll make the best of it um so myname is John Bellame um uh and one ofthe co-chairs of SIG architecture um andI'll talk to you a little bit todayabout um kind of what what SIGarchitecture is what role it plays inthe Kubernetes community uh and theKubernetes project and um the differenttypes of um issues we address and kindof things we we concern ourselves withand then of course I will give you apitch for coming and getting involvedand and helping us out because this isan open source community and uh wereally really need the help and can canuse um everybody'scontribution so generally ah here heis hi everyone sorry apologies I'm lateum so my nickname is Dims uh I've beenpart of the Kubernetes community forquite some time now anduh I work for AWS and uh what else i'mpart a partner in crime uh in variousthings with John uh and including thesick architecture yeah can you do acouple of slides and then so catch mybreath yeah absolutely all right thanksDims yeah um yes so uh yeah Dims and Ihave been working here in SIGarchitecture together for probably likesix or seven years um and uh soessentially we uh you know theKubernetes project as a whole has uh acertain set of goals and principles andit's kind of one of the roles of SIGarchitecture to help facilitate thosethose roles i mean there's other likeyou know other other SIGs and thingsthat also do this but we sort of focuson the technical architectural goals ofthe project so portability uh generalpurpose although these days we're alsoextremely uh uh focused on helpingKubernetes be more effective for IMLworkloads um but uh essentially thatcomes kind of in the flexibility andextensibility aspects of Kubernetes umso these are kind of the some of thegoals um there's also a set of communityvalues that we adhere touh and that goes into not only like howwe treat each other which is isobviously very important but it's alsoabout how we structure the project andit's it's a little funny to talk aboutthat in the architecture because youknow we're talking there about kind ofthe organizational structure of theproject but for those of you who maybehave been around software engineeringfor a long time you know there's a closerelationship between the architecture ofyour organization and the architectureof the software you produce so it isactually super important and we we werun into these problems right we runinto problems where a sig because it'swhat they understand and know will solvea problem there when really it probablyshould have been solved across three orfour different sigs right so part of sigarchitectur's role is a place wherethose sigs can come together and we cantry and help tell them you know go��ice does and thenum the ingesttor will send those uh uhcompacted blocks to uh some objectstorage of your choice Um so that wayyou could have long-term storage Um soin this case it could be S3 or GoogleCloud Storage Um and we'll talk a bitabout more about what are thesesupported uh object stoages Um and thenon the read path there's dashboards likeGrafana that will query Cortex so thatyou can query all the metrics that yousent it And um when you query Cortex youpass in a a a header which signifies uhwhich tenant you want to query themetrics for And there's a lot of cachingfeatures as well so that it doesn't haveto go through the whole query path toget the metrics that you want for yourtenant So how's it different from blankthat does something similar so withcortex all of the components that yousaw earlier are horizontally scalable Umso if you wanted to scale up some somecomponent there you could And there areno unexplained metric gaps So anytimeyou restart something there's alwayssome sort of availability or highavailability setup where something elsewill take on that load Query speed isalso a very important feature for CortexSo we try to optimize aggressively onthat And there are also strong stabilityguarantees So whenever there is a newfeature uh we always mark it asexperimental so that um users of Cortexaren'tsurprised Uh as and the CNCF project isalso backed by um many many developersfrom multiple companies So you can tellthat the community is quite vibrant andhealthy Um and here's just a quick tableof what Cortex is about Uh it it's uhhas object storage uh scalabilityperformance data resilience and dataavail data flexibility and it'scommunitydriven So this is our gettingstarted Um we have a page on our uhdocumentation that shows you how to getquickly started with uh either docker orkind uh whichever you want The dockerone has a um a single binary mode So youcan just have Cortex running as a singlebinary and docker and then play aroundwith that There's also the microservicesmode uh where you can have cortex brokenup into the microservices that it ismade up of and then see how thosecomponents interact with eachother But um I wanted to show you alittle demo of cortex in action So onthe right here I haveuh just an empty Kubernetes cluster Uhit's just running on my machine here andthere These three pods are just part ofthe Kubernetes cluster Um but what if Ilet me just double check that all thesevalues are correct here Sorry I wastesting this earlier So I need to resettheseback So I'm going to apply the this Helmfile which has a um a couple of chartsthat it's going to install U cortexitself in microservicesmode So Cortex like I said earlier ismade up of multiple uh components hereSo there's the ingesttor which storesthe metrics Um there's the distributorwhich is the uh um the front uh frontdoor for all metrics that get ingestedin Um I just have one replica here andthree replicas of the ingesttor Engine Xis like basically our um our HTTP routerhereUm there's the query which uh executesthe queries and all of these componentsare scalable here So um it it'll take asecond to load these all up Um I alsohave some Prometheus uh pods here thatwill be scraping metrics in the clusteruh and then sending them into Cortexwith their own tenants So imagine youhave a dev tenant and a staging tenantand you want to have isolation betweenthe two tenants you can have that uh andit's possible here Um and I'm going toshow you how Um so yeah I just havethese two Prometheus instances that aresending metrics to Cortex So let me portforward into Graphfana so I can show youUh there's also Graphana running in thisuh cluster So let me um uh create somedata sources and some dashboards in thatgraphana instance so that I can umvisualize some of the metrics that arecoming inSo um in this dashboard I'm showing uhthe ingestion rates uh for samplesbetween the two tenants So I have a devand a staging tenant Both of them shouldbe sending the samemetrics And I also have some uh ratelimits over here So each tenantcurrently has 25,000 samples per secondas thei�r limits And on the bottom I havewhat each ingestion or each ingesttorpod is currently ingesting per secondfor samplesUm the first thing I wanted to show washow I can dynamically rate limit atenant uh without restarting anything umon Cortex So what I'm going to do is gointo my Helm file where I have someoverrides for these tenants here So letme just change this 25,000 to2,000 Apply that againAnd you'll see that that is the diffhere is this change for the dev tenantfrom 25,000 to 2,000 If you notice herenone of these pods got restarted becausethe the these values that I'm settingare runtime configuration values So theydon't require a restart And each ofthese services are um uh programmed toreload these types of values um uh everyso often I think it's 10 seconds bydefault but I've changed it to 1 secondso that we can try to see it faster Soif you notice hereum the ingestion rate for the dev tenantis actually going down to 2,000 like weexpect And then over here you can seethe limit as well got set to 2,000 Andthen for further proof you can see theingest pods actually are seeing lessmetrics coming in which is which isworking So the the demo gods arepraising or are happy today I guess Sothere's one more thing I wanted to showand that is how do you automaticallyscale up ingesters So if you want tostore more metrics and to do that I'mgoing to set the ingesttor pod replicacount to four And you can have this bedone automatically you know based onscale or some other metric So let me goahead and apply thatAnd uh we should see one more injust podcoming upuh pretty soon here Cool And I'll loadthe logs here So you can see it comingup But I have the the logs here for thisingesttor pod coming up And then I alsohave uh this you know panel here on thebottom that is showing what is theingestion rate for each of those podsAnd if everything is working correctlywe should see those three ingesttor podsgiving those metrics that it wasingesting before into the fourth ingestpod that's coming up So we should seethe fourth ingest pod increasing in umsamples uh uh ingestion and then theother three decreasing as it's being umload balanced across the the four Sothat seems to be working as um asexpected here So you can see the the thenew one coming up and then the otherthree kind of releasing those ingest umthose those um samples that are comingin So that is uh what I wanted to showfor um Cortex and and how you canquickly kind of change things and seethings working in a multi-tenant wayAnd I'll hand it back over to Daniel totalk about um what's um coming uh or thesorry the team updates for the communityin Cortex[Music]Thank youCharlie Since the last CubeCon in NorthAmerica 6 months ago we've had onerelease nine new contributors three newtriagers 381 commits We've recentlyadded this new triagger team which is agroup of contributors that contributejust like external contributors do butwith more uh persistence and they have alittle bit more permissions to managePRs and issues to help alleviate theadmin overhead on the existingmaintainer team The current team of umtriagers is made up of Anan Raja GopalSunjin Lee and myselfIn this last release version 119 we havea bunch of newfeatures The first is that OpenStackSwift as an object storage backend is nolonger experimental Which means that thefour officially supported backends areAmazon S3 Google Cloud Storage MicrosoftAzure Storage and OpenStack Swift Thereare two new features for the CortexrulerThis component is the component ofCortex that evaluates Prometheus formatrecording rules and alerting rules We'vehad this component that runs as amicroser inside Cortex for quite sometime but prior to release 119 it has notbeen properly highly available Thanks tothe work of and uh apologies formisprononunciations Alan Pagio Yiji ChinWilbert Wool Sununping AnanRajop and Emanuel Loviche We now haveexperimental support for highavailability Unlike the highavailability that we use in distributorsand jesters instead of using purereplication we use a primary versus non-primary approach In this approach eachrule group is �assigned to a set of rulerinstances but is only evaluated by oneinstance in that group of rulers At eachrule sync which happens uh by defaultevery one minute the non- primary willcheck the health of the primary and ifthe primary is unhealthy it will thenon- primary will take ownership ofthose rules begin evaluating them So yousee two scenarios here Um the thirdscenario not shown is that if the bothruler one and ruler two are down thenext non- primary in the chain there isan ordering um imposed Ruler three inthis case will take over ownership andwhen the other rulers come back alivethey will relinquishownership This approach helps usminimize gaps when rulers restart andfail but also avoids redundant ruleevaluation So at any one time a givenrule is only evaluated by one ruler Thishigh availability is also availabilityzone awareThe second feature that we've added tothe ruler is the option to use theexisting query path This means that umrules evaluated by the ruler now takeextra network hops but allows you totake advantage of the caching and querysplitting performance enhancements thatexist in the query front end Thisfeature was contributed by Sunung JinLee and BinYi And now I'll hand it to also Danielto talk about more features in version119 and upcomingThanksDaniel Um so uh the next um feature thatwe released on monitoring team it's thepartition compactor Uh we have beentalking about it for a while We did uhdo a talk on last Kubiccom like it itwent very deep dive on it So if you areinterested please take a look at thetalk Uh the partition compactor mostlyhelp us with the limitations from castDB index which is 64 GB and it helpsalso if you your case has more blocks orbigger blocks to speed up compaction Uhthat was problems for us that we addedAnother uh feature that we are workingon recently not recently but we alreadyhave for a while is OTLP uhcompatibility uh we already had it forcortex but we are still working inimproving in adding more uhfunctionalities One of them is the maxuh request size limit uh that was addedto prevent for example out of memorymetric and out of memory uhissues Um we are also trying to be asconsistent as possible with PrometheusOTLP uh compatibility So we also addedthe target infometrics by default And weadded u a new proposal that was uhpassed on Prometheus and was alreadyimplemented Prometheus which is called afun a configuration called promoteresource attributes That configurationallows you to add custom labels for yourmetrics That helps a lot in OTLP casesbecause sometimes you want to add adifferent label for querying or for uhaggregating the data For example you canuse the labels from target info to dothat Um the other part that we arecurrently looking a lot on it isinjection optimizations focus on CPU andmemory Um one thing that we added asexperimental for 19 is something that wecall push workers uh credits for gRPCthey actually were looking at to thatand we are getting the idea implementingcortex basically the idea for pushworkers if you look at the p profitthere we have a place called new stacksand what's happening there is like whenwe al when we create new go routines wehave this new stack and that burdens sowhat we added we added a workers poolfor managing go routines so instead ofcreating new go routines every time ifwe have in the working we reuse themavoiding these new stack to be creatingevery time Uh in our in some use casesnot all but we are seeing improvementsup to 20% of CPU only gestures which ispretty good Uh so if you guys haveproblems uh with CPU sometimes take alook try to try to enable this and seehow it goes Let usknow Another is another another uhimprovement uh enhancement that we didit for ingestion uhoptimizations focus on CPU again isexpanding posting cache that's somethingthat was already added in star gateway Idon't know if you guys are familiar butthe idea is instead of caching theresult of our query we catch thepostings of the query so instead ofreceiving all the response from thequery we basically get the query and seewhich postings are going to be relatedto tha�t query the idea here is becausesome of the queries are very complex andvery burdening on CPU when they areloading a lot of postings or whenthere's a rejects that are want to matchfor a lot of postings Uh for storyatedthat was easier just because we don'thave the head block for inesture thatwas a little bit more complex because wehad to add some invalid uh cache forthat If we if we just added that to theinesture we could have wrong data Sowhat we did is every time that wereceive a new sample that matches aposting we invalid the cache for thatposting So it's not as good as storegateway but it also helps a lot Uhthere's a lot of graphs here but if youlook at the first one on the right forleft for guys uh you can see thedifference on CPU usage We also seeimprovement around 20% when we enablethis Uh so it's another thing that youcan try and see how it goes It reallydepends on the type of queries that youhave If you have a lot of reject queriesthat can help alot Um another part of the code that weare looking a lot uh is scaling cortexand a lot of focus on in gesture itselfAs you look at for the demo from tallyscaling up in gestures is pretty easylike we just enable it goes to the ringand that's it It works Scaling down is alittle bit more complex just because thein gesture for us is the hard uh datalayer So if we just scale down ingestures we are basically haveopportunities not opportunities we canhave issues about losing datatemporarily or permanently depends onhow we configure your context We do haveflags on configuration to avoid that Theproblem is that those flags can alsocause um query performance issues forexample high latency So what we did itwas we added a new state for in gesturesfor the ring itself it's called readonly but basically in gestures use it Umthe idea is uh if you have a new readonly gesture it's still active on thering is still visible for everybody butit's not receiving any information So ifyou look at the first graph there I havesome red lines The first one is settingthe inest only and you can look in thesecond um the second matrix uh below itwhich is basically shows the pushrequests for those in gestures going tozero So the idea is the data is notgoing to inest anymore but the data isbeing still being queried from ingesture So if you look at the the metricon the side you can still see that thequery request is maintains the same butthe push request drops So with that andadditionally we also add a new API on ingestures which is basically to getstatistics from in gestures The APIbasically tells you if there is anyblocks loaded on gestures So if with ifthese were the only in the API we cansafely assume when we stop receivingdata We can safely assume when we don'thave any more blocks on in gestures andthen we can scale down gestures becauseon that time we don't have any moreinformation to query on on those ingesturesUm another thing that we improved uh onthe release is 1.19 for those that useHA uh on Cortex which is a feature forCortexuh previously um we only accepted one HApair by request So if we have a requestwith a different samples but uhdifferent HA samples HA pairs webasically would get the first one andignore the rest So it doesn't work andhow you prevent that is basicallycreating different requests which is notideal So right now we have experimentialflag that you can enable which allows usto have mixed HA pairs Uh so the idea iswe look in the request for at each hapair for each sample and then you don'tneed to create a bunch of requestsThat's was something that was annoyingfor for userssometimes So now what's we are currentlyworking on and what we are trying torelease the next next versions Um we arestill working on the OTLP compatibilityWe already have OTP but we got mergedthe OTLP metadata conversion So now weconvert LTLP metadata also and that wasalready emerged so it's probably goingto come in the next version ofCortex Uh continue with the improvementswe are also working on anotherimprovement for inest CPU Uh the idea inthis one is instead of creatingconnections every time we open a streamconnection to between distributed ingestures and we leave it open we have akeep alive for that stream connectionand then we don't have the burden tocreate connections to to manageconnections that have time um from ourtests again we are seeing a 10% CPUimprovement this one is still open weare still having discussion about it uhso if you are interested go take a lookat at thePR uh another big topic for us is nativehistograms we already support nativehistograms in gestures in in cortex uhbut we were missing the custom bucketsnative histogram and that's somethingthat we are working right now uh andit's going to be coming in the nextfeatures in the next versions and latelyuh the big topic which is Prometheus 3.0so how we are doing on that we areactually talking a lot internally aboutit we are trying to support more andmore Profus 3.0 So we have a PR open forthe remote write v2 part of it per itbrings the headers the new headers thatprom the remote v2 has and it also bringthe string internally which also isgoing to help with the CPU performancefor in gestures Um so if something thatyou are interested we are havingdiscussions discussing how to implementthat incortex Um that's it for me I'm going topass for Frederick for the road mapUm thank you Daniels for the amazingupdates Um it's been uh it's been anhonor for me to uh work uh for Cortexfor a number of years I started as auser um then became contributor um andthen became maintainer and well now I'mhere talking to you about the project Umso well and what I talk want to talk toyou about is we have a number of users anumber of companies um pretty largeusing um and we need to organize to getthe project um in a road map that is uhcohesive right and this is uh what we'vedecided is that we we have this uhfeature of GitHub that we're going to beusing to handle the the projects lookingforward and we have organized all thefeatures in this um in in these bucketsUm I'm going to talk to you about onespecific one uh that has been long uhcoming and is the one about thegraduationUm the graduation um of Cortex means forus um uh a milestone It's not theendline Uh we have even more things todo but we we think the graduation iswhat we need to do be doing next Umthere's many adopters of Cortex publicand private Um and uh so we think we uhwill fulfill um all the requirements Umthe things that we're going to beworking on that that we are missing overthe course of this year is the thirdparty um security review and also uhdocument the road map uh change processum so that uh we have uh a moreinclusive governance um also in therelease management we need to finalizesome details um and um that's all I haveum we're moving moving into questions Umum thank you for being here Thank youfor listening Um thank you forcontributing to Cortex Um thank you foryour next questions[Applause]I think we have a few minutesYeah I think let me check Couple ofminutes Yes we have like Yeah seveneight minutes Two minutes Okay Okay Twominutes Okay Anybody has questionsanybody hasquestions this the mic if not okay Allright All right HeyUm is there any discussion in cortex toum I say the the storage layers thatcost storage is like S3 or or GCS and soon Um is there any discussion toconsider as inestion to analytic storageum no At this point no we don't have anissue for that No nobody has Um that'sthe first time I hear that but uh you'rewelcome to create an issue What'stypically happen is you don't know thatis other people in the community wantingone you want So the first thing I wouldsay I would create the issue Don't don'tjust ask the questions if you if you'reinterested Yeah Yeah Go ahead YeahAny other questions daniel do you wantto talk about the issues that exist thelast thing that I'll add is that if youare eager to contribute you are allinvited to check out our GitHub projectgithub.com cortexproject/cortex Um there are a lot ofissues and you can check out the goodfirst issue label You can filter bythose check those out Um we're happy tohave your contributionsThanks again for beinghere Thank you2025-04-15 21:57:50.792201 ��33#��A3aUg2qxfoZUso thank you for joining in I know it'sthe last day of CubeCon but you knowlots of folks I think are also streamingin from uh the keynotes So um welcomeeveryone Um today we are covering inthis session lots of cool things aboutCortex How many of you use Cortex todayawesome Very cool Uh and those of youwho are not you should definitelyconsider itUh so today we're going to talk aboutthe project uh provide some updates uhon you know the new features that haverolled out already this past year andalso give you an update on the road mapthat's coming forward for the thiscoming year I'm Alolita Sharma fromApple Uh we also have our awesomemaintainers on the project here Some ofthem joining in today Charlie Lee fromApple uh Daniel Blando from AWS DanielSabsay from Adobe and last but not leastFrederick Gonzalez from Adobe So uhagain you know all the maintainers havebeen working on this project for a whileand really excited to have them all heretoday So uh in case you don't recognizethem on stage here they are again Andand uh they will be covering differentparts of the agenda We'll be startingoff with an introduction to what Cortexis but also diving into the architecturea bit uh and kind of talking about thedifferent moving parts Charlie will bealso then diving in into a super cooldemo uh to show you how to get startedwith Cortex if you don't know so alreadyand uh kind of doing you know walkingyou through a couple of scenarios Uhthen the maintainer team will beactually providing an update on thedifferent features that have been rolledout for uh performance or you knowscalability reliability and others Andthen we'll talk about you know what'swhat are some of the cool features thatare coming in in terms ofinteroperability compatibility with newformats the and upgrades uh in on theprotocols Uh and last but not leastwe'll talk about the road map to uhgraduation Uh again the project islooking to graduate you know again uhit's super exciting we're at that timeyou know where we are just just kind ofgetting ready so Frederick will bediving into that and then if we have youknow a little bit of time please holdyour questions till then uh and if we dorun out of time please you know do findus hang out you know we are happy tohang out and talk more about things Soas many of you know Cortex is a uh veryhighly scalable uh specificallyhorizontally scalable highly availablemulti-tenant long-term storage forPrometheus Again it is you know a verypopular multi-tenant solution for mostlarge and small organizations that arelooking to isolate data and Charlie willbe diving a bit more into the featuresSo Charlie over to you SorryThanks Alita Uh so I'm going to betalking a little bit about Cortex whatit's about Um like Alita said it's ahorizontally scalable uh version ofPrometheus Um it tries to have thelowest response times possible uh forqueries and writes to it Um there are alot of strong stability guarantees whichI'll talk a little bit more about andit's community managed like you see hereIt's backed by multiple companies andthere are lots of diverse contributorsall around the world that are helping tobuild it Um this is the architecturediagram that is on our documentation Umif you've read it it's probably you'veprobably seen it before Um but at a highlevel there's the right path which isthe orange line here uh where somethinglike open telemetry or Prometheus sendsmetrics to Cortex and then it lands intothe distributor which is a serviceinside of Cortex which handles the ratelimiting and fanning out of thoserequests to an ingesttor which is thesecond part of that right path Um andthe ingesttor is responsible for um theingesttor is responsible for um storingthe metrics so that you can query themuh in like uh in memory and then whenthe ingesttor uh has enough of yourmetrics for um I think two or so hoursit will compact them just like thePrometheus um uh serv��cess APIsTherefore if an attacker can steal anaccess token from the client applicationthen this attacker can use this accesstoken to access APIsillegally So in order to prevent suchsecurity features the depot uh might canbe usedThe depot enable us to make an accesstoken the sender constrainingtoken which means that only a legitimateclient application can use an accesstoken So therefore if an attacker cansteal an access token from thelegitimate client application then thisattacker cannot use the access token toaccessAPIs The second future is OID forVCI H before erh introducing to you OID4BCI I would like to tell you briefly thebackground of OID4 BCI and why OID4 VCIspecification wasdeveloped The as you might know the EUcitizen will be able to use EU digitalidentity wallet called EUDI walletThis EUDI wallet can securely manage andstore the data called verifiablecredentialsVC The roughly speaking the BC is adigitalized version of some kind ofcertificate in the real world forexample in national ID the driverlicense passport or the universitydegree certificate and soon So the major characteristic of the BCis that the everyone can verify theauthenticity and integrity of the BCthis uh entity called issuer orcredential issuer issue this w to walletSo thereforeuh some communication protocol wasneeded to convey this VC from issuer towallet So the open ID foundationdeveloped the OID4 BCI specification assuch a communication protocol The byfollowing BCI the wallet can requestsome type of VC to the issuer and issuercan issue this uh VC and send back tothewallet The Trog experimentally supportedthis OD4 VCI but only experimentallysupport and it also cover only the partof the world for VCSspecification and the Tro the currentVCS support plays the two role tokenissuer as authorization server andcredential issuer as resource serverSo therefore the wallet firstly receivedan access token from key crog as tokenissuer and then the wallet request BCwith this access token to also trog ascredential issuerSo the finally I would like to brieflyuh introducing to you the currentO6activity Erh the O6 now try to supportthe followinguh new the security uh specificationsupportuh for workload identity the transactiontoken andspiffy and to for party applicationcalledFEA and shares signal framework calledSSF defined by open foundation and openfederation also defined by openfoundation And please note that thethose working items is not the key thedevelopment teams the working items thebecause the O6 is the purely thecommunityactivities today I'd like to pick up thetwo the specification the FIPA and openindividual hip allows us to providebrowser resource to flow for nativeapplication As you may know uh by uhfollowing the authorization code flow uhor authorization code grant ofO2 then the end user uh needed to inputthe uh their credential for userauthentication onto abrowser the while by following the FIFAflow and end user can input theircredentials for user authentication touh native application and not no need touse the browser So therefore this flowis suitable for first party nativeapplication use case For example uh Iheard that some bank some banksdeveloped their own first party nativeapplication for their end userauthentication and uh they provide theirend users with this first partyapplication So therefore in this casethe FEA that might beapplicable and the open federation uhenables us to build the trust betweenIDPS and RPS on the fly No need toestablish inadvanceUh the open federation allows us to dothe similar thing done by TLS or MTS buthowever on an application layer and thepromising use case for openfederellation is dynamic clientregistrationErh please consider the situation thatyou learn uh the key clock as IDP andyour key clock accept uh dynamic clientregistration request from RPS or clientsbut however yeah your key only want toaccept the request from the legitimateand trustworthy client application orRPS In this situation now privatefederation can be used The actualcurrent adopters of open federation isItaly's digital identity system the SPI�D So uh that's all for the first partof this talk Thank you very much and Iwould like to pass the presentation onto the lion that he will describe thetro recent update on observability andgive some demonstration about that Thankyou very much for listening Thankyou Can you hear meokay Is the sound on yep we're goodOkay So yep My name is Ryan Emerson I'ma principal software engineer at Red Hatand I work as part of the Keycloak S surteam Today I'll just quickly go throughsome of the enhancements to ourobservability story we've made in thelast uh 6 months to a year or so andthen hopefully we'll have a live demothatworks So the thing that we'reparticularly proud of is we've spentquite a bit of time improving ourdocumentationUm from the screenshot I've got here youcan see that we have I think sevendifferent guides A lot of them are kindof u operational in the nature how toenable tracing metrics But the twoguides that I would like to highlighthere are the first one which is ourguide on keycloak service levelindicators In this guide we go throughthe concept of ser service level servicelevel objectives and service levelindicators Um the take-home if you'renot familiar with these concept is theSLO the objective is basically yourtarget and the service level indicatorsare your measurements to you know whatyou can use to make sure you're meetingthat objective and reaching that targetSo in this guide we go through andexplain these concepts in a lot moredetail than I just did and we providesome example service level objectivesthat you might want to have with yourkeycloak deployments So I've just pickedone here Um this screenshot is of thetable we provide and it's a SLO relatedto latency and we would say that we want95% of all authentication relatedrequests to be in the sub 250millisecond range Now the indicator thatyou would uh use to ensure thatobjective is being met could be theresponse time for all HTTP requests asmeasured by the server So then in thistable we then provide a a couple ofmetrics that could be used as thefoundation for thatindicator So then building upon this isthe second guide that I want tohighlight which is our new in-depthguide on all of the metrics thatkeycloakprovides Within this guide we have manysubcategories as li listed on the lefthand screenshot here um metrics relateto the core keycloak functionality umJVM metrics database HTTP requests aswell as very in-depth uh metrics forclustering whether that's a localcluster or clusters um acrossavailability zones if you have a more HAenvironment but all of these guides havethe same kind of format um on thescreenshot on the right here we have anexample from the clustering guide and webasically provide the raw metric namesfollowed by a highle description of whatthose metrics are and what it actuallymeans because raw metrics aren't verynice But what I particularly like aboutthese guides is that we have a bit ofcontext around the metrics So we havethe light bulb here and we're sayingactually on a healthy cluster theaverage replication time will be stableum or with little variance So you as akeycloak user can read that and thenthat can inform your Prometheus alertsSo if you know the response time doesstart to fluctuate you page somebody andhopefully they catch things before youknow it's too late and the whole systemgoes down Not like keycloak ever goesdown So there's a lot of metrics coveredin those guides and metrics can be youknow maybe a bit too much data focusedsometimes So we now provide uh graphfanadashboards out of the box from keycloak26.2 26.2 two will be available nextweek all being well Um the twodashboards we provide one relates totroubleshooting which I'll show shortlyand the other relates relates tocapacity planning which um is basicallyto ensure that your cluster and yourdeployment is right sized based upon agiven amount of uh the current load thatthe deployment'ssatisfying Um these dashboards are allavailable on the URL on the slide Um youcan use them as is or you can use it asthe foundation for some kind ofcustomization Uh these are new so pleasetry t�hem raise GitHub issues the usualcommunity process We would love to seeuh what you think aboutthem So then next in our story istracing Um tracing was introduced inKeycloak 26.0 as a preview Um it's fullysupported since26.1 and we provide spans for allincoming and outgoing HTTP requestsincluding identity provider brokerage aswell as spans for outgoing database andLDAPcalls What's nice about this is that umwe expose it via the tracing providerSPI Um we use this internally within allof Keycloak's providers but if you havea key keycloak extension you canleverage this so you can create your owncustomspans And finally the last thing islogging Um a lot of people have askedfor support for the elastic commonschema format Uh this is now supportedby all loghandlers So we'll jump into the demo nowHopefully my mini cube cluster is goingto be doing its thing Um thearchitecture is as follows We have alocal mini cube cluster Within that wehave a single keycloak pod which istalking to a postgressdatabase We're then using Prometheus toscrape the metrics We're pushing Aspansuh to Jera and uh we're consuming logsvia Promptale which is um storing thingsin Loki We then have Grafana consumingall of this as our single pane ofglass So before I touched on servicelevel objectives I'm going to reuse theexample from our documentation In thisdeployment we want 95% of allauthentication related requests to befaster than 250 milliseconds within agiven 5m minuterange So I'll just jump to the browserhere Um we can see the Keycloak UI UmI'll log in here And for those who arealready familiar with the UI this mightlook slightly different I am using the26.2 nightly build So I'm going to go inand I'm going to select our realm whichis realmzero Within this realm I have created100 users And in the background I haveGatling logging a user in and out every1 second just to lightly load the systemfor the purpose of the demoSo now I'm going to jump to Graphana andI'm going to refresh this And we can seethat we have this SLO metrics P panelIt's saying that all of our pods areavailable So that's good And then wehave this second graphic here which isresponses below 250 milliseconds Umwe're currently at99.44 I was hoping that would be 100%but never mindUm on this panel as well we can see thatwe have uh zero errorresponses So things are going well froman SLO perspective at the minute I'mgoing to quickly just jump to my commandline here and you know introduce someproblems because that's way moreinteresting Um and then I'll just now gothrough the rest of our troubleshootingdashboard while we wait for someproblems to propagate through oursystem So I'll just minimize the SLOmetrics Um the next panel up is relatesto JVM metrics Um we can see thedifferent pods we have here just one inthis case Um and then we have uhvisualizations for the usual kind ofthings you would be interested inaverage memory usage uh CPU usage and umgarbage collection time that kind ofthing Next up we have a panel whichrelates to database metrics This is youknow the utilization of the connectionpool available connections that kind ofthing But for the purpose of this demothe most interesting panel is this HTTPmetrics I'm going to there's a wholebunch of virtual uh visualizations herebut I'm going to focus on two today Oneof them is the total number of requestsper keycloak URI and their outcome Sohopefully you can read the text in theaudience there Um but on the right handside here we have a list of thedifferent endpoints and they're orderedbased on the URI that's been hit themost In this case we can see that thetwo top ones are the orth and theauthenticate endpoint which we wouldexpect And then we're followed up by theusual readiness and livveness probes Umso that's one panel that's kind of niceand useful so you can get a quickvisualization on what URIs are actuallybeing used on your system Um but thenext one that I really like is this heatmap So this heat map um we have umrequest latencies uh grouped intobuckets So we can quickly um have asense of how what requests are fallinginto what time ranges And um from they-axis here until the last few secondswe can see that most requests were below100 milliseconds Um the delay that I'veintroduced has happened a bit too soonfor me here They were all meant to bebelow 100 milliseconds but here we go Sowhat's really nice though is now that wehave all of our tracing capabilities andeverything that we've introducedrecently is we now have exemplers So ifI zoom out here and click on one oftheseum this pink dot is an exampler we cansee the metric that it relates to and wecan see this is associated with theauthenticate uh URI So what's reallynice here is we can click this querywith Jerger button and we will get thefull uh trace spans associated with thatrequest Um we can see from the graphhere or hopefully you can see in theaudience um the total request time was170milliseconds and as we go down and seekind of the subspans that are uh in thegraph here we can see that 78milliseconds of this is actually fromthe iggon hashing and then as we go downwe can see we have some database callsthis select one here we're going topostgress so we can see some attributesaround that requestUm we can see at the minute this requesttook 74 milliseconds So we're within ourSLO Things are going well Um and as Iscroll down here we can see there's abit more database activity We have somecommits which were a lot faster again SoI'm not sure what happened with thatselect statement But when I go on theselect statement because it's anexampler the really nice thing is we canalso see the logs associated with themetric Um if I click this uh button herewe have a couple of debug logs Um Iassure you they're not that interestedinteresting I just enabled this to showthe fact that we do link thelogs Um but that's how this kind ofscenario looks when things are goingwell Um what we can now do though ishopefully when I click back here we'regoing to be massively failing our SLOYep we are now only 56% of our requestsuh are within the 250 millisecond rangeSo what I would do is the idea is nowI've seen what a normal trace looks likeUm I go back to the HTTP metrics I'll goback to this same visualizationuh the heat map and we can see there's anice solid thick green number ofrequests here with 24 requests that arealmost at the second kind of rangeSo what examplers allow us to do is Ican now dive into this and hopefullyfigure outwhere the root of the issue lies Soagain same screen as before The totalrequest now in this case is 739milliseconds We go through the spans Wecan see the argon hashing is in the samekind of ballpark as before So I thinkit's fair safe to assume the issue isnot there Um but as we go down we seethis select statement has increased to100 millisecond and so is anotherdatabase call and actually the commitsat the end uh in hundreds ofmilliseconds So I'm sure you can guesswhere I'm going to go with this Thisimplies that the issue is with thedatabase level and there's some kind ofproblem between keycloak andcommunicating with the database Um sohopefully you'll agree that gives a verynice high level easy to consume um wayof figuring out the ballpark of whereyour problems lie You then you knowthat's when the S sur engineers maketheir money they go in and figure outwhat the actual c the actual problem isBut this is a nice head start and ismuch better than looking at logs in myopinion Um that's everything I wanted toshow today in the the demoUm if you're interested um the rootcause of the issue was I used chaos meshjust to add an arbitrary 100 milliseconddelay to the Postgress service which iswhy it was kind of uniform in the thetrace span there Um but yeah I thinkthat was everything So if you have anyquestions feel free to ask them now orI'll be hanging around after the talkand so willTeeshi Thank youOkay So okay Uh so so sorry Uh as youmight know the cubicon Japan will beheld on this June in Japan and there isthe key rock uh maintenance session Sotherefore if you will visit Japan toattend uh Kubon Japan uh I would beappreciate if you could join again thekey maintainer session Thank you verymuch2025-04-15 21:57:51.275247 3\3��5#��YAWuMyfaF0UeMhello everyone uh welcome to join ourtalk today yeah today we'll talk aboutQBH QBH dive deep dive about thearchitecture use cases at projectgraduation updates okay first uh let'shave a brief introduction for ourspeakersum okay good afternoon i'm I'm uh Juanfrom from H Cloud uh I'm I'm doing thethe work on the uh uh on the productsdesigning and and the developing ofthe edge computing and also I providesolutions for our and customers whoyeah okay thank you uh I'm fish fromhigh cloud now I'm a full-timemaintainer of cube which graduatedproject okay let's start now okay firstlet's See uh this is the uh QH projectjourney uh from scratch you can see uhfirst we uh create this project thendonate to the uh CCF uh as a set boxyeah then we released the Wii one uhreleases uh then we have uh the uh largescale use cases first yeah uh then uh in2020 we uh became a CNCF incubitionproject yeah then we uh create the SATAsub project which is a uh hi uh projectin cubage yeah uh then we have uhseveral use cases like the vehiclesatellite yeah theǁ�4#��OAbC4xbBJs0CAso hello everyone and we are very happyto have today that talked about the keycro and thank you for joining this thepresentation H this uh presentationconsists of the two part The first partof this talk is uh about introducing toyou the recent tick clock update aboutsecurity future support While the secondhalf of this talk we would like tointroduce to you the key recent updateson observability and give somedemonstration about thatSo in the part of this talk I would liketo talk about the tro recent update onsecurityfutures Uh before my talk let meintroduce myself to you briefly My nameis Takashin Matsu and working forHitachi limited Japan and I am also thekeyro maintainerIn this talk in this talk I would liketo introduce to you the two trog specialinterest group the community activityuh trog oc and tro st reliabilityengineers s the objective of of oc is tosupport o thec and it related certspecification support to kroto make trog moresecure while the objective ofs is to improve the lives of the peoplelike you learning grog and operatinggrog So the both hik has its own uhgithub repository and alsocncf channel Therefore if you areinterested in the the activity of Zoplease access to the CNCF swag channelis an appropriate place foryou So in the past of this talk I justmentioned before I'd like to talk aboutthe key recent security at a futureupdates And uh the this slide shows therecent contributions by the communityOC And uh futures marked with the testtube means that uh those futures arerated asexperimental or preview future notofficial futureuh while the items marked with the checkboxuh means that those futures are treatedas official officially supportedfuture Then so the OC not only do thecontribution to the newly supportingsecurity specification to tro but alsodo the contribution to the define thetro already supported security futuresAnd this slide shows an example of suchthe contribution for refinementIn this talk the I would like to pick uptwo security specification that therelatively the recent Tro thesupported RFC9449 O2 demonstrating proof of positionuh called DOP and open ID for verifiablequerial issuance called OD for VCIAnd the depot enables you to prevent themisuse of a stolen access tokenErh as you might know the O2 uhspecification RFC6950 states that an access to an accesstoken is treated as bearer token whichmeans that everyone holding an accesstoken can use this access tok forexample to ac��n we do some largescale test yeah and in the uh last yearwe have became the uh since graduationproject yeah q age is also the uh firstgraduation uh project of in the agescenario in CSF okay this is a briefproject journey uh forQBH okay uh this is the uh cube agegraduation celebration in the last yearyeah we have do some uh discussion uhfor the future of the project okay or wealso uh thanks to uh every member in ourcommunity okay uh next I will introducesome project uh backgrounds for thecubage yeah uh from this diagram you cansee uh the from the right side is the uhcloud then the uh regional age then CTthen near side age and now many devicesrun in the near side age and many uhdata also generated in the uh near sideedge so how to manage the uh node andapplication in age uh is what we want todo okay uh this is auh overall introduction for the kubageproject yeah you can see kubage is theuh first cloud native age cloudcomput yeah we have the open governanceyeah we also have many stars at Fox orGitHub yeah we also have manycontributors uh from uh all over theworld yeah from many organizationsokay okay next I will have uh have aintroduction for theproject okay you can see this is a uhcube edge architecture yeah from thediagram you can see uh cube age includethree part the code part uh age part andthe IoT devices part yeah from the codewe use the Kubernetes master uh we don'tmake any modification uh for theKubernetes master yeah the right side isthe code call yeah this component is uhwe developed in uh cube age yeah uhbecause we think the uh network betweenthe code at age is always uh unstableyeah so we have do some enhancementsbetween the uh cloud at age for networkissue yeah in the age part we have theuh component age core yeah h core weintegrate the uh light cublet yeah letkublet is uh we remove some unusedfeature in the cublet then we integrateit in the uh h core component yeah rightside we also uh develop some uh functionfor the IoT devices management yeah youcan see we can use the uh componcomponent called mapper uh to uh connectthe out devices to the kubagecluster okay yeah this is the uhoverview of the kubage architectureyeah okay next I will uh introduce theuh latest uh release of cube age yes wereleased many new features uh like uhsupport batch node process like uh eachnode batch drawn in yeah like IPv6 forcloud edge communication and also mountlogage me framework support yeah this isfor the devices management yeah also weupgrade Kubernetes dependency to thelatest version okay also have the newrelease of cubage dashboard you can seethis is the overview of the uhdashboardyeah okay uh next I will uh introducethe detail of the cubage architectureyou can see this is a uh agearchitecture mainly for the uh ageapplication age pod how how we managethe age pod in the age node yeah you cansee uh above is the cloud part yeah wesend the uh port the metadata from thewebsocket protocol yeah in the H node weuse the Hub uh receive the metadata thenthey uh we uh store the metadata in theage store then uh send the data to theHD Kite Kat light uh then run thecontainer in each nodeyeah okay uh next is the uh IoT devicesmanagement yeah we define a interfacecalled the device management interfacein the age yeah it can uh manage the IoTdevices connected to the age nodes yeahuh from the Kubernetes master uh we usethe CRD to define several uh APIs fordevices uh like device model and deviceinstance uh to manage the IoT devices inthe uh edge node yeah we can uh uhmanage the uh connect edge devicesconnect or uh collection of data of theuh devicesyeah okay next one is uh H mesh yeah hmesh is the uh network uh sub project incubage it can solve the uh ageapplication communication in the agescenario yeah because you know uh ineach scenarios the age node can't uhconnect to uh each other directly uh sowe built the uh edge mesh it can uh helpthe uh each application connect to eachother in uh in different siteyeah okay this is Hmesh uh it include uh like the uh uh uhDNS function or otheryeah okay next is uh uh security uh incube edge yeah you can see� uh cubage isone of the first CCF project reaching L3of supply chain level yeah uh and wealso have the full audit report yeah weal also uh cubage is one of the firstCNCF project integrating with fuzzingand we also released the uh threat modeland the security protection analysisyeah yeah this is um uh securityupdates okay uh next I will introducethe HAI yeah maybe uh many developed uhwant to learn yeah first you can see uhthis is the Sedna yeah sedna is a subproject in cube age yeah it's a age codestandardy AI framework yeah from thediagram you can see this is the overarchitecture yeah it also include cloudpart and age part yeah in cloud part wehave uh component called uh globalmanage it has all the uh hai taskmanagement task coordination and modeldata set management yeah this is the uhcloud uh component or we also haveanother component called localcontroller it runs in the clone node atage node it's the bridge between thecode and ageyeah the right side you can see we havea worker yeah worker is the uh AI taskrunning the node it can it can run incode and age yeah we have another calledlib in the worker uh through the lib uhthe application of AI workload betweenage and cloud can do some collaborationuh for uh like joint inference atfederated learning okay uh this is abrief introduction uh forseda okay next I will introduce the codeedge joint in France you can see whatsee what is code edge joint in Franceyeah you can from the diagram you cansee we we have the uh devices like thecamera then three H node H1 H2 H3 andone clone node yeah we can deploy theshadow model in the H node and the deepmodel uh to the clone node uh thenuh developers can do some um users cando some inference uh from the uh eachnode yeah when the confidence level isunmet the uh request will forward to thecode yeah if the uh confidence level ismeet uh the h node will uh return theresult yeah this is the uh code edgejointinference okay next is um case studiesyes one pleaseoh okay now let me uh show some uh casesthat can highlight the power of uhkubash uh first we have we have uhKubach in the commercial vehicles and aswe know uh uh commercial vehicles liketrucks often operate in the in theremote areas where the uh signal loss isuh uh common issue and however for theum for the better fleet management andthe um the maintenance weuh the uh the reliable networkconnection and the uh the the better uhvehicles management is is very verycrucialand uh this is why uh Kubage uh this iswhere where where Kubage comesin and soby by enabling uh vehicles to run thethe AI models locally andlocally and Kubage can mix them to toidentify potential uh issues uh beforeuh the problems occuruh so in this way and the benefits is uhthat even if there's no call access theuh vehicles can can can run the modelsuh smoothly and and uh this can help tohelp help us help our our customers touh reduce the maintenance maintenancecosts and the um and to do the betterfleet freight changemanagement and next we have the Kubachin in in the offshore oil oilfields and oil leers are often um oftenuh situated far from land uh uh so sothe so the network between um betweenthe the oil leak and the clock is is isoften very veryweak um uh so so in in this in this inso um machine failure in such anenvironments can be very can be not onlyuh dangerous but but also but but alsocostlycoup edge addresses this thesechallenges by and handling data on onsite uh souh so the uh so the benefits areum and our customers can work in uhsafer safer environment and the the uhoilfield oil field systems can uh remainstable even if there is just uh nonetwork network connectivity between theoil leak and the cloudand finally we have we we have a case umand about cool badge on the uh CDN andas we know uh uh contentdelivery and contentcontent delivery networksare and are working on theum on the delivering of the uh contentfrom the um from the cloud but sometimesthey will they will suffer the problemis that the connect is is very very weakand and so the the the downloading is isvery very slowum so Kubaji can help help uh to uh to�uh provide u um more efficient solutionuhby uh by some some AI models we can umwe can do theuh do the predict of the uh trafficthrough so so we can ensure that onlyonly the necessary content will befetched from from cloud uh so so withwith cool badges the and outcomes are uhfaster loading times and uh and uhuh uh fewerum fewer problems and the moreum andthe and improved um performance of theuh CDNservers and also we have we have Manyother cases in the uh differentdifferent industries uh like the in theintelligenttransparation and the smart energy andand andindustrial in intelligence and also alsoand others uh uh uh we can you can trackthis these cases in in our uh commutecommunity so and I would deliver to Okayokay next one yeah you can see this isthe uh use cases in many industries yeahespecially in many traditionalindustries uh for Q age yeah like thesmall energy yeah smart CDN uh and smartcomfort at uh smart logistics yeah andso yeah you can have a lookokay okay next let's uh see the uhcubage community because uh cubage isthe open governness community yeah firstlet's see uh this is um uh governnessmodel in cubage yeah from the uh diagramyou can see we have the uh TSC yeah yeahabout uh below the TSC we have some subcommi committees and special teams andwe also have many uh six uh in Q yeahevery six have many uh sub project uhlike seda in sig AI yeah and edge meshin uh SIG network yeah we also have manyworking group like the MEC working groupyeah and so yeah uh this is our uh TSCmembers and uhstero and working groups yeah this is auh open governance model for cube edgeyeah next you can see this is some uhpartners uh for kage community yeah wehave many partners uh from uh acrossindustries yeah make andresearch okay next you can see we havemany uh activities in the uh communityyeah like the uh linen foundation LFXyeah we have many mentees uh from allover the world yeah this isactivities okay uh in our website wealso have the uh partners yeah the ouruh partners provide like the solutionsfor the uh consumers yeah if you want tofind the uh providers you can check thepartner part in our website yeah we alsohave many case studies uh in our websiteyou can also uh have acheck okay okay that's all uh what uhwhat's what what we introduced thank youuh do you have any questions[Applause]i have one question relating onespecific use case uh in our companytoday we have sometimes thousands ofworkstation connected to uh uh the cloudnative back end have you alreadydeployed cube edge on all workstationuh to take benefits of the computingpower provided by all workstation forinstance during the night where when theemployee are basically sleepingin order to spot uh boats andexploit and use sorry uh the power ofthis susant of sleeping workstationyeah kub is mainly uh manage the uh edgenode located in many places from thecentral cloud yeah yeah we have the uhkubernet kubernetes uh master yeah thisis our uh control player in the cloudpart yeah we can manage uhthe edge node in many places that wethink the uh network between CL edgesunstable yeah this is theuh main uh scenarios yeah could be edgefocusokay but this edge could be theworkstation themsself yeah yes okay okaythanks aHi i just want to ask I mean you havethis case study on uh offshore uh oilrigs right and uh I mean normally Ithink there internet connectivity andall is very constrained there how do youmanage your dayto operations after it isdeployed in production how do you managethe security updates uh new versionupdates and all because they alsoprobably have a lot of regulations orlow I would say the short life longerlife cycle expectations that they can'tjust allow you to window have updatecoming every 6 months or 18 months how'syour experience with uh the day twooperations on on after deploying ituh you mean uh the uh port after deployin the age node yeah in the productionright so that's a project deployed onoil rig facility mhm a case study youhave uh so do you mean the the case uhregarding the and the offshore oil leaksyes that one and I mean I what Iassuming that you have deployed this inproduction it is running there but yeahafter let's say 6 months you have to putan update or do you have internetconnectivity constraints does customersays uh that you only have this windowwhen you can update it you cannot do itlike every 6 months or three months uhso kub is designed to address thesechallenges and especially on the web asthe weak net weak internet community uhbetween the edge and the uh cloud okayand as we know the audio audio leaks areoften uh are situated far from land soum uh so uh but the machine in the inthe in the in the oil fields are arevery very important we need to keep thisthis machine run uh running smoothly andwe need to identify any any potentialrisks before the the the uh the problemoccur so uh uh in this case we andactually we and we we uh developed somesome AI models to do the uh predict ofthe uh problem so um and and u and uh uhbecause badge has a a very importantfeature we called autonomy and and thismeans even if there's no cloud accessthe the cool badge components will willremain very stable and keep keep runningso so um all the uh edge applications Imean edge parts they they're running inthe edge will will uh will keep runningeven if the the edge nodes even if the Imean the OS is restarted then the allthe all the parts running on this onthis node will will become normal inseconds uh so in this way uh we can wecan uh predict the the potential issuesbefore any problem can then we can wecan do the um more effective um oil riguh operation and maintenanceokay and what about the cyber securitycompliance you have regulations and allyou need to certify for deploying in oiland gas industryor do you get kind of cube edge need tobe certified by this cyber securityauthority or something like that todeployuh do anyyeah I'm just cyber security any anykind of compliance requirements in oiland gas uh you mean the the uh containerisolationlike the no I mean like gettingcertification saying that yeah thissystem complies what you deployed basedon this kind of industry uh uhcertification on cyber security andthings like that is that kind ofrequirement and often often the uh inthe in the oil field they have their ownuh privity private networksokay yeah so uh this canum can ensure the the uh network and andalso we have we have the uh security uhuh comm community the the the verysecure uh commun communication uh mechanbetween the the agent and the cloud wehave theTLS TLS connection TLS connectionokay thank youhello uh I have qu two question onequestion is like you tell about likerobotic what kind of robotic you meaningit's like the for example ROS like therobotic operating system or it'ssomething else and second question isabout this uh device managementintegration uh do you have like usecases someone use it with the automationlike the PLC stuff or in manufacturersomething like that maybeyeah yeah the second question for the uhdevice management or we only um uhdefine some uh APIs through the C fromcloud yeah we can we uh use API calleddevice instance and device model yeahthis these two APIs can control the uhuh H IoT devices to connect to theKubage cluster yeah uh then uh for theuh collect data uh it will depend on thethird pass application uh deployed inthe age yeah okay yeah yeah yes yeah uhfor the first question can you repeatagain like the what kind of robotic doyou think like there was likeinformation about extending cube patchto the robotics mhm and for example weuse robot our robots with the ROS systemit's like the robotic operating systemsystem it's likethe use for the manufacturer robots sobasically I asking if robotic you meanby some manufacturer robots or like thehuman new kind of new new age roboticstuff yeah actually we have someprojects for the robot in our communityyeah they manage the robot from thecentral cloud yeah they run the uh edgecore component uh in the robot then thecontrol player in the cloud then we canuh manage the application in the robotokay good thank youwelcome okay any questionsokay okay it's Thanks for your timethankyou yeah you bet2025-04-15 21:57:51.787414�igs that buy in onthis to align on uh the design that isin question um and to have a long-termsupport not only from the authors of thedesign but also from the reviewers andapproverswhat we also want to do is we want tohave like a uh like an alignment on thedecision- making itself um so that wecan uh make the sigs umlike responsible for looking at allthese things that concern them um and atthe last point um from all these uhdesign proposals that we have we canthen at some point like construct like aroad map so that we this is uh most ofthe time asked by many people um howdoes the future of Cubid look like andum this uh will enable us to um uhcommunicate to people better uh what iscoming in the future at which point atfor example in which release will uh acertain design proposallikeBG8 so as I said before I think normallywe aim to require this process by end of2025um yeah what we also want to work on inthe future is uh we want to automatefurther in our release process like umautomate the milestones um have betterrelease signals on the release branchesum so that people see better feedbackwhenever they are working on things umalso what we want to do is we want to Imean we did that already in the 1.5unconference we were starting to discussthe first design proposals at theunconference itself which had I think agreat result right or a lot of greatresults actually and yeah we want tohave um like a better commitment to allthe features that are comingin so u to give you an overview over thecurrent state so at the moment we are inthe release 1.5 which has been releasedon the 13th of March this year it'scalled by li powered by libert.10 andQMO 9.1 and it is targeting uhKubernetes 1.32 but it's also it'stested against the latest threeKubernetes releases that we haveso the major changes for the release 1.5is there is a breaking change that youneed to be aware of because uh there wasa certain arbuck change um that uh wouldbe targeting um like um migrations andthat would be very critical to havebecause we want to give like themigrations in case of eviction apriority over the normal usermigrations um also there is a couple ofbug fixes the major bug fix here is likeuh we had a case where the recovery ofthe migration would not be happening andthat has been fixed um which is good forusers obviously and we have adeprecation about the word cuttle localSSHcommand so um looking at the sik uhspecial features the sik compute um hashad a couple of improvements which isimplementing the virtual machine resetso that you don't need the uh the pod togo away and be created again but you cannow reset a virtual machine without umchanging the pod itself uh what you alsocan do we have the graduation of theauto resource limits uh which then umautomatically sets limits in case thename space will contain resource quotaum yeah also verts has been uh been uhis now supported for live migration uhwhich will increase the usability of uhverts and um yeah we are finally uh notrunning uh the pods that contain thevirtual machines with a custom uh SELinux policy anymore but uh it is usingthe standard SE Linux policy with thisrelease and finally like we have uh anincrease or or usage by usage of multifduh we are uh able to speed up themigrations so when it comes to storagethere is only two changes or two majorthings that we want to mention herewhich is first is the graduation of thevolume migration itselfum and the new IO thread policy um thathas given us great performanceimprovements which I think you will talkabout later right no okay no okay yeahthen I'm wrong okay um so uh when itcomes to network um we can announce thatwe have like finally implemented thenetwork interfaces link state um andthat we have graduated the networkbinding plugin so to have custom uhnetwork plugins inside cubert which youcan use um yeah I wanted to point outand shout out to the people who wereactually like bringing um caps upstreamto uh enable further work inside Cubertwhich is first of all for example theswap cap um and the volume source OCIartifacts cap and the imp place verticalscaling� cap which is also useful whichare all useful features for the upstreamcommunity alsoum yeah then also the DRA adoption whichI think you will talk about hopefully atsome point laterokay great yeah next CubeCon we willhear about that and yeah I wanted toadvertise that we have still a couple ofopportunities for people who are forexample participating in GSOCum like topics would be for exampleseabore BMC dynamic dattachment or vertsto name just a few I think there is manymore opportunities so if you'reinterested in participating you'rereally welcome to come in and um help usoutokaythanksDaniel um so just uh so from here onI'll be talking about um what six scaledoes uh in cubeboard but before we getstarted just a quick show of hands howmany people here run clusters greaterthan 100 nodessome greater than thousandnodes not many okay so um we runclusters that are in in the range of um500 600 uh nodes we are hoping to getthat uh further um you know scale itfurther out um so it makes sense for usto you know um get involved with some ofthe scaling things in upstreamcommunities and since we use cubeword alot uh cubeword 6 scale makes perfectsense for us you know to come in andcollaborate so um in the next set ofslides I'll be talking about thisjourney uh cubeword 6 scale has gonethrough uh and hopefully at the end talka little bit about some of the newupdates uh with simulation uh andspecifically uh withquark so the first thing uh we startedthe six scale uh with was an objectiveright so we wanted uh cubeword codebaseto be scalable uh to perform well uh andthat was kind of our our objective toyou know get uh get this group starteduh once we had that you know we westarted having um some end toend teststhat would uh create load uh on on thecluster uh and we quickly realized thatyou know as we put end to end tests inwe need to put certain kind ofmonitoring in place so we have some umunique monitoring where uh we do umphase transition times on our objects soit can help us get exact um you knowvisibility into where time is being uhspent on um when we run this uh uh perfand scale tests so those monitoringchanges were um the the you know the thesecond thing that went hand inhand withthe um load generating um feature andthen over time we realized that westarted to see um metrics and a bunch ofresults uh but there was no easy way forus to um you know keep track of whatmetric is improving what metric isdegrading so we came up with our own wayof plotting plotting these metrics overum days so now you know we have umuh scale tests that get run each day inCI and you know we can plot the uhperformance and scale metrics of ofthose um over time and run maybe likeweekly averages and see how the codechanges are affecting scaleso then we realized that okay we have agood enough system in place but thescale tests require a lot of computeresources so now you know very recentlywe came to a point where you know wewanted to make it little bit um costeffective we started exploring thingslike simulation and see if the if somekind of simulation can help us you knowget uh perfect and scale metrics sothat's what we're going to talk abouttoday but before we get into that aquick um summary of what we have rightnow this is the benchmarking stack thatCube has uh currently the first part isthe load generator so we have the endtoend tests that get run daily we havethree uh instances of this uh testrunning daily um the second part is oncethose tests are executed and we collectmetrics those metrics are dumped into anS3 bucket um and then from there we getsome kind of persistence so we have uhCI jobs that can go scrape those metricsand plot um the results over timeum we actually did a much more deep diveon deep uh benchmarking stack at uhCubeCon Paris last year there's a QRcode uh for anyone who is interested irecommend uh checking out that umtalk so here is a a quick output of whatthe benchmarking stack gives usuh the graphs you see is on the x-axisthere is time uh on y-axis are resultsof these runs the blue dots are dailyruns uh that we do as I said we havethree run�s each day and then the orangeline is weekly weekly aggregate of theseruns so what we could do with this graphis say that overtime how is cubeword codebase uh helpingperf and scale are we regressing or arewe getting things better and you canactually see that there is a point inthis graph when we regressed a littlebit when I say regress these graphs showhow long it takes for a VM uh to be or aVMI once it is created how long it takesfor it to go into running state and youcan see that at some point uh it startstaking longer so what we have is asystem in place that as soon as we see aregression here we can go look at theGitHub pull requests that got merged onthat day or on that week and try tofigure out which change impact um theperformance metrics here we've been ableto successfully catch three or four umregressions one um monitoring change sothere were some um success stories onyou know successfully catching um thingsfrom thisgraph okay so after running these graphswe realized that this the stack isperfectly set up for us to you know takeadvantage of running large workloads onit but we we have to do it in a costeffective way so that you know we don'tblow up the bills of of CI system sothat's when we started looking at Quarkum it's abbreviated as Kuberneteswithout cublet what it is is alightweight tool to measure performanceand scale of the controlplane underneath the hoods it's really alife cycle management controller um whatit doesis depending on the uh configurationthat user provides it changes the lifecycle of certain objects like nodes orpods which means you can get a lot ofnodes without actually running a cubletand without having the hardware for itso here is a sample architecture um justan example of what a cluster with quarkwould look like so on the left bottomyou can see there is a control planejust the normal kubernetes control planeum at on the top you see one worker nodewith cublet um that is the cublet ismanaging that node this is the real nodeon it you see a quark controller runninguh and some workload pods running rightso that's all standard Kubernetes butfrom here we since we have quark uhinstalled a user can come in and sayokay I want four fake nodes without thehardware for it and the quark controllerwill instead of cublet quark controllerwill go and reconcile those node objectsso for instance in our case we cancreate thousands of nodes withoutactually having hardware for it um thatgenerates control plane pressure andthat helps us uh find scalabilitybugs similarly the quark controller alsohas mechanisms to um reconcile on fakepods that could be running on these fakenodes so you might be wondering at thispoint okay this all sounds reallycomplex how does this actually workthere is a custom resource in Quarkcalled stage CR uh stage is really aconfiguration of what object you wantQuark to be looking at and how do youwant Quark to transition it intodifferentstates then Quark controller will getevents for that object just like anyother controller and depending on theconfiguration it would either decide tochange its state or maintain the currentstate so to summarize this is alldeclarative um it depends on what kindof configuration is created in thatstate CR quickly looking at that stateCR um on the left you can see a stage CRfor a nodethe CR has a section calledum resource reference resource ref forshort you can see that we have definednode object there on the right we havedefined the pod object so this is theinput to quark saying that it needs towork on the node or on the pod objectthen there is a selector field uh thisis helpful to to exactly pinpoint quarkat what kind of states you want yourobjects to be in for quark to change itright so for example if if the node isjust uh gettingcreated it is not in ready state youwant quark to set send send it to readystate because cublet is not doing thatfor that fake node so this is where auser can specify what states quark umneeds to look at and then there is asection called next and that determinesuh where to send this object uh to likein what state um should Quark send thisobject to and then at the very end youcan see that the phase is running sowhat this configuration is saying is Iwant a node which is not ready to gointo a readystate just for reference a similar uh CRexists for pod and what we did withcubeword is added support for VMI so forpeople who want to do a scale testingwith quark um there there did not existan ability to do it with VMI um inrecent releases we have added supportfor it so on the left you can see wellokay uh this slide is actually just aninstantiation of fake node and fake uhpod the more interesting one is this umslide here you can see a stage CR for afake VMIand what you can see here is that we arelooking at a VMI which is in scheduledstate we want to send it into running orready state and at the very end there isa specificuh spec change that was neededexclusively for cubeword as to whichservice account does this change need tocome from so in cubeword there was achallenge that anytime an update is madeto the VMI it needs to come from acubeword owned service account otherwisecubeword web hook will reject it so whatwe added in quark here is an ability tomake quark impersonate one of thecubeword um service accounts so even ifit is the quark controller the impersonuh theimpersonification um says to the APIserver that I am one of the cubewordservice accounts it is configured inthis state here uh and then because itis cubeword service account the web hookwill allow the change to go through sothis was actually uh a change wecontributed back any workloads apartfrom pod andvmi um if you have your owncustom workloads that are wrapped aroundpod um you could potentially use thispersonification umimpersonification u feature to you knowum take advantage of it so with thisability now we could have a virtualmachine instance um with a node selectorthat can make it run on a fakenode so with all of this plumbing we'renow at a point that six scale cubs scalehas integrated quark testing in its CI iwant to give a huge shout out to uhShria um she's one of our teammates umhelping out a lot with uh six scale workand and maintaining all of these umtests andbenchmark just a quick mention hereright now uh what we have with cubeword6 scale is an ability to create thousandVMIS in a really really small uhcluster since it is small cluster uh weare saving CI resources but at the sametime all of that control plane pressurewhich is the control plane scalabilitybugs we are able to have visibility onhow the control plane canscale okay so now that gets us to someof the future work that is planned uh onit the simulation that we have right nowis helpful but it's not perfect uh wewant the simulation to be as close to umreal workloads as u realisticallypossible so one of the um uh the goalfor six scale is to explore other waysof making this um you know simulation tobe closer to uh realworkloads we also haveum pending work to add more higher levelworkloads for example a VMI isunderlying um resource that could bemanaged by a VM controller and we don'thave uh tests for uh VM or uh instancetypes for uh simulationtesting there is also an alternative wayof uh simulating uh nodes uh some of youmight be familiar with this project it'scalled cube mark so the way it isdifferent there is cublet code isactually running in that case but it'snot running or reconciling a physicalmachine it's just running as a pod inthe cluster uh that is reconciling theuh the node object so it's analternative uh architecture ofsimulation uh we have plans to exploreum that kind of simulation for enhancingsome of the u simulationtests and then finally um I would liketo say that we really really need helpwith cubeword 6 scale so if these kindsof uh power scale uh testing interestsyou um I would recommend you come joinus there is a link um in this slide witha charter that has all the necessarydetails to you know find our weeklymeetings and and ways to reach out to usso please come join us there's tons ofexciting work uh to be done in in thisarea um that's the end of our talk wecan now take some questions2025-04-15 21:57:52.550569 ��S6#��]AxwGDxqI_3Nkhello everyone um welcome to the talk umhow to tackle cubits growth andscalability um I am not lubis pir i'mfilling in for him my name is Danny Hilli'm from uh Red Hat um this is Ryan umsorry Ali sorry for that do you want tointroduce yourself hi everyone i am AlaiPatel i'm from um Nvidia um we runcubeird at scale so I'm going to betalking about it um after Danielyeah and I'm part of the openvirtualization team um my main concernis the Cubert upstream CIsystem soum how many people uh in this room arealready usingCubert okay a couple how many have heardofthat that's nice great yeah it's a quitean interesting project I think but I'mbiased obviously so um yeah let me startum to talk a bit about the communityevolution that we have seen and that wehave um uh so where we have tried toimprove things so that it works betterfor the communityum so first of all in uh 2024 we had thesituation where we only had like rootapprovers who could um were responsiblefor everything to go in somehow uh thisobviously didn't scale because therewere very few approvers so basically youknow um you most of you are probablyfamiliar with the system like LGTM and aproof um from the Kubernetes ecosystemright or do I have to explain that okayseems to be okay um so um as I said uhbasically we only had root approvers andthis didn't scale so we were looking athow we can improve that situation so wewere going with the idea from theupstream so from the kubernetes as theecosystem to uh ecosystem to look intouh SIGs and we basically came up with umlike uh how many six do we have five orsix I think so like compute um networkstorage um also scaling um andobservability did I forget anythingno yeah okay um these are like the uhthe sigs that look at the specialaspects that are relevant for that SIGbut we have also like the uh concept ofSIG chairs which actually need tointeract with other SIGs and tocoordinate with them but basically um westill have those root approvers um thatuh do uh take a look at um at specialthings but yeah this helps us to scaleuh more into um uh yeah um spreading theload overeveryone so uh what we did next was wewanted to have like a um special processto look at designs that would beintroduced we had initially we had likethe evolution of like a like a designproposal process inside the communityrepository um but that didn't scale asgood because um this was inside thecommunity repository we had only veryfew approvers who could look at that sowe wanted to spread the load again umespecially we wanted to involve the sigsum so we came up with the with thisdesign proposal process that wouldcalled be umVAP um that uh will be mandatory umafter I think at the end of the yearshould becomemandatory so to expand a bit on that umwe have this uh lightweight process thatis called virtualization enhancementprocess which can be compared to the CAPprocess but it's much more uh slim so itshould not get in the way and feel muchmore natural than um and intuitive umthan the cap uh which is normally theupstream because we are like a smallercommunity and we don't have like the uhlike the bandwidth of people that cansomehow look at some very um like timeconsuming things i mean everyone is busyright so yeah we need to look at thatso um basically the goals for the uhvirtualization enhancement process arethat we want to commit not only thepeople that are actually um creating thedesign but also the s��i have some quoteshere from three of the different uhprojects that use us uh and this issomething we're we're immensely proud ofto be able to offer um because we've asa company and as individuals me thefounding team and everyone else has uhbenefited immensely from from theinfrastructure so we serve at peak todayabout a million requests per second ofopen source uh software packages uharound the world and uh it's uh so whenyou when your build systems are fastbecause they're pulling content reallyfast like that's usuh also when you watch lots of livestreams uh vod surf the web buy stuffonline that's also us but for thisaudience and for me i think uh this partof the business is really importantuh so we we started uh2011 and we started with uh five popsfive locations couple of serversuh and uh a fairly small network now thenetwork peaksat the 40 million requests per secondprobably uh and uh total networkcapacity is about 400 uh a little over400 terabs per second uhworldwide and uh that's been a a longuh journey of learning uh in how to tooperate at scale but alsolearningthat i i wasn't you know if you ask me14 years agothat becoming a platform that ourcustomers rely on uh is a is a differentwas a separate uh parallel journeybecause it's one thing to just build youknow a scalable system uh it's adifferent thing to build a scalablesystem that other people are buildingscalable systems on and so i wanted totalk about platform and this talk willdiverge a little from i think atechnical a normal technical talkso i have kind of always thought aboutplatforms but it wasn't until about ayear ago that it kind of clicked for mehow to really describe what a platformis because there's a lot of differentways to describe it and i kind of cameto it more from a it's kind of a more anemotional description of what theplatform is so i don't know if i'm ineurope so more people probably know whatfestol is than in the us um but uhthat's my daughter uh and uh i got allinto festival so i addressed her asfesto as well festol is a uh germanpower tool brand that uh has like helpedme define what i think of platforms sobear with me festol is uh so 100y oldand they're famous for something calleddust extractors dust extractors is whatnormal human beings would call vacuumcleaners um but they are certified todeal with lead blah blah blah blah theseare very very good vacuum cleaners andamazingly enough cost about as much as adyson so i think either these are likevastly underpriced or the dyson arepretty overpriced um but it's just avacuum they also invented it's kind offunny this was invented in 1982 thestraight rail before that straight railsdid not exist uh apparently it turns outthat making like a 2 m rail straight andshipping it is a really hard likephysical engineering problem but theyinventedthese and they invented this which iskind of looks like a boring containerstorage containeruh they also invented a circular sawwhich a table saw is a saw you slidematerial over a circular saw is a sawyou slide material on the reason idiscovered festival is because i live inby us standards a relatively old house100 i need to do some work and i askedmy co-founder tyler what saw to get andhe said get the festival one it's thebest one and then i talked to someoneelse who said you should not get it itis the best but it's the most expensiveturns out it is not the most expensiveat all if you really go into that worldthere's more but and of course i'm anerd so i was like "this looks cool i'mgoing to get it." kind of a mistakeended up being a lot more than just onesaw so how do these things fit togetherwell these containers stack on thevacuum cleaner which until you have todrag stuff around you're like "why isthat a big deal?" it is a big deal youonly have two handsthis rail obviously the track the sawruns on the rail right makessense why wouldn't it the rail also hasa hole in it so that you can put it ontop of the sustainer so you can carrythe rail and the sustainer with onehand it's like these small details uhthis is a router not like the router weknow uh there is some ov�erlap in likethese terms router bits etc even thoughthey're completely different worlds therouter runs onrails in fact i think as far as i cantell if they sell something that couldpossibly run on one of theserails it does they have a chainsaw thatruns on therail then like stack so you can nowstack your sustainers high and eachpiece of equipment you buy comes comesin one of these sustainers so you likeadd on more and what you see here is avacuum cleaner in the shape of one ofthese sustainers so you can now stackthem even higher and buy more of them uhthey have a battery system this is thestandard base platform that most toolcompanies will talk about they'll talkabout a battery system which is justthat they have proprietary connectorsthey actually added bluetooth to theirbatteries which initially i was likereally bluetooth works even worse thanhdmi uh but their bluetooth has uh talksto the dust extractor aka fancy vacuumso that it turns off and on the vacuumwhen you turn the power tool on and offit also notifies you when you leave yourbatteries behind which by friends i havefriends in the trade who tells me that'sa important feature because they lose alot of money by batteries wandering offand if you haven't ever bought any ofthese tools the batteries cost generallymore than the tools or at leastequal um they also innovated quitesignificantly this is the f the festivaltool domino which wasreleased about 20 years ago patentsabout to expire people are very excitedbecause now otherwise people may do itit basically takes mortising which ismaking holes in wood to put it togethermuch easier um i don't use it very oftenuh but if you're a you know furnituremaker you're going to use it a lot andit saved a lot of time and they patentedit they integrated with their dustextraction uh and the rest of the worldhas been complaining they're not allowedto do it they also so they so they havea culture of innovating patents and thenbuilding that into aplatform so you end up with these youknow stacks and the power of this isthat once you have one of these toolsright you want to buy more and they usethat to such effect that in the us theyreleased this independence day sustainerum which is just one of these boxes witha branded iron for your steak that saysfestival a salt shaker a pepper shakerand an apron for$189 uh and it sold out immediately soto me it's like okay that's what germansthink americans will buy and then itturns out they're right that is exactlywhat americans will buyuh they also have like some ridiculousthings like this which is a hand sandingblock connected to the dust extractorthey think like at some point like youcan't they can't sell you can't get tomake a product there unless youintegrate with dustextractor um this is a spirit level it'sthe best spirit level i ever used butthe really cool part is that it slidesinto the handle of the sustainer and ireally wonder like did some engineer golike there's a hole in the sustainer wecould slip the thing in or did they golike if we make it this way we can thenmake a spirit lever to slide in i reallywant to ask them this question actuallyum when you buy the tools they come withlike all the insets and little displaysthat tell you where everything goes alsohelpfully reminds you of the things youdidn't buy because they were too cheapso that you can then go and buy them andlike fill out yourcollection but very useful becauseeverything has its you know its placeand so this is a hundred-y old powertool company and if you think about liketrue platform plays uh to me therearen't that many and and now i'm goingto define like how i i view platform isif i am a user of a platform and thatplatform provides a new product orservice or function that i need will ijust use it because i already know theplatform right like if apple releasesanother headphone i'll probably just buyit because i know it'll work with myiphone do they best do they make thebest headphones probably probably not ihaven't even looked the last 10 years ijust know that i won't have the likebluetooth kit if i if i buy someoneelse'�s right um in the real world ithink lego is a great platform playtwice they started off having like athousand different products they wentdown to lego twice they went away fromthe brick twice they nearly wentbankrupt and in fact my my 11-year-oldwas just complaining because he bought ithink like lego play and the charactersare like larger so they don't fit withnormal lego things and he was very madbecause he couldn't put it in the otherslike so they broke the perception thatall lego works together and so when wethink of this as how we design like ourproducts or platforms and we provide youknow envy level is this the notion isyour platform provides a promise rightwhatever that promise uh is and when youadd new features new functionality toyour platform products it has to livewithin that promise and if you breakthat promise your users will get sadconfuseduh annoyed angry uh and you know so foryou know fastly has these kind ofcorresponding to to those i compare dustextraction to network right like when itworks no one cares when it doesn'tthere's everywhere uh same withyour network uh so there's some thingsthat like silently are part of it someof more you know there but if if yourelease if you do something that doesn'tmatch what your customers wantum they uh they get upset other way oflooking at it is like if i need a toolthat they make i buy it if i they don'tmake it and they don't make that many ispend like three hours on youtubewatching reviews because no one actuallywrites anymore uh and then uh i have tobuy the tool and hope it works and theend result of that is that like if youare a platform customer or a platformuser platform does it why do you go forthat thing that already exists wellbecause you get value quicker right likeyou already know the ecosystem you knowthe tooling you know the lingo you knowall of it so you just the cognitiveoverhead of the act of adopting a newthing is much lower if it fits into theplatformright and this is kind of a i think thisis more of a maybe a philosophicaldefinition of platform but over the lastuh two years i've really shifted my wayto thinking about it and i think itapplies you know from very low level allthe way up which is core of it is thispromise right promise of what youdeliver tell you like they sell a anglegrinder that doesn't do dust extractioni bought it and man i was pissed forlike a weekend when i realized that itdoesn't and even worse they licensed itfrom another manufacturer and thatmanufacturer hasuh dust extraction on their anglegrinder so some product manager atfestival was like screw it we're goingto ignore all of it and just do it easyuh and and this is hard right becauseyou have to think about like globalmaxima versus local maxima right likeit's easy to optimize in the moment andforget that then you break this greatpromise that was the detour hope it wasuseful that's the set ofplatform um another big importantuh lesson of fastly for me isoutliers um we spend a lot of time onoutliers um we spend a lot of timelooking at at data and performance umthis one is pretty cool this is thelatency to our pops uh around the worldso this chart this started with an emptymap and the dots are just drawn based onlatency um and you can see no one livesin the us uhmidwest uh for example no one lives incentral uh in uh central south americauh north korea does not exist southkorea is an island uh there also someother pretty cool things like here youcan see the nile there's a lot of peoplethat live next to the nile so we we diginto data uh a lotand we we spend a lot of time looking atmetrics and like where do you start withmetrics and i'll just cut the chase likeyou should never look at medians rightlet's just always look at 99 percentileor higher uh at at scale and this is anexample of a thing that we were we weredebugging and uh if you look i don't youcan see that really closely but that isthe uh 6ix9ine p which at like 40million requests per second happens 40times a second so it's like yes not thatoften but pretty often still um and whatwe founduh is that by looking for these you� canactually usually improve the the entiresystem so we spend a lot of time on thisso we in this case there wassome some things that shouldn't be doneinside an event loop let's say that wasbeing done uh that unnecessarilysometimes slowed down the event loopuh by accidentally doing file io uh wefixed it and you can see the latencyhistogram but the latency histogramactually dropped even for the lowerninesuh so four 9ths uh 5 nice and six ninesand then you know you do a control andyou roll back and you see if it fixes itand then you quickly roll it out toeveryone so they uh the top number thereis about 80 milliseconds and the uh the49 one i think is about 5 millisecondsum so for most users like they want toknow this but uh as we've we'vedeveloped a focus a culture of reallylooking at outliers like you findoutliers they're not something you don'tignore them because they're outlierslike you go after them because they'reoutliers because the outliers is alsowhat your more mostumsophisticated and uh technical savvycustomers will find or users will findand then complain about and then they'llyou know you have to fix them anyway souh is it's it's just um it's just becomereally part of our culture and uh ithink i rarely look at anything below uhp999 but really rarely on anything it'sprobably some some damage of the scalebut p9939s are running a little over because ofthe delays soum the other thing to that reallylearned from from the fastly years is tofocus on outcome like keep in mind likewhat what is important like what are whyare we doingthis uh what is your user trying toachieve what is your customer trying toachieve and are we doing the thing thatwill help them do that better or are weneedlessly innovating on something thatwe shouldn't and there's a lot ofneedless unnecessary innovation um inthis world i'm staying at a hotel wherethe room key is like half sized andevery time i lose it in my pocket and mywallet i go like some person spent timeinnovating on this room key to thedetriment of all hotel guests they couldhave spent it on innovating on anythingelse um there's a lot of that going onum and uh that leads me to this wordwhich is a terrible word and and some uhgeneral i think uh career advice toanyone which is never say somethingshould be fixed because of technicaldepth because it can at best mean uhwell in the best case it means what itmeans which is the cost of maintainingthis code is higher than the uh benefitof maintaining it is most cases it meansi don't like thiscode or sometimes it means i really wantto rewrite in this new shiny language orthis new shiny thing because that's whati want to do and so i kind of bannedanyone from telling me something shouldbe fixed just because this actually haveto specify uh is it credit card debt oris it like a good mortgage like one ofthem is bad one of them is not baddoesn't automatically make it bad andthat is in context of outcome right it'sthe operational value of code um isbased on the time in prod code so ifyour if your prod if your code has beenrunning in prod for a year and your newcode has never been running inprod doesn't mean that the new codewon't replace that value pretty quicklybut at that moment in time the value isthe opposite and the longer the new codeis not inprod until the longer until it realizesany value and therefore will probablynever realize value um and so then thisgoes to uh another thing that we focus alot on at fastly which is uh how toprevent global outages which you thenreframe as how do you cause a globaloutage so how do you cause a globaloutage it requires a global state changeand what is a state change well it's asoftware release or a config release ora workload now workload is a little bitout of as a platform provider right likeworkload is a little bit out of our handlike customers could choose to dosomething different shift vastquantities of traffic to us differentforms but code and config those aredefinitely things that we control and ithink uh one of the lessons we havelearned is thatuh there really shouldn't there reallyisn't a difference between code andconfig if the code is out there for ayear but hasn't been run because theconfig is off and and you flip theconfig and it's really the same thing aspushing the code and so you should justthink of them uh as the same as the sameand so you know if your blast radius orif your change is global you have tostop you have to stop and think thereare very few uh things uh that actuallyhave to be a global immediate statechange uh and from a from a networkingpoint of view there is one that's likeunavoidable which is anything to do withbgp routing is instantly global nearlyinstantly uh but most othersuh yeah also at scale there's no lab ordev in the world that will help so theseare some of our like rules they're notsuper hardum but the the no fast rules but fastrecovery and then every known instanceof crash in a canary production musttrigger investigation immediately uhlike you can't let any of it slideum the the one that i think is the mostimportant is the the rapid roll backwhether that is config or or code andideally um automated rapid um rollback maintain situational awareness whentiming production advantages we had anoutage once in a pop where so our popshave four networks in them there's fourswitches all machines are connected toall four networks all internet isconnected to all four switches middle ofthe night someone is doing a softwareupdate on the switches the first switchgoes offline second switch goes offlinethe first switch goes offline and thenthey do the full switch and the pop goesoffline because they never checked thatthe other switches have come back and uhand so when you're doing any of thesecritical changes like maintainingsituational awareness is importantuh and then to end um i wanted to sharesomething i had a little more but youknow um we've been uh i' you know i'mold i i view this lm stuff somewhatdubiously but i have found we found ihave and we have found a lot of value infeeding uh all of our incidents throughhistory into lms and training them andit's uh so if you haven't if you haven'tplayed of that i can highly recommendthat we now have the ability to uhclassify get much better much fasterrecollection of pre previous incidentsuh and also getting summaries so this isa we put we sent this out uh tocustomers uh but this is a pretty gooduh summary of a very very very longuh investigation covering many zoomcalls and many chats and many log filesand many uh things and it's uh it's beena very u significant help as the networkhas grown larger than any person'sability to keep it in their head um so ican uh very much recommend this liketrying to use these tools on your owndata to uh analyze uh any form ofincidents and especially uh incidentsthat didn't lead to like full-on outagesbecause you're notreally so in the in the air safety worldright you treat near incidents asfailures and i think in our world wetend to deep you know operations we tendto treat near failures as successes uhwe try really hard not to but you knowit's like human nature but you don'thave as much time to debug them talkabout them and these tools have beenvery very helpful in allowing us to uhinvestigate these near failures morelike we would uh any other like afailure that actually occurred uhso and uh the final kind of rule i haveis that the abstractions are key um youhave to understand you know little bitunderneath you a little bit above youyou need to get the abstractions rightand do people here know what awinchester mystery houseis i see some hands okay it's in sanjose south san francisco if you have anarchitecture team and they are in thatarea send them on a group event to thewinchester mystery house this is animage from this woman built it over like50 years without any plan and there aredoors going into walls going intowindows it is like amazing and it's anexample of what happens when you don'tplan ahead when you don't haveabstractions and uh it's it's a greatteam building and exercise for anyenterprise uh architecture team for surei highly recommend that um but with thatum thank you so much[Applause]2025-04-15 21:57:53.140975 & K&��Q9#��YAKPNuLwXNkNQall right we are good to go uh welcomeeveryone to my session failure is not anoption drupal execution plus dapper isrocket emoji i think we can translatethat as amazing uh yeah thank you forspending this half an hour after lunchwith me i think it's my job to keep youawake hopefully I can manage that uhlet's get going um I think we are prettymuch familiar with u failure in it rightfailure is inevitable i think we alreadyeveryone experiences probably firsthanduh maybe we caused even some failures idon't know but definitely as end userswe we probably are familiar with a lotof it failures uh usually it's alwaysDNS that's causing it i don't know whyuh and maybe some of you are as old as Iam and are familiar with this Windows XPmessage information message uh that saystask failed successfully which is ofcourse a very weird thing to show toyour users uh but that's actually whatthis talk is about failure is inevitablethings will fa will fail uh so why notmake sure uh things will failsuccessfully h so maybe we can evenautomatically recover or maybe we canmake the impact on the user as small aspossible so I'm Mark Der i'm a developeradvocate at Digrit a company founded bythe co-creators of Dapper open sourcei'm one of the Dapper community managersuh big fan of pixel art as you can seeuh also a big fan of VS Code i'm runningmy entire presentation in in VS Code iwill share a QR code with the GitHublink later so you can definitely see allof the slides and all the demo codeyourself uhlater so I did a bit of research aboutum IT failures and I came across thisvery interesting blog post it's alreadyquite old it's from 2015 it's on the LEE spectrum um blog post um but itdescribes lessons learned from a decadeof IT failure so it it's quite a longہ�8#��_AgMDC1zzHabkwelcome i hope everybody's doing welland I hope you're not too exhaustedbeing Friday day five well day five forme because I've been here sincemaintainer summit onMonday but welcome to the session and wewe want to talk about the tag networkand the the cloud native networklandscape just as a kind of a quickpreface how many of you folks are awareof the the the network reboot or the thetag reboot that's happening inside ofthe CNCF at the moment a few people areaware of what's going on okay so one oneof the things about that is that tagnetwork is going to get merged into taginfrastructure we're going to dig intothat in a little bit but everythingwe're going to talk about is stillrelevant uh regardless of the fact thatit might be tag infrastructure that yousee at the next KubeCon and wiց�7#��iAXelZnqurT2shieveryone uh i'm uh arur bergman uhfounder and uh cto of pasti and uh it'suh 2025 and uh connectors are still uhnot workingwell uh so i founded fastly in 2011 umit's uh it's a honor to be here i havebeen an open source developer for a veryvery long timeum i started off in open source of thepl5 core team uh in the previous uhcentury uh and uh has ever since beeninvolved in a in a lot of different uhopen source projects products and uhevents and uh 2011 i started fastlywhere we make the internet a betterplace all experiences are fast safe andengaging and so what is fastlywe are a uh network around the worldwith servers that sit between users andthe central cloud locations providing uha whole bunch of services and you'relike arer this is a really boringmarketing slide and i'll be like that'strue justwait uh what we actually are right is wehave about a 100 locations around theworld these locations all have physicalservers um i understand today a lot ofuh software developers have actuallynever been in a data center uh whichlucky you uh but we still have uh plentyof themand as a you know someone who'sbenefited immensely from open source uhand i think to really kind of explainmore what we do to this audience we havea program called uh fast forward wherewe sponsoruh all these wonderful projects withhosting uh so essentially whenever youdownload a software packageuh on the internet it probably comesfrom a faster service ��th thatI'm going to hand up yeah thank you Nickuh my name is Jun Hushi i'm from Chinayou know an open source softwareengineer from Huaweiyeah andyou okay uh today let's talk about uhthe several parts uh first one is whatis tech t tech network so it will beunder the newtag framework yeah but we want to uh letyou know what we did last year yeah andthen the second part we will talk aboutthe overview of tech network projectsand we will cover several projects inthe sense land scopes for the networkscope yeah then we will uh talk aboutsome works we did in the last 12 yearsuh sorry last 12 months yeah and thereare several projects in the last yearthat uh maybe evolve from the sandbox toincubation and also some new projectsthat on board uh sandbox then we have aquick demo for uh new project Kim meshyeah then the last uh last two partwe'll talk about uh how tech networkwill do after the tech reboot and thenuh maybe uh before the new tech rebootis happen we want to let let all of usknow how to participate in the networknetwork workyeah so what is a tech yeah attack is atechnical advisory group that is a longlived group that reports to thesync yeah uh the tech responsibilitiesis uh so many we can see see here it isto link the uh end users and the projectcontributors and also the TC membersyeah we need to identify the gapsbetween the uh sens project and uh helpthem evolve to uh fill the gaps betweenthe uh enduser yeah then we also have to educateand inform users with unbiased effectiveand project uh practically usefulinformation yeah we need also need tofocus on attention and resources on onhelping uh foster projectmaturity yeah then we also need toclarify relationships between projectsyou know there are many uh projectsfocus on networking uh especially uhwhen we uh look at the keynote or listento some sessions there's many projectsfocusing on focus on AI gateway rightuh we have to also engage morecommunities and create an on ramp toeffective Tuesday contribution andrecognition and reduce some project workaround on Tuesday uh you know Tuesday isvery busy Okay for kind of work we havewe need to help themto help them to do some part of the workyeah and avoid creating a platform forpolitics between vendors yeah provide aletter for the community members uh toget involved with the uh technicaloversight of s projects yeah as a partof this tags are expected toactively nurture diverseparticipation so so our mission is ispredominantly looking at that so we'relooking at trying to enable widespreadwell this is actually from from themanifesto but we're just trying toeducate users in order that they can useprojects inside of tag networksuccessfully so to to kind of foster thethe community the communication and allof those things which help us to buildthe the resilient systems that we needtoday and obviously we're concentratingall on cloudnative uh there there are four of us inin in sort of the what would you call itnot I wouldn't say leadership butprobably sort of heading up what what'sgoing on in Tag Network we have Zach Leeum Jon Hu and myself and what we do isis try to kind of to run through that sothere's a number of responsibilitiesthat we we kind of carry out the runningthe weekly meetings doing sort of duediligence on new projects and and thingslike that but the I think the key thingis about community and and we want thepeople to be having the conversation soit's you know it's trying to kind oflookat that side of things i'mgood i think when when it comes to kindof looking at some of the things thatwe're we're doing on a daily basis umthe the important thing is that whenwhen you're looking at the the sort ofthe tag network or the the networkingsort of arena what you're trying to dois trying to look for areas that arekind of missing inside of the currencyand safe project list and we want to beable to to encourage people who arebuilding those projects or people whoare interested in those areas tocontribute their knowledge but but alsothe the pol the the size of things andthen we work with the projects where thebest we can to to help them� to growtheir projects once they're they'reobviously submitted is a a CNCF sandboxor to help them get into the CNCFsandboxso if we look at a few of the projectswhich are currently um under tag networkthere's quite a few and it's quitediverse as well so you you look atthings like we we obviously havepsyllium linkad so kind of your servicemeshes core DNS those are the graduatedprojects and then if we look atincubation we've gotnat so event bus orlike event sort of um management systememissary ingress which is the the theAPI gateway project and and the uh envoygateway which is now emissary and thenyou have things like GN uh gRPC and CNInow sort of gRPC C is is is notnecessarily a kind of a a networkingproject as such but when when you sortof think well it's not a service meshit's not a an overlay network or asoftware defined network or a kind of agateway tool but gRPC also falls inthere and it's it's it's incubating uhin instead of graduating so we I hopethat we'll see that move up tograduating um because it is a verymature project and then the sandbox areathe the sandbox is that that kind ofthat entry point and I think this iswhere there's some really reallyinteresting and new projects which arekind of coming into the fold and and oneof the things that we've we've seen whenwe look at some of the changes inside ofTAG network over the last 12 months andthese are the projects which have umbeen admitted to to sandbox with thewith the exception of K8GB which iscurrentlyum staging forincubation but the core thing thatyou're kind of we're seeing through hereis that we're we're finding that peopleare submitting tools which are managinglarge complexity so you you have thingslike lockb which is looking at how youcan do multiple kind of cluster loadbalancing how you can kind of deal withum the the redundancy and then quadrantwhich is kind of looking at theconnectivity and and adding networkingcapabilities um such as being able toallocate IP addresses to on a on yourown sort of name space inside the podand then cube slice which allows you tokind of manage the traffic flow acrossmultiple clusters so for example youmight have an application which needs totalk to another application which couldbe on a different cluster coupe sliceallows you to do that securely you candefine all of the policy to do that italso enables you to do things likemulticluster failover so if you if youdo run a kind of a regional setup onyour Kubernetes clusters coupube slicecan actually help you to to route to adifferent cluster in the event that youyou need that and then I think you startgetting in some of the reallyinteresting things so you start lookingat like semant and k mesh and and what'sgoing on here is you're getting achallenge on some of the traditionalpatterns that we would use in in servicemesh so a lot of the kind of the theestablished pattern or that that sidecarpattern which is currently progressing imean obviously now has the ambientpattern but semant kind of looks at thatfurther as well and that they're kind ofsaying well you know we want servicemesh but we want to be able to deliverservice mesh w with a kind of a lowlatency and with a low overhead soSamant kind of proposes that hey we'regoing to take a different approach we'renot using the sidecar we're going todeliver uh an integrated service meshthrough using the um Java well for forJava applications and it's kind of usingthe sort of Java capabilities to be ableto modify the bite code in your systemso it's you know a really smart solutionto be able to provide extreme lowlatency and extreme low overhead butretaining a lot of the service meshcapabilities connect i mean um isanybody aware of connect but it's it'sit's a I've used gRPC for for many yearsand I'm a great big fan of of gRPC forbuilding connectivity with for my APIsconnect I absolutely love and I've I'vebeen using a lot of connect recentlybecause I think what connect does whilstit's compat retains compatibility withgRPC it enables some of the the niceenhancements so I don't necessarily haveto use um like gRPC curl or somethinglike t�hat in order to to to kind ofexpose my APIs connect will give me aum a sort of a JSONbased HTML sorry HTMLa JSON based plain HTTP API but also Ican retain the the gRPC and I don't haveto do any extra code for that so it's aa new project which has just hit sandboxbut um if you're using gRPC I definitelyrecommend checking that out because itis um an incredibly good project um Kmesh you're going to hear about a littlebit more in in depth but then you knowthings like OVN Kubernetes we're lookingat how the sort of the networking worksand of course K gateway which is the thelatest submission which um formally isum glue uh from from the the good folksat Solo um and and all of those as I sayare kind of challenging the problemsthat we have today so I definitelyrecommend having a look through some ofthose projects because I think you mightfind them very usefulyeah now sorry let's do a quick demo ofkimsh and first let me introducesomething about kimsh ksh is a uh ebpfbased uh s class of mesh data plan itmakes use of yeah uh the first demo isdynamic header routting yeah let's seehere we first need to installthese and also book info uh applicationhad been ininstalled yeah we we don't need thesetwo data plan here it just installed butbut nomatter yeah then we test the book infoworks as expected and then we'll deploya way we point for the reveals servicewe can see here we need to uh replacethe we point image with ksh we pointimage yeah after that we can say thereviews we point isrunning yeah then we uh also need tolabel theuh first we need to label the name spaceof the default to uh make it in inmanaged by ksh then we can can create av service that when match the header wewill send to thev1 yeah we see if we logging in as JSONthe traffic key will be redirected totheuh reviewsone uh if we log as other user it willroute to another uhversion this is a basic example uh thatsupported by most search mesh but we useEPFbased yeahokay clean up then next I will showabout how to do uh traffic split splitby weight in this example also we needto install the bookingapplication let'sforward we can see all the applicationhas been in uh running and here we alsoneed to test the book info runninghealthy then we the same part deploy away point for the re reviewservice thennext check the way point is upyeah then we need to config theroutting routting the traffic of 90% tothe to the v1 and then the left 10% tothe v2 version ofreal yeah after configure itsuccessfully we need to test test itwith curl we cansee yeah we can see about 90% of thetraffic is sent to the reviews V1 andthen the last left left one left part issent to review uhV2 yeah oh nextpart okayso I think what's what's reallyinteresting about the likes of K meshand the likes of semant is that I I beta lot of you folks are probably usingtoI mean how many is it maybe easier tosay how many isn't how many people arenot using in their clusters you have onetwo people threepeople you you probably should likecheck out umto I think uh wellto or anyof the service mesh offerings I think itit offers an incredible utility to toyour um to your platforms But the thenice thing is that it's not a case ofsaying well hey if you're already usingISO then the likes of K mesh and thelikes of of cement you you sort of ripout your when you put these things inall of the new kind of software we'reseeing is kind of being built to to workwith STTO so for example with withsemant semant will work with yourexisting ISTTO workloads but if you dohave that requirement that you want thissuper low latency and super low likeresource overhead for your Java then youyou can kind of choose whichapplications should be managed with asidecar or an ambient pattern inside ofand which should be inside of semant sothat that's I think really really reallynice to see it it's it's more of a kindof an augmentation of the patterns andum some some great work from the thedevelopers to putting the effort inthereso the question now is that well tagnetwork so tag network will not be athing uh kind of in in a sort of a acouple of months time the all o�f thetags inside of the CNCF are being beingrebooted and the the rationale behindthis is that it's it's it as a as kindof speaking as somebody who is aco-chair it can be quite time consumingin terms of what you're kind of doing ummostly everybody who is a chair or atech lead inside of the any of the tagsare kind of doing it in addition totheir day jobs and it's found that likeover the the time that some of thepeople have just left so we've we've hada loss of domain experts inside of thethe tags um some of the leadership youknow people just get really sort ofoverworked overt tired have differentwork commitments and things like thatand and also the work that the tags aredoing was was kind of in some waysbeyond the scope of what they wereoriginally intended to do so you'll seea lot of the working groups inside ofthe tags is anybody a um contribute orpart of any of the the tag workinggroupsno it's a great opportunity we'll get tothat in asecond um yeah and then so you know someof the tags are kind of just focusing onthings like working through the thesandbox application process and notnecessarily being able to kind of dowhat the initial requirement or theinitial desire was and which is to kindof foster and grow the the communityknowledge so what's happening is thetags which were currently eight is beingreduced down to five so three of thetags are getting merged into the theothers and the working groups whichcurrently exist and there's there's somequite big working groups like tag um thethe infrastructure working group whichis under tag app delivery I think it isum is is actually a very sort of biginitiative so there's it it's kind ofseen that well should that be a tag ofits own and we'll we'll kind of see thatbut what we're we're kind of going tomove to is we're going to kind of get tothat situation where you have thisrationale where the tags are providingmore information and more utility foryou as a community as opposed to justkind of performing an administrativefunction for the the project and ofcourse being able to to work with theTOC and the TOC who are you know they'reincredibly busy as well and probablymore so than your tag chairs and beingable to kind of help and share thatworkload because that affects everybodyas well so from applying to a sandbox ortrying to move up to incubation thesmoother that runs the easier and moreefficient it is then the better it isfor for the communityso we're going to have this this setuphere and there's there's going to be aninitiative so an initiative is is reallya kind of a a shortterm project and whenI say short-term it's more somethingwhere you're thinking about research soagain if if you have an idea for aninitiative the intention is that thecommunity will spin these up so forexample we say well you knowmulticluster is really really complexand there's no um there isn't anysoftware or or there isn't any definedpatterns which is really good to workout how to do multicluster umconnectivity or how do we do kind ofredundancy or or something in aneffective way so the initiatives will bespun up with the intention of kind ofresearching into that and potentiallyproducing things like white papers thatcan be then put out to the community ordeveloping the patterns which we canthen share out with the community insome cases you might find that aninitiative is is kind of required to bemore of a long term so I mean AIinitiatives I think like I I think everysecond word in in most of the the kindof the keynotes has been AI right like Imean it's it's it's it's without ashadow of a doubt that it's it's goingto replace all of us one day and uh wellI don't know whether I'll be happy orsad about that but we'll we'll kind oflike uh we'll reserve that and whetherI'm I'm old enough to retire when ithappens but uh parking the the the thedystopianvisions something like artificialintelligence is is going to be alongunning thing it's here to stay so weneed to kind of have the formulation inthe community around that to kind ofunderstand how it it best serves you allso I I I would imagine that whilst it'sthe first formal initiative it willbecome a sub project if not a a tag inits own right in in a given timeso the tags uh as they will stand youwe're going to have developer experienceworkflow uh workloads foundationinfrastructure which is where tagnetwork will be um placed underneath uhoperational resilience and security andcompliance and they're the the kind ofthe I'm not going to read through all ofthese but the the responsibilities foreach of thetags why I I kind of think this isreally important is that I would lovefor every single one of you in this roomtoday if you have and I'm I'm positiveyou have specialist knowledge in one ofthose areas that that you kind of gothrough and and and help contribute itdoesn't have to be a an administrativerole or a formal role but justcontributing your knowledge andcontributing your your kind of yourskills helps us all to to growand in the time being whilst the thetags are being rebooted then we've gotthis information here of how you can getinvolvedyeah before the tag infra is establishedwe can join the you can join the the thetech network herev if you are interested in any s stuffwe can join theslack yeah and if you are interested innet mostly you can first join our GitHuband leave a message there and then wealso have a meeting list you can alsosend usemail and the most important part isthat we have a by weekly meetingwe welcome you to come and contributeany ideasyeah and on the bi-weekly meeting thethe the the meeting is is always at 9PST we we have the first um one afterKubeCon i think it's next Thursdaysometimes things get shifted aroundbecause of KubeCon but if if networkingis something that you're interested andpassionate about regardless of thereboot it would be lovely to be able tohave that discourse but to also in orderto get your opinions and your knowledgeabout what you want from the a sort of aa CNCF formalized structure within TAGinfrastructure and how it can best serveyour networking needs so starting thatdiscourse now allows us to help reshapethat tag rebooton a a last note the something which isincredibly important to the CNCF and toto all of us I believe is that we wewant people to become mentors we wantpeople to be able to help specificallywith underrepresented groups in in opensource and and as a call to action thereplease please please sign up ummentoring is a is a wonderfulopportunity inside of here it helps youto grow yourself as well as to helpteach and grow others and so you're kindof like paying paying back some of thethings you've learned so please umplease please please help us out therebecause we we genuinely would love lovethat help and with that with sort of afew minutes ago I would like to ask ifanybody has anyquestions you don't have any questionsdid I explain everything perfectly orwere you notlistening i explained everythingperfectly i'm going to take thathonestly if you do have any questions umif you if there's anything that youwould like to know about any of the sortof the networking ecosystem inside ofthe CNCF please uh reach out to us it'san absolute pleasure to to chat and andwe're also kind of well it's it's dayfive of the day three of the formalconference day day five of for for manypeople but I'll certainly be hangingaround so feel free to ping me on Slackif you'd like to sit down have a cup ofcoffee and I can help you with anythinghow you can get involved um all of thetag chairs and all of the tech leads inall of the tags are um going to gothrough a revote so there's no formalsmall movement of kind of people who arein existing roles that they will keeptheir roles which I think is a wonderfulopportunity for everybody who wants toget involved you can actually proposethat you uh or submit yourself to be achair a co-chair um a tech lead or onceall of the kind of the working groupsstart get spun up the sorry the subprojects and initiatives get spun up tobe able to help and and contributethere but for me I I just want to saythank you so much I appreciate all ofyour time listening to us todayyeah thank you very much[Applause]2025-04-15 21:57:53.781264�blog post i just took one screenshot outof it um but there there's this legendand the biggest circle is like 10billion of monetary costs of of um yeahof IT failures and here in this diagramyou see across like 2005 and 2015 acrossthe whole globe like very big circles soyou can imagine the the monetary cost ofIT failures is running in the hundredsof billions and this was like 10 yearsago but I'm I'm pretty sure that maybethe cost is even higher these days sothe link is all in here you'll get theQR code later so definitely check outthis link yourself it's it's very niceand it's also an interactive blog postso that's that that'sgreat um yeah one of the reasons uh andmaybe it failures are a bit moreprominent is because also ourapplications are becoming more complexyeah so this is even not a very complexthing but yeah we are making more andmore distributed applications and theservices need to communicate with eachother either synchronously orasynchronously yeah there's lots ofmatches brokers involved lots ofdifferent state stores involved and yeahthere's a lot of moving parts so therehad a lot of things that can go wrongand of course that will gowrong and that's nothing new right sowho who heard about the fallacies ofdistributedcomputing okay so some some hands sothis this list of false assumptions isis compiled in 1994 so quite somedecades ago so it's pretty much wellknown that distri distributed computingis is is hard and that we don'tunderstand everything when we startworking with it uh I'm not going intodetail all of this because this is notreally a theoretical session about thisi can definitely recommend you some ofthese links that I've included here andone of them is like a playlist whereUran from particular software describesall of these fallacies in in very shortvideos so definitely have a look at thatnow about five years ago some some smartpeople got together and created uh thisopen source project deer the distributedapplication runtime and over the yearshas been used by really hundreds of ofcompanies across all kinds of verticalsuh yeah to really help developers uhreally run and build their distributedapplications so it really makes it muchmuch easier to build but also run it atscale um so last November it became agraduated project within CNCF so ofcourse the the deer project is reallyreally proud to to achieve that and onthe CNCF website also on the deerwebsite there's lots of case studiesmentioned on how um companies are usinggapper and how they use it successfullyso yeah if you have any doubts aboutdeer definitely have a read uh in theseusecases so um anyone familiar here withwithdeer couple of people who who's using itlike in production right now okay welldefinitely stop by me later um so Deerruns in a sidecar next to yourapplication and you can use any languagesince it runs in a separate process andDeer offers a lot of API that you canuse as a developer to quickly build yourmicroservices now I'm not going to talkabout all of these APIs because this isreally a session about durable executionand workflow um definitely I recommendyou if you're new to Deer I canrecommend you to try out Deer Universityi'm working on some educational contentabout Dapper now I will also share a QRcode at the end of the session where youcan access this it basically runs like asandbox environment in the browser soyou don't need to install anything totry outDapper so getting back to the problemthat we're solving yeah so we know thatsystems fail and we need to recover fromthose failures ideally automatically andbut also limit the impact of these ofthese failuresso and and yeah one of these solutionsis durable execution so durableexecution is means like running code ina stateful way which sounds a bit bitweird what we've actually doing withthat for for quite a while so in casethe process that runs the code if thatum process fails another process can cancome up um it can uh read back the statebecause that state is persisted on ondisk somewhere and it can then run thecode to completionnow like I mentioned and may maybe thisjob execution u�m term is maybe a bit newbut maybe you're familiar with workflowengines and workflow engines have beenimplementing this uh this dup executionuh stuff for for many years so who'vebeen using like workflow engines ingeneral to automate their businessprocesses yeah I see see quite quitesome hands right yeah so if you're ifyou're new to like yeah workflow enginesand workflows in general yeah so aworkflow consists of many tasks oractivities and all of these activitiesare like small small units of work maybecalling out to another API uh the savingsome state to a database or retrievingsome state uh publishing a message andso on so your workflow typicallyconsists of all kinds of activities thatmake other calls and some business logicand that the way that yeah workflowengines and durexion work is because allof these state changes they are storedto disk to a state store so I've createdan animation just to give you anindication how this works in generalbecause there's actually a lot going onso as soon as you start a workflow umthe input of that uh workflow isactually uh stored in uh in the statestore so this workflow ID and inputthat's stored then the first activitygets uh scheduled well that firstactivity is actually then being executedand the input and output is also storedin that database so then we don'tcontinue immediately to the secondactivity but the workflow will startfrom top again so it will read the stagefrom activity one it doesn't execute itagain because it's known that it alreadyis there then it goes to the secondactivity and again everything ispersisted to disk and again we replayfrom the top before we going to activitythree so as you can see there's a lot ofIO going on between the workflow engineand and the state store and yeah this isnecessary because yeah a workflow u cancan stop can can break any time but allof your u your status workflow ispersisted to disk so this is in generalhow these these these workflow enginesand durable execution worksnow there are different ways how you canauthor workflows right so when typicallyhad these systems sort of with withvisual designer so you can more clickand drag all of these um these theseworkflows they're also like um um stepfunctions which is more like JSON basedand but they're also like real workflowas code solutions and I'm a big fan ofthese workflows code solutions becauseit's part of your source control and sowhen you use a bit of PR which with yourchanges it gets reviewed by someone elseyou can unit test your workflows whichis ideal because workflows shouldcontain like business logic which isquite valuable if if that's correct soI'm I'm a big fan so has some examplesof workflows code solutions like includelike temporal maybe people are familiarwith Azure dribble functions if youdoing doing serverless now and deerworkflow um is also like a workflow ascode solution so uh the deer workflowAPI has been stable in since the lastdeer release so that happened in 115 inFebruary uh and it's been like underdevelopment for for like over two yearsum if you want to know more about thesedup execution engines I also included alist to a an awesome GitHub list thatcontains like dozens of these differentum workflow engines because there thereare much more outthere so when you start using theseworkflows code solutions you can you canum use different workflow patterns withthese with these things so the mostbasic pattern that you can use is calledtask chaining and which is basically youone run certain activities in a specificorder and you need the order becausemaybe the output of activity A is usedas an input for activity B and so on sothe order isimportant another pattern is the fan outfan in pattern and here there's nodependency at all between the activitiesyou can run them uh in parallel or or inlarge uh batches the nice thing aboutthis though is that your uh workflow canactually wait until all of theseactivities are completed and then youcan aggregate over the results of alltheseactivities another pattern is themonitor pattern and this is great whenyou want to like a reoccurring tas�k somaybe you want to run a night nightlycleanup job in the cloud and and removesome cloud resources so the first thingthat you can do is you can uh implementlike a call to your cloud to clean upsome activities then you can instructyour workflow to create a timer wait 24hours the the workflow will um will willunload it will not do anything and thenafter these 24 hours it will becomeactive again and then you instruct it touh restart as a new fresh instance sothis is not the same as the replaybehavior I showed earlier because whenyou do replay it will um remember all ofits history executions but if you saycontinuous new it's a new fresh instancefinal pattern I want to mention is theexternal system interaction um so thisis great when want to implement a anapproval workflow for instance um sohere you can instruct your workflow towait until an external event comes inand then you can use the payload of thatexternal event to decide um how you wantto continue in yourworkflow and normally you wouldn't usejust one of these patterns in your uh inyour workflows you would actuallycombine several of these patterns sothis is an example where we use a bit oftask chaining we are waiting for anevent we also do a fan in fan out one uhand the nice thing of course if you useuh dapper in general you can use anddepper workflow but also all of theother deer APIs in thoseactivities then a bit about the deerworkflow engine so the deer workflowengine is part of the deer sidecar so assoon as your application starts up and adeer sidecar starts up there's a um GPCstream that's established between yourapplication and the deer sidecar becausethere a lot of communication needs to behappening for for scheduling yourworkflow and your application itselfwill still run the workflow but the deersidecar is responsible for forscheduling it and in case it it all goesdown the deer sidecar will actuallystart up your workflow again and willstart executing all of these things andof course the workflow engine isresponsible for persisting all of thatworkflow state to the state store andgetting that back into yourprocess all right almost time for ourdemo so I have a demo with twoapplications a workflow application uhthat talks to uh an inventory and I havea shipping application and we'll do somekind of a simulation of like of an orderprocess where we uh sort of validate anorder and interact with this with theshipping servicenow this is what the um what what whatthe workflow is doing so the firstactivity of that workflow is an updateinventory where we actually check do wehave enough inventory and if we do weactually uh mutate that that inventoryuh if we don't we immediately exit if wedo have enough inventory then we haveanother activity that um gets back theshipping providers for this product thatthat we're ordering this gets back a anarray of different shipping providersand then for each of these shippingproviders we will um retrieve by makingan API call to another service okay giveus the cost for this uh for for thisshipment so and this gives back likeseveral cost indications and then in ourworkflow we have a bit of code thatactually chooses okay what is then thecheapest shipper for for this productand then for the cheapest shipper wewill register this shipment if this isall okay then we end if this is not okayuh that means we can actually do acompensation action so we can actuallyundo our update inventory again and acompensation action is not somethingthat is built in into um deer workflowit just means you write an activity toundo something that happened in anearlier activity so we actually undoingthis update inventory uh when we callthisactivity okay let's have a look at thecode in this code I am uh in this caseI'm I'm showing C but you can uh writeyour deer workflows in also in JavaJavaScript Python or inGo so in this case we just have a aclass uh we inherit from a workflowclass that's given by the deer SDK andwe are uh running this method called runasync uh so the deer SDK gives us aworkflow context and we'll be using thiscontext to to call out and to schedule�all our our activities uh we give it anan order as an input so that's our ourown object that we provide uh and theoutput of this workflow is an ordervalidation result again that's just acustom object that we uh defineourselves so we are using this thiscontext to call uh to an activity so weinstruct the the deer runtime okay nowuh run this activity and as soon as umthe deer workflow engine will actuallyschedule this activity uh this code willstop executing it it will be unloadedfrom memory and once this uh activitycode has been completed uh then thiscode will actually start running againso so that's the replay nature I showedin the animationearlier um so we call an activity andyou always refer to activities by nameso in this case we're calling the updateinventory activity we pass it a apayload and we get back an inventoryresult and then we have just some if uhsome if statements which is our ourbusiness logic so if there's sufficientuh stock in our inventory uh then wecontinue um the next activity that'scalled is getting the uh the shippingproviders and here you see I'm adding inan additional argument that's called aworkflow task options uh so you canactually provide um a retry policy whenyou call these activities i didn't do itin the first example here um but you canactually instruct the workflow in casethis activity fails uh retry a couple oftimes and in this case it's just asimple constant retry policy but you canalso instruct it to exponential back offso that's that's totally up toyou um so this gives us back an array ofof shipping providers so what we'regoing to do next is we want for each ofthese shipping providers we want to uhask them okay well what's the cost ofthis um of of your shipment so that'swhat's happening next so we areiterating over the the shippingproviders in this in this for each loopum and again we are calling an activityget shipping cost uh but notice that uhwe are not uh awaiting this this callhere yeah so that means um the deerworkflow engine is not actuallyscheduling these calls yet we are purelyadding these tasks here to a list sothis is a list of task of shipping costresult uh and at this point we areactually scheduling all of these uhtasks and also instructing the workflowhere to wait until they have all beencompleted so this is the fan out fan inpatternhere so once all these activities havebeen done uh we just uh ask by hey givegive me the the cheapest shippingservice so I just uh just gives me thethe minimum the minimum cost here so atthis moment I have a shipping serviceidentified which is of the lowest costand then the next part is registeringthe shipment for this shipping servicenow I I've wrapped this in a a try catchclass to just to indicate that you canuse some other control logic inside aworkflow so um I'm calling a registershipment activity here but what happensif an exception is thrown in this inthis activity well we can then um catchthe u exception um exceptions arewrapped in a workflow task field uhabstraction in this case you can ofcourse always inspect what what theinner exception is of this uh of thisone because this is a really workflowspecific exception um but this is how uhfor instance you can do a um determineif you need to execute a composationaction all right so if this call thisactivity fails uh then we are in thiscatch block and then we do the un undoupdate inventory activity yeah so thisis our composationaction right just to show you an exampleof an activity there are multipleactivities but I'm just going to showyou one because they all have the sameuh the same structure again an activityis just inheriting from an workflowactivity that's based on the on the DeerSDK uh in this case the inventory isactually checking against a state storeusing the Deer API so that's why we'reusing the Deer client here uh we have arun async method again which is beingcalled and we're using the the Deerclient to retrieve some some state um sowe get back a a product inventory and wejust check if do we have enough of thisproduct uh in stock uh if not we ofcourse give back a result that we� don'thave enough in stock uh but if we do uhthen we make a mutation to this to thisstate and we save this state again usingthe deerclient allright now this is the this is the thestartup um part of of of the program umso what's important for for for deerworkflow and what might be differentwith other other workflow as codesystems you need to uh register yourworkflows and activities with deerbecause otherwise if you don't do thisdeapper has no clue that there areworkflows or activities present in yourcode so in this case we need to use theadd deer workflow method on the servicecollection where we need to register ourworkflow and ouractivities now when I run the demo I uhI will run uh I will execute thisendpoint there's a validate orderendpoint and uh what happens here we areretrieving a deer workflow client and weare executing the schedule new workflowasync method so here we are instructingthe workflow engine go please uh goahead and schedule this this newworkflow uh again you do it by name sowe pass in this work um validate orderworkflow uh we provide an instance IDbecause yeah each workflow needs an aninstance ID if you don't provide one itwill just generate a good for you but inthis case I'm just using the order ID asmy instance ID for this um workflow andwe provide the order as an input for theworkflow so notice that that we don'tget back the the result of this validateorder and and uh for this for thisworkflow right because everything isasynchronous so what we actually getback here is the instance ID and theinstance ID in this case is exactly thesame as the order ID so yeah um I couldhave just returned the order ID here butif you don't provide this order ID thenyou need to get back this instance IDotherwise you don't have any anything tocorrelate to what what correlates to myorder so we don't get back the resultbecause yeah we are triggering apotentially very long running process inthis case it's it's very quick but thiscould also take hours or days or ormonths right so that's why you don't getback uh the result of your workflowbecause everything isasynchronous have some other endpointstoo to update the inventory but Um Ithink it's almost time to to run thesample now so um as I mentioned uhdurable execution workflow engines uhstates needs to be persisted somewhereuh and you define what kind of state youuse via component files if you'refamiliar with with deer and you definethem uh in this YAML structure so inthis case uh I am using a state storethat's based on reddis because if youuse the deer cli you always get a localreddish container running um so Reddisis where where our state is stored andyou're not directly interacting with thestate we're using the the deer uh APIsto interact with this you're notinteracting with the state directlyyourself um deer workflow is built ontop of deer actors again you're notusing the actor API yourself when youare working with workflows it's justsomething that is useful for you to knowum and that's why you need to make surethat the actor state store flag needs isset to true so this is like aprerequisite if you want to useworkflow right just one more yl file Ipromise and then we're done um so uh I'musing uh deer run to start up myservices locally and in this case I haveconfigured which applications to run inthis yl file uh I'm starting up aworkflow application and I'm starting ashippingapplication all right so what I'm goingto do now is I'm going to use uh deerrun i'm going to use this command butit's it's all scripted so you won't seeme typing i just click a button and thiswill run so this will run bothapplications for me it will run the deersidecar for me as well uh so once thisis running uh I will use a rest clientfor VS code to make a call to thisvalidate order endpoint that will startmy workflow then I will stop allprocessing to indicate something severehas happened and our applications arecrashing and burning and not aliveanymore and then I will restart theworkflow with this dapper run commandand then we'll see what happens rightbecause we were actually aiming for likeautomatic� um failure recovery rightbecause that's what dur durableexecution is offeringus all right so I'm running the app nowso now our uh workflow application isstarting up now our shipping applicationis starting up also the deer sidecardare starting up so everything is runningnow now let me make sure that we firsthave some inventory so so this is thisis working so let Let me check if thisworks yeah okay so I have like fiverubber ducks in my inventory so that'sallworking okay so I'm now calling the uhvalidate order endpoint here and this ismy uh payload that I'm providing i wantto order uh two rubberduckies okay so Iam sending the request and then I'llwait very briefly and then I will stopeverything yeah I will stop right nowall right so the last thing in our logsthat we've seen I mean there's a lot ofthere's a lot of debug debug logginghere um but the last thing that washappening is um our workflow applicationwas um uh making a a service call to ourshipping application and that that's thelast thing what's happening here so thisis a uh a log from our shipping serviceuh that is going to retrieve theshipping cost for this order but beforeit could return this information to ourworkflow I have stopped all applicationsright to to indicate something somethingbad happened so everything is downworkflow is down shipping app is down uhso let me let me restart everythingagain so now I'm running deer run againuh which just starts up everything uhand let's see whathappens okay and again we see that heythe shipping services are contactedagain all three of themnow and let's wait a bitmore and soon and now the cheaping thecheapest one is selected that's uhservice B in this case and now we see alog statement called orchestrationstatus completed right so this isactually the deer workflow enginecompleting our work after a failure weonly need to start up the process againwhich if you're running in KubernetesKubernetes will make sure hey you'll geta new path and we'll start up everythingagain but I didn't need to make a callto this validate order endpoint againright because Deer realized okay I havea workflow that is not completed yet solet's let's run it again and I canverify it its state by making a uh getrequest and with this instance ID to getback the state of this uh this workflowso this is the workflow this was theexact instance ID that we we did run andthis is the final state this was theinput and this was theoutput all right yeah so this is howDeer uses the Turbo executionconcept um okay let me stop executingthis all right um I will skip the nextdemo due to time but I do have someother important parts tomention all right um so workflow s codeis great or workflows in general uh aregreat but there are still somechallenges when you're working withworkflows and so there are four points Iwant to discuss these are all describedin great detail in a blog post by ChrisGillum i definitely recommend you toread this blog post if you're reallyreally seriously considering to use touse workflows one of them is that yourworkflows should contain deterministiccode let me zoom in maybe bit more moreso um why does your code need to bedeterministic well your workflow isbeing replayed uh several times right soum if uh if during this um this replayum uh your behavior changes withinwithin the workflow this can cause amismatch between the state that isstored and what's happening duringruntime right because uh after eachactivity call all of the inputs andoutputs of this activity call are storedinto into this uh into the state storeand uh when during a replay when some ofthese arguments change there there's amismatch between between the states andthen you yeah you you have like a veryweird mismatch between state andworkflow so um what you you should notdo for instance is this you should notuse like um good new gooid which createsa new random good inside your workflowor you should not generate a a new umtime stamp in your workflow um have forthese like common use cases there areactually some uh some some helpermethods uh on the workflow context uhcalled new gooid and also contextcurrent UTC daytime so uh there are somehelper methods to make that a bit easieruh if you have other code that isnondeterministic just wrap that codeinside an activity call yeah becauseyour activity codes can be non uh can benondeterministic that's totally fine butmake sure your workflow is deterministicnow uh your activities um deer workflowguarantees an um um at least onceexecution of your activities meaningyour activities can be executed morethan once right so yeah what happenswhen you write your your activity codeyeah where there's like a not aguarantee thatthere's different behavior in youractivities so for instance what if youuh do an insert in in in a SQL store andwhen you provide a primary key so whathappens if you do an exact same insertwith that primary key for a second timewell that probably would not work rightso you you probably then better off todo like an upsert or you first do a readbefore you do a write uh so yeah youneed to check what what kind of codeyou're using um other APIs yeah maybethe APIs that you're using are itempotent maybe not so you you have to makesure uh that the code that you're usingis itempotent another tricky thing uh with thewith with workflows is is versioning umbasically whenever you make a yeahsignificant change in the workflow andmaybe you change the order of theworkflow or you you change some inputarguments to some other um argumentsthat is a breaking change in theworkflow and has again has to do withthe difference in between state and whenyou what you're using at runtime so forinstance I have this first version of aworkflow and that has the order itemwhich I'm getting as an input for mystate i'm using that as an input forboth my activities uh but then I changemy mind and the second iteration of thisworkflow is I'm using the order item asan input for my first activity but I'musing the result as my first activity asan input for my second activity now whenI deploy this um and there are stillsome workflows in flight and meaningsome uncompleted workflows um they havethe state that belongs to this versionmeaning they have the state thatcontains this order item as an input umbut now my new code expects the result awhich is actually a completely differenttype so that's that's a breaking changeso I think yeah there are severalworkarounds to this um probably theeasiest is to just version yourworkflows right so and here we see theworkflow name is really has a numberversion one version two yeah so that'sso never change a workflow just alwaysadd a new one it does mean that yourclient that's calling the workflow thatalso needs to be updated yeah becauseyour new version should now startcalling this one instead of this one butthat's that that's the easiestthing okay final one uh just make surethat when you are passing argumentsbetween activities make sure they are assmall as possible and because all ofthese arguments they get stored to thisstate store so in this case um um wepass in an ID which is small which isgreat and we get back a large documentfrom this activity and we're passingthis large document as an input foranother activity right here so thatmeans that this document is saved twicein the state store it's first it's savedas an output argument for this activityand it's saved as an input argument forthis activity so that it's veryinefficient so in in that case don'tmake your activities very small andgranular and give a bit moreresponsibility to your activities sojust make sure in this case do both aget and an update and a save in oneactivity instead of like making verymicroactivities all right my time is up um Ihope I've shown you a bit about aboutthe workflow u I've got two R codes foryou to to to share so the left one isthis GitHub repo where all of the slidesare and all of the source code is uh theright one is a link to Deer Universityyou can use that if you're completelynew to Dapper but in a couple of weeksI'll release a new lesson that'scompletely about Dapperworkflow if you have questions just comesee me afterwards thank you very much2025-04-15 21:57:54.394261�these days i don'tknow if you do but best not to uh youcan work in practice so we're definitelygoing to have the idea is everyone whowants to can follow along on localmachines if you don't want to do that oryou at any point kind of get lost withit do not worry we will be going throughevery single step up on screen so you'llbe able to get everything of what'sgoing on just by following along ifyou're not feeling like hands- onkeyboard this afternoon we probablydon't have a lot of time for questionsand also it's kind of tricky uh with abigger room um but we will have contactlinks at the end so if you want to knowmore there's a question that I didn'tget round to feel free to drop me anemail or a DM in some social messagingplatform as wego so technical requirements hopefullyif you looked in the schedule you willhave noticed that the idea is all youshould need for this tutorial to work isDocker or something else that runscontainers and kind so kind being a wayof starting Kubernetes clusters uh ifyou haven't got Kind already uh it'spretty easy to get just Google KindKubernetes and follow their quick startit's a single binary download so it'snot super bad if you haven't come acrosskind before it's amazingly useful forspinning up test clusters because itworks anywhere that you have Dockerinstalled and you don't need any otherprerequisites uh and then the idea iswe're going to clone a workshoprepository from GitHub um the QR codewhich is there on every single slidewill lead you to this repository uh andthen this is the repository URL uh andif you go to that if you run that orclick the QR code which will take you tothe repository that will get us the codeum now what I cando is we do we go in here when you getthe code down you should see a number ofdirectories the main one here is there'sa directory called commands and a filecalled tutorial commands and that hasevery single command we're going to typeso it's well worth having that file openbecause you can copy paste things out ofit i will be copy pasting at variouspoints in this because I'm not going totry typing all this stuff live becauseit will go wrong so there's also all ofthe slides are there as a PDF and allthe other files we're going to use solet me go back to this thing just incase people are typing away uh or if youwant to get get the QR code that's theother way of doing it and that's firststep so I'm going to get a couple ofseconds on this slide just so people canget that um and that will have all thefiles we need and it's going to have allthe manifests we're going to run thereare some manifests which we're going todeploy into a kind cluster feel free tolook at them i promise there's nothingdeliberately malicious in there they areall just standard manifests there'snothing really clever goingon so people go that you can always takea picture of the of the screen as wekeep going on slides know that that'sthe QR code to where we need to go sosetting the scene what do we want to dotoday what we're going to do isthroughout the tutorial we are going toplay the part of a developer who iswriting some software which is going torun in a Kubernetes cluster it's aFriday afternoon and they want to getthis working this feature working forthis sprint by the end of the afternoonbecause they want to go away for theweekend mark that as done and get out ofthe way and unfortunately they're goingto encounter some problems and we'llneed to get creative and it's that actof getting creative in which we're goingto find some hopefully interestingKubernetes security facts that we'regoing to workaround so set up a kind cluster settingup kind clusters is super simple all youneed to do is first go into thedirectory that we from the that's justthe GitHub co the repository that wecloned just go in that directory thereand then we run this command here whichjust starts a kind cluster so I'm goingto do thatwhere is my so you should be able topaste that in as long as you're in thedirectory from the repository it willhappily create a kind cluster i'm goingto go a little bit slow at this onebecause I'm con�scious i think theconference Wi-Fi can handle this i wasin another tutorial earlier on this weekand the conference Wi-Fi seemed to beable to handle kind cluster creationi've already got the node image down soit was pretty fast for me but it shouldlook a bit like that so we're justrunning the command to create a clientcluster and we're giving a passing out acustom config there's nothing supermajor in the in the custom config butit'll give us some stuff to gowithokay has anyone got the kind clusterworking or is it heading in the rightdirection thumbs up yeah we got a couplepeople workingokay that gives us the environment thatwe're going to workon and everything we do will be in thekind cluster so we shouldn't be goinganywhere else on your machines otherthan the kind clusterso with that done we can just check tomake sure it's working because as anyoneknows who does a lot of Kubernetes stuffuh it's pretty easy to end up pointingat the wrongcluster so I'll just check and yeah I'mI'm running against the local clusterbecause we can see our nice 1277 127001address so I'm definitely hitting a nicelocal kind clusterso if you've run up the K cluster andalso if you're running something likeStarship you'll see that we've got it inour prompt aswell and then what we're going to do iswe're going to apply any everything inthe manifest directory so this is justall the manifest we're going to need forthe tutorial all of the different podsand namespaces and things like that sowe're just going to copy pastethat and then just put that in so let'sjust apply all the manifests in thedirectory called manifestsuh and you should see something likethat when you run it uh you will get awarning which might be familiar to somepeople uh don't worry we will talk aboutthat warning later on but that is partof the scenario that's intended don'tworry that's not broken nothing's notgone horribly wrong it is actually whatwe wanted to doso uh if I do thisget things should hopefully be thinkingaboutit how is itrunning okay things are starting toimagei've got one on image book back offhopefully that's going to fixitself and the if the update log oneisn't working don't worry we're going toget to that one and why it's notworkingokay so what I'm going to do is before Ido that slide I'm going to talk a littlebit about what we did there and some ofthe things we actually did behind thescenes and then we'll come back andwe'll make sure people's clusters areworking because if they're going funnywe want to give a little bit of timebecause I think it might be a fun on theconference Wi-Fi so how did that work wehave just spun up a cluster and we'veapplied manifests from securitystandpoint what did we do well we wentthrough three gates with every manifestwe created there first we had toauthenticate the cluster so we had toactually make sure that we had validcredentials we then had to go throughauthorization are we allowed to do thething we're trying to do so createthings update manifests createnamespaces and then we also had to passadmission control and admission controlis a very important step because ithandles additional policy requirementsso even if you've got authorization wecan have admission control in place tosay actually can this person do thisspecific operation is this manifestvalid is it badly formed does it haveall the things it needs to have and isit trying to do something it shouldn'tso admission control is a third veryimportant security gate authenticationusually if you're working on productionclusters I would expect that you areusing OIDC so you're using your cloudproviders authentication service andyou're authenticating to a clusterhowever in Kubernetes there are alwaysother ways of authenticating to acluster not just the OIDC setup serviceaccount tokens are used by workloads toauthenticate to the API server butdevelopers can also use them and I'veseen clusters quite a few times whereactually they're used for human users aswell that has got some drawbacks uh andalso some system components use clientcertificates for authentication uh userscan also use clie�nt certificates but I Ididn't actually write this slide but Iwas meant to talk to it it comes withproblems hopefully no one is usingclient certificates for userauthentication in their clusters becauseit's not really a very good idea forproduction mainly because you can'trevoke user credentials that are issuedthat way there is no certificaterevocation in Kubernetes and if you everuse client certificates you find youcan't actually get rid of the accessyou've given people if that person movesteam or leaves the organization or thecertificate file gets stolen sotypically it should be OIDC but thereare other optionsavailable authorization Kubernetestypically you might not think a lotabout authorization and how it works butKubernetes actually has a flexibleauthorization model with lots ofdifferent models sort of different waysof doing it possible this is a standardum API I server flag from a KubernetesAPI server in Cubadm and there's twomodes defined node is a specialist orauthorization model which is onlyapplicable to nodes so when the nodeitself authorizes itself to the APIserver it uses that plugin and arbbackif you're using cloud Kubernetes you'reprobably going to have a third one inthere not that you can look at itbecause you can't look at the API serverflags but the cloud providers typicallyhave their own authorization model thereas wellum you can always look at what rightsyou have from an arbback perspectivethese are very very useful commandswe'll be using these later on cubecuttle off can I list gives you a listof all the permissions you have in acluster as far as arbback knows aboutthem uh and if you ever want to knowexactly what's going on you can alwaysdo minus v9 and that will tell youeverything about all the different areasof permissions that you have got andwe'll be running some of those commandslateron um right so let's go back and let'shave a quick look andsee what'shappening so if I justdo Oh my workstation pod is not having agood day is anyone's workstation podalso facing that or is it just me yeahimage pull back off that is that'sunfortunate cuz we need that one one ofthem were meant to be seeing the updatelog file[Music]uh yeah I I'm I'm logged into Docker iwas hoping it wasn't going to do that tome cuz logged in shouldn't complainokay always going to be funso because I'm a little bitparanoid I and I was worried about thefact that it might decide to beannoying i have a backup planso it's always wise to have a backupplan for live demos especially tutorialsum I kind of surprised DockerHub'scomplaining about that because I didactually ask them about rate limits thisweek and they said we're not putting thenew rate limits in place this week andtherefore it shouldn't be a problem uhbut maybe they have to change their mindand put in ratelimits i also haven't pulled that imagevery much this weeksookayokay did it work for anybody or dideveryone it couldn't shouldn't haveblocked work for some people okay so weobviously hit some kind of fun ratelimiti may just not have been fast enoughwhich is surprising as I was doing theuh but let's just what I'm doing hereactually just to explain what I'm doinguh I promise this is not a paid advertin any way shape or form this is just aservice I use which I happen to like i'mcurrently running in a ephemeralenvironment that I spun up in a placecalled Ixie Muis Labs uh it's reallyreally really fun because they've got anice CLI and you can just run upephemeral labs anytime you want with anykind of configuration so this is just alab I've got with all of my containertools set up ready to go and I thoughtjust on the off chance something goeswrong with the conference Wi-Fi let'shave this available and as you can seeit looks exactly the same as if you wererunning yeah and this should be comingfrom a separate environment so Ishouldn't have anyproblems so let's check and see if thisone goes better than the other onedid i gradingwhat you should see if it's working isyou should see all the other onesworking apart from the update logfileokay is going to complain oh not updatelog file sorry t�he other one where is itoh it won't bethere okay we've got fun with thingsdeciding to be that's fineso what we're going to do once you'vegot it up and running is we're going torun this command here and this is goinginto our If anyone's got problems thinkabout it the other thing you do istether to your phone and that should getaround Docker Hub beingannoyingoops we're going to exec into ourworkstations here we're going to becomethe developeruh when it's decided that it workingyeah it's happy okay so what you weregoing to do here is we're execing intoone of the workloads in the clusterright so we're saying cattle exec minusdndev which is our developer workspaceand we're going to go into ourworkstation and just have a shell so nowwe are the developer now we've got therights that our developers got andthat's the start of the setup hopefullythat is now working for at least somepeople and I I'll show up here anywayso what are we trying to do what's theactual goal of our developer right wenow become our developer we've set upour test environment we set up ourcluster our developer is just trying todeploy a log reader application so afairly simple application it's designedfor the critical task of reading logsfrom a s from a pod in the cube systemname space so it's a very simple thingit's trying to do and it's using a hostpath volume so host path volume is justmounting a volume from the underlyingnode and then reading files off it thisis a fairly simple task it looks a bitlike this we have got our pod insidecube system which is updating a log filewe then want to read the log file in ourdev namespace right and that's our task if wecan get that done by the end of Fridaywe get to go and have drinks in thebar so let's see what'srunning in our dev so this is make sureyou're inside the the workstation podwhen you're doing thiswe have got one pod running we don'thave our log file reader though it's notthere that's a problem this one howeveris expected this isn't Docker i'm beingannoying so let's look atdeployments okay we have got a read logfile deployment but it has zero of oneready something has gone wrong it's notworking and we can look at our replicasets as welland yep we've got one desired zerocurrently ready okay something has gonewrongour deployment which is our thing wehave to get done today is not workinglet's see why so there's our problemthat can't get created something isgoing wrong it can't read the logs let'sfind out what's gone wrong why is thisbroken well we have a feeling it's goingto be admission control admissioncontrol is getting in our wayit's because it's trying to use hostresources and our cluster administratorlike hopefully every production clusteradministrator out there doesn't wantjust anyone creating pods that canaccess node resources because that'sdangerous from a security standpoint itcan lead to cluster compromise um inmodern clusters there's lots ofdifferent options you've got forblocking this kind of thing you coulduse pod security admission which is whatwe're going to look at today you coulduse external admission controllers soprojects like Kyiverno and OPA and CubeWarden who have all you've probably seenif you've been around the booths earlieron at the conference or you can usevalidating admission policy that's arelatively new feature if you're onKubernetes 1.30 or higher that's G nowso you can have that as an optioninternally there's lots of ways to stopthis but the general idea is there aredangerous things you can do withcontainers and we don't want everyonedoing them and obviously our clusteradministrator here has said hey we don'twant anyone deploying stuff that can dothese things let's actually confirmthat's what the problem is thoughbecause we're just thinking that's theproblem right now let's run a command toactually check it so we're going torun Yeah that's the problem i will saythe mesh at least is nice and easy tounderstand it's not like super badbasically what it's saying is hey you'retrying to create something whichviolates the baseline pod securitystandard we're not going to let you doth�at so the baseline pod securitystandard is the basic policy you can putin place with pod security admission itblocks all the really obvious privilegeescalation attacks so the idea when wedesigned PSS was we said we want to notlet people do things that wouldinstantly lead to privilege escalationso we blocked them so it's not going todo it okay that's not great for ourability to get out and have adrink so how do we going to fix thiswell we could work in a differentnamespace right that's one option weknow that this is blocking our dev namespace so if we had permissions maybe wecan deploy somewhere else and that'llfix the problem we could ask for therestrictions on the dev name space to beliftedunfortunately we have got a bureaucraticprocess in place as it turns out that'sgoing to take weeks to actually get thatchange made so we can't do that or wecould see if we can find a way to makeit work maybe there's some way ofgetting additional access that let usget get this done and get it knocked offthe checklist that's what we're going todo let's see if we can dothat so before I get into the stuffwhere we do things that people probablyshouldn't inside clusters I am I alwaysdo this we always do this when we dothese kind of uh tutorials and workshopswhat we're showing you here istechnically possible please don't doanything like this on clusters that youdo not have explicit permission to do iton so don't go like go back to yourproduction clusters and think haha I cando the sort of things on a productionplease don't do that um because it's nota good idea and it can legitimately getyou in trouble with either yourcompany's security team or in extremecases legal law enforcement um so don'tdo anything illegal nocrimes right let's look and see whatpermissions we've got right what can wedo here let's go and see what we can doso this is where we're going to use thatcommand we talked about before o can Ilist it's a very very useful commandso let's quickly look and make that alittle bitsmaller so hopefully it will stay on thescreen there we go we have got somepermissions in the developer name spacewe can do thingsto pods and we can do thingsto pod logs we can do things to pods wecan do things to name spaces and serviceaccounts and deployments so we've gotsome rights but these are really onlyread only rights the only thing we canreally do here is we can create podswhich is kind ofniceso let's also just check because one ofthe ideas we had right was maybe we canjust deploy in a different namespace andthat may be one thing you'd think of sothe cube system name space is almostnever restricted it just doesn't getrestricted because the pods in cubesystem need to be able to do privilegedthings so let's just check see there seewhat's going on therehere unfortunately it's sad face time wehave very little permissions those ifyou don't recognize them are the bogstandard permissions that you getthrough system authenticated so everyauthenticated user in the cluster hasthose rights but those are absolutelynothing useful all that basically saysis I can check my own permissions and Ican get certain um nonresource-basedpaths so things like I can tell whatversion the cluster software is runningand I can check that things are healthynone of this is very useful to us asattackers or as people who want to getsomething done that they maybe shouldn'tbe able tookayso we can't get any good rights thisisn't going very well forus but there wassomething we did we were able to getservice accounts that was an option wehad let's have a quick look and see whatservice accounts we gotso service accounts are used to givepermissions to workloads and we've got acouple ones there we have got ourdeveloper service account that's the onewe're currently using and then we've gotone called arbback manager well thatlooks interesting arback manager soundslike the kind of thing that might beuseful to us because it sounds like it'ssomething that might have permissions somaybe we could use that somehowit might have rights to service accounttokens with those rights let's see whatwe cando s�o service account tokens are kind ofinteresting there's three main ways toget one you can either run the commandcube cuttle create token which will letyou create service account tokens wedon't have the rights for that one youcan create a secret with the right kindof type and it will create a serviceaccount token but you'll notice wedidn't have create secret so that's nouse to us either the third one which isvery useful to know because it's thesort of thing that you will almostalways have is we can create pods rightbecause obviously we can createworkloads in the in this name space wecan create pods um we only havepermissions for that one of thosethree any pod in a namespace can use anyservice account in the same namespace soanytime you're creating a pod you cansay what service account you want to useand because that service account is inthis namespace you can use that serviceaccountokay that sounds useful we have a routeto get the permissions of arbbackmanager which sounds like it's at leastworth exploring it might be useful sowell let's create a pod now here is oneof those places where if you're doing ifyou have are following along and you'redoing the commands in the command file Ithoroughly recommend copy pasting don'ttry and write this yourself i'm notgoing to try and write this myselfwriting YAML on a Friday afternoon andget the indentation right does not soundlike a good day so let's do that and itwill create a new pod so all we've donethere fairly simple pod manifest webasically just said hey create as a newpod and mount in the arbback managerservice account token so that then givesus a new pod running and one of theother rights we had was we had pod execso I can execute commands inside thatpod right so we now have a new pod ithas the token we can get access to thetoken hopefullyokay cool so that tells that we'remaking progress let's just check andmake sure that pod came up it shouldhavedone if you manage to get the first oneworking that one should work as wellbecause it's the sameimage so what we can do here we can dosomething like this cube cuddle exectoken read which is the name of our podand we're going to exec the file catthat long string that is the location ofthe service account token in a pod it'salmost always in the same location ithink you theoretically can change itbut it's almost always there and we'regoing to pipe it to a JWT code just tosee what it lookslike so let's dothat and what you get back is some niceJWTdecoded service account token that tellsus we have a token for arbback managergreat that's a valid credential for thearbback manager service account it cando anything the arbback manager serviceaccount can do that sounds promising uhone little bit of trivia that you may ormay not know if you've ever looked at aJT token and it looks like it's B 64 andthen you try and decode it with a B 64decoder and it doesn't work properly thereason is it's not it's URL safe B 64which doesn't decode with a B 64 decoderyou have to use a JWTdecoder found that out the hard wayafter many times getting it wrong sowhat we're going to do now is we'regoing to export and create a newenvironment variable which just has thattoken in it right so we're just givingit's just going to be easier than copypasting a token into our shell everysingle time when we do the next coupleofcommands oh cool minus I on the basicsofdecoder uh okay minus I will basicallyword it as well that's useful toknow so we're just going to do thiswe're going to export a token let's justset an environment variable because wewant to use that token in various othercommands so we just have to set that andthat just gives us that environmentvariable set up so now we can see whatwe've got right we've got Roback Managerbut what what can they do is this goingto solve our problem is it going to getus finished ontime let's find out what we can do so wedothis andthat so we have got a lot of these areall the boring rights but there is oneinteresting line right at the toparbback manager kind of unsurprisinglybased on the name has got rights toeverything in the arbb�ack area ofKubernetes API and it has got star whichis great because it means any objectinside the arbback schema we have gotrights to and we've got rights to do allverbsso this schema controls anything thingslike cluster role bindings ro bindingsuh and at this time we've only actuallygot it in one name space but it lets soit lets us do ro bindings and roles butnot cluster role bindings and clusterroles it includes the usual CRUD verbsso when we say star that means any verband one of the things that you may notknow this may be the first time peoplefind something they didn't know aboutKubernetes security is there are otherverbs in Kubernetes not just create readupdate delete there are other onesspecifically there is an arbback verbcalled escalate escalate only applies oncertain objects in Kubernetes um but itis a verb that is valid and you canspecify it if someone gets star rightsto arbback it includes the escalate verbthe escalate verbs amazingly usefulbecause usually if I try to escalate mypermissions say I've got rights tocreate roles in a cluster and I try andcreate a role that has permissions thatI don't currently have Kubernetes says"No go away you're not allowed to dothat you're attempting privilegeescalation i'm not going to let you doit unless you have the escalate verb ifyou have the escalate verb it says "Ofcourse you have the escalate verb you'reallowed to privilege escalate that'swhat this verb is meant for." So this isbasically the privilege escalation verband because we were granted star sowhoever created the rback manager maybethey just meant you to be able to createroles and role bindings but they didn'tknow the fact that star includesescalate so we can escalate ourprivileges that sounds like fun becausethat's what we're trying to do today getmorerights so because we've got escalate wecan basically run a command like thisand what we're doing here is we're justusing cube cuttle and we are saying okaywe're going to use our rbback managertoken which we set as an environmentvariable create us a new role which hasgot star rights on the resource star.ststar and we'll call it ns admin so thisis a role remember only writes in onenamespace not a cluster role which wewrite in allnamespaces so we'll create thatrole and then we will give a rolebinding and we'll bind it to ourdeveloper user right because we bind itto our developer user that's just goingto make it easier for us because then wegot the permissions so all we're doinghere is we're saying create a new rolebinding of role of NS admin and theservice account we're binding to is ourdeveloper service account so that'sgoing to give NS admin rights to ourdeveloper and it'll say yes now I saidthose two commands with the escalateverb if you just had create read updatedelete that would have failed it wouldhave said nope that's not going to letyou do that even though you've got rightto create roles you can't create thatrole because it has more permissionsthan the arbback manager account doesbut because we have escalate it said ofcourse you can dothat and then we can check ourpermissionsagainand it looks the same apart from whatyou always like to see when you'retrying to escalate your rights lots ofstars so we've got stars across theboard now inside our namespace thismeans that we've got all the rights todo anything to any object in thisnamespace that sounds great that soundslike exactly what we wanted to get thejob done because now we have morepermissions maybe we can make theannoying warning goaway so our next object we've now gotnamespace admin essentially and this iswhere you come across another thingwhich again this might be one of thething you learn in this tutorial that'snew to you um arbback rights objects areeither global or they're namespacedright that probably know that part rightyour object either exists in a namespaceor is a global object across the clusterum we are only rights of one namespaceuh we can do what we want inside ournamespace but we can't do anythingglobally right if I try and do somethingto the cube system namespace it'll saynope you don't� have any rights in thisname space go away namespace objectsthemselves though this is where it getsa bit funny because I'm going to overusethe word namespaced namespace objectsare sometimes global and sometimesnamespacedspecifically requests in a namespace forthat name i didn't write this butrequests in a namespace for that samenamespace arenamespaced and I'll explain what thatmeans in the next slide because thatsounds likegibberish one of the things you can donow you have rights inside the onenamespace is you can change labels onthe namespace because labels aresomething you can change on a namespaceobject once you have full rights to itnow that's where you get to a funimplementation detail of pod securityadmission which is that pod securityadmission is enforced as a label on anamespace so if I can remove the label Ican remove pod security admission thisis true for any security control thatyou put in place that has got is doneimplemented via labels on a namespace ifyou do that and you then give someonefull rights to the namespace they canjust turn the label off they can saydelete the label and I've got all thepermissions now so why don't we do thatlet's remove that annoying pesky thingby saying no we want this name space tobe privileged please Mr clusterAdministrator and it says dev namespacelabeled awesome that soundspromising so let's just do QC cuttle getpodsagain and our token reader workspaceobject is now working look we have got apod token read and it's ready and it'srunning so it did what we wanted to andyou see it came up and it's now doingits stuff so we now have what we wantedwe've achieved our goal yay we can getto the pub um but maybe we're not quitefinishedyet so let's just recap what we didthere when we started this workshop wewere developers we had got rights to dothe kind of things developers would beable to do but not a lot more and thenbecause of a service account that beenleft in our namespace we were able toget our permissions a bit higher andthen because of maybe a misunderstandingof exactly how arbback works we wereable to get even more permissions andactually the point where we own thewhole name space and then lastly becauseof the fact that our security controlwas a label in a namespace we were ableto bypass it these are all relativelyplausible mistakes to make these aren'tlike you know wild you know thingsthey're not all necessarily that wellknown and now our workload runs butlet's we're not quite finished yet whatelse could we do well one of the bigproblems about Kubernetes security isthe problem that pod security admissionand things like we're meant to stopwhich is if you have unrestricted createpod rights which you have to give toanyone who you want to have workloadsrunning in your cluster if you're goingto let them do it directly it means rooton the node right because Kubernetes isdistributed remote command execution asa service that's what it does for aliving uh and if you don't lock peopledown they can do interesting things andthis can lead to privilege escalationnow we're thinking it's very useful thatwe've been able to bypass this controland get you know what we need to getdone done before we leave for the daymaybe we could do a couple other thingsto give ourselves some more access incase this problem ever came up again wedon't really want to bother the clusteradmins with all this you know annoyingthings where things don't work let's seeif we can do something else and makethis a bit better forourselves so let's compromise thecluster why not we're here already let'slet's let's let's just compromise thewhole cluster we're going to apply amanifest and I'll cat this manifest outonce I've runit we're going to apply a manifestcalled node root and you can probablyguess what it is by thename uh if Icat this is a fun manifest i used to usethis manifest all the time in pentestand it's a very fun manifest what itdoes is it basically says turn off allthe security that you usually put on apod it says give me the hosts networkgive me the hosts P ID which is a namespaces that usually be used to const�rictcontainers uh and then it has the magicflag in the security context setting itsaid privileged true uh the privilegedflag is a is a really well-n named uhflag originally I talked to someone fromDocker who said they wanted to call itinsecure uh as a flag but managementdidn't like that as an idea so it'scalled privileged but it basically meansturn off all the security so we said butlet's just turn off all the security sothat's what that does it runs a pod intoa node and it turns off all of thesecurity you would usually use torestrict apod once we've done that we can justexec intoit and run the command chute host andwhat it does is makes us root so we'rerooting the node now we are now the fullroot user on in this case control planenode if you had a cluster with 100 nodesyou could run the same thing as a damonset and you could run one of these ontoevery single node in the cluster and youget to be root on every single node inthe cluster which is fun because itmeans you can do whatever you want andthen you don't have to worry the adminwith any pesky problems you might haveum you can just get access and this isfull access root access to where you areum you have everything you ever wantedyou have all of the processes this is acontrol plane node so it's got the APIserver and everything else running on itso that's nice we've got our access butthat's only access right now what ifsomeone comes back in on Monday morningand realizes that pod security admissionhas been turned off accidentally andthey turn it back on again then we'regoing to have this problem again so whatelse could we do well now we're here wemight as well see if there's anythinguseful in the node for us to use maybewe want to come back later and you knowdo some more stuff it would be reallyhandy if we had a way to fix problems wehad maybe need some better credentialsif we had a credential that wouldn'texpire that would be even more usefulwouldn't it because that's one of theother annoying things about creds theytend toexpire so this is where we come to adminconfadmin conf um most Kubernetesdistributions when you install them willcreate a credential with the distri thatthat you use at the startup in anythingbased onQADM there the files admin conf well itused to be admin and super adminuh and these files essentially haveclient certificate based credentialsthat don't expire and have cluster adminlevel rights to the cluster so if youcan ever get access to a control planenode where those files might exist youcan just get the files and copy them andthen you have access to thosecredentials uh if if anyone's doing thison a a kind cluster that is older than129 you will only have admin conf superadmin came in in 129 and I'll I'll I'llbriefly explain the difference butbasically if we do that if anyone'sfollowing along you can do that andyou'll see this is that's actually atthe cluster level so we got star starstar at the cluster level using thatcredential which isfantastic if we use super adminconfst star at the cluster level and atthis point you might say well what isthe difference you've just told methere's two files and they seem to havethe exact same rights and the differenceis that super admin is in a group calledsystem masters and system masters is aspecial group in Kubernetes if you havea credential that is in system mastersit doesn't touch arbback you can't youcan't remove its rights in arbback youcan take every single line out ofarbback that there is and thatcredential will still work and willstill give you cluster admin if you lookat the Kubernetes API server source codeyou'll find why there's actually ahard-coded bypass for all authorizationmodules if you use a user in the groupsystem masters so typically youshouldn't be having credentials that arein system masters unless you want to beable to do something like coming back ina year's time and still having clusteradmin no matter what's happened inarbback uh and the reason is you used tohave one file that only had systemmasters and they split it up so this onereally really shouldn't be given toanyone the other on�e at least you canrevoke its rights by taking away thebinding to cluster admin group so wecould have that credential and thatwould be super super useful because thenwe could come back anytime we liked andfix any other problems we happen to havewith ourcluster so that sounds really useful allof therights so how would you avoid theproblems how would you avoid thishappening infuture because really if you're acluster admin we shouldn't have letsomeone do that that wasn't reallysomething that should have happened umlet's take our frustrated developer hatoff namespace segregation is importantwe that arbback manager service accountshould have been in its own namespace itshouldn't have been in a namespace thatother people use because it has got lotsof privileges it should have its owndedicated area where you deployed it toand in general if you're going to createservice accounts with privileges and youdon't want everyone else who can deployworkloads to that name space to havethose rights then you don't actually youknow you want to split it up care withservice accounts it's very important notto let people create service accounttokens and knowing that pod creationallows for service account creation inthe name space is very important andthen arbback lease privilege um don'tgive anyone rights they don't need tostar is incredibly dangerous inKubernetes arbback because star ismatches anything that could exist thatexists now or could exist in the futurein that area because it's just stringarbback kubernetes arbback is juststring matching and so star alwaysmatches a string so realisticallyspeaking if you get someone star atanything so star means literallyanything that exists now or could everexist in this cluster because it is juststring matching so all of those thingswould have helped us avoid the problemnow that was kind of fun but we've donepretty well for time so let me just goon a little bit because I thought wemight have time and I wanted to I wantedto demonstrate one more thing which Ithought is a kind of fun bit of hackingum we're going to exploit an unpatchableKubernetes CVE to finish off theafternoon so if you don't know there areum and it's from 2020 so this CVE wasdiscovered in 2020 uh and there is nopatch for it it exists in every singlecluster that there is and you can'tpatch it there are actually fouractually five i wrote this slide and Iforgot to update it there are currentlyfive unpatchable Kubernetes CVEes fourof them relate to networking and one ofthem relates to git repo volumes it'sfairly new though which is why I hadn'tmentioned that as a five but we're goingto do one of them and they exist prettymuch in every cluster you can getbecause there's no patch so we're goingto we're going to exploit somethingcalled CVE 20228554 and it's one of myfavorite CVEs to exploit because it'sreally fairly simple to demo he saysbefore running the demo script um itshould work fine uh and it's quiteinteresting because it illustratessomething about how Kubernetes works theway it goes is this say I've got twodifferent namespaces owned by twodifferent teams one namespace the one ingreen wants to make a connection to anexternal website and and get someresources or get some data right thepeople over here in the bad namespacecan hijack the traffic of these usersand can say I don't want this to go toexternal uh web server i want it to cometo our internal pod any internal podthat they've got control over and theycan hijack the traffic and actually makeit go in theirdirection and this is kind of a featureof Kubernetes well I mean it is a CV soit's not it's a bug but it is also afeature so let's actually let's actuallytry thatout uh I'm going to exit out if anyoneis following along and is in the shellswe want to be all the way back to ourrepository so you want to be back towhere youare hereand then I'm going to go into theCV directory so we're going todemonstrate this just by creating alittle kind of client pod which will beour good the green namespace pod andthen our bad namespace people will willcreate a deployment and a service andthey're� going to hijack the traffic thatthat user is trying to sendexternallyso let's do thatso we've created a client pod and thatpod is up and well that pod should comeup and running put it in dev name spaceif I rememberrightly yep it's up andrunning and that pod wants to get somedata from the very important internetservice I canIP so we'll actually check to make sureit can get to thatservice and it can right so I can has IPjust gives you back your IP addressthere's lots of services which do that iremember that one cuz I always rememberthe name because I'm old enough toremember thememe but now we're going to hijack thattraffic and I'll show you the manifestexplain how it worked once we've done itassuming all goes to planwe're going to create two we're going tocreate a deployment and aservice and once we've done that we'regoing to go and tryagain that sentry to come up andsuddenly our request has changed andinstead of being a request that's goneto an external web server it's gone toan internal instance of the EngineX webserver so we've hijacked the trafficthat will work in any Kubernetes clustercluster you try it in unless someone hasspecifically blocked the feature that isor the other one is if you are usingpsyllium in non-cube proxy mode it won'twork there either if I remember rightlyso what did we do there we have justessentially exploited a high traffichijack inside a cluster this is akubernetes 132 cluster this is fullypatched how did thatwork the magic came in theservice so in Kubernetes another lesserkknown fact hopefully a thing that youmight know that you didn't know beforeabout how Kubernetes works is there aredifferent service types and you can seehere one of them we've got is we canspecify external IPs and that's what wedid here you can basically specify anyexternal IP you want and then Kuberneteswill say hey if anyone makes a requestto that IP address it's actually belongsto this service and not any other thingit might otherwise have belonged to andthe reason that works that way isbecause Kubernetes services are just IPtables rules when you create aKubernetes service what you're actuallydoing is having an interface to createLinux IP tables rules and a Linux IPtable rule can redirect traffic that'sone of the things you can do with IPtables so this that's exactly what aservice object is it redirects trafficand that's why it doesn't work in siliumi am right thinking about that becauseinselium they don't use IP table rulesif you replace cube proxy so basicallyall we did here was these are the two IPaddresses that belong to I can IP youcould change those for any other IPaddresses you like um and then we saidif you get any traffic that's going tothose IP addresses don't send it theresend it to the deployment that wecreated hijacking thetraffic if you want to fix that becausethat doesn't sound like a great thing toallow in your clusters the best way offixing it is you want to block creationof external IP address servers if youdon't need them probably a lot ofclusters don't need that as a conceptthe best thing to do is have anadmission controller and say to youradmission controller hey don't allowanyone to create those or maybe onlyallow it in a specific name space andthen have control over who can deploy tothat namespace if you do need thatfeature why did that work yeahKubernetes uses IP tables rules and IPtables rules can redirecttraffic yeah if you want to stop ithappening arbback is obviously don't letpeople create services and admissioncontrol the two best ways of doingitso quite well in terms of time but let'sjust talk a little bitabout the goal of this and hopefullyactually show of hands did people learnsomething they didn't know aboutKubernetes security did I succeed thisafternoon yay I succeeded awesome thankyou very much for that um the goal ofthis was to try and say look Kubernetescan be a bit complicated and there arelots of little and we've talked on someof them i could literally have doneanother 10 of those without too muchdifficulty there's lots of differentlittle things in Kubernetes that youdon't necessarily know or think aboutgenerally speaking the best way tohandle them is try and be very carefulwith permissions you give to things beminimal access and segregate stuff asmuch as possible if you got things youreally don't trust maybe you want to putthem in separate clusters depending onhow confident you are that you canactually know all of the places wheresomeone might do something like this andescalate their privileges or getsomewhere else if you want moreresources um a couple of things thatmight be useful I have wrote my blog isthe racine link there i've written aboutmost of this stuff at various times umthere's also that talk site if you everlike Kubernetes security talks uh I'veI've put a searchable index of everysecurity track talk from every CubeCongoing back to 2016 there's like 300talks so unless you really can't get tosleep um that should make you you knowhave a nice time and get to sleep uh asI mentioned earlier on SIG security onthe cub or Kubernetes security on theKubernetes Slack are great places to getmore information about this kind ofthing if this has sparked an interestyou want to learn more of these weirdand wonderful places that Kubernetesmight be a bit odd from a securitystandpoint and also tag security on theCNCF Slack uh one thing that's not onthe slides that I promised to someoneelse at the conference I would mentionis if you like hearing Scottish accentsand talking about Kubernetes KCD UK thisyear will be in Edinburgh in October uhand we would love to have everyone comeup to Edinburgh and join us for KCDwhich will be obviously I think it'smiddle of October if I remember rightlybut later on in the year and now withthat I've got some time for is everyonegot Oh contact details i promise contactdetails contact details there happy totake questions that way we've also gottime if anyone wants to ask anyquestions nowyeah would network policies block it ifyou wereblocking if you were blocking yeah ifyou're blocking the bad namespace andsaying no traffic could go to that badnamespace then maybe so yes if youlocked it right down with networkpolicies that might stop it just becauseit would stop the traffic altogether butI think the idea here is the bad namespeeches people yeah you might build ablock with that as well that's a fairpoint yeah network policies might alsomanage to block it um if you put them inby def but making sure you have adefault deny and on ingress and egresson all namespaces by default would bethe way toAny otherquestions yeahah oh there's a microphone awesome hellothank you for the presentation um forthe first pen test that you did you weused to get the token from a file uh ifwe use secrets is the token will bestill persistent in the file where we uhwhere we got the token from uh I'm goingto see I heard that but secrets in filesit's kind of hard yeah I mean we we gotthe token we we got the token from afile to use it in order to hang on it'ssuper echoey I'm going to comedownoh the file service account tokens inthe files I'm going to get this rightotherwise I'm going to not hear youoh yeah yeah so the question is if weuse secrets would the token still be inthe file the answer is yes you typicallyhave to put secrets somewhere insidepods if you want them to use them soyou've got to either put them inside afile or you've got to put them in anenvironment variable um ultimately it'sthe problem with secrets is if you wantpeople to use them they have to besomewhere for people to get them from uhyou can make it harder for people to getthem but the way because I can create aservice account in a namespace I can saywhere I wanted to go so I can say Iwanted to be there um the secrets youyou because you think we didn't have anyrights to get secrets in this scenariowe only had access to great pods um soyeah it's still generallyaccessible any otherquestions are we all have I said colddrinks and ice cream in the sun too manytimes and everyone's now can think ofnothing else other than it's Fridayafternoon in that case thank you verymuch for your time and I hope that wasuseful2025-04-15 21:57:55.142053 ��?:#��5A8Q8sFzODEUoafternoon everyone and uh welcome alongto this afternoon's tutorial uh veryglad to see you all decided to come tothe tutorial and not go outside in thesun and get an ice cream and a colddrink um what we want to do thisafternoon is we want to teach a bitabout Kubernetes security through ahacking scenario my goal for thetutorial today is I hope that everyonein the room leaves with at least one newthing they didn't know about Kubernetessecurity that's my personal goal so ifwe can get to that point it will havebeen a success for me anywayum about us you might have noticed onthe schedule there were three of usunfortunately life sometimes happens andnot everyone can make it uh so I am Roryuh and I will be running the tutorialtoday uh I've been doing containersecurity now for around uh nine years uhand I do a number of things like helpout with Kubernetes SIG security andalso help maintain the CIS benchmarksfor Docker and Kubernetes so if you'veever had a bad CI CIS compliance findingand you're wondering who to blame I'mthe guy um also uh if you're thistutorial sparks an interest inKubernetes security in you then SIGsecurity is a great place to learn morewe have meetings every other week uh Ithink the booth will probably beshutting fairly soon but we are aroundon Slack as well uh and my partners incrime who unfortunately can't make itare Ian and Marianso let's start what are we going to dowe are going to follow an attack path inKubernetes so we're going to take oneattack and we're going to walk throughseveral steps of bypassing and avoidingKubernetes security controls to show ushow someone might be able to escalatetheir privileges in a way that perhapswe didn't expect they would be able todo um and we're going to explain theconcepts at each step so we want to walkthrough and actually explain what we'redoing and why this works and also to anextent why it might be surprising topeople that this is how it all works andhangs togetherlogistics um phones on silent if anyonehas phone ringers on ��ong time i'm going todo my best to inunciate both awesomeprojects both very different from eachother very similar names uh but maybesome fun overlap where we could paint aworkflow that goes into production allthe way to the end but maybe also startsits story at the very beginning uh and Ithink that that's something that'smissing and that could be reallyexciting so um we're talking about Nyxwe're using Flux we're talking aboutGitOps and we're going to useFlux sound good all right um so thefirst thing that I want to point youfriends to um is a repo where I havesome experimental things going on um andI think the conference Wi-Fi here inthis room because it's not very dense isactually pretty fast but I'm pretty sureas soon as all of us run this speed testit's going to be a little bit miserableso if you are comfortable uh using yourfree credits in your GitHub account forGitHub code spaces I recommend uh comingover togithub.comybox is my repo and thencryo-enuh which is a little bit of a teaser forsome of the things that I'mexperimenting with uh and instead ofcloning the repo or forking it andinstead of clicking add a code space youshould probably click this three dotsright here and you should say new withoptions and then you can check it out onmain um in some uh you know region thatmakes sense for you and I wouldrecommend bumping up the core count uhas well as the uh RAM just so thatthings compute a little bit faster foryou this is uh going to run a Kubernetescluster uh and so it's nice to have alittle bit of headroom uh that will useyour core seconds uh on your GitHubaccount faster so if you are running outof those for some reason even though itis only the beginning of April uh thenmaybe just use the two cores it stillworks um so once you create your codespace um or if you don't Yes yeah i willgo back thank you for um for callingthat out there so the repo is here uhthere's a green button you click thegreen button you click the three dotsand you can say like launch with optionsuh and I recommend clicking fourcores and uh this is this is kind oflikethe the the point where I'm like tryingto pre-bake some stuff so sometimesthese code spaces can take a little bitto start up um some things that aregoing to be happening in this code spaceuh I will admit uh that the devcontainer that I've prepared for you isin an early state um it's not like themost um you know like perfectly puttogether dev container but what it isusing is Nyx uh in the dev container uhit's using Nyx via Flux which I'vementioned a couple times um and it'susing what's called a Flux environmentuh which has its dependenciesdeclaratively stated inside of the gitrepo that you are looking at uh andthat's going to create the dev containerwith Docker coming from Nyx uh providingyou your Kubernetes command line toolsproviding kind the Kubernetes clusteritself as well as a bunch of developertools uh and getting everything set upuh in the dev container and then usingMicrosoft's init RC and hooks and stuffto start the Kubernetes cluster for youas well as the Nyxstatement um so that's just somebackground uh as I fill the space uh toallow folks to create the dev containerum I'm going to start moving on butdefinitely if you need a little bit moretime please feel free to ask anotherquestion sweet so get the dev containerspun up uh what that will look like is awindow that kind of looks like this umyou can see I've also been working onsome local changes and things so yourswill maybe look a little bit differentbut when you pop into here you'll have aterminal i'll try to make this big foreverybodyand you'll notice um that you are goingto be at a shell your shell willprobably look different from minebecause I have my dot files loaded um ifyou're really suffering and you need uhyou know to import your dot files orwhatever feel free to take the time todo so i also have a simple um bash rcsimple it's like bash rcen simple undermy name on GitHub if you want to clonethat you can use those and then it yourrepo will uh look like mine in codespaces um what else was I trying toaccomplish her�e okay so you'll pop intohere and you're going to have a codespace and you're going to be in theworkspace cryonix i think that uh youshould also clone another repository sowe should go into the workspaces folderuh and get cloneum just httpsgithubuh.com and then it's going to bestealthyslashcale-nixgithops uh and that should put a repointo that workspace folder i alreadyhave it so git is complaining at me uhbut uh there is another repository thathas other folders with other micro demosthat we can play with so go ahead andget that other uh repository i'm justgoing to leave that up on the screenthere um let's talk a little bitabout dependencies and working withpeople on a team or even just becomingfrustrated with yourself over timeworking on the software that you like toplay with or the projects that you wantto work onum when I want to install a database onmy Mac so that I can write software andput information into that database Iusually will need some sort of packagemanager um I'm either popping into thebrowser going to a website and clickinginstall on s some sort of application ormaybe you get Brewmy teammate usesUbuntu and he needs to go and get thatsame sort of database probably adifferent version of it um because thepackage managers are now kind of out ofstep with each other uh but he's goingto use apt right and then my teammate ison Arch Linux is going to use Pac-Man uhbut then maybe they also might want toactually get the same exact version ofthe package as me and so they might alsouse Linux Brew and maybe they'll havethe same version but it'll becross-built for their machine their uhoperating system and CPU uh and thenI'll have my own version built for me umthis gets really frustrating when wehave different hardware from each otherand so Docker came and kind of saved theday because we just wanted to pretendthat everything was a Linux machineunfortunately that comes with a coupleof trade-offs one you've got to run avirtual machine and split up yourresources it's not so big of a trade-offbecause now we have really fast CPUs andstuff but it can make a difference andthen um two it means that you actuallydon't get to use the benefits of youroperating system kernel and your nativehardware and maybe that might beimportant if you're doing an ML workloadand like wanting to do some GPU um youknow computations or somethingwe I think need to get to a place wherethe system package manager evolves alittle bit luckily there are some folksthat got ahead of us so I'm going tomove on from having the repos on thescreen uh also please forgive mydisorganization i had to restart mymachine right before this um entirepresentation so I'm going to try to findthings as quickly as possible but thethe Nyx community has been around for 22years uh this project is huge if you goto github.com uhNyxos Nyxpackages and if you can typecorrectly you'll see uh in here thatthere is I had this up earlier what am Ilookingfor uh it breaks the contributor counton GitHub more like 20,000 or somethinglike that uh over 22 years with massiveamount of commits and you can see thatover this history there's a very largeuh maybe even kind of multiplying orexponentiating trend um some sort ofcurve that's going up uh people reallylike this stuff if you've never heard ofNyx before the benefits that it givesyou uh is that your builds aredeclarative and then you can maybe startto do operational stuff declarativelynyx looks like this it's honestlymiserable for anybody uh whoseexperience would be you know beginningprogramming Python uh and uh just normalstuff that everybody knows nyx is afunctional programming language uh andconsuming it oftent times uh it has areputation for being very hard to grockso uh I want to show you maybe an easierway to use Nyx but still get thebenefits of it uh it's an open sourceproject called Flux uh they do have likea a you know open service that's similarto DockerHub as well where you can pushand pull things but you don't reallyneed to use that part and we're notgoing to use it today um we're justgoing to use Flux to create environmentsthat� are reproducible we're going toconsume software from open source andthen we're going to try and use our samehabits to build a Go application[Music]so if you pop into the Oh also one thingthat's really important here uh afteryou clone both of your repos uh presscommand shiftp to open up the commandpallet and then uh you'll want to likemake sure that you type the like likeuh greater than sign like a little arrowspace and then you'll want to look foradd folder to workspace you need toclick that and you need to add scale nyxgithops to the workspace so that way youhave two repos and when you do that it'sactually going to like restart theeditor um so if you don't do that um ifyou pause this code space and come backto it later you may lose work so justheads up that has bitten me before[Music]um let's hop into the scale Nix GitOpsum repo here i have a softwarerepository i want to work on my Goapplication you'll notice I don't haveGo if I go into my dev folder scrapbookdev then you'll see that I have mysource code there i would like to gobuild this but I can't because I don'thave go but if I run fluxactivate then now I will be put into asubshell that will have my dependenciesso if you're a Nixxian in this roomwhich you know there's like a prettyimpressive percentage of you here thisis like no news right um this looks alot like maybe running Nixenv oractivating this thing called a flakewhich is a complete jargon term thatyou've never heard of for the rest ofthe room um but here what we're doing isactivating what's called a flockenvironment and so somebody has prepareda group of dependencies let's see what'sinside of it so if I list this fluxenvironment then I can see that I have aCA certificate bundle for making surethat my application can do TLS i haveGCC the GNU C compiler uh for doingCbinding stuff i have the Go uh toolchain for building Go applications ihave the image magic C libraries anddevelopment kit and then I have thisthing called package config whichhonestly I don't know what that is but Idownloaded I I added it to theenvironment after I had to deal withsome errors and it it helps build theimage magic libraries when you're doingstuff so it's it'snecessary so this is what I need todevelop my applicationum similarly if I go up into the uhfolder above and inside of the devcontainer you'll see that there'salso a flux environment here it's calledflux test it's got zsh nano kuberneteshelm cubectl kind jq for some jsonparsing or whatever you want to dothere's gnupg for validating you knowsome secrets stuff uh the GitHub commandline tool Docker etc um the ACL thingfor fixing some file permissions in thedev container setup inside of codespaces all kinds of fun stuff and uhthis environment is declarative let meshow you what that means uh thereis a flock folder we'll actually hopback up over intothe Sorry umlet's Could you ask that question i justwasn'tlistening yeahokay cool i will do my best to slow downa little bit here so in the scrapbookdev folder inside of therepo there is a folder that's checkedinto the repo it is a hidden folder sols minus la uh that shows you the filesinside of the scrapbook dev folder ijust ran the cd command to get in hereuh there's a folder called flux andinside of this umfolder there are manifests and thesemanifests I did not write them insteaduh if you want to change things insideof uh here you just like flocks installand then we could say we want to addlike python3 uh you can do the same thing with likeflock search um maybe we could searchforsome I don't know Elixir related thingsor uh how do you even spell Elixir maybelet's just look forNode so like you can get some Node.jsruntimes from the catalog uh thesepackages are ultimately backed by theNyx packages cache um and Flux is doinga little bit of work in their service toindex them uh which means that you canalso say like flux show uh the name of apackage so like say I did no.js you cansee the different versions that areavailable and what CPU and operatingsystem architecture combinations aresupported for that particular packageversion um so thi�s is a package managerthat probably feels similar to somethingyou've used before brew npm uh thosesorts of things uh we have a questionover here from the person in the Palumishirtso um basically those are versions thatpredate the release of the Apple SiliconMac um and so historically uh thosepackage versions uh when they were builton a communal build farm called Hydra umjust that the uh CPU and operatingsystem target for ARM 64 Darwin uh orlike basically an Apple silicon Mac justdidn't exist yet which is a really goodquestion for the other ones uh theoperating system tupil is just not beingshown because it's kind of canonicallyexpected that most packages arecross-built for all four targets um butif you end up looking in here and yousee um any uh architecture combo thenthat's a sign that it's not maybesupported on your on your laptop or yourfriend's laptop soum so this is really interesting becauseuh this package manager is showing me aa single idea of a package that hasmultiple versions that then is alsosupported on multiple operating systemsuh which is sometimes I think thatthat's pretty unusual um it's it's notunusual if you're just looking at likeversion manager tools for say Node.js orGo um but it is really unusual whenyou're thinking about a package managerthat has a scope that's as wide as Nyxpackages which is like over 120,000packages some of them come directly fromprogramming language libraries othersare command line tools others are systemlibraries a bunch of C tool chain stuffuh and even things like systemd anddocker uh you can install all of thesethings with nyx packages and many ofthese targets are cross built arecross-built um so that's this this isalready getting really interestingwhat's also interesting is if you runfluxedit there is a declarative manifestfor anything that you do imperativelyuh this I when I look at this I see thedesign of npm basically a package jsonthe format of this is called a manifesttoml fileUh and you can edit this as well um soif I wanted to say I just added Python 3and um ifI was privy with I don't even know whateditor I'm in nano um thenuh oh you know I'm just realizing someof you probably ran that flux editcommand and then didn't know what editoryou were in um I probably should haveadded the the variable to open that upin VS Code i apologize um if you are innano uh and you are like struggling toget out you can press probably likecontrol uh and x and that will exit justin case anybody got trappedum so there's a declarative manifest andthat means that if I change it and Ilook at my git status um then there arechanges to those files and then I can goand diff the manifestchange uh and you can see now that I'veremoved a comment here and I've added uhI guess I I made an edit but uh this wasuncommented just just a moment ago itwas added automatically anddeclaratively and I could you know uhget commit you know to push that up tothe repo and then anybody who used thiscommit uh would then whenever theyactivate that environment they would nowhave Python 3 at that particular versionwhen I say at that particular versionit's a promise uh because if youactually look at the gitdiff there's a lock file also a hugeidea like what a concept cept when I'mworking on a project and I would likedependencies and I install a version ofa programming language interpreter orsome C libraries uh or a CA certificatebundle that the the hash and the versionand the provenence of that decision thatI made as a developer to pull adependency into my software project thatwe would lock it and this is again not anew idea npm does this cargo does thisgo works this way so basically anybodydoing real software development hasfound this need for per projectdependency management resolution andlocking uh and we just don't treat ourmachines this way uhNyx kind of has this in a thing that'scalled a flake that's also moderatelyexperimental uh and sometimescontroversial depending on who you talkto um flux is not using uh a flakes itdoesn't like require them uh you can useflakes with this uh like uh environmentformat if you woul�d like like I can Ican go and install any flake ref umthere's a bunch of Nyx people in theroom so yeah you can install the flakerefs you can do all of your next magicum you know your years of experience umdoing functional programming and uh uhyou know crafting perfectly perfectly uhyou know tree shaken software builds uhand then just put that into a fluxenvironment and somebody does a gitclone and they do a flux activate andthey have all of your stuff running oreven potentially building so um verynice but uh these are locked not justfor one operating system but for all ofthe systems that my environment isdeclared to support that's really nicebecause this exact environment that I'musing on this Linux AMD machine uh inGitHub code spaces if you go and checkthis out uh on your Mac you will be ableto use that exact same environment soum this is I'm going to increase thefont size in a second here againum why is this not I'm in the wrong repothis needs tobe workshopscale there wego i need to check the status of whereI'm at here right now right i'll pushthatup oh wow okay hold up onesecond i have to mount adisc look away i'm doing uh developerbuild stuff rightnow okay there we areoh boy allrightum scrapbookdev okay so this is thatexact same developer environment uhright here right the caerts GCC go imagemagic package config uh I can activateit uh and then now I have thatenvironment on my stack of subshells andI have access to go go is coming from myflux environment uh you can see thatit's coming from here uh and thatactually comes from a immutable uhreadonly disc that you saw me just mountuh that's part of my Nyx installation umthat is built for Arch 64 Darwin uh butthis is the same exact git repo and thesame conceptual environment uh with thesame life cycle of management uh anytimeyou change the software project uh forthe uh Linux AMD4 stuff AMD 64stuff sookay we we've thought a little bit aboutenvironments we've talked through kindof what that means uh you have the codespace cloned you probably have both gitrepos cloned you have your your git repoadded to the workspace so you don't loseany of your work um so just one moretime unwinding since I know that we'vekind of gone on a couple tangents but inthe scale next GitOps repo if you go tothe scrapbook devenvironment so in this state and you runflux activate you will get thedevelopment dependencies necessary foryou to run go build of this applicationif you run that command then it willstart to download a bunch ofdependencies from the internet uh atthis point now I have only one thing tomention to the Nixians in the room thefolks who use Nyx already uh is that youwould probably if you're veryexperienced with Nyx write a a Nyxprogram uh to get your dependencies foryour Go build um and for people in theroom who are not using Nyx uh I'm justasking you to run a go build and do a gomod um basically like hydrate modulesthe normal go way uh so I would say thatthis is stepping out of maybe the normalNyx path but what we are winning is alot of Nyx benefits for everythingunderneath the developer environment umand nothing is stopping you fromcontinuing to use Nyx from here but justlike what's really nice about this isyou can get the benefits or at least agood number of the benefits but then youdon't have to actually ask anybody tolearn really anything new at all um sogo build uh what you'll end up with ifyou run ls I I like to run ls-laum is a binary that's built that's justsitting in here uh great i have myapplication built uh if you look throughthe source code uh of the app inside ofthe scrapbook dev folder you'll see uhinmain.go that this application is using aPostgress driver uh it's using the Gofiber um like web application frameworkuh and then we also are doing some stuffwith mime types and we are using the gographics imageic uh bindings so theseare Cb bindings from a go applicationand I've done that uh kind ofintentionally because now you're askingfor kind of cross language dependencymanagement which is the kind of thingthat's usually really frustrating anderrorprone and cause you� all sorts ofissues uh the beginning of this programinitializes immig um with the C bindingsuh and then we open up a connection to aPostgress database uh and listen for acouple of uh views and then um there's acouple of you know kind of restish waysto put stuff into a list of things uhand then we serve uh some things from apublic directory and then um there's youknow if you want to mess with the HTMLand stuff uh you can go directly intoviews and index uh and mess around ifyou are a front-end developer uh andwant to have a JavaScript Go applicationcool okay so that's a little bit aboutthe application that we're doing let'sgo ahead and run it go image app okay myfiber server is running um because I'musing code spaces it has noticed that myapplication is running on localhost port3000 um you'll notice we're at acloudnative conference and we'recurrently not using any containers atall we are using the thrill of localhostum so open in browser that's great it'sgoing to go in proxy me to myapplication and uh just oh gosh right Ithought things are not working so I I'mjust playing a game i'm I'm I'm totallyexpecting that because uh you know my myGo application is trying to connect to aPostgress database and there's noPostgress database let's fix that i'mgoing to show you why it kind of is funto not be inside of a container all ofthe time uh I'm going todo flocks so like with with my scrapbookdev environment already active I'm goingto run flocks activateuh d-remote and then I'm going to say in theflock namespace so flocksslashpostgress and then I'm going topass another flag I'm going to say startservices and this is going to pull aremote environment that's just kind ofephemeral and Um of course we're doing alive demoso something didn'tworkwhy is it a flaky problem why did thatnotwork service manager is unresponsiveplease try again later pulling updatesfrom Foxhub i could also pull copy thisand attempt to run it i also can justnot start the servicesthere's a couple of things I could dohere i'll be truthful i was expectingthis to be like a like a wow moment butinsteadum503 yeah maybe but I mean I'm on a codespace you know like this machine is inthe cloud um I did get a 503 from theflux meta git repo that'sinteresting what can I do aboutthis i am goingto make a new flux environment uh withyou friends so we're going to make adirectory i'm just going to callitPostgress and then I should hopefully beable to flock pulld-copythe flockhostgressenvironment presumably if flockhub is upright now then I will be able to do thatif not I can get it directly from thegit repo the problem here is that weneed a database and we want to run itand Wowthat's Yeah okay cool let's go aheadand go get thatenvironment fluxends there's a repo under the fluxnamespace SL called Fluxends and if youum look through this list there'sPostgress uh and basically I'm justtrying to in flock Postgress.fox envumI'm trying to get to this TOMLfile and um wecan kind of just copy the entirething okay i wasn't in intending orexpecting that service to ever go downthat's never gone down for me before soum thanks for suffering through thiswith me um so what I'm going to do hereis I'm in this new folder i'm going toflock init a new environment uh and thenuh I'm goingto set my editor i think it's like thisandthen maybe does thatwork yeah there we go then I just pastewhat I just copied um so again thatthat's from theflocks repo in the Postgress folderunderfloxuh manifest tunnel i'm just going tosave that file and accidentally close mycode spaceuh from muscle memory and we're justgonnaYeah yeah the upstream repo addressthat's um so that wasuhflocks flocksenv uh and then in the postgress folderthere is a flockuh and then there is an env folder witha manifest toml and we basically justwant to copy this um technically youshould also have like the lock so if youwant you can just like clone the entirerepository uh but uh this should shouldprobably work even just copying themanifest at this point now I'm I'mtotally just playing jazz so umbut yeah yeah um so once you make a newfolder you c�an fluxinit right okay so here it just sayscreated the environment Postgress uh andthen um you can also just like open thatfile you don't have to necessarily runfluxedit um but that will do like somevalidation checking similar tocoupubectl edit uh anduh basically so like in this Postgressfolder if you just like edit themanifest toml here uh I would just pasteit in save it uh and then if you're inhere and you flock activate uh then itwill want to build that environment soafter activating it I'm actually goingto like exit and I'm going to show youthat you can activate and you can starttheservicesso environment does not have anyservices defined that's really odd to meokay I'm like reallyuh losing it right now thank youum where are the services in hereoh did I not copy theright file oh I see there wego i just I I wasn't able to save thefile um and then I keep accidentallyclosing my codespace okay coolso I think I've saved that fileyes andthen so I just say I want to buildPostgress uh and then it goes and itmakes sure that I have um the Postgresspackage if it's not there it goes andfetches uh the one that's needed for myCPU and architecture from Nyx packagesuh and then it initializes a Postgressdatabase for me uh and then tells me heyyou know there's some stuff that youshould go and try um here I'm also goingto So right now I have a databaserunning right so if I run PsQL I canactually talk to a database that'srunning on local host now um and then Iforget yeah so you can just type exit ifyou want to get out of there uh I amgoing to quicklyGP through the cluster directory insideof the root of scale nyx githops uh anduh I am just going tolook kind of recursivelyuh for theword I think u think recursively and andcase insensitive just table so if you ifyou do that GP or you look inside ofcluster appd dby yaml uh there is atable creation command here uh and youknow we're we're doing local dev sowe're just kind of imperatively messingaround if you copy that command and yougo into psql uh and you paste it then itwill create a table for you and then nowum if you in your uh devenvironment if you try to run go imageapp which is the thing that we built uhthen you should be able to open that inbrowser and now our app should berunning so what happened here is we'rewe have our development environmentthat's giving us our developmentdependencies uh and then we also havethe Postgress environment that we builtuh with the database that initializesitself uh and then I logged into thedatabase really quick on my localmachine with no authentication oranything uh and then added a table andnow our app is working this is prettykind of quintessential um experience forlocal development and what's notablehere is that we sort of were stacking soyou'll see uh at the top of the promptor for you it might be at the beginningof the line and now it's taking up yourentire window um it says flock and wehave the flock test environment whichhas like docker and all of those thingsinstalled u for our home directory andthen we have the scrapbook devenvironment for my go uh interpreter andthen we also have the postgress databaseuh and then the Postgress database isrunning a service um so that's uh you'llalso notice you know like you can sayuh wecan throw some uh pictures in here andthen upload them to the database and nowthey're you know being serialized andstored within Postgress so if you youknow stop the application now the app'snot running and then you know you startrunning itagain it'll come back you know um sothat's just like my little uh scrapbookwhich is like a little modified to-dolist where you can put pictures um theidea as well was that image magic is inthe go application so maybe if you arereally good with the image API then youcan start um you know making the imagesa little bit more generative in in thefun 2000s you know turn of themillennium kind ofway great so we have our application umI think we should try to think aboutwhat we can do to put this in acontainerum there are some easy ways and thensome um some more experimental ways sothe very first thi�ng I can show you uhwould just be uh fluxcontainerize uh what this is going to dois it's only going to take theenvironment not my application just onlythe dependencies uh for my environmentand it's going to put them into acontainer um this honestlyis it's I think an an early form and a aglimpse of the future and[Music]um I guess suspiciousownership why would that haveoccurred uh I think that this was thealle thing this won't happen to you ummy my code space is in a very differentstate uh than yours uh because it hasbeen through multiple restarts but umthis is a file system uh permissionissue that sometimes happens on GitHubcode spaces when they recreate the tempdirectory as well as some of the devicetree uh and I have a script inside ofthe dev containerIt'sthe postcreate script and the probablythe poststart should have fixed the permissionsthe next statement's runningokayanyway I'm expecting that to work if itdoesn't uh then I will show it to you onmyMac uh oh you know why don't I do thatjust now already too uhlet's So we ran that and then I alsoneed terminal hereso I'm I'm here with the repo checkedout on my Mac obviously uh because I'musing my Mac i shouldn't say obviouslythe all of the binaries that I'm usingfrom Nyx packages are built for my Macum there's no such thing as a Mac OScontainer right now that is anythingthat you would actually want to usethere's some experiments um but uh if Irun Flux containerize then what willactually happen fail to populatei need to I need to start Docker becauseI stopped mymachine at the start of the sessionlet's go ahead and just run that with KLima it's my favorite uh containerruntime uh on aMac Iguess meanwhile okay I need to fix mywindow management here we gocool uh back on the code space uh Fluxcontainer I completed and it wrote acontainer image uh to theruntime um so this image doesn'tactually have my app in it it only hasmy dependencies uh that means that if Iwant to actually um build something withthat then uh I need to probably write adocker file around it you know so I canuse those dependencies um so I basicallyhave created a base image out of my fluxenvironment uh the cool thing is um thatI can also run that same exact commandon my Mac what what will happen here isFlux will actually resolve the Linuxversion of all my dependencies and it'llput it into a volume inside of theDocker Damon uh and then it will usethat volume to build a container imagethat is Linux native uh for that thatrespects basically the lock file um sofor all of the Mac packages you'll getthe same exact versions uh of thosepackages with the same build flags thatwere built at the same time in ahermetic way on the public uh buildservers that are called Hydra um and umthen that'll assemble it into a Dockercontainer you'll see that this umprocess I wish it was faster uh and Iit's it's very possible to make this alot faster uh it's just that it's anearly uh sort of prototype of how thisworks um some of the ways that you canmake it faster would be uh I startedmessing around with a um kind of moremanual way of doingthis uh so I haveum yeahuh addflux generic docker file let's go aheadand push up a commit to the repo uh youcan pull that down if you have it clonedon your code so this uh Docker file thatI'm messing with uh let's go ahead andjust read the sourcecode it's dockerfile.floxn what this does is from githubcontainer registry flocks flocks so aninstallation of Nyx with the with theflock usability layer on top of it uhtagged at just one version ago um sofrom this image uh I'm going to make aworking directory and from my uhapplication directory I'm going to copymy environment into the build contextand into the container and um what'sreally important about this is uh if youmight have noticed when I looked atwhere the Go interpreter was or or theGo tool chain was coming from it it wasactually inside of this folder but notactually it's being sim linked uhinto what's called the Nyx storeuh which is basically a flat directoryof a bunch of hashed versions ofdifferent what are called derivationsthey're like pieces �ofpackagesum and so because everything is actuallyin the Nyx store and it's only simlinked then when you do this in yourDocker file you're actually only gettingthe metadata about the environment sonone of the actual binaries ordocumentation or um library uh dynamiclibraries are actually being copied intothe container here and then when we'rein the container not on a Mac but in aLinux machine for uh Arch 64 we canactivate the environment which will goand pull down everything in order tomake sure that it's actually runnableand then here uh we can actually querythe Nyx store so I'm I'm saying I wouldlike all of the requisites for theruntime environment for my current builtsystemum and or my current system um fromthe uh Nyx standard library uh and thenuh we have to do a little bit offormatting and then I also threw acouple wild cards in there because I'mlazy uh and then basically after we getall of the requisites thenum these are recursive hard links to allof the files in the Nix store but in afolder that I can look at separately andonce we have that folder um I tested afew things this seems to be like themost performant and bug-free way to dothis uh then you can call that a buildstage where we basically get all of therequisite store paths for everythingthat our environment depends on uh andthen we can copy from that environmentand then into um the Nyx store on anAlpine Linux machine uh so this this newcontainer that we're producing it nolonger has nyx or flocks or anythingneeded for nyx or flocks it only has thedependencies that are necessary for theapplication's environment um this is uheffectively equivalent except the baseimage is different uh to what flockscontainerized does but the benefit ofdoing it this way uh is that all of thelogic sort of stays in the Docker Damonand you're kind of only in a Docker fileso if you're using tools that use Dockerfiles uh or you want um the cachecoherency and behavior of the Dockerbuild x builds then uh this is one wayto do it so youcan you can say uh something along theselines if I docker build with thisgeneric docker file uh and then I I canname it something like say likescrapbook uh runtime uh and then I pointit to a folder where the context of thatenvironment is uh then you should get adocker build um so it loads up flocks ialready have it downloaded that machineis like a gigabyte because it includesnyx and flocks and all of itsdependencies um but it we're just usingthat as a builder image uh and then werun the flock activation internet'sworking pretty fast uh here in this umconference center lovely uh this isagain happening locally on my MacBook uhand then we query the next store we findall of the paths we make sure thatthey're recursively hardlin isolatedfrom everything else that's not relatedto it uh and then um this usually takeslike what 25 secondsthis step could definitely be removed umif you know you just wrote like a abuildex front end uh or did this adifferent way like directly with OCIregistria storageum but um yeah we get copied into ournew image uh and then you end up with uhin the case of my Mac this is an ARM 64Linux image that is runnableum you're in a shell and inside of Nickstore there's all of mydependencies right so uh you'll notice Iwas using Alpine Linux there uh it'sjust so that you you know have areliable base for a shell and don't haveto go and find you know the sim link uhto there should be like bash in heresomewhereuh ls next door gpbash yeah so I mean like you could runbash here and set the entry point uh andthen you wouldn't need alpine Linux atall but um there's a just an alternativeway to build those sorts of base imagesbut we want to be able to get ourapplication to a format where we canactually run it inside of a container umso let's go ahead and look at um whatthat Docker file might look like um thisis probably the most uh kind of standardway uh to do it so if you've used Dockerbefore uh what's happening here is amulti-stage build um so the first stageis up here uh and then as soon as yousee another from statement we're intothe runtime uh which �the last stage isalways the output image uh so basicallyuh from the scrapbook dev environmentthat we produced either with Fluxcontainerize or with uh my specialdocker file hack uh we're going toenable CGO so that we can use our Cbindings with IMIGIC uh we have the workdirectory being set copy our source codein there make sure we pull ourdependencies so we actually run a fullycontainerized build uh that's notdependent on anybody's imperative youknow kind of process or debugging orwhatever they did on their local machineum and thenum we make a new uh runtime container uhso we don't want to have the the go uhinterpreter and all of that stuff in itwe just want our C libraries ideally uhso we have a separate flux environmentthat is the base image for this thatonly has image magic in it uh and thenwe take our go binary uh and we umcan just copy our web assets in there uhas well as the uh binary itself and ifwe build this so you'll need to makesure that you have run like fluxcontainerize on both the scrapbook devand the scrapbook runtime uh or run thedocker file on both of those then youcan go ahead and run this docker fileand that'sjust docker build and then dot uh I'vealready done this so it's built the goapplication and uh you end up with acontainer image that is usable um youshould probably tag it with somethingi'm just going to callitscrapbook that so then um this containerimage you can docker runitand oh that's right umyeah thank you appreciate it uh that'sthat was the exactly the thing that Ineeded to hear thank youreally appreciate thatokay there we are okaygo download uh that was an interestingone you have to have a temp directory inorder for Go to build i I can't say thatI knew that um but I guess when you youhave minimal container images sometimessometimes you discover fun things likethat uh oh I should uh actually unpacksomething here um I'm like glossing overa little bit of a hack that I'vedone so let's go ahead and just dig intothat for a second um this run statementright hereum in order for this Go buildto link up to all of the CB binding uhfor the C bindings to actually callpackage config and and build all of theimage magic stuff into the binary itselfum we need to make sure that ourenvironment is active uh and so uh ahacky way to do this is inside of ourbase image if you look inside of the Nyxstore uh for the development environmentthat comes from flocks and again I'm I'mdoing a little hack here because thisreally should be pinned to a hash that'sspecific to the image uh not thatthere's more than one in there there'sjust one but uh and then we run theactivate script um with the environmentfrom that same um thing and then we canexecute um go build and that allows youto actually build against those uhbindings that come from Nyx uh thesekinds of wrappers are necessary becausesince everything actually lives in thenick store instead of the standard umfile hier file hierarchy um so likenothing is coming from slashlib or slashuser uh there's like nothing to bindagainst in those directories uh we needto have an environment that's speciallystructured to uh inform the linker andthe loader uh when programs are builtand started where to find theirdependencies uh this is just kind of aninherent thing about how Nyx works umand something like this uh is you knowagain a sort of proof of concept uh kindof hack uh you wouldn't want like thefinal like usability to look like thisbut conceptually what's happening thereis uh we are hydrating the environmentof the build so that it can find all ofthe C stuff um in a non-standard placethe reason that we want to find it in anon-standard place again is umeverything in the next store is hashedspecifically to the inputs or to thecontent address of the bits of thepackages themselves uh and since they'rehashed uh that means that we can befully sure every time we build that theexact versions of the things that wewant uh that our binary will neveraccidentally get the wrong version ofimage magic uh or a slightly differentversion of gibbc it's always only evergoing to be built against exactl�y whatit's supposed to uh and then when yourun it it'll only be ever it doesn'teven know how to accidentally find otherthings it's just only ever looking forexactly the the stuff that it needs uhand this um these sorts of loader hacksand environment uh wrappers and thingslike that areare in like99.999999% of the time like completelyfree uh you never will notice anyperformance or behavioral differencewith your application it'll just workbetter than any other time you've had aweird issue where you've had a versionskew uh between something you were usingon your Mac and something that uh gotaccidentally pulled and updated on somereconciling thing in CI but you likeonly found out after an incident so hasanyone ever had a problem like that youknow you're like you had a version likeslip on you uh with something that wasactively running in production and thenyou ran CI to like rebuild something uhand and then it just like didn't workthe same or the build didn't evenwork yeah cool seen a couple of nods uhit's like one of the most mortifyingproblems that you run into let's get toa place where stuff isreproducible so um at this point we havea containeruh that runs ourapplication um right like if I wereto thenum should be able tojust run uh I think it's like go imageapp yep cool andum another hack that's happening here isthe entry point for this container isactually the flux environment itself soyou'll notice when I activated the umscrapbook runtime environment there uhor docker image we ended up in a fluxenvironment specifically for this go appuh and there's a bash shell so I'll showyou what this looks likeum say we wanted to actually run thethis like just with a single docker runcommand um we haveto I think just pass the uh binary thatwe're intending and then that getsactivated as an argument to the activatewrapper uh so that way our go imageapplication is finding the exact versionuh of image magic that it needs uh inorder to actually run so that's a now wehave our application containerized andthere's some caveats about how to startit uh basically our our entry point isnot pre-programmed because we need argsum so I could have added args to it butlet's look at a uh kubernetes manifestfor um this kind of thing so in thecluster folder in the scale nyx githopsrepo um there is an app folder that hasa db yaml and a scrapbook yaml so ummost of these things are forillustrative purposes um so ignore thethe missing um you know shutdown hooksand things like that that you would needand like missing resource limits and allof that for an actual Kubernetesdeployment uh but here we have aKubernetes deployment YAML uh and we'recalling everything scrapbook uh and thenwe can point it to some container imageuh here this is using a different nameand tag uh than what I built just nowbut um you can build it at that samename and tag and then here uh we'redoing that sort of workaround right likepassing an argument so that we can loadthe the binary um in a maybe likeslightly non-standard way uh and then wewant to pass it some connection data fora Postgress database uh and we'll justtell Kubernetes that there's an HTTPport on 3000 uh as well as a servicethat's pointing to that port and then inthe database YAML there is a config mapuh here I didn't build the database withNyx uh you can do that um one of thethings that's nice about using theupstream docker uh image just fromdocker library postgress is that uh theydo have life cycle for creating tablesso you can just putum a SQL file inside of a config map uhand then mount it which is really niceand then um what's also nice and whatwas kind of being used by my Goapplication to connect to my Fluxenvironments Postgress database is thatthere are standard environment variablesto direct Postgress drivers to the rightplace uh so my flock environment wassetting these things and I just copiedthem uh put them into a secret uh andthen um made them appropriate for theKubernetes service name there's a littledeployment here again this is definitelynot a productionready database it'sexactly the opposite it's a singlerep�lica deployment of a Postgress umcontainer instance so it's not evenusing a stateful set uh and then uhthere's the volume mounts uh for that uhtable initialization logic uh andthen there's that so sort of the fullstory here is that we would be able touh run our application in the Kubernetescluster um let's go aheadand let's go ahead and skip actuallylike applying this and running thisbecause we have I think only a handfulof minutes left in this session doesanyone have any questions about flocksand nyx itself u and we can maybe spenda few minutes on that and then I have afew closing statements about uh githopshi yeah um so usually stuff in Nyxpackages is very up to dateum and uh if we search for the top uhlevel Postgress SQL I think wouldprobably be what we're looking forum so yeah SQL let's go ahead and flockuh show Postgress SQL and then I'm goingto pipe it through head and let's go seeuh what's on top so currently 17.4 is uhthe most up-to-date package uh I wouldexpect that um for any hightra softwareuh the usage um the the user base isincredibly wide and active um if youneed to update a package you can open apull request on NixosNix packages umlike you don't you don't need to knowanybody it'll be reviewed by somebodywho knows what they're doing um but forpackages like Postgress they have anactive community of maintainersspecifically looking at that all of thetime uh and then there are bots andscripts that are constantly looking atupstream places to automatically updatethese packages uh the result is that umNyx packages is larger than pretty muchany other um package repo on theinternet and it tends to be the most uptodate thanks for thequestionokay yeah sbombs think the story uh ispretty like um like like it's like it'slike begging for an sbomb story right uhand like when you when you do that umnext storeum uh you know query uh requisites youknow for some um you knowscrapbook dev uh flux runtimeenvironment for some CPU architecture umthen you get a list of every requireddependency uh that would ever possiblybe touched by uh applications being umyou know run in this way um and uhthat's that that's like a little bit offormatting away from an sbomb there's nolike flag for it right now butdefinitely the story is like obviousthere um also uh something I would saythat it's like kind of weak right now inuh I shouldn't say kind of it's veryweak it's almost non-existent is avulnerability correlation uh in Nyxpackages uh just the way that thepackages are maintained and stuffthere's there's nobody you know who'slike fed RAM certified like looking atall these things and trying to correlatethem um to you know like does does someCVE like apply to this or whatever uhyeah uh cool uh that's going to be thelast next question i'm going to takefive minutes to just talk a little bitabout the vision of like GitOps withthis so one really cool thing here iswhen I update the environment I get a deeven if I'm working with it imperativelyI get declarative changes to a manifestinside of my project's repository i do agit push and then when my teammates comein in the morning and they do a git pulland they activate we're all using theexact same versions of all of ourdependenciesso that means that any time thatsomebody makes a decision to pull on anew dependency or update somethingbecause they want to use a new buildflag or a new API or a new you knowfeature or subcommand uh or they justreally want pretty printing for thatJSON output that cames from you know AWSCLI and it they're putting it inside ofyour um you know CI script thenum any change that we make it justnaturally becomes a declarative conseququence of iterating and collaboratingwith each other it ends up encoded inthe repo this to me has like a reallynice mirror toGitOps where the reason that Kubernetesbegs for GitOps is Kubernetes is so widein scope networking cluster managementcompute storage and often across manymany namespaces across multiple clustersmultiple organizations multiple teamsand apps um infrastructure that's likedeep and critical experiments thingsthat have been there in productionshould never change in the next 10 yearskubernetes it begs to own all of thatand so people are begging to collaborateon Kubernetes but ain't nobody walkingin into the you know the office in themorning and they're logging intoKubernetes to chat with their teammatesright kubernetes is not a collaborationtool and so using Kubernetes as a teambegs for us to glue collaboration infront of it right and that's what liketools like Flux and Argo are really goodat doing is educating people about howthe infrastructure works and then givingthem an almost invisible thing whereit's like you don't even have to look atit when you are making collaborativechanges to your infrastructure whenyou're changing code the robots arealways constantly at work reconcilingand making sure that exactly what isintended is funneling through the buildsthrough the lints through the artifactpublishing signingverification to your fan out inproduction and I think that there's amissing step here which is that that'severything operationalwe need to take GitOps further we needto shift GitOpsleft and I think it starts with how wethink about using our machines um againuh I'm I'm really happy to walk peoplethrough things um if you would like tocontinue hacking on these things um youknow I I want to be respectful thatrespectful of everybody's time but likein these last two minutes uh if you lookin the scale uh nyx githopsum repo there's this cluster folder andum for any flux users that showed up uhthere are a couple of new things I wouldlike to show uh one is a usage of theopen-source um flux operator by controlplane uh who full disclosure I'memployed with um you can make fluxinstances and then point them to atagged release of the flux operatormanifestsum for installing flux at a particulartag version scheme uh and when you makethis and you install flux operator andand you put this one flux instance inyour cluster now you have auto updatingflux uh that's really nice if you've gota fleet a fleet of a thousand or youknow couple hundred clusters orsomething uh running on tractors orwhatever umum the next thing is okay well if we'reum going to be installing Flux this waythen it sort of feels like it'sreplacing Flux Bootstrap which is sortof the bread and butter of anybody usingFlux um so a good way to sort of emulatethat is we don't have to worry about theuh installation manifests anymore let'sjust make a git repo um point at wherewe're at create a secret that has atoken uh and then um make our top levelcustomization so here you can see thegit repo is pointing to the one that wehave cloned uh and then it's configuredwith a token and uh it's going toreconcile every minute on the clusterpath so that would be this which meansthat um now the git repo is alsomanaging what version of flux you'reusing on all of your clusters uh as wellas itself uh and then creating anamespace for your app and thendeploying our database and our scrapbookapplication to the cluster uh using ourcontainer image that was built with Nyxand again that container image all ofthe dependencies uh they were alwaysversion locked from the moment that weused them so there was never a shiftwhere it's like we went into CI and nowsome something different is happeningand there's never a drift when I'mexperimenting on something and thensomebody on another uh laptop pulls downmy branch and they're actually using adifferent version of Postgress like justthese differences they just can't happenanymore because we brought declarationsall the way through the collaborationplatform not just for productioninfrastructure not just for ops but forour development environments too uh andI think that that could be a cool pieceof the future so uh thanks so much forcoming to the tutorial today i wasn'texpecting a production incident to takedown Flux Hub but you know um there'sthere's always cool stuff that happensuh when you are friends with people atstartups and um I'll be like around ifyou know we want to have a little chatabout whatever you guys want to chatabout um and thanks so much yeah2025-04-15 21:57:55.860906 ��C;#��=ANnYtnUeJi7Uhi friends how's everyone doing yeah uhmy name is Lee um thanks for so much forcoming to the tutorial today today weare talking a little bit about somethingthat you may have or you may have notheard about uh and then we're going tobe fusing that with an idea that youprobably have heard about uh we're goingto be talking a little bit aboutNyx um we had any Nyx users in the roomoh wow all right so we're in goodcompany now there's like a half of theroom that raised their hand uh and thenthere's probably another half of theroom that has maybe even not heard ofNyx or never touched it or used itbefore uh andthen I want to sort of paint a littlebit of the visions and the possibilitiesso I'm going to be real with y'all i amin the lab right now like I am doing alot of R&D uh and I wouldn't say thatI'm maybe as experienced of a Nyx useras some of the other folks in this roomuh but we are all in good company andwe're going to run some experiments uhlearn some principles and try to hacksome stuff together uh does that feellike a good vibe for today we've got 75minutes to uh get our hands on thekeyboard and mess around all right theother thing that I am much much moreauthoritative in is GitOps right so wehave Nyx which is a completely um kindof separate idea from lots ofcloudnative stuff but there wherethere's a lot of overlap nyx is thissort of functional package manager thatlets you build reproducible software ina hermetic way uh and then you have abunch of addressed blobs that you canget from caches and you can assembleyour programs with graphs ofdependencies uh so Nyx is all aboutbuilding software and doing it in aprogrammatic and a declarative way umGitOps isabout declaring yourinfrastructure in a way that iscollaborative andsharable and so uh we're going to belooking at Nyx uh we're going to be youusing aum easeofuse layer on top of Nyx uhdeveloped by the fine and talented folksat Flocks um I did used to work withthem so I'm a little bit biased uh but II still use their software every dayit's very good um great way to share Nyxwith your team uh and then I work atControl Plane again my name is Lee uh Ithink there's two control planes i'lljust get this out of the way controlplaneIO um and I control plane employs me tohelp maintain the Fluxproject i know that I I just mentionedFlux earlier um and now I am telling youthat I have I work on and have worked onFlux for a really l��orization can begranular and to be honest it's sodiverse and really depends on theapplication we don't want each serviceto create its own authorization thisleads to silos specialized patterns andboth knowledge and security gaps thereis no standardized way of solving itevery time you start at a new companythe way you do authorization changesalong with itlet's go on a tangent to a few decadesago let's look at a similar problem wehad as an industry in the past datastorage before relational model existedapplication used custom logic on howdata was stored and accessed this causedtight coupling redundancy and in solvingthe same problem it caused inconsistencydue to lack of standardization therewere some atoms of standardization likenetwork model hierarchial model but theynever took off this was until relationalmodel was introduced we found the rightabstraction to access our data that fitfor 90% of our use case that we neverlooked back similarly we need the rightabstraction forauthorization that's the search we arein nowgoogle in search of that rightabstraction released a white paper in2019 called Zanzibar Google's consistentglobal authorization system in theirpaper they explain how their systemsolves all the problem we have as anindustry need a authorization systemthat's generic for all applicationshundreds of client services at Googleincluding calendar cloud drive mapsphotos and YouTube uses Zanzibar you cansee that it can adapt to various needsof different applications need acentralized system zanbar is useful forbuilding common infrastructure on top ofunified access control system itprovides in particular a search indexthat respects access control and worksacross application please note thesearch index part of it we will comeback to it later in the presentation butsearch in itself is a hard problem buttry solving it along with authorizationit becomes even harder nostandardization to solve that Zanzibarprovides a uniform data model and aconfiguration language for expressing awide range of access controlpolicies this led us to build open FGAopen FGA is an authorization inspired byGoogle's Zansbar it's a modularauthorization system which makes iteasier for teams with hundreds orthousands of developers to manage theirauthorization system centrally and mostimportantly uniformly we are a CNCFsandbox project we are at CubeCon itshould go without saying we providefirst classapo support for Kubernetesand yeah now that you know a bit aboutOpen FGA let me show you the power of adeclarative authorization by showing howOpen FGAworks we and we as a industry alwayslove declarative code much more thanimperative show me raise your hands ifany of you use at least one of thesetechnologies SQL Kubernetes manifestTerraform or React i'm sure every one ofyou yeah all theansw declarative way of doing things allof these technologies are declarativeand they're loved by the industry so letme show you how you can achieve the samething with your authorization system aswell this is a representative example ofhow Graphfana system works in the modelshown here you can see differententities it has a user and teams whichare identities it has folders anddashboards which are resources here youcan see we are associating differententities with each other for example auser can be associated to a team afolder can have a parent folder in thesame way a dashboard can be presentinside of a parentfolder now that we've represented theentities let's add authorization role toit a admin we are associating a user tobe a admin of a folderthese associations can be even moreexpressive as you can see here a usercan be a viewer right prettystraightforward but you can also assignall the members of a team can be aviewer think how powerful this is nexttime you add someone to your team theywill be automatically be able to viewthe folder or if you want to make adashboard public you can set somethinglike user star and anyone will be ableto view the folder we call this wildcard now that you added all theinformation to FGA the interesting partstarts let's say you want to add apermis�sion where only admins can createa dashboard inside a folder you cansimply do that by adding this line toyourmodel you can start adding any number ofpermissions in a similar way but youcould ask me hey isn't this eyeback nopebecause you are assigning a fine grainpermission of a user to a folder fjthinks of how a user is associated to aparticular object it doesn't stop atassigning a particular role to auser now you might want even moreexpressiveness to represent yourauthorization system you can usealgebraic operations you can addoperations to denote unionsintersections and even negations hereyou're declaring that a viewer can beanyone who is assigned to it or anyonewho is a admin of the folder canautomatically be a viewer this means alladmins of a folder automatically becomea viewer of the folder but you could askme you want even more expressivenessyou can represent even inheritance bysaying any viewer of the parent foldercan be viewer of the child as wellimagine how powerful this is theinheritance can span multiple levelsallowing you to represent even complexauthorization use case it can be likeyou can go to users teams organizationscompanies guestusers your imagination is the limit torepresent your authorization systemhere and that is it think of how manylines you would need to implement asystem a authorization system similar tothis now hoping that I've convinced allof you the powers of Open FGA let mehand it over to Joe to go throughGraphana's experience of migrating toOpen FGA thank youand so well now it's the time that likeI burst Puvam's Raj's bubble like withlike now let's go with reality and likewhat we've seen uh so our team actuallydoesn't attend a lot of conferences likeother Graphana teams that you see soactually I hope that you find like thissmall look inside of Grafana's likeinternals like interesting so on yourside if you want to move to open FGA thefirst thing you need is actually likethe need like uh you actually need tolike this type of features and for thisI'm going to show like a bit of thebeginning of graphana which is graphanastarted as like a single monolith likeeverything was like tightly integrateduh access control was done haphazardlylike throughout the application bydifferent teams but then we startedfinding out that well people want extrathings so we added plugins and westarted like growing like the monolithlike bigger and bigger and bigger likewith these plugins we also got likepotential risks from third party code sothis change actually meant that weneeded to be um well more fine grainedso like this need for fine grainedaccess is well like open sj's likeentire cell point uh but ourobservability platform actually hasgrown to a point that is extremelyinterconnected now like we have likethings like Kubernetes monitoringsynthetics monitoring K6 IRM SLO andthese are not like integral parts ofGraphana these are like applicationsthat are like joining the Graphanaecosystem and the thing is before theywould only talk like with centralgraphana and graphana would talk back tothem and now they're talking to eachother and this is like just a small partof thepicture and so now we have componentsthat are talking to each other but theyneed to go back to graphana to like canI actually do this which means that wellour access control system maybe shouldnot be living inside of thatcore then we have actually theopportunity so we started with havinglike the need and we were like okay weneed open FGA we need like better likeaccess control like I have big teamsusing uh like our existing accesscontrol and like making like decisionson the fly of how they're going tointegrate with our system but we wantedlike to standardize this all like withopen SGA the opportunity for us came uhwith the Graphana app platform so thisis a project that has been like ongoingalready I think for a year to make surethat there's like no difference betweencore features and plug-in features inGraphana by this I mean that yourdashboards and your SLOs's should usethe same interfaces and storage so nomore like core and like peripheral umresources they're all like sharing likethe same likepathways and because of this new appplatform we didn't have a choice anymoreour current access control system neededto evolve and we need to support likethis deep level of integration both withour resources and whatever you add andwhen I say whatever you add is probablyyour app aswell so first we need to know who ismaking the request this is the identitybefore checking what they can do theaccess so authorization checks happen indifferent scenarios but actually for usfor us we're focusing on userauthorization system authorization isalso possible with Open FGA but rightnow that's not our concern for uh systemauthorization we actually use more likecoarse grained access control which youcan see here like on the access tokenswe have like a small thing called likescopes which is a incident write andaccess token sign uh which basicallylike give wide access to all resourcesof thiscategory uh for users we actuallyrepresent them through ID tokens andyou're like now looking at this like nowthis kind of feelsfamiliar we took heavy heavy inspirationlike from off and like some IRS like8693uh which is like token exchange toactually implement like servicecommunication and a authenticationinside ofgraphana but what interests forauthorization is actually two thingslike the audience in this case so likewhat tenant we are talking about andlike the sub so the user like what'sthis user's ID orUID once we have these two informationswe can like post questions to our umsystem like does user of ID 31 who ispart of tenant ID3 have read permissionon like a specific folder likeairwin so we've established like why weneeded a change like like why we coulddo it which is the opportunity and likewhere open SJ fits so let's like likelet's do it let's migrate to Open SJ inthree easy steps which well define aseasy we've actually gone through a bunchof systems like built inhouse uh orgsand roles were introduced in 2014 inGrafana 1.0 zero uh so we're talkingabout a lot of like depth product depthtechnical depth whatever you want tocallit and this is a list of our entities inGrafana and this is not the completelist there's so many entities and somany resources that already exist in oursystem uh that um well we cannot do itall at once so the first advice is pickone resource pick one resource that isrepresentative of your system thatmatters where there's actually trafficgoing through it and start with thatstart small and then test out yourtheories and then generalize to therest and this is the resource type wechose folders and dashboards dashboardsis like the resource folders a bit thecontainer but it's a resource in itselfit plays great um like with Open SJsince folders actually have recussionyou can nestfolders and handling um like nesting hasactually been a weak point of ourprevious system so this is likesomething that we're really excitedabout this screen like for example showsyou how folder permissions typicallyappear to a graphan administrator likeyou can assign folder permissions for afew different things uh you can give itto basic roles which kind of act asteams in in the end you can assignpermissions directly tousers and as well like uh you canum change actually if the folder is evenviewable to otherusers but on this actually we wantedlike it would have been nice if someonehad warned us of like a few like dangersfirst is actually the number ofpermissions that are written so likeOpen SJ calls themtupils the more tupils you write theslower your system is going to be likeeven though it's a black box still has adatabase follows everything just becauseyou can write a lot into it doesn't meanit makes it easier to read it back thenwe have the complexity of the schemalike Pamra told like your dreams are thelimit was that it but uh in in realitywe found that actually schema complexitycan actually affect your performance wetried to represent a lot of existingconcepts we had that in the end like didnot play well at all uh because likethey complexify the schema alot um and finally graphana supportsdifferent types of databases so my SQLposgress SQLite my like the base versionthis is um so there are three differentdatabase integrations in open FGA likewe actually contributed like the SQLiteone uh and they don't have the sameperformance actually they're notoptimized the same way so if anyone herelike feels like grabbing some like mySQLoptimizations like please check out theopen FJ repo and contributeSo now that we've refined our accessmodel uh we need to move on to how totacklemulti-tenency you can talk multi-tenencylike through the schemauh which we thought about but since wesit in a weird place in terms of likecontrol plane uh we decided to do itusing stores and this also comes likewith another side advantage uh stores inOpen FJ allow you to separate liketenants information like into separatestores uh and by basically namespacingthem so you know that if you're usingone store you'll only get data from likeone tenant this is an advantage becauseit avoids you from like oopsaccidentally granting like cross tenantpermissionsuh and uh well it simplifies a lot ourdeployment so how we do it is we assigneach tenant their own dedicated store wewrap open FJ with middleares like onemiddleware for us like uh acts onauthentication another one for examplelike maps the incoming request to thecorrect tenantstore uh and also on this I didn't saylike I said like we need one centralizedauthorization system i kind of lied likewe actually run one open SJ per clusterso like per region that we support sothat means we actually don't need likelong latencies or like global likespanning so regarding like datamigrations and this is something that wehave been bitten a lot in our previouslike access controladventures and this is like what likewe've learned is that feature toggleswill be toggled uh it seems quitestraightforward uh I wish we thoughtabout it as well but customers like willand users will like toggle them on andoff for every reason and you'll probablyhave like an instance on some side thatchanges like some behavior it will gettoggled on get toggled off so do writealways and if you do behavior changes dothem in both pathways the legacy and thenew one because what usually happens isif the old pathway still has the oldbehavior people will toggle off and youwon't get like feedback like on thatchange that you did secondly uh we haveuh upgrade and downgrade cycles and thisif you ship on prem happens a lot like alot a lot don't trust one timemigrations and actually verify the stateoften between the two stores uh we'vehad this like happen already severaltimes where we've done like a migrationso like okay like that's good we havelike some recurring check but like verylong and when it comes to access controlbeing out of sync for like a few hoursis not a goodoption and finally don't rush migrationstake time ensure like that your legacyis working but make sure that it doesn'texpand so move new teams to use like thenew access control system the new accesscontrol service but keep your old onelike running and stable and finally useshadow calls when you're implementing uhyour new access control system like havelike in the beginning have requests goto both engines to get to make sure thatyou're getting the same response fromboth this will help you make sure thatyour schema is really good uh and itwill avoid nasty surprisesafter and finally on actually using thisdata I would like to focus on searchwith permissions as said like this islike a hard problem uh and it'ssomething thatwe like took a very long time looking atthat we were like really afraid of ofthe how we'll make this work like we hadmeetings over meetings and like PC'safter PC's trying to understand like howto make this work and in the And I thinkwe surprise ourselves with theresults so for example for dashboardslet's let's think that well you have avery small store there's not a lot ofdashboards the user like also will nothave like a lot of like search resultscoming from it like this is like yourbest scenario which is the filtering islike very restrictive so you have fewresults and your pool is also not thatbig so on this you can basically querythe database so uh like with yourfilters and then like just check withthe authorization system like can theuser review this one this one or thisone that's like your probably like thesimplest one you have we'll call thisone like search and checkbut the search and check method wediscussed actually breaks down if theinitial search result is very big and aswell if you have a lot ofdata on this case actually this is fromopen SJ's guide that if your user canaccess like around like less than athousand of the results actuallyfiltering by these so what I mean bythis is going to open SJ and asking likeoh like what dashboards can this userand using that as like to feed your SQLquery might be thebest but the thingis when you're listing all of the objectIDs if it's more than a thousand objectsso like more than a thousand dashboardslike this stuff's working because it'slike just too much graph transversal forOpen FGA uh and as well like you're likebasically like having like huge thingsto fit into yourqueries and well it might be slow likevery slow when this occurs and like sowhen you have high cardalic access highcardality access using a local searchindex like might help like to pre-filterbut it comes with a few cons which isyou have like more things to maintainyour local index will go out of date andyou'll have to keep it in sync which issomething that Open FJ SH's watchendpoint helps with but it's one morepiece in theloop and sofinally there's a practical approach andthis is actually something that we werelike like we are trying to find like thebest solution for every case whichdoesn'texist and then we decided to bepragmatic when you cannot like when youdon't have everything aligned like makeit align filter more so actuallyremember when I showed you the the stackID on the open ID tokens or on the IDtokens well with that informationactually like our search service canalready filter out all of the resultsthat are not like belonging to thatstack or to thattenant so if you are actually like veryaggressively filtering at the beginningthen you'll have like less things tolike check we found this like works verywell like from most of our like tenantsbut we're also looking into the futureof having like a search selector of thedepending on like the statistics of thetenant to apply different searchstrategies so finally how do we packageall of this authorization logic like tomeet the different deployment needs bothfor cloud and for on premiseswe use a toolkit that's developedinternally called Grafana DSkit it'sopen source go check it out it has likea lot of cool things it's called adistributed services kit and uh this kitallows us to either run it everything inthe same binary communicating in processso it's actually gRPC over go channelswhich is pretty cool or just separateand deploy it separately so this givesusthe this still makes it easy if you'reOSS to deploy it using like a singlebinary without having to deploy a secondservice but if you need the scalabilityto split out like the access engine andlike scale itindependently on our let's say user sideand when I mean user side I'm talkingabout like our development teams we givethem um like we use we have it it'scalled offlib it's our um library forlike authentication utilities uh whichhandles caching it handles like tokenexchanges for them so it makes it reallysimple so they can just focus on writingI have this resource I have this userthis is like what I want to know if theycan do withit and so like from our side here atGraphana thank you very much forlistening today uh it's been a fantasticexperience working with Puam Raj and theentire open FGA team they also put outlike an official open adoption guidevery recently which is really good andum yeah like uh very happy to have youhere And I hope you had a good a goodcubecon thank youbeing the last talk of open uh of theday I think we have some leeway if youhave any questions please go to the micor else we'll be standing near the stagehappy to answer anything thank you folks2025-04-15 21:57:56.411656 ��=#��9ADmfZq70WOxIthank you for attending this latesession Um just quick uh introduction Myname is Larry Kurvalo I'm an independentanalyst and I uh bring great paneliststo the stage and we are going to talkabout compliance as code First of allthe caps we all wearing are the is theCNCF project uh that's in sandbox rightnow and that's the open source projectthat we are going to talk about a littlemoreUm compliance is code is really abusinessliability and you will learn how toachieve continuous compliance uh withautomation and standard��H<#��GAZOG1J1Niuh0let's get this started uh well I hopeeveryone's like feeling comfortable likenot too tired like on your lastbasically slot session ofCubeCon uh well thank you for choosingto be here like I know it's somewhat ofa stretch after like these days and wellthank you for choosing our talk whenthere's actually other like twosimultaneous like authorization talksgoing like on others so by the way thisis your cue if you just realize you'rein the wrong one like just just sayinguh so my name is John uh but actuallyeveryone calls me Joe so call me Joe uhI work at Graphana for the identity andaccess team uh we actually provide likeum like a application platform for otherteams to build features on so let's saylike having like the authorizationchecks like uh having like resourcesthat are like instrumented so let's sayprotected witharbback and actually our journey kind ofstarts like in 2024 at cubecon in Parisuh there we saw a presentation aboutopen FGA uh we actually were reallyexcited and we contacted the open FGAteam afterwardsuh like they were great like uh we werelike really excited to get like anintegration or like a partial migrationgoing uh but after this like initialexcitement we noticed that likesomething was missing uh we couldn'tfind a lot of like real world examplesof moving multi-tenant systems like ourson cloud to open FGA uh and well there'sa lot of examples like for new projectsand even like for new multi-tenantprojects and if you are starting like anew project in that case save yourselfthe trouble and like start integratinglike with like access control enginelike from the start uh but on our sidewe have a lot of like legacy like let'ssay baggage like almost like 14 years ofuh access control features and so wewanted to know like how are other peoplelike handling this transition so this isa bit what we want to share today uh andhopefully answer like questions likethat you have like well would open FJfit my use case like what hurdles mightI expect like in this migration and Ithink to start like the best thing is togive it over toPuamraj hey all I'm purajangati rajinI'm the technical lead in the fa team wedo everything from integrating open intoCM platforms identity platforms andobservability and uh yeah let's get intothepresentation if you start a new projecttoday would you build a login for itfrom scratch there are so many greatlibraries frameworks and services outthere that we don't need to build thisourselveswe all know of protocols like open IDconnect oatsu and samo that we can adoptbased on theneed but imagine authorization eachproduct builds their own authorizationthere is no standard we usually adoptsometimes even within the same producteach service does authorizationdifferently this is a problem right nowwe as a industry don't have a goodsolution forauthorization still notconvinced check out the OASP top 10 listfirst in the OASP list and three of thetop 10 in the OASP API list areauthorizationrelated so why is authorization a hardproblemthe requirements of auth�ization at thissession Um we will start with thepanelists giving one slide They'llintroduce themselves They'll talk abouttheir slide and for those who havequestions there isa mic back there if you can stand upthere and uh ask your questions it wouldbe great so that we can record itproperly and have it for for those whocannot attend and attend the replay Umso with that um without further ado I'mgoing to start with um Robert who willtalk about your session Please introduceyourself Sorry Can everyone hear me yepGreat Hi my name is Robert Fkala I'mwith Sunstone Secure Uh we started in2019 uh primarily around the open sourcecommunity as a group of senior SISOexecutives and our core area is publicsector healthcare heavily regulatedindustries like pharma and from day onewe were uh trying to solve the problemfor ourselves how to manage compliancein large enterprises with minimal staffreduction in in budgeting and yet anincrease in attacks and increase inthreats and From day one compliance iscode is uh an investment we made Uh webuilt out knowledge graphs and haveapplied machine learning to that overthe last 5 years Now that's actually acommon way to do things called graph ragand we've been involved in uh the policywork group in Kubernetes uh working onformal verification with NIST and othergroups and uh we are contributing all ofour AI related security and complianceto a nativecurity.org So glad to be hereLooking forward to a great panelsegment Yeah Uh hi I'm Simon Metson UhI'm SVP of engineering at EDB We mainlydo uh Postgress Um worrying about datahow that gets stored how you process itin various systems AI analytics whathave you Um I've been working on kind ofcloudy products for longer than I canremember 13 years or so And uh one ofthe things that our team had to takecare of is taking our our environmentthrough uh sock 2 audits and PCI auditsand things like that And so being ableto sort of automate those problems awayI remember one of my infra team was sortof leaving the office and 7:00 oneevening which is sort of ashen and it'scuz he'd spent all day pulling lists andlists and lists of servers and ports andall this kind of stuff for a sort ofvery paper driven audit and it's likethis is not the way to do stuff So we westarted building some software to do itUh that became a product we or projectthat we open sourced uh five six yearsago called auditry and we're workingwith the Oscar stuff and and compass tosort of see how we can kind of plum thatin and maybe bring it into CNCF in thefutureAnd Ana if you can say something aboutthe open source project also that wouldbe hi everybody my name is Anka Syler Umthank you for being here Um I'm adistinguished engineer at um Red Hat andIBM and um I have two minutes now uh1:45 to tell you everything aboutcompliance as code which is impossibleSo I packed as much as possible on thatslide that you can analyze afterwardsyourself I'll just give you the keys tothat So in that picture you have thehorizontals and they are the variouslayers going from compliance as code aslower as policy as code and this is whatSimon talked about and um on the top youcan see the regulations they typicallycome as PDF spreadsheets that cannot beused if not in a programmatic way So thecompliances code particularly Oscal opensecurity uh control assessment languageis a NIST standard for compliances codeprovides schemas for specifying yourcontrols catalog This is there in yellowIt allows you to specify uh fortechnology how those controls areexpressed as rules and then uh at thepolicy as code level allows you to mapto the actual check ids and the evidencethat is associated with that So theseare the horizontals Um now the verticalswe give examples for each of those uhhorizontals One is in the DORA contextand the other one is in the EUI for AISo there are controls rules associatedwith that and so on If you want moredetails now you can look at the text Yougo on the left side of the screen andyou can see u the Oscar link where youcan go and learn about the details aboutcompliances code and the uh artifactschema And you can also look at our CNCFuh Oscal compass uh project thatprovides an SDK trestle for the Oscaland also a platform on top of that So itallows you to manage all the layers thatyou see in the picture So the um needfor this compliance code became veryevident when we uh are bombarded nowwith many new uh regulations and alsothe industry is moving from annual andquarter rewarded to uh uh continuouscompliance So here is your solutionEugeneHi hi I'm Eugi Watan from IBM ResearchTokyo uh I'm working on the uh how toapply the AI and JI and Asian technologyto the this polish code field and uh umthe this area the uh the as you know thelot of the uh uh different persona uhwork uh involved and uh with a differentuh knowledge and different experiencedifferent tool So the very the processis very complex and now uh we have AIgen AI and agent So the uh we are tryingto apply the how the uh agent uh AI jican help the whole entire process Uh westarted from the uh an uh anal uh use AIfor analyzing the existing policydocument control and the requirementdocument and all the u uh uh knowledgefrom the different persona and then uhwe uh the new regression comes in theagent uh JI use is used for analyzingthe what is the exist what is can bemapped to the existing control asexisting uh uh framework and uh find thegap uh then uh identify the what what ismissing what needs to be filled thenapply the jai to generate the uh controland uh and this is over uh the processuh um so the we are we are using tryingto use this uh agent and the JAI to theimprove the overall uh speed and uhefficiency Yeah thank you Great Thankyou Thank you So we're going to gothrough some um questions that we havealready you know prepared for For thosewho have any questions whatever youdon't understand we got a great panelhere Please make use of it Please standup to the thing when you're ready and Ican I can get your questions to them Umso Ananka how do you prove the AI usedfor compliance is itself trustworthy andoperating correctly Okay so do you wantyour slide back again if you need totalk to it okay Yes it's um so you haveseen that now we have uh genai right andwe have uh uh in the pipeline multipleagents to help with the process and uhobviously the question is how are weable to uh evaluate and the answer isbenchmarking So together with the agentsthat uh Eugi mentioned and the first onefor the assessment for policy as code isalso available in open source We alsoprovide an IT bench If you are googlingfor IBM IT bench you are going to see umuh the IT bench with um two three uhsamples of agents for uh CISO complianceassessment for S sur for PHOPS and theycome with about 50 scenarios and theyare uh modeled uh in a real uhenvironment They are deployed in a realenvironment to test the agents that youcan subscribe You can subscribe your ownagents You can use the ones that we havealready to play with or you can registeryour own and you can test them againstthose in the case of compliance 50compliance scenarios These are typicalCIS benchmarks that are made availableuh deployed in a real environment withanible with OPA um for real and for umuh Kiverno and for OPA So this is theway that we can set up benchmarking andprovide a comparison uh uh not onlybetween um how the agents behave butalso against a standard of uh uh what isthe right answer because we haveobviously the ground truth together withthe 50scenarios Great Okay So next we'll go toYugi Um and um how do you approachnational natural language compliance ascode uh this could include crosswalkingvarious regulatory frameworks and downto specific controls as well as AIingesting ragging tooling with many youknow uh policies and procedures androundtrip engineering Um yeah the is uhuh initial input is a natural languagetech so it's highly very abstract andnonoperational so the first use AI toextract the uh input information andfind mapping with the existingcompliance framework and that uhrequirement that part is the the firstfirst layer so the uh we are using doingthe layer the layering approach Firstlayer is a very the uh uh analyzing thedocument Then second layer is a log Souh many many the curated body of thedocument around the framework thecontrol requirement is uh picked from byusing lag mechanism that input to the AIThen then AI use do the uh analyze thecrosswalk helps the crossworking map uhprocess So find the commonality betweenthe multiple multiple theregulation then the if the if the newregulation regulation requirement coversuh is covered existing the control thenit's okay but if if we if the new somedata is fine that part is filled by theuh policy generation and uh and alsothat this part if the policy exists Someimplementation exist but some changesrequired The you we use the agent to uhidentify the what is the del datarequirement and how to change theexisting policy to the new policy thatpart is also the done by theagent Okay So next uh next question isto Simon what operational challengesdoes a cloudnative environment presentand what about the opportunities thatyou get um so from a sort of compliancepoint of view I think there's there myanswer to both questions begin with a Dwhich I thought was cute Um so the thechallenges come down to the dynamism ofthe system So you want to have anautoscaler You want to have things whereuh pods are ephemeral coming in and outof of existence you want to be able toenable developers to spin up a databasethrough an operator or whateverUm and that's how you get the benefit ofthe cloud native environments but that'scompletely at odds with some of thecontrols and expectations that thosecontrols have So um we're looking atFedRAMP uh for for some of our productsand and there's things where like thischange requires a general to sign off onit right so having an API that lets youspin up a thousand of these things thegeneral is not going to be pleased withthat right so that dynamism causesproblems So not only do I need to sortof measure and and track and collectevidence in the sort of rapidly changingenvironment but also the policy needs tochange and adapt and adapt and pick upthat kind of uh real world new braveworld sort of thing But the the thebenefit of the the cloud native stuff isis another d declarative right i knowthat if we provision a database throughCMP for exampleum it's always going to be the samething It's the same image It's the sameset of ports It's the same behavior Andso I can I can use that operator to makesure that when I make another one it'sthe same as the last time that has beenapproved by the general And and that canthen feed into policy and behavior andprocedure because it's not about everytime I make one is it okay it's I'vemade one once was that okay okay now I'mhappy with making more of them andstamping those out That declarativenature that you get from Kubernetes It'sa lot harder to do with other other setsof tools You can do it but but cubereally gives you that opportunity AnaYugi you want to add anything to that uhto the challenges or opportunities or wecan go to the next question Okay Um soI'll go to Robert Um we all thinkingabout cloudnative compliance but whatwould AI native compliance look likewell actually I'll I'll answer that andand yeah continue the conversation onthe challenges that Simon was speakingto As you were saying that um I Icontrast cloud nativeuh confidence in that declarative codewith what now if agents are generatingthat code Um and that's the kind ofdesign I'm seeing In fact you come thegreat part about KubeCon is you comehere thinking you have this bespokeproblem that no one else has ever seenAnd sure enough this morning I went to agreat talk Google gave this morningabout using AI LLMs to generate theircontroller code right and and the pointthey made is the same point I'll makeThey generated a thousand controllerswith their LLMs Now that someone had togo review those so now the task is justshifted to a different type ofcomplexity So if if Kubernetes and cloudnative today is about the the comfortzone around declarative what how do wetransition to a model where agents andprompts define your system now how do webuild the skill sets for those humanswho have to be part of that trust loopuh how are they going to evaluate howare they going to keep up with the scaleof code that's being generated uhbecause you might not have humanscrafting very small you know blocks ofcode that are highly tuned It may bemore advantageous and scalable to havethe agents generating you know thousandsof blocks of code that are reassembledand dynamic every day So the one oneexample of how Fed ramp breaks down inthat world or Nate 53 breaks down isthey assume you have this staticinventory and they and they tweak thisto say okay well we understandcontainers may be a lot you know more ofthem and they may be ephemeral So atleast give us a container imageinventory but how is that going to scalewhen you have agents generating theimages generating the deployments evenwithout right you have the problem of Ihave an immutable image so how do I do aCV patch right that's the fun one I haveto change the image to fix the CVE and Ihave to fix the CVE within 24 hours orwhatever the timeline is so you've gotthat dynamism is is just prevailentacross the whole system and and you'vejust got to work out how to deal withso I think it's you know we're going tobe orders of magnitude more dynamic Thespeed is going to have to increase andthen the human ability to keep up withthat code and the risks and the threatmodel that go along with that That'sgoing to be the biggest challenge I seein the next Yeah I I can add one moreum as uh you execute allthese new controls and let's say thatyou know uh we solve all the AI neededcontrols and the DORA and all of thekind uh one problem that I have seen anduh if you are just starting on thatjourney you will notice as well is thecultural change that is needed in yourcompliance team to move from the currentmanual processes and screenshots totechnology and the main problem that isthere is that these teams are by designnon-technical and they are they are uhuh uh right now introduced in all thisworld not only on the technical levelbut with agents and with AI and they areoverwhelmed So I think one thing thatyou need to have in in your agenda is tohelp those teams bridge the gap into thetechnology either with um support fromum aentic uh advisors right or technicalteam it's much easier for a technicalteam to be um uh upgraded to skills thatdeal with compliance than other wayaround So this is one of the challengesthat one one of the things that we'vebeen trying to do and think about is uhhow do I how do I turn a complianceproblem into a software developmentengineering problem and so if I can uhuse an AI to sort of say um I don't knowthe control is that all data in rest inthe system must be encrypted it's likeokay cool I can understand that but ifyou say that means that lux needs to beturned on on every server suddenly itbecomes a configur uration managementproblem rather than a compliance problemand then you get into the evidencecollection around that But turning theseproblems into things that SRRES DevOpsteams engineers can kind of rationalizereally quickly or even use an agent tosort of turn into some system Um I thinkthat's how you get around that sort ofthing and and and the the roles sort ofare slightly different and it becomesthat sort of communication betweendifferent types of team Uh turning thecontrol into a thing that can becomecode or configuration or an agent promptor what have youGreat Um so the next one is for Yugjiagain Uh is there any questions from theaudience by the way i don't see anybodystanding near the mic so I'm taking thatthere isn't any yet Uh so Eugi the titletalks about speed where we talk aboutspeed of innovation Um what are thecurrent metrics and you know how do youbenchmark them and then the other one isa little loaded question is real timereally realistic and u you know how doyou audit agents in real time yeah thankyou the when we say the uh real time thethe actually the uh we we are notexpecting the real real time uhassessment of the by the agent actuallythe this uh the we are not fully rely onthe whole verification and assessment bythe agent that's not that's not todaybut uh uh by bringing the agent we canimprove the overall uh uh compliancelife cycle cycle Actually the forexample the uh the um agent can detectthe uh new comp uh requirement changethen or some control chain then the wecan uh agent can navigate the therelevant uh stake uh persona to do theaction uh and it can be a part of the uhCI/CD pipeline So the uh we can askbring the uh compliance uh activity inthe more shift form uh way the uh not sothat's a uh kind of efficient uh uhimprovement to bring bringing by broughtby the agent and for the uh benchmarkingthe uh we so one approach so approach isso how much uh how long the overall theuh new regulations new regulation newrequirement support that is one uhmetrics and another metrics is how muchhow many uh manual touch point includedin the some uh doing some uh uhsupporting new change So that kind ofmetrics is used for uh evaluating theeffectiveness of the uh uh AI and agentAnd the first last one is a audit auditfor the agent So actually that part isreally the difficult part thechallenging part uh one uh this is notspecific to the compress but uh this ifwe bring the ji agent to the thisregulated scenario that that part thatthat uh auditability is really importantSo one approach is the agent explanexplan explanability So keep the all thetrajectory of the agent is recorded anduh it provide easy uh transpartransferability transparency to the uhhuman uh uh verifier and uh keep therecord for the later compliance uh uhother compliance evidence that type ofthings uh is uh approached for the agentu uh auditability today Any thoughtsfrom you any of youokay Okay Uh we'll go to the next one UmAnka again what are the realistictimelines and budgets uh for digitizedenabled compliance as code you know Yeahthis is a question that uh immediately Iget uh when we're talking with ourcustomers when we bring the compliancemodernization and compliancedigitization Uh it was a uh we have avery good white paper um uh from thetime before Genai um around 2020 2019uh by call fire uh where they made anassessment of theuh uh uh total cost of ownership TCOwhen we introduce compliancemodernization and they show that usingOscalike technologies uh it will reduceuh the TCO for compliance by uh 50% Umand now we see as we introduceaccelerators like the one that Eugimentioned to his agents this becomes umuh much more complex uh cost evaluationbecause of on one side we have thereduction in the development cost usingagent to develop code it it is much moreaccelerated But on the other side wehave all the additional work in creatingthe benchmarks in creating the um uhcontrols uh the totally new path for uhexplanability like Eugi has mentioned Souh I think right now we need to have anew assessment I think the original 50%reduction needs to be uh re-evaluatedDefinitely there is an improvement Idon't think we have the choice not to gointo the digitization for the reasonsthat we discussed at the beginning ofthe call Right we have so many new um uhcontrols that are upon us We are movingto continuous compliance especially withAI I cannot rely on an audit that wasdone last year Uh so I think this wholething uh needs to be re-evaluated andand provide the new uh time uh uh costexpectations timelinesI always compare compliance as code toour experience with um with uhinfrastructure as code I see a lot ofyou here are very young Um I lived uh tosee uh the infrastructure configurationbeing managed with spreadsheets Okay youare lucky you haven't seen that So ittook about uh 10 to 15 years to matureand to have the infrastructure as codeas a de facto uh um practice So I do Iexpect 15 years for compliance as code Ithink with all the genai technology Ihopefully will not be there On the otherside I think compliance is uh u manytimes fold more complex thaninfrastructure as code which is dev secops people were technical It was with auh you know compli um a confinedbusiness unit In compliance we see thismany personas across many business unitsdifferent type of skills Um so just inyour plans with compliance modernizationdon't expect that to be done by secondhalf of the year This is something thatwill take many years to get there Andagain I cannot underline enough yourcultural change will be the one thatwill be the longest that I'll just addthat said you can get a faster time tovalue by just bringing the AI into theassessment process and the discoveryprocess So often we'll come into asituation where there's a lot of uhbespoke documentation on policy andprocedure There's kind of a mix ofinfrastructure as code and legacysystems gathering all that together andthen you know ragging that vectorizingthat reviewing that questioning thatthrough the LLMs you actually discover alot of unknown unknowns and there's beenyou know a lot of tribal knowledge inthese enterprises where now the LMS canhelp surface those you know lots ofcomponents in the component model thatthe the sysos and security team didn'teven know existed until the LLM processsurfaces so time to value I think isvery rapid time toperfection that might take years Soplease join our Oscal compass to help uhuh contribute to this content andaccelerate this path I think um theother thing there is that the analogywith an infrastructure is code is kindof interesting because there there'sthere's a like we used to use chef andyou kind of could pick up a recipe So ifyou wanted to deploy a system there wasa recipe for chef and you just do thatThe thing that's challenging I thinkwith compliance is even though theregulation is the same and the thecontrol is the same the implementationin the company is very very specific tothat company or that organization So onecompany might say well we use this toolfor our backups and then another companysays something different and so sothere's not that kind of the thingthat's going to take it make it takelonger is that there isn't that kind ofcommonality that you can take a solutionfrom there to there You have to think inthat more abstracted fashion of how dowe talk about controls how do we havetools that help us but the company'sstill going to have to make their ownpolicy and procedure and all that kindof fun stuff Yeah Not yetOkay One one last question I still don'tsee any questions from the audience SoumRobert what does a CI CISO's day looklike in five years as AI becomes moreintegrated into all aspects of the techstack and operationsyeah I would say that you know the thething that humans are good at isjudgment right so there's always goingto need to be a human in the loop to youknow provide the trustworthiness nomatter how many proofs we have ormathematics we delve into uh you knowhuman organizations are human so they'regoing to need to trust uh an expert oncompliance and the use of AI how thoseAI models are trained and the wholepipeline to how you ingest data theprovidence of data I guess for me thethe crystallizing moment is for those ofyou who have have had to be part of a anincident response or a breach responseand you're usually immediately broughtinto the seauite usually there's outsidecounsel and often regulatoryuh officers and suddenly the sis's jobis to be fact-based and to provide thatsubject matter expertise in a moment ofcrisis I don't see that role going awaywith an LM anytime soon I think that thegravitas and the judgment and theexpertise that you know humans bring tothe loop will always be an importantpart of the process and I think theskill set that you know a sysa will needin 5 years will be applying AI in theright places and applying human judgmentuh to counterbalance thatI think every cisu in the room washoping you'd say drinking my ties on abeach right so um yeah no I agree Ithink it's and it's that interestingpiece where there's the intersectionbetween um technology problem and abusiness problem This is a business riskWe're accepting this business risk Thatthat decision is hard to sort of pushinto an LLM because it's going to bevery very contingent on lots ofdifferent factorsVery good Uh any final comments uh butotherwise um if if you're all done onthe stage thank you for attending andplease give a hand to our panelists[Applause]2025-04-15 21:57:57.292030 om messages allowadditional messages to be sent betweenthe agent and server using the opampconnection this allows the connection tobe reused for new features and while theoverall structure of custom capabilitiesand messages is not defined by the specthe format and content sorry is definedby the spec the format and contents ofthe messages are not if you want toimplement a new opamp feature start bydefining a custom capability and definethe message type and data for eachmessage for messages to be sent both theserver and the agent must indicatesupport for that custom capabilityso to give you an idea of how this worksuh there's an example in the spec ofusing custom messages for a servicediscoveryfeature the server sends the agent acustom message requesting the availableservices that could be monitored andthen the agent could respond with a listof services that it discovers so firstwe define a capability here we have itcalled com example discovery then wedefined two messages one for the serverto send a discovery request to the agentand another to carry response from theagent to the server again this is justan example of a possible use of custommessages and while the capability umwhile a capability may be vendorspecific uh vendors are encouraged topublish their custom messagespecifications to allow for moreinteroperabilityavailable components is the most recentnew feature added to opamp um as we'llshow you later the supervisor allows forcustom distributions of the opentelemetry collector to be managed usingopamp this means that you can now createyour own custom collector distributionscapable of remote management using opampwe have a flag to indicate that thecollector accepts remote configurationbut how do we know if a configuration iscompatible with the agent whatcomponents were included with the agentwhen it wasbuilt on a message to the server theagent sends a hash of the availablecomponents and because the full list ofcomponents can be long the server isrequired to set a flag on the responseto request the full list of componentsthe agent will then respond with a fulllist um for example here it says that isusing a specific version of the file logand OTLP receivers there may be over ahundred available components in acollector distribution but with thisfull list of components the server candetermine if a configuration iscompatible with the agent and provideoptions based on different componentversionsall right so now that we've taken a lookat what op amp can do for configuringagents generally let's take a look atwhat are some of the considerations anduh implementation details uhspecifically in the collector so tocontrol the collector uh we need to dotwo things first we need to send itconfiguration so that's the whole remoteconfiguration part from the server uhand then in return the collector isgoing to return to the server status andtelemetry information from the collectorso that you can monitor it and make sureeverything's working all right um wehave two ways that we could do this uhfrom an implementation standpoint uh onewe could put all of the op amp piecesinside of the collector so the collectoritself would speak opamp uh secondarilywe could have something kind of do thattranslation so it would uh run thecollector process and the collectoritself wouldn't necessarily need to knowhow op amp works um what we're doingright now is the second approach umbasically it's first of all it's simplerum we don't need to worry about if weput in uh op amp into the collectorthere's a lot of life cycle things thatneed to go into consideration um andthen if something like a crash happensyou need to uh figure out how to recoverfrom that it's just more complicated soum having another process start and stopthe collector is just simpler uhadditionally it's a lot more powerful umin addition to communicating crasheslike I mentioned um we're also able toperform binary upgrades more easily yousimply download the new binary stop theold one and start the new one so whatthis looks like is in the middle of thecollector and opamp server um kind ofcommunication we have a supervisor sothe supervisor is started before thecollector um and does all the op ampcommunication with the server um itconfigures the collector by writingconfig to disk and then using thecollector's command line arguments topass it that config um in return thecollector can return um any informationthat you need on the server side to uhensure the health and um like justgeneral um good state of the collectoruh you can return um attributes thatidentify it so that you can figure outwhat the collector is that you'rerunning you also can figure out likeAndy mentioned supported components umyou can see the resolved configurationthe collector is currently running so ifthe collector is going over a network orsomething to grab additional config youcan see that um and then finally likelivveness information about all yourpipelines uh and the supervisor willgather logs through std out and makesure that those end up uh in yourtelemetry back end as well um thesupervisor also supports forwarding itsown telemetry as well as the collectorsto the telemetry back end of yourchoice so um to use a collector with asupervisor you need uh basically adistribution that supports the op ampextension um the upstream distributionthat you can use to do this is collectorcontrib uh we recommend you just usethis for testing uh but nonetheless itcan be used to test out the supervisorum additionally there are some vendordistributions that support op amp so youcan use one of those uh and finally umI'll cover this in just a moment you canbuild your own DRO with op amp supportum and control it that way so all youneed to enable op amp in your collectoris um ideally a recent version of thecollector framework so v122 plushopefully um you can use older versionsbut there are just more caveats um ifyou are using that all you need is theop amp extension if you just put that inyour collector it can be controlledthrough opamp so just a quick overviewof OCB um to kind of show you how you dothis with your own distribution uhbasically the collector builder um is away to create collector binaries onlygiven a manifest file so all you need todo is specify what you want in yourcollector um OCB will run this throughsome templates produce some Go code uhthat can be built with the standard Gocompiler uh and then you get your owncollector binary so uh here's an examplemanifest uh for OCB uh basically on thetop here what you have is just somemetadata around your distribution um andthen the rest of the keys are your umwhatever components that you want inyour collector uh the key point here isthat the only thing you need is theopamp extension uh it's it's fairly easyto enable opamp support in a collectorso earlier I described the custommessage feature that was added to opampthis last year um I'm going to show youhow that's implemented in this with thesupervisor basically the um opampextension maintains a registry of customcapabilities custom messages are sentfrom the server to the supervisor andthe supervisor forwards them to theopamp extension running in the collectorand then the opamp extension dispatchesthose to the component that implementsthat customcapability so this is how we wouldimplement a custom component that usescustom messages uh components registerwith the extension passing the name ofthe capability and some options and uhthis returns a custom capability handlerand the handler provides access to achannel where you can receive thesecustom messages and you can send custommessages back to the server so comingback to the example of discovery Idescribed earlier you would implement anextension that can do discovery andimplement all the the the necessarycomponents of that um and then it wouldregister its capability on startup uhand use the resulting handler to sendand receive these discovery messageswith the um opamp server then you buildthe extension into your DRO and add itto your collector config so you canlearn more about this by looking at thereadme for opamp custom messages um andalso the AWS S3 receiver uses custommessages to sendstatus all right here's where it getsexciting uh we're going to be both doinga live demo and I'm going to bereconfiguring the screens during thepresentation so uh let's prepareourselvesum let's see here so I need toAll right here weare looks good enough okay so um whatI'm going to show is a live uh runninguh demonstration of the supervisor usingthe example op amp server that you canfind in the op amp go repository umwe're going to have a link at the endthat will point you to all the stuffthat you would need to runthis so what we're going to do here isum I'm going to show you basically likea bare bones supervisor setup um so allyou need is a supervisor binary um acollector binary that you want to run umand then a supervisor config and ofcourse if you want to do remote configyou need a an op amp server so that'snot included here but is uh running onmy machine so what we're going to goover here is just kind of a bare bonessupervisor config uh basically you needto specify the opamp server to connectto um note that the TLS um like theinsecure uh like skip verification thingyou don't do that in public or in uh inproduction um that's just for testing umbut what we're going to do is we'regoing to connect to an op amp serverrunning on my local machine uh nextwe're going to enable um the remoteconfig capabilities uh they're disabledby default just for security reasons umyou know we don't want to make it sothat you're open to remotely configuringyour collector unless you specificallyopt into it uh note that the supervisorstill has uses even if you don't enableremote config um you know it can be usedto monitor the collector or add um liketelemetry settings to it stuff like thatum but we want remote config here sowe're going to enable it uh finally youjust need to specify where the binary isuh and then where the uh so it willstore some kind of uh just runtimeinformation like it'll cache the configand stuff like that uh on the localmachine so we just specify directory forthat storage so with all of that out ofthe way um you can see here we've gotthe server running uh there are noagents running right now um but we canfix that so I have a command here umthat hopefully is fairly readable uhwhere we're just going to run the binaryand then pass it the supervisorconfig if we runit you can see that um basically youjust have a message here that says noconfig present not starting agent um ifwe go into the uh server interface umwhat we're going to see here is a UU IDthat matches uh our collector that wasautomatically generated um and you'regoing to see that the collector isn'trunning so basically what is going onright now is um we have we started thecollector we got all its information andthen found that the server had not sentany kind of configuration um if there'sno config to run there's no pointrunning the collector so to saveresources we don't run it um you can seesome basic kind of configuration herethat was just used to bootstrap it uhadditionally you can also see theattributes that are um kind ofdescribing the collector um but what wewant to do is uh and also you can seeall of the uh the cached files here umbut what I'm going to do here is I'mjust going to add a really dead simpleuh no pipeline so that there's somethingrunning um but we don't need it to befancy so what we're going to do here isfrom the server so the supervisor isrunning um and from the server we'regoing to remotely send this config tothe supervisor the supervisor is goingto receive that and then restart thecollector with the new configuration uhone important thing to note here is thatum while we do support remoteconfiguration this isn't hot reloadingso it will incur a collector restart ofit'll completely restart the process ifyou update the config uh it won't justkind it won't it's not intelligent aboutit right now uh we'll cover that alittle bit later that's something wehope for the future but uh currentlythat's not how it works so if I save itand send it to the server or the uh thesupervisor um you'll see now it is upit's been up since you know less than aminute ago um and you can see here thatthe config has applied in the collectoruh if we look in the logs um it didn'tprint anything because uh it just ran asnormal so uh a couple other things tocover here um basically so this is justan example server you can kind of get anidea of what it's capable of um but umone thing that this enables is from afleet management perspective you haveall of this information that you can useto select which collectors that you wantto configure so for example if I'mrunning like a host metrics receiver uhand I want some specific ARM informationI can select on that host.arch attributeand choose just to update thosecollectors that's really where the powerof opamp comes into play again thisisn't a production grade uhimplementation it's just a uhdemonstration so that doesn't have thesecapabilities but hopefully that can getyou thinking about the sort of uhcapabilities that this unlocks umadditionally all of those attributes areincluded on the collector's telemetry soum self-observability is is kind of animportant thing of running yourobservability pipelines um and all ofthese will be the supervisor can be usedto configure all of these aswell that is it um we go back to theslides right we can go back to we canswitch the demo if you want you want togo to the demo yeah it's fine okayokay so uh let's start it over irecorded mine because I didn't trust uWi-Fi or my ability to click on thingslive um so I'd like to do another demoshowing opamp features like uh changingthe configuration of a fleet of agentsusing a custom collector DRO and how theavailable components feature allows theserver to know what components can beused in the collector so here isbindplane a telemetry pipelinemanagement tool that my company buildsand um in this simple example running mylaptop I have 50 open telemetrycollectors running in docker uh an opentelemetry collector running on my Mac uhthey're sending metrics overp to agateway that I built and um they're inthis case sending it to Dino traceum so we can see the list of agents uhand their status and some details abouteach of the agents that are connected tothe serveruh and if I click on one of the agents Ican see more details uh that were sentto the server via op amp um so let'slook at aconfiguration um I'm going to modifythis configuration and filter themetrics that are being sent uh to onlyinclude process metrics so I'm going tofind some metrics and say to includethemand uh this will change this will youknow add a processor to my configurationwhich means I'm going to need to applythis and send it down to the agent um soactually 50 agents so I'm going to startthat roll out we're going to see that weare sending new configuration to all ofthose agents and um they will receivethat new configuration restart and anduse the newconfiguration so now let's look at thegateway um the gateway like I said I Ibuilt it it's super stripped down itjust has an OTLP receiver OTLP exporterand a bunch of processors um I I So thisis a custom build of the um Open Retreatcollector and you'll notice that becauseum we're reporting the availablecomponents most of the um sources aren'tcompatible so here we can see that theApache Spark um requires the ApacheSpark receiver or another um destinationif I try to send this to S3 it says Ineed the AWS S3 exporter and we knowthis because this information was sentfrom the um from the collector viaavailable components uh via op ampall right that's it we can switch backto slides all rightgot a little onetricky so this is the real live demoherethis is the hardest part oh my goshnutsum here i'm probably just gonnaOh okay sorryforgot oh waitnice all right so close thanks forhanging with usokay so I want to talk a little bitabout doing this the challenges at scaleum soum suppose you deploy a new version ofyour server uh the agents willtemporarily disconnect from the serverand when a new node appears they willall attempt to reconnect um this isknown as the thundering herd problem umso exponential backoff helps with thisbut it can be difficult to manage thislarge influx of agent connections umthey'll also send their complete statusand the agent will the server will needto process this information and send anynecessary configuration updates uh tothe agents um there's also the agentsthe the the issue of agents that are notconnected so how do you know uh if youhave an agent that's trying to connectbut it's failing is it still trying toconnect or was it a container agentthat's been replaced and deleted um andthen lastly the the op amp protocoldefines the interaction between an agentand the management server butspecifically one agent so we know how tosend configuration and receive uhmessages from an agent butum if we've got a 100,000 agents how dowe ensure that we're sending the sameconfiguration to all of those agents umdo you do it all at once uh which onesdo you update first what if youencounter errorshow do you ensure that all agents areusing the same configuration um theseare some challenging problems but it'soutside the scope of this 30 minute talkbut I just wanted to present some ofthem okay so um let's take a look atkind of what's coming up um so over thenext six months to a year or so uh we'relooking at mostly just increasingstability of the supervisor um doingsome refactoring um stuff like that justmaking it really ready to go uh rightnow it's currently at an alpha stabilityyou can download binaries for it um it'sready for some testing um we don'tnecessarily recommend that you use it inproduction but uh if you do know that'syour choice um additionally we'relooking to implement some more opampcapabilities uh these are kind of inprogress right now uh we're currentlylooking at doing collector upgrades somaking it so that you can upgradecollectors through the supervisor uhrather than having to do like a fullredeploy uh can simplify um themanagement a little bit um additionallyuh we're looking to enhance how weconfigure and report telemetry uh withinthe collector and supervisor um so justmaking it so that these are fullyobservable uh additionally this is kindof in the early stages right now butwe're um kind of working with some ofthe Java SIG to help uh support opampinside of the JavaSDK uh further down the road um we'relooking for more integrations with theoperator uh the operator alreadysupports op amp um but uh with the helpof Jacob Areronoff uh we're um lookingat adding some more features or reallywe are going to be helping add some morefeatures jacob will be driving this butum we're looking at uh taking uh just itto the next level basically just makingit so that it's uh more Kubernetes readyum so trying to handle uh connections ata high scale high availability etc umadditionally we're looking to expandwhat the supervisor can run uh we wantto be able to run multiple collectorswith one supervisor um also just noncollector things so uh SDKs like theJava SDK um other agents are alsohopefully going to be supported in thefuture um this is further down the roadlikely um even further down the road umwe are looking at um or we have designsfor supervisor list collectorconfiguration so this is running thecollector um putting all of the op ampbits making the collector really speakop amp uh making it so you don't need asupervisor in all cases um again we havesome designs for this but uhimplementation is not really started orplanned um and then uh something thatwould be really cool would be hotreloading of the collector so making itso that uh just the bits uh that youchange when you do a uh a configurationupdate are actually reloaded in thecollector um this is going to be highlyintricate it's going to be an involvedprocess um currently also just in theideation phase right now uh but um wewelcome contributions so if anybody liketo see this uh we'd love to hear whatyou have um that's all we have for todayhere's a link to the supervisor readmeum in that you'll find links to the SIGum the uh Slack channel if you want toask any questions uh links to downloadthe supervisor instructions on how torun it um we'd love for you to try itout and uh hear your feedback that's allwe got thank you thanks2025-04-15 21:57:57.840623 ��5>#��!Ag8rtqqNTL9Qwelcome to smooth scaling with the opamp supervisor i'm Andy Keller i'm theprincipal engineer at Bineplane uh I'man approver of the opamp spec and amaintainer of the go implementation andwork on the architecture andimplementation of bindplane a telemetrypipeline management platform and I'mEvan Bradley i'm an engineer at Datraceand a collector maintainer um I'm anapprover on the go uh op ampimplementation and uh I help maintainboth the supervisor and the op ampextension in the collectorso we're going to start with an intro inintroduction and update of the opampprotocol then we'll look atimplementation of opamp and the opentelemetry collector and the role of theopampsupervisor and we'll demonstrate opampworking in an agent management serverand finally talk about what's comingnext for opamp and the supervisor so howmany are familiar with opamp showhands familiar with thesupervisor anybody using op ampallright so opamp stands for open agentmanagement protocol it's a networkprotocol for remote management of largefleets of observability agents thespecification is available within theopen telemetry github organization and ago implementation is also available itcontains a client and server SDKs uh andprovides some exampleimplementations opamp is used to connectagents to an agent management server theserver can coordinate telemetry agentsreporting on their status sending themconfiguration upgrading their packagesand it provides a command and controlinterface to a large fleet ofagents the protocol supports both HTTPand websockets in the case of HTTP itpulls for updates from the serverperiodically and websockets use apersistent connection to the server andmessages can be pushed to the agentdirectlyappdefins agent to server and server toagent as you can guess by the names theycontain data sent in each direction bylooking at the different components ofthe agent to server message we cananswer basic questions like what agentsdo I have are they healthy what are theydoing what can theydo the message sent from the server tothe agent describes what the server cando and it can also instruct the agent touse a particular configuration orprovide packages for modifying the agentand can also send commands to theagent while the project lives within theopen telemetry GitHub organization opampdesign is designed to be agent agnosticwe hope that other agents in the futurewill support opamp for remotemanagement opamp allows for partialimplementations of the protocol so onlycapabilities supported by both the agentand the server are enabled and thisallow also allows the protocol to beextensible new features can be added tothe spec without requiring support fromall agents andservers if you'd like to learn more Igave a talk about opamp at CubeCon NorthAmerica in 2023 with Jacob Aronov uh andsince then there have been a few updatesand I'd like to share those with younow many improvements have been made tothe opamp specification andimplementation over the last year i wantto highlight three new features thathave beenadded let's start with heartbeats soheartbeats keep the websocket connectionbetween the agent and the server alivemost load balancers will terminate idlewebsocket connections and periodicallysending an empty message prevents thisfrom happening agents that supportheartbeats can set the remote sorry theport's heartbeat capability and sincethe server is more likely to know thekeep alive requirements of inboundconnections servers respond with thepreferred heartbeatinterval custom messages are also a newfeature inopamp so suppose you want to implement afeature that's not defined by the opampprotocol cust bility in SW3 uh and I'm alsohotel and the looks good to me stackcontributor i'm also an active member ofthe open telemetry and Prometheus uhspecial interest group and um last yearI also became a graphana championso soum with um Prometheus and openmetry theysort of got off at the wrong foot and atleast partially that's due tophilosophical differences um around howto collect metrics right whereas uhPrometheus is predominantly a pullbasedmodel open telemetry at least in itscurrent implementation is predominantlya pushbased modelum so uh what's the differences what arethe pros and cons uh with the pool basedPrometheus model um Prometheus relies onservice discovery to know what's uhsupposed to be monitored and it alsoallows us um to have uh metrics such asthe app metric in Prometheus we canwhich we can use to monitor for failedscrapes uh however there are somedisadvantages to this approach um appsneed to keep the metrics in memory uhand if you if your metrics don't changefrequently um uh if they keep the samevalue for a period of time we're wastingCPU and uh network to collect uh thesame data over and over uh whereas withthe pushbased modeluh yeah whereas with the pushbased modeluh we don't need to keep the metrics inmemory and um we can only push whenthere's actual data to push however wedon't have the service discovery uh sothe promeus can't tell what's beingmonitored and even if it knew what'sbeing monitored it can't tell easily thedifference between an application thatwent away versus an application that'sjust unavailable due to for example anetwork issue um we won't go into thenitty-gritty details of things butPrometheus is also very much designed uhfor pull and regular intervals um andthe implementation of very popularpromill functions depends on thesecharacteristics um also um uh withoutlowering the uh look back delta inPrometheus uh you can also keep gettingthe basically last data point for yourapplication that away that went away forup to 5minutes uh so this philosophy differenceuh brings me to this uh peoplemanagement theory called the tree ofmonkeys where the same group of monkeysthey work towards the same goal there'sa banana in the top of the tree and theyneed to work together to get there butin reality the the monkey that sees if amonkey sees the other one progressingreally really fast instead ofcollaborating it starts to see the otheras a threaten to their ownsuccess and I I'm not proud to say uhI've been a Prometheus user for almost adecade now and I at the beginning I didsee open telemetry as a competitioninstead of a collaboratoruh but if we compare the two like why dowe see them as competitors likePrometheus is a time series databasefocused on metrics uh and it has a lotof features that allows monitoring andalerting systems while the opentelemetry is not a backend it's not adatabase it it focuses on creatingtelemetry and pro making sure thetelemetry types correlate well togetherso if Prometheus doesn't exist opentelemetry still needs a good backend forthemetrics we could say that we compete onthe SDK side like Prometheus has SDKs toinstrument metrics in your applicationsopen telemetry has the same butPrometheus focus on the very efficientPrometheus way of doing things open opentelemetry has a different goal opentelemetry focus on uh contextpropagation between between thetelemetistry type and to make sure thatit works for every vendor out there notonlyPrometheus so now that we we realizethat Prometheus can focus on being acollaborator with open telemetry andwe'll work to make the best uh opentelemetry matrix backend actually that's what happened inprom con 2023 the team uh meet uh inperson in Berlin we had a full daymeeting and we decided that yes we dowant to be the best uh the best back endfor open telemetry and to do that werealized that we needed to uh introducesome breaking changes and that's how uhpermittus 3.0 there are not only opentelemetry but open telemetry was a bigfactor to release a 3.0zero when we get to the work it's a lotum we we have too little time to talkabout all the good the the new thingsthat we developed in the last year so wewill hush through it and if we havequestions at the end please ask as manyas you want starting with UTF8historically Prometheus uh only acceptsdigits the English characters characterset uh underscores and columns whileopen telemetry accepts the whole UTF8character set that includes foreigncharacters dots symbols emojiseverything that can comes up to yourmind open telemetry acceptsuh and in fact if you use open telemetryyou know about the semantic conventionsand there is dots everywhere there is noway we can adopt uh open telemetrymatrix if we don't accept thedot with 3.0 uh promeus starts to acceptUTF8 by default we are also making theum open telemetry receiver u endpoint isstable we don't need a feature flaganymore we do need a flag because ofsecurity reasons we cannot ex we cannotexpose an endpoint that can write to adatabase by default that's a securityproblem so we need a a flag and we alsohave um in the configuration filestrategies to translate open telemetrymetric names to Prometheus metric namesthat'suh yeah this is something that wedefinitely want to improve is I don't Iwouldn't say this is thefinal open telemetry to permitexperience but that's uh as far as wegot uh if you were in this roomyesterday uh Bartk and Ariana gave atalk about Prometers adopting opentelemetryschemas this is still work in progressis not released but it this helps likeyou can specify there's a metric thatwas following the Prometheus namingconventions on version let's say one andthe same metric there is a new versioncalled let's say 2.0 zero that followsthe open telemetry semantic conventionsprometheus will be able to query bothand they will they will show show up inyour graphs as one single metric thiswill definitely help with no migrationbut still notreleased if you you're using Prometheuslike backends but not Prometheus just besure that they do accept UTF8 i'm notsure if all vendors all implementationsaccept so just keep an eye onuh we still have work to be done on UTFAsidei I have no idea how to predate the opentelemetry spec but that's on my to-dolist i want to make propose change tospec to make sure that opentelemetry when open telemetry is sendingdata to Prometheus it's fine to keep theTF8 characters today in the spec itrequires the translation and I want Idon't want to toh to stay thatway the open telemetry collector has twoprimeus exporters the promeus exporterand the permits of remote write exporterthey also they are translating uh f8characters to underscores I don't likethat people don't like that uh we areworking onit oh sorry uh changing topics uhdeltas deltas um if you work onPrometheus you realize that countersonly go up uh when a counter goes downin Prometheus world we say that this isa reset and the rate and increasefunctions they they are implemented in away that we are able to work aroundthose resets and the the the queryresults still work but in open telemetryuh there is more counter types uh thedeltas so there's a quick illustrationifuh let's say we have an app we aremeasuring HTTP requests if we receive in10 seconds 12 HTTP requests thecumulative value will go up to 12 andthe delta 1 will also go up to12 10 seconds pass we have only tworequests the cumulative will keep thestate from the previous uh measurementit will increase by two the deltas theymeasure the they increase between onemeasurement to the other so it won'twill only uh count plus two instead ofuh 14 and then have that happens forevery single measurement so we cantranslate things we can translatecumulatives into deltas and deltas intocumulative but to do that we need tohold the state we need to keep in memorythe previous uh measurement and with thestate we can calculate the translationbetween the two if you're sendingmetrics to Prometheus so far we onlyaccept commumulative so we built in theopen telemetry collector a processorcalled delta to commumulative it willhold the the states of the previousmeasurements will do the translationthat just like like I just saidum as every stateful process process ishard to to tomanage so we the outer of the delta tocumulative process pro processor uh openin Prometheus so we can embed the thisprocessor in Prometheus instead ofrunning two stateful processes you runonly Prometheus and it will run thedelta to commulative processor inPrometheus so Prometheus does theconversion uhalone this is not the best experienceever like I don't think this is going toto be scalable uh we are working on abetter support we are want to do theconversion actually I'm not even sure ifwe want to do the conversion like wewant to ingest deltas as they are and ifwe want to transform to into cumulativesduring the query time or if we want tohave different uh promql functions tohandle deltas this is still uh underdiscussion but keep an eye uh we havePRs open we have uh proposals we aredefinitely improving the experience uharound deltasum on another topic native histograms umas you might have heard uh nativehistograms became a stable feature inPrometheus 3 uh0 uh and um they improvedthe performance and accuracy um they'realso um atomic so a whole histogram umuh can be contained in a single requestwhich allowed us to create thePrometheus remote receiver in the opentelemetry collector umuh there's still some work to be donethough uh so they're not officiallymarked as stable yet one of the thingsmissing is uh the implementation of uhnative histograms in the um uhopenmetrics text format and there mightbe still some backes needed for thequery set of things but you can use themin production already uh and as you cansee on the screenshot 733 uh issues havebeen already resolved and basicallythere's one last pending so it's it'svery close to becoming like anofficially stable featureum the another new feature in thePrometheus 3.0 Z is Prometheus's uhremote write protocol version two um itenables new features while definitelyalso m maintaining the efficiency aspectum oh sorry yeah uh it does have ansupport for the native and classichistograms uh which as we mentioned aremore efficient and accurate um do youwant to hold this yeah probably let's dothat um it also has uh support forexemplars uh which allows linkingmetrics for example to traces umpreviously there was only anexperimental support for it in theremote right protocol uh it also has anative support for uh metadata um whichnow is no longer just scoped uh to themetric name but it's scoped to theactual series um and it's uh themetadata is basically the description ofthe metric uh the type and the unit andthey're also no longer optional uhcreated timestamp support uh whichimproves the accuracy of rates uh andfor shi time series um and as mentionedwe also have in the promeus rateprotocol we also have full UTF8 supportadded specifically to improvecompatibility with um because in thepromeus worldwide one uh version youwould have to do the translations umwhat does this mean for open telemetryuh so um as I mentioned this enabled usto uh implement the primitive use uhremote receiver in the auto collector uhand allowed us to make the componentstateless uh there is an ongoing work tomake it better uh you can uh check outthe details in the tracking issue um uhit also brings improved support for allexponential histograms uh which gettranslated to the native histogram oneof the reasons being now they can beshipped in a single remote ride requestsuh request uh instead of being splitinto multiple uh time series um in theindividual counters in the old histogramwhich can come in multiple differentrequestsum avoiding conflicts in the metadata souh previously um because the metadatawas scoped uh to parametric name if youhad two different applications with thesame metric name for example like theHTTP metric in hotel which is uh whichhas an ongoing migration to change theunit type uh you would have a conflictin the unit basically if not all yourmigration applications were alreadymigrated this solves it um it also theUTF8 support enables us uh to store theold metrics uh without having to escapelots which are commonly used asmentioned before and um the Prometheusremote write V2 also brings partialright statistics uh which give an autocollectib of idea uh if um a request waspartially uh partially written uh howmuch data was uh written how much datawas rejected and if it's worth retryingum And there's also ongoing work toimplement the support for the Prometheusremote v2 v2 protocol in the remote redexporter you can check the trackingissue for more details um resourceattributes that's a spicy one so um onething that I really like on the hotelSDKs is the ability to find attributesabout the resource about the applicationthat is running for example if it's aJava uh application running with theJava SDK we'll find information uh forexample the Java version or the cmd linethat run the application if this app isrunning inside a Linux uh host it willfind information about the host itselffor example the uh the host the OS nameuh things like process P ID um thearchitecture of the of the of the CPU ifit's running inside a container it findsthings like container name container IDuh if it's running inside Kubernetes youcan use Kubernetes um receivers to addmore metadata to the to the applicationbut when we when we are sending thisthis information to Prometheus you justput that inside a huge bag ofinformation we throw it to the otherside and we hope for the best uh hope isnot a strategy uh things do do not workas as we wantand while trying to support resourceattributes we are iterating severaltimes and I think we are we are stillnot in the place we want to be the firststrategy that we had was to transformall the resource attributes intoPrometheuslabels if you run Prometheus as I couldsee everybody here is using Prometheuswe if we have labels for example pod idcontainer ID process ID like this is nota good label like you will never run aquery that oh let's let me sum memoryusage by Java cmd line you just don't dothat and we are storing that as a labelwe are wasting CPU we're wasting memorywe are wasting resources and we arealways hitting cardality uhissues we changed the strategy uh thisis what we have in the spec the hotelspec at the moment all resourceattributes are being translated into asingle metric and the single metric iscalled target info and you can use proQLjoins to to join the target info labelsand you put that in your metric thatmatters the metric that actually queringfor but joinsthat's I don't like joins uh I've been aPrometheus user for nine years i'm amaintainer for I think two or threeyears and I still Google how to do joinsand I think you do the same like this isa hard this is a hard like uh that's abarrier that's a barrier for users torequire to use joins to use in yourmetrics yeah third strategy we aretelling the user you are intelligentenough to tell us what are the resourceattributes that matter for you so in theconfig configuration file there is aoption called promote resourceattributes you tell us what to promoteinto labels and those resourceattributes will be there in yourmetrics this is so far the best userexperience for a Prometheus userterrible experience for the Prometheusadmin uh because if you change theresource attributes if a let's say I'm aplatform engineer and I providePrometheus as a service to my mydeveloper teams and the my developersays I don't like this resourceattributes anymore i want a new one soyou change the configure ficonfiguration file you need to restartPrometheus and uh and we change the timeseries uh ID let's say you you need awhen you change a label you create a newtime series and Prometheus for to belike a real time monitoring system wekeep the most recent metrics in memoryso when you switch the resourceattributes you have a set of uh metricsin in memory and then all the labelschange so you still have the old metricsin memory you have new metrics in memoryso you double the amount of memory for awhile and usually that breaks yourinstallationum we are doing some UX research we arementoring a young woman from Nigeriashe's interviewing a lot of uhPrometheus end users hotel usersco-founders we are asking a lot ofpeople information about like when hotelwas the hotel metric uh data model wascreated how was like what was theco-founders's thinking of a gooddatabase for the resource metrics likewhat as a permiss as a hotel SDK uh userwhat do I expect from a metric databaseto handle the resource attributes thisis still a work going on uh I hope inMay we have some good new ideas andafter May we start implementing themuh one thing that I'm really excited uhabout resources built is a OTAP uh opentelemetry enhancement proposal uh thattalks about entities when I like if yougo back a few slides I was talking howwe have really nice resource scoped likethe the Java resources that are the hostresources container resources they arevery well scoped together but when wesend them to Prometheus we just put themon a bag and they like we lose all thethe scope if they with the entities wewill be able to identify what from thoseresource attributes which ones aresupposed to be Kubernetes attributeswhich ones are supposed to be containerattributes and we'll be able to have abetter experience in Prometheus based onthisinformation so just to recap uh thingsthat to to look for in the future wehave uh we have ongoing work to adoptopen telemetry without any uh metricname modifications so far we have we areable to accept UTF8 characters withouttranslating to underscores but we arestill adding the suffix for units andtypes for for example the underscoretottoal things that in the Prometheusworld is makes sense is that that's howwe always done uh things but in opentelemetry that's new and the feedback wereceived is that people don't likethat we're working on native deltaingestion we are working on remote writereceiver and exporter in the collectorwe are workingon the remote right protocol was alwaysdesigned for Prometheus and Prometheusnever accepted delta in the past sincewe are starting to work with deltas wemight make changes to the remote rightprotocol to also have a good experiencewithdeltas uh we want to make changes to thehotel spec uh mostly around nativehistograms andUTF8 uh I promise not to say this but it umuh we are also we all we also have aworking group uh around paret it's rereally really really really really earlybut with parquet files it we should beable to handle better high cardality uhattributes it's and I think for hotel itmakes a lot of sense since resourceattributes are usually in the set ofhundreds 200s of resource attributes andpark might be a very good use case tohandle resourceattributes this is a QR code if you wantto give us feedback and thank you forbeing here thankyouany questionsthank you for the talk i have a questionfor the resource attribute instead ofpromoting the resource is there anoption to have a reax so we can put astar like I don't know resource dot staror something along those lines this isthe first question or can we put insteadof promoting blocking so skip thoseresource attribute like a command lineyou don't want you want to filter basedon that but maybe you want to filterbased on the rest like a a deny listright right yeah those are good ideas II don't think anybody suggested that uhI think that's fair ask that's open anissue and implement it uh also if youuse the open telemetry collector to shipsome of this data you can always do usethe hotel collector processors to do theuh to the transformation on data youwant to do before it comes to the remoteexporter which might be able might allowyou to do some of these changes alreadyright this is a good idea to thank youguys you're welcomeokay sounds like we're doneuh anyway if you Oh thanks for thepresentation uh a quick question comingback to the join topic and the infolabels uh I think you have added a newinfo function the prom languagebasically for this uh case and do youthink it uh solves it for this opentelemetry use case or is there somethingspecific there that I I think that theinfo label is definitely better thanjoins uh but I don't think it's thefinal the final solution but yes it's avery good uh it's a definitely betterthan the joints thanks for calling thatout all right let's have lunch[Applause]2025-04-15 21:57:58.328846 � ��@#��iAnXdGXdxmWNQhi everybody Welcome to this talk aboutmemory noisy neighbor I hope you're allcaffeinated and ready togo I want to start with a few snapshotsfrom this YouTube video called positiveaffirmations for site reliabilityengineers I highly recommend you watchit It starts at night A lone SRE issitting at hisdesk looking at exceptions in productionand the rising cost of thedeployment He's feeling so frustratedandhaggarded Hemeditates A calm female voice tells him"Your pipeline isgreen Your tests are well written andstableyour friends and family understand whatyoudo You were born to deploy Kubernetesclusters And so this got me thinking whydo we take on this heroic task ofmaintaining these large deployments andfor me the ultimate affirmation is thisYour deployment is highly available andcostefficient Users delight in theproduct and its performance And this isthe holy grail what we all strivetowards and performance that uh part ofour holy grail really matters and bothcompanies understand uh the importanceboth to users the user experience andtheir bottom line Uh Amazon released acase study uh saying that for every 100millisecond increase in uh user responsetimes they lose 1% of revenue And therehave been dozens of case studies overthe years uh reinforcing those results Ionly were was able to bring just a fewof the highc caliber names I was I foundUh one of the one of the uh case studiesI really liked is this one by Raku 1024which is a Japanese online grocery wherethey ran an AB test and they ran twoapplications with exactly the samefunctionality but one of them wasoptimized and ran 400 millisecondsfaster and they had startling resultsfrom the faster application It increasedrevenue per user by 53% and reducedbounce rates by35% So performance reallymatters But with that users also expectbetter functionality And your systemsmight not be as complex as these extremeexamples of service maps published bybig companies but still uh your systemsare probably quite complex users expectuh better functionality right they recombetter better recommendations searchengines maybe AI and in general just amore featureful application And as oursystems become more and more complex itbecomes harder to reason aboutperformance And it's easy whenperformance degrade to blame thespaghetti monster of complexity or thenetwork But really a lot of that uhdegradation in performance comes fromsimple resource congestion on the serverthat we just have no visibility into Andthat's the topic of this talk Now whatif I told you that a small cabal ofengineers got together and created thisresource allocation capability thatallowed them to run 50% moretransactions on the same size of serversand reduced their tail latency their P95and P99 by factors of five 14 Like thatwould be fantastic right how can we notknow about this how can they not tell uswell it turns out that this capabilityactually exists but it was not developedin secret Over more than a decade therehave been more than a dozen papers bywell-known research universities and ourhyperscalers that explore thiscapability and uh and give us detail onh��\?#��oAJFS0lSfHtMIokay we are talking about the state ofPrometheus and open telemetryinteroperability first ofall show of hands who is usingPrometheus or opentelemetry um who is using Prometheus andopentelemetry and who is liking theexperienceokay I'm sorryum I'm Arthur i work for Graphana Labsi'm a maintainer of From Meus for a fewyears now i joined Open Telemetryrecently i've been contributing for likesix seven months i got approver statusin the collector uh besides that I alsodo mentorships through CNCF mentorshipprograms Google center of codeLFX and I graph pays me to make surethat Prometheus and open telemetry workwellapparently I'm not doing a good job uhbut I hope uh we improve over time hi soI'm Uray um I'm an SR with a focus onobservaow to mitigate these noisy neighborsAnd today I I'm giving this talk becauseI think it's time for the Kubernetescommunity uh to get a a solution basedon this research uh that we can allenjoy Uh so in this talk I want to talkabout what is memory noisy neighborshould we care and what are the benefitswe're going to get from solving it Uhtalk about some of the hardware andsoftware mechanisms that we have inorder to approach the problem and whatwe're doing in open source right nowUh I did my PhD in noisy neighbormitigation specifically network noisyneighbor mitigation I started a networkobservability company which we sold toSplunk and I'm a maintainer on the opentelemetry network collector which wecontributed and I'm now working on noisyneighbormitigation So let's start with what ismemory noisyneighbor Our cloudnative applicationsultimately run on physical servers andthe applications have to share thefinite resources that these servershave when uh in one in noisy neighborone application takes more than its fairshare of resources and so otherapplications cannot get that resourceand their performance degrade In thistalk I'm going to cover I'm going totalk about the last level uh cacheswhich I'm just going to call caches andthe memorybandwidth So how does cache uh noisyneighbor look like uh usually yourapplication fits into the differentlevels of hardware caches that are onthe CPU You have the blazingly hotworking set in L1 cache and then the hotwarm and cool working sets in L2 L3 andDRAM When a noisy neighbor comes in ituses the shared L3 cache veryextensively and evicts your applicationout of that cache And so when theapplication wants to access its warm uhdata set it has to go to DRAM instead ofL3 In the more extreme example where thenoisy neighbor is running at thehyperthread next to your application italso evicts your application from L2caches And so now your application hasto go to DRAM where it would have goneto L2 which is a lot faster right it'slike it could be up to 50 times uhdifference So that could degradeapplications quite alot and in practice it does uh Google uhpublished results from experimentingwith different synthetic noisegenerators on their production servicesUh so here you can see three productionservices Uh you have web search which isthe Google search node ML cluster is anonline machine learning textclassification system that works in realtime and memal is an in-memory key valuestore like memcache and they found thatthe slowdowns across the different loadson the system was uh five times for websearch five times for ML cluster and upto 14 times formeal So performance degradation on theP95 P99 is there anddramatic So I'd like to take a shortsurvey Uh can you please raise your handif you know what type of VMs are used inproduction is it bare metal is it likethe four core 8 core 16 core roughlyspeaking now Okay Leave your hands upplease and leave your hands up if younever use fractions of a CPU You onlyuse bare metal or you only use fullCPUs I think um 75% of hands uh wereretracted Thank you Um and so bestpractices today is to separate our bigdata analytics from our productionuserfacing latency critical applicationsbecause we don't want somebody to run abig data analytics job and uh destroyperformance for our latency criticalworkloads But if you're working with VMsthat are smaller than a full physicalmachine then what ends up happeningactually looks like this right like nextto your VM with your uh interactiveworkloads are other tenants of you knowyour cloud provider or infrastructureprovider that might be running batchanalytics right and some random dude onthe internet is running uh these batchworkloads next to your workloads um andit's fineum the question is does the qu cloudprovider protect you from these noisyneighbor running alongside potentiallynoisy neighbors and I've been able Ihave not been able to find evidence thatthey do and I I believe that there isvery little protection from at leastmemory noisyneighbors So imagine this It's you knowour engineers can spend months and yearsof their lives optimizing deploymentsThey could be adding indices to thedatabase or uh changing data schema soqueries run faster They could be takingservices and breaking them down tomultiple services so that queries run inparallel and they could just doprofiling and alleviate bottlenecks Allso that the performance uh for usersimproves and all of that work can beobliterated destroyed by a noisy runneighbor running alongside the workloadon the compute infrastructure that isjust degrading the computeinfrastructures uh performanceAnd so one takeaway is if you're runninglet's call it a thousand cores or morethen you should prefer running on fewer100 core instances the the larger baremetal instances rather than a lot of thefour core 8 core machines Uh becausethen you at least know who your noisyneighbors are They're going to belatency sensitive workloads that aremuch better behaved than batch analyticsworkloads or some random workload someother folks could be running next to youNow do you have to have batch analyticsrun next to your workloads in order toexperience noisy neighbor the answer isno Um researchers from MIT ran memcachealongside a garbage collected workloadYou can see on the top graph here thememory bandwidth It is relatively lowand stable throughout the experimentuntil the mark phase of the garbagecollector which is very memory intensiveand saturates memory bandwidth Then ifyou look at latency it is pretty stableat 50 micros 99.9th percentile latencybut then as the mark phase starts itjumps up increases three orders ofmagnitude a thousandtimes So you don't have to have big dataanalytics neighbors in order toexperience noisy neighbors There aremany workloads that we run today thatexperience them and for example thegarbage collection Uh if you have asystem that has heavy transactions andonce every 10,000 transaction it justhas to run through a lot of rows in yourdatabase then that could be noisyneighbor aswell So should we care does it matterand what can we get from solving this uhyou might have seen one of these uhpublished uh surveys This one waspublished by data dog showing that a lotof the containers that we run today usea lot less of the requested CPU Andusually I run surveys and I ask peoplekind of how much is your CPU utilizationand I get answers between 10% and 40%for most people And that's uh in goodcompany because some of the world'slargest companies have published theiraverage clusterutilization Uh and these are companiesthat certainly have resources tooptimize their deployments There's a lotto gain from optimizing them Um but whathappened over the last uh few years isthat some companies have been able toget a lot betterefficiencies Uh Google was uh publishedaround uh 35% in 2011 and increased to50%uh in 2019 So that's one and a halftimes more efficient They run a lot moreon the sameinfrastructure So what happened uh toincrease efficiency so much in theseorganizations I was able to find twomajor contributors There might be moreI've I was able to find these major onesOne is advancements in verticalautoscaling So uh making sure that thememory and CPU requests fit the workloadSo you don't have waste and you canbinack better And another one ishandling noisy neighbor which decreasesP95 P99 latency and enables uh to run alot more workload on you know more denseon the same amount ofsystems And to just to illustrate uh whywe are not able to run at high densitytoday we simulated two systems One runsat high load You can see it on the leftAnd so it has a lot of these memorysaturation events or you know garbagecollections for you know just a as a wayto think about it And a system with lowload that has fewer of these eventsbecause it has low load needs lessgarbage collection And if you look atthese you can intuitively say that youknow the low system would have betterP95 P99 right that has a lot less ofthese interference events happeningslowing down our applications And in Ithink a lot of us intuitively know thatwhen we experience high P95 P99 we scaleout right we want to reach this low loadSo let let me say it uh another wayright like we have high load on oursystems which generates memorycontention decreases CPU efficiency wesaw that you know our noisy neighbor arekicking the applications out of cachesso they're slower and that causes highresponse times and us breaching our SLOtargets and what we've been doing todayis scale out our systems so that we runat lower load and I think this is ananti-attern we should stop doing it it'swasteful andinefficient Instead we should make surethat the memory contention these noisyneighbor events do not decreaseperformance on our systems by mitigatingnoisy neighborbehavior And so if we're able to do itand run at higher load then first we canscale down our systems and we'll stillkeep our SLOs's uh and just run athigher CPU utilizationWe can allow product teams to build morefeatures They don't have to go andoptimize the system and reduce theSLOs's reduce the P95 P99 before theyship more features They can just goahead and we can have users enjoyimproved uh SLOs's And in practice weget a mix We get all of these and uh ifwe mitigate noisy neighbor and so whywould only the hypers scales be able toenjoy it we should all be able to enjoyit And I think this gets us closer toour holy grail right it touches on threeof the important uh uh goals that wehave both cost you know the costefficiency product and uh sorry theproduct features and productperformance So what mechanisms do wehave in order to measure and mitigatenoisy neighborsmodern CPUs allow the operating systemto control how much of the cache andmemory bandwidth each application canuseBut even if you don't have support forthese techniques in your CPU you canstill restrict the ability of noisyneighbors to create damage and noise inthe system by pinning those noisyneighbors to a small number of cores andperhaps reducing the frequency of thesecores so that they just can't do uh muchdamage I get this question a lot Youknow containers should do this rightlike we expect containers to isolateworkloads u but they don't today this isnot built into our containerinfrastructures and in fact containersare not very good at isolatingperformance more security uh but not allis lost there is access to thesemechanisms but through a differentsubsystem called resource control it'sconfigurable throughCISFS okay so we want to tackle memoryuh contention and memory noisy neighborwe have these mechanisms that allow usto uh to uh control and limit noisyneighbors So what do we measure in orderto uh uh close this loop uh we careabout application service time So whydon't we measure P95 P99 and then adjustresource allocation based on whichapplication has a bad P95P99 Uh this turn the the thesemeasurements turn out to be noisy I meanfirst we have to measure P95 or P99 Soyou know just to begin with we have tomeasure at least a few hundredtransactions in order to get a goodsignal for P95 or P99 Uh it turns outthis is noisy so it reacts veryslowly Uh another thing we could do iswe could measure CPU efficiency So CPUwants to execute some number ofinstructions If there is a lot of memorycontention and the CPU has to wait formemory the number of cycles that it hasto uh take to run those instructions islarge If there isn't a lot of memorycontention then the number of cycles issmall So this ratio of cycles perinstructions could be a good metricUh the problem is that this is alsonoisy because it measures a bunch ofother things happening on the system andit is also uh these systems end up beingcomplex because we don't know what agood cycles per instruction value isright like is three good is it terriblewe need to profile the applicationbeforehand so these systems end uprelatively complex uh but it turns outthat Google uh has deployed this type ofsystem on all of their shared clustersum as of 2013 So it's been it's beendeployed for a whilenow Um another thing we can do is we canmeasure the memory contention itself Wecan measure the utilization of memorybandwidth and caches in our applicationsdirectly and then if we find somebodyusing more than their fair share welimit them And this is actually a goodmeasurement and this is what ouropen-source collector is doing right nowAnd in fact Alibaba published that theyhave a system based on direct collectionof uh of memory contention events uh inproduction for over two years on orderof a million cores as of2020 The problem is if we want to solvethe memory noisy neighbor problem weneed very frequent measurements Um onthe top here is a simulation of uhyou've seen this before right like theuh the memory uh contention events andon the bottom is a a sampling of thememory metrics every 1 second And youcan see that you know you lose all ofthe signal if you don't measurefrequently enough And so with thiscollector we set out to measure at 1millisecond granularity Um and westarted this as an Apache 2 project uhcalled the unvariance collector rightlike the the idea is to reduce thevariance of responsetimes Okay So uh with that uh I want totalk a little bit about what we're doingin order to build thiscollector Well I I told you we want totake these 1 millisecond measurements Sowe want to have uh measurements acrossall of the cores that we're runningevery millisecond So we expect it to beyou know very you know very regular andwant the measurements to be really goodIn practice what you get is this jitterSome cores might take longer to torespond to these timers and to do themeasurement And so we wanted to know howmuch jitter we're going to have Jitteris bad because if each core is measuringa different time interval then how canyou take all of these measurementstogether and say okay this applicationused this much memory bandwidth in thistime interval or that much uh bandwidthright like we have fuzzyintervals And so to measure jitter weset up Linux high resolution timers at 1millisecondintervals And these graphs show how muchuh uh each core the time that itresponds to a timer differs from the thetime that we wanted the timer to fireright and you have these vertical linesbetween the fastest core and the slowestcore So the top of the graphs are theslowest cores that took to respond Andso um I'm I I want to ask you what doyou think is the noisy graph on thebottom right like the noisy graph is 300microsconds to respond to a timerinterrupt That's 30% of our measurementinterval That's quite a lot of jitter Umyou know it could be the metal becausethere are 96 cores It's like the sameinstance but we take either the fourcore slice or the 96 core slice So itcould be the the metal instances becauseyou know 96 cores one of them is goingto be slow or the four cores because youknow the hypervisor is adding noise andthere's noisy neighbor Sometal extralarge okay so I kind of gave it away YesThank you Uh yeah um the extra large isa lot more noisy So the there's I thinkthere's an artifact with the hypervisorThese graphs actually show it much worsethan it is It just shows the extremesbecause the scale is seconds and we havea thousand measurements per second So itkind of hides the good measurements butso statistics lies and all that But uhthese are both workable but there's alot more uh noise on the smallersystems Um so this influenced how webuilt the collector and its architectureUh so I'll go through it very quicklyUm whenever there's a fork in the systemso a new thread or task comes in thecollector assigns a resource monitoringID RMID to it This is something that thehardwareneeds On every context switch betweenapplications the collector tells the CPUwhat RMID is active now And you canthink about it as coloring the trafficYou say okay these are all going to beblue axises These are going all to bered axises And now we can ask laterevery 1 millisecond how many blueaccesses did you have how many redaccesses did you have and how much ofthe cache is blue or red or the othercolors and there are hundreds of colorsyou can use There's enough for you knowhundreds of containers on the system Sothis all of the telemetry at 1millisecond granularity and all of theallocations ofuh containers to RM ids flow to a sharedmemory buffer that the user spacecomponent can then come and analyzeeither output or actionon what we're doing today The goal weare outputting the raw telemetry intoparquet files and the goal is to developgood detection algorithms and the ideais like let's let's back back test usingthe raw data that we have um in in thesefiles to make decisions which who is anoisy neighbor at any given time And sowe are looking right now for productiondata We're building synthetic workloadsand we're looking for production data Ifyou have a system that you can run thecollector on and share the data maybe atest cluster or a um a staging clusteruh that has real production traffic wewould love to see it so that we canbuild better detectors Uh the ideaeventually is to output forobservability statistics on detection Sothey would say okay this pod was noisyneighbor 1% of the time and that pod wasnoisy neighbor 1.5% of the time And uhwith with these detections on each 1millisecond slice hopefully we'd be ableto also use those to mitigate noisyneighbor by limiting by configuringresource control So that's the plan Uh Iwant to uh to add one uh note onoverhead We're designing the systemfor.1%uh overhead in line with the traffic and1% of the analysis in user space that isnot in line does not increasetransaction uh latency Uh but even ifthe overhead was high it almost doesn'tmatter if you're able to mitigate noisyneighbor So the on the bottom here is anexample So imagine if the averageservice time increased in your servicefrom 40 milliseconds to 42 millisecondsThat's huge 5% We're not aiming for thatWe're aiming for.1% but 5% 42milliseconds But then we're able toreduce the P95 latency from 250milliseconds to 75 milliseconds I thinkalmost everybody I talked to would sayyes let's do this right like this is agreat trade-off to have Of course we'renot going to have that much uh addedoverhead but it almost doesn'tmatter Uh so I'd like to inviteeverybody uh to contribute I want tothank our existing contributors Uh andif you want to join the project pleasedo And and come My details are here I'mavailable on CNCF Slack Um and if youhave these clusters that you cancontribute data um we'd reallyappreciate it It's a very powerful wayto help the project right now Um and soI hope this will help us get closer toour holy grail of better performancecost efficiency and better product toour users Uh I'm thank you for beinghere and I'm happy to take somequestions[Applause]Microphones over thereHi Uh quick question So you mentioned umthat can you hear me yeah Okay Um youmentioned that there are two basicallycauses that are not identified is likethe shared cache uh levels and thememory bandwidth and memory um IO'slet's say um if you are running like astatic CPUs incubator which is uh fairlyeasy to to set up uh do we know whichwhich part is in your measurements orthe data that you've collected whichpart is due to the cash poisoning let'sand which part is due to the memorybandwidth or latencySo the question is what percentage ofdegradation can you attribute to memorybandwidth or to caches when you run onfixed CPU so when you pin workloads todifferent uh to different CPUs Yes YesOkay Um so uh usually memory memorybandwidth itself when you saturate it umthe latency can increase from 75 ncondsto 250 nonds It's on a that benchmarkwas done on a specific hardware butthat's kind of what you can expect So umwhat is it three four times uh withcache you can expect 10 times uhdifference but they compound frequentlywhen you have memory bandwidth noisyneighbor you also have the cache noisyneighbor If you're running on thesechiplets there's been advancements inKubernetes where where you can run someworkloads and reserve a chiplet you knowa set of CPUs to them and they only theyhave exclusive access to the cache Thenonly those workloads are noisy neighborfor themselves on the caches and youonly experience maybe memory noisyneighbor but it's very hard to manageYou know these are very specific usecases maybe in telco that need thereally low latency and it's makes itreally hard to um pack more workloads uhwhen you run on these chiplets uharchitectures usually I hope thatanswers the question Yeah thank youThanks for the talks Important work Umif we're sampling every 1 millisecondswhat are the storage requirements can'thear you Sorry Sorry If we're samplingevery 1 millisecond what are the storagerequirements to to put on disk i'mreally sorry Really can't hear youCan you hear me now yeah Okay Sorry Ihave to put it in my mouth Um if we'resampling every 1 milliseconds what's thestorage requirement for that right Sofor the raw data it's goes into parquetformat which is a columner store andit's compressed and uh it is relativelyheavy weight It's one I think wecomputed up to one megabyte per secondUh but you wouldn't you wouldn't outputthe raw telemetry What you would do isyou would analyze it You would dodetection for noisy neighbor and thenyou would output statistics on who isthe noisy neighbor And then for maybeevery 60 seconds right today we measureat 5 seconds or 15 seconds or 60 secondsyou would output per pod how whatfraction how many how many 1 millisecondslots out of the full 60 seconds Forexample it was noisy neighbor So that isa lot less cardality and a lot lessvolume and that's something that we'recapable of Okay Thank youThank you Great talk Exciting project Umdoes it have to run uh can you hear medoes it have to run always on you knowor it's more like you know I'm profilingthe if the applications are safe it canbe done on you know periodically thanclosed you know profile identify thenoise numbers I don't think it does anyyou know you you need to analyze do someadjustments and you know keep onmonitoring It's more like that rightso the question is do you need to keepit on the entire time or no or not so ifyou do automatic mitigation for exampleif you have garbage collection eventsthose are really hard to optimize whatyou want to do is to maybe eitherautomatically label those as noisyneighbor and decrease the amount ofresources that they can use uh or um Iguess I mean that's what you can do Soyou need this automatic mitigation forsome types of events Uh there you coulduse the data to find applications thatare consistently clashing on resourcesand create an anti-affffinity rule Sothat is possible and it's a use case Umand we want to support that too But Ithink the the next level of value thatyou get out of the system is automaticmitigation And I believe a lot of theincidents uh that you have of memorysaturation can be mitigated by uh by uhautomatic mitigation without need of forprofiling and changing of theapplications Thank youI think last question Thank you Uh so Ihave very uh specific question So weingest like pabyte of data daily fromour customers and usually what happenswhen we see performance degradation Weupdate the instance type or we updatethe disk type Uh and and this isbasically our solution today But as wescale this is not a viable solution Umplus so the question is how do we know uthat if this is happening due to noisyneighbor Uh and the second point is likehow do we enforce to our platform teamto solve this problem uh because um likebasically what would be the costimplication if the problem is solved vialike implementing a solution for noisyneighbor as compared to just doing uhinstance upgrade for differentdeploymentsSo I think if I understand the questionis how do you know that you have noisyneighbor how do you know that you'vesolved it uh and would vertical scalinghelp yeah I think so What would be thecost implication as compared to justupgrading the instance type uh yeah souh I think you should run on largeinstance types fewer of them So thatthat should be a good recommendation Youwould know that you have noisy neighborby using a collector like we're buildingnow uh and you know that you've solvedit by seeing that you have the highincidence of memory contention but nodegradation in performance Um as to kindof the different resource types I thinkyou mentioned you have disks and andothers uh IO is another noisy neighborthat we'd want to solve but we'restarting with this So I I hope Ianswered the question Yeah All rightThank you everybody for coming Enjoy theconference2025-04-15 21:57:58.981369It's aopen source AI platform for KubernetesSo today I going to show you someinsights I learned over the yearsHere is the today's agenda I going toshow you how to reduce the impact ofrestart using three different type ofapplication as example Let's moveon So why does restart matter It comesdown how Kubernetes traits your app Youmight have heard the famous analogyKubernetes treats you up more likecattle rather than pet But what doesmean If your application is a pet itrequires special care manualintervention every time it goes wrongBut Kubernetes is optimized for cattleIn fact many powerful features such asself-heering roaring upgrade andautocaring fundamentally rely onrestarts So take out take advantage ofthose powerful features Putrestartability asimportant A restart isn't just a singlescenarioWe'll dive into the detail later but fornow let's get a high leveloverview at the container level if yourcontainer exit or livveness provefailure the kubertrestarted and second at the replica setlevel even if your part gets the readedit automatically recreated by thecontroller to keep the desired replicacount at the deployment level during inga roaring up degrade Unlike the previoustwo scenarios deployment creates a newport forest before degrading the oldone So how can we ensure our workloadssurvive in thesescenarios Let's examine the simplestscenario a basic duration and recreationflow We have a simple deploymentThis defined only image and argumentsbut doesn't handle anysignals Let's say we direct its potusing cubic cut direct pot for threethis port receive a stop signal usuallysix time However if the applicationdoesn't handle the signal it continuerunning during the default 30 graceperiod Once that time is up the port isforcibly stopped with sigkill Meanwhile the number of runningport drops below the desired number Sothe port isrecreated Then scheduling image brewingand container startup all happens asusual Now you might wonder why the portkeeps running even after it receives sikIf you run some program on your laptopand send it signal it's typicallyterminated However in the containerworld application typically runs as P1So the Linux kind of treats it speciallyIf your application doesn't handle Stime it's just ignored That's why theprocess keeps running until sig kill Andhere's a problemIf your application just wait the entiregrace build the can waste resource alsoyour application has shutdown task likecrossing connections or writing backdata Sudden shutdown can lead to requestrows andinconsistencies To avoid this it'simportant to handle the stop signal andshutdown app before the graceful periodendsHowever some application is a differentsignal for graceful shutdown For exampleNGXexpect But what if you can't modify theapplication code directory like thosethird partyapplications In such cases there are twosolution Docker file stop signal andKubernetes pretop In the docker file youcan set stop signal to tell which signalyour applicationexpects In this example reset secret Socontainer gets secret instead of sigtime The second method is pre-topactions which runs just before thecontainer shutdownbegins There are several options Firstwith an exit hook you can send thedesired signal directory like using Qsecret but it's required kill binariesin your applicationcontainer Second HTTP hook is anotheroption It calls a shutdown end point onyour app which is useful if yourapplication provides the HTTbasedgraceful shutdownThe last thing is the life cycle stopsignal This feature isn't available yetbut you will able to specify customstops signal directory in the potmanifest This is useful because it workslike docker file stop signal but youdon't need to rebuild your imageSo now we understand how terminationworks for a single container port Butwhat happen when your pot has multiplecontainers For primary containers thetermination process all happens at thesametimes But what aboutside The sidecore design pattern whichis extend the functionality of theprimary container like loggingagents like that has been known for manyyears but cyto support as a nativekubernetes feature is relatively new Itbecomes a default starting in1.29 This feature builds on the initcontainer and it's only applied when theport restart policy is set to alwaysSo here how the shutdown works Unlikeprimary container cytos shut downsequentially Once all the primarycontainer have exited the cyto get stopsignal one by one in the rebas order ofhow theystarted In this example cyto stop in theorder three two oneOne thing to keep in mind the porthommination grace period is shared byall containers So make sure everythingcan finish in thisperiod So far we've learned a greatsource down However portination graceperiod are not always guaranteed Ofcourse if your app crash due to bugs orget killed by killer there's no gracebuildAnd even when Kubernetes does to dotermination it can sometime beoverridden or ignored For exampleduration and eviction API can overridethis period For example if you run cubecut date pot with a 1 second grace perfield option the pot will be immediatelyterminated regardless of its pot graceperiod settingsAlso when a node is under severeresource pressure the cubate epic spotto reduce nodeusage in a software eviction this isdisabled by default but cubate overrideit by its configurationvalue and harder eviction is moreaggressive but uh immediately evictedwith any graceperiod So we've explored the standaloneapplications with that But how aboutapplications that handle network trafficlike HTTPservers Let's consider deployment with aservice When you delete a portKubernetes sends a stop signal as welearned before then eventually removes aport from the routing table A new portstarts up and once it mark as readytraffic is direct thereHowever even if we've implemented thegrace version down it's not enough toguarantee zero downtime We still face tochallenges Let's look at the firstissue On the right this go code shows atypical HTTP server After receivingsignal the shutdown function is calledfor graceful shutdown And this functionkeeps handling infrared requests butstop listening immediately Here's acatch The server stop listening to newrequests but new incoming request willstill be routed to your pot It's causinga requestdrop To avoid this problem shutdownshould be delayed until Kubernetes stoprouting traffic to thepot There are some ways to introducethis ray One is just modify yourapplication code and adding strip beforethe shutdown logic Another way is pre-stopthree from 1.3 This feature is availableby default So you can set three secondsin the port manifestdirectory Let's move on to the nextchallenge If your app isn't preparedright after it startup that Kubernetesimmediately starts sendingrequests these requests can getdropped Readiness prove can solve thisproblem This pro check whether theapplication is actually ready to handletherequest Kubernetes will starting requestonly after this pro passesHowever using only a readiness probe haslimitations because this probe checksthe status throughout the entirecontainer life cycle not just duringstartup Generally you need differentthreshold or inverters for startup andpost startup proves If you rely on areadiness proof for each phases it couldlead to slower startup or two strikehealth checksThat's where startup prove comes in Thisprove runs only during a initializationphase Once the startup prove succeededladiness and liveness provetakeover This separation let youconfigure startup proofs withoutaffecting the post startupproofs Next let's see the actualbehaviorrestart On the left we have deploymentand service management It doesn't havespecial settings On the right this is acorrected version deployment It's addedpre-top three and proves which ensuresmooth traffic transition with zerotime Let's see the fader case first Onthe screen each panel show the fadercase deployments port and the serviceendpoint And now I going to drape theport Look at the log Our server receiveda sig time and start the shutdownHowever IP table haven't been updatedyet and traffic is still being direct tothe potAs a result these requestsdropped Now Kubernetes has created a newport and the routing has switched but westill seefailures This is because the new serverisn't ready to handlerequests So next let's see how we canavoid theseissues Here the corrected version ofdeployment which hasprior Now I going to delete supportagain This time thanks to P of threeKubernetes waits 4 seconds beforesending the signalsignal While an old P is threeing a newport is created The readiness provereturn success once port and endpointare ready The IP table will also beupdated As you can see traffic smoothlytransition without anydrops Finally the old port receive signandexited This way by applying pre- stopthree B proves we can minimize downtimeduring the port restartWe've covered traffic handling duringrestarts but we haven't touched when thetraffic shift happenedexactly Normally cube proxy do not sendthe traffic to terminating pot but if noother end points are ready traffic stillgoes to the shutting down ports Thiswill change the restart behaviordepending on replic accounts and restarttypesIn multi library scenarios Kubernetesremove the the point almost after theport isderated Then it ask the new port and thepoint once it it becomesready Of course there gap between theold port stopping the new and the newone being ready But it doesn't matterbecause as a replica can still handlethis requestNext in the routing upgrade new portsare created first So regardless of leakaccounts traffic will beswitched when the new port are marked asready But one point note that trafficdoesn't stop at the same time ofdurationcomponents like cubit endpoint thricecontroller and cube procy all componentfetch resource status from API serverand process them as synchronously sothere could be athre then how about the single replicaworker as I mentioned if there is no asthe lady endpoint traffic keeps going tothe terminating port as you can see onthe left the old port keeps receivingtraffic after it starting shuttingdown Once the new port is ready thetraffic isswitched With this seamless trafficshake even if it's a single replicaworkload we can avoid request dropduring restarts However as we learned inthe previous demo if the old port endstoo quickly the new port can't take overthis requestit there is a risk of the gasdrops Therefore giving enough pris andsetting proof isimportant Next let's look at whathappens when a liveness prove fails in asingle replica In the container levelresult all events happens sequentiallySo there's a short gap before it comesback unlike the routing upgrade case Noone can take over this trafficSo during this gap the request isdropped That's why in some cases wecannot avoid downtime in a singlereplica So multiple replicas isrecommended for important applicationsNow we learned about simple app and HTTPservers But what aboutcontrollers We can apprise the sameapproach to controller but with oneextra factor which is a leader electionTypically the controller implementcontrollers implements leader electionto prevent conflictingupgrades So let's focus onthatHere we will take the controller longtime as an example This is a commonlyused library to implement acontroller In this library once thecontroller gets saved time and thecontext is cancelled the great shutdownbegins It goes through severaltermination procedures and finally stoprenewing leadershipsThen new reader gets aleadership So how can we reducedisruption duringreelection Wetalk before talking about theoptimization technique Let's see theKubernetes leader electionbehaviors Kubernetes provides a listresource for radar election Eachcandidate continuously try to updatethis shared resource and the fast thatsuccessfully updated it becomes itbecomes aleader Here are two key parameters thataffect reader transitions First listduration set the maximum leader durationwithout renewal The default period is 15seconds seconds Retry period specify howfrequently candidate attempt to acquireleadership The default interval is twoseconds With these default values in theworst case a leadership takeover couldtake 17 secondsYou might think we could just shortenlittle duration to reduce take over timebut setting a two short periodincrease split brain risk like multipleportvar So how can we safely reduce thistime The controller runtime provides anoption called leader election release oncancer When enabled this the leaderproactively ends the leadership not justwaited forexpiration Internally the managerupdates list duration seconds to 1second only at theend With this configuration theleadership transition time can speed upto 3 seconds while keeping the defaultduration except on terminationends In summary although carefulattention is required temporarylistation updates allow you to safe andrapid leadershiptransition So far we've discussed howapplication can gracefully handlerestarts But what if cluster maintenancetarget multiple restarts and affect theentire workload How can we minimizethese disruptionsPort disruption budget called PDB canlimit these voluntary disruptions On theright we have to manifest a deploymentwith two replicas and is secured by aPDB This PDB allows only one port to beunavailable In the scenario on the leftthe eviction request succeeded becauseit respects the budget we've set But inthe scenario on the right the evictionrequest is rejected This is becauseanother port is still pending and theeviction would exceed thebudget In this way when new setup PDBresol requests that would viate thedisruption budget So you can reducedisruption We found the PDB isbeneficial to reduce those disruptionthat there are some limitations Firstunlike Ring upgrade port evictiondoesn't create a replacement port aheadof time So if you want to keep at leastone port during eviction you need atleast two running ports to beginwith Second if you set PDB to stry itcould permanently block evictionrequests This would happen in severalcases such as when V workload has only asingle replica max unavailable is set tozero or the application is never in arunning state due to bugs or somethinglikethat This can complicate nodemaintenance like cube drain and itrequires lot of manualintervention I mentioned thatapplication part can also broke thiseviction butKubernetes provide this solutionstarting from version 1.27 Kubernetessupports unhealthy port eviction policyas a default When you set always a loadto this policy unhealthy port do notbroke the evictionrequest Lastly remember that PTB onlyapply to the eviction API and death ofpreemption Of course if you delete potPDB constraint won't be worked Alsoalso when the scheduleuler try to printlower priority port to make a note spaceit normally will follow the disruptionbudget But if finds no plates matchthoseconstraint budget in which casepreemption may ignore the PDBSo finally let's move on to the keytakeaways for today'ssessions First make sure yourapplication can handle sig time andstart grace version down If anothersignal is needed use stop signal or pre-stop actions Also cycles are terminatedsequentially in reverse order of theirstartup time and remember the pottermination grace period is not alwaysguaranteed Next pre stop three and lethelp minimize traffic downtime Bothfeatures support smooth traffic shiftingduring restartThird if your controller use leaderdirectionction you can speed up thetakeover time by setting in short periodas a list duration only at the managerstops But be careful for sprint brainLastly PDB can mitigate the impact ofvoluntary disruption without losingoverload leasersThese practices sees will help make yourpart more list friendly and takeadvantage of Kubernetes automationfeatures And last here's some list Thisslide has been uploaded to the eventsite and all example are available on myrepository That's wrap up our deep diveport restart session Thank you[Applause]2025-04-15 21:57:59.491734 ��]B#��qAAYxjk8ZZclookay everyone Uh welcome to uh Tamingthe Beast on the uh last day of KubeConIt's great to see such a good turnoutconsidering uh last days Typically it'snot really like this Uh my name's LucyI'm an engineer working Uber and I'm acontributor to Kubernetes Uh for most ofthis presentation I'm going to beplaying uh more of a role as a user ofKubernetes who uh has a uh what is itwho has a lot of struggles with uhresource management I'm thrilled to bejoined here today by Dorm Hello everyoneMy name is Don Chen I'm the softwareengineer from Google and also I'm thetech lead of the Kubernetes signal sinceits inception And I'm thrilled heretogether with the Lucy and together withus uh we are going to show you folks uha lot of the new tricks we developmentuh last one year and or couple years andto help you to management resourcesbetter and help you gain better controland also management the Kubernetesresource a little bit less and tend YeahAwesome So uh oh can I borrow theclicker ah yes So uh Dawn uh I've beenusing Kubernetes a lot at work uhrecently and it's been quite usefulespecially for more basic stateless uhworkloads and uh the more basic stuff wehave but we've really struggled with uhmore advanced uh workloads and moreadvanced resource usage I uh thoughtthat uh for to show my artisticcreativity I would draw and uh how itfeels like to me It's almost like uh ttrying to tame this wild beast uh towork with our platform There's a fewthings in particular that I'm uh reallyuh really having uh trouble with I waswondering if we could maybe step throughthem and see if we can work out uhwhether we can do anything better Yeahgo ahead Absolutely Let's tame thislet's tame the beast Okay which is theforward one That one So first offum it's really it's uh setting resourcesat per container level It's sometimesometimes it makes sense sometimesthough it's a bit of a struggle I don'treally want to dedicate an entire sliceof my CPU just to some site loggingcycle I just wanted to share it with themain application Uh there's also a fewapplications we have where we have onecontainer normally has high utilizationat the time another container has a lowutilization and then it flips on theinverse and really we have to reserveall of space for both containers uh inorder to use it There's no real conceptof sharing um you know you could maybedo stuff in the CR but it's really notthat great Is there maybe anything wecould do here yeah of course signaled uhcommunity have been heard this kind ofthe request for years and we finallyintroduced a new feature called P levelresource uh uh to the so the featureactually it is in F2 in this recentrelease 32 uh uh 33 sorry right so anduh it is allow you not just uh apply theresource requests and limit at the portlevel instead of the individual of thecontainer level Right so to really diginto how this is work let's start withthe how we apply of the resource uh atthe co ��gA#��AeO8szEGNwoohello everyone Thank you for so much forjoining my sessiontoday Let me ask you a quick questionCan you put survivalrestart Potter restarts might soundbasic but handling them correctly is abitchallenging Kubernetes restart port inmany situations And if your pot isn'tready for prepare for that you could runinto unexpectedissues Before we dive in let me brieflyintroduce myself My name is Ay Zawa I'mmember of technical staff at CrownaticsI'm currently developing L Marina !ntainer today and you can see thefrom the yama file and you have to applyCalifornia syncing up front planning anduh how much of the resource you want toapply to the each individual of thecontainer and you always there's thekind of struggle you are going tooverprovisioninguh for each of the container and uhbased on their peak usage and the whichis need about the class level or nodelevel of the resource visiting and oryou are going to risk your underprovisioning of the critical forapplication and impact their performanceright so so this is kind of the currentapproach with the introduction introduceof this new feature ple resource and youcan see that from the yama file right soyou are hugely simplify yourconfigurations so you can apply simplyat the port level and you can say thewhat is total of the provide theboundary total of the resource uhrequest and limit at the p level insteadgo to the detail about the container soyou can see that the two container andshare of the the CPU and the memory uhdo you want to try the demo yeahabsolutely it's uh that sounds reallyexciting let's see if we can make itwork on a real system I will say this isover the internet so uh just uh bewarnedUh so if I come in here I have aKubernetes cluster here Let me remindmyself what I have in here Good So if Icome in here I have first off a uh YAMLfile here which I'm going to cap for theaudience Uh so this is just a uh basicweb app and it has a resources Uh it hasa sidecar container that's doing somelogging and it has a separate resourceresource request right so let's justapply that to the uh cluster pod levelbeforeOkay Yes I should make that bigger OkayAnd I'll just quickly remind soresources are at the container level andwe're all used to this right now Whatcan we do differently well let's have alook maybe at pod level after.yaml whichis the exact same uh application exceptnow look the resources are not set atthe container level at all They're setall the way at the top in the pod levelAnd the pod is going it's going to sharethese resources among all of thecontainers uh within So if we now applythis pod uh what is what what was thefile called again it was called podlevelafterl and that's configured and if Ilook maybe at the uh pods uh once theycome upuh which one was it i can't evenremember now Probably this one and thenI get this uh podhere and we go up to theresources Ah here we are So you see herethat the resources are now set at thepod level They're being shared among allof the containers Awesome That's reallycool Um so Dawn um is is this somethingwhere you have to make a binary choiceis this something where I either have uhpod level resources or I have containlevel resources or can I mix and matchhere as adults we want both We want allcombinations So this diagram show youhow powerful it could be And so you canOh thank you This is Oh yeah Yeah Wellthis is the one Okay So you can see fromthis diagram you can see that you canapply the uh the the resource at thecontainer level You can also apply theresources at the pod level You can alsoapply as the combination So this isactually the highlights of the com theflexibility of this feature gave to usright so you can of kubernetes So thisone with this feature you are hugelysimplify um your the resource managementuh configurations right so and reduce ofthe possibility because themisisconfiguration based on the sum ofthe container resource and whichexceeded of the node capability but iffor advance of the uh uh applicationsyou want for example for the machinelearning of the workload which has theone data process uh pre prepile ofcontainer and which require a lot of theresource spec at the beginning and butalso we need to ensure about the uhmachine learning main app and have theenough of the uh CPU and the memory canuh do the training jobs So in thesecases actually you can apply uh powerlevel resource the CPU and the memory asthe boundary as the total ensure of themain application can be uh makingprogress but at the same time you canset the relative really high limit forthis data pro pro make sure they ca"nusing some of the resource at thebeginning and but not all the time rightso that's the kind of things you canthink about that another of the exampleI'm can thinking about it is applicationwith have the web services along with ofthe cached proxy right so so youapplication the web services they may behave to due to the traffic pe you may beonce a while need more resource and buta lot of time actually it is the normalyou can set the p uh p level uh boundaryas the normal usage of those combinationof the two containers So that's kind ofP level in this case as P level resourcegive you the the guardrail as the wholeright So since you can think about thepart as a unit but at the same time vingof that part internally you can do thebetter sharing you can also uh at set ofthe peak usage but at the same for theuh at the same time you can ensure ofthe normal time and the resource usageThat's awesome So uh this is uh in alphastill right so if you want to use it youhave to turn on the feature gate Butyeah and uh hopefully folks can use itand they give give us feedback AwesomeSo that's great Uh we have thoughanother problem as well Uh right nowwe're trying to run stateful workloadson Kubernetes databases uh things thatare more tough to disrupt as well And wewant to change their uh resource usagebut this isn't really possible right nowPods are immutable and their life cycleis one way Once you said that you havethis many resources for the pod youcan't really change that When we rundatabases for example at work thatreally can be quite disruptive becausewe have to create a brand new pod uhmove everything over and uh and becauseof that we're we're really struggling Isthere maybe uh anything that we can dohere to make this better yeah this is areally good question actually andrequire about the uh powder restart andalso container within that powderrestart and it is with uh when there'sthe little resource uh change actuallyis definitely introduce a lot of theoperational burdens right so this is whywe start see signal community start tointro working on this imp place boundarysensing for years so I just want toglide to along that in the1.3s3 and this feature actually promotedto the beta and uh so that's and also itis enable default uh for kubernetescluster so so this feature just likename uh implied and um so you can uhlet's go to the previous let's look atthe before picture first and so you cansee that okay you want to change smallthe machine have some of the slackresource you want to change because dueto that traffic pe you want to change uhin uh increase of the CPU and the memoryand you have to today you have to killoff the uh uh kill off the uh runningcontainer and and rebuild the recruit ofthe new part with the new update of theresource requirement and then schedulersto schedule that to the whatever noderight the suitable node So this isdefinitely introduce of the disruptionand uh could be and then the theterminate about the services all thosekind of things loss of the networkcollection all those kind of problem wehave right so and also even for themstateful side could be cause of the datacorruptions too because those kind ofthings so with this in place powder uhresizing graph a diagram to show so youjust can kubernetes take that yourrequest and also just dynamic change andif the node have enough of the slack andthe dynamic change of the request andthe services continue run and do youwant to try yeah this is really excitingLet's see if we can get it to work Uh solet's come over to the terminal again Ineed to quickly change uh context uhwhat is it uh QCTL uh config use contextuh pod uh what is it in place is it yepokay so uh in here I have a uh series ofpods here uh awesome so let's have alook at these so if I open up this podhere uh it's just uh running anapplication if we come down a bit so I'min edit mode okay so say I'm in here andlike I want to increase the uh CPU limitof this pod right I've got more trafficcoming in and I don't want to blow upthe whole pod it's running a database orsomething but I just give it more CPU SoI'm going #to come in here I'm going touh in change this 600M to 800M I'm goingto write that and it's been configuredAnd now if I have a look if I go QC I'llget uh this pod here Oh yl 800 The CPUhas increased The CPU has increased 800m800m So I've done that without stoppingthe application at all I've just givenit more CPU with absolutely zerodisruption to my users and myapplication owners So yeah that's thatis really really cool and it's one ofthe most exciting features I've seen inQ&A in quite a while But yeah awesomeThank you So it's not that we can dothat with everything are the limits hereFor example adding resources istypically quite simple for systems buttaking resources away from alreadyrunning workloads is a bit tougher Forexample I could see maybe it works forCPU uh shrinking but does it work withsay memory shrinking yes that's a goodquestion leading to the next one And uhso um so so it's not all application youcan uh easily to change the resourceright so not so even like the not justshrinking and even like the in of theresource request like for example Javaapplication when you want to uh readjustof the hip size and you may require ofthe restart of the application So tomake this this is used based on thosekind of workload node and different uhapplications So imp place recessingactually intro in introduced the policyresize policy when two option today wesupport is prefer no restart and anotherone it is restart container right sorestart container basically justprevious behavior right the defaultprevious default behavior and uh for theVPA and integrate those kind of thingsand uh but the preferred but even withthe preferred no restart we didn't sayno restart here's the reason just likethe Lucy just ask for for for memoryshrinking actually doesn't work verywell because that's have the due to ofthe kubernet the kernel limitation sokubernetes sign community actually workvery close with the linux kernelcommunity we try to make this is betterespecially in the sigu v2 and and whenshrink off the memory and this willtrigger about the negative recon of thememory and but in the sigu v2 actuallyprefer it is w killed because the reasonthere's also have the good readingsbehind because we want to make thatpredictable performance all those kindof things So but we work with thembecause the different requirementespecially for the booming of the AImachine learning workload node and careof those those tasks actually cause alot of the cost is really high Soanother things I want to point out anduh this feature actually also can worktogether with the uh pod level resourcewe just introduced earlier and uh uh theeven you can apply of the pod level uhresource as a whole for the whole partbut the restart policy actually dictateabout the behavior about the containeritself So I want to point out that onealso I want to point out um there's thelimitation right so the feature itselfactually it is node feature but to makethat in power is a holder feature andmake that more useful and there's the uhcluster level of the orrationintegration for example vertical powerthe autoscaling and even like the someworkload framework for on top ofkubernetes like the array and they havetheir own autoscaler right so they allthose kind of things we need tointegrate actually there's the communitywe work with them and to make that isintegration better so so so it's easyit's not like Lucy and have to manuallychange the yama file and so in thefuture it will be automated um and alsoplease notice that there's otherlimitations right so uh currently forthe battery release uh we are onlysupport CPU and memory and uh there'sthe other extended uh resource we uhunder development we are thinking Eachone of the resource actually introducedifferent type of the complexity to thesystem because by the nature of theresource itself So but we are continueof the improvement of those kind ofthings Another small things I want topoint out right with the even with ofthe uh the impress powder resizing withthose resource and you cannot change ofthe quality of the services guaranteejob will stay withi$n job burstable staywith burst even you may be change totheir CPU requests and the memory equalCPU requests and equals to their limitsand also static CPU and memorymanagement actually won't be changedbecause even you change off those kindof means there's a hard reallycomplicated to dynamic change of theyour exclusive uh or reserved CPU andthe new load all those kind of thingsright so another thing I want to pointand we make the to simplify to get topromote this to easier earlier so peoplecan try give us the feedback and we alsomade a decision it is atomic resizingright so you ask partial of the if onlypartial request adjust hit uh on thenode admission time we will reject theentire of the uh resizing because uhthat introduce unnecessary complexity tothe system and also it's also for theuser it's really difficult for them toreasoning what's the behavior and forthe we cannot that so that's kind of allthose limitation and we have but pleasetry because it's quite powerful and uhmaybe in in last couple days of the ofthe talk and a lot of people alreadyutilize this feature and in theirproduction and uh and yeah so yeah givesome a lot of guarantees Well that isreally awesome I think this is a reallypowerful feature Actually let's just doa quick straw poll Put your hand up ifyou're planning to use this feature inthe future Put your hand up Wow Yeahactually I should put my hand up tooThat is a lot but this is a huge boonfor us all Okay awesome Uh oh am I overhere yet uh awesome Okay now this is allreally cool There's just one other thingthat we're kind of missing right now UhI run a lot of Java services and a lotof services where on startup they haveto warm caches or pre-process things anda lot of the time that means theirmemory usage it looks like a big burstat startup and then it kind of falls offRight now I have to give them uhdedicated memory all the way through totheir startup load and then I can'treally take it away Uh even even now Ican't do memory shrinking right so thatmeans I have a load of idle memory thatI don't want to use I want to use swapjust like in any other system butKubernetes is uh almost allergic to swapIt's not just not supported If you run acublet with a on a machine that has swapit immediately it immediately blowsitself up So um yeah it would be reallycool if we could maybe use memory swapfor at least for burst for burst memoryin Kubernetes Do you think we maybe wecan get to that as well yes Can we go tonext one so so yes we actually workingon the node swap support and uh um anduh unfortunately this time still is notgraduated to the G and you can see thatthis is for last many of the uhimprovement every release we haveimprovement on the node swap features uhand uh so in the1st3 and we promote to the beta 3 orbeta beta 3 I and and add moreenhancement uh to the features So I justwant to say something at uh in the pastas the signal and the comm community andthe community tech lead I'm reallystrongly uh against of the end of thenode and about any of the format of theswap usage one of is the kernellimitation right so the top reason it isbecause they introduce performanceparity associated performance monitorright so and also kernel cannot reallyum accurately detect what's inactive thememory So and uh and also we don't havethe really good way to detect those kindof things and carefully calculate howmuch of the those inactive memory canpush down to the disk and also thosedisk activities exhaustive disc act diskactivities really really slow and alsoexhaustive disk um uh activity could beintroduce uh the the neighborhoodloyalty right so of impact not just it tthe complication itself's performanceand impact entire of the node otherapplications performance not just siteand also kernel once you end up offthose swap and the kernel will spend alot of time to try to do reclaim uh pushthose pages to the disk and instead ofspending the time to really run off theapplications right so that's thethrashing uh all over also associatethose kind of things so that's why uh Istrongly against to support of the nodeswap but at the %same time um um in thesame time we realize because we evolvedKubernetes we support the state for siteand now we are support of the batchworkload and also especially for thoseAI machine learning large scale of theJI machine learning training jobsuh and we know and uh resourcemanagement and the planning is reallyhard and care of those certain jobactually it is at a more cost to uh topush those and act those certain of thepages into the disk So that's why westarted to accelerate of the developmenton the node swap uh uh memory swap Sothat's canso this is the this is the before andyeah you can see that the featureactually it is the um and at this momentthe feature actually is node levelfeature you can see the cub kubernetactually play uh control plane uh majorcontrol plane uh here and so kubernetyou have to config and this node andenable of the swap or not right sokubernet will do the carefully how muchand uh and the monitoring of theresource those usage about those kind ofthings Kubernetator also will all bebased on the both of the uh raw memoryusage and also swap usage and makedecision when there's the steel cannotprevent and make decision Okay So so sothis is but this one actually uh meansthe node level of the of the swap Westill not really finish give the podlevel and uh and uh make the applicationlike a pod for free to use in those swapSo we are continue work within thecommunity and try to make that finalgrand of the control uh for the each ofthe pod and applications So awesome Thissounds really this sounds really coolMaybe we can give this uh this this a goas well Okay so uh if I come over to myterminal here uh let's see if I haveaccess to the cluster first off becauseI've been fighting with it this morningYes I do Awesome So on this machine hereI'm running a basic cluster using cubemUh if I just have a look at my swap Ihave swap memory here Uh if I have alook here and so uh sorry yes let memake this bigger So you see here I'm ona I'm on a cluster This is a single nodecluster I have swap memory here and it'sbeing used properly If I have a look atmy uh cubadm configuration you can seehere that I've turned on the no swapfeature gate and I've said that the swapbehavior is limited swap So that meansthat uh the burstable component of thememory of uh workloads uh can end up inswap What's the burstable component wellif I have a look in uh and I c actuallylet's cap the live pod qctl getpod Okay let's cap the fileuh uh you can see here So uh the uhrequested memory is the part that'snon-burstable and then once you're aboveyour request but within your limitthat's the burstable part that inlimited swap can uh end up in limitedswap mode can end up on swap memory Andthen there's another mode where you canactually end up with all of the uh podmemory on on swap Okay so that's reallyreally cool And yeah unfortunately Idon't think the cluster is going tocooperate with me today I don't know Noit doesn't like me Okay I think itliterally just crashed mid demo That wasbad timing But yeah that's really that'sreally cool though as a feature Um letme put this side back Two out of threesucceed Two out of three is way betterthan we thought it would be So that'sgreat So I guess now that so I work atUber So I guess that's great because itmeans that with like many pabytes ofthis space that means that I have likeeight pabytes of RAM now right yes Realtotally right Yeah So that's why you cansee the demo of the node swap fieldThat's mean that indicate how cautiouswe are try to use the end of the nodeswap by default even node swap featureit is g by default we will make that isavailable I want to platform adminCalifornia thinking and plan and howwhen there will want to end up after thenoodle swap okay so I talk about a lotof those yeah using cautious and therethere are more of those kind of thingsmost of the things I already mentionedthat earlier and They also have thesecurity right so security concern So umso the reason of this time it is notgraduate to the G most it is because ofsecurity concern and another one it isthe eviction management uh that &thepotential policy the good behavior ofthe part without using swap So that'swhy we we we kind of think about we needthe more uh sorry sort through and uhbut that's kind of thing but each one ofthose kind of things actually have themedication suggestions So we are goingto document those mitigation and uh bestpractice to enable about those featureand uh so so looking forward afterfeedback when you try in your in yourplan in your clust production clusterYeah Yeah And awesome and people yeahplease turn it on use it maybe give usfeedback Okay awesome Well thank you Dorfor helping us tame I've got amicrophone on here Thank you Dor forhelping me tame the beast Okay so nowwe're gonna go into the and talk a bitabout maybe the future of what may endup happening in Kubernetes sign Before Italk about this I just want to be veryclear I'm in a I'm a human who hasopinions which is the most dangerousthing of all But when I say somethingthat doesn't necessarily mean that it'sa promise and it doesn't mean that uhother maintainers in Kubernetes agreewith me It is an opinion that I have butit is not a commitment to anythingparticular But I want to start by maybethinking about why uh why Kubernetes isso ubiquitous And I think it was ClaytonColeman who said it's because it can runmost things reasonably well And that's areally good baseline to maybe setourselves towards Um the landscapethough of of workloads that are runningon Kubernetes is changing and it'schanging all the time uh and over timethroughout the project we've beengetting maybe better and better atrunning more things reasonably well Ifwe look at maybe at the start ofKubernetes you have basic web serversbasic chrom This is what Kubernetesreally started with 2016 you have theadvent of the stateful set which hasdeals with basic stateful workloadsEventually there's progression offeatures so that it can uh get betterand better 2020 you have the uh job setuh object which helps you run batchworkloads in a much more better way Andwe've talked today about these newfeatures that uh they could help you forexample run developer environments Uh Iknow that in the past maybe we've nothad the best support for this I knowthat GitPOD was on Kubernetes and thenleft because of the challenges they werefacing Um with this for example with inplace resizing a lot of developerenvironments require a lot of resourceswhen you're building and compiling butnot many when you're just writing codeThis kind of allows you to respond tothat Also more advanced statefulworkloads things that are less able touh have be disrupted These are easier tosupport with these new features but weneed but we need to uh keep uh relevantwith the industry and we need to be init in my head it should it shouldn'ttake you being a maintainer to run yourworkload on Kubernetes as long as it'snot really crazy right most workloadsshould be able to run in a reasonablywell fashion so with that maybe Dawn doyou maybe want to talk about what whatwe're looking at next and how this tiesinto what may come next in nodeKubernetesThank you LucyAnd you're lucky have the two mostopinionated people on the stage That's abad that bad combination That is myopinion and also my proposal So uh soI've been trying to propose those kindof just like what Lucy said and I alwaysbelieve we can evolve Kubernetes tosupport more workload even most of theworkload right like the last coupleyears I even we even signaled to makethe windows container supported here sothat's why I always believe we can dothat so but I also know that today'skubernetes make a lot of the primitivesto support uh certain workload like forexample those complicated the uh batchworkload and the GNAI machine learningworkload we do we can support but wedon't do good job and it's really painfor a lot of the users right so one ofthe I want to uh to talk about one ofthe proposal I made last year is Pgrouping right so and Kubernetes todayhave only have the p concept of coursethere have the high level concept thejob set and the replica set the demonside tons of those kind of things Buteach one have to have that a specialmeaning and a special and also it's topright So it's not a first first layer ontop of uh uh work to direct with thecontrol plan and work d sorry no controldirect with the schedule thinking aboutis the per pod and uh so this is one andpropose especially it is addressing theneed of the distributed of the batchwork node we introduce okay it's notcalled the p group and it introduce pgroup lack of the concept the first lineof the concept describe those workload Iwant and so so then that require of thegroup of them they can colllocate it anduh and the tight couple of the containerall those kind of things right so I wantthis is kind of a fundamental of thebuilding block and uh so the so allthose kind of things can be uh groupedtogether and uh so end up of the hacklescheduling when I talk about hackerscheduling it's kind of like the reautoscaler and the snermuler and alsolike the this week and announced of theKai scheduler and by the uh uh by therun AI and so all those kind of thingsI'm thinking about those p group on topof the kubernetes and people cannaturally build about their ownreccluster and slurm cluster on top ofkubernetes so then we can and also thewhatever and for other customer of theum customized of the scheduleuler rightAnother things So that link about theproposal and another one it is also itis I want to relax of the constraintlike today we talk about the p levelresource right and we also talk about uhwhich is make this is happen and we alsotalk about implants p uh resizing or itis make a p less static right so thisone is I want to further make part lessstatic so you can inject you can removeadd and remove the container right Sointo a pod So when you have the this isthis feature and together previousfeature pod group actually represent awork node batch work node right So thenthe batch workload control plan they canschedule about their task and jobs intothe those pod group and inside each podgroup and uh inside of the each pod theycan schedule their task and jobs alsoAnd when you need more resource you canalso enlarge off your powder and and ifyou don't need that much resource youcan shrink or you can share with otherother contuh uh jobs on application on the samenode right so there's a lot of thethings we can do so so this is kind ofthe what uh we we are talking about thisone so once I mention this kind ofthings and forexamp signaled and I myself we have oneminute actively work with the ray andthe snerm community and also we justdiscuss with the kai scheduler uh peopleand how we can group together Theultimate goal for me it is the uh definea lot a lot layer of the abstraction ontop of the copenet and so uh the batchworkload can work on top of those kindof things right so just a little bitsimilar signal actually start about theDRA and then that is build of the umabstract layer on the standardize allthose specialized of the device sothere's another layer we want to buildit is on top of the pod objects So thenuh all those kind of the customer or thebatch work node scheduler can scheduleuh target to the pod group together Ithink we may have to I think we may haveto be very bridged here Yes The fi OkayGo ahead Uh no no you do you want toquick like no actually yeah we need togo forth Very sorry folks but we arealready over Uh the final thing I wantto say and it's going to take 30 secondsDon't cut me off isuh is um this is not being built inisolation or alone A significant amountof people the London one wasn'tavailable yet A significant amount ofpeople were involved in building thesethings and in making these things readyfor production So please if when youdecide to applaud if you decide toapplaud please reserve some of thatapplause for those people I'm not goingto name people because if I do I'm goingto forget someone and that's going to bebad But I'm looking around the room at afew of them that I can see And with thatuh thank you I don't think we can do Q&AThere is a there is a QR code if youwant to give feedback on the session andyeah thanks for coming[Applause]Thank you2025-04-15 21:58:00.126440(gat extending Kubernetes to actually domore to not just do what Kubernetes doesout of the box but to bring more valueto their organizations and and to theirteams so if I simplify this down evenfurther a user sees Kubernetes as someinfrastructure an administrator sees itas a container scheduler and aself-hosting solution for tons of toolsin the CNCF landscape and outside and adeveloper sees it as a powerful APIserver and all of these are true andit's part of why Kubernetes is takingover the world is because it does solveall of these problems extremely well butit does it's like touching differentparts of the elephant you never quiteknow who you're talking to and today'stalk to be very crystal clear is allabout this developer p persona andexperience because again we don't talkabout it as much from an entrylevel whywould I use as an API server how does itwork and that's what we want to dig intotoday so hopefully that sets the contexti won't be offended if this was not whatyou thought it would be we'll learn towrite better CFPs it's okay but this isthis is what we're going to be talkingabout today and the reason we need totalk about it is because the number ofacronyms the number of tools the numberof ways of working with the KubernetesAPI is confusing and we want todemystify how it works yeah and we willstart with the first one i will explainyou everything about CRDs what is a CRDokay let's go crd three words and firstof all it's a resource okay it's aresource that you will be able to applyon yourcluster resource is just a reminder inKubernetes define a desired state when Ihave a deployment resource I define astate of I want three replicas and thatis my desired state and the cluster willmake sure that the desired state uh isrespectedokay we all know about native Kubernetesresources a deployment a pod an ingressa service those are the native resourcesthat existcustomdefinition that means that me as adeveloper I can create my own resourcesand apply them toKubernetes and you can define basicallya new type of kind and just it soundsreally simple but for me in my opinionthat is one of the most powerful featureof Kubernetes we often hear heyKubernetes is a container orchestratorthat's true it's really good at it butwhat is the most powerful feature isthat you can build on top of it in areally easy way and the entry point isbuilding your CRD you extendKubernetes let's see together howexactly looks like a CRD so first of allthe first part is um if you know a bitthe the the native blocks like a pod oran deployment it's exactly the same youhave an API version and you specify thekind then you have some meta data asalways and then you have the spec partand basically in the spec part you willbe able to put an AP open API schemabecause a custom resource definition isnothing more than just extending the APIof your clusterlet's focus a bit on the the spec partokay so uh here we can see a reallysimple uh CRD and I haven't expanded thethe spec part here but here you give itthe kind that you want i could call itwidget i could call it anything thatmade sense for your organization i couldcall it pizza quason anything if itmakes sense for your platform i candefine whatever I want inhere let's focus on the spec part againit's an open API v3 free schema thismeans that you can define there yourfield types do I need strings integersbooleans enims I can add constraints toit hey for this number this integer Iwant a minimum and a maximum this stringis required and that will help me forvalidation if I don't respect any of thethose constraint I won't be able toapply my custom resource yes because youcreate a custom resource definition andfrom there you create custom resourceresources if you are a Java developerlike me uh think about the customresource definition is your Java classdefinition your POJO definition and yourcustom resource are just the instancesthe objects that you create i like thisanalogy we have other analogies like adatabase schema but as a Java developerthat's the one that I use and byrespecting those validation you alsoensure data int)egrity in ATCD because inthe end it's a resource that will liveinATCD here you will have um a morecomplete uh spec part okay so imagine Iwant to create a custom resourcedefinition to to define a posgressdatabase what do I need i uh just wanthere three fields i want the name of mydatabase and here again look you cangive some defaults you can give it adescription really useful and you cangive it a type then I have another fieldwhich is also of the type string andfinally I have another uh type which isa nim of the type of my uh environmentis it uh dev or prod so when you willcreate your custom resource you will beonly able to choose between dev and prodagain if you put something else thecustom resource will not be accepted byyour APIserver crts are definitions and againwhat I say custom resourceare for now they are valid your clusterwill accept it but nothing will happenyou have to assign something to it andAbby yeah at this point it's just anentry in a database it's not anythingrealized the thing that helps us takethose definitions those assigned actionitems is a controller so a controller issomething that listens for these entriesinto the database and very specificallya certain type of kind and then takesaction on it so let's jump into what acontroller is because at the end of theday a controller is just a applicationit runs as a pod so we know how our webapplications run as pods this just runsas a pod and what it does is it watchesthe state of your cluster the datathat'sincds that are occurring across yourcluster and then it actions what it istelling it to do so the controllerlistens to a trigger resource and thenit tries to take that trigger resourceand action it in the way that it issupposed to so the to-do list becomescreate something update somethingconnect something and that kind of thingdelete something etc and so this isreally just an eventbased application solet's have it a look at it in practiceso here we're going to look at somethingthat is like a built-in type withinKubernetes a deployment so if you'vetouched Kubernetes before this is one ofthe resources you may have have alreadyhad experience with so the deployment issomething that sets a specification fora set of pods a template for that set ofpods and it sets the number of thosepods it wants to run this helps you withresiliency by having more than one podrunning the same thing so we have thisdeployment resource which as of rightthis second is just an entry into adatabase but now we have this deploymentcontroller when that entry happens allof a sudden the controller kicks intoaction and creates a to-do list itsto-do list is actually to create a newreplica set so a replica set is sitsbetween the abstraction of a deploymentand the abstraction of a pod it says ittakes that template from the deploymentand it transforms it into the nextresourcethen we have more controllers who listento these types of resources and continueon which can create a huge chain if it'ssomething that you're not familiar withbut actually can make a lot of sense ifyou have any experience with the Unixkind of framework and style of thinkingwhere you want something to do somethingone one thing and do it well and that'swhy we have a deployment re controllerthat creates a replica set which createsa pod when that's all you even wantedwhen you started with your deploymentdefinition allright so the next thing is we've beentalking about why Kubernetes is sopowerful as a developer extending it andusing the API and we've learned how tomake custom resource definitions so newentries into the database but now how dowe react to them well we make customcontrollers so I'm going to look at acustom controller that I help build butthese are true for pretty much everyproject in the CNCF the way that itworks with cretatics is that we have acustom resource called a promise thepromise is different from any otherresource in Kubernetes because insteadof focusing on deployments or pods oringresses as we were describing itdefines a platform service so what doyou want to provide as a service in yourorganization s*o it has its ownspecification that helps an organizationdefine that it becomes an entry in thedatabase kratics the controller thenwatches that and creates its to-do listone of the things in that credic to-dolist is that if the intention is tocreate a platform service a self-serviceondemand API well we need an API forthose requests to come into so one itemon the to-do list is to create anotherCRD the service API that was defined inthe promise the other thing in the to-dolist is to create the backend you canthink of this like the controller forthis service this controller is thenlistening to the new CRD type so whatwe're doing to extend Kubernetes iswe're removing or introducing anabstraction that takes that idea ofcustom CRDs or that's duplicate languageisn't it say it all the time but customresource definitions and customcontrollers and we're pulling thattogether into the concept of a platformAPI so coming back to the basics of acontroller controllers are eventdrivenapplications that run as pods in aKubernetes cluster that is it so ifanyone is making you feel silly for notunderstanding them they are prettysimple stuff uh that you can getinvolved with those controllers are ableto watch resources and then manage otherresources they tend to watch one andmanage another one because otherwise youget into circular issues right so youhave a trigger resource a set ofactivities that the controller knows howto to we say reconcile to bring togetherthat defined state with reality andworks on those managed resourceKubernetesbut before we close up the talk we wantto go through a few of the myths that uhI think are prevalent around thecommunity when it comes to CRDs andcontrollers and the first one is thatcontrollers are only for Kubernetesresources so I'm going to go straightback to that diagram we were justlooking at and talk about the fact thatKubernetes that controllers do run inKubernetes because they use the API asuh a dependency for themselves but andthey can manage things in Kubernetes butthey also can manage things outside ofKubernetescrossplane uh all the cloud providercontrollers uh that come from them theseall do a great job of calling APIs thatare outside of Kubernetes from thatcontroller to take action so there'snothing stopping a controller frommanaging bare metal we we're doingthings on mainframes and edge computeand anything VMs and anything else thesecond thing is is that it can evenwatch non- Kubernetes resources now thisstarts to add a little bit of complexitywhen you're talking about offcluster sothis is a more advanced use case in alot of ways but it's very much done andit's very and there's lots of examplesacross the community that you can golooking for and learn from if this isimportant to you so the thing aboutcontrollers and CRDs and why they'vetaken over Kubernetes has taken over theworld as an API server is because itdoesn't limit you to Kubernetes but itgives you a single control plane asingle way of working across all of theinfrastructure in your complexenvironmentoh so the last thing is it can be on andoff the cluster that's that's the totalslideuh yeah okay uh the other myth that wewant to debunk is hey goong is the onlyway to work with kubernetes and to writeyour controller your operators okay so alot of people think that because yeahkubernetes is built with golang most ofthe first operator that we saw werebuilt withgolang but again a custom controllerit's really important to understand thatuh a custom controller is nothing morethan a pod it's your pod and what youput in the pod it's up to you so you canreally use anything you want to createyour custom controller we can see herethat there areeven controllers uh built with shellokay Python 0% is funny uh I even sawcustom controller written in JavaScriptokay uh if you want to have fun ofcourse a few examples uh the hypelanguages of course you can write yourcustom controller using Rust we have thecube uh Russ i don't know how youpronounce that rust yeah rs rust okayum you have the uh Pythonic framework ihave never tried it but uh because I'mnot a Python+ist but there's a greatframework to build your operators usingPython uh there's a NodeJS operatorframework and personally the one that Ireally love because again I'm a Javadeveloper is the Java operator SDK andit's great because those SDK came a bitlater than the Go operator SDK so theycould learn from the mistakes and theycould also learn from what works well ornot well and uh the Java operator SDK asa Java developer make it really I I loveto speak about developer joy and whenyou start building an operator using theJava SDK it's really a pleasure you haveso much developer experience uh youwrite your POJO that is converted to aCRD that is automaticallyuh applied to uh to your cluster youjust have to implement the event loopcontroller method it really makes yourlife easier so if there are any Javadeveloper in the room and I'm sure thereare some okay uh give it a try again youcan build your custom controller withwhatever you want you're just ignoringthe shell up there hey you know oh yeahignore it even Bash even that yeah coolso the next myth we want to talk aboutis that you just because you don't knoweverything you don't know enough to getstarted so one of the things about beingat CubeCon is that it's such a largeevent with people at all differentpoints in their journey and everyone'slearning slightly different things asthey go and at slightly different pacesand with slightly different vocabularywhich just adds to the confusion and thething is is like how many people haveheard the term operatorbefore okay and keep your hands up howabout we'll go downcontroller okay how many people thinkthat these are the samething all right how many people know whythey'redifferent okay cool so we had almost theentire room knowing hearing about bothcontrollers and operators we have a verysmall number of people being like "Yeahthey're the same." And a very smallnumber of people being like "I know thedifference." Well the thing is is it'slike squares and rectangles so if you'regoing to get into a debate aboutcontrollers versus operators it's a bitlike getting into a debate over Emacsversus Vim or something else that isjust that kind of esoteric technologydebate right there are nuances andthey're important to talk about and tounderstand in the fullness of time butdon't let people using one or the otherstop you from getting involved in theconversation given the knowledge thatyou've picked up over this week andhopefully in this talk the operators arelike squares they usually talk aboutoperating specific software so the lifecycle management of creating updatingand deleting and there's actually anoperator framework that was put out byRed Hat and that's where the kind ofterm really gained popularitycontrollers are the more generic podrunning on Kubernetes that responds to aresource and affects another resourceright so you got squares and you gotyourrectangles um I have the slides that goalong with that sorry about that uh sothe operator is the operationalknowledge in software so as I said thatlife cycle management and the thingabout life cycle management andoperations is that it's a bit of ajourney so this is something that thiscapability model came out with the redhot red hat uh framework and it talksabout the fact that when you're thinkingabout operations whether you're writingthat in into a Kubernetes operatorotherwise you can start small and buildyou can start with installing then doupgrades then deal with life cycleinsights autopilot you know taking thehuman out of the loop the same thinggoes for learning about all of thesethings writing a controller that autopoppopulates and updates ated image tag inyour pods or you know we wrote one atone point a few years ago that um autorestarts a pod when a config map changesyou can write really small things justto get a sense of how the the worldworks and how the system works um beforeyou start trying to to run with yourentire uh you know organization runningon a set of custom controllersbuilding on top of Kubernetes is onlyfor vendors another myth you think thatoperators are only I don't know SASproviders cloud providers that providesyour uh operator so you can interactwith it um and again that's not true umand here is just to show you uh how muchuh projects are being adding over theyears to the the thethe CNCF landscape but again if it makessense for your organization to writeyour own operator just do it it's notjust for vendors start building oneoperator start building two operatorsthree operators if you have too manyoperators usecritics anywayum we know that platform engineering isreally hot it's the hot topic uh wellcurrently it's MCP but let's talk aboutplatform sorry there's no MCP in thistalkum when you will start building yourinternal developer platform okay CRDsand controllers are crucial and what I'mgoing to tell you here is it's a bit myopinion it's quite opinionated but uh Ithink that the source of truth of yourinternal developer portal should be CRDsthe way you model your platform domainmodel should be CRDsCRDs are APIs okay treat your platformas a product your product has awell-defined domain it has APIs doexactly the same with your internaldeveloper portalokay one of the thing that you want tooffer with your internal developerplatform is offering selfservice to yourplatform users we want to make yourdevelopers self-sufficient autonomousand providing a set of operators andCRDs is a great entry point you canstreamline exactly what you what youwant to expose to your developersbecause you are in charge of creatingyour CRD okay you can abstract awayanything you want and as Abby saysdoesn't matter if it affects stuff thatis not living in your cluster youroperator can operate stuff outside ofyour cluster so even your legacysoftware your SES vendors that you areconsuming they probably offer youoperators put everything represent allof this as CRDs and uh you can see heremy design skills in the end what youshould end up with is having a main uhbase platform that operates all yourother platforms because that's anotherlie every instance say hey I want tohave just one single internal developerplatform believe me you will haveseveral in your organization includingSAS vendors that you are consuminghaving this approach of CRDs andcontrollers is a great way to harmonizeyour internal developerportal um wow we are pretty good on timeinconclusion Kubernetes is not just acontainer orchestrator scheduler even ifit's really good at that and I love thisuh feature but you can do way more thanjust scheduling containerscrds are just like database schemas ijust wait for thesounds like it's a good one hope you allare okay being here and it's truebecause in the end it will end up in thedatabase crs are instances of a CRDschema you can also use my anology crdsare Java class definitions and CRs areinstances of this uh Java uh classdefinition controller andapplications are to oh that's the otherroom probably controllers areapplication that subscribes to CRD typesokay really important controllers reactwhen something happens with your CR thatis being applied updated deletedanything it's really powerful becausecontroller will always make sure toreconcilate the desired state that youdefine in your customer resource and thereality of your cluster and again itsounds pretty obvious but this is such apowerful feature uh that reconciliationloop and uh do I have do we have anotherone no no i think that's a that's athank you for coming and and chattingwith us uh the CNCF loves to hear backfeedback about the talk so please usethat QR code to to share your feelingsabout this talk and also please justcome talk to us we'd love to hear uhwhat we could have gone into more whatwe can do to help um and also please getinvolved in mentorship if you can withinthe CNCF thank you thank you so much[Applause]i guess the we should say that both ofus build both of us have built and arebuilding controllers and operators soport has a great one that connects yourKubernetes clusters uh into their portaland Kratics is obviously built as acontroller so come by the booth in thelast couple hours if you want to hearmore about it thank you2025-04-15 21:58:00.800199 ��SD#��]AFqUPqroF-Rwthank you so much for being here i knowit's Friday the last day i hope so faryou had a great event so far and onceagain thanks thank you so much for beinghere before we start I'd like to do aquick question for you make a quickquestion for you about do you know howmany CNCF projects security projectsexist today i'll give some hints someoptions to you to think about it sobetween these options what do youthink before you provide the answer holdon let me provide some an image thatwill help you to to make the the mathfrom your side so this image is aboutall the secure CNCF opensource projectsso now it's easy to know the optionright so let's check who thinks that thethey the answer is letter E asorry no one okay one thereB okay a lot of people what aboutC okay i think B is the winnerso the letterC and to be more precisely it's 80 andcounting it's growing fast it's not easyfrom the image to get this number youneed to count because the the squaresare not the same size my name isHickantana i'm principal cloud supportengineer at AWS i work with uh customersfor ECS and EKS for about five yearsalmost six year now i'm based in Dublini'm Brazilian and together with me it'sHello I am Bruno Silva senior solutionsengineer at CYIC i'm also Brazilian livein Dublin a little bit about myself uhbefore knowing this Kubernetes thing iwork a lot with no corporate leashlegacy let's say technology likeMicrosoft products and since then Idiscovered this open source andworld I shift to this and welc-��8C#��'AdVM20108SRcgood morning everyone how are you doinggood yeah last day so few hours and thenyou can crash okay anyway thank you somuch for attending our talk we will betalking about CRDS controllers let meintroduce myself i'm Sebastian Blondeyou can call me Sebie i'm a developeradvocate for port and developer uhinternal portal and uh Abby hi i'm AbbyBangzer i'm a principal engineer atCentaso uh where we're building aframework for building platforms and alot of what we're doing is trying tomake all of this a little bit moreaccessible but today isn't about eitherof our products it's about what makesKubernetes so powerful in so manydifferent contexts and the thing is isthat there are different modelsinteraction models for Kubernetes andsometimes when you're talking to someoneand you're confused about how they useKubernetes it might be because you'retalking across these models so I want tocall them out and call attention to themso the first is the user personality andthis is people who who might use cubectlwhich is the command line tool to beable to see what's going on in thecluster but the big thing here is thatwhat they're focused on is what's whatapplications are running in the clusterwhat are the pods that are running howdo they get logs from the softwarethat's running in the cluster and thatkind of thing the second interactionmodel is around the administrator andthis is people that probably definitelyuse the CLI tool uh but they're going tobe focused around different things thatthey care about they're moreinfrastructure oriented they're lookingat administrating a cluster so how dothey create the nodes how do they dotaints how do they do upgrades ofKubernetes itself and they may besetting standards for Kubernetes aroundthings like arbbacks or role-basedaccess control and other things thethird interaction model is the developermodel and this is probably the least mothe least spoken about here at CubeConor just in general and often thesedevelopers have a background as eitherusers or administrators or sometimesboth but what they're looking at in thisrole as a developer is they're lookin'.ome tosurviving the two picking the right toolto secure your cuberneteshabitat so if you read uh thedescription of all your talk you may bethinking how can we correlate technologyand you know open source tools well Ihope our presentation uh make this alittle bit clear so starting with youknow the animal kingdom we have someanimals that are very similar even insize in shape but they're not the samelet's get for example like a tiger and alion but they overlap in some things youknow kind of the family etc but theyhave some social ways to do or differentstrategies to hunt and this is where Ithink the tools in open source space aswe saw 80 at least well when the slidesmaybe a new one was announced duringcubecom and this is how we want to getthis uh clarify in thistalk we don't have time of course tocover all the 80 tools otherwise wellit's going to take a long time andhonestly myself I know don't I don'tknow all of them top my head and whatthey do so we select some tools thatbased on our experience and communityusage so GitHub uh stars or blog post oreven you know uh cubernetes uh cubecontalks and based on those we kind likeokay how can we structure this in a waythat I can understand the flux or theflow where the security tools fits inthe security space as also the idea hereis get a little bit of oneonone idea ofhow the secure around kubernetes can beused for those who decide to get fourkind of stages build deploy start andrun and this is where we're going tofocus today but of course there's waymore as we saw the number ofprojects starting in the build phase iheard so many times so many times thisactually was why I discussed with Hickeyto create this uh this talk is hey canyou use the project XYZ to scan mycontainers for vulnerabilities and thenI kind of step back and say "Oh you meanyou want to scan your container imagefor vulnerabilities." This is a very uhI would say spreads misconception and Iunderstand the person wants to say theimage but we need to be very clear whenI talk about the image is a green squarewhere is like we have the layers so ifyou like docker file before to create acontainer for example we're going tohave the layer one from Alpine so thebase OS then I'm going to copy aconfiguration we're going to run APK adso basically installing something andthen at then I'm copying my app directlyyou can see that the layers are betweenbrackets O like read only so this is astatic file so the image is veryimportant to remember is something thatis a static when I teach in my companyespecially when I talk to people thatstarting the security journey I like tocompare this when like you do a recipeof a cake you got a recipe you're goingto start with you know the cake pen solet's say the Alpine and I'm going tostart adding products over it so I'mgoing to add sugar egg milk etc and thenwe're going to go to the build phasewhen I'm going to you know like bake thecake itself then we go have you know theblue square the container when you'regoing to have a new layer which calledthe runtime layer or sometimes referredas a sandbox layer which only existduring the time that the container isalive and this also we change thebehavior to be readrights all container images they arebased on operational system on this casewe're using Alpine as base so you cansee here like the file system of Apinelike /bin mnt proc root etc so thingsthat my application will need to workand then makes itjob so once again image or containerimage is based on layers so going backto the cake idea if I get something fromsupermarket directly from the shell Ijust basically trust that product to addon my cake but I may don't know exactlywhat is inside very similar in thecontainer or Kubernetes world we justdownload packages or use you knowsources that maybe are not so trustableor we don't have the uh knowledge ortime to inspect them so what is you cando in this case is thinking aboutscanning this image so check what isinside for this case the tool that weselect for doing are very well known inthe community is 3vy and simplifying thetree uh steps we're going to star/t withextract the layers so get those layersthat I explained before then we're goingto build the file system so you knowextract and go similar to the image thatI show before third part identify OS andno packages so kind of made distinguishbetween and the check we're going to getthis value and cross check with thevulnerabilityDB on this case what I'm saying is heyget specific product or you know againstget something for the shelf let's lookthe composition of it let's try identifyif the shelf life is okay or in thiscase the VDB will have the CVS thatmatched with a specific package orlibrary when we run 3V this is how uhresult looks like we can see that thebase O base image here the Python 39 etcit has zero vulnerabilities but somePython packages they have vulnerabilityso in total three and informs directlythe package this is one of the ways thatif I have the image built I can getextract the data from it once againtrack the layers build etc and then fromthis I can get you know what are thepackage or the vulnerabilities beingaffected on specific so this is one ofthe open source tools to use in thebuildstage and well which animal can wecorrelate with this well actually was alot of interactions between ourselvesand we come to the conclusion that maybea raccoon is kind of the best betterfits it especially because you know araccoon has something that it scavengesfor looking you know for food orsomething and he knows for differentlike hey this is plastic this is foodthis is something I want so he has theidea to you know open the layers so openthe trash open our trash can to knowwhat's inside and then get what hewants okay moving on for a deploy phasewhere you are hitting the Kubernetes APIto change something like to create a podor delete something or modify somethingso for this uh it follows a a specificpath so it goes to the admissioncontroller which basically is acheckpoint that inspects all therequests and enforce the defined policythat you have like for example you canhave a a company policy that's sayingthat you cannot allow X Y and Z so thisadmission controller is the gatekeeperthat make sure that this uh you you thisis a checkpoint that will make sure thatthe rules are are compliance with thecompanyso and after that with all therequirementenforced the objects can now been bestored into the etc so basically thethis is the workflow for for everyKubernetes API that every modify orupdate or delete Kubernetes API that youare making for the Kubernetescluster talking all about the projectthat we select for this stage uh weselect the CNO that it works as adimission controller for Kubernetes itworks with YAML file which is the sameuh language that almost everyone that isusing for creating the the resource likethe pod definition deployment definitionso you don't need to learn new things touse cavo which is very good for that uhthere are some policy code that youbasically need to to flip the languagethat you are using and this is good forsomeone that is uh familiar with alreadythe kubernetes world so it's it's verygood for thatum so there are two things that for therequest the cavern will do it can domore than that but this these twovalidating and mutating are the mostcommon ones so first it mutating sobasically mutating is a on the fly uh onthe modification that for the requestthat is uh is making uh it will make thechange on that that object to moveforward so for example let's say thatyou have a policy in your company thatis uh make sure that you have a memorylimits for your p definition right sothis is an example that you can do solet's say that the developer forgetabout that and try to deploy a podwithout this memory limit so the cavernoum mutation workflow will make sure thatyou assign a minimum amount of memoryfor that resource before move forward soit's not it will not block the the themodification it will just change it andmove forward the next step stage is thevalidating so the validating is now is aokay if you are not compliance you notnot be allowed to to move forward andsave in the TCD so an a validatingexample would 0be for example if you havea cluster that a production cluster andyou have a company compliance that youneed to have a label on your name spacethat you arecreating a validate uh hook that youbasically make sure that for that namespace that you are trying to create itshould have that specific label withthat key and on that if it doesn't itdoesn't have it will block and not allowto so it avoid a lot of mess if you havethese uh two things there are two morethe generate and clean up that for moredetails on that I would recommend to gofor the documentation because is it'snot so easy use case we are at 200 levelhere so these two are more most morecommon the validate and mutateones for the analogy that we arethinking we are thinking a vero is likea a forest guard that you for example ifyou are going inside a forest at nightand you forget your lantern the forestyou make sure that you bring the L so itcan provide the it's the mutation onethat I just speak about so it canprovide the lantern for you to go insideof theforest at the same time if you are goinginside of the forest with something thatis not allowed it to block you to goinside of the forest so it's it's apretty much align with the cavano thingthat I just spoke about so it's analogythat I will bring toSo moving to the start phase this isjust before the pod is running so it'sit's once you the deploy is is the stateis saved on theTCD and before the pod is running thereis a start phase that basically it's forENI attachment and as secrets forexample that you are using so fortalking about specificallysecrets it'suh it's about special object that thekubernetes is storing uh sitiveinformation so you can think aboutcredentials tokens keys certificateswhatever that you can save intokubernetes so you can use a secret tosave these objects instead of puttingdirectly for example the password insideof your pod definition so with thatconfiguration your pod instead havingthis uh hardcoded password it will mountas a volume or as environment variableinside of the pod and your applicationcan read safely read this uh informationuh without being uh the the descriptionof your pod with the informationdirectly so it will save on intoit so however there are some well-knownlimitations with uh the Kubernetesdefault secrets on Kubernetes forinstance there is no built-in uhrotation mechanism in place so basicallyif you are trying to change yourpassword it will it will be a bitchallenging also there is no versioningabout the the like the change that youare making you cannot like easily rollback at least not out of the box andalso to integrate with GitHubs and otherpipelines workflows is not also a a easyway to dothat one more thing that we we we seethat is missing here is aboutintegration with external vaults likesecrets manager um hash corp and so onso there is no integration by defaultwith the the secrets onkubernetes so that's where externalsecrets operator comes to the picture soit it brings the secrets values fromexternal resource like in this exampleas secrets manager into the kubernetesso it's it also supports others as Isaid parameter store hoscop vote andothers uh other external vote systemsbut looking on how it works with AWSsecrets it will fetch the secrets fromAWS using the EM role that you have soyou basically configure the externalsecrets uh operator with your ERM rolewith permission to fat the secrets fromthe the secrets manage and also itperiodically syncs this with the theexternal secrets and keep the secretsinside of the Kubernetes synchronized sowith that you have the all the the thegaps that you just mentioned about theKubernetes secrets so you have thebenefits of automatic rotationversioning uh that is also easilyintegrated with the GitHub because withGitHub you can check the the secretsfrom secrets manage as well so it's it'sa pretty handy and and a good way tomanage your secrets uh with Kubernetesand also integrates with external uhworkflow so thinking again about theanalogy here you can think about thisexternal secrets as a skill that gettheir resource from outside words and1bring it for its home to consume whenthey need so similarly the externalsecrets you get the resource from theexternal secrets whatever you are usingbring into the kubernetes cluster andmake it available as external object foryour pod your application to consumewhen theyneed so we scan our images we have anemission policy in brace we have safetsecrets management we are safe rightrightwell now finally we came to the finalstage of this which is going to be theruntime and yeah the colors are reallyto be again I know I said this six timesalready or maybe more image greencontainer runtime blue and of coursewe're in CubeCon we're going to go onemore level and talk about pod whichneeds to have at least one containerrunning so if I need to observe or scanlet's say the container when it'srunning how can I do that well I coulduse some you know instrumentation someSDK or something but this creates a lotof overhead and especially I need toinvolve maybe a developer to be part ofRight so if you think how a container ora pod works in the kubernetes world it'sbasically he needs like hey I let'ssuppose I have like engine x containerand then container I needs to you knowwrite access logs to a specific pathwhat he does is the pod will know usinguh docker and other runes go usingsystem calls to talk to the kernel andthen the kernel will do make the actionknow to write to disk etc so this is howthe application works important tohighlight that this if it's a pod or ifI have a you know a bare metal orvirtual machine with my application uhrunning that directly it's going to workthe same but again we are at cubeconwe're going to focus onpodway so if I need to follow this howcan I do that so the tool that we chosefor this it's focal so the foco has theidea of using ebpf you probably haveheard about this uh during the cubecomif not there's lot of talks I suggest totake a look but to simplify eBPF is likea bridge away to talk to the kernel in avery light light way but also very safeavoid us to using like kernel modulethat if you use kernel module before andby mistake you did some wrong parameterand you saw a kernel panic you'reprobably going to want to know to usethe newversion and how the fog works in thiscase is he's connected to the same placeso instead of try to you know be insidethe pod or being you know a sidecar podit's actually try to do like a mirrorlet's say during the kernel using ebpfso whatever the application in the podis doing falco is watching and then hecan react or make events of it so talkabout Cisco calls if you saw this beforeif you got like let's say 5 10 secondsof Cisco call of a Kubernetes nodeyou're going to have like thousands oreven maybe millions so how can we getall of this data and transform this oninformation something that we humanconsume information so in this for thiscase FOC has something called rules therules are a logic or a way to get all ofthis massive dump of sys call data andinterpret them to then generate an eventgoing back to my previous example let'ssay I have you know read read and writecalls all the time that well actually wehave but I don't want to be notifiedevery time someone reads and writes butif someone or some process is writing tofor exampleslashtmp I want to be notified and thisis where the falco rules come in placeyou can define a specific score specificbehavior to a specific path or programthere's several ways to doAnother very famous rule that's beenused uh in Falco is detecting if someoneopen a shell in the container becausewhen you think about shell in thecontainer most people think oh yeah Ibypass my application using avulnerability or something like that butmaybe I have access to the node maybeI'm using a docker or you know I'm usingcubectl I got a credentials leaked forcubectl and got there so I need to havea way to find this and this is wherefalco comes inplace if you're still Not clear why dowe need runtime you may saw last weekabout this in the security world theingress nightmare if not type on Googleyou're going to find lots of resultswe're not going to go in details herebut the thing is we had like aconversation one of the exontainers ofthe Kubernetes uh engine x ingressengine x and he explained details thatthis was something that happened atruntime so if we scan with tri we didn'tfind a package with a vulnerability ifwe try for example use cavo to denythere was no kubernetes it was insidethe engine x itself configuration so theonly way we can get this here is aboutruntime the behavior that happened in myengineext ingress pod is something weare able to detect and why this is sodangerous just to give a little bit ifyou didn't see it basically the enginexingress controller has access to allsecrets of your Kubernetes clustermeaning everything we did safely to keepan external vault and sync in Kubernetesit goes to this engine X ingress sosomehow I'm able to exploit this I needto know so this is why I really need aruntime uh security technology for itwell this was the easy one to do justwas some interactions with you know AIto get a kind of nice image and the easyway I want to highlight this this isfalco is actually the Italian word forfalcon so very easy to do it and similarto falco he knows where to go to attackattack a prey because in his brain thefalco rules know that hey this is a rockdoesn't make sense but when I see amouse running it's time to go so this iswhere I know the action needs to happenand then I'm going to react toSo the main takeaways I want you to gohome with is please remember scan imageis different than scancontainers the best threats can beenforced so just because Kubernetes outof the box doesn't have a way to preventpeople to don't follow my orders let'ssay as a platform engineer doesn't meanthey're going to just allow to dowhatever they want so cavo or other uhtools can help us a lot on that keepusing external vaults without usingkubernetes way i think something thateven uh hiki mentioned is very importantwe tend to think that kuberneteskubernetes but if you're running let'ssay in a cloud provider we may have anEC2 instance or other products they alsoneed to share the secrets so how can webe sure that the same RDS password isbeing synced between my EC2 instance andalso sync between my Kubernetes podsextra I want this to uh give extra workto my developer so he knows hey get fromsecret in my ammo just works he doesn'teven know where the secrets coming fromand for sure the fourth runtime to therescue if the previous barrier theprevious projects were fail or maybefail is not the correct word they werenot designed to so if you go back againto the animal kingdom comparison there'sdifferent types of hunting differentskills and they are designed forspecific specific animal to survive andthen evolve and then we know new toolswill also show up to cover other placesother use case something really reallyimportant that's not written takeaway iswhen we were uh writing this uh talk ortalk discuss about the talk it's veryhard to find oneonone content forsecurity i work with security for someyears now so of course for me it's likewhat's easy but if you go todocumentation of some of these projectsthe documentation is awesome a lot offlags how to do this how to parse thatbut okay I'm a very beginner how do Istart with this what this is doing forme what what stage should I use this andthis is where you know create the ideafor we beinghere we have some resource learningdon't worry you don't need to try toclick the hyperlink uh sitron in yourplace we're going to share this in theofficial CNF place now one small ask Iknow you have a smartphone in yourpockets I'm sure you have it so this isthe care code for doing feedback I'mgoing to keep this for like three fourseconds if you want this is veryimportant especially also to providefeedback connect to CNCF etci say hey you you please doit and if you want to connect with usthere's a you know social network calledlinkadine you may have heard about soonce again trust us the care code safedon't need to worry you should just scanand good to go thank you so much if youhave any questions please let us knowthankyou thank[Applause]2025-04-15 21:58:01.2945903eallyyou know augmenting and helping the TOCin accelerating some of those uh youknow interactions the other part alsowas that from an enduser point of viewreally bringing uh to the TOC as well asto the projects uh feedback from the enduser community which is you know againuh really uh a measure of success forall the projects and the technologyinnovation that is happening in the CNCFprojects and uh also being able to givegive back uh project feedback in termsof what works and doesn't work for endusers and last but not least uh whenthere is an uh concern uh in terms ofrolling out uh you know this technologyin production because at the end of theday um you know the end users are theconsumers of all the technology that isbeing innovated upon in the CNCFuh to give that feedback back to theprojects to the TOC and work together asa community to uh you know impact theroad maps of projects sometimes to uhhave that alignment so there's again youknow a complex interdependency but yet avery collaborative way of working andgetting that feedback together uh Katiedid you want to add or Joe go ahead gono I I was just going to say um I I IAlita you kind of hit on some of thethings that I think are really importantwhen I stepped into this role you know Ireally was like there I've been an enduser at a lot of different levels rightnow I'm at a larger company but I'vebeen at smaller companies and you knowone thing that was really key to me isjust being able to find my way inwhatever that whatever that size companyis and helping end users to be able tosee how they drive impact and I thinkthat's one thing that if we have a goalfor I'd like to see out of this is youknow helping so that wherever you're atwhatever size you are you find thoseentry points as well as like driving itinto the projects themselves i thinkthat's going to be really key katie andI actually talked about a little bitabout the future of the CNCF over thenext 10 years and I think that's one ofthe things we think is critical that'sgoing to help us be sustaining and tohelp this thing to grow uh one one lastthing i I think we have one other teammember here who's key we had somebody inthe past Taylor Doleazal i know Brian'sover there hiding out that's one of ourkey support persons i just want to callhim call him out because he's we'rereally excited to work with himgot to stand up so we canThere we goi'm Yeah so I I joined the tab recentlyand one of the reason that Ispecifically joined is to break thesilos in in some way to like feel thatwe are all working as in our companiesin like in a similar problem uh onsimilar projects but how to likecollectively be able to like make animpact as Joe was saying like how to getthis back and report all of our feedbackback to just like the CC CF and saythese are the projects working for usand how can we make improvement on theseprojects so to be able to have a voiceand understand the ecosystem better Ithink that's one of the great way ofhaving a line of communication betweenend users and the CNCF the TOC and likebeing able to share all of that feedbackthat we get from working on problemsthat hits our industries but also someof these problems we believe as I'm anengineer by nature so sometimes I wouldbuild something I think like I'm theonly one who build it but like a lot ofpeople build the same thing so likewhere we can like drive impact togetherto make the community and the opensource as a better place for all of usso you all have sorry can you saysomething no I I think this is like avery very good representation of whatthe tab is doing and perhaps to put itin perspective within the from anecosystem standpoint end users havefeedback they have the real use case ofof all of these technologies and theyknow what works what doesn't work howeasy it is to to engage with thecommunity or not and I think the mainidea for the tab is to give end users avoice or an input into the ecosystem aswell because it's been very dissipatedso far it's been very uh dispersed ingeneral because only end users who wouldgo to the project directly and engagewith them dir4ectly would be able to moveforward things but now we can do this aspart of the tab and we actually canintegrate with different processes andwe have what what's important about thetab is like we have people that you caninteract with you know how to engagewith us um and we definitely will beable to perhaps channel this back intothe ecosystem as well so I think it'skind of this public facing interfacethat's very important that has beenmissing and kind of at leastconcentrating some of this effort intohow we're collecting this feedback soyou you've sort of like all touched onsome issues of that end users have hadwith sort of engaging the ecosystem andthings like that what would you say arethe key challenges and you knowpotential tab initiatives or things thatare actively going on that might helpwith this sort of thingum yeah so uh the last year we had acouple of areas that we focused on uh Ithink the one that was the mostsuccessful was around referencearchitecture so this has been a verylong uh discussion topic in thetechnical oversight committee uh but theexperience there clearly showed thatthis this information should come fromend users not from the projects uh or orany other body so it was decided thatthis would be uh one of the workinggroups that we would form joe has beenreally active on it so he can talk moreabout it the other areas are aboutproject health um we heard in differentsessions during the week that we have achallenge which is we have this maturitylevels but we have a challengeunderstanding we we put a lot of effortwhen uh the projects move between levelsbut we don't have a similar effort yetwhen the projects remain at a certainlevel to ensure that the expectationthat was true when they moved is stilltrue uh today so this is where webelieve that the end users can reallyhave uh a very large impact from theother perspective which is to give toolsto the end users to uh improve their ownprocesses internally uh there will be alot of uh um work during the year togive you the tools to create your ownview of the cloud native ecosystem it'sa very hard to navigate landscapebecause it's so large uh there aredifferent views of uh the the maturityis only one metric that you might have alot more metrics internally and what wewant is to hear everyone and try to givethe tools to the end user so that theycan build their own uh landscapesecosystem with the additionalinformation they they rely on internallyi was going to say yes so definitely thereference architecture has been one thatI've been really passionate about ithink you know when I look at like thethe landscape and I know we see thememes and the jokes and everything aboutit because there's like so much and it'sjust you know it can be confusing but II find it really interesting when I seelike organic projects stir up likethere's a project like canoe wherepeople are putting building blockstogether and what what it's telling meis that we're all looking for ways tofind patterns or ways how we can makethese things work together ideally if westart to like as as Ahmed mentionedearlier we find ways to collaboratebecause we share some of the sameproblems it's similar like in an opensource like it's always better when youcan share something with a hundredpeople than just doing things on yourown or find lessons learned also I likeit when I see like projects like Arggofor example which was an end user-drivenproject i I get excited to see thatbecause that to me tells me that we'recreating like this ecosystem thatenables this type of innovation whereend users start to make that journeystart to turn these things out soideally hopefully that works and then Ithink Bobby and you and I we we bothshare a background in the Kubernetescommunity the one thing that's differentfrom the Kubernetes community versuslike the CNCF in general is that youknow over the 10 years there's astructure right we we kind of know howto we know the enhancement protocol howthat works I I also feel responsible forit but guess what we know how it workswe can show people how to get featuresin ideally we can kind 5of do somethingsimilar in the CNCF because there's alot of projects some have differentgovernance different structures they'reat different levels but if we couldcreate a way that could maybe make surethat they're getting the right signalfrom end users the stories the use casesas a product guy so I'm obviouslylooking to make sure that we're we'rebuilding projects that are getting theright feedback so we help them so thatin partnership with the TOC maybe we canhelp these things kind of graduate ormove along get adoption so that's thethings that I think I'm hoping to reallyover the next year help to drive andwork on very quick because I know wehave two questions as well um the otherkind of not necessarily challenge butone of the gaps is what is what areother users using within the ecosystemand I think this is where tech radarshas been quite beneficial because thematurity levels provide a level ofcredibility however the radars as wellthey provide like another layer ofperspective of where this is where thisis used or how much it is used by otherusers so I think that perhaps is anotherinitiative I would like to highlightquicklyquestion please go ahead please go aheadhello uh I have a question so I thinkthe tib just collect the uh usersfeedback and it can also give suggestionuh and influence to the end user andwhen we see CNCF a lot of projects theykind of like tackle and solve similarproblems and how could TDA guaranteeingtheir suggestion is neutral and fair andwhat do we have a standard when weevaluate uh different projects and giveend user feedbacks yeahuh I think um whenum when you have different usersproviding feedback you know for specifictechnology right uh in an environment uhespecially like the CNCF and you knowKat Kitty alluded to the um complexityof the landscape today uh you haveseveral projects you know in a s in a alarger group of projects right which areworking on similar solutions ordifferent parts of a larger solution andwhen you're providing feedback that'sone of the things that we are looking atthat you know it's like how do weprovide that feedback in a morestandardized way back to the projectswhere that pro then can be you know uhreviewed by all the projects in thatspecific focus area and then weconsidered for you know giving back uhtheir comments or their you know theirprioritization in terms of whether thataligns with their uh project roadmapright so again I think it addresses partof your question but um there is noguarantee in one sense that all feedbackfrom end users will be incorporated bythe projects because it has to be alsoaligned with their technical you know uhroadmap and where uh they actually seethat uh technology evolving to andneedless to say you know what we want todo from the tab is to take uh thoseuh areas of you know recommendationsthat are coming from end users forspecific roadmap items and make that youknow visible as a general backlog toJoe's point where there is morealignment cross project right may maybejust to take the other part of yourdiscussion on being fair to the projectso the CNCF is has the principle of noking making for the projects uh thethings like the reference architecturesthat we talked talked about are reallyuseful because they put things intocontext and this is where what is reallyimportant is when we have the end userscoming forward and giving presentationsat Kukon they don't tell you uh theychose the project X over Y they tell youthey chose project X over Y and thereason why they did it and they relatethat to their specific use case sothere's always some context when youhave to make decisions the best areathat or the easiest area I usually talkabout is storage where you might choosea project because you need the highthroughput or choose another projectbecause you you you need the reliabilityand this is really related also to thedomain you're you're involved in soprobably the reference architecture willbe user based user specific but weprobably will also find domain referencearchitectures where where differentprojects might play a similar role forfor multiple reaso6ns so I think this iswhere the information will be veryinteresting if we get a lot of momentumfrom end user reports in this area wecan really build this knowledge and helpout everyonethank youso my uh question half suggestion ismostly about the reference architectureso um I'm very keen on making whateverarchitecture we have also publiclyavailable um and I've noticed I've beenhanging around the pavilion uh way toolong um that there are many differentlevels of maturity and levels ofunderstanding so people will usuallywant something they have a an outcomethat they've seen somewhere else andthey want the same outcome but theydon't have or don't want to invest thetime read the documentation makesomething that fits their use case soyou end up with the the hello world uhproblem where they will copy and pastethe hello world example then perhapssomewhat silently complain that itdoesn't really suit their needs and thenonly a small fraction of those will pushthrough and actually engage askquestions take the next step is thatsomething that it's not necessarily partof the reference architectures but isthere some way that as a community wecould communicate this referencearchitecture requires you to have thislevel of knowledge or skills orconsultants to help you actually makereal world use of this referenceso it's almost kind of like uh if Iwould give like an analogy be like youknow if you're going to go on a ride youhave to be this high to ride this thingyou know basically is what I kind ofhearing and I think back in the day Iremember there was like kind of a aphrase and maybe you might know this butwe used to use like gy when it was likeGoogle's infrastructure for everyoneelse and the point was kind of likeGoogle made something to show you whatthey could build but that may not besuitable for you but it was a patternfor you to kind of build off of and Ithink if I look at it that's kind ofreally I think that starting point iswhere we're at like you're right likethis is probably not going to fit whatwe we put out there but we at leastwanted to show how we built and designedit and then hopefully you canextrapolate from there maybe there's apotential for us to like take some inputon how we can improve maybe there islike hey give me a starter guide ormaybe some suggestions but love toexplore if if you have some ideas aroundthat yeah in fact I I think it uhactually refers to more in terms of whatyou're mentioning as a maturity modelright because again there are differentdimensions in a maturity model and it'snot only the technology referencearchitecture but it's also thedependencies in the ecosystem rightwhether that is experts whether that isyou know really level of understandingcomplexityuh as well as other factors so you knowagain that's a great way to also uhextend and expand upon uh you knowreference architectures in themselvesbecause it becomes more of a model atthat point than just a you know staticreference architecture in time yeah soinstead of being a blueprint like takethis blueprint and now you're us that'snot really how it works the same is ifit works for Google it is good enoughfor us well unless you're also Google itmight not actually be good enough foryou absolutely absolutely because thereare whole other you know dimensions inthe ecosystem that we really have toenable in order for that blueprint toworkgood questionhello yeah thank you very much um couldyou perhaps give us an idea of what lifeis like as a end user technical advisoryboard member how many hours a week doyou sync into the work what's an averagetask you would do in a week like what'sit likeyesso I just started uh like all I know sofar is uh so far like last week uhthere's a weekly meeting that like weattend together to discuss like specificagenda items that we go through butthere are other working group and workstreams that like you would put effortto so I don't want to like from myperspective my estimation aroundsomething like that would be betweenlike 5 to 10 hours that's where like Isee depend also on the size 5 to 10hours like maybe like a week depends onlike what working group or what effortsthat you are implanted in but I'm stilllike exploring so this is my first weekassumption ask me next year i might Youwere kind of thrown right in the deepwater the deep end of the pool on thatone yes so you or Katie or Yeah I I canI can say that you know having been uhuh on the tab for a year now uh it'sit's definitely uh you know uh at leastfour hours of a commitment every weekbecause uh think about it this way rightlike we started off this year and now wehave kind of converged into one meetingper week but we actually had uh spun upthree workg groups through the year uhwhich were about an hour long meetingsuh and and these were discussions youknow focused on three initiatives thatwe were actually um working on throughthe year one was reference architecturesone was project feedback and one wasproject health plus we actually also hadan general you know over a tab uhmeeting so that was about four hours ofmeeting time but then you have also someamount of work you know for an hour youknow for each related area uh that youcould easily invest for the proposalsfor the review of you know and providingfeedback and being thoughtful aboutother additions to those proposals so uhMS is actually quite you know accuratein terms of about 6 hours to on anaverage because uh however uh that beingsaid we have consolidated the you knowfour hours of meetings into one hourevery week uh we decided to go for aregular cadence and really beintentional about how we use the time todiscuss you know a specific area and godepth in uh but also come prepared tothe meetings with you know work that weare doing on the proposals or onfeedback and on any followup with theprojects or TOC or othersyeah so I I think the fair uh the therough guess would be around 10%uh of your week uh but what what we aretrying to do after the first year ofexperience is to mimic a lot of what theTOC has learned over the last 10 yearsuh which is um we we do need the weeklycheck that's really important to to beavailable for that but a lot of the workhas to work has to happen offline andthis is where you can scale out andreally be effective so the effort thisyear will be on having this moretimecoped and well- definfined uhdeliverables that we want to come upwith uh and to be to have the automationin place to to to be able to delegatethese tasks and do them offline uh thegoal being also to not be restricted tothe uh end user technical advisory boardmembers but to have uh some sort ofstructure similar to what the TOC haswith the technical advisory groups wherethey can really uh make use of a lotmore people and not only that open theaccess to this initiatives to a muchbroader uh community so that if youmaybe don't feel you have the time rightnow to commit to become a member orapply to become a member of the boardyou you can still do your contributionin some specific deliverable or helpreview a white paper or collaborate in awhite paper this kind of thing willhappen more and more so I would suggestuh if you're interested to to reach outuh propose an initiative or show yourinterest in a certain domain and we'llbe super happy to to follow it up aswe're just about out of time do you wantto mention like how people canpotentially follow that the end usercommunity and the tab or how topotentially get engaged yeah so the bestentry point is to go to our GitHub repofor the tab you have all the informationthere with the contacts uh the meetingtimes uh and we'll start having a publicmeeting very soon that you will be ableto attend as well uh the secondpossibility is to look for the existingend user groups in the CNCF uh there area few per domain we'll actually bereviewing those as well during the yearuh but those are also a very nice entrypoint they have regular meeting pointssome of them even in different timezones which are really nice for fordifferent people so the tab the tab willalso have the wickling meeting uhstarting to be in different time zonesso that it's more open uh to new newpeoplethank youthanks thank you[Applause]2025-04-15 21:58:01.879690 ��F#��uAV7HbBO_umOUhello everyone and welcome to Friday thelast day of CubeCon and thank you forbeing here for this live technicaloversight committee meeting which willreally amounts to a ask me anything orask the experts type of thing For thoseof you that don't know the technicaloversight committee is one of the threetop level governing bodies of the CNCFand is there to largely act as thesteward of the projects and set thetechnical vision for theCNCF Withthat here are members and if yo8��RE#��[AVU_vj2r3BgIhello everyone and welcome for the sortof techn end user technical advisoryboard ask the experts or aAMA the technical advisory board is oneof the top three governance bodies ofthe CNCF alongside the governing boardand technical oversight committee whowas just on stage a minute ago with thatoh whoops i'd like to introduce our tabmembers would you like to startintroduce yourself sure hi everyone i'mAlorita Sharma and I am uh leadingobservability engineering at Apple uhand also uh have been involved in theCNCF as a maintainer of the opentelemetry project uh and also uh inCortex uh very actively working also onthe tag but uh very happy to have beenrepresenting uh you know the interestsof the end users and some of theinitiatives on the end user tabcommunity so thank you for you knowagain joining in Joe sure i'm JosephSandal i am with Adobe i am also amember of Kubernetes SIG release kind ofabsent right now as I work on being aCubeCon co-chair and then uh I thinkthat's ithello everyone my name is Katie Gamjiand I work as a senior software engineerat Apple i'm also part of the TOC ortechnical oversight committee and partof the tab as well uh I'm Ricardo i'm aa computer engineer at CERN where I leadthe platforms infrastructure teams uhI'm also in the TOC and in the tab and Irecently uh chair thetab hello everyone my name is Ahmed Parsi'm a principal engineer at the New YorkTimes and one of the newest members soI'm still learning a lot from all of thetab group uh but happy to berepresenting the end user and the NewYork Times as well on the tabso do you want to tell us a little bitabout the tab and sort of what the taboversees within the CNCFokay I I can start and then others uhcan come in so the the tab actuallyexists uh since just over a year um itwas created to try to complement some ofthe roles that peoplewould expect from the TOC but the TOCwas not necessarily the best place to doit so everything that needs uh uh a lotofenduser point of view uh things like uhreference architecturesuh better interaction between projectsand end users uh so we we decided to toto try to create a new body that wouldcomplement the technical oversightcommittee and the governing board likeuh Bob mentioned uh but would bring uh amore end usercentric perspective uh tothe ecosystem uh so we've been scalingthe the the body in the last year soAlolita was the chair uh during lastyear and she did a great work getting usuh into into speed and understanding howto work uh better uh but yeahyeah I think sure um again I think thatum you know as Ricardo was mentioningthere are two aspects uh to the tabright that we started off with one wasof course that uh there are several youknow uh dependencies and andinteractions with the enduser communityuh and including end user members of theCNCF who actively participate on theprojects and the technologies that uh asend users we all consume and and r29u want todo a little intros or pass it aroundOkay Hello Hi I'm Karina Angel I'm thecurrent chair of the TOC and I work atRed HatHello I'm Alex Kerop and I work at AkamiHello Uh I'm Kevin Juan I work onmultiple projects and uh very happy tobe on the TOCHi I am Jeremy Rickard I am a co-chairof SIG release in Kubernetes and I workat MicrosoftHi my nickname is Dims I'm part of theKubernetes community and I work at AWSHello everyone my name is Katie Gamjiand I work for Apple and I'm also partof the tab or technical advisory groupwith RicardoHi I'm Facil part of the STO communityand I work for EricsonHi I'm Ricardo I work at CERN I'm alsoin the TOC and the end user technicaladvisory boardI also want to highlight we have one ofour newly elected TOC shadows hereRicardo do you want to please stand upand so people can seeit the TOC shadows are a new thing forthis term um that'llbe that are there to help the TOC umsort of not see completely take overtheir duties but help out where neededor if a TOC might has to step down orthe shadows will be available to tohelp With that I think we can get rightinto it Karina do you want to tell uswhat the TOC really is andthe TOC provides technical oversightover all the projects within theecosystemUm thatincludes licensing uh trademarkstechnical oversight I know many peoplehave projects that have gone through thedue diligence process or move throughthe levels So the TOC really looks atthe technical specifics of the projectsand ensures that most importantly thatthey are viable for the end users whoare going to be adopting the projectsincluding sustainabilityum including uh making sure the releasesare transparentum yeah all the good stuff ExactlySo what do you think is the or biggestchallenge to the TOC and the projects uhfor the next you know we'll say coupleyearsand anybody else can also jump in I knowI'm looking at Ricardo and Katie overhereI mean we Yeah Yeah Um some of thechallenges from a technical standpointwe highlighted today as gaps within ourecosystem during the keynote and theseare mostly f well I'm going to mentionthem again for anyone who didn't had achance to attend the keynote there arethree main areas and we categorize themfor a very specific reason because oneso one of the domain is aroundmulticluster management andobservability so yes it's challengingespecially if you do it cross providerbut observability is challenging inmulticluster as well so we hopefullyhoping to bring people for example fromexisting dagup delivery andobservability together to cover some ofthese gaps So we trying to have morecross that collaboration for theseidentifying these gaps and by doing theum restructuring of the tags as well Sowe have multicluster management andobservability The other one is aroundcost management and sustainabilitybecause we definitely focus around ourcost spending but at the same time weneed to think about our carbon footprintas well and these are groups that shouldwork together and finally we havetooling around infrastructureprovisioning and uh secret managementand this has been a gap within ourecosystem when it comes to fully opensource tooling for many years So wedefinitely want to kind of again bringpeople together from different groupsand hopefully cover this this area aswellI I think um one of the other focuses uhwhich is you know becoming more and moreimportant as the CNCF enters its 10thyear uh of existence and you know we hadthe Kubernetes birthday last year isthat you know there are almost like twoparallel focuses now there there thereis the drive to innovation and you knowfilling in some of the gaps that thatKatie was talking about and and umreally good keynote this morning butthere's also how do we keep the healthof the existing projects going and howdo we make sure they're sustainable andthey're healthy and they have the rightamount of contributors because of coursethe more sometimes the more mature aproject gets um the less exciting itmight be for some and therefore you knowwe have to kind of keep that momentumgoing as we go alongWell with that I know the TOC wentth:rough a few exercises this week tosort of set the vision and direction forthe next year Uh would you like to careto share some of what you'd like toaccomplish and where where things aregoingso one thing that we haven't done agreat job of is really scaling the etechnical expertise um that's in thecommunity Um we've had a lot ofattrition burnout pandemic obviouslyhasn't helped and so right now we'regoing through a restructuring of ourtechnical advisory groups so that tohelp it scale and provide betterguidance to the projects as they'recoming through the maturity levels umalso you know applying to the CNCF thisway uh one of the problems that we'vereally seen um you'll see that we stillhave a backlog for incub incubating andgraduating projects projects that wouldlike to reach those levels One of thebiggest problems we've seen is that theyare not ready So once a due diligence isstarted uh we find out that u maybe theydidn't have technical expertise incertain areas where if they had guidanceum in those areas uh they would be at apoint where we could graduate them tothe next level So that's something thatwe'll really be focusing on through thisnext year And again uh that's going onright now So after KubeCon if you'reinterested in applying to be a chair ora technical lead please you know umnominate yourself or nominate one ofyour peers We really need uh technicalexpertise to be helping the projectsI I can add also the other thing wementioned as well is this new idea ofhaving initiatives in addition to thesub projects and the tags and these aresupposed to be time scoped with concretedeliverables uh and a specific domain Sothis week has been extremely useful alsoto discuss on the possibilities and theideas people are coming forward Uh wehad uh really nice conversations aboutpeople are very advanced in doing thingslike uh cold recovery disaster recoveryand getting together and trying to uhstandardize on best practices for thatThere were people like Lego and uh uhother automotive sectors that work havevery concrete manufacturing plans andthey have challenges exposing deviceslike PLC So this could also be aninitiative to get them together and workon uh best practices for this type ofdeployment to help other end users doingsimilar things So if you have ideas uhyeah we're all here just come forward uhand propose one Uh it's open to anyanyone to come forward and help outUm and just building on what Carter saidon on in the maintainer summit on Mondaywe had this amazing workshop and some ofthe ideas that we uh covered there aspotential initiatives andprojects We'll also be focused onhelping better integration between theTOC and the um end user tab uh andgetting more feedback through thatsystem and hopefully um growing thatcommunityI thought that the uh maintainer summitworkshop was really cool to see all theenergy that people had brought intothinking about new initiatives Oh canyou tell us a little bit about theworkshopso for the workshop what we did waitthere for the workshop who was atmaintainer summit Yeah thank you I Imean honestly I'd rather hear from someof the audience members on what theirexperience was in the workshop Ifanybody wants to stand up and share themicrophone's right back there if peoplewant to ask questionI know right there's there's audienceparticipation Yeah Who has the winner gyOkay It was fantastic So we had the roomfull of people We had about 11 tables Umseriously we ended up having about 8 to10 people per table and split up intodifferent areas such as developerexperience or operational resiliency ortestinguh really workloads right um and they'regenerating ideas on what initiativescould be Um interoperability We heard umsome great ones about uh test gridWouldn't it be great if we could youknow see test grid come out ofKubernetes right and um have that beused to help projectsUm you all were there What are yourfavorite initiatives that came out ofYeah What were what were your favoritesi know you have Um so I was working witha group of people focused on securityand one of the things that came out wellactually the;re were a couple of thingsOne of them was around better automationaround the self and joint assessments umand to keep those reviewed and healthyand and current Um and there was anotherinitiative around how to get um betterdependencies uh handled between projectsfor things like identity management andintegrating spiffy inspire into theecosystem in a more fundamental wayOkay So uh one one of the initiative Ijoined discussion is about how uh theproject better kind of deal with oralready downstream vendors We we have uha lot of maintainers have met this issuelike the user using through uh theproject from vendor they met some issueand went to the upstream community toraise the issue but it turned out to beyou know there's a lot of kind of breakpoint uh because the environment mightmight vary uh by different vendors Sothe idea was that actually we can youknow there are already some of theproject has the um uh conformanceprogram So basically the um initiativewas discussing how we can uh providemore practical guideline also with maybesome of the tools to help more uhprojects to set up their confconformance test suits and how can theum or the project when they are becomingmore mature mature uh go through that tohave their own conformance YeahI think that the the collection ofthings for me personally that wereinteresting were all the operationalresiliency things because it's kind ofnear and dear to my heart Um our tableuh the way it kind of worked was a tablewould work on an idea and then pass iton for elaboration And the one that ourtable got past was um thinking aboutmaybe producing a filter or something ontop of the existing landscape themedmore around personas so it would be alittle bit easier to navigate andunderstand I thought that was kind ofinterestingWhat I really liked was everybody stayedtill the very end and all the tableswere like completely occupied like youknow heads down talking to each otherand uh that that was really great to seeand like me and Karina were like okaywatch for the energy levels to fall downand then we'll you know twist and changewhat they are doing and you know wenever felt that you know the energylevels were flagging So it was reallygood to uh hear and see from everybodyin the room and uh you know we reallywant them to continue to do the work Umand and there was quite a few people whowere like do we really have to give ouridea away to some other people what arethey going to do with it can we justrefine it ourselves and so we had toassure them that it was a livingdocument and they could continue to workon it And so yeah it was you know it wasreally good overall for me Thank youI think that's Katie needs a microphoneYeahbut she's busySome technical issues Um I think I justwant to echo what Dim said because wehad a great attendance The room waspacked which is absolutely outstandingto see Uh the tag reboot is perhaps oneof the biggest change we have donewithin the TOC and within the landscapein terms of how we interact withdifferent micro communities within ourlandscape and seeing so much interestfrom folks and they want to discussabout it and perhaps they want to applythe new the new kind of initiativestructure it is great like I think thiskind of enthusiasm it is amazing and Ireally hope for everyone who has beenthere in the room and for everyone whois here in the room today for you tocollaborate on these new initiativeswith us and perhaps again I'm going toecho the message come be part of theleadership uh team be part of the be atech chair or a tag lead This is thesekind of people we need uh at the momentUm and I'm not sure if the initiativesdocs are available for the public butbecause yeah we have some initiativeswhich are quite interesting as well butI didn't think we're going to go forthem because people will have access tothem anyway So I just wanted to echo thecommunity enthusiasm I think that's thatwasreally actually wonderful and beautifulin many waysOne other thing I wanted to mentionabout our table was of course theattendance was really nice and at thesame time it was diverse So on our< tablewe had people from Japan and uh Europealso as well as US and all other placesmany places actually So one thing thatcame up was about the time zone issueswhat they are facing and also thelanguage issues So uh it's basicallysome people had the concern that they'renot able to attend some of the meetingsor not able to communicate properly insome of the meetings Um that is onething probably if we can see somethingabout it that would be nice YeahYeah I I don't have much to add but justuh maybe maybe when on my table therewere there were quite a lot of ideas UhI think that the most interesting partwas to to actually uh like people didn'twant to pass their ideas forward butactually receiving other people's ideaswas was the most interesting and youcould see the different approach likethe thefirst the first uh round was kind ofvery everywhere and then the secondtable looking at it kind of like okaythis makes sense this doesn't and theykind of started making it a lot morestructured So I guess it's up toeveryone now to look at them and andreally make them concrete but I thinkthat we have a lot to to workon I I just wanted to because you can'treemphasize or repeat call to actionstoo often It would be lovely to see moreof the community that participated inthe maintainer summit and obviously umall you amazing people who have come tosee us uh with your own about this to toactually take a more active role in uhjoining as and nominating yourselves oryour colleagues uh or other people inthe community for tech leads and chairroles I think I think we really want touh increase the vibrancy of ourcommunity as we growOkay I think we're I have a question Howdoes one do thathow does one get involved is that yourquestion awesome Okay So if you watchthe TOC repo right after KubeCon we'llput out the call to action also in theTOC mailing list as well as the TOCSlack channel So just CNCF SlackTOC So we'll we'll add all theinformation information thereInitiatives will be um put into the TOCrepo So you'll start seeing them landthere And then we'll have a projectboard for people to who are interestedin contributing to an initiative Andremember you don't have to be part of atechnical advisory group to help work onan initiative As you have time we reallywant to see them as time bound smallerpieces of work so that as you have timethroughout the year you can get engagedAlso we'll be having another maintainersummit in Atlanta and we'll do a reviewof the work that has been done up untilthen And I know we have another questionSo go ahead Uh hi It was yeah reallygreat to be part of the maintainersummit for me was the first time Um andit was uh the energy was really excitinginspiring So um yeah I hope there's morepeople in the crowd if they haven'tparticipated in this kind of event thatum they they go for it Um my question isum so my I'm coming from the perspectiveof the TAG environmental sustainabilitywhich is being merged into the TAGoperationalresilienceand how can we so the next steps willincludeuh collaborating to write the charter umgoals and non-goals um talk with theother tag communities that are uh goinginto this new tag And I know we're we'redoing this for a few other tags So forpeople who'd like to be involved withthis what are kind of the interimum communication channels or uh goals ortimelines that you'd like to share withus and suggest for us to get involvedwithso after Coupecon we'll publish more ofthe timeline because we are in theprocess right now of redoing the reposWe're going to pull them all intocentral back into the TOC repo and thenprovide instructionsetc there So keep looking for thosechannels and we'll put a call out Wehave been working with several of thecurrent leadership on that Um so againwe'll reach out to each tag each currentone and then also any other communitymembers Keep looking out for thosechannels for more information Butunfortunately that's going to be afterKubeCon which after today but give us acouple weeks I know to get back onthings Yeah thank you for that questionDoes anyone else have any questionsactually how =many people here have beenthrough the moving levels process orinvolved with theproject okay at least a coupleUm what would you say have been thebiggest challenges you've seen when itcomes to projects movinglevels the most common things i've beentalking a lot Kevin do you want toanswer this one yeah Uh so uh my kind ofobservation and also experience uh forthe uh project moving levels especiallya little bit before like we uh I thinklast year like we updated a lot aboutthe template checklist to make it moreuh clear for the project team tounderstand the criteria and uh uh butstill I I I have seen some of theprojects Um uh first of four is the uhthe answer are not that uh ready I meanideally as a TOC or reviewer we wouldexpect you give a uh some of thestatement to the answer as as well asproviding reference links to providethat you are you prove that you arereally working on implementing what yousaid like the open community governancemature release m uh release process andhow you deal with the uh security thingsBesides that we also added like generaluh uh technical review and uh domainspecific technical review to kind ofhelp evaluate uh I think from thetechnical perspective the maturity ofthat project So uh the first of I Iwould recommend uh for anyone uh kind ofuh preparing the application or in uhprocess of the application uh try atleast another round to take a look youruh application answers YeahI I will add to that and thank you KevinI I'll add to that that um another thingthat we've seen are that projects maynot be thinking about security firstright not thinking about their securityhygieneum how they're planning and we'll begoing over a review of the currentrequests that we're asking through theprocess and do dduplicating I know rightnow a lot is being asked to between uhthe security assessments the securityaudits and we'll see what makes sense tobe moved into the audits but I do seemembers of tag security here and they'vebeen invaluable So thank you like tohelp reduce that load but um wanted toadd that So projects should also belooking at security first righti want to mention perhaps in addition toall of this is we have because yes nowour queue is actually shorter So we havea shorter response time and engagementwith the projects But what usuallyhappens unfortunat not usually sometimeshappens is that the maintainers are notresponsive after opening the PR foreither incubation or graduation And itis crucial because the due diligenceprocess is a collaboration between theTOC and the project maintainers and it'scrucial to have that channel So if youopen a an issue or if you open APR forincubation or graduation please beresponsive We this is the thing we aretrying to help you to get to the nextlevel and we need that kind ofresponsiveness back and I just want tokind of perhaps emphasize that because Iknow everyone is very busy and thingskind of get can get very chaotic at alltimes but that kind of channelcommunication it is very important So beaware of that pleaseUm so one other thing that usuallyhappens is people are on a deadline Wewant to get the project out in terms uhfor Cubecon NA or CubeCon EU or likesome internal deadline and so you knowwe try to do things but uh it's going totake us some time and we have projectsin the pipeline so please be aware of uhthat and the other thing that we usuallyend up is um you know there we have alot of material that we've written downand there is lot bunch of checklists andthings like that but sometimes Sometimeswe really have to sit down with theprojects andsay exactly what why we are asking themthose questions or like why we areasking them to make those changes Uh youknow simple examples will be around likehow do you think about the communityversus your uh company and your productand like where do you draw the draw thelines and so we really have to have thatheart-to-he heart talk with them andthen they figure out okay we are missingthis in our governance we are missingthis here uh maybe it reflects in likehow we do our uh CI/CD artifacts or likehow we provide support for something orthe other right So and this is arecurring thing This is not like a thishas happened again and again where we'vehad to sit down with uh the folks in thedifferent projects that are comingthrough and then make sure that theyunderstand why we are asking them to dowhat they what we are asking And Ricardoyou're one of the longest running TOCmembers Be nice to hear one We have uh alittle about three minutes left and Idon't want to Yeah we we got peoplequeued up now So let's try and getthrough some of their questions quickHello I was wondering if um the TOC haslike any um like success metric thatthey measure to see how um how wellthey're doing like on a you knowquarterly basis or something like thatlike how do you guys like report on thatthank youWho wants to take that one it's usuallyuh how fast the projects are movingthrough the process and it's moving alot faster Umyeah I think may maybe there are twoparts One is how fast we are helping outthe projects whichis shown by by the rate of moving levelsUh there there are some metrics that uhwe started very recently looking at Ithink Bob was making them which are howlong the projects are actually stayingin the same level themselvesum before applying to change level andthis one is quite interesting as wellbecause they are interconnected butsometimes the projects uh need some helpalso to to push them I think this is oneof the metrics that will be veryinteresting to explore a bit more indetail in this year as wellAnd maybe maybe not necessarily a metricof success but an indicator of successisthat we are having time to refine ourprocesses rather than just actually dothe due diligence because previouslymost of the TOC time would be focused onrevising sandbox sandbox inclusionapplications doing incubation andgraduation due diligences and that's allwe could do and that was our full timeand because we have been focusing a loton streamlining the processes becausewe've been trying to be as clear aspossible we are trying to automate a lotof that and just even like all of theseforward-looking initiatives that weinitiatives like workloads with the tagreboot and so on and kind of reshapingand thinking about what's happening inthe future I think this is a very goodindicator of us scaling in terms of ourworkload and maybe it's a metric ofsuccess in that as well Maybe notquantifiable with numbers but in the inthe overall dynamic of how we engagewith ecosystem Absolutely it is We arejust about at time What's your questionlet's see if we can answer it quick Thisis a relatively short one Um so whenthere is a project that has troublecompleting some tasks or moving to adifferent level uh and you have to havethis additional calibration additionalconversation uh does it sometimes end upbeing a problem where the outcome wasn'tdefined like the the why needs to beexplained and the what needs to happenbut not necessarily what the desiredoutcome was and that turned out to bethe reason why it didn't work outUh we have many examples allcombinations are there Um so we we dohave to tell them what we are lookingfor why we are looking for that and howwe are going to typically what we alsotell them is like here is a set ofthings that I we want you to think aboutand see where you all have to applythese things in different areas and alsowe want to see how it works in actionright like don't just change a readme ora governance.md and then say you're donewe want to see it how we areimplementing it over the next 3 monthsor next 6 months and then with theevidence we'll be able to like you knowcontinue the conversation so to sayThis kind of goes back to the lastquestion about KPIs It's hard toquantifyOh we're now like a minute over So Idon't know if we you'd like to follow upafterward Well okay Yeah please cometalk to us I will Thank youAll right Well thank you for being uphere and thank you Anybody's questionsquestions2025-04-15 21:58:02.481170?cs these guys uhare able to see on their dashboardswhere the next parts are needed to bedeliveredyeah the big mysterious glass boxactually uh here the magic is happeningand here small components become biggerparts and bigger parts become the drillyou use at home or maybe the electricmotor or other component in your car andwith another uh piece of software calledpart traceability uh we are able to seeuh the the the traces of a part from thebeginning uh when it is bored and fromthe moment it it leaves the the factorygate all right and yeah we also havethese cool guys moving around thefactory moving parts from the warehouseto the to the shop floor area uh theseautomated guided vehicles on short AGVsuh are controlled by another piece ofsoftware that we calledAGVCC um and for each plant there is amap and these guys knows basically everycorner of afactory all rightwhat if I tell you that uh all thesepieces of software that you've seenuh are actually this big brother of offactory that's the the the industrycalled MEES and yes we at Bosch uhcreated our own MEES system and it'scalled NixedMEES all right based on this uh to havea cleardefinition a manufacturing executionsystem it's a software that monitorsmanage optimize production on the shopfloor in real time and it basicallyserves as a bridge between enterprisesystems and shop flooroperations all right now let's imaginethat a factory it's actually a humanbeing and we place it in an MRI machinewhat we would seeinside first we see the compute andstorage power that's the brainit uh basically just like a human brainuh the these are processing decisions uhmees relies on on this um to manage thecriticalinformation moving on we find theexternal systems these are functioningas the circulatory system and exactlyhow blood carries oxygen and othernutrients to the organsuh this system are distributing uhessential information such as uhproduction orders or inventorydata moving on we have the messagebrokers and these are acting as the asthe nerves um similar to how nerves aresending signals throughout the body uhthese me message brokers uh ensure uhthe communication across thefactory another scan and we have theindustry protocols and the sensors uhthese are acting as the senses and uhbasically are correct collectingreal-time input from the machineryincluding temperature speed pressure andsoon okay and finally we have the shop forequipment uh all kind of machines uheven those little AGVs that you've seenbefore these are acting as the musclesand uh these are ex executing theproduction task based on the meesinstructions all right uh you mightalready think all right this system it'scomplex how does these people manage toto uh to do all the stuff to well it ishard because um these systems havehighly indiv individualconfiguration and because each planteven each production line is differentuh it is hard to uh to standardize it'shard to replicate scale and andupgrade next manual configuration wellyeah uh manual configuration is it'sdone at two levels uh we have the thefirst level operations uh level which isdone by the ops guys uh includingsetting up databases uh network policiesuh use user uh permissions and so on andwe have the business logic level whichis done by the application engineers uhthese guys are basically responsible touh create productrecipes uh and uh doing all all kind ofroutingrules uh yeah moving on the deploymentprocess well the deployment process istypically typically done on um uhindividual individually ordered Windowsvirtual machines and this is making theprocess both uh slow and resourceintensiveum yeah in the end deploying an MEEStakes a long time uh also the manualchanges um increase the risk ofmisconfigurationuh potentially leading to to systemfailures and oh god it's bad when thesesystems are going downuh because when MUS goes down productionstops workers are standing idle uh andbasically this translates to lostweighted materials missed deliverydeadlines and uh of course millions inrevenue slippingaway all right now even if configuringand maintaining legacy MEES was a @burdensome time we have to admit that uh foryears traditional MEES have successfullypowered uh pro manufacturing operationswithin Boschuh but with the rise of industry 4.0zero uh that demands faster moreflexible more scalable approaches uh weneed to and we also need to find a wayto avoid the downtimes of course um yeahwe decided to stop pressing next nextand finish forever and uh we took theclassic lift and shift approachwe basically embraced cloudnativetransformation and building on top ofNixie MEES uh we started to develop thenext phase Nix IAS a cloud capabledesign software that made the firststeps in bringing uh this flexibilityscalability and efficiency to the shopfloor uh but yeah in the first phase uhwe just packed everything intocontainersbasically each mees module became a setof containers and now 30 differentmodules uh became 200 plusmicroservices we took all thosemicroservices and deployed uh them inLinux machines uh basically no nofundamental changes just containerizedversion of the already existing appsso we had minimal refactoring we justfocused on the portability rather thanuhrearchitecting uh we had two approachesthis time um one was designed for theonremise environment and the other onewasum was an early version that that threwus into the cloud worlduh in the case of on premise situationum we we had all these microservices wewe needed to find a way to bundle themup so we used docker compos for thatdeploying the microservices with anibleand um uh for the for the cloud approachuh basically we used anible to renderthe the kubernetes manifest and deploythem in in the cloud uh but no matterthe environment uh we ended up withsequential and very slow deploymentsuh panned updates and an update usuallytook us around six or 10 hoursuh depending on the installedmodules but years have passed and uhyeah in our days looks like this uhnowadays we have cloud and onrem uh thatuse that same deployment structure uh wewent from sequential uh deployment withanible to simultaneous um um pipelinedeployment um of the microservices inmulticlusteruh nodes in the case of cloud scenariosand uh in single node in case ofon-remisescenarios uh but in addition as aglimpse into the future we thought aboutbuilding a local cloud uh we called itBMLP and it stands for Boschmanufacturing and logistic platformyou can think of it as an app store butforfactories um yeah and here the thingsare a bitdifferent um each factory has itsmicroservices in a multiode clustereverything gets centralized um andchanges are applied automatically uhusingArgo yeah uh now I will let my colleagueManuel to show you what challenges andsolutions we encountered along the wayyeah thank you Andre um as mentioned wenow know now know how our system we aretrying to operate looks like right andI'm now going to walk you through somelessons learned we had on the path toreally make it more cloudnative andbring it into the cloudonce we started with a lift and shiftapproach and basically packaged all ourmodules into different microservices weimmediately faced three differentdimensions of complexity the firstdimension is quite specific to themanufacturing context as alreadymentioned most of the plants and eventhe individual lines use individualconfigurations that is something you donot always have in in the microser worldwhere you often deploy the same replicaof the same thing that was a realchallenge for us to support all thoseconfiguration and deployment parametersand reducing human errors of coursenobody would want to build a new plantjust because we DevOps guys think wewant to standardize our input values sothat was the first challenge we werefacing the second one was not as a bigchallenge right once you had packagedall your components all your existingsoftware we put it on Kubernetes um wehad already some Kubernetes knowledgeinside um our team during that time sothat part was easy we added the missingcomponents deployed everythingworked the hard part then are the humansactually because as you could imagineour software solution is comprised ofmore than 30 different moduleAs eachmodule comprised again of a plenty ofdifferent microservices and as you wouldexpect each of these modules is built byan individual development team so wehave a a set of multiple developmentteams and inside those development teamswe have a very unequally distributedknowledge of Kubernetes at that point intime there were teams without anyKubernetes knowledge and that of coursefor us as a DevOps platform team causedthe question how do we how do we handlethat how do we ensure that the thing weare building is something we can easilydeploy and operate at the end withoutputting too much burden to thedevelopment teams and slow them down inthe dailywork so one decision that was done quiteat the beginning was that we willpackage all the modules as individualhelm charts the question here was how dowe do that there is a naive approachthat comes to mind at first yeah easylet each development team create theirown Helm chart let them do whatever theywant they have a great level offlexibility but there are some problemswith that it's hard to enforce anystandards and because of the distributedknowledge within the teams the outcomemight be not really what you wouldexpect in theend the alternative approach would havebeen to say okay let's embrace somecentral teams maybe the DevOps teams andlet them be responsible to create allHelm charts for all the individualmodules it would work we would be ableto enforce certain quality standardspeople would be knowing what they aredoing and things would be great but ofcourse in that case we would havecreated a huge bottleneck in ourorganization the central team wouldbasically be a dependency for all otherteams and it would slow down our releasecycles so we opted for another approachand that is really making use of thepossibility to write library helm chartsand that is what we did and are doingtill today we are implementing andcreating a library helm chart and wehave the mandate for all the developmentteams to make use of that library helmchart and the chart is designed in a waythat everything that is rendered bythese individual module helm charts isactually funneled through that librarychart and by that we have a closecontrol what developers are actuallyable to do and what they are able torender to our clusters and that turnedout to work very well for us of courseif you have something likethat you are thinking about how do yourelease these kind of things and here wehave really a special requirement whichmight not be very common in the microserworld um even though we are technicallyable to deploy individual services weare still required also for externalcustomers to bundle everything we dointo one single release and that isstill possible even if you go withapproach like that you just create anumbrella helm chart and call that yourrelease and basically use this as afinal release artifact you woulddistribute internally and to yourcustomers so that part got be solvedquite quickly uh we basically were ableto fulfill thisrequirement having that the questioncame up who owns what who is responsiblefor which parts um of course as expectedthe module Helm charts are owned by themodule teams and what we then did weembraced a couple of additional teamswhich we call library teams you couldalso call them DevOps teams platformteams as you like and those are theteams like us who are owning the librarychart the umbrella chart and a couple ofsupporting components which are thenbundled all together with the differentmodules coming in from the module teamsbetween our teams we have a closecollaboration of course we do knowledgeexchanges and that worked out to be asetup that works quite well for us andthat is the way how we are operatingtoday if you go with this library helmchart approach we had a big learning atthe beginning um if you make a singlehelm chart or library helm chart such acentral component within your overalldeployment or within your overallproduct then it's really important tounderstand and treat that library helmchart as a first class citizen it's notjust some deployment code developed bysome DevOpBs team that is living on thesite and does not need any specialattention that's wrong it's importantit's basically deploying the entiresystem and can break the entire systemand changes there often have an effectto many different teams so you have toembrace a full software development lifecycle for that component you should makesure you are able to hand out a road mapto the team so they know when whichfeature is expected to arrive in thelibrary helm chart you should um do somerequirements management maintain abacklog so everything you do for justevery other software project and at theend we embraced in our organization thepossibility that each development teamcould basically do pull requests andcontribute to that library helm chartjust to accelerate the time whenadditional features are needed and toplace things quicker in that centralcomponenttalking about more about talk sorrytalking about a bit more about thetechnology behind it well as mentionedevery module team has to use a libraryhelm chart that is a mandate and whennow a development team wants to starttheir Helm chart for their module theybasically start as anybody anybody wouldstart with a helm chart they createtheir chart yl with the information likename and version nothing special therebut they only have to place one singletemplate in their helm charts and thatsingle template typically only containsone line that is a line that is callingthe rendering function of that libraryhelm chart and nothing else and that isthe single entry point to the libraryhelm chart from which we could enforcethen how the final resources will looklike then and that is a bit special thedevelopers are basically required to puteverything else they are wanting toexpress in the Helm chart directly inthe values file that looks a bituncommon of course but that had a lot ofbenefits for us we basically definedhere a customdriven opinated datastructure to which the module teams haveto obey and this opinated data structureis helpful because we could abstractaway quite some details from most of thedevelopment teams because they would notneed them anyways for example ingressdefinitions do not need to be explicitlyexpressed they are built in by defaultif they don't define anything else ordon't overwrite it still this approachallows development teams to make use ofall possible features you would have ina normal Helm chart it's just they don'thave to and that makes things quite easyand allows us to control a lot of thingsof course it's uncommon to puteverything in the values.yamel off yourmodule Helm chart but it has a secondbig benefit and that is schemavalidation we do schema validation a lotand we do it on two levels on the onehand the module teams are asked andrequired to place a schema file withintheir own module helm charts thoseschema files are typically used to doinput parameter validation so thedevelopment team exactly knows what theyexpect in a certain parameter they coulddo a reg x and check if the input iscorrect then we have a second level andthat is a global schema and that ismaintained by us as library chart teamand within this glo global schema we areable to do some structural checking andreally make sure that the overall datastructure that is defined by the moduleteams is correct and fits to the currentversion of the library chart that turnedout to be very helpful because peopleearly notice if they do a mistake intheir module chart or forget about someinformation they need to put in alreadyduring templating time and not onlyafter the fact when things are deployedto some testsystem given all that and you alreadysaw the separation of concerns betweendevelopment teams and operations orlibrary chart teams we do the same forthe data model itself so as you couldimagine um a lot of the modules areactually connected to some outsidesystems be it a database or messagebroker and within our data model we arerequiring from the module teams webasically have the separation ofconcerns again we have this global levelwhich is a level that is typicallyfilled in by the operator for example usthat wouldC contain information likeconnection details of a database serverport information and so on then on themodule level the teams just express okaymy module needs either an MSSQL databaseor an Oracle database and those are justexamples could be anything else and theyenter additional requirements that areneeded for example database roles theywould want to have and we then make useof the Helm templating to mix and matchtogether these information and by thatrendering out the final deployment thatis then automatically set up in acorrect way to connect everythingtogether but again you have thisseparation of concerns development teamhas one place to edit things we haveanother place to edit things and thatseparation worked quite well for us wedo we do similar things for differenttarget environments so we are deployingon public cloud we are deploying on ourBosch manufacturing and loistic platformbut we also have a couple of edgedeployments running on single nodeclusters and all that is supported bythat single library chart via this kindof mechanismthere are some cases in which moduleteams came to us and said "Hey there'sanother module that has a similarparameter like we do and we need toensure they have the same setting so wewant to share configuration informationbetween different modules betweensubcharts that is not possible out ofthe box with Helm right but if you makeuse of a small trick you can stillachieve that each Helm chart or Helmduring templating has within the valuefiles or within the values this globaldictionary and this global dictionaryalways is available to each of thesubcharts in a Helmdeployment on the other hand it ispossible that you are import data from asubchart to a parent chartso if you combine both of these thingsand simply say hey I use a trick that Iimport the data from subchart A into theglobal dictionary I immediatelyimmediately make it available to allother subcharts within my deployment andby that you can easily share informationbetween the subcharts in our casebetween the modules of course you needto take that a bit with care you areoperating in one globalnamespace take that with caution can cango wrong ofcourse further if you look again on theapproach to put so many things into thevalue files you might think okay but howdoes then a module team actually dotemplating they maybe want to do somestring formatting or some easytemplating functionalities in theirenvironmentvariables that is true and we stillallow our teams to do that and we dothat by simply adding the TPL functionwherever it makes sense so it is verycommon in our codebase to see somethinglike this here a module hand chartwithin their value file they would againhave go templating language written aspart of a string and then at the pointin time where we render thoseenvironments we would call thetemplating function again and executetheir template that has a benefit thatwe as a library team have control overwhere teams are allowed to template andwhen not which again comes in quitehandy if you want to ensure qualityacross the boardtill now we only talked about Helmrendering templating and deployments ofcourse that is not everything you cannotsolve all problems with this and we alsonoticed that quite early so we built acouple of custom operators we havebasically two classes of operators thatwe build for our system they areinfrastructure operators they are nottoo uncommon and I think that is notreally something very applicationspecific in our case they are managingdatabases creating schemas and so on sothey are doing all kinds ofinfrastructure things and control thatand then there is a second individualoperator that integrates with ourcustomuilt identity and accessmanagement solution which is part of ouroverall MEES solution and are ensuringthat these different modules once theyare started are registered correctlythat they have the permissions to accessthe other modules exchange secrets andall these kind of things are fullyautomated using operators and by that wedo not need any manual interventionanymore when we deploy aSo lessons learned and sumDmary when youdo a library hamm chart which getsthousands and thousands of lines oftemplating code unit tests are yourfriend because everybody is using itevery buck you introduce will affecteverybody so the helm unitist unit testplugin is really useful and we make useof it a lot on the otherhand I told you we have to bundleeverything as one helm chart which meansit becomes one helm release once youdeploy it there you need to watch outfor your Helm for your Helm state andyou might switch the Helm back endsbecause your state might might not fitanymore into your into your secrets umin our case a single instance of thesystem has about 1.5,000 differentKubernetes resources and then you hitthis um boundary of 1 megabyte quitequickly on the other hand we noticedalso a lot of improvements coming fromthe open source community a good examplehere is Helm itself since we are doingtemplating on templating on templatingour Helm templating became quite slow informer times it was more than 100seconds on a normal development laptopafter the update to Helm 3.14 the Helmteam actually improved the performanceof the templating itself and we saw ahuge drop in the templating time mightbe a small thing for normal ham chartsbut in our case where we really make usea lot of this templating approaches itreally saved us a lot of time and thetemplating dropped to under 10 secondsand finally switching to Kubernetes wasa good decision for us switching to Helmwas a good decision for us overall ourdeployment times dropped from hours ordays to minutes and we are now able todeploy such a manufacturing systemwithin a few minutes and it will justwork thank you very much and we would beopen for questions nowso help yourself there's a microphoneuh hey guys so I work at a company thatalso has uh some of these challenges toto undertake um we as uh have also a lotof uh VMs running Windows Server but ourmain difficulty right now is what to doat the human level where there are thereare no clear ownership of some of thoselegacy components that were in some ofthe sites and not at uh every every siteand now we don't know how to to managethem and how to uh lift and shift themuh into our Kubernetes environments sohave you any insight on what is was yourapproach um yeah that is a good questionand I think in general a lot ofcompanies might face that right um theremight be always this one old legacysystem around that you cannot lift andshift and I think in those cases youmight have even to live with having itstill in a Windows VM somewhere becausewhat else can you do right either youswitch to another solution do a fullrewrite which might not be taughtpossible because nobody knows how itworks um one thing that comes to my mindon that I was attending one of thesessions on Tuesday for ISTTO and theywere announcing STO on on Windows forexample and at the beginning I wasthinking okay why but maybe exactlythese kind of cases might be a thingbecause there will be always a companywith this one Windows VM with runningsomething and maybe in future we will beable to connect those things more easyto our Kubernetes clusters and by thatmaybe able to at least improve it butsolving is hard without switching thesolution sure thank you you're welcomehello so this Thank you for this talkit's very relevant to us uh we saw a lotof parallels with your experience andour experience great uh my question isyou mention about you migrating legacysystems to Kubernetes and you basicallycontainerized the systems first thenwent to Kubernetes later my question isbecause we're trying to do the same whenyou're migrating uh like industrialcontrol systems or similar aspects theyare built with a very different paradigmthey expect they are running they haveall the state they do their things somoving to Kubernetes means everything isephemeral everything can spin up spin upa replica delete a replica it can bedeleted and relocated to a new workernode so how do you do this migrationwhich as you said is there are dev teamswith limited knowledge of Kubernetes sohow do you make them re-engineer theapplication to be cloud native insteadof being based on running and having allthe state in one area yeah good questionwe know exactly what you mean I guessright um so one thing that needs to benoted here we are as as Bosch connectedindustry we are actually the softwareproviders who produce or write that thatsoftware right and as Andre mentionedthere was this decision to make thesystem more cloudnative and since we hadthis decision in the organization a lotof the components which have beenrunning before on the VMs have not onlybeen lifted and shifted but at pointswhere it made sense also rewrites happenso there are changes in the softwareitself it's not that the code base isone to one the same still it is based onthe old codebased but it's made morecloud native where possible we are stillnot fully there and we are still as anoperation system having challenges hereand there because not all workloads areas good as others in terms of failuretolerance and things like that what wenotice is really helpful we do forexample as an operations team or devopsteam we do regular devops exchangeswhich means some one of our engineer isjoining on a bi-weekly basis thedifferent development ment teams andhelping them with things where theystruggle with be it the Helm chart orbeing things like retries implementingthings correctly what can we do with ourstate and I think that is really againone of these human problems you need togo in there and and the people you havewith the cloud knowledge need to go tothese development teams and try to helpthem wherever possible and once theylearn about it it's becoming much easierthat is at least our experience and thenat some point it will work thank youhello my name is Sian Kenny and I'm herefor BSF which is the biggest chemicalpro company in every scale everymeasurement um so we also challengedthis we also had the same challenges asyou had with our manufacturing systemsum and one of the issues we had with ourlegacy system we partially well we weactually start to introducing um cubeweird right now to connect to legacysystems have you ever thought of usingthat maybe um for example to accesssystem that still use opcda and habecause nobody wants to migrate like 400plans plant sites with like tens or 20plans inside um just because you pushkubernetes to the edge yeahum we are aware of the cube word projectum but till now if we have mixedworkloads that we have something thatstill needs to remain on a VM wetypically keep those VMs there and justwith the on-prem clusters then do adirect connection to the rest of thesoftware so those VMs are not becomingpart of the Kubernetes cluster itself atleast for now right so so you don't wantto push it really to the edge but justclose to the edge your MES systemsexactly it's close to the edge close tothe edge but not in the edge in themachine itself so to say right thank youvery much it was a very interesting talkyeah thank you very welcomeuh you mentioned uh that you moved someof the modules into the public cloudright on AKS clusters and such yes uh Iwas wondering what percentage of themees in terms of functionality you wereable to move into the public cloud andhow you ensureuh that the the connection is reliableenough uh fast enough are your plansconnected via private line to the to thepublic cloud or yeah how do you do thatactually um our Bosch plans itself arerunning entirely on the PL Bosch on-premclouds so there are centralized cloudsthere are cloud or local clouds withinthe plants but the end the Azureoffering the public cloud offering thatis offering a dedicated instance of thesystem that we offer to externalcustomers and there we mainly operatenon- latency um non- latency sensitivemodules so functionalities that are notso sensitive to any latency so not theline control things or things like thatand by that um we have offering that canbe utilized for example to do shopliftmanagement and so on within the cloudbut that is not the part that iscontrolling the Bosch factory itselfthank you thankyou okay if there's no otherquestion thank you again thank you2025-04-15 21:58:03.098898 ��\G#��oAGgxRHpQIEfghello everybody welcome to our talkabout into the shop floor movingmanufacturing execution systems tokubernetes my name is Manuel Pster i amwith Bosch connected industry i'm asenior DevOps engineer there and youmight have heard of our company Bosch weare a very large manufacturer based inGermany and we are active in manydifferent fields we are doing mobilitysolutions industrial technology we doenergy and building technology as wellas customer goods overall we have morethan 400,000 employees worldwide and weare having more than 400 plants that areproducing our products in more than 60different countries and these plantstypically look somehow like that here soit's a very complex system as you couldimagine running in most cases 247 and weas Bosch connected industry are asoftware provider within Bosch and weare focusing on industry 4.0 from zerosolutions and especially we areproviding a product which is amanufacturing execution system and forthat let me hand over and introduce mycolleague Andre who will walk youthrough the details on what amanufacturing execution system is andwhat it actually does thank you Manuelhello everyone i'm Andre and I work forBosch connected the industry as a juniorDevOps engineerall right so uh what isNES maybe let's take a look at thispicture we might find someclues yeah we see some guys working hereand there uh we have a guy that isdriving something that resembles aforklift overthere looks cool right but we don'tunderstand much from it rightmaybe let's take a deeperlook all right see that little screenthat is something that we call end onlive um to the worker is able to see onthat screen how many parts will becreated in his shift or if any parts hasanymalfunction let's moveon yeah moving to the guy that isdriving the forklift that is actuallycalled the milk run uh with some somehelp from another piece of software thatwe call intra logisti>Gmsum you know with all the capabilities ofcustom scheduling uh and autoscaling andinferencing and training and fine-tuningwith you know various set of tools thatwe're going to discuss soon Butyeah Sachi Yeah on Azure we see users umalong different paths or parts of theirAI journey So starting off just wantingto experiment with different modelsconduct benchmarking and evaluate theperformance um starting off with servingmodels and doing inferencing workloadsand then moving forward with tuning orconducting rag as well to improve um theperformance and make it morecontextaware in the context of theirapplications and of course um then moreadvanced AI users that want to traintheir models build them inhouse fromground up when it comes to um moresecurity reasons um governance overtheir data etc So we're looking tosupport them along that journey atwhichever stage they're at and meet themwhere they are with their AI And I thinkwe'll go into this a bit more indiscussing the projects that we'reinvolved in um within the CNCFUm Google is a little bit uh specialhere We also have a lot of the mixedworkload and also customer at thedifferent uh stage So from the trainingproc uh data processing preparation anduntil and until to the all those kind ofthe inference serving and uh one of thespecial like the Google unique it isGoogle itself also have the hardwareright so the Google also have the TPU socustomer came to Google and not just foruh GPU they also have the uh TPU forcost efficiency all those kind of theread different reasons and also we alsohave like the special things Googleitself generate be build those LM modelsand uh so a lot of customer came toGoogle ask for gemini and all differentof the gemini all those kind of thingsSo one of the special things it is sointernal of those uh training jobs or uhnot just third party of those kind ofthings and and one and some of it isalready on the GKE some is talking aboutthe migrate or in on the past to themigrate of the kubernetes and GKE sothat's kind of give us a unique of theperspective and how to we build of thestandardized of the hardware and uniformof the things and to support the userand also how do we uh build astandardize of the framework on top ofKubernetes and to support the differenttype of the customers things Yeah thankyou So let me start back from you Don Umuh you are you know leading the effortsin SIG node in Kubernetes for the thingsthat everybody is going to use you knowsoon real soon now like things like DRAswitching on some some features likethat So um like what the cubernetes isevolving to uh suit uh the AML workloadsThere is some work in progress Last yearin Paris and in Salt Lake City we goteverybody excited on like all the thingsthat are coming out So can you give us astatus of like where they are and youknow what can we expect from thecommunity and then on on the side tryingto figure out like how and when uhpeople can start using it and also wherewhere does everybody think thatKubernetes needs to evolve beyond thecurrent set of things that is inprogress That's good question Thanks Souh just like of the teams mentioned ofthe DIA and also earlier I mentionedthat uh because the Google add specialunique space So then we realize abouthow to standardize the hardware especithose specified the hardware to supportAM machine learning workload node it isreally leaded right So this is after DAis is trying to uh standardize and buildabstract uh representative to Kubernetesright so the Kubernetes can uh managedand also scheduled all those kind ofworkload so there's a lot of progressmake those kind of things and there isin theproduction stage almost in that stagebut we need to figure out a lot ofothers issues like how to migrate andhow to u uh build off the monitoring allthose kind of things we still used to bebut besides about the hardware there'sthe higher end of the objects and thekubernetes actually in the past we startfrom the support of the web servicesright state and then evolve to supportthe state for site and we don't doreally good uh we try but becausepriority we didn't dHo good job on a lotof to support of the batch workloadespecially batch workload complex Xbatch workload and with their frameworkSo we are looking this is why there's alot of customer scheduling and thetypology aware of the scheduling and andalso topology management and findinggrand of the resource management requestcome to the Kubernetes community andevery day so community actually build alot of things like wino and also Kadunerfrom from Avidia and also Q uh around ofthe Kubernetes in the Kubernetesecosystem try to do the job queuing tryto do the file sharing and all thosekind of things But because this kind ofthings we done a lot of more and moreand well the time we realized actuallywe are missing of the primitives definedobject to support those kind of thingsright so for example kubernetes stilltoday couldn't really support uh batchworkload especially complex batchworkload their uh dependency right sothe job of workflow and the dependencyhow we are going to achieve that andalso kubernetes there's a lot of the uhcore scheduler or customer schedulingdefine about the part group So we can dothe G scheduling because especially fortraining model training jobs they reallyrequire about all or nothing about thescheduling and also what it is how wefast detect of the hardware problem andthen do the hard swap and and quick orsmart rep about those failure GPU andthe TPU all those kind of thingsactually is the new challenge for us Sothat's we are exciting to work ontogether with the communitySo umSachi when Kubernetes starts makingprogress how soon can you get it intoKato and you know what other things canyou get it into either in the communityor in the service uh what are youthinking about yeah great question Umwhen I think about where Kaido fits intoall of this more so with the name itselfSo as a tool chain operator as a toolwithin your entire AI pipeline we see umcomp composable architecture So pluggingit into the other tools that you'reusing for observability for checking umyour GPU node health for example andoverall making optimizations across theboard for different workload deploymentsSo as Don was focusing on um withregards to scheduling and batchworkloads um it's great to look at uhthe changes that can be made in thecubeuler framework for example and thereare scheduler plugins that have beenproposed um as part of certain ks likeum gang scheduling co-scheduling as wellas topology aware uh scheduling pluginsthat can support these specific workloaddeployments um not only just thedeployment part but also longunningavailability um so within In Kaido we'relooking to just see and validate um howthese all work together And anotheraspect to look at is um node health andreliability So when you have these umstateful AI enabled applications notonly do you want to um have successfuldeployment but for training for exampleum you want to support it throughout theentire uh workload process So if there'sany kind of interruptions um due tochanges in the GPU health you want to beable to migrate that over let's say toanother node and continue on the processnot lose all of the pro progress thatwas made in the first part and do thatin a safe secure manner so that you canoverall reduce cost because we know thatGPUs can often be quite um costly andhave limited availability at times Soreally um supporting node uh reliabilitythroughout the deployment process andconsidering different aspects of howthat can affect your an AI enabledapplications So V you are trying tointegrate a bunch of these things intothings that are useful to customersYou've you know done things like data onkubernetus and things like that So canyou speak up about like what do you whatdo you face when you go to a customerand like what their challenges are andlike how do you go around figuring outlike what are the components that youneed to put together for them to youknow do the things that they need to doYeah Yeah Sure So uh probably I mighthave to tell the a little bit of historyaround um before we get into theKubernetes and so a lot of the batchworkloads or AI workloads traditiIonallyrun on HPC or maybe uh some bunch ofCPUs and in the past um you use clusterslike Hadoop for running your Sparkworkloads but since the Kubernetes hasbecame more stateless to stateful over aperiod of time it's grown it became morelike an AI native uh platform for allAML workloads on Kubernetes The mainthings are when Kubernetes introducedthis PVPCs with the running the statefulworkloads support for the distributedfile systems like EFS and FSX and so onuh that help move a lot uh running thesebatch workloads So and these frameworksum like spark flankpeno and even the al workloads now theystart to support um to run on kubernetesby leveraging these features thatkubernetes offers and there is there area lot of operators that have been builtover a period of time to simplify thisjourney So at AWS and you know throughdata on Kubernetes uh we build a lot ofpatterns to show the customers how youcan actually run high scale a largescale spark workloads on Kubernetes orAML training workloads on Kubernetesright so to do that uh sometimes youneed to think about a lot of thingsright running these workloads onKubernetes is pretty straightforward butyou need to start thinking about fromthe point that you want to pack theseworkloads distributed workloads You knowthe things that you want to do is GPUoptimization and CPU optimization andbeen packing these workloads uh byrunning uh using custom basketers likeApache Unicorn Volcano or Q Um and youalso need to consider um the failover ifsomething goes wrong can thesedistributed computing parts can recoverfrom it So these are the aspects is thecommon aspects that we hear from usersand the customers and they also hit thescalability issues with the Kubernetes aKubernetes we say it supports 5,00010,000 15,000 nodes but with thestateless microservices yes answer isyes but for stateful workloads like allthese data workloads and AML AMLworkloads that changes for workload typeto workload right there are some set ofworkloads like in batch can go up tothousand nodes just because of thebursty nature of these workloads Butwhen when it comes to AML workloads youcan scale up to maybe 2,000 3,000 andthere's a lot of tuning that needs to bedone in every aspect But it changesbased on the framework that you selectSo we we just I think it is evolving alot and there are a lot of tools thatare coming up caching techniques andbatch scheduulers uh and optimization onthe GPUs uh using you know uh fractionof GPUs and so on but there is still lotto be done in this space uh by thecommunity Thank you So uh this is aquestion to Erin So as an industryleader right you have to lead theindustry you have to take us where youwould like us to go and people are alsotelling you you know this is what wewant So how do you balance the two andlike how do you think about like youknow where do we put our resources howdo we prioritize requests we you knowwhat does the future look like you knowwhat do we want people to be doing fiveyears down the line how do you thinkabout thatgreat question Dims So you know Ihonestly think you'll see a lot morefrom Nvidia in terms of open source Youknow the landscape Thank youthe the landscape is is certainlychanging Um you know I've been in opensource forever and part of the CNCF andum you know I think we're we'reinvesting both in our talent and ourfocus because we know as you know Sachiwas indicating you know GPUs areprecious and they're a resource thatneeds to be managed and you know for usto be able to contribute and have thedeep understanding and give that back tothe community how they can easily manageand operate that is you know fundamentalto all of our success and you know werecognize that we're better together asa community and so you know I I don'tknow exactly I can't point today but I'mon LinkedIn You can always reach out tome I'm very open CNCF Slack KubernetesSlack I you know I I I do want to hearabout the use cases I want to hear aboutthe pain I want to hear about how NVIDIAis an industry leader can help solvethese problems But there are quite a fewactually open- source projects in NJvidiathat people probably aren't aware ofThere's one called Skyhook that allowsyou to update you know some of thenickel parameters on the fly for jobsthat you're running to get betterperformance Uh we're going to bepublishing a project called EnvySentinel which looks at the node healthIt's it's not just good or bad It's invarying different states and we need tobe able to work very closely with ourcloud partners to be able to you knowunderstand the node health get them backinto service very quickly And you knowNvidia is not on an island in this youknow DGX cloud which is what I run theKubernetes team there you know activelyworks with all the major cloud providersto provide both the technology and thehardware to make that possible So youknow I think the tides are changing andI think you know just the relevance ofAI within the Kubernetes community isobviously changing I think at Parisalmost every keynote mentioned somethingin AI and it was very different Itwasn't that bad this week It wasn't thatbad this week but um I'm thinking back Ihad you know a picture come up that wasKelsey and I at CubeCon London nineyears ago and it was a hundred peopleand I did a keynote about storage and Iremember he was the moderator and he waslike I just don't think it's ever goingto be stateful Like you're wrong on thisoneAaron So you know things change and wehave to evolve and um this is also justpart of our evolution Sounds good SoI'll let you go first this time OkayWhat is one question you would like toask the other panelists or you know aspecific panelist pick oneUm I mean if I wasn't here I'm areplacement moderator just so you allknow There was one more Shinas who wassupposed to be here So I'm thereplacement So and they were like uh heywe just going to ask each otherquestions So this is your chance I Iguess I'd pose this to all the panelistslike what can we do better you know toyour point you all are deeply involvedyou're all helping customers like whatcan Nvidia do to help to be you know noteven more part of the community but alsojust like this is the pain va featurerequestsyeah um sure uh I'm looking at some ofthe pain points that customers came tome right so one of the common pain pointuh that I noticed was especially for thetraining when they're runningdistributed training uh across multipleGPUs and uh they stream the data fromvarious sources like uh S3 or other filesystems Umbut sourcing the data or reading thedata and um and running the trainingworkload takes a longer time becauseit's missing the caching capabilitiesSo for the training it requires uhproper caching across all the nodes anda lot of these customers are actuallyduplicating the data the entire data setacross the nodes to do this training Uhif there is any possibility around thisof sharding the data and splittingacross these nodes to optimize and alsobuilding a cache clusters whichsimplifies this you know whole trainingpartAnybody else wants to chime ini asked the question uh request to amedia last year at the Paris So I wantto repeat three asks here and somewidely for fail but you know justmention something right when it is I askfor the faster detect of the GPU problemand the hardware problem because ofoften customer did the training andafter the the they have to do the humaninvolvement validate that the model iscorrect incorrect all those kind ofthings there's no visibility and and uhand that's with a lot of time andresource there's a thanks there's a lotof proment there and another thing it isI asked last year it is the sharingbetter sharing GPU GPU it is expensiveand and uh it's the how we are going toshare right so there share have many wayto share but uh and also security likewhat kind of additional isolation whento to offer to share of the GPU and Ibelieve there's some solution we need towork together and to move forward solast one I ask it is how to plug in GPUbecause GPU like GPU is expensive and aspecial device Um so which is means alot of customer have to preserved aphysically attach of the GPU to the uhmachines and once they reserved and thatGPU cannot use Kthat that node alsocannot be used So uh how we are going touh uh dynamic plug attached of the GPUto the machine that's can totally changeabout the cloud and uh and how weassemble of the node right so so that'sthat's kind of the three things I'vebeen keep ask of course kubernetescommunity and also all of us like allthe part partner and the contributor wework together and evolve and one of thethings I really worry about it isbecause so many of the uh scheduler uhidea and so many CR propose to customizeof the Kubernetes right so one of theconcern is we don't evolve Kubernetesright so to standardize standardize alot of things standardize the DRstandardized hardware device and I hopewe can also standardize about the likeworkflow maybe even not to that highlevel but the standardize about a groupof the parts describe of the workflowand so uh customer customized of thescheduleuler can work against those uh pgroup instead of we fragment ofkubernetes for different purpose uhfragmentation uh sometimes actually it'sokay because you customize for differentindustry needed but that's also ispotential for the uh reduce of theefficiency right reduce about uh uh theefficient and performance all those kindofand also it's not elastic uh forcustomer perspective So that's kind ofgood the things we need to work togetherand move forward Sachi before you askI'm going to ask people to start liningup Uh if you have a question and the micis right there and if there is nobodywe'll just continue talking So I hopeyou know you have questions and you knowline up So otherwise okay Sachi Yes Umyeah So what we see users uh trendingtowards is wanting to learn more abouttheir AI HPC workload So using differentmonitoring and observability tools andthere's a lot of interest around forexample the NVIDIA DCGM exporter andlooking for uh broader support of thattool maybe across different types ofGPUs and um compute that's introduced byNvidia for example maybe confidentialGPUs and so forth So just I would say umlike plugging in these tools togetherand allowing them to be seamlesslydeployed monitoring observability usingour kind of standards across Prometheusand Graphfana for example and then umeasy integration into all the differentcloud providers Um and I wouldsay yeah generally that's the mainfeature ask Okay if nobody's coming tothe mic I'm going to Good Otherwise Iwas going to call on people like Sergeyhere I'm going to answer one of Don'srequests just right out of the gate Youknow we we just open sourced our runningeyeuler It's called Kai and it does haveyou know fractional GPU Uh so you know II love that you mentioned like thatthere's a lot of schedulers going outthere It's really showing kind of theubiquitous of what people are focused onSo we're excited to be part of you knowthat evolution as well Thank you PleaseYesAh perfect Yes Uh I have a question Wewere focusing very much on AI and MLworkloads which are certainly veryinteresting very hot topic right now Butwhat I am interested is um uh whatchallenges do you see in the context ofKubernetes um for the workloads tointegrate them um beside and beyond uhGPU workloads excellent question Whowants to take this i can take Yeah Uhbesides GPU workloads and I think I'llstart with the batch workloads if thatmakes sense Right So um the challenge uhtraditionally on Hadoop clusters whereyou run a lot of these data processingframeworks as Spark Flink and you knowwhatot um Hadoop clusters used to scalefor thousands and thousands of nodesthat traditionally works well but uhwhen you move to Kubernetes um there isa lot of um orchestration components uhwhich needs to work together to make itwork for large scale workloads startingwith uh distributed workloads like SparkSpark uh requires gang scheduling rightso which is something the Kubernetes uhyou know default scheduleuler does notsupport but then the other scheduulersevolve to make that gang schedulingpossible uh and then optimization of theCPU so distributed computing heavilyrelies on the storage right so how do weactually use the storage the way we usedto use in HDFLS like a common file systemright so we came up with the PVPVC'slike leveraging uh node level storageand also leveraging some distributedfile storage and so on So those are theareas where uh improved on Kubernetesand still there is a lot to be done interms of the scale aspect because wenotice when you run like 30,000 parts or50,000 parts of these workloads andthere is um always a bottlenecks aroundit CD and uh API servers it just becausethe nature of the bursty workloads andputs a lot of stress on that CD SoKubernetes was never designed for thatBut now with this type of batchworkloads and AML workloads and there'sa lot to be done in the space I mean inthe end your data AI stuff has to benear where your existing stuff is rightthat's why we are trying to do thiswithin Kubernetes Uh thanks VA We haveone more questionUh hi my question is more targetedtowards Nvidia but all feel free to chipin Um so I was thinking about what couldbe the future of the scientificcommunity the research community alongwith the open-source community whereperhaps we want as a informal group umall u gather together and uh dosomething with AI but we don't have thebudget to go to a cloud provider andreserve a cluster with some GPU nodes Soperhaps we all uh are gamers and have anNvidia GPU at uh our home How can weleverage the ability for the Kubernetesto work at the edge and take advantageof the processing of one of thecontributors in that community to uhallocate some uh GPU time from their ownmachine and then uh all uh train theiruh open source model Yeah Before uh I'lljust quickly answer one one portionwhich is if you're part of any of theCNCF projects uh CNCF projects haveaccess to cloud credits from everybodyum you know GKE AKS and so espec forexample we run um you know uh sanitychecks sanity tests on all threeproviders for GPUs So uh you know pleasecome join the community you will haveaccess to some of these things but yesuh for individual re researchers andresearch teams you know it might not bethe right solution for it but uh you'rewelcome to talk to the CNCF staff uhthere is various mechanisms where wewhere CNCF makes things available andthen I'll hand it over to Erin for theNvidia side Dim's pretty much answeredmy exactly what I would say Sorry No I Ithink it actually brings up anopportunity for us to to collaboratemore widely We have a lot of researchwith NVIDIA you know across the spectrumUm andso I'll look into it I mean that's all Ican say I'm sure you have researchprograms and scholarship programs andthose kinds of things Yeah we do But I Ithink it's a great point and and that iswhy the CNCF went to the cloud providersgot credits made sure that um academicsand new startups had the ab that theyweren't constrained by getting computeright and and also the cloud providersendup not just Nvidia but also they doneuron and tranium and other things tooSo uh all those things are available tothe community as well Yeah greatquestion Thank you Yeah thank you Uhanyone else yes Sergey Thank youOkay Hello I will ask you to predict thefutureYou beat me to it dude So how do you seeit going forward do you think do youthink that um we will have um likeMicrosoft has like PC in every everyliving room we'll have accelerator inevery cluster or we will have peoplecalling to some well-known APIs and likeonly few companies will provide servingand great models and second when Iassume that you'll answerfirst if you answer first do you thinkthat I know all three clouds buildingtheir own accelerator ators and Nvidiahave accelerator Do you feel we willconverge sometime and then acceleratorswill start doingum we will be differentiated bycapability rather than by vendor nameAbsolutely And and I think this goesactually beyond accelerators I thinkthis you know VU made an excellent pointis that you know the foundations ofcloud computing you know are thethree-legged stool They're computestorage and networking and we haven't Ihadn't had any questions in networkingwhich I'm kind of shocked because a lotof these benchmarks are based on latencyright and it's not just within the GPUso I mean I do think the evolution ofthat is like how do we do that well andhow do we orchestrate that well and Ithink that's why Kubernetes is soimportant here um in tying thatinfrastructure together beyondpredicting the future of that no I'm noteven I'm not even going to touch thatquestion but I'll I'll hand the mic offso everyone can take a take a chance atit Yeah Uh we probably are out of timebut you know G 10 seconds each Oh I Idid I did hear Microsoft mention inthere so I just want to give my twocents Um but as far as the future goes Ido see um users wanting a single pane ofglass across all of those main threecategories So when it comes toobservability using different toolsacross GPU vendors or even wherever theaccelerators are coming from being ableto have more fine grain control um andmonitoring over all of those resourcesum using a single tool or just standcreating standards across um as well asfor networking So for RDMA overInfiniband or um other approaches to uhlike distributed networks then how arewe going to approach that and make thatstandard um for different types ofworkloads not just a IMLyeah sure Uh just add to what Sachi saidUm I think standardization is the mainthing Uh that's now if you look at AItools on Kubernetes there are like lotof tools and customers are struggling tounderstand which one to use it So thingslike anthropic launch this MCP serverright standardization uh something likethat within both training inference uhusing GPUs or maybe using inferentia andtrrenium from accelerators various otheraccelerators there should be astandardized way of accessing theseaccelerators and I think that's wherethat innovation that needs to be done Ithink it will be done in future I guessJohn has the last wordUm yeah I just add uh what my partnersaid here and uh actually from myimaging of the world I even have theproposal since last year and uh I wantto standardize the different layer rightso we already doing that hardware layerand I want to standardize at the uhscheduling layer what kind of the newparameters to efficient of thestandardize those kind of things andalso I want to standardize about theauto scanning and the build of the pivesto Nate those workload node the customerscheduler right so for different type ofthe workload node and different type ofthe batch workload node they give thesignal to the anton infrastructure likea cloud provider and GKE AKS and EKS allthose kind of things they need to scaleuh auto scale of the node expanded ofthe uh and uh and uh and also VPA uh HPappear all those kind of work togetherand the most of the important thingswhat I want in my imagination it is wecan evolve all those batch work node orbatch work node framework together butpreserve a lot of those good thingsespecially of the UX for example earlierhave the data scientist one of the datascientist ask I don't have the answerhow you access the GPU TPU all thosekind of things but I want to preserveyou still can using snerm and snerm APIand the r and the r api you're familiarwith and submit of your jobs and onceyou finish off you can using ry notebookand to do those those testing andexperimental but once it is ready todeploy I hope that deployment it iscommon and standardized and all thosekind of things so we can decouple on thekubernetes layer is the core and I candecouple of the rest of the workload thedetail management based on your lead andthen send it to the kubernetes that'skind of the my vision and also I've beenpush since a year ago and that's justwant to share here we are out of time alovely audience thanks a lot thanks forusing kubernetus thanks for trying touse a stuff on kubernetes and giving usuh enough feedback uh as part of thekubernetes project you know you'll seemore stuff coming out we have 133 comingout and then hopefully there's stuffthat you can switch on and try and yeahplease give us more feedback and uh giveus your use cases Uh we'll hang aroundhere for some time if you have if youwant to come chat with us Um Don thankyou Sachi thank you V thank you Erinthank you2025-04-15 21:58:03.584672 ��#H#��}AjtGSzIvw9jIhi everyone We are today here talkingabout adapting Kubernetes to specializedapplication workloads Uh anduh the this is us Uh who you know let'sstart from the other end Uh Don pleaseuh would you like to introduce yourselfmy name is Danten I'm a softwareengineer from Google I'm also uh thesoftware engineer and the tech leadsince Kubernetes inception Currently I'mleading of the signal the community andless to be here and talking about theKubernetes evolvement for those speciallast of the work nodes like such as theAI machine learning work node and theHPC work nodes SachiHi everyone My name is Sachi Desai I'm aproduct manager on the Azure Kubernetesservice team and my focus area is AI GPUHPC deployments onto AKS and Kubernetesas a whole And I'm involved in theKubernetes AI tool chain operator CNCFsandbox projectV It should be on Yeah it's on Yeah Uhhi everyone Uh my name is Vuntu I'm aprincipal open source specialist essayworking with AWS Uh my key focus areasare uh scaling data and ML workloads onKubernetes Uh especially the Amazon EKSSo um my strengths around building somedata pipelines using Spark Flink andTrino various other data frameworks onKubernetes and also building a MLplatforms on Kubernetes Looking forwardto speak to you allUh welcome Uh my name is Aaron Boyd I'ma senior director and distinguishedengineer at NVIDIA Um super excited tobe here I'm just thrilled to see howmany people are here on a Friday Like 10years ago there might have been seven ofus And you know we just went out forbeer because like let's just go do thatinstead So great to see you all Hopeyou've had a exciting CubeCon I can'twait to hear your questions for thepanel Yeah this is exciting You know wehave the three hyperscalers we haveNvidia What would you want so how manyof you already have AI workloads inproduction can you raise yourhands quite a few And how many are youplanning in the next fewmonths okay so that's much better So uhlet me open up uh a warm-up question toeverybody Uh what is the state of artfor your customers who wants to go firstwhat is the state-of-the-art for uhworkloads and in general and a IML inspecific Go ahead I can go ahead andstart So I think Nvidia has a reallyunique set of customers We service likea ton of internal customers that aredoing amazing a IML research um you knowfrom uh biological models to robotics umyou know is obviously known for itspresence in AI but we're rapidlyexpanding to an external marketunderstanding you know what customersare doing for training and inference umand the workloads are expansive you knowwe have anything from hosting thingslike deep deepseeek to you know using uhnims and blueprints to go create yourown models or augment your models or dofine-tuning and training so I would saywe we service a lot of customers with anextensive diverse set of workloads andyou know as Don will get to eventuallyyou know the hyperscalers are prettyunique in that you know not only are weseeing the intersection betweenKubernetes and AI we're really seeingthe intersection of HPC and Kubernetesand how is that rapidly changing ourlandscape I think it's an exciting timeto be here and I'd love to hear how manyof you um with your workloads and withyour customers are experiencing the samethingYeah sure Um uh at AWS uh you have youknow pretty much uh a lot of customerswho are running their AML workloads onKubernetes uh which is Amazon EKS andand the focus is in every vertical uhmostly in automated financials uh inevery vertical supply chain they'refocusing on it and one of the biggestexample is Amazon retail and whereAmazon focused on building um AI withinthe retail itself self and which isshowing as example for a lot of these uhcustomers at AWS and and they'releveraging a lot of these technologieswithin you know Kubernetes within AmazonEKS and the open source solutions thatcame out uh building their AML platforFO data dog we've beenworking with validating admission policyfor a little over a year now so today wewanted to come here and show you how youcan take some tutorial policies andtransform them into fully adoptedvalidating admission policy at scalequick overview of what we're going totalk about today so we're going to startout with just a little bit about whatKubernetes means at data dog uh thenwe'll transition into data dog's historywith admission control kind of ourmotivations for migrating to validateadmission policy we'll dive into one ofour policies and kind of take it fromwhat you might get out of the box in atutorial into something that we feel uhis good enough to deploy out intoproduction uh next we'll cover a fewmigration uh and monitoring strategiesthat we've used and finally we'll wrapup with some teasers to uh some day twooperations and kind of next stepsso for a little bit of background ofKubernetes here at Data Dog we have over2,000 employees in our engineeringorganization we run multicloud with overa 100 clusters over 10,000 nodes andover a 100,000 pods so every single oneof our workloads runs on Kubernetes soit's not just our infrastructure butalso all of our platforms and all of ourapplications so as such we need agranular and configurable set ofsecurity policies to match ourheterogeneous workloadsand so data dog's kind of history withadmission policy starts around 2020 whenuh pod security policies were beingdeprecated uh out of kubernetes we werelooking for a replacement at the timeand uh pod security and mission uh justwasn't going to be a great fit for us uhso around the time OPA gatekeeper waskind of the go-to default solution um inthe ecosystem at the time and that'sultimately what we beganwith just a quick overview opagatekeeper OPA or open policy agent uhis a graduated CNCF project it aims tobe a generalpurpose policy engine um itprimarily uses the Rego language towrite and define its policies andgatekeeper is the part of the projectwhich kind of wraps the OPA engine intoa validating uh admission web hookso as of Kubernetes version 130 SIG APImachinery has a new uh feature outcalled validating admission policy it'sactually graduated to stable so from thedocumentation validating admissionpolicy offers a declarative inprocessalternative to validating admission webhooks so this means that VIAPvalidations are evaluated directly inthe API server rather than as anadmission web hook like gatekeeperfurthermore VIAP's policies are writtenin the common expression language ratherthan rego so if you'd like a little morebackground on VAP and CEL we invite youto check out Anish and Kevin's CubeConNA 2024 talkso um once Kaitlin and I kind of got ourhands on validating emission policy uhwe did a proof of concept and ultimatelycame to the conclusion that it'd be agreat fit for data dog we kind of cameup with three main categories as tofeatures that we really liked and uhultimately you know the first one isjust moving that from an external webhook to an entry process was going tosave us a lot of operational complexityit's going to reduce our cloud resourcesand cost and finally it was going togive us just a little bit bettersecurity posture by not running thatexternal webhook uh next we really liked the CLlanguage um we see it being usedincreasingly across uh Kubernetes thingslike validating fields and CRDs um it'salso being used internally at data login some developer tooling uh aswell and finally we really liked thenamespace scope parameters thatvalidating emission policy gives um as asecurity engineer it's great it gives mean entire namespaces security posture uhat a single glance and uh it alignedreally well with some of data dog'sexisting kind of name spa namespaceownership and uh arbback boundariesright so once we've decided to migrateto validating emission policy one of thefirst things that we needed to do wasactually take our existing policies andthen translate them into validatingmission policy um so we wanted to shareone of our policies with you today ourcapabilities policy this one's come allthe way from Pcome with us all the wayfrom our pod security policy days um andit very simply just restricts uh theadded capabilities that a container cangive to its security contextso to get VIP up and running we're goingto need a few different components sothe first one is our validatingadmission policy resource so you can seein box one that this policy isvalidating against the update and createrequests of pod resourcesdown below in box two you can see ourvalidation expression written in thecommon express common expressionlanguage so you can see that thisexpression only validates pod resourcesand it will only pass pod resources thatdo not have anything specified in theadd field of any of its containers uhsecuritycontexts so in order to have this policytake into effect we're going to need tobind it to our name spacesso we can do this by using a validatingadmission uh policy bindingresource you can see in box 3 that we'vespecified a namespace selector so thatmeans all of our namespaces will bebound to this policy by default exceptif they explicitly include thisexclusion label so right now we have theability to exclude entire namespacesfrom entire policiesso you can see here an example pod thatis attempting to add a capability withina uh one of its container securitycontexts so this pod would actually bedenied admission by our policy butsometimes we do have legitimate usecases for these capabilities andunfortunately with this current setup wedon't have um it's not adequate enoughfor our needs so right now it's eitherall or nothing pods would either bedenied admission from our policy forusing any capability or they could beexcluded from our policy but allowed touse all capabilitiesso yeah as a security engineer isclearly not a great security posture uhwe'd love to introduce some moreflexibility into our policies beforethey actually go out into production toaddress these legitimate use cases umand the first way we can do that is byadding um a variable to our policy therein box one we've just called it globallyallowed capabilities and this is simplyjust a list of capabilities that wethink might not need additional securityreview before they're used um inworkloads once we've got our variabledefined there in box one we can go aheadand reference it in our CL expressionthere in box twoso to make our expression even moreconcise we have replaced our former useof the has macros with CL optional fieldselection syntax so that question markoperator in conjunction with theor valuefunction seen in box 3 this allows us tosubstitute in a value in the event thata field we're checking doesn't exist soin our case we're substituting in anempty list so this allows us to checkfields without having to check theirexistence every single time and thusmaking our expression a little moreconcise and also in this example anempty list would in fact pass ourvalidation and therefore admit apod so globally allowed variables are agreat start to add flexibility in ourpolicy but we can go a step furthervalidating emission policy has a conceptcalled parameter resources and thisallows you to take basically anyKubernetes resource and inject uh someinformation from it into your policy umthis could be anything from a config mapsomething like a pod disruption budgetat Data Dog we uh made the decision togo ahead and create our own CRD whichpackages all of our variables that weneed for our policies there in box oneyou can see kind of the beginnings ofour CRD and in box two an exampleresource created from the CRD uh whichuh includes our allowedcapabilities once we've got ourparameter resources defined and createdwe need to tell our policy resourceabout it so there in box three we'veadded a param kind field to our policyand we're specifying the API version andkind of our CRDfinally we also need to tell the bindingresource about which parameter to go andlook up uh and so we can add the paramref field there uh to go and look forour parameterresource finally once uh the policy andthe binding both know about ourparameter resource we are then free toadd it in our CL expression there aQt thebottom in box five in our policyresource um ultimately this gives us theflexibility to uh take a single policyand configure it differently across eachnamespace individual uh capabilitiesneeds and their security requirements soright now we have a configurable policybut it's only validating against podresourcesit's perfectly acceptable from asecurity standpoint but it would be abetter user experience if we couldvalidate against higher level podcreating resources likedeployments so that's exactly what we'regoing to do now um you can see in boxone that we've expanded the scope of ourpolicy to also validate the create andupdate requests of these higher levelpod creating resources such asdeployments and crown jobsso you can see down below in box twowe've created a podspec variable so weneed this podspec variable becausetraditionally in VIP we would need anadditional two validating uh validationexpressions to accommodate for thesedifferent resource kinds and this isbecause all these resource kinds havedifferent paths to their podspec so youcan see we've enlarged this expressionin box 3 and we've expanded the scope tothe apps API groups and job resourcescron jobs and pods ingeneral so you can see that we use twoterinary conditionals to extract thepodspec from all these different kindsofresources and finally down below in boxfour we've replaced our formallyhardcoded podspec path with our newpodspec variable so this allows us touse the same single validationexpression to validate against all thesedifferent kinds ofresources so with this posback variablewe've been a little bit more friendly toour users um they you know they'regetting uh admission denials now fromresources they might be more familiarwith uh and might interact more directlywith um but we can go even a stepfurther uh and provide some helpfulerror messages to our users wheneverthey do have an admission denial uh andwe can do that here with what's called amessage expression this is also a CLexpression and it is attached to our uhactual policy validation expression andit's uh rendered and run whenever theassociated validation expression deniesadmission to a workload uh you can seehere in box two we've also added anadditional docs variable which just hasa hyperlink to some further informationabout our policy so when we combine thatwith our message expression there in thebottom right you can see that we are notonly returning the actual containernames which violate our policy but alsowhere our users might find moreinformation about why uh the resourcewas deniedgreat so with all those enhancements wefeel comfortable now deploying ourpolicies out into production so justwant to take a second and kind ofsummarize the enhancements that we'vedone so far uh so at the beginning wehad this barebones kind of inflexiblepolicy um and the first thing we did waswe added this globally allowedcapabilities variable to just give usthat slightly bit more of flexibilityuh we've also reduced the complexity ofour policy by using CL optional types tomake expressions more concisewe've added uh even further flexibilityby introducing namespace parameterresources this allows us to take thatsingle policy and configure itdifferently across each namespace uhdepending on that namespace's securityrequirements and capabilities needswe've introduced a podsp spec variableto allow our policy to validate multiplekinds of pod creating resources within asingle expressionuh and finally we've given our userssome helpful error messages uh totroubleshoot whenever they run intoissues so now we have a production gradepolicy but how are we going to safelyand securely migrate from our old OPgatekeeper setup to our new validatingadmission policy setup so we can firstdo this by utilizing VAP's auditvalidation action so you can see in thetop left our policy binding resource andhow we specify that so in audit VIP willlog all policy violations to a specifiedaudit log and if audit is the onlyvalidation action it won't actually denyrequests at this time so this gives ussome time to check the validity andbehavior of ouRr policiesso you can see in the top rightvalidating admission policies areevaluated before validating admissionweb hooks which is where opa gatekeeperruns so when we're checking we can uh wewill check the validity of our policiesby first seeing all policy violationslogged in vap's audit and thensubsequently um denied admission by opagatekeeper consequently we should alsosee that every single um pod admissiondenial from OB Gatekeeper has a matchingum policy violation in our VIP audit logso to check that every single one of ourpolicies that we've migrated from OpaGatekeeper to VIP are correct we've madesure to uh create some unit andendto-end tests to check every singleone of our policies and make sure thatthey have matching behaviors so now thatwe're sure that our new policies arecorrect how are we going to starttelling VIP to actually start denyingour requestsso once we're confident that VIAP uh allof our policies are ready to go we canthen add the deny configuration there inthe top left to our binding resource andagain because validating admissionpolicy runs before validation web hooksin our admission control chain we willnow start seeing uh admission denialscome from validation admission policybefore they ever get to gatekeeper wecan again confirm that our policies arecorrect and match our old gatekeeperpolicies by confirming we no longer seeany admission denials within Gatekeeperitselfonce we're confident that this is thecase we're free to go ahead anddecommission uh gatekeeper and uhcomplete our migration um but now thatwe're done and validating a missionpolicy is now in the critical path weneed to make sure that it's runningreliably andsmoothly so um a few ways to do thatfirst of all we can take a look at somemetrics that Kubernetes offers us um nowthat we've moved our emission controlfrom an external web hook into our APIserver the first thing we want to do ismake sure that the API server is runninghealthy uh running with good health andreliably um and so two um metrics thatwe can use to do that are the totalnumber of policy checks and the durationthose uh checks are taking so to furthermonitor that our validation expressionsaren't causing havoc to our API serverswe can look in at an additional twometrics provided to us by the API serverso these two at the bottom right concernCL compilation and evaluationduration so the caveat to these metricsis that they are aggregates of all CLexpressions across your API server sothis doesn't just include theexpressions in your policies but alsoexpressions used for CRD validation umeven though these are aggregates they dogive you a general sense of how CL ingeneral is affecting your API server butif you did want to pinpoint exactlywhich expressions may be causing youtrouble there are some greattroubleshooting tools out there um suchas this playground so as you can seehere we've input our validatingadmission policy and some test requestobjects so this tool allows us to kindof check the um the validity of ourpolicy and of ourexpressions on the bottom right side ofour playground you can see that it has avalidation cost associated with it sothe API server has a concept called thevalidation cost budget um to protectitself from any runaway CL expressionsso down below you can see there's astatic estimated cost limit which is thelargest allowed static CL cost on a perexpression basis and this is hardcodedto 10million so as you can see on ourplayground our capabilities policy thatwe showed today has costs associatedwith each one of its variables and abovethose you can see a total cost of thepolicy so this total cost is a sum ofall the costs of the variables and allof the expressions and messageexpressions so you can see our totalcost of our policy today is relativelylow in comparison to the hard-coded 10million but validation cost is stillsomething important to keep in mind umthis is because your costs canexponentially increase especially withthe use of nested and chain macros so umvalidation cost and validation costbudget is still important to keep inmind to protect the reliability oSf yourAPI serverso we kind of come to the end of thetopics we wanted to cover today so Ijust wanted to kind of highlight andsummarize what we've discussed uh so farso at the beginning we talked a littlebit about data dog's kind of historywith uh admission policy and some of ourmotivations for migrating to validationadmission policy namely moving from thatexternal web hook to an entry processnext we stepped you through one of ouruh policies to show you how we kind oftook an outofthe-box tutorial and put itand evolved it to something that we feelcomfortable deploying out intoproduction namely uh how we usedvariables to simplify our configurationhow we added parameters to uh increaseflexibility into our policy and finallyhow we're helping our users troubleshootwhenever they run intoissues uh finally uh after uh after themigration we showed you a few uhmonitoring and metrics to keep an eye onto ensure your API server health andreliabilityso uh this might seem like a lot butwe've still got plenty to talk aboutunfortunately we won't be able to coverall of this uh in our talk but justwanted to leave you with some day twooperations um and some next steps thatthat we've been thinking about and umsome ideas that you might be able totake back um first of all in our policywe were just talking about our main uhcontainers in our pods and pod creatingresources however both ephemeral andinit containers can also addcapabilities to their security contextand so expanding the scope of ourpolicies to cover these is somethingthat we have to cover as well um furtherum adding flexibility into our policiesuh at data dog we inject sidecars intomany of our workloads um some of thosesidecars might need uh additionalcapabilities and it would be kind of alot of manual toil to go in and addthose exclusions one by one uh so we'veadded an additional parameter oradditional variable into our policieswhich basically maps a given containerimage name and an allowed list ofcapabilities to automatically allowadmission to these uh sidecarsuh we've been really happy using the EDframework to uh to run our end toendtests like Caitlyn mentioned to ensurewe're not regressing uh any of ourpolicies and you know just to doublecheck that what we're running locally uhand developing locally is what'sactually happening out on ourclusters uh I'm really excited about theself-service platform that we aredeveloping internally at data dog asAPIdriven way for our end users torequest exclusions basically thecontents of those parameter resources umand be able to configure that and beable to request it uh themselves whilegetting an a security review from us umand finally we've got some really coolideas about how to prune um and identifyand prune unneeded policy exclusionsover time uh workloads might change uhnamespaces might be abandoned andleaving those policy exclusions in placejust creates a little bit of a securityrisk and so we've got some ideas on howto identify those and ultimately uh getrid of those so that all being said uhwe're wrapping up if you would love touh if you'd love to see more of this uhplease leave us a review uh we'd love topresent more on these topics later um weuh are both available on LinkedIn ifyou'd like to connect and talk with usum and we'd be happy to take anyquestions thank you very much thank youis this on yeah hello uh thanks for yourtalk was great um I have two questionsum how do you reference other resourcesin the cluster in your uh policies doyou just load those in as parameters oris there a better way you referring tothe parameter resources yes exactly likeif you want to write a cell policy umthat uh has some dependency on a thevalue in another resource on the clusterum can you do that somehow right um uhwe don't have the slides back up butyeah exactly if you take a look at theslide uh referring to the policyresources here you can specify the APIversion and kind of the resource thatyou want to look at and then um actuallyadd the the path the the spec path thatyou want there in the CL expression um Ididn't mention it uh explicitly butthere in box four there's a line tobasically configure your policy what todo if that parameter doesn't exist inyour namespace or is somehow uh yeahmissing does that answer your questionokay so yes parameters are the right wayto go for that okay um wait what was theother question um you mentioned theaudit log um to validate if the policiesare doing the correct thing did youwrite a tool or a script to do that andis that open source uh yeah we kind ofjust parse our audit logs so for thisaudit log you can specify the path tothe log so we just parse it and thenalso parse our uh open gatekeeper logsand just check that they have matchingbehavior yeah it was mostly just somesome in-house scripting but um there arethere's an advanced feature invalidating admission policy to addspecific audit messages uh and tags andso that could uh be an easier way tokind of filter through um some of theaudit logs as well okay thank youhi there um anything that you lost frommoving from the web hooks to the theother one the admission policy um one ofthe things that uh was a bit of astruggle at first was just wrapping ourhead around CL logic um it's quite a uhit's quite a declarative uh languagewhere you know Rego is a little bit moreof a procedural language so just gettingour head around that paradigm shift wasa little bit of a challenge um and alsoum you not not necessarily having umuserdefined functions within RCLexpressions can you know it wassomething we were missing a little bitfrom from the reggo languagethank youhi uh first of all great talk uh we alsoquite heavily adopted um validatingemission policies uh quite a while agoand we did a very similar approach uhbut we started a bit earlier when theywere way less powerful and my questionwould be so I feel like testing this endtest is quite cumbersome or it's quiteresource intensive and we like donightly builds on all regions all thetime and we when we introduced themplayed around with trying to write unittests for validating admission policiesthemselves in some way but veryungracefully uh failed and I just wantedto ask if you uh touched into thatanyways and maybe cuz honestly I Ilearned a lot today we we're going todefinitely go home and rewrite ourpolicies uh uh if there's somethinghappened in this space that made thateasier or if there's any movement intothat direction is specifically about theunit test yeah specifically uh so I meanit's it's quite simple right it's uhit's basically defining uh a bunch of uhpod resources some that some that passthe uh policies some that fail and uhmore or less just like rendering andapplying the uh policies into a kindcluster uh and just trying to apply themvia dry run uh cubectl andthat's what we do we just had for uhopen policy agent policy previously likeyou can test reggo itself in a unit testand we I was just wondering if youplayed around like doing it in linewithout the need for for a kind clusteror any cluster or something yeah that umthat is a bit of a challenge and kind ofwhat Caitlin was mentioning with the CLplayground that's kind of when we'restarting to develop CL from scratch fromnothing it's a lot of playing around uhin the CL playground okay very similarexperienceuh hi uh from an application perspectivethat you want to kind of stay in linewith a policy I kind of want to be ableto use policies like this as a lint uhis it possible to test my YAML againstthe policy uh at the client side orwould I have to basically set up a testcluster apply the policies and then tryto apply my YAMLuh yeah that's a challenge you knowgetting kind of these warnings toapplication developers earlier in thedevelopment cycle um it's something wedon't have a great answer for yet butcertainly are feeling some of the samepain and yeah if you've got any ideas behappy to to to maybe brainstorm laterokay uh thank you very much anywaythat's a really good talk thank youokay thanks very much everyone2025-04-15 21:58:04.199472 [ [��J#��WAlj_qgsb4h38good afternoon KubeCon how are wedoing so before we get started todayKayla's going to take a quick selfiewith the audience that's all right readyokay on three flex if you're up for ittwo threewoo i think maybe a little more woo woowoo awesome all right so I would nothave worn this outfit if I knew how manypeople were going to show upbut I'm here and I'm in it and I'm sohappy to see you all thank you so muchfor taking the time on the last day ofKubeCon in the afternoon to come to ourtalk it means the world to us so whatare we talking about i assume this iswhy you're here it's the title it'sfunny um and that's what we're going togo over this is an entrylevel talk umgoing over the risks of containers ifyou just make them and you don'tnecessarily know what you're making sowe will kick off with our introductionwe are your coaches my name is KaynEdwards i am a security engineer atOzero by Octa and also a co-chair ofKubernetes SIG Security a CNCFambassador a very enthusiastic dog momand a Sander fan i've got my Bridgeportearrings on if that's a reference youappreciate then uh we should be friendsand I will pass it over to Dan tointroduce himself so my name is DanielMuU��3I#��AOJ1WoQjYAJowell helloCubeCon thanks for saving the best talkforlast this is evaporating Kubernetessecurity risk adopting validationadmission policy at scale uh my name isJordan Connard i'm a security engineerhere at Data Dog um and I work on one ofour compute teams helping secure ourKubernetes control planes and I'mKaitlin Lee a software engineer also onthe comput team here at Data Dog um Istarted on the control plane team andnow I focus on our workload autoscalingum but here atNVrphy i am also a senior securityengineer at Oslo by Octa i am a lover ofphotography coffee and outdooradventures and also a massive sounderfanokay so what are we going to go throughtoday we will start off going over thelingo we're going to go over the whatthe how and the TLDDR we all know thatthe Kubernetes landscape and thesoftware landscape in general is full ofoverloaded terms so let's break themdown oops and make sure that we knowwhat we're talking about next we'regoing to go into the before we're goingto meet this soft and vulnerablecontainer get to know what they've gotgoing on under the hood finally we'llget into what you're all here for whichis the montage we're going to take thatcontainer from squishy to strong uh andthen we'll finish off with coach'scorner we're going to offer some tips totake forward to your company to yourpersonal home project so that you cansecure your containers and haverocksolid showworthycontainers finally um I'll start withthe lingo as I said we want to make surethat we're all jumping into thispresentation with a shared understandingif I say any words I want to make surethat you are thinking of the same thingthat I am thinking of starting offbecause this is KubeCon we'll go throughthe Kubernetes components at the verytop we've got the cluster that's whereeverything lives that's right there acollection of worker nodes then we'vegot the node these are your workers theycan be virtual or in the cloud um andthen we've got the pod this is thesmallest deployable uh resource inKubernetes and this is where yourapplications live and then finally we'vegot the container which is a lightweightunit that contains all the componentsneeded to run an application such as theapplication code runtime libraries andconfigurationand a few terms that we're going to beusing that are kind of containersecurity specific so starting with baseimages images like onions are made oflayers and almost all are based on someanother image uh we have ourdependencies so these can be operatingsystem or your programming languagepackages that are used by an app to dowhat it does and these can be direct orindirect and it's dependencies all theway down dependencies ooh supply chainis so hot right now i know right we'vegot vulnerabilities and exploits thesecan sometimes get used interchangeablybut when we say vulnerabilities we meanweaknesses in software that can be usedto do something that we don't want tohappen and exploits are essentiallytools to take advantage of them andlastly scanners so these are just aspecific type of tool that are designedto examine your container in some way todetect things like insecureconfiguration things that look like theymight be secret vulnerable packages thatsort of thingand so why does this matter so I did abit of a lit review in preparation forthis talk and found that uh a study from2021 found that nearly half of theofficial Docker Hub library images thatthey studied contained at least onevulnerability that had a proof ofconcept exploit so by that I mean anexploit that can demonstrate that thisthing is actually usable but has notbeen built to do something activelymalicious like xfilling all of yourcredentials or starting up a cryptominer and in a different study from 2023they found that about 30% of popularDocker Hub images were based on a parentimage that was outdated by over a monthby the time the child was built and 70%of the latest versions of those imageswere built from an outdated parent witha median stalness of more than 5 and ahalf monthsso probably we should just use latestall the timeabout that so in yet another study from2020 so all of these uh have got linksat the bottom for sources that will beon shed after the show uh there'snothing actually all that special aboutlatest it's only created if you don'tspecify a tag when you're building theimage and it only gets updated if themaintainer of that image remembers topoint latest at the version that they'vejust built so when you track latestyou're relying on someone else todictate which image version you'reactually running so this study fWoundthat nearly 12% of images that theylooked at had a latest tag that doesn'tpoint to the latest version and about 4%had a gap uh between latest and actuallatest of more than five versions sothere there is kind of a problemhere uh in yet more data from 2023uh secrets turns out they show upeverywhere so of the roughly 340,000images that were studied from bothDocker Hub and quite a number ofallegedly private but unsecuredregistries on the public internet theyfound uh more than 28,000 of thoseimages had at least one valid secret andthey found a total of55,000 secrets so most of these were TLSand SSH private keys couple thousandcloud API keys hello crypto mining and acouple thousand compromisedcertificates and those secrets they stayin use so a couple hundred of thosesearchs from public CAS and nearly 5,000of them from private CAS were stillvalid when the image was pulled for thestudy and moreworryingly like 275,000 hosts were stillactively using these keys to secureservices and these were like HTTPSservers SSH or posgress LDAP MQTTwof those are some big numbersyou haven't seen anything yet how abouta million 1 million exposed containercomponents on the public internet datafrom 2024 founduh more than a million of these exposedcomponents and by this they meant RESTAPIs like the Kublet the Kubernetes APIserver image registries Docker runtimesockets or SCDclusters of those the majority wererunning outdated versions and usingdefault configs which is a recipe fordisaster when it comes tovulnerabilities about 3/4 of a millionof them were coup API servers a quarterof a million cublets and these arethings that by their design are supposedto give you remote code execution as aservice and 86% of them were sorry 86%of the exposed cube API servers allowedanonymous off and this isbecause from version 1.6 six the defaulthas been that if you have anauthorization mode other than alwaysallow configured then anonymous or isalso enabled by default you can turnthis off but because Kubernetes isdeclarative it's going to take someliberties if you've not specifiedcertain pieces of configuration itdoesn't default to the most secureoption instead it's going to sometimespick a more usable option so that you'renot pulling your hair out trying to workout why something isn't working uh so weneed to be very intentional aboutensuring we secure our applicationsproperly now of these million componentsthat they found about a thousand of themare on public IP block lists from uhthreat intel providers which means we'vegot a pretty good indication that threatactors are taking advantage of theseservices to do badthings so now Kayn is going to introduceour lucky example container so if wecould have a drum roll pleasemore more more more comeon all right this container lovessharing secrets it takes candy fromstrangers gives its one-time passwordover the phone has sent Amazon giftcards to princes in far away lands andhas the same password on all itsaccounts kept in plain text on its phoneit's soft it's fuzzy and it's going tocompromise your applications introducingSquishy Bin please don't hackso let's take a quick look at what makesthis cutie so squishy starting with ourDocker file now the application thatwe've containerized here doesn't reallymatter in this case it's a Flask APIthat returns a string but we're tryingto demonstrate how we've containerizedit and how we're deploying it so we'restarting from a fairly slim Python baseimage using Alpine from the Docker Hublibrary we're setting up our work datewe're copying across a pip config sothat we've got access to our privateregistry installing our dependencies getrid of that pip config copy across therest of our app source code set up ourentry point and we're off to the racesit's pretty small can't be much wrongwith thatright oh it'sme right yeah so it's a little littlebaby application it's not doing too muchwe're going to look right now again atthe Kubernetes deployment um here youcan see we've got a pretty standarddeployment it's got a namespace labelgreat job um it's got all kinds of basicsettings and this coXuld have come fromanywhere this could come from theKubernetes.io docs this could come fromyour favorite ai.com whatever it is thisis a really normal configuration andthere's nothing wrong with usingit um before we get our little friendinto fighting form we need to get anidea of where we're starting so for thesake of the metaphor this is equivalentto getting a read on our bodycomposition um or doing those awfulbefore pictures doing our one rep maxesum Dan's going to take us through theresults and give us a good look at wherethis squishy bin is starting thanksKaylin so first we're using a tool fromTrivy a tool called Trivy which is anopen source tool from Aqua Security andwe're using this to scan our images forany known vulnerabilities ormisconfigurationsand as we can see here we have got acouple of vons that we need to take careof including a pretty high severity onethere that we will come back to in justa minute we are also going to point atool called Truffle Hog which is anotheropen source tool from Truffle Securityto scan our images for any secrets thatmight be lurking there it can find allkinds of secrets in all kinds oflocations even though we're just lookingat an image today but I'm not sure if itsupports signal group chats just yetand as you can see here so much for oursuper secret password e right there inour pipconfig so trivy can also be deployed asan operator to your cluster and in thiscase it is scanning for the clustercomponents itself the workloads thatyou're running on it the infrastructurethat sits underneath your cluster uh thearbback configuration and it createscustom resources incd which means youcan consume this programmatically in anyway you choose so I realize this is alittle bit hard to see but this is asort of overview of some of the findingsthat we've got from the triv configreport uh and in this case we've gotum a mutable root file system we've gotthings like no security configsconfigured we can escalate our ownprivileges and we're running as the rootuser we've got an ID for the check we'vegot a little description of what's goingon here and a message giving you a bitmore details of exactly what you shoulddo to try and remediate thisuhand these are just some of those zoomedin a bit so you can read them a littlebit easier so that v that we looked atearlier who doesn't love a CVSs 8.8remote code exec in their packagemanagement library so CVSS for anyonewho's not familiar is the commonvulnerability scoring system it takes anumber of factors about a vulnerabilitysuch as what network conditions it needswhether it compromises confidentialityintegrity or availability that sort ofthing and gives you a score from 0 to 10to give you a rough indication of howsevere a problem is so our demo app ispretty small and has a nice shallowdependency tree but who here remembersthe log for Shell Vong and the pain ofresponding to that incidentso time for our montage let's lift somewater jugs and get this container in abetter place starting with some quickwins we're going to patch almost all thethings so taking recommendations fromany scanners that you've used andupdating your image version but somethings to bear in mind when you'rerunning these updates uh if the thingthat you're updating uh aderes to sembicversioning standards that's like majorminor patch um that can be a really goodway of trying to get an indication ifwhat you're updating to is going to havebreaking changes in it if that majorversion number's changed uh also worthhaving a look at release notes for anyof your sort of critical dependenciesbecause they can al some yeah they canalso highlight when things that aren'tnecessarily breaking changes but willaffect you are going to be in therelease you're updating too uh make sureyou don't accidentally run alpha in produm if you do you're a braver person thanI am and uh just because it should justwork please test your application ican't tell you how many things I've beentold should just work that didn't uh bigshout out here to endoflife.date whichis a really great uh online referencefor a bunch of projects thYeir supportedversions and like support dates for whenthose versions are going out of supportuh as well as renovate from mends whichis an open source tool that will helpautomate a whole bunch of differenttypes of dependency updates um not justsecurity rel relevant ones uhyeah and fun fun shout out I actuallymucked this up when we were preparingthe demo and went ahead and pulled aalpha version that was not latest sojust because you're in the field doesn'tmean that you're just going to do thingscorrectly all the time i think more eventhan bad actors we need to watch out foruh not having paved paths and guardrails because we all make mistakes we'reall tired overworked and uhexcitable so let's get into some of thechanges we made to our Kubernetescomponents to ensure that Squishy Bingets nice and secure um the first thingwe did is another quick win we justchucked Squishy Bin in its own namespacethis is a really common strategy inKubernetes to ensure that thesemulti-tenant applications that aresharing resource space are not actuallyaccessible to one another so this allowsus to have bespoke security policies foreach of the workloads that we're runningand you can see right here it's assimple as adding a line to thedeployment we can see this in actionright away with our next step that wetook for security in the namespace wecan go ahead and add um some labelsthrough the pod security admissioncontroller this is a native Kubernetesum feature it's been available since1.25 you can also configure the securityadmission controller um in the YAML weall love YAML and want to be there allday i love YAML um but we don't havetime for everything today so I've justdone a really basic example um you cansee the top labels there say enforce andI think that's pretty self-explanatorythat's going to go ahead and enforce theum pod security level and this isbaseline this is just kind of your okayyou've met the minimum securityrequirements and then below we have twoother levels um set to audit and warnand so warn is going to warn the userwhen they create uh resources and auditis going to add it to the event logs sowe'll go ahead and take a peek at whatthis looks like um so first we will lookat the namespace uh itself we see thoselabels we've already talked about themum so there they are and then we willmove on to looking at our poddefinition for these examples I've madejust some kind of risky business podsbad pod you can see it's got a securitycontext but it's commented out it's notreally doing much lifting at the momentand uh the security context that is setthere is unconfined which is the mostrisky one but it's not it's commentedout so we'll go ahead we'll create thepod we get a warning um because it hasno definedsecurity context um that it wouldviolate the more advanced restrictionsso we'll delete thatpod and then we will go ahead anduncomment that security context i thinkmaybe we'll see could be a surprise ohyeah i knowme and then we will go back in and seeif we can create the potagain ah no luck could not create it umso this is really nice if you have uhvery risky workloads and you want toensure that your security policies arebeing met you would do something likethis that enforcesuh if you want to get even further youcan use tools like gatekeeper um or chaoor kuborton to do the same thing theexamples I'm using are gatekeeper um butthere's lots more uh that you can useall of these examples come from thegatekeeper policy library and there areloads there that um would probably workin your job and also again for your homelab uh and so the two main types ofadmission controller are validating andmutating we'll start with validating uhthis just means that it's going toperform a check when there's a change ona resource and validate that it meetswhatever policies that you have setwe've started out with a very basic onewith the SEC comp uh secure computingmode profile runtime default is kind ofagain the default basic uh profile andthen localhost would mean that you'regrabbing the local host profile um andit has to be available on the node diskwe've alloweZd all local host filesbecause we don't give a hoot and nowagain we will go ahead and watch this inaction and so I will apply the policy uhthis is all made to run like all of theKubernetes we've got a YAML file we'regoing to apply it the constraint iscreated and we will take a look at it ohno we'll take a look at our bad pod samebad pod as before it's got that uhunconfined profile which means no holdsbarred it can do whatever it wantswithin reason uh and then we will goahead and try and create it you can seethat it is denied because of the sec umpolicy that we just added now we will goand we'll add this exemption and so Iwanted to highlight that you're able tomake exemptions because not all yourworkloads have the same securityrequirements and it's very likely thatyou're going to have some things thatneed a little bit more uh of apermissive permissive security policyand this allows us to secure by defaultand start very secure and then open upwhere we need rather than lock up andthen secure where we can and you can seenow that we were able to successfullycreate the pod woo uh next we will lookat the mutating so uh same sort of thingwe have a policy and in this case thepolicy will make changes based on a setof criteria this is a very simple oneit's going to look and see what the uhset comp profile is set to if it doesnot exist it will set it to runtimedefault and we will watch that in actionas well and this is really powerfulbecause it means that you are takingsome of the burden off your developersand ensuring that your minimum securityrequirements are being met we're goingto look at the pods we made the pod itwas able to be created now we'll deleteor we'll look it into the details of itand see that it does not have a set compprofileset which is what we expect now we'lldelete thepod and we will adduh ourpolicy applied our policy so now itshould pick up that the pod does nothave that sec profile set and set itmaybe I'm just kidding i know what'sgoing to happen i didthis and then we scroll up and there itis hello magic very very very powerfuland so we've already touched on updatingthe image that we're using but we'vealso made some other changes to ourDocker file starting with that pesky pipconfig so instead of copying a file intoour container build and then trying totake it out and leaving it behind inanother layer we're instead takingadvantage of the ability to mount asecret at build time and it will then beavailable at the path that we werepreviously copying it to but not getadded to a layer in the final imagewe've also added a user so that we areno longer running as root and we've setup a health check so that our containerruntime can tell if we are still aliveor not uh it's not shown here but aquick shout out to uh the docker ignorefile uh to make sure that you don'taccidentally copy your virtualenvironment or node modules if you'regoing to be installing them in thecontainer build anyway or moreimportantly things like an envr filewhich some of us use to manage uhdifferent environment variables andproject settings as we're moving arounddifferent projects on our disc and thoseusually have secrets in them uh and thatuh trusted host stuff you can ignorethat for uh general use uh that's justan artifact of how we've set up our demoenvironment it's just telling thecontainer runtime to just ignore any TLSwarnings while we're doing ourbuild uh we are also making someimprovements to our deployment so we arenow pinning our image to a specificdigest to avoid any risks associatedwith mutable tags so if a threat actorsomehow got push access to your imageregistry they would be able topotentially push poison versions ofimages overwriting existing tags andthen when you spin up your workloadyou're pulling not the image you thinkit is by pinning to a SH 256 digest ifanything at all in that image changes weget a completely different digest andour deployment will fail we also makeuse of very high user and group ids toavoid any overlap with the host name uhhost range we are running on a readonlyroot file system we're blocking knownprives we [are dropping capabilitieswhere we can we're applying a set compprofile and we're applying some resourcelimits to prevent things like redos orcryptocurrency miner from exhausting ourcluster resources and generally wastingelectricity so introducing catchingcontainer this container backgroundchecks its friends and family it has aUB key in a safety deposit box in caseit's death it doesn't hone a single IoTdevice and it sleeps in a Faraday cageit's tough it's unapproachable and wemight have actually compromisedreliability to get it this rockolid meetcatchingcontainer so time for a before and aftereveryone loves a before and after and bythis we we still mean scans and we'regoing to see how squishy bin has madeits transition to catching container sobefore we had a bunch of vulnerabilitiesthat we needed to take care of and nowno more known vulnerable packages areinstalled[Applause]wehave a CIS benchmark of Kate's clusterand our workload deployed on it and nowit is clean it is fresh it is allpassing now this is just our name spaceyour control plane or a multi-tenantcluster is probably going to haverequirements that mean that you're notgoing to perfectly be able to meet everyrequirement on a benchmark so you'regoing to need to do some risk analysisand some bespoke enforcement based onthe workloads that you are trying to runin your environments but don't letperfect be the enemy of good hereapplying some of these controls is stillgoing to put you in a much better placethan doing nothing at allawesome so now we are going to hop intothe coach's corner for those of you thatwork in security um how do you do thison the scale that you need to acrossmultiple teams multiple stacks multipleneeds um we're going to give you sometips to ensure that your containersecurity program gets you the resultsyou've always dreamed of so tip numberone um just like when you're trying toget that rock solid baud you want tomake sure that the healthy choices arethe easy choices and in the case ofcontainer security or any security forthat matter uh we want to make sure thatthe secure choices are the defaultchoices um in practice we being securityengineers and security teams don't havethe time to go through every singleDocker file or every Kubernetes YAML umor run scanners in our terminals like wejust did uh what we really need to do assecurity professionals is ensure that weare paving the roads and putting upguard rails we should have a privateregistry of blessed images but alsoensure that we're tr we're making surethat we want and are getting the imagesthat we depend on uh we should ensurethat we're patching and upgrading ourimages regularly that we're enforcingthe use of slim and minimal imagesensuring that dependencies are onlyincluded as needed and they're vettedwhen they are included um finally havescanning in your pipeline and alert whencontainers need to be patched andupgraded don't trust that a securityreview on the creation of a service isgoing to be reflective of what thatservice looks like in a week threemonths threeyears next uh it's important to makesure that you understand the use casesand goals of your customers and by thisI mean the developers that you work withensuring that you are fully aware of theprocesses that you're trying to secureand reducing friction in any way you canmeeting teams where they are and addingon to any existing workflows they mightalready have by communicating early andoften and involving users in the processyou can avoid the nightmare scenario ofdeploying a control and then disruptingeverybody's ability to do their jobthere is no faster way to upset acompany than doing that and a reallygreat way of doing this is by runningthreat modeling sessions with your teamsit gives you a great opportunity toreally understand what these teams arebuilding and at the same time it givesteams an opportunity to think like thebaddies which can be both enjoyable andquite eye opening and building off ofthreat modeling um that's a great way toeducate teams and encourage buyin if youtake time to learn about theirapplication which presumably you will dowhen you're building a threat modelgetting them to data flow they'regetting them to work through the dataflow of their application um you'regoing to learn a lot about theirapplication and then when you presentthe threat model you're able to offerrelevant and valuable insight andhopefully they're going to be reallyenthusiastic to action those itemsbecause you came up with them togetherwe all know that the most secureapplication is one that's turned off buthow could you use the turned offapplication and when we make assumptionsabout developers and the teams and thetools they use not only do we damagerelationships but we also risk impactingour customers which at the end of theday none of us want to do we want tocreate a culture at our companies thatvalues security we're both really luckythat we work at a security company andsecurity is in our blood in our bonesbut that's not always true you mightbuild something and not think thatsecurity is necessarily the priority andwhen security is successful it's silentyou never hear "Oh man it's been 6months and we haven't been hacked butyou hear the day that there's acompromise." Everyone knows about it umso we need to make people know thatwe're doing stuff we're doing stuff allthe time please keep employing us weneed it um and then a way to do this ina at a in a mature level is to create asecurity advocate or champion program umthis can be as formal as you want it canbe pretty casual but you bring membersof various teams in educate them aboutsecurity and then you have like thismold your your person on the insidewho's going to be able to represent thesecurity interests and also come to youand be like "Hey we're doing this i'dlove to hear if you're going to be madat us in six months." um which helps usto stop showing up at launch day beinglike "Hey no sorry." Um and gets gets usinvolved early um run securityeducational sessions do cyberc monthit's so fun every single person lovesthe CTF you just have to make sure theyknow they're welcome to try um and thencompletely unrelated to the talk AnnieMurphy was on my train here in Canadawhich is very cool okayand so we're going to start wrapping upnow so these are some of the tools andresources that we found really usefulduring this uh big shout out again toend of life.date that has made that Iuse that like all the time at work uhshol which is a CLI tool for scanningfor end of life software or softwarethat's about to become end of life overgatekeeper triva truffle hog lovetruffle hog uh and helm which made it soeasy to get started with the trivioperator in our cluster saved us so muchtime and uh docker and kubernetes dockerthe docker desktop integration withkubernetes gave us a really easy demoenvironment but there are plenty ofother tools kerno confidentialcontainersand all of our so the slides are alreadyavailable on sheds and we've got some ofthe resources in our repo uh but we'llbe adding some more when we have achance to recover kayen has to get backto Canada personally I'm probably goingto need a couple days of hibernation torecover uh but everything will be onshed and on GitHub perfect so if you'rereally amped up about security which Ihope you are i am i'm pumped um join theconversation you don't have to just dosecurity at your company you don't haveto do security at all to help withsecurity you can join me and myco-chairs at six security that is thegoose QR code where you could come tothe Kubernetes Slack and talk to usabout the security come to our meetingshang out and help improve the securitylandscape you could do it at the CNCFlevel through tag security but we wouldreally love to have you honestly wewould love to hear your opinions wewould love to hear your problems pleasecome join us and and make securitybetter no make Kubernetes more securesecurity is the best so thank you somuch for attending this is the QR codefor feedback if you've got commentsconcerns we have a tiny little bit oftime for questions maybe if you havethem otherwise you can find us i thinkwill be easy to spot and we'll be2025-04-15 21:58:04.808537] private keys um tokens thatgive you access to a system and give youaccess to the system as a regular userso we are looking and we are going afterthis kind of secretstoday so first uh let's have a look atthe the content of a docker image um somy goal today is not to explain what isdocker is to explain what I will dolater on when scanning secrets so firstwe start with a docker file and thedocker build command if you are using abuild kit so we build image here thename is my namejaladon/cubecon and then uh we have thetag latest and you push that with dockerpush into a registry one of them isdocker hub and your uh image is storedwithin the registry into a repository soit's a bit like a g repository stuffstoring images and the target today isthe pull operation and when you get animage uh without anyauthentication so let's take one morestep and have a look at what is inside adocker image so we start with the dockerfile here really small few instructionswe have from copy run expose and of andfinally a cmd instruction and here whatyou can see that some instructions havewhat I call a side effect on the finalimage they create a layer so from willbe a layer its name is SH256 sorry I'm going backwards sh 256 colCD27 and this name is indeed the ash ofthe content so if two images they sharethe same from the same base image theywill also have the same uh layer withthe same name layers they're also calledcalled blobs and a blob is just a tarblewhich is usually compressed with gzipcopy will create another layer run willcreate another layer so we have threelayers and finally there is some strangeoperation and build kit is doing that ittakes a docker file plain text and dosome transformation some ones areunfortunate for secrets and then it willcreate another blob with a JSON configso for in an image we have layers configand on top of that we have another JSONdocument which is called the manifestwhich is let's call it a list of layersplus the config file if you want to playuh with an image content I wouldrecommend to install in a tool calledScopio scopio is really great you canget an image and get the blobs on yourdisk and start playing with the contentsso basically scanning docker images forsecrets is as easy as scanning JSONfiles and files within tarbles um and wehave a tool for that at gorian it'scalled gg shields you can install it uhon your laptop and JG shield will scandocker images in that example I justforgot u key in my python code so herewe have jg shield saying that okay Ifound a key and it's invalid thisvalidity thing is really important forattackers because they are only lookingat secret that they can use so what is avalidity check so most of the time youtake the secrets you test it against aknown endpoint and then you check forthe HTTP status code so most of the timedepending on DAP it could be a bit morecomplex but it works uh it's it worksfine on the right here we have thevalidity checks for two cases on top andin green a valid GitHub token and theanswer is 300 the token is valid and ifit's invalid we got 401 so the invaliduh token indeed there's a third case inthe case you have for example a GitHubenterprise which is only accessible uhwith a VPN so it's not possible to testit for a public IP address so in thatcase the status that we have in ourapplication is cannot check that's kindof useful because it mean the secretmight be valid it's not possible to toto know what todo um next uh so we scanned one image sowhat should we do to scan moreimages and I'm on time bitfast so uh the methodology is thefollowing we have four simple steps iwill skip these slides why because thesesteps they map to um Docker registry APIendpoints so first when you want to scanfor secrets in a docker registry youwill look for the um repositories you'lluse the catalog endpoint that will giveyou the list of all of therepositories then for each repositoryyou will get the list of tags that's thesecond step third step for each tag andrepository you will get the manifestwhich is this JSON file describing thecontent of the image and then you willget from^ the manifest the blob for eachlayer and the blob for the config andwhen you have the blobs you can scan forsecrets when you have the blobs you alsohave other things than secrets obviouslyyou have the source code of theapplication or the compile version ofthe source code so that's for a regularregistry um and do is a bit slightlyslightly different so why because uhit's not possible to use the catalogendpoint so it's not possible by defaultto enumerate uh the repositories whichis not what we want there's a trick herethe trick is to use the docker websearch uh and to use a keyword so wecould use like well-known keyword likeproduction staging things like willlikely get you images with secrets umhowever there's another problem that'swhen you use a keyword the docker websearch only return 10,000 results so youneed to do some tricks for exampleenumeration like kind of brute force youwill start with a a a b a a a c and soon so here on the on the right side youhave SKO so the step we we are at rightnow and the docker web search isreturning the list of repositories withSKO on the left part uh that theusername on the right part that thelet's call it the repositoryname and that's great takes almost oneday uh and you do it quietly um and atthat step you have one pabyte of datathat's huge i don't know what to do withthat i don't have the money to storethat i don't have the time to to scanfor that that's 9 million repositoriesand the first fun fact of the day uh isthat the most used uh repository iscalled Docker 101tutorial if you convert it to imagesthat's almost 26 million images uh andas I said before only scan 15 so my goalis to go from 11 pabyte to somethingelse the target is 50 terabyte which ishuge still but that I can I can manageso next step is to have a look at thetags so you you retrieve the tags ittakes like two or three days uh usingthe the the endpoint that I presentedbefore so that's step number two andthis plot is quite interesting it showedthat uh most of the layers 93% of themthere are five layers or less five tagssorry or less and that's what I use idecided to only take repositories withat most five tags so that's the secondstep so concepts is roughly half of whatwe had before 6.5 pabytes that's 15million images and the other fun fact ofthe day is that latest is the most usedtag so the other fun fact that took mefive days to find that um not reallykind of waste of time after that we havevis 1 v 0 1 zero and so on that manyversionnumbers um next slideum there's a trick not a trick but someimages are huge so this oneis 170GB that's big the first layer is 128 GBthat's still big that's a machinelearning uh machine learning modellikely not containing any secret so whatI did next I say okay maybe they're biglayers i don't want to scan them twicelet's dduplicate if you dduplicate youwill getthat 44 million unique layer the biggestone in the experiment was 148 GB 148 GBit mean that when I try to do thescreenshot um the layer disappeared sofrom time to time you have layers andfrom time to time they disappear um andto get this number to to get the layersI had to get the config files to get theconfig files from the and the manifesttakes approximately 10 days quite a lotof time and the next thing when you haveconfig file you can have a look atinstructions so we'll start here on theright so on the right we have the numberof instruction run is the most used uhinstruction 30% of the instruction arerun instructions and then we have copycmd add and star here means everythingelse expose entry point uh and the otherones so that was the count for the sizeum so if you recall um we have 48pabytes at that stage run that's 70% ofthe total 3.3 pabytes and indeed a runlayers are not really important most ofthe time you will see calls to APK aptinstallation of softwares installationof Java SDK so what I decided to do isnot to scan run layersum because that will exclude most of thevolume i will focus only on copy and addand here the idea is to only have a lookat files that are copied into the finalimages so obviously I'm likely missingsome_ uh secrets by um not scanning runlayers and I will discuss that later onwe we found tricks to actually identifythe good layers without scanningeverything so if you do that you go fromum a bit less than one pabyte to 412terabytes that's 18 million uniquelayers and we need still to do one morestep to go to50 the the trick here is to have a lookat the the add and copy layer sizes umand this graph is also interesting um soin in in uh in right here the the whitecolor 99% of the layer are below 200megabytes what I decided to do to do istake a smaller size so what I did I onlyhave a had a look at layers smaller than45 megabytes that's 90% of the layer soat that stage we have 60 million uniquelayers with a max layer size 45 it'sstill 15 million images it's just thatI'm looking at lessdata so if you put everything togetherto scan Docker Hub or indeed to scan anyregistry but Docker is the biggest oneit took me 15 days to do the discoveryphase so identifying the repositoriesand the images uh three days to scan thethe config files with the gigard giganproducts 20 days to download and scanthe the layers and here you have the thethe the curve which is more or less thespeed of the scan over 20 days in theend it was Christmas time so I was off ididn't look at the experiment so it tooklike three days to complete that wassuper slow at the end I was offbasically celebratingChristmas okay um charts numbers i knowit's not always super fun to have a lookat chart and numbers so I try to to makea short version of that first reallyimportant point before the numbersthat's um what we call at Gardian secretdetectors we have two types first thedetector that we call specific and as anattacker I will go for that or as apentester red timmer I will go for thespecific why because they belong to wellthey use sorry well-known patterns andformats usually they can be mapped tospecific uh services and provider GitHubGCP AWS and so on and they may be umautomatically validated which is theexample I presented before for GitHub weare able to identify a GitHub token andto check it against an API so maybeautomatic automatically validated whybecause the specific could be an openSSH private key and obviously it'sdifficult to know which host to connectto and then on the other hand we havegeneric secret generic secret they lookrandomum we don't know the service or theprovider and we lack the context sogeneric secret are not are kind ofdifficult to automate so I I thinkthey're not great um but a human musthave a look at them and I'm really luckybecause at GG we have really greatpeople really great machine learningscientist and my colleague they managedsomething really really cool they tooklike this gen secrets they applied theirown model and they are able tocategorize 13% of the gener secret whichmeans that usually you don't know whatto do with the secret but because ofthis model we are able to know if it'sfor example a Microsoft SQL server loginand password so I think it's reallygreat but that's out of the scope fortoday so I'm going after specific andaftervalidUm there's okay one information I needto give you the only information youneed to to to take back home um inEnglish this is called a pie chart soI'm French we call that a cambear that only information you need toto take back home from cubecon the cambear is a cheese so 5% of therepositories they contain a secretthat's in other words that's one out of20 that's huge um first uh number secondnumber origin most of these secret theycome from they come from layers not fromconfig so several possibilities I'mscanning way more files than layersthat's why this percentage thingy is notsuper great or people are not actuallyleaking uh any secret in dockerfiles next slide um the best forattackers 20% of these secrets specificsecrets obviously are valid that's huge100um,000 secrets are valid today in DockerHub um not all detector are equal sothis 100,000 secret come from 230detector types obviously some of themare not super interesting for attackeropen weather map API keys uh you can getthe t the weather in London New Yor`kTokyo Paris do that a lot of time it'snot an attack not super great but I willas we will see in the next slide we haveother more powerful types for attackercanot check here is really great againcould mean that we managed to identify agitlab personal access token and we alsomanaged to identify that the host is notreachable so these secrets are reallyinteresting especially for attacker inyour internal network because they canuse that to do lateral lateral movementso they can go deeper and deeperum types types 28% of the secrets secretdetector that related to data storage sofor us at GitGu means databases S3buckets and so on so total of this 28%that's 140,000 secrets 13% are validthat's a huge number again um and thesecond biggest category that's cloudprovider we haveuh um 100,000 sorry secrets and 50% arevalid so again in Docker up today youhave a lot of critical secrets for manybigcompanies something I didn't tell whyI'm doing that i'm doing responsibledisclosure because as a securityresearcher that's my duty to reportanything critical to companies and inthe past four to five months I've beendoing that i will I will not like giveyou names we have a blog post on that uhbecause from time to time people arereally happy that we give them secretsand they put them on their wall of fameor we havevulner we are finding u yeah secrets forbig companies the last[Music]um graph this one is just crazy uh 60%of the valid secrets were leaked before2024 so it mean that the more like youwait the more secret youfind crazy and for 2020 so five yearsago 2,000 secrets are still valid so ifyou're able to grab them you can attacksomeone you can go deep in the uh in thearchitecture that's uh for me thisnumber of 2,000 that's justinsane and I'm seeing that every dayyesterday I was having uh coffee kulcoffee at breakfast i found a validGitHub uh token that gave me access toKubernetes related open source projectso it means that I'm able to do any kindof supply chain attack against projectthat's my breakfast like do that everyday crazystuff um so conclusion docker pitfallsand takeawaysum I hope some of you will not believethe next slidesso first I guess you know dockerremembers here that's a a fake dockerfile but it's in inspired by things I'veseen uh from copy do stuff run npmci andthen remove npm rc so npmrc was added tothe docker image with the copyinstruction but the thing is npm rc isnot removed from the image it's justremoved when youactually do docker run and the file isstill in the copy layer and if you getthe copy layer you get a secret so thisis a pattern that is not really frequentbut you can see it so dockerremembers next point is more morequestion my question is are peopleleaking or hardcoding secrets in dockerfile i don't think so in practice thereare really few hardcoded secrets indocker file so here I'm using scopio anddq that's perfect tools to to go fastand analyze docker images and you cansee that someone actually tried to doecho that's s3um client ID client secret echo clientID client secret into a file this squarevalid and this pattern is not reallyseen really often so what is goingon so let's have a look at the otherpattern again copy and GQ so here wehave a bunch of arg instructions on andhere that's really really again reallycrazy insane um leak the appearingconfigs as a result of docker buildstrange so let's have an example thisone was just crafted crafted for you uhon top we have a a docker file is reallynice this docker file no secret at allthe only bad stuff let's put that waythat the echo test and in that case thepassword will be seen in in yourterminal at build time which is let'ssay okay just for um the testing purposesecond phase docker build and we havethe build arg and for me that'sperfectly fine I've been doing securityfor years 15 years or more I've been adeveloper before and everyone told menot to art code everyone told me to usevariables to to pass secret and fromtime to time it's not possible to usearguments so this looks fine I mean it'snot hardcoded in the bottom What you have that'scall it a flat representation of theJSON config file json is not really easyto put on slide so it's just a flatversion and inside you have the secretsthis funny way of writing cubecon inarg and run layer so it mean that dockerbuild leaks secrets into the config sothis is just insaneso what people do usually when they findinsense stuff they clear ah it's toocrazy and then they read thedocumentation there the big warningbuild arguments and environmentvariables are inappropriate for passingsecret to your build because they areexposed in the final image instead usesecret mounts or SSH mounts which exposesecrets to your build securely don't dothat so don't use build ar so what elseso that's after the camera that's theother thing you need to to take backhome use mount secret but I think in theprevious presentation they also discussum this uh best practice mount secretthe only drawback for me the the way youactually use it run d-m type secret isnot really easy to remember um anyway ontop a docker file in the middle the newway and clean way to pass a secret dashsecrets so here we are mapping therandom variable to secret and the trickhere random israndom so every time you call thecomment that's a new secret injectinginto secret just a trick to to see if itworks and at the bottom you have the theresulting JSON config file so no secretinside so that the best way to dosecrets are only accessible at buildtime so that's I guess one of the maintakeaway from thispresentation next um anotherquestion what if people are actuallydoing such bad stuff that secrets arefinally leaking in run layers so this isa fake example obviously no one willever write that no one will ever use runand redirect that intoum so that could be ofconfig.gsing py any file storingcredential that could be used by anapplication to do something else forexample connect to a databaseokay might work so let's have a looklet's select grab whatever kind of toolyou use and that's just a small exampleof real instruction in my data set andactually people are using that so thelast line here that's my fake examplewhich is indeed real people are doingthat um in blue do you know what what'sthe blue stuff that's a question for youwhat is the blue partokay so blueparts that two slides ago we discussedmount secrets so mount secrets canassign a secret to a variable and we'llalso giveyou this pass run secret so here we havesomeone that is doing the best stuff inthe docker world to give a secret to animage at build time and at the same timeit's our coding the secrets in the endso this is the worst thing ever don'tever do that please tell that to anyoneyou love don't ever do that the worst ofthat that's in the data setum it's time toconclude brief conclusion uh I'm sorrybut you are probably leaking secretsmaybe not you maybe a friend maybesomeone in your company and it's notbecause someone in your company is doingbad stuff it's because it's difficultit's difficult because there's a widerange of docker pitfalls and I know somepeople that have been doing docker foryears that they didn't know for thisbuild arc stuff so please audit yourimages for art secrets so obviously ifyou use our tool gardant will be it willbe good and be really happy mycolleagues also do whatever you do uhyou want to do please scan for secretsthere are really bad stuff if you arenot technical and more into the let'scall it the governance side GRC ISO 2701sock 2 whatever please just try thisyour 2025 to have an exercise pretendthat someone in your company publiclyleaked a secret let's call it like theAWS root accounts and see it goes canyou revoke can you identify thedeveloper what will happen that's areally good exercise and why um whyscanning while doing their size thatvehicle prevention that's my guess ismore effective than actually dealingwith a leak and dealing with umrevocation so that's the end of the talki have the the QR code you can get thethe slides online if you have anyquestion I will be around having acoffee and try trying to have a a wateruh so yeah you have microphone here ifyou have any questions2025-04-15 21:58:05.438379 ��VK#��cA2r92tTuFYg8i'm really happy to be here today thankyou for being here with me i've been inthe all and I've seen a lot of peopleleaving so thanks a lot first owe you anapology there's a typo in the title iscanned indeed 50 million images a bitmore than on the title so I did somehomework uh on the way to the conferencewho am I i'm G French uh French guy usedto be anetwork guy uh you might know um a toolthat I'm working on Scrappy which is aPython library dedicated to um umsending and receiving um u packetspacket in Python and um these days I'mdoing security research for GitGuardiangitg Guardian that's a French company uhwhich is doing secret detection so atfirst obviously in Git and GitHub butnot onlyso that's a bit disturbinguh not only in in in in code but do wedo that in messaging systems we do thatin uh things like Slack uh and our goalis to help customer identify secrets andremediate two weeks ago we released areport called the secrets pro report Inthis report it's just let's say um uh asummary of the the discovery we do likeevery day with secrets the crazy crazyworld one of the information we like inthis report is uh the fact that peopletend to leak more secrets in privaterepositories that in public ones so ittends to indicate that when people feelsafe they will arcode secrets you canget it online uh it's free there is nopayw wall no no address if you want toplay with the product it's also free soyou can just create an account plug youruh GitLab instance GitHub instance andit will start scanning the secrets foryou uh but my goal today is indeed to tohave a look at um what can attackers dowith secrets um in the news attackersare looking for secrets in many thingslike PIP packages they're scrapingwebsite forum in the past weeks attackers have beenlooking for secrets in GitHub actionlogs with great success it seemed thatthey managed to hack into Coinbase or atleast some Coinbase dependencies umthat's my focus today i want to behaveas I did in the past as a pentester ordoing offensive security so what canattackers do with secrets um in securitylike in many other fields uh we haveframeworks uh to define what we do oneof them is attack that's a way to defineattacker attack path so today we'lltarget counter registries in that casedocker up we start with validcredentials and these are not valid inthe sense that they're default you don'tneed credential to talk to docker hub sowe have default credential then we getall of the registries so docker imagesand then we scan for unsecuredcredentials so that's the target oftoday and an attacker what you can dowith a valid credential for example it'sa valid GCP uh account it will connectto GCP it will create a new instance forexample with a big GPU big CPU and youwill start to do crypto mining sobasically is stealing your money sothat's something an attack can do butattackers they don't only go after uhcloud credentials they go after data sothey look for um credential fordatabases uh S3 buckets of course codesand um they scan also for registries inthe case of registries like piartifactory docker or github and gitlabone of the goal is to look for secretsbut also to perform some supply chainattacks something that I forgot to toexplain that's what is a secret a secretcould be anything like a usernamepasswords\c thatnor do we want to throw shade uh becauseall of these ideas are actually reallycool and and we think they're quitevaluable so we're going to compare themand we're going to take a look atmulti-tenency as seen on TV and alsowe're going to figure out the the rightway to have that difficult conversationtalking to your doctor about which comwhich container runtime is right for youthankyou so as wesaid Kubernetes is great and it's anabstraction for us to understand andmanage our infrastructure containers arealso great but I'm going to say thatthey're great at deploying to differentenvironments our containers are justprocesses ultimately running side byside uh on some compute somewhere thismight not be a concern when you'rethinking of Kubernetes when your clusteris all the way up in the cloud but inthe cold LED light of a server room it'sa process and maybe we should startthinking about where this is andultimately what's aroundit so what is a container runtime we'llstart with definitions to make surewe're all got a grounding base myoneliner is it's when we provide acontainer image that we wish to run onsome compute the container runtime wouldstart it and manage it accordingly idon't believe them to judge a containerthat they are running uh they are justto work on this process they don't knowif it's right or wrong to run acontainer they just run it so let's looka bitdeeper so the OCI stands for the opencontainer initiative and they have theOCI runtime specification it specifiesthe interface that we use so we canselect a runtime to use in ourenvironments pretty much everything Isaid in the last slide but to betechnical and to build on the interfacecontext it's required to handle createstart kill delete and many othercommands the Linux namespace and this iswhere naming starts to get hardnamespaces Kubernetes namespace Linuxnamespace user namespace we've got a lotso the Linux namespace provides us withsomething that feels like isolation theruntime creates Linux name spaces suchas app and mount and user for examplecroups help us limit the resources beingused allowing us to prevent the noisyneighborproblem set comp and nama assist with asecurity posture of our container uhwhen it can't what it can and can't doessentially and this is a form of likefiltering sys calls or to mandatoryaccesscontrols so we need a file system forour containers and we mount from thecompute or storage into ourcontainers now that we know what ishappening when we run a container let'stalk about the different types ofcontainers that we might run from asecurity context and so I believe I'vegot three hopefully I got three um so wegot untrusted code or an image uh just arandom image on the internet um and sothis is someone else's code essentiallywe don't necessarily know who's made itor where it's been made or where it'sbeen hosted um there might be little tono addestations so um like the originsum of being able to find out what'shappened there uh associated with thisimage ultimately we don't know if we cantrust the running image uh but thenthere is nothing much stopping us fromdoing so now if this is like a newconcept to you please check out Salsa umthat can help you understand uh betterhow to start putting these uh securitythings incontext now has anyone seen this newacronym as a service before well um thisone's by Jed Salazar so I want to give amassive shout out for that remote codeexecution as a service um security trackso when we have remote code execution orrice um it's usually a scary thing umit's usually terminal um that's a verybad dad joke um but the point here is isthat when you run remote code executionas a service remote code execution iswhere we can start to do bad things andwe can like exfiltrate x excfiltratedata or we can start manipulatingsystems um but Rice as a service todayis for a developer platform so AIdevelopment is rapidly expanding anddevelopers require hardware that isn'tphysically available to themselves um ontheir laptops so companies such as coderand githod they offer compute with anIDE and nowadays this compute has GPUsso yodu'll be able to take a containerand run it there but have access to thehardware that's right as a service todayand we need to be able to securethat now this is the most simple of thefree but arguably the most vital as wellwhen we talk sensitive applications it'snot just your payment details there havebeen data breaches in the past that haveresulted in people's lives being at riskfrom a container standpoint we might beputting all the best security that wehave in place and we might believe thatwe're doing a greatjob but if we go back to the processesbeing run is there someone in this roomwho is just looking at our processes uhpoint being tools can get us so far butimplementation is also key so personallyI love using Signal to be able tocommunicate to my friends but I get abit concerned when a journalist has beenadded to our group chatso um oh here we got uh runtimes andcontainer security postures with uhsharing a kernel and host access we'regoing to go into these two but I grew uphere in the UK and as part of myupbringing I was taught that jungle ismassive today i know that also kernel ismassive and the kernel has a significantfactor in security today as as anythingsitting on top of that kernel will passthrough that said kernel so we need tobe able to secure this as best as we canbut when the kernel is massive in sizeand importance we have to be careful asany changes here could be detrimentaldetrimentally impact on theperformance so host access is the secondpart here and so at 5:00 a.m thismorning when I was trying to think ofthis one um I was I wrote modern cultureand then I'm referencing the matrix andI realize oh god um but the point hereis is the architect from the matrix ismy concern here with a godlike cis adminmode uh they can access the OS that acontainer is running on and view all theprocesses when they're on your sidemaybe that's okay but what happens ifsomeone has malicious intent and also doyou even know who the sis admin is who'smanaging your compute and I stress todaythat if you're using a service is yourfinancial payment providing you enoughtrust on thatperson so taking those concern Takingthose concerns uh let's get deeper intothe weeds and attempt to group runtimeclasses bytype so we have the kernel type and thisis our standard type of runtime class ifyou're running a cluster today and thisis all brand new to you you're probablyfalling into this category exam exampleruntimes here would be runc containerdor cryo sandboxing is achieved by basicLinux namespaces and croups and asubiquitous as it is we see that they canrun general purpose workloads if youcheck a cube color output you'd expectto see the runtime class name as Rancinative orLinux so moving on from that and this iswhere we've seen advancements in theindustry this is uh this is and thisadvancement is going towards isolationtowards multi-tenency and additionalsecurity so within this group I'mputting G visor calontainers andfirecracker but we'll dive into thosethroughout this talk but this is whereit starts to get complicated i'mgrouping these together as they aregenerally focused on isolation andisolation to put focus on multi-tenantworkloads but grouping them togetherdoes not mean that they are attemptingto solve the problem the same way tothat I'd also like to bring a dera intothe mix in thisgroup now miscellaneous and again noshade but this is my personal opinionsum and to quote Caleb's friends from NewZealand please be constructive with yourfeedback as I'm trying to as well wom umI still don't fully understand Wom andthat's probably a me thing um I knowthat there are good things to it and Ibelieve that they are secure um I justcan't get my head around it and I feelthat I need to rebuild everything toutilize it for me we have to meet peoplewhere they are at today and sadly for meI'm not where needs me tobe the other one that I put into MISK isrunning everything as separate VMs umthis is an option yes but I wouldn't saythis is Kubernetes to meyes Kubernetes is scaling architecturebut Kubernetes is about efficiency aswell if I'm paying for compuete I shouldbe using that for my workloads solvingisolation this way might be one of thebest isolation options of the past butthis is at a financial cost of lost CPUmemory and energy but let's not alsoforget that there is human cost here tooum managing another VM takes time awayand a single misconfiguration can causesignificant issues goodwe could just juggle that would beeasier thank you for that Louis um yeahso workload isolation uh who likes thatyeah few hands good let's talk about thehistory because I think there's there'sa lot to be told about it um as Louisajust mentioned about uh having severalvirtual machines uh historically I meanyou probably still do this in somecapacity but you may wish to have acompletely separate physical machineand good for you uh doesn't work for allof us uh Cherut yeah any Cherut fans inhere yeah cool um love Taruteuh don't don't use it for anything secthat you need anyways uh tuto has itsplace in in things uh I think mostlythese days it's used for like packagebuilding and stuff that's cool nothingfor production u virtual machines greatfor rich isolation there's heaps of goodimplementations of this that uh you'reusing uh whether you know it or not ifyou're using public cloud virtualmachines cubeote I love cube vote justwant to say that uh virtual machinescool um also uh croups so later alongthe line we had croups introduced forfor providing some level of packingthings into a box and then uh making ourneighbors a little bit happier with uhnot overflowing onto each other'sprocess runtimes um out of Docker we hadrun C which was eventually spun out ofthe the project and um because we havethe open containers initiative and uhyep you're all using run if you're notthinking about this so um let's talkabout that g Visor G Visor is anapplication level kernel that limits theCIS calls and various other thingsaround like what what disk access andstuff you may be able to have in yourcontainer so G Visor that that's anotherone that's pretty popular um it it it'squite good uh one of the options you maywish to have bubble wrap uh anyonerunning desktop Linux yep flatpackflatpack uses bubble wrap um it's it'sit's run C basically uh at least forfrom an ideas uh standpoint so bubblewrap is just scopes down exactly what adesktop applicationuh will be needed i mean it it it can beused for other purposes you'll just findit in flatpack is uh um a good takeawaythere chem is a specific implementationof virtualization so Chem is a prettyold project but it's it's uh beenrigorously tested by um I'm sure there'sa few folks from various cloud providersin this room i'd be surprised if therewasn't you're probably using Chem alexthat's a a bit of a classic one it's abit more niche these days I personallyfind but it does have its placesomewhere it's a bit It's a bitchallenging to configure but it may workfor you that's coolwe have microVMs such as Firecracker andmicroVMs are a great great way to putyour application in a tiny box and sayuse the kernel that's running inside ofthat tiny box and then you can't seeanything else you're in you're insidethe box get in there's some excellentwork happening over at ConfidentialComputing the Linux Foundation projectand um this is a bit more uh hands-on soyou do require having access to thingslike special hardware and and stuff likethat and it also is really great becausewe we care about attestations andproving the environment which I think isawesome uh it is it is today it is alsofuture uh but uhyeah look at that space uh also I'd liketo hand it over to Lewis to talk aboutso I previously worked at a place wherewe made images and I was there when uhWolfie was founded by Ariadne and sofrom the brilliant mind that is Ariadnewho's our CPO at Ada um she's created uma styite so this is our version ofbubble wrap uh the difference is thatit's much smaller than runc and it'sfully written in rust it's fullypragmatic and has rich APIs and it'sopinionated for security security andefficiency and it's open source so it'seasy for local sandboxing so we're keenfor your feedback so please uh checkthfis out um we'll share links to theslides as well after um let's workthrough the options that we just wentthrough to see what we could be usingtoday so back to run run is a referenceOCI runtime used by docker containerdand cryo it directly uh interfaces witha Linux kernel to run containerscritical for understanding containerisolation and breakout risks and regulartarget for security harding and updatesas we'll talk about with leaky vesseland it uses uh user name spaces uh forrootlesscontainers now moving on from that we'dlook towards crunch um orcrun acrony when you get on stage andyou read the dogs but you don't know howto pronunciate them crunchy crunchywe'll go crunchy is a faster leaner OCIruntime written in C um it's ideal formodern workloads fast start low memoryrootless support and this is a greatchoice if you're optimizing forperformance and kernelfeatures but then we're going to move onto G Visor now traditional containersrely on the host Linux kernel for SIScalls g Visor will intercept those callsand emulates the Linux kernel behaviorin the user space drastically shrinkingthe attack surface the previous tworuntimes uh talk directly to the kernelso G visor is helping us out there and GVisor containers talk to a fake kernelthat decides what to allow this canprovide stronger multi-tenant isolationthat the previous uh two uh cannot butit can't solve the problem of theprivileged container risk so privilegedcontainers that's when we just removeall the uh security defaults um and GFcan't help it the caveat for the windseries is that the cost and performanceuh that we will talk about later on withour graphsnow moving on to Kata container so Katauses KVM or kernel virtual machinesunder the hood to provide the isolationand implementation uh runs similar torun C the microVMs available are QMU orfirecracker and this was the defaultoption for isolation within Kubernetesfor some time in my honest opinion butthere is a higher overhead here runningKata than the previous mentioned optionsbut I've yet to see someone u who haseasily implemented it within theircluster if you're here today please comeand see me uh I'd love to talk to you umso there is a barrier to entry in myopinion here and I want to state thatcolor containers are solving a reallydifficult problem and complex problemslike that um it's just not as easy toimplement to an end user again in myopinion but we were discussing this thismorning weren't we yes uh wellpersonally I'm a a very big fan of TelosLinux and they have an incredible uhthing called uh image factory and allyou need to do is check Cartercontainers and then it it gives you ainstaller which has exactly uh theoperating system with Carter and sothat's that's my personal easiest way toget it up and running and yeah cool nowif you're using WOM then I'm assumingyou're using Spin Cube and as I statedbefore I'm not a WOM expert so I'dappreciate if there's anything anymisunderstandings here uh to becorrected by someone with moreexperience i can see the implementationof this to be able to use uh ourinfrastructure to distribute WOMworkloads on our clusters but Wom canoffer more effective workloads andoffers a way to run WOM in amulti-tenented environment the cost herefor me is learning Wom and rebuilding myapplications so that they can run on Womand again if you're a WM person thenthis cost might be less significant toyou but for me today uh this I I can'tafford thetime now CISbox uh uses a Linux username space of all on all containers toassist withisolation now this is coming towards akey theme of this talk is what isisolation i've speak spoken to people atthis conference about this talk to saywhat's your definition of isolation andthere's not been one consistent theme weall think we've got isolation to anextent but it's like where are we all atso to go to this and to quote theirreadme um unlike alternative runtimessuch as Carter and Cubert it does notuse VMs this makes it easier to usewhich is a great win in my opinion itcan also run in cloud environments byavoiding nested virtualization anotherkey win becauseg if we're dependent onbare metal bare metal doesn't scale andat a Kubernetes conference we do requirescale but here's a key quote from theirreadme although it does not provide thelevel of isolation that VM basedruntimesdo so we're still just saying the wordisolation and so it's like a VM isolatewhat is a VM isolation what's a KVM isolwhat is this isolation now on their siteuh there is a comparison chart availablewithin their readmemes but as I'll comeinto and um I want to be moreconstructive of this and I hope I dohave that option available to you in amoment but I'd also like to mentionAdera because we launched Adera Protect1.0 last week and it's officially Gaderero protect uh was an amazing ideafrom our CTO Alexander uh that she hadabout 16 months ago and our impleimplementation is both old and new we'rebuilding off concepts of Zen but we'rewriting it in Rust and we're creatingzones that you can run on a type onethat run in a type one hypervisor sothere is no nested virtualization andbecause we're so close to the compute wecan be as efficient as runtimes thatwere grouped within the kernel type eachzone has its own kernel and our kernelsare just distributed via OCI we not onlymonitor CPU memory and usage within azone but because we're so close to thethe metal um we can also see the actualenergy usage of a zone via Prometheusendpoint and we don't require specifichardware public cloud to bare metal youcan run adera anywhereso breaking the sandbox and I'm nowcreating a new acronym apparently um ofICU these runtimes are fundamental toour clusters working so what couldpossibly go wrong well with great powercomes great responsibility being theinterface between a a desired state andan actual state can cause problems suchasuh noisy neighbors um Sarah from Ada shealso gave a great talk on concurrencythis week and um ironically JonathanPerry is giving a talk about noisyneighbors right now so I would recommendgoing to YouTube afterwards and havingit on your playlist but the concern forus here within isolation is if we're inisolation we can't see what's around usbut we're sharing resources so ifthere's another process that's takingall the resources so if there's someonehere taking all the seats it's not idealfor the rest of us um things to look outfor here would be CPU memory and IO butagain I would defer to Jonathan's talkum where you'll probably get moreinformation now leaky vessels so back uhlast year in January um sneak uh mefound uh four uh CVEes one of which waswithin runc and the issue here was itallowed for lateral movement out of acontainer runtime because there was adefault there's a deep there's an issuethere within the code um that brokeisolation so and we're talking aboutruntimes constantly here and that'sbecause we we're a pinch point of oursystems like everything has to gothrough here so if you're able toexploit that risk you're able to getaround that nodeJust a caveat the fix of this was anupdate so as ever it's a great reminderthat we need to make sure that we'reable to update and patch our code at alltimes ha all righty uh so let's bringmulti-tenency into the to the mix withworkload isolation hereso who thinks Kubernetes name spaces isall you need hands raised yeah yeah okayuh you can leave i don't know um it it'slogical so we're we're putting it's it'slike a folder who uses folders on theircomputer yeahgood don't know what to say anywaysmoving on we also have Linux name spacesso this is kind of like a a differentkind of box that sits in in the kernelthat you're you're able to uh have thesespecific like mounts and uh maybe theresource constraints and stuff um sothat's good but this requires rootaccess which your container runtime doeshave so think like uh runc throughcontainity um but then we have therootless which would be um throughusername spaces but the problem here isthat this violates the kernel self-pprotectionuh what's the other word guidelinesthat's the word I'm looking for um soyou can you can have username spaces andbe rootless but uh it does reduce youryour security on your machine um whichmight noth be a problem but we're we'retalking we're talking secure here yeahyou could have different nodes so let'ssay uh in a made a a totally madeupscenario you've got uh some computeenvironment that you're running foreither some friends or customers or whatwhat have you and maybe you're you say aparticular customer can have aparticular node pool that might beenough isolation for you that's coolthat's an option we also have networkpolicies this is in in a different realmbut uh in in the magical uh uhnetworking realm policies are useful forisolating who can talk to who that'sgreat virtual cublet think of it like uhI want a VM when I say I want a pod andand that works great that might beenough isolation for you um there's someimplementations out there for differentcloud providers vcluster is more of aAPI isolation historically I I haveheard through the wind that there's useof uh container runtimes to securethings these days but um more of a APIkind of thing so it's still running onthe the nodes alongside the other thingsuh let's talk about in the wild so maybeyou're going on a big old hike and youwant to see what container runtimes uhyou might be looking on your chart forum so here we have the CI so thinkGitHub actions GitLab CI the one thatyou're using that I haven't said that'sgreat um cloud providers it's justmulti-tenency as a service um that'sthat's it uh that's great surprise uh wealso have some hosted serverless somaybe you're using um some cloudprovided serverless that'smulti-tenanted k native serving recentlyin 1.16 I think it was startedsupporting runtime class nameconfiguration so you can say anythingthat's running in my my cluster throughK native will be using a specificcontainer runtime by default or if Ispecify uh you may be using Google CloudRun AWS Lambda or Azure Functionsmulti-tenencywoo uh so there might be a cost you youmight decide I want security but thenwhat's the cost for that so often forcontainer runtimes uh when we're we'rewe're wanting to have like a veryisolated it's in its own using its ownkernel and such things you need to havea VM so then you need nestedvirtualization and inside of that VMit's running a kernel blah blah blahit's expensive so you got to you got tomake sure that there's those things thatare known the the known unknowns withrunning uh other container runtimes andvirtual machines um some of you areprobably very familiar with thissometimes you need hardware requirementsthere there can be uh specific hardwarefor for running virtual machines or whathave you so that that's one thing thatyou do need to consider when you'reactually deciding okay my doctor saidcontainer runtimes are right for me whatI need to do is configure it so afterit's installed on all my nodes becauseuh make sure that they're all the samei'm I'm looking at you at the back thereso you can do this either throughKubernetes container runtime classes andthis is very effective um but if youwant to make it even more effective youmay wish to configure containerd or cryoor a container runtime that I haven'theard about um so one idea that we'vehad is why not just not install run c onyour nodes um maybe use something thataligns it might be okay for you to useit sure but what if what if you youdecide okay uh I'm going to only run myapplications with a secure containerruntime i'll I'll use that foreverythingcool so I'm going to go through these alittle bit quick but uh Marina Moorewho's a researcher at ADA and also tagsecurity chair um she's created uh wellshe's gone through and done performanceuh to check to see the comparisons thatwe have today and I'll make sure theslides are available i've got lots hereif you go to adera.dev you'll be able tosee them all and essentially thesechange and on the left it'll say ifhigher or lower is better but these areall the considerations that we need totake into it as well i know we're on asecurity track but we also need to haveefficiency and performance and the pointthat we're trying to say here today isis that we want to see as an industry wewant to have efficient isolation whilsthaving the performance that we expectthat from the kernel types and that'swhat we're starting to see today wherewe didn't have it yesterday now theseare the important things for me is justto be honest about your needs back tothis isolation i want to see somethinglike Stalsa but for isolation i want tounderstand the different grades ofisolation because if you're a startupand you're just doing something on theweb you probably don't need secureisolation if you're a financialinstitute or a healthcare provider orsomething I want to see level three andI want it to be assessed i want anindependent person to assess in the inthe places I can't get to to know thatisolation is met but for you beconfident with your choices but to dothat you need to know what your needsare and if it isn't the right thing ifyou're just copying someone else that'sprobably not right for you so you needto do that and for that then thoughlet's be open about it as well as anindustry let's be open about how we'redoing things how we are we implementingthese things because that can give otherpeople advice as to how to proceed tothat am I isolated we open this is anopen source project that we've done nowas of yesterday we had some strongopinions on this from others and that'swhat we want because this is causing thediscussions that I believe that we needi want to have a tool that I can run inmy infrastructure to tell me what levelof isolation I have the problem is it'slike the movie Inception i don't know ifI'm in a dream state i don't know if I'mstill spinning there because that's whatvirtualization should provide me butthis tool is a love letter to uh JesseJesse Friselle's Am I Contained from 10years ago when we had the problem of amI running am I in a VM or am I in acontainer this is supposed to be aneducational tool to help us understandwhere we're at and I'd love to have yourfeedback on that this is the kind offeedback that it gives in your terminalit's very simple and I would love if ifyou're if you're if this matters to youas it matters to me then I would love tohave your contributions towards itso our conclusion yesuh well execution runtime that'ssomething that is hopefully at the frontof your minds right now because we'retalking about it to you that would makesense uh so yeah we're we're I I verystrongly believe that you should beconsidering having some kind ofdifferent runtime that isn't just run Cregardless of if you're doingmulti-tenency uh but in the case ofmulti-tenency it's a very effective wayto make sure that there aren't problemsuh until there are then you sort thatout but there's less problems and theproblems aredifferent but uh yeah let's provideisolation between our workloads and makesure that weare prioritizing securityso and last 30 seconds things I'velearned today wearing a kilt that I'venever done before first of all you wantto invest in the best underwear that youhave because I don't know how this isstaying up right now um and I'mcomparing that to runtimes because runtimes are that low-lying feature that wedepend on so much but we don'tnecessarily give enough love to thesecond part is implementation of that wemight have the best technologies but ifwe don't implement it right thank you umif we don't implement it right then itdoesn't matter and that's the thing iturned up to this conference i had RoryMccur Scottish person say you're wearingthe kilt bag to front i you not andso I had to go to the toilets and turnit around because I didn't implement itright third of all after 20 seconds ofwearing this I realized I didn't havepockets i've heard so many people whoare in this room right now who've toldme over and over again they don't havepockets on dresses or skirts i felt thattoday i had to experience it to feelthat um I'm so proud of the companywhere I'm at because we have threefounding women our CEO M is here todayas well and I know that I'm part of amajority in this room but I can onlysupport a change in the future that Iwant to see by supporting others and I'mincredibly proud of that but to all thatthank you for your2025-04-15 21:58:06.125198 ��.L#��AI9t7qfOjgboso we're going to talk about isolationtoday and why is isolation importantwell first of all sometimes we can havea bit too much information so let's sortthat out by isolating part of it andI'll pass to you Kevin thank youeveryone or uh hello or Kiora as uh asthey say where I'm from i'm from NewZealand i'm so glad to be here um thankyou for all coming and uh I would liketo isolate some workloads so uh let'stalk a bit about thatand hey everyone my name is Lewis um asyou can see I've got the most kick-assflag in the world um that's resemblingup there down here resembles somethingdifferent uh today is my first day ofwearing a kilt and I'm doing it in frontof all of you so I think that deserves around of applause for the amount of fearI've had for the last fourhours we had to get one round ofapplause in our talk um and the reasonfor this is because I'm an organizer forKCD Edinburgh uh CFP is open um yeah weweren't allowed to upstage thisconference this year by having it inLondon so we've moved up north um soplease CFP open we'd love to have youthere but we'll crack on and by crackingon I would like you all to stand if youcan for a moment this is turning into amagic trick no everyone has to standplease please please pleaseyeah look at everyone standing so muchpower right this room we're in today isour node this is for compute this couldbe the bare metal machine it could bethe instance in the cloud that you'venever seen before now looking here now Ican see pods because I can see groups ofpeople that I know together and alsothere's all of us together so you mightnot know people in this room before butthe concept here is is that the pods aredifferent shapes and sizes we got somepods that just have one person and wegot pods that I'd say optimistically 20people but um and we got some pods thatrequire additional resources so I gotpods right at the front here that needto interact with us because we need hopeand help right now um and then at theback of the room we got like ourephemeral pods that are coming and goinglike some of the pods are off in thedistance that's fine that's fine but wegot more comingin now each of you were there thereforeare containers and within containers werun processes and processes are yourbeautiful minds with everything elsegoing on there so just for context ifyou look around the room now that's whatit's like running on the cloud that'swhat a Kubernetes node is to me at thismoment thank you so much for that but ifyou could sit down that'd beawesome so there are a couple of peoplestanding up which I didn't tell youabout now these are the maliciousworkloads that I had in this room and Iplanted them on purpose whilst you werestanding up they were judging every partof you if they weren't judging you theywere going through your backpack andthey've taken all your swag that you'vegot the last two days like these are myfriends by the way lovely people thankyou for trust here but we wanted to showhere that Thank you yes you've been herethank you but we wanted to show you hereyou can sit now as well thank youfriends thank you thank you we're allfriends now we're all friendsbut the point here is this is isolationthis is isolation for me we're all herebecause we're running Kubernetes we'rerunning containers we're runningworkloads and we don't necessarily knowwhere they're running we can'tnecessarily touch where it is and that'sokay but what's running next toyou okay so uh today uh what we're goingto cover is an overview of whatcontainer runtimes are the history andyou might recognize a few of theimplementations we're going to talkabout the risks so what are the risks ofI mean running workloads at all and whatare the risks of using particularcontainer runtimes but the main thingthat we want to cover here is awarenesswe're we're not here to prescribe uh youmight have something that works betterfor you and we can all appreciatebklike that and then on thecertifier side it's um going to umvulnerability sources like osv.dev to tosay it looks at the graph and says whichwhich packages are there and then itgets more information from from thoseexternal sources sovulnerabilities end up being added asnodes to the graph as well as um likethose scorecard scores I was talkingabout so here you see um on the bottomgoing in into in are the differentsources um that those collectors arewritten for and that that gets pushed upinto the the guac system and then on thetop are things that you can use to to umaccess all the information in theGraphQL API so uh there's of courseGraphQL itself if you write queries toyou know say I want I'm interested inwhich which package is connected to thispackage is connected to thatvulnerability things like that um wehave a CLI to just you know answer sometypical questions uh a visualizer I'mgoing to show that a little bit laterand then um your whatever your ideas areso I think at in Paris there was a talkabout uh guac and um OPA and to dopolicy for for Kubernetes so there'sthere's a a plugin for that as well umyeah I think that's aboutitokay so uh few words about um my projectmy project my CNCF project but theproject I'm maintainingum so Cubescape uh started u like fouryears ago the project um it used to beuh it started as uh CLI for scanningKubernetes and uh live API and YAMLfiles uh for misconfigurations from asecurity perspective uh it originally itwas really built upon um uh open policyagent and implemented a lot of lot of uhuh controls security controls frameworksand so on and the project really fastlyevolved into a full-fledged Kubernetessecurityplatform so both you can cube cubecapehas a version of as a that can beinstalled as an operator in yourkubernetes cluster and it covers uhquite a lot of things that you wouldneed u from a security perspective whenmanaging a cluster and kubernetesdoesn't give you that out of the boxuh you know we started by fromconfiguration scanning uh we have avulnerability scanner which is going tobe I think the focal point of thediscussion today uh but beyond that weare uh we are doing we have an ebpf nodeagent u that does a few quite nicethings like uh um network policyproposal second profile management uhand be beyond that anomaly based uhruntime detection and we are cubecape init different like versions uh areavailable as I said as a kubernetesoperator as a cla tool as a githubaction as uh um visuals uh code uh uhplug-in so uh the project went up to agreat start and today we have a lot oflove that we get back from the communitya lot of contributions a lot of uh uh umusage uh which is like uh really reallygreat as an open source project so thething I'm going to focus on today is uhrevolves around the vulnerabilityscanning and package management as bomband here's our connection uh with Guakso um so is there here anyone who is inhis daily work does vulnerabilitymanagement hands up yeah is it fun rightcool thing very nice we love it so um soone of the thingsuh we really like in our project is isis just to try to come towards theengineers towards the security personsand try to solve uh things that youwould need need to do uh manually andsolve them you know through differentsmart things and one of the uh uh majoruh uhproblems that do all of us who are whoare doing uh vulnerability management uhare more or less you know uh uh affectedby is that we need to fixvulnerabilities we need to scan forvulnerabilities we need to know we needto keep an updated log of what kind ofpackages we have in our environmentswhere they are running uh uh and whatvulnerable they have and we need to doit constantly and monitor this thingwhile the main issue with that is thatalthough all of these vulnerabilities tosome extent there is some kind of aconjuncture where where you couldexploit them because otherwise theywouldn't be called vulnerabilities uh inmost of the cases they are not leadingnot real threats to our environments ina Kubernetes cluster because of myriadof reasons uh which I'm going to talkabout a little bit lalter so here comesuh the reachability feature uhreachability analysis in cubecape so umwe love uh other open source projects aswell so uh when we went into the eBPFworld uh we decided to use inspectorgadget which is another great CNCF uh uhproject uh uh in sandbox today and thething that we come up with is thefollowing simpleidea um today when you take a kubernetesworkload um you can look at the yl filesee where the image is being what theimage that it is running uh take thisimage put into a scanner and get likehundreds of vulnerabilities instantlywhich is like super annoying now theproblem is that uh that all of thesevulnerabilities although they are therein the image they are not actually umloaded into the memory or not even usedby uh uh the container during runtimeand um this difference between sayingwhether vulnerability belongs to a to asoftware package that is being usedduring runtime or not was something thatwe felt that we could uh uh uh you knowanswer uh this question really simplyand this led us to developing uh thereachability feature which where we areusing ebpf monitoring of uh of file umactivity in the containers in runtime uhin order to uh uh to say which softwarepackages are touched by the runtime sothe way we are doing it is as you seehere we are creating an sbomb uh for theimage that is running in in a kubernetescluster we are using uh sift and gripeuh for this uh on the other hand we getthe uh observability stream of fileoperations uh from uh uh inspectorgadget and take and looking at the sbomband saying okay uh which package wastouched during the runtime uh by thecontainer and marking them as reachableor loaded uh this enables us to create ashort list of of sbomb anbomb which isonly includes uh the those softwarepackages that are being used by the contuh by the container during runtime anduh uh and remove all of those that areare notso just some further explanationCubescape we are big believers ofopen-ended uh uh integrations and APIsuh the way it works that uh thecubescape operator component detects animage that we haven't seen in thecluster before triggers uh an SBOMscanning through our component the cubewoolen then the cubulan scanned uh uhthe image created an sbomb based on whatit saw in the image and stores it as akubernetes custom resourceum this custom resource can be accessedby all of you and actually you'll seethat uh how it works uh but in generalthis custom this sbomb is already thefull sbomb uh the node agent with theebpf uh data stream is able to markevery entry on this esbomb as eithersomething that we saw during runtimethat's executed and loaded into memoryor something that hasn't been andtherefore enables us to create adifferent object which we call thefiltered asbomb so you have twokubernetes API objects you can accesswith your cubectl and uh you will uh youwill be able to see the differences sowhether this is uh whether this ishelpful so I usually like this althoughthis is not an updated uh uh uh graph uhI like to show it because uh as you cansee in the case of the this radiusexample uh the number of vulnerabilitiesfrom around 150 170 vulnerabilities uhwhich are all the vulnerabilities thatare existing and all the softwarepackages inside the container during theruntime based on the runtime analysisare reduced to around uh uh 15 uh uh andand this is like uh something around thenearly 90% uh uh noise reduction so inin general I brought you like a fewother examples uh posgress uh elasticsearch you can see that only uh only uhjust um uh uh two/3s in these images arereachable of all the cvees um in otheruh other uh examples which are actuallylike quite a good examples are uh youcan see also that only just part of theuh the vulnerabilities are actually partof uh uh of software packages that areloaded into the memory so over the timeuh we've been in contact not especiallywith the graphana team but few otherteams and us showing their uh them youknow the results uh enable them also toremove software packages that thatweren't in use and enabled them toimprove um and just last word so ingenemral in the general population we seeat armo of what how you know how muchyou can gain from this kind of filteringis is between around from 70 to 90% uhin the general you know applicationpopulation most of the time every if youthink about all those software all thoseimages that contain based on uh baseimages that are like uh taking Ubuntubase image or redhead base image so verybloated base images uh this you knowgain of of filtering is higher in caseof uh uh you know very slim uh uhcontainer images usually this gain issomewhat uh uh smaller back to youcool all right so we have um a greatplace to analyze supply chain data wehave some really great supply chain dataso it only makes sense to get all ofthat cubecape uh sbombs and filteredSbombs into guac so it's it's what I wasjust describing here with the collectorum I just wrote and released a Coupecapeuh collector for Guac uh just releasedlast week and a little bit about howthat works is um as Ben mentionedCoupecape is putting all the all thesbombs into the API server as customresources so the collector is just asimple um Kubernetes client that watchesthe API server for those uh sbombs asthey're there and then automatically umuploads them and ingests them into theguac system that you have running prettystraightforward um so I want kind ofwant to show like now what can we dothat we have these different all thesedifferent sbombs in guac and try toanalyze them and see what thedifferences are and explore that um soto do that I'm going to show a kind of ademo or you know a little bit of anexample applicationum before I show the example I do wantto show here um so this is the clusterthat I have running um I have up at thetop I've got like I the regular sbombsthere which are just the image the imagescan or you know full image sbombs uh ohin the middle those are the vexdocuments so those can go into to guacas well but I won't I won't be exploringthat right now and then at the bottom isthe uh filtered sbombobjects and so to to take a look atthose uh I'm going to use I wrote a veryquick sample um you know applicationthat let's say it you have a server anda job right so this is go code so bothof these things live in one gitrepository um they have one go mod butthe server and job like share codebecause it's one project but they havedifferent um executables differentbinaries that get compiled so if we ifwe see here our packages on the right Ihave um a server and a job containeractually separate ones that get builtand for this example uh in this in thego mod it has like gorilla and zero loglet's say are two things that are usedbut if we look at the packages or sorrythe the the source code the server isonly using gorilla it's not using zerolog and then the um the job is usingzero log but gorilla so remember that aswe go and look at the at the uh at thesbombs so I have I had already run theguac collector that went and collectedall the cubecape sbombs but also um Ihave sbombs from a source code uh sbombgenerator and I also have an sbomb froma buildtime sbomb generator and all ofthose have been put into guac sonormally like I was saying you know youwould do an bomb during your build so wehave that plus the cubecape uh imagesbomb and then the cubecape filteredsbomb so in uh guac we I'm I'm here inthe playground for I can where I canwrite graphql queries and just run themand explore the data and I have a aquery here that's just all sbombs and soI can look here and I can see that uhthe file collector collected this resultspdx and the um file collector alsocollected this server cdx um but it'sthe cubecape collector that collectedall these other sbombs here uh thatwe're going to look for so the first oneI'm going to look at um I have some morequeries down here thatuh take a look at all the packages in anesbomb and this is just to to explore sothere's a lot more uh things that youcan you can query in guac um and so I'mgoing to use the ID of the source sbombthere to query and see which packagesare inthere and so this is a um this was Igenerated this with um OSV Excalibur uhand it has you know tnhe the thetransitive dependencies um but I seeboth zero log and gorilla here so thisis an sbomb of your of your everythingthat's in your source repository it'sbasically like looking at your lock fileor that go mod or that that um what isit the uh the go shopfile someum so but that's not that's not um someof these things aren't being built intoeach each each binary right the job andthe and the theum the job and the server had differentstuff so this was source so I I alreadysaved the ids here from all thedifferent sbombs that I've ingested uhthis is the build sbomb ID ID so I'mgoing to run that samequery and look at the packages in thereand here we see what we expect so it'smy sample server and it's just has httpsnoop gorilla you know so gorilla andthe the transitive dependency so thisagain builds sbomb is just going to bewhat actually goes into a build what'sgoing into a binary something that'scloser to what you're shipping so sowhen you're building sbombs those aresome things to think aboutand then now I'm going to take a look atthe imagesbomb and query the packages in thereand the image sbomb has oh wait I'mseeing I have uh wolfie stuff I havebase layout um and then of course allthe go the go packages as well not notzero log only the gorilla mucks stuffthat's built in and then um see So thewolfie packages like CS certificatesbundle and TZ data so you're and thenthis was this was um I didn't actuallyrun an sbomb generator on the imagemyself this was the one that cubescaperan and and put into the API server forme now if I if I take a look at thefiltered sbomb for um let's see this wasimage if I take a look at the filteredsbomb for this this sample server it'snot going to be much different becausethis sample server is um using gold it'sa go a go uh binary so there's onlythere's only one thing in in the wholewhole sbomb so the filtered one ispretty much the same um but I I did Idid import a um filtered sbomb on a on aDebianbased image fromcubescape and let's run the query onthatone and so I we see that yeah there'sDebian packages here lib SSL uh libsLinux etc but if if you look throughthis whole thing it's not very long Sothere's only uh only the stuff that'sactually running is coming in thefilteredsbomb uh and then before getting back tothe slides I just wanted to take a quickpeek at the visualizer this is the thevisual visualization of the graph it'skind of hard to look at all the thelittle dots and things in the visualizerso it's neat to show and say like heythis is the um the image and then here'sall of this is all of the sbombstogether so this isn't filtered thisisn't um just one looking at one sbombso this is the image and all thedependencies between say like gopackages and um these are the wolfiepackages and then the uh the theoccurrences of the of artifacts and etcetc all right so I'm going to jump backto the slides and by exploring all ofthose sbombs there are some takeawayshere there are some things that thatthat we've learned so again uh sourceand repository sbombs are basicallyparsing the lock file they're gettingyou everything in the repository so thatmight be also stuff that's important toyou like uh dev and test uh packdependencies um but maybe they're not ashigh priority to you right so thinkthings to think about with with stuff inthere um your build sbomb limits onlythe stuff that's being built into thatexecutable you you something that youhave that's being either during thebuild of your software um whether it'sbeing compiled or if it's uh interpretedlanguage maybe there's some kind ofdependency resolution that happensduring the build that is not reflectedin the source so um but also if you havea build that's like maybe part of yourCI for testing that may not be this thatmay not be the um exact same buildthat's what you're put shipping toproduction so that's something to keepin mind when you do when you take a lookat buildsbombs and then image sbombs scanningthe the container again you're gettingum you're not getting your dev and testuh uh type of libraries or stuff that'sbeing built that's in the repo that'snot being built into the image and thenyou're also getting all of the umoperating system level stuff and thenthis is much closer to what you'rerunning in production but again it'sit's everything and then the filteredsbomb um you know has been covered is uhonly the the the the binaries that aregetting loaded into memory so the veryvery most important thing to take a lookat um and so there's I don't want to saypros and cons but there's just differentthings in different sbombs and it'sthere are things that you have to knowand and when using something like guaccan help you kind of explore the sbombsand see which packages have the rightrelationships to which sbombs or the theones that you're looking for so maintakeaways there um and that brings us tothe end yeah first of all thank you forall for attending you know I know thisis the last uh presentation for you atthis CubeCon and I was really impressedby you know the turnout so thank youvery much for everyone and uh we'llreally hope to see you also in nextCubeCons and uh maybe also you knoweither joining either of these projectsas contributors or or just like givingus feedback uh we'll be happy to uh bein touch and we'll be happy to takequestions yeah there's a mic in themiddle uh if you want to speak in themic to ask questions um I also have guacstickers for uh after the afterwards ifyou want one go ahead uh hi thank youfor the talk um I have a question doesit matter when you um query the filteredsom so is there a chance that the filteris missing somevulnerability so uh I can talk on thecubescape part so in the cubescape uh bydefault we have default setting of forhow long uh we are monitoring acontainer so when you start a containerit takes us like two three minutes tocreate the first filter des bomb and weare like every 10 minutes afterwards weare if we see something new we areupdating but after by default setting Ithink is around three hours or 6 hourswe are just stopping monitoring in orderto reduce the CPU you know uh um loadthough although this is not a big loadbut having said that if mostly in fromour most of our experience after thingwe didn't see in the first few hoursthey are usually not really uh uh rerelevant but but again uh this issomething that you can customizethank youhi I just have a question on the uhruntime package detection is that 100%accurate or does it depend like is italways definitely accurate that theseare the only ones that are loaded duringruntime or is there some inaccuracy tothat so um okay so let me I'm putting myengineering hat on it so there are liketwo inputs here right so one is thesbomb the second is the BPF data so inthe sbomb we are just as accurate as thesbomb generator is so we are using siftsift is rather I I really like theproject it's it's a it's a great projectand it's one of the you know top-notchuh uh you know sbomb generators fromimages so we might have like uh thingsthat we don't discover during thecreation of the sbomb from the image andthe other inaccuracy could come intheory from the ebpf data stream so inthe ebpf data streamuh we'll see that which files aretouched during the runtime so eitherthere is an open sys call or an exiscall on themum in this case um there is a way thatmaybe the CPU isoverloaded and then we are missing someebpf events and missing ebpf eventscould lead to inaccuracy and we are notmarking some packages as loaded whilethey should be now in that case if Iremember correctly we stop generatingthe if we are missing ebpf events we arestopping the filtering process becauseof inaccuracy and we don't want tocreate inaccurate objects but thisbarely happens like it it has to be likesome really crazy thing going on inorder to uh to not be able to processall the data and just like running outof the CPU limits we have and missinginformation but there is like thisdistant you know um possibility and youwould get a log uh uh log informationand you would be missing the filtereddesk bomb in that case thank youall right thank you very much thank youvery much everyone have a great weekend2025-04-15 21:58:06.614463 ��jM#��Ax5qguW0SF_Ihello everyone thanks for being heretoday uh my name is Jeff Mendoza heyeveryone I'm Ben i good to have everyonehere today yeah and and we're here todayto talk about um both build time runtimesbomb supply chain information in yourclusters um using both Kubcape andGuac uh okay a little background aboutmyself um I'm a software engineer atKusari which is a software supply chainuh security startup and I'm a maintainerof Guac which is an open SSF project uhlicensed Apache 2.0 um so I'm alsoinvolved in a couple other open SSFprojects and also the clearly definedprojectso I said before as I said before I'mBen uh maintainer one of the maintainersof uh Cubscape which is a we'll talk alittle bit later about that uh but it'sa Kubernetes security uh platform uh andthe CNCF incubating project um and I'mcoming mostly from software engineeringand the security product side of theindustryuh just a brief overview of the the talkwe're going to talk a little bit aboutthe two projects and then um someexciting new features that have justlaunched and ademo all right so I'm gonna um bring youall up to speed on GUK uh GUAK standsfor it's a backronym you know we likeavocados and and guacamole but it standsfor graph for understanding artifactuh now I I forgot what the C wascomposition um so what it is is a systemthat you you bring up and it's uh at thecore of it is a GraphQL API um but it itpulls in all kinds of supply chain dataum from all kinds of sources or types ofdocuments primarily software bill ofmaterials sbomb but also um salsaaddestations scorecard scores uh vexdocuments things like that and it bringsit all into a a system that you're ableto use uh those GraphQL queries to driveinsights about your entire supply chainso if an SBOM just contains um the thecontents of a single piece of softwareyou get all of those into one big graphand you start seeing the interconnectionbetween all of your software um the wayyou typically use it is at least upuntil now is you would have like a stepin your build that would generate ansbomb and then um ingest or upload thatsbomb into the guac system that you'rerunning um to dive a little bit moreinto um the different components thatmake up Guac uh what you see here ingreen is the core core project of Guacum the the database and assembler is uhwhat runs the API um but there's a a lotof other pieces that sit around it sofor example the ingesttor is what takesum documents in and then hydrates thethe graph with all the you know turnsthe documents into um nodes and edgesthat go into the graph um and then wehave these things called collectors andcertifiers so a collector would besomething that um either watches for newdata or um goes out and like grabs grabsdata so a typical example is the filecollector you know that's for sbombfiles but other ones are um collectorsthat can sit and look at like yourGitHub repo for when you have a newrelease oh I'm going to check to see ifthat has an sbomb download that and thenput that into guac um we also have uhcollectors for a number of otherintegrations jqr resetting the GPUthis helps you in identifying issues andtaking mitigation steps but it still hasquestions on how you can handle theapplication uh efficiently because youyou face issues around things likescheduling dead time and also slow partstarts because when you have a new nodethat needs to be provisioned that takesa while and if you have a large modelthat takes several GBs or sometimesmultiple hundred GBs it's going to takea long time to load we talked about someaspects from the infra providerperspective and also from a userperspective in this CubeCon Salt LakeCity talk but now we're going to addressthe other side of how we make it easierfrom the application perspective suchthat they don't have to think aboutthese problems on theirown another approach that's commonlydone is uh model checkpointing duringtraining it's done frequently and thenif a node goes down you just restorefrom um from the registry or anotherrepository where you have the previouscheckpoints of the model where it'sstored and then you can continue thetraining process uh some approachesinclude basically interrupting yourworkload as well and that's a naiveapproach and then just migrating to anew node if there's issues you couldalso use tools like Q and Volcano tomake this process easier along with someGPU strategies like multi-instanceGPUs but all of these uh addressdifferent aspects of the problem theystill face some common challenges likeslow loading time and recovery time foryour end to-end workload they might youmight need to configure yourapplications as well to handle thingslike recovery from the last checkpointit could also lead to higher costs andwe think that this could be made moreresilient so uh what is infrale transferand checkpointing this is the the aspectthat we want to talk about today and wewant to define it in uh what phrase byphrase so checkpoint here representscapturing the entire state of therunning application with the point intime memory and all the associated filesso that you can basically pause migrateand hot restart itlater transparent refers to theapplication code or frameworks notneeding to be changed so the applicationdoes not need to be aware of uh thischeckpointingprocess then intra level refers to thiscompletely being handled byorchestrators or schedulers and theplatform itself so that uh users don'tneed to worry about it i I want toreally emphasize this point that this isnot modelcheckpointing it's not just modelcheckpointing because you're capturingthe entire application state in yourcontainerand from the usage perspective it's thepeople at the platform layer who wouldreally care about the solutionuh from from the implementation aspectand this is as you know uh an emergingtopic and we are in the emerging tracktoo so not everything is going to befitting perfectly but so we'll have openuh open discussions at the endso if you're new to this you might havemanyquestions you know how production readyis it can you restore it on differentGPUs what type of different machines canyou restore it on how do you handlenetwork state how much time does it taketo do all these additionalcomputations and we'll be addressingsome of these questions but I would alsohighly recommend uh multiple other talksthat you could look at for more detailsso for a high leveloverview think of yourself as a neuralnetwork that is running on adouble-decker bus in London taking theviews of the city learning about it andupdating your neural networkweights unfortunately your bus breaksdown so your application containercannot start running you cannot continuetraining what happens nextwell without you knowing it almost asnapshot of view from 2 minutes ago wastaken and kept in the British Museum nowwe want to be able to restore that andcontinue running it so this is thecheckpointing uh phase and that that waskept in the museum and then now you spinup a new node where you can continuerunning your application but instead ofgoing uh through some applicationspecific logic you can recover to whereyou were from just 2 minutes ago andcontinue as if nothing happenerd prettymuch so you're going to be like almostlike the cyber people in the showseverance or like unlike them you'regoing to remember what you did so allthe training the neural network weightsthat you updated are still going to bethere uh and you can continue running itso because we've captured the entirememory state that will also be restoreduh in thisprocess and this will help in improvingresilience and we'll also talk about howit improvesutilization so there's a host ofuh challenges that this can address sothere's three categories one which wecall scheduling optimization another forhandling node downtime and then thethird one for pod start timeimprovements for scheduling if you havea higher priority workload that comes inor if you have idle resources that areyou know scheduled but actually notbeing utilized or if you have spotinstances that can go away anytime youcould use this checkpoint restoreapproach to be able to migrate yourworkloads or you could also use it tohandle it efficiently and thenespecially when you have longunning jobsthat need to be taken over by anotherjob we need to have a seamless processfor uh capturing the state and beingable to save it somewhere and thenrestore it so this helps there and thenfor node downtime like the originalchart that I showed with uh GPUissues you can add to that end to endflow to be able to checkpoint rightbefore the GPU goes down or takingdraining the node and then restore itwhen you go to a new nodeuh and a lot of these use cases areunlocked by the faster time it takes touh load the new container image and theapplication and uh many companies havestarting to already use this forspeeding up their pod starttimes so what are the other like MLspecific use cases that we may be ableto address withthis for inference many of you mightassume that infrance can be consideredas a stateless application but is itreallystateless well according to onedefinition stateful applications savepast and present information while youknow stateless applicationsdon't the for inference of let's say forLLMs you you tend to store values likethe key value cache values in memory andthat could be both in GPU memory and inCPU memory and if you lose the inferenceworkload and the KV cache is onlypresent on your node then that memorywould have that would have to berecomputedlater uh with most currentapproaches but with transparentcheckpointing it might be possible torestore that state and continue runningyour application without uh much needfor additional recomputationbut one requirement there is the time toreload the KV cache must be less thanthe time needed to recomputeeit then for distributed ML training uhin this in this time chart you seecheckpoints being done at a regularcadence uh this is from a Microsoft uhresearch paper this graph and they weretalking about how to optimizecheckpointing from the model layer sothe model it I I I need to mention thatthe model checkpointing will stillhappen even with infra levelcheckpointing because model levelcheckpointing is useful uh forexperimentation and if you want to goback to a previous state so we will becomplementing this model levelcheckpointing with infrale checkpointingso we take perhaps more frequentcheckpoints and then restore it if thereis an issue in between two modelcheckpoints so in this time chart yousee that if there's a failure in betweencheckpoint 3 and checkpoint 4 you'd haveto retrain uh from checkpoint 3 thatwastes additional cycles basically forGPU and CPU but also because when youhave a distributed training case manyother nodes in your entire cluster willalso have to wait for your one node tobe restored and then the progress to besynced because at the end of eachcheckpoint typicallythey will be syncing their uh gradientsand the weightsso but this is not a solution toeverything there are some limitationsthat we need to be aware of and addressone being that the checkpointing theinfra layer is not aware of theapplication state itself which meansthat the checkpoint could besignificantly larger compared to justapp level checkpointingso let's sasy for inference for the modelitself will take up most of the spacebut you also have additional librariesand helper functions that will be loadedonto memory that would also be copiedover and depending on the applicationthe amount of memory you uh thedifference between the two checkpointscan be u significant as well thenthere's rigid requirements forrestoration sometimes where you mighthave to match the machine configurationsbut also this varies on the type ofcheckpointing implementation you use andthen you would want to have restore in amachine which obviously has like thesame or highermemory there's also some resource uhconsiderations like how much computetime it takes and that can vary betweenperiodic and event- driven checkpointingso this these are pretty much two typesof checkpointing you can do so periodicis you like the the word you keepcheckpointing every you know few minutesor a few seconds based on how youconfigure it uh and then uh event drivenis if you know that there's going to beafailure then you can checkpoint it uhand then keep the latest checkpointthis graph there is sort of a highlegraph on the trade-offs that areinvolved in figuring out what type ofcheckpointing you want we want to do soon the y-axis you see the total cost ofoperating a workload and then on the xyou see the checkpoint frequency so ifyour checkpoint frequency is really lowyou're not uh you're going to checkpointafter a very long period of time yourthe additional compute overhead forcheckpointing is going to be low butyour resilience to errors and recoverytime will be high because you might needto recomputee it if you uh error out andthen on the other extreme if you'recheckpointing really frequently yourcompute uh consumption is going to besignificantly higher but your overalluh resilience might might be better butthe total cost of operating your uh jobwould be higher as well so we need tofind a sweet spot for how frequently uhyou need to checkpoint and that willvary also depending on the type ofapplication and also the type of GPUsthat you have the error rate and otherfactors now Bernie will continue andhe'll share about how it works and showsome cool demos okaygreat thanks Ganesh uh yeah so as asGanesh uh uh discussed he's now definedwhat infrar transparent checkpointing ishe's described some use cases uh that wesee are possibilities and uh uh he'salso described the limitations ofcheckpointing and so now what I want todo is just deep dive a little bit moreinto how we're trying to implementtransparent checkpointing for uhKubernetes and everything is centeredaround an open source project called uhCreo creo stands for checkpoint restoreand user space uh and uh Creo wasstarted in 2012 it's it's designed tocheckpoint uh applications running onLinux uh platforms and uh it's uh inwide production with uh other uh virtualmachine uh uh platforms for doing livemigrations and then uh my companymembers has been using uh Creo as partof a spot instance migration oflongunning HPC batch workloads on onpublic clouds uh in addition to thatCreo is is incorporated if if you don'tknow already into Kubernetes the the 130release uh supports forensic examinationof containers those containers arecheckpointed by uh Creo and uh inaddition to that in uh CubeCon Paris uhlast year I gave a brief talk on uh uhsome work that uh uh we uh done withNvidia to introduce GPU uh levelcheckpointing uh to uh uh to the thiscommunity and so I demonstrated anoperator that could do a GPU CPUcheckpoint of a single uh workloadrunning on a pod uh and successfully rerestarted elsewhere uh there's ongoingwork by the community to uh and they'vecreated a very nice uh GPU plug-in forCreo that also will support the uh AMDGPU and I want to give a shout out to uhexcuse me Rison and Adrian here in thefront row and their uh colleagueVictoria they presented yesterday and soif you want to read more about thehistory of Creo and where the Creoleproject actually stands those are thefolks to talk to uh in addition to allthat work uh what we did was we uh uh wehave to to to make this ready for uhprtoduction level workloads as uh Ganeshareferred to we we need to do somethingabout the checkpoint uh overhead and soin that U-shaped curve we showed in thein the previous slide what we're tryingto do is lower that curve and flatten itout uh so we get a very highly efficientuh checkpointing technology and uh to dothat uh uh from our experience workingwith HPC batch uh work uh loads uh we uhare focused primarily on reducing thecheckpoint interruption time productioninterruption time uh and then also thespace consumption by the checkpoints uhand then also computer and memoryresources that are being uh used to uhuh execute these checkpoints and so uhwe've got been successful with HPC uhworkloads doing that uh using acombination of what we call asynchronouscheckpointing uh uh compressiontechnologies and then also evenincremental checkpointing and uh in caseof asynchronous uh types ofcheckpointing we've been able to reducethe window the checkpoint productioninterruption window from up fromanywhere between 30 and 100x uh 100times better than just uh running it uhwithout any kind of asynchronous andthen uh with compression we've been ableto achieve up to five to one levels ofcompression uh and in incremental uhsnapshots also for uh situations wherewe have very very narrow uh uh times touh handle uh uh preeemption uh signalsso uh we're hoping to bring all thosekinds of capabilities to bear on on theKubernetes problem and uh and so part ofthat is we also have a lot of experienceworking with uh external events uh likeuh uh spot instance preeemption signalsworking with schedulers uh so we've doneintegrations already with on the HPCside with slurm lsf and htc condor anduh we're hoping to bring that kind ofknowhow to bear also so we just startedworking with Q and the job the the jobset uh projects to uh integrate there uhanother consideration for uhoperationalizing uh this kind ofcheckpointing is security issues as asyou guys know the uh uh as you may knowthe creo actually has to uh run in in aa privileged state because it it needsto checkpoint all the different uh uh uhpods within that particular node uh sothere's some security things we have toaddress there and then also uh sometimesthere's third party license managersthat need to be uh taken care of andwhen they're checkpointed and things arebeing moved around uh and then uhephemeral files is another area uh uh Ithink uh right now uh you're allowed tospill memory into uh onto a disk andyou're also uh allowed to have ephemeralfiles as part of your uh workloads andso those also need to be checkpointedand and and moved elsewhere if you'regoing to do a hot restart somewhere elseuh and then lastly we're we'recontinuing to work with Nvidia to uhimprove the uh the checkpointing uhfeatures uh in terms of fractional GPUsand things like that uh and then alsoimprove the uh overall performance uhand uh flexibility uh to be checkpointedand migrate to different types ofenvironments so those are all ongoingefforts to operationalize checkpointinghere and just to give you a quick ideaof what's going on uh right now the theGPU checkpointing is currently atwo-stage process the first stage is touh checkpoint uh to actually haltfurther job submissions to the to theGPU that uh uh and this is done on a ona a process ID basis halt allsubmissions and then secondly wait forthat uh uh particular process to uhcomplete and then we take that memorydump it to system memory so that's thefirst stage and then from system memorywe're also checkpointing the systemmemory plus the GPU memory that's justbeen dumped there along with any kindsof uh ephemeral files and things likethat taking that all and and dumpingthat onto a persistent volume somewhereuh or or some sort of directory and therestoration process is simply thereverse of that a two-stage processcurrently uh now uh last year again Ishowed a single node being uh uh uhcheckpointed and hot restarted we wantedto extend that to a distributedarchitecture uh and uh so the keycomponents of that are first of allthere's a high level coordinator uh thiscoordinator's ruole is to uh first of alldiscover membership uh of who whobelongs in this particular uh uhdistributed cluster uh and we do that byuh mapping out the the networkrelationships between all the workersand and doing a self-discovery but inaddition to that we've also have anoption to uh just basically query thejob set API and just find out where arethe workers uh and then that's so that'sone component another component is asynchronizer synchronizer's job as youcan see on the on the right side of thisthing is to make sure all the workers uhwhen they're invoking this uh Creooperation are all uh doing it uhconcurrently uh so there's this uh placewhere uh in the in the Creole designwhere you can put an action script hookcalled at the pre-dump stage to makesure that all the uh uh the thecheckpoints are occurring simultaneouslyand then uh conversely on the other sideof the checkpoint there's a post dump uhuh uh barrier there where we can makesure everything is also releasedsimultaneously and the purpose of thatis to make sure there's no uh messagesin transit and things like that thatcould could be uh uh uh lost orcorrupted by uh having thingscheckpointed out of out of sync and thenwe also have a web hook that allows uhus to uh provide a for particularapplications the the a checkpoint pathor or posision or or or to a ppersistent volume or whatever so that weknow where this checkpoint is beingstored and there's a demon set to deploythis on each host uh and we do use a uhmodified version of runc uh and the goalof that is to uh uh uh run C wouldnormally pull off the a cold restart uhcontainer off of the registry but inthis case run C will then go to a adirectory path if there is one for acheckpoint and and bring the uh image inthat way uh so that's that's the basicsof how our clustered uh transparentcheckpointing is uh excuse meimplemented uh and then we to automateall this naturally we got to put this inan operator uh and the goal of thisoperator is to create a gracefulpreeemption and hot restart of of of aof a workload whether it's a anindividual no uh application or adistributed one and so uh we look atthis as a way to reduce what we call thefriction of moving or hibernatingstateful longunning uh workloads and alot of AI ML workloads uh have that kindof a characteristic and u so so this isuh basically there's two ways we canimplement this thing one is to uhimplement as a sidecar or we could alsoimplement this deploy this via dimasetso for the purposes of using this fornode maintenance we're implementingexcuse me implementing this as a as a ademon set deployment and uh pretty muchoff-the-shelf components that are listedthere the only thing that's beenmodified uh offtheshelf Creo version 4for example the only thing that's beenmodified is the the run C so that we candirect it to a a checkpoint uh path toto pull the uh checkpointedimage this is uh the what we're talkingabout the kind of a use case that we'reworking on it's called job set migrationthe first uh type of migration is to uhfor for a scheduler to to use to to takea uh a set of of pods in a in aparticular distributed cluster and movethem in an entirety to some otherlocation and this could be used to helpmaybedefragment a particular uh uhinfrastructure uh and uh and and also touse it for prioritizing higher priorallow higher priority jobs to take overthe resources etc and then the secondone is more for node maintenance uh whatwhat I discovered during this conferencewhich was really interesting is thatseems like almost 90% of these failuresthat you see with GPUs can actually becorrected by just simply uh uhevacuating the node and rebooting it uhso so we see this scenario where we'rewe're actually dealing with again thatdistributed cluster but we're just uhmoving uh one node the failing node orthe one that has the problem over to auh a hot spare and then maybe rebootingthe hot spare so I think there's alsobeen a lot of progress in doingpredictive kinds of of of of uh work onpredicting which nodes may be failingand so this kind of technique I thinkwould help in nodemaintenance vuh so just before I go intothis uh the demo itself I I just want toexplain what we're trying to do here tokeep things simple we have just three uhuh nodes in this three workers in eachdemo the first demo is going to be abare metal demo of a of a two nodepietorch distributed cluster and we'regoing to use our periodic checkpoint sowhen you see that to checkpoint uh andfail uh we'll we'll manually walkthrough that demo so you'll be able tosee every step of this process and thenuh what we'll do is we'll fail over to aspare node uh and then you'll uh uhyou'll see it resume operation nowbecause it's periodic you'll see it rollback to the epic where it actually uh inthis case uh uh was checkpointed so theso when we do the checkpoint itcontinues running but when we uh do therestoration it will roll back to thelast checkpoint the second demo is uh uhwe're going to show more automatedthings so that's going to happen a lotfaster so you got to be alert uh butthat's a job set migration uh so that'smore of an event driven thing the eventwill be we we're typing in there to takethis node out of maintenance willautomatically take that node out ofmaintenance migrate the workload to thespare and resume operation uh that's alldone with a uh with with Q and the jobsets so let me uh now uh switch over tothedemos uhokay so this is the first demo it's a uhthe bare metal demookay it's running good so there's twotwo workers there a master and and and aworker and they're already running thisPyTorch distributed thing so you can seethe Epics and patches going uh they'rethey're scrolling by uh and also uhwe're we're running the GPU so we'repulling up the SMI Nvidia SMI tool toverify the GPUs are up and runningthat's on the bottomthere uh then also I want to show youthat we're showing you the TCP IPconnection so these things are chattinga lot with each other uh and so that'swhat's being highlighted there in the inthe bottom windows there in thedemo again this is going to be a manualdemonstration uh then we pulled up ourin the middle there this is our ourcheckpoint coordinator our synchronizerthat uh uh will make sure that these twonodes when they're checkpointed beloware are aligned with each other and sowe're typing in a manual checkpoint foreach one and uh so the uh so they'reboth being uh checkpointed andcoordinated and the the checkpointing uhyou can start seeing uh on the leftlower left the uh checkpoint dump filesuh now on the on the lower right uh thethe dump files as is being checkpointednow if you notice above the applicationis continuing to run so the checkpointoccurred fairly quickly and uh it wasabout at epic five when the checkpointoccurred but everything is stillprogressing and uh so now what we'redoing is uh we're trying to arsync thelower right uh uh worker to a spare nodewhich we haven't shown yet because it'sit's not I haven't posted it yet on thescreen there but we're doing an arsyncand just copying the files manually andso you can see all the the the fileslisted there on that lower uh right sideof the thewindow uh so it's taking a little whileto arsync it and thenuh then what we're doing is we're goingto kill that worker uh and then uhswitch the monitor on the upper right sothat we can see the sparenode so that's uh being set up rightnow and again this is the manual demo soI just wanted you to see this and uhwhile this thing is running I also wantto give a shout out to my colleagues Ronand Vincent uh back in California thatdid all the hard work of putting thesedemostogether uh and then u yeah so now theuh checkpoint coordinator is theninvoked again to do the the restorationprocess and all the all the files areare being uh uh reloaded and the uhsystem is beingrestarted uh and uh pretty soon you'llstart seeing the uh the checkpoint uh uhlog and I mean the distributed PyTorchlog uh scrolling alongagain uh yeah so there it goes it'sresuming and it actually resumed uh eventhough the the uh job act the the jobcontinued running we actually ended uprolling it back to about epic 5 in thisdemo so that's the manual uh demo of acheckpoint uh and a hot restart uh witha with a with a roll back to the lastperiodic checkpoint and now like I wouldlike to show you the uh uh excuse mehere the uh automated version of this sohere what we have is uh we're going todemonstrate taking a node out ofmaintenance using uh anoperator uh so we have three nodes setup and then a masteruh and uh yeah so and we're using uh Qthe Q project uh to migrate from node 3to node one and then also we're showingthis job set notion so the job set iscalled uh uh pytorch distributedtraining that's what we just highlightedthere that's the job set and then uh uhwhat we're doing here is just going toshow the log so you can watch the epicsand the batches go by and so we'll showyou later on that there's totalcontinuity even though we restarted itelsewhere now we're highlighting nodethree because that's the one we're goingto take down this is a maintenance CRDthat we set up uh in conjunction withour operator uh and uh then we also setup a maintenance shell script and allyou have to do is basically execute themaintenance shell script and just putthe node name in and then everything'sautomated after that and that's whatwe're doing right now we're invoking themaintenance shell script if you lookimmediately above in that kind of panelyou'll you'll see that there's a you'llsee a quick termination and and a startof the of the containers above it andthe other nodes so there there's thetermination there's the new creation ofthe new replacement container uh andthenuh see where are we here okay now nowwe're starting up this log thing againyou can see it pretty much uh resumedwhere it left off it'll scroll up somemore so you'll be able to see that whilewe're waiting for it to scroll up uhyou'll also be able to see thischeckpoint file that's where we we rightnow we're just dumping these into an NFSfile but just to show you where thesecheckpoints are uh we're we're justshowing you uh that in the upper uhright cornerthere so anyway there you go and if youlook at this bottom uh corner uh bottomright corner you'll be able to see yeahthat the uh we resumed right where weleft off basically on epic 9 batch uh uhuh batch 1000 there so uh so that's thedemo i hope you uh enjoyed that and letme uh let me make some concludingremarks hereuhsorry yeah so uh next steps and uh callto action so what I just showed youtoday is uh that we demonstrated thefeasibility of using an operator totransparently checkpoint and hot restarta distributed PyTorch uh a IML workloadand improve its uh resiliency and so ournext steps will be to to continue to uhuh this is kind of a a crawl walk runbut we're going to work on overheadoptimization characterization how toscale this up we believe it is should behighly scalable because everything canbe done parallel on each node uh thecheckpointing process uh and then alsowe we we have a lot of work to do onback-end networks uh we'll probablystart with the RDMA uh kinds ofconnectivity and then also telemetry andintegration with Prometheus uh and thenwe also want to continue to work withthe uh uh CUDA uh and the Creolecommunities to improve uh functionalityand capabilities uh and contribute backthere uh and then my call action for youfolks uh is uh we appreciate anyfeedback evaluation or specific usecases if you want to uh give this a anearly uh try that would be great uhevaluation and uh we we are veryinterested in working with the schedulercommunity and orchestrators and evenapplication frameworks i we seepossibilities of where this could beused in that context to help uh withcheckpointing or or to uh coordinatecheckpointing and then also spotinstances and uh we want to keep rampingthis up and validating things at scaleso with that I'd like to thank you verymuch and have Ganesh come back up here isee do we have any time left oops did weovershoot yeah we are a bit over thankyou i apologize thankyou i think we'll take questions in thehallway later and then uh we'll alsoupload the slide deck for reference andwith additional references to previoustalks too thank you thank you2025-04-15 21:58:07.240206 ff��vN#��#A3oWODC2mdk0great to see you all on a Fridayafternoon towards the end of theconference uh when you could be out inLondon enjoying the beautiful weather hieveryone I'm with uh Bernie here and I'mGanesh and we are going to be talkingabout intralevel transparentcheckpointing for resilient a IMLworkloads uh believe it or not that isme uh without a beard and I am asoftware engineer in the AzureKubernetes service team at Microsoft uhat AKS we run a managed uh Kubernetesservice platform which allows you to runa variety of compute workloads uhincluding lots of ML training andinference workloads and we have usersrunning these workloads at extremelylarge scales i particularly work on GPUprovisioning and making it easier to runa variety of GPU workloads on theplatform i've also previously worked onspeeding up pod start time through afeature called artifact streamingyeah my name is Birdie Woo i'm the VP oftechnology partnerships for a companycalled Meverge mever is a uh SiliconValley based uh startup uh that hassoftware uh that we've developed that uhhelps on memory virtualization and alsocheckpointing software uh and so we'rehoping to bring those two kinds ofcapabilities into the A IML area uh on apersonal note I I live uh part-time inuh Hawaii and I noticed that the cultureof the Kubernetes community in Hawaii isvery similar there's a there's a a veryfriendly culture uh uh one that helpseach other and uh and it's almost likean extended family or what we call Ohanain Hawaii and uh we also in Hawaii welike to welcome new people that's calledthe Aloa spirit so one of these days Ihope we'll have CubeCon over in inHawaii anyway back to you thankyou in today's talk we'll start off witha brief poll then talk about themotivation for why we want to do thisapproach and what are the other currentapproaches that people are taking toaddress these problems then we'll definewhat intralevel transparentcheckpointing is talk about uh use casesand limitations and then Bernie willshare about how it works and show somecool demos to see this in action andwe'll also talk about some open areasthat we could address together as acommunity for this part I would reallyreally love yourparticipation there'll be threequestions first one how many of you haverun GPU workloads in Kubernetes pleaseraise your hands wow that's great almosteveryone here has done that uh asexpected and then uh how many of you arenot satisfied with the existingsolutions for resilience things likehandling GPU failures or node downtimecould you please raise your hands okayalmost all of you who raised previouslytoo and how many of you care aboutresilience more for training orfine-tuningum okay that's about half of you andwhat about forinference all right there's a goodnumber uh but fewer than fortraining that's sort of the mix thatI've heard from conversations with folkstoo and then finally how many of you arefocused on improving the utilization ofyour GPUinfrastructure okay awesome awesomegreat almost all of you raised yourhands which is uh good to know becauseGPUs are expensive and you couldprobably save a few uh several thousanddollars of of money by improving theinfrastructureutilization and this is also validatedby many papers and reports that havecome out um you probably have seen theLlama 3 paper with how frequent GPUerrors are and the different types ofissues that have h that people areencountering especially fortraining and then we we've seen talks inthis conference as well talking aboutthe frequency of GPU errors there's alsosurveys which validate what you'reexperiencing around low GPU utilizationso about a third of users have less thanor equal to 30% of GPU usage accordingto this uh survey and but how do weaddress these challenges um today andhow can we make itbetter today many users are running GPUhealth checks uh that are integrated aspart of node problem detector and thosecan be also combined with a remedycontroller which can take actions likerebooting the node opyady mentioned Kubernetes has theAPI server which essentially manageseverything All the components connect toit While in slurm it's a little bit morecomplicated essentiallyHowever even if the architectures lookquite similar on the component level thehighlevel architectures and goals areactually quite different Kubernetesstrengths lie in its versatility andreconciliation You can deploy Kubernetesbasically anywhere and its self-healingabilities make sure that the systemkeeps running through upgrades andpartial failuresOn the other hand slurm as an HPCulerhas its strengths in schedulinglarge-scale multi-node workloads andoptimally handling the complex hardwaretopology present in modern HPC systemsandsupercomputers On the drawbacks slurmdefinitely exposes a lower levelabstraction compared to Kubernetes whichcurrently has somewhat limitedinterfaces for hardware control And alsowhile the community is hard at work atimproving the ease of deployment settingup a Kubernetes cluster especially onHPC scale is still a somewhat involvedprocess I'll also highlight twoparticular key drawbacks in the slurmecosystem that I'll be focusing on todayBased on the experiences in my researchgroup the HPC software stack is muchmore fragile compared to Kubernetes andfor example software upgrades are notseamlessSo both sides both sides have some keybenefits but unfortunately alsodrawbacks What if we could merge the twocommunities and ecosystems gaining theadvantages of both while minimizing thedisadvantages ofeither around a year ago this was thestarting point for my master's thesisresearchMy quest to map the HPC and cloudintegration landscape started on theLumisupercomputer Lumi is the eighth fastestsupercomputer in the world at the momentas it and is located in my home countryofFinland However through my exploration Iquickly gained the knowledge that Lumiis one of the trickiest supercomputersfor this kind of research None of theexisting solutions that I tried wouldwork So the scope of my thesis suddenlyexpanded quite a lotIntroducing Kubernetes Kubernetes is mymaster's thesis project that aims tobridge the gap between the cloud andHBC Kubernetes is a so-calledtransparent bridge which connectconnects a Kubernetes environment and aslur environment together And cruciallyKubernetes actually works onLumi Here's the architecture ofKubernetes There's a lot to take in Solet's do a step-by-step walkthrough ofhow this worksSo you have a Kubernetes cluster on thetop and an HPC environment in this caseLumi on thebottom On Lumi we have a set of nodesHere I'm representing them as Lumi Gnode one and Lumi G node2 When you deploy it itself consists oftwo components You have the controllerwhich is running in your Kubernetescluster as a pod and then you have theagent which is just a process that youstart on a login node The login node onan on an HPC system is essentially justan endpoint where you SSH into in orderto run workloads on themachine First the agent establishes anMTLS secured gRPC reverse tunnel to thecontroller Then the controller requeststhe agent to discover the nodes on theHPC sidefor each node one to one The controllerthen deploys a virtual cublet instanceThis is essentially a virtualrepresentation of a Kubernetes nodewithout the backendpart Then let's say you deploy apod You can deploy the pod eitherdirectly or through a job or basicallyany other means This is a standard v1/pod from KubernetesThe controller then picks up your podpasses it to the agent which then usesthe SLM commands to dispatch it as a jobIn this case that job is split into twotasks running on two nodes inparallel These new jobs then getobserved and pods will be created andtheir status logs etc All the themetadata birectionally synchronized onthe virtual cublet nodes We need theseso-called shadow pods in since we can'tsplit pods across nodes in Kuberneteslike we can jobs in the HPC systemthrough the use oftasks Finally Kubernetes associates thecreated pods for the job with theoriginal pod you deployed allowing youto see the status and logs like the podwould run natively on Kubernzetes So it'sfullytransparent and naturally any other jobsrunning in the HPC environment will beobserved and reconciled as well So wehave the observer observing and then thereconciliationuh leading to the virtual cublet nodesin Kubernetes getting a complete view ofthe state of the respective nodes in theHPC cluster And this is important if youuse schedulers on the Kubernetes side sothat they can use um the node and podmetrics in order to do smart schedulingdecisions for exampleThis wouldn't have been possible withoutthe great cloudnative githops tool fluxCD or simply flux which I modified andintegrated into Kubernetes to achieveconsistentsynchronization And I'll also highlightall the work put into the virtual cubletproject which is the backbone formirroring the entire HPC system intoKubernetes in the firstplace All right let's jump into a demoBut hopefully this works There's a lotof movingparts Okay it's visibleGreat So on the top left we have aKubernetes cluster This is just a singlenode cluster with a lot of virtualcublet nodes managed by KubernetesSo here we see that there's a 103 nodesThis is1,2 virtual nodes exposed by Kubernetesand one physical node If we scroll allthe way to the bottom here you see thatthis this note here at the bottom is theonly physical one actually running thecluster So let's hop into or actuallyI'll highlight this as well since I justgot this working before the demo Thenode CPU and memory metrics are alsoexposed So we can take a look that thesenodes are around 30% utilized which isnot that great and we have a memoryutilization of 45% as well But don'tread too closely into these numbers SWMis a bit wonky in how it reports thismetric All right let's jump into thisparticular node so we can try spawning adeployment At the bottom here I'mrunning SQ in a loop So this isessentially a view into all the jobsthat are running on Lumi uh with my useraccount The the pod we're going to bedeploying looks like this So it'sessentially just a standard V1 pod I'mdeploying something intentionallysomething very light here so that we canget it scheduled quickly since this islive So we're just going to be pullingthe Alpine container from DockerHub andthen running a loop which just counts to300 in 1 second intervals and we shouldbe able to also observe that in realtime In order to deploy onto aKubernetes node you need to add atoleration so that the port can actuallybe deployed on on one of these nodes Andhere I'm also using node name just justto target this particular node So we cantake a look So when also supportspassing arbitrary options to slurmthrough this label system and here I'mpassing in the time flag in order to addenough time for the job so that it cancompletesuccessfully So let's apply itHere we go And we see that it hasappeared and it's in the pending statusat the moment And at the bottom wealready see that the controller passedit to the agent and the agentimmediately dispatched it and it alsoimmediately started running because thisis such a lightweight job and therethankfully wasn't anyone else hoggingall of the nodes at the moment Buthowever we still see the pending statusup here This is due to Kubernetes beingessentially out of necessity poll basedSo there's a 10-second polling intervalbecause you can't watch resources inslurm So in just a moment we should seethe reconciliation of the pods alsohappening And there we go And we alsosee the shadow pod that was created forthis job So it's this one in a specialname space So just for the for thetracking purposes and superenetesassociates these two together so thatyou can just take a look at at the logsof your pod and see it running in realtime That's thedemo However we're not done yet As I'vehinted Sopenet is not alone in the HBCto cloud bridgelandscape Let's take a broader look atwhat'savailable here I've listed the sixprojects that I aim that aim to bridgethe gap between the cloud and HBCecosystems today The first threeInterlink HPK and Superetes are recentlydeveloped and maintained The next threeare not deployable currently but I'velisted them here for{completeness No is the predecessor ofHPK and is deprecated Then we have Kfoundry which which was presented at aprevious CubeCon but to my knowledge isstill not public And finally we have theSlinky Slurm bridge which is sort of theofficial solution from SCKMD thedevelopers of slurm but this is yet tobe developed and specifically t uh takethe last two with a grain of salt sinceI've needed to make some educatedguesses based on any public informationI could find about themSo first all of these solutions are ableto take workloads defined in Kubernetesin some way or other and deploy themonto an HPCsystem However of these only Kubernetesis able to do the reverse namely alsosynchronize jobs from the HPC systemback to KubernetesKubernetes is also the only solution toprovide a complete node view to theKubernetes node view to Kubernetes inorder to for example use um cloud nativeschedulers here and it can expose theentire node structure of the HPCenvironment to Kubernetes as well andfinally there's a firewall typically inplace on HPC systems and especially onLumi this is a challenging aspect forall of these bridge solutions in generalthe yellow warnings here are notnecessarily deal breakers but mightrequire some elaborate workarounds andfor Slinky this can't yet beevaluated however as great as Kubernetesis it also can't solve the two key HPCenvironment issues I highlightedpreviously namely the securemulti-tenency and high availability onthe HPC side in fact none of the HBCbridge solutions can fix this let meexplain these two issues in in a bitmoredetail So first themulti-tenency You can think of this likerunning a hotel The tenants here are thecustomers that book rooms to do work inand store their stuff in In this contextthe tenants can be for exampledevelopers or teams from various AIcompanies working on different projectsdoing work in training LLMs and storingstuff in the form of training data andresulting modelsAt the moment our hotel of Lumi needs toaccommodate around 3,400 people Let'ssee if that'sfeasible In order to support multi-enyproperly the tenants must be isolatedfrom each other In our hotel we do havewalls between the rooms Directlyaccessing the data from other tenants isprevented through Unix permissionsHowever our hotel also has no locks onthe door doors to the rooms Anothertenant can just walk into the hallway ofnetworking and directly access any othertenants's room where the other tenantmight for example be running a JupyterLabenvironment Through that environmentthey can then access the stuff the firsttenant stored in their room namely theirconfidential data andmodels In technical terms we don't havekernel nor user name spaces Andtechnically since we don't have processname spaces either the walls of ourhotel between the rooms are alsotransparent and you can basically seewhat all the other tenants are currentlydoing On the Kubernetes side there areexisting and performance solutions foraddressing these concerns You caninstall locks for your rooms by takingadvantage of for example networkpolicies spiffy andselium Containers also natively provideus with opaque walls by default sincethey isolate yourprocesses Let's next discuss the highavailabilityconcerns Unlike the cloud hardwarefaults and software upgrades in the HPCspace often involve downtimeunfortunatelyThis is a screenshot from my email inboxlisting all the issues with Lumi overthe past year where red entries areassociated withdowntime In terms of cost efficiencylet's do a bit of napkin math So Lumihas a total cost of €150 million and aplanned lifespan of 5 years We cancalculate an advertised yearly cost of€30 million a year If we count it thisway the cost of one day of downtime isaround €82,000This here represents 28 days ofcontinuous downtime So first of allthink about if your cloud platformswould just go offline for a month Thatsounds a bitcrazy but if we use the same formulashere this works out to 2.3 million eurosofessentially unusable capacity on the HPCsystem while software is being upgradedAnd the successor for Lumi has alreadybeen planned and a|llocated €250 millioneuros And while we don't know thelifespan it's probably under 10 years Sothe cost of downtime will probablyeither stay the same or or increase inthefuture So now that we understand theseissues at hand and that the bridgesolutions also aren't the ultimatesolution is there a way to unite thecommunities and ecosystems in an evenbetter waylet's board the seamless integrationtrain to take a glimpse into the futureWe'll start off at today'sstation Today's station is our base casenamely how Lumi for example is set uptoday Just a single HPC environmentjointly accessed by alltenants So let's throttle up and off wego Mind the gap between the train andthe platformFrom our base case first we move theusers up and then we insert a newKubernetes enabled world in a new cloudenvironment Now play pay close attentionIn order to retain the best parts ofSlurm while avoiding its drawbacks andalso leverage Kubernetes strengths atthe sametime the tenants now operate in isolatedSlurm clusters running inside ofKubernetes as podsThis is nowadays implemented already byat least three projects We have theslinky slurm operator not to be confusedwith the slinky slurmbridge both skdb products and then wehave sunk by coreweave and also operatorby nibbiusHowever obviously we still want toutilize the actual HPC hardware that iscurrently managed by an existing SOMinstallation on the bare metal This iswhere HPC to cloud bridges andespecially Kubernetes come into playKubernetes aims to be able to expose theunderlying hardware to slurm ourunderlying hardware and slurm toKubernetes faithfully enough in order tosupport running HPC schedulers withinthe Kubernetes clusterIt is a key component for essentiallyexposing a sort of large enoughKubernetes cluster to evaluate the newenvironment at top the existing softwarestack And this with this we've arrivedat stopgap station This marks anintermediate checkpoint that isachievable today with minimal disruptionto existing HPC customers and platformsHowever we can still gofurther from a stop cap up station Wecan now do somethinginteresting When we ready to commitsoetes together with the underlying baremetal installation of slurm can actuallybe entirelyeliminated And finally Kubernetes can bedropped down to control the hardwaredirectlyAt this point we've achieved our goal ofcombining the advant advantages of thecloud and HPC while minimizing both oftheir individualdrawbacks But why stop there on the sidewe've also gained the superpower to runthe cloudnative batch ecosystemalternative HPC schedulers such as FluxFramework or even completely custompipelines And thanks to variousmulticluster solutions in the communityspace connecting this new hybridenvironment to an external cloud is alsono longer an issue No more hacky bridgesneeded and no need to reinvent the wheeleither Just a single unified and modernplatform to build the future ofsupercomputingon And with that we've reached the finalstation This architecture has a lot ofpotential It empowers communities fromboth the HPC and cloud ecosystems tounify their efforts benefiting all theprojects in the stack and making themmore accessible to everyone We gainimproved security better automationbroader ecosystem support betterutilization on the supercomputerhardware thanks to the accessibilityimprovements and finally reduceddowntime and costs just to mention a fewadvantagesSo to conclude we as a community need tostart thinking about connecting HPC tothe cloud today Right now we alreadyhave a set of HPC bridge solutionsincluding Kubernetes to act as a stopgap As we transition towards bettersolutions we need to take care toimprove the security posture and highavailability of HPC in the processI also hope to have convinced you todaythat Kubernetes is an excellentcandidate for addressing thesechallenges There is no need to reinventthe wheel It's just a matter of thecommunities coming together for thegreatergood And finally if this presentationpiqu your interest hit me up I'm workingon finishing my master's thesis onSuperetes and I'm looking forinteresting job opportunities to makethis transition a reality Thanks[Applause]anyquestions Okaycoming Thank you Dennis for this veryinteresting talk Uh I have one questionwhich which kind of application aresupported right now because I supposethere should be a gap in the how the podthat are running in slurm are connectingor are sharing a network with akubernetes cluster is that the case yeahat the moment it is very difficult tobridge the networking So the obvioussolution would be to isolate the jobsrunning in slurm into a separate networkname space and then bridge that networkname spaceaces through a proxy orsomething to the kubernetes side Howeversince we don't even have networknamespaces probably the best we can dois to just implement a plain proxy thatthe application needs to be aware of andconnect to if they want to communicatebut that is not yet implemented inKubernetes All right thanks So yeah I Iappreciate also the table and it's alsobrave to to do that because you can missthings but uh overlay network isdefinitely one row that should be shouldbe there and for instance interlink canhave and another solution cannot andyeah yeah okay I will update it Thanksfor the feedback Thank youYou talk about uh bringing slurm HPC andcloud communities together What was thefirst reason or the reasoning uh in yoursuperic center Lumi that started themand you on that path because it requiresa lot of open-mindyes And it's a bit of an uphill battlewith that So I'm trying to get hold ofof the people to motivate in in conconsidering this transition and that'swhy I also presented sort of a couple ofintermediate steps that arenon-disruptive to the HPC environment sothat we can evaluate whether um whetherthis is feasible and also what kind ofchanges we need on both the HP HPC sideas well as on uh the Kubernetes side inorder to make this transition seamlessSo for example at the moment Kubernetesdoesn't necessarily have all of the finegrain control that slurm exposes on HPChardware that are required for someworkloads but h have did your people atLumi had the real use case for it theuse case here comes from essentially theusers of Lumi so this started from myresearch group and the I've also chattedwith a couple of other people who arefrustrated with the difficulties ofconnecting uh theseHPC systems such as Lumi uh to the cloudin order to run for example uh batchoperations like training So they havetheir existing infrastructure customermanagement front ends backends databasesetc on Kubernetes and they want tointegrate their uh AI training andinference backends with the with uh HPCecosystems and that's kind of difficultto do with the way you need to SSH inand run a bit run a lot of scripts atthe momentOkay Okay Thank youUm so I was wondering if there's anothertake on integration with cloud nativeand batch more batchy workloads Um so wehave a bunch of users who are familiarwith slurm and I was sort of thinkingmaybe an sbatch wrapper that couldcreate job sets that can be scheduled byQ might be funI've already thought about this and yesthis is very much an option and also areally fun thing to think about so thatwe don't need to run the entirety ofslurm in a pod like cluster entire slurmclusters but essentially just implementthe bare minimum front end and thendirectly maps into kubernetes nativeresources the users get what they'reused to yes and and I'm hoping that DRRAis going to over time give us a bit moreof the flexibility with absolutely thisheavy deviceuling Yes this heavilyrelies on DRA and also the uh finegrained aspects the fine grain controlaspects of DRA And maybe a a difficultthing with Lumi specifically is alsothat it's AMD based So the AMD GPUoperator is kind of an essential partthat needs to support DRA and all theinfiniband etc that needs to happen inthe Kubernetes space as well to makethis a reality I think there's some sothere's some cool stuff to get involvedin that especially with the nickuling aswell and networking So absolutelygood talk by the wayThanks All right Well if no morequestions then thank you once more2025-04-15 21:58:07.892924 ��&O#��AQbR908kgk1Ythanks everyone for coming to my talkThis is thousands of virtual cubletsonetoone mapping a supercomputer toKubernetes with superetes Let's getstarted So hi everyone I'm Deness I'm asecurity and cloud computing masterstudent at Ald University in Finland aswell as Nineu in Norway My cloudnativecareer started at Weave Works where Ico-authored weave igniteCurrently I'm working on my master'sthesis at the astroinformatics researchgroup at Aldo University with a focus onsupercomputer and cloudintegration So let's start with a quickpoll to see which backgrounds you'recoming from First raise your hand ifyou're familiar with high performancecomputing or supercomputing on anylevel Okay it's a bit hard to see but Isee a lot of hands Nice And then raiseyour hand if you're familiar with cloudcomputing or Kubernetes again on anylevel Yeah Okay A lot more as to beexpected for CubeConNice So let's hop straight into somecomparisons then to get everyone on thesamepage On the highest level the differencebetween cloud computing and highperformance computing is this The cloudassumes infinite resources and finitedemand while high performance computingassumes finite resources and infinitedemand So sort of the oppositeIn practice this means that cloudworkloads are typically bound to a smallset of nodes while there is nodecapacity for running a lot of workloadsin parallel On the high performancecomputing side on the other hand singleworkloads may span across the entiresystem limited only by the number ofnodes intotal Initially this paints a picturethat there are fundamentally differentassumptions at play when building forthe cloud versus high performancecomputing platformsHowever with the advent of artificialintelligence workloads in today's marketthe tables have turned for the cloudNowadays cloud and HPC requirements lookincreasingly similar The demand forresources by workloads is increasingwhile platform limits are starting toshow Last September the EuropeanCommission published a report by theformer European Central Bank PresidentMario Draghi Mario was tasked to preparea report of his personal vision on thefuture of Europeancompetitiveness In the report Marioencouraged opening up HBC capacity tostartups small tomediumsiz enterprisesand the broader AI community as a wholeHe also highlighted the importance ofexpanding Euro HPC to additional cloudand storage capabilities using unifiedpublic and private infrastructure forupcoming AI and cloud servicedeployments To understand what all ofthis is about we first need tounderstand the similarities anddifferences between the cloud and HPCworlds on an architectural levelModern HPC and cloud hardware isactually surprisingly similar So let'sfocus on what differs between them andspecifically it's the software stackMost cloud platforms nowadays runKubernetes as you probably know while alot of high performance computingsystems run something calledslurm As a side note for thispresentation I'm considering highperformance computing its abbreviationHPC and supercomputers all to beessentiallysynonyms So reduced to their simplestforms we can observe that Kubernetes andslurm are fundamentally very similardistributed systems Let's start from theclients These are the users of thesystem who interact with an APIinterfaceNext And there's some differences withthe API interface but in the Kubernetesside it's the API server and on the SLMside it's essentially a bunch ofcommands that you invoke through a bashscript Next in both systems we have aset of controllers that mutate thestate To keep track of thatstate including for example theworkloads deployed by the users Bothsystems rely on a databaseAnd finally both systems have a set ofnode agents one on each node whichhandle the actual execution of theworkloads The main difference betweenthe two lies in which componentscommunicate with which other componentsAs alrextware as interoperable aspossible so you want to reproduce thevery same resource on any kind of backends so the challenge in a nutshell wehave a distributed hogenousresources good and then we need to puton top a commoninterface two problems comes from hereso how we give access to the users tothis platform and how we maintain allour workflows consistent to each otherso for instance starting from thesoftware and the containerology that cancome from different container runtimeinterface plus we know that we have alots of users that actually don't carewhere they are going to run they justwant their jobs to be donefortunately enough in this case but alsoin particle physics we start toconverging to a set of tools that arecloud native so the main interface thatwe face in most cases is Kubernetes sowe left out just with a uh a questionhow we can merge a Kubernetes API accessto different kind of resources and welisten before for quantum computing iskind of the same challengeso uh someone told me that on Tuesdaythere were there was a nice talk aboutuh this technology that is interlinktotally not uh not presented by me sothere is a demo for the solution thatI'm going to briefly introd introducehere so if you're curious just check itout and the idea is to create a plugablesystem where we put adapters on top ofthe remote resources that we want tocontrol through the Kubernetescluster and we try very hard to keep therequirements for the providers the uhless inbases possible so you pluginterlink on top your edge node here andyou're good to go basically you create avirtual node based on virtual cublettechnology that is capable to schedulepod on your zoom supercomput forinstance and yeah so why we choose thisway well in this way we are capable togrant our users access to kubernetes ininterface with no compromise with no cdseverything just run like playingkubernetes so you have pause withannotation you select a virtual note andvoila you go end up into supercomputersthat do your machine learning pipelineor your digital twin and everything isreported back once learn it's acommunity effort of course and thecommunity is very uh interested inunderstanding how they pl can plug theirown providers and in fact we have notonly scientific use case but you canfind here in the list also enterprisethat try to uh plug their container as aservice solution in this way and thelatest news uh is that uh now thisproject is part of the cloud nativesandbox we just uh passed the vote uh acouple of months ago and so uh I takethis occasion to say that it's a reallythin and simple interface that isperfect for uh introduction of newcommitment or wider communities so weare really looking into get more peopleon board onthis so we have an interpace if you wishso a set of frameworks that can run ondifferent Kubernetes cluster that thanksto this adapter can run on on VMs withGPUs on HTC system like HD condo onquantum we are trying uh to see how wecan what we can do with this in with aGalatia supercomputer uh quantumcomputing and then we have HPC centersthat were where we started we need thesekind of centers to create digital twinsso we create a cube ray for instancefrom plain kubernetes andchartthis spoon up different working nodesbut pay attention these working nodesare living inside a supercomputereverything seamless and completely uhthe user are completely unawareeverything is happen below cool now wehave big powers we have also bigresponsibilities becauseuh we need to provide a consistentenvironment for our users we need inother words a spaceship and a spaceshipneeds to provide us with a consistencyuh of workflows we need to check ourworkflows we need reproducible resultsand yes we can do with the CI systemthat we have nowadays but not withoutlocking uh our yourself into a customsolution or reusing and copy pasting uhnightmare that you might know very wellwhat Dagger provides the solution wechoose for for this uh is a runtimebasically that it's meant to createcomposible software and go we scientistslove composible things so we took a lookat that and yeah we got pre�tty prettyfar because we have now repeatability ofour results we can compose all our uhsoftware in an efficient way and we canalso observe what happens when we buildand win test stuff so this is uh allgood but even more we have a universaltype system so we can code our pipelinein a very efficient way we have a netplatform where we have uh caching forour artifacts and we have built-inobservability and finally yeah the newuh the new thing is that we also havenative integration with LLM that ismight be uh of use in the next yearsthis is the situation different modulesthat does different things all togethercan be composed to create differentpipelines that we can run on uh ourmachine first of all so on your laptopyou can run those pipelines but the verysame pipeline we run all right uh we runin the docker engine in your CIsystem and all right so just one lastnews this sandbox that you send aroundwith between your local machine and yourremote machine can also host LLM aent sowe can have some improvements of ourworkflows based on our uh set of piecesthat we put into the puzzle and we shipalso all our software in a resilient andefficient way for more details I leaveall the links during the presentation sonow we have a an interpace and a daggerthat can match provide us the capabilityto run on external resources through aKubernetes cluster but those Kubernetescluster can run also in a sandbox on ourCI pipeline granting us all thecapability to be double checked beforewe push changes to to production andreusing also other modules that we usein other use cases so pretty good so fari would say we are pretty satisfied andwe are all set for the launch now pleasemate introduced to what we did with allthis stuff thanks a lot Diego for theintroduction very nice yeah so let's seein practice uh what we are been doingwith inner link and dagger so first ofall as introduction I'm from CERN openlab which is this um entity in the CERNIT department which is responsible forestablishing collaborations withindustry and academia so here you cansee our partners and I mean of coursefeel free to reach out if you have ideaswould like to collaborate we're alwaysopen to you know new collaborations andum as was saying we are interested indigital twins for instance so we'retaking we are participating into theinner twin project and um in thisproject we are responsible fordeveloping a component for scalable AIworkflows for scientific digital twinapplications this component isimplemented as a Python library which iscalled it twin AI and um it's uh it canyou can imagine that as a toolkit thatprovides scientists with different uhfunctionalities to support distributedmachine learning um training distributedhyperparameter optimization can supportPyTorch and TensorFlow and also has aspec strong focus on the machinelearning tracking so the uh machinelearning metadata data that is generatedduring training for instance alsoallowing you to store the models I meanto connect to a models registry whereyou can store uh the models and versionthem and so on and so forth and for thiswe're using for instance MLflowum when I talk about distributed machinelearning training I'm actually thinkingof two different models one is the puredata parallel distributed training inwhich we have a model which which is youknow replicated on different GPUs ondifferent nodes potentially multiplenodes and the data set is partitioned soeach uh local replica of the model willactually access a subset a specificpartition of the data set uh on theother hand we can also support modelparallel and hybrid model parallel anddata parallel training so in uh which isthe the image on the right in this casethe model is too large to fit on asingle GPU which is pretty commonnowadays for very large language modelsor like transformer based models so inthis case the model is distributed overmultiple GPUs and um to do this we relyon uh popular frameworks such aspytorch ray and the speed another keyfeature as I was saying ishyperparameter optimization so in thiscase you can imagine you have a traininguh tuning confi�guration in which youdefine the ranges for yourhyperparameters and then the itunatrainer will make sure that uh inindividual uh training trials are run inparallel so distributed over HPCinfrastructure over multiple nodes umand uh in in an optimized way um and todo this we rely on ray tune so let's seesome examples uh of digital twin usecases that we have in interwin that havebeen currently integrated in it theseare not the only ones but let's see uhtwo of them so the first one uh it's asit's tackling I mean it's as I it'sfocusing on the on the domain ofenvironmental sciences so onhydraological modeling and developing AImodel to improve early warnings fordroughts the other is from physics um sowe're collaborating with people workingon the data collected at the Virgo ininterpherometer um which is meant to youknow measure gravitational wave signalsand uh what they would like to do is touse AI based model to den noiseise thesignals captured by the detector so herewe can see an example of um um let's sayscalability analysis that we did with itso on the on the left we can see a plotin which we compare differentdistributed frameworks how they scalefor the same model on the same data setof course and uh on the right we can seean example of energy benchmarking sowe're also interested in studying howthe how you know different distributedframeworks for a specific model for aspecific use case and a specific dataset you know how much is the energyconsumption and you know in other wordsto highlight the different trade-offs uhanother example as I was saying in thiscase we used i20ai to performhyperparameter optimization of a of amodel for hydraological modeling andwith i20i we were able to reduce uh thevalidation loss of almost75% but so how do we make sure that thecode that we're developing in aai isalwaysconsistent um so as I was saying it isthis abstraction layer on top of populardistributed machine learning frameworksso uh the users will provide a trainingconfiguration and a tuning configurationand then the trainer will abstract outfrom the complexity uh below which arebasically different you know distributedmachine learning uh uh frameworks andalso from the HPCinfrastructure uh of course we can writesome tests uh and and we do that uh andwe we have of course some unit andintegration tests uh that we use forinstance to test the abstraction layerthat we developed for the loggers or forsome utility functions or some you knowgeneral purpose unit integration testingfor whatever any kind of feature uhthese are classical let's say these aretraditional unit and integration testsand can be run anywhere and these arenot the topic of this talk you alreadyknow them uh the problem is that somefeatures of it AAI are inherentlydistributed so how can we test thesefeatures uh you know everywhere like howcan how can we test these featureswithout having access to GPUs tomultiple nodes an example is the workeruh rank accation so what does it meanwhen we have for instance data paralleltraining each uh process assigned to aspecific GPU is also assigned differentranks and the ranks are needed in thecollective communication in in thecommunication between the workers so howhow can we test this uh on a laptop ofcourse we cannot I mean we we needspecific infrastructure same is for thecollective operations such as all gathergather barrier and so on and so forth sohow can we make sure that our softwareis actually implementing this theseoperations in the right way otherexamples are for instance saving andloading checkpoints in a distributedmachine learning uh setting uh end toend integration testing for distributedmachine learning training and also thepossib also testing that we can rundistributed machine learning trainingunder hyperparameter optimization sothere's like a an hierarchy so we have ahigh level uh distribution which is thehyperparameter optimization and theneach trial is also distributed so howcan we test this um well that's thetopic of the talk of course and um uhwe're also interested in you knowrepeating these tests for all theframeworks tha�t we're we're actuallyusing so you can see that there's a kindof you know matrix that is building uphere oops yeah so how do we how do weactually run the tests in practice so wehave distributed launchers which arecompatible with the uh distributed uhmachine learning frameworks for instancefor PyTorch so for torch DDP there'storch run if you if you know it uh ifyou ever heard of it and this will allowto spawn multiple processes and eachprocess is basically just a pi testcommand so I I I guess you're familiarwith pi test and uh and then each testcase will actually communicate with theother test cases using a collective ucommunication back end which is providedby the distributed machine learningframework uh and the whole thing willthen be possible to be uh tested on HPCso this is an high level representationof our tests now let's put everythingtogether we saw digital twins HPC AI uhI know hydraological modeling uhgravitational waves so great but thenhow do we actually automate our tests onHPC in practice uh we have our code onGitHub on GitHub we also keep our CIworkflows such as software qualityassessment uh unit tests we build acontain the Docker containers on HPCgreat but how do we integrate then theHPC resources into into into GitHubum so first ingredient is duggerpipelines so D already made a very niceintroduction the the the great benefitin my opinion uh what I noticed at leastis that you know you can avoid to pushand prey okay you have a reproducible CIthat you can run on your laptop you canrun it on GitHub on GitLab whatever youwant it will always be the same and isbased on containers which is prettypretty useful for us uh and second uhdagger pipelines allow to span on thelyservices in the pipeline and this willbe pretty useful for the next step whichis putting everything together withinterlink so on the left we have thecloud side so GitHub code CI pipelineDagger pipeline basically and on theright we have the HPC so the remote HPCsupercomputer um they use differenttechnologies so on cloud we can uh onGitHub we can build docker containersbut then we need when we want to run onHPC we actually need singularitycontainers so part of the CI uh let'ssay pipeline also consists in convertingthe Docker containers into singularitycontainers before they're actuallytested on HPC so it's um but let's seemore details what the pipeline lookslike so uh first of all of course webuild a Docker container and we run someuh unit tests like simple CPU only teststhat can be run on GitHub um next weconvert the docker uh image to asingularity image file and we push it tosome let's say singularity registry it'snot it's not shown here but and then weactually deploy inner link on wellactually we deploy k3suh inside the dagger pipeline on the flyand uh on top of k3s we deploy innerlink so now uh by magic we are able touse interlink and submit jobs to HPCwithin this pipeline and so the last thelast step let's let's say the centralstep in this picture is let's actuallyrun the tests on HPC so we using a linkwe submit the jobs uh if the tests arepassing great we can publish the newversion of the container image both to asingularity image um registry and to ado to a docker container registryum yeah so this is actually implementedas a dagger module so you have also alink um yeah you can find it on thedagger verse uh I'm sure everyone canread here but um this is of course justto you know give an overview of the codeso in practice how does it look like umuh we've been defining three daggertypes are called so first one iti sothat's that's you know where we have thelogic to build a container to connect toinner link uh to run the tests on HPC Imean to um to convert to singularityeverything all of this is on GitHub soof course you can always have a look atthat then we have the inner link cont uhthe inner link uh type which you knowallows to bootstrap the inner linkservice and the last one is thesingularity step which actually uh Imean the singularity type which actuallyallows to convert the docker containerto singularity image so here we havesome examples on how to use thispipeline on the left uh if you could seeuh you can actually see some examples ofum how a container I mean how a CI/CDpip I mean CI pipeline can be built in amodular way so if we want we can justbuild a container and publish it orbuild a container and open a terminal init immediately or I don't know build acontainer test it on CPU only let's sayenvironment and publish if you want soit's it's really modular whereas on theright we have some examples on how toactually spawn inner link and uh submitjobs from the same dagger pipeline so wecould you could even just use inner linkwithout caring about all the rest on theright you have some examples on how todo that um so let's see an example sothese are the this is this is the thethe end to end workflow so we launch wepush let's say to GitHub this willtrigger some um let's say GitHub actionsor some some some workflows uh whichunder the hood will run a Daggerpipeline here you can see the the thetrace of the Dagger pipeline on theDagger cloud so uh I mean of course I Icannot you know open the tabs but it'svery it's very uh very nice because youcan navigate inside the whole tray sohere we can see that at the beginning wewere passing some variables as secretsthen uh the the container was built andthe last step is actually the uh releasepipeline so uh you know push I mean thedeploy inner link run the test on HPCand and and push the final images andhere we got the final images so thedocker images will be pushed to thegithub containers registry whereas thesingularity images will be pushed to aharbor registry which is currentlyhosted on cern resources and here youhave the the you know uh complete userstory now let's let's wrap up um whatcan we currently do uh well of coursebuild containers in a reproducible uh CIpipeline thanks to Dagger uh we canintegrate the factor HPC into GitHub andwe can automate the whole thing um sointegrate the CI tests on HPC throughGitHub using GitHub actions Dagger andInterlink next steps um next steps areto scale up so we can create more testsfor instance as I was sharing before weare really interested in studying thescalability of um of our code so wewe're not all we're only not onlyinterested in checking that the code isrunning but we also want to make surethat our trainer I mean when we you knowmake changes to our trainer we're notactually introducing some uminefficiencies so one one set of testscould be you know having some baselinecode some some baseline referenceactually making sure that the code isactually you know scaling is alwaysscaling the same way or even better uhor on the other hand also studying thethe the energy consumption for forsimilar reasons second could beintegrating new uh HPC centers so at themoment we have just been working withVega in Slovenia and the third one it'spretty uh interesting because we couldso the whole CI that we've been writingso far is for our code so for right toAI for the abstraction layer but wecould extend this to any use case thatwould like to you know write dry runstests for their um let's say machinelearning training for instance before uhcommitting to very large jobs on HPCwhich you know take time and maybe youneed to wait a lot in in the queue andit's not very uncommon to you knowallocate resources on HPC for I knowmany hours on know many many many GPUsand then for some mistake you know itjust crashes right so but sometimes theallocation is not always releasedimmediately so you may you may be payingyou know you may be charged let's sayfor comput time anyway so it's it's itcould potentially be uh pretty usefulfor for our users also to extend whatwe've been developing to their uh let'ssay day-to-day development for HPC codefor AI onHPC and uh with this thanks a lot forattending you can you know of courseleave a feedback and uh I left somereferences if you want to know more fromthe slides um but yeah thanks a lot yeahthankyou and if you are interested in in anyof this of course we are here until theend of the day so you can reach out andwe can talk about Yeah yeah2025-04-15 21:58:08.612831 ��?P#��5AbIxw1uK0QRQgood morning everyone uh today we aregoing to talk about something verypeculiar uh how we managed to developand to create a platform to developdigital twins in a hybrid cloud plus HPCscenario rightso first of all presentations i'm DiegoChangotini from INF that is the NationalInstitute for Nuclear Physics in Italyand here with me on the stage there isMatabunino that is from the CERN openlab so you might have heard CERN inafanit's on the same kind of of track we aredoing physics with particles and so whywe are here talking about um digitaltwins well there are several reasons uhand several institute and projects thatare working into whatever we arepresenting today i give you here justthe links so you can check later ifyou'reinterested what are digital twins then ilove this uh sentence that summarizevery well uh what should be uh a digitaltwin so it's a digital representation ofa real world system and yeah it it's notworking like that but kind of you knowso you have projecting wildfire and youwant to uh forecast whatever will happenin in case of this disaster or in thiskind of other disaster you want tounderstand the impact of floating onyour um landscapeand then there are other digital twinsthat can be very useful and for instanceone that was just uh mentioned before uhin the nice panel that was mentionedbefore uh is particle detection so wehave this big camera that are uh trainedand projected to create photos detailedpictures of what happens betweenparticles why not creating a digitalteam of these big cameras and facilitateour life and same for noise we have todo precise measurement we can simulatenoise or teach a machine to simulate thereal world noise we have in ourenvironments from all these use casesthat share very similar uh needs westarted investigating possibility tocreate an engine for digital twins tosomething that will serve multiplecommunities and make them adopt and useresources that are shared acrossdifferent providers so you see intertwinis the project European project uh wherewe are working this out and on one endwe have the providers so that can spanfrom cloud providers to supercomputerseuro HPCsupercomputers and on the other side youhave frameworks that users use to uh dotheir digital twins what are thechallenges there so you have to providea platform that is capable to supportall different use cases all differentframeworks but also you have to mergethat with the offloading capability ofevery task on different kind of backendsthat not necessarily are able to run uhcloud payloadsfinally since you have this multiplicityof back ends you want to uh maintain allyour sof~�ERN umresponsible for the teams uh doingcloudnative deployments as well asmachine learning deployments and uh nowalso starting to look at uh quantumcomputing management awesome thank youand as Nati already said um I also workat Broadcom uh my name is Nikita i'm aprincipal engineer there i've beeninvolved in the Kubernetes space for along timenow quantum space interests me and Ithought why not we talk more about thisat CubeCon so let's get started um we'vetalked a lot about the AI hype so I justwant to circle back and talk about thequantum hype when when's that going tohappenso I I think we're all learning aboutquantum computers and we're stilllooking for real world use cases butthese systems are developing rapidly andwe know that at some point if we takecryptography for example you know thatsome of the mathematical problems whichum those are based on could be attackeduh by quantum computers um but it is itisn't just hype there are actuallyservices out there i I work for IBM forquantum services in the cloud and so itis about then making those available topeople so that they can startexperimenting and start learning um withthemyeah I can say what we are doing at CERNso uh at CERN we have two maininitiatives around quantum computingactually there have been people lookingat this for for a few years now uh thefirst one is uh CERN uh helps leadwhat's called the open quantum institutein Europe uh so we have a a role thereand this project is more about thegovernance and the access to the thistype of technology uh we have anothereffort that's called the quantumtechnology initiative which is actuallynow in phase two which means uh therewas a phase one before so it's it'salready progressing and here it's moreabout the technology aspects and notonly the development of the algorithmsand identifying the use cases wherewhere things can match but also ensuringthat uh we can manage these workloadsand deploy them uh uh efficiently andcostefficive as as well uh so yeah we'requite involved internallyand working in uh in PIuh postcontography is one of the hottesttopics since NIS standardized the firstquantum safe algorithms in August lastyear so it consumes a lot of our timesand it's a big topic on all kind ofindustry focused or cyber securityspecific focused uh conferencesdo you want to go ahead um so one of thethings that kind of got me into thisparticular space is we were talkingabout cryptographic uh agility so howcan you quickly change cryptographickeys and it sounds like you are alsoworking on something very similar tothat one of the things that had sort ofdriven the projects that I was workingon because I used to work in research isum the US government had basically madean announcement in 2022 about having tosupport quantum computing basically by2035 and because the field is so nentthere's so like like everyone's kind ofsaying here there's a lot that we don'tknow there's a lot that's still going onthere's a lot of research that's stilloccurring and one of the things that'sreally important is try to figure outlike how can we get there because we'restill pretty far behind there arecompanies such as drug companies likeFizer that are using this to kind oftest out molecules so that's like one ofthe more practical ways if you will ofworking on it and then you also havelogistic opportunities as well that kindof plays in with AI workloads as well asquantum computing where they're tryingto figure out best ways to um identifydifferent routes and uh and and ways toget aroundso I think you all raised a really greatpoint especially around cryptographythat there's a lot of changes that weneed to do so c can we like talk alittle bit more about that so in say forexample in this whole cloud nativeecosystem or just kubernetes to startwith what changes do you think we'llneed to bring ini'm a lot of almost all security inKubernetes be it MTLS or uh digitalsignatures JSON web tokens uh codesigning you know everything relies onasymmetric cryptography and that is asNigel said you know the algorithm RSAelliptic curve diff those algorithmswill be broken w�hen a large enoughquantum computer sees the you know thetime of light and shores algorithms canbe run on it so that's and of courseeverything not only kubernetus buteverything in society relies on thesealgorithms that it means yeah we have toupdate things or what we today think issecure will not be secure anymore i Ithink another part of this as well is weneed to understand what cryptographywe're actually using so getting acryptographic in um inventory you knowwe've talked about sbombs for um bill ofmaterials there's a standard calledseabbombs which also kind of augmentsthat with information about yourcryptography usage and and that needs tobe you know it's not just about what'sin Kubernetes and how we buildKubernetes but it's also that awarenessand that usage you know far beyond thatum so yeah I think uh understanding whatyou've got prioritizing as well so wehear of this you know harvest nowdecrypt later so the idea here is we cancapture traffic at the moment save itand then come back to it when quantumcomputers are able to break it thatinvolves looking at the risk it's goingto be relevant perhaps for your veryhigh value data that has a long lifetimethat's still going to be valuable in 510 years time it may not matter for yourshorter lived data so I thinkunderstanding what you've got and thenworking out how you kind of prioritizeis is really really important you won'tfix everything at once and we in theopen source community have to help youknow things again like the S bombs and Cbombs may be a small part of that do youalready see work happening in any of theprojects or anywhere in the communityso I'm actually a TSC lead for a projectthat's within something called the PQCApostcontent cryptography association wehave implementations of some of thesestandard track algorithms like MLDDSA MLChem um and are working towards makingsure those are high assurance so that wecan start including them in stacksthere's other efforts going around inother projects openssl for example 3.5 Ithink is coming out soon so that's goingto add support for some of thesealgorithms and we'll see that filter upthrough the stack so I I think for allof the project maintainers working onother components it's about um beingaware of what's happening with thosedependencies and and and making thatavailable through your um particularprojectsthis is not to put you on the spotbut um IBM's working on an open sourceuh Kiskit SDK right so that's prettyavailable as well for people to kind oflook at and play with but I don't knowtoo much about it i don't know if you dothat's why I said I don't have tools soso yeah Kiss KissKit is an open sourcetoolkit for developing quantumapplications um that absolutely is outthere it's it's been there for a whileand and people can take a look at itthat itself is open source will workwith multiple backends but yeah I wouldI would definitely say if you'reinterested in quantum computing uh lotsof companies have education materialaround there so I think it is somethingthat um is good to start learning aboutand understanding and understanding whyit's different it's it's not thatquantum computing is going to replaceclassical computing it's actually how itaugments augments it and how you knowsome of those little um functions thatjust happen to work well on quantum canbe part of our standard businessprocessesyeahum I I just wanted to the one thing thatI found interesting was you said sothere are changes that are already beingmade in certain projects and that isthat's something that we're going to seeup the stack going forward so from aKubernetes or just someone an platformengineer perspective right so when whenI talk about migrating clusters to aquantum safe future what does thatreallymean i mean there's a couple ofuh two a couple of things one is youknow modern have making sure theinfrastructure is modern so oneprerequisite you could say is TLS 103 ithink u in most kubernetus cases it'sprobably modern enough so ts 103 is usedbut other legacy organizations have ajust a huge uplift to move from TLS 10.2two to TAS 1.3 and things like th�at whenthat hap after that is done you alsoneed to make sure you all the componentsare upgradable easily right you can't bestuck with legacy for years by now youhave to be more agile and updatable andand keep track and then it comes intothe crypto agility aspects uh last rightso projects can't hardcode for RSA or ECanymore it has to be configurable soit's easy to update you know once theuh development reaches productionmaturity so you're not uh stuck thereeither you don't have to redevelop toomuch of the components so those are Ithink uh three key aspects that thecommunity really has to step up to I Ithink one point also when we talk aboutum making use of these postquantumalgorithms within our software stacks issometimes times that you know the keysizes may be bigger right the packetsizes may be bigger uh in many cases youmay be using hybrid schemes where you'recombining um traditional encryption withsay elliptic curves together with uhquantum safe encryption and again thatcan increase uh resource um sizes andCPU maybe especially important with umvery small um sort of transactions orvery high volume yeah and for someapplications it won't matter at all butfor some applications it will may havehuge impact So we don't know until westart testing so kind of starting toplay around early I think is importantso let's switch gears a little bit uhwe've talked a lot about security andthe cryptographic side of things butwhat about running quantum workloads onKubernetes what gaps do you see rightnow and where are we atokay I can try that one so thecryptography is is not the main focus uhuh where I work for for uh looking atquantum computing um there there areclear use cases where we already haveworkloads that have been seeing uhbenefits from this so I I'll give twoexamples um one of them is beamcalibration so we we have large particleaccelerator we have proton beams goingaround calibrating these beams isactually a very hard task um so peoplehave been looking at quantum uhalgorithms to to help out with this todo this live they actually validatedthis algorithms with a live proton beamwhich is uh quite interesting uh thesecond one is a lot of effort uh aroundquantum machine learning um so we talkedabout how what's the complication thatquantum computing is bringing to to thecloud native area after AI it's actuallyvery tightly related in some of theworkloads so even things like the Higsanalysis that the Hig boson which is alot of what we do is just analyzing thedata coming from the detectors andtrying to discover things uh there arequantum machine learning uh algorithmsthat will help us with this um they dohave issues right now because of the theproblem being um the dimensionality ofthe problem not being adapted to thecurrent hardware we have for quantumcomputers and this is where things getuh uh interesting because you can doreduction of the problem usingtraditional machine learning algorithmsand then do the second step usingquantum algorithms and this is the mainchallenge we have on the platform sideis that we have to integrate this kindof hybrid uh scenarios where we havemore classical and and quantum workloadsin the same stack uh so this is what weare uh investigating from from aplatform perspective is how we can havethis very complex workloads uh how tomanage them in a hybrid uh world whichwill stay there for quite a while thereare other challenges uh which are moreon the infrastructure side which is wevery likely will not have quantumcomputers uh on site anytime soon sothey are remote so we need to integratethem in much more of a HPC like waywhere you send workloads to the cloud orremote and fetch the results and thenintegrate with the rest of your analysisa lot more than than the tightly coupleduh infrastructure you would see on atraditional data centeri I think also and the other side ofthat if you like from a providerperspective um is that Kubernetes iscritical to us when we offer quantumservices um through the cloud becauseyou know the actual computation thathappens on the quantum comput is just asmall part of that there's a� lot ofpre-processing there's a lot ofpost-processing that occurs there's alot of control um that's involved aroundhow you manage these systems and thenthere's all the usual kind of boringstuff right whether it's um you knowCI/CD process whether it's login whetherit's authentication and all of thatagain is is based on Kubernetes soKubernetes is a critical part of makingthis utility available to people and Ithink the parallel with AI is also verygood in this case because the when whenthe AI hype started a couple years agothere was this notion of what'shappening to cloud native where is AIgoing but all of us that were doingcloud native for a long time like therethere's not a lot of motivation to goand reinvent the whole stack and redoall the tools that we all uh rely on forseveral years uh it's it's there's astrong motivation just to integrate thistype of new workloads maybe adapt theexisting systems to accommodate properlyand I think the same will happen uh forquantum computing as wellm do you want to add something oh no nono i was just going to say I was theyboth kind of already said what I wasgoing tosay basically I was just going to saythe similar thing whereas in terms ofhow Kubernetes is currently interactingwith quantum computing and of coursewith AI workloads right now is it isacting as of course the orchestrationcomponent aspect of it but everythingright now in that space is hybrid but Ithink u the thing that Ricardo was kindof mentioning that I thought was reallyinteresting is like where's it going togo from here and what does it mean whenit's not hybridso we we talked a lot about like youtouched base on quantum machine learningand such um so if someone's hereinterested in the audience about it whatshould they do about it like how canthey what's the next step for them ifthey want to tap inand get quantum machine learning yeah souh I think the first uh the first uhstep is to actually have a use case thatcan benefit uh machine learning isactually a pretty good uh use case forquantum computing uh the the I don'tknow if you if you listen to peopletalking about quantum computing is whatwas mentioned before a lot of peoplethink this will be the next generationof computing this is clearly not thecase there will be a hybrid world so Iwould say the first step is to reallyidentify a good use case uh the secondthing is that uh what was mentionedbefore which is there are there aretoolkits that can simulate quantumcomputers and this is a really good stepto introduce yourself and and your toolsto to this type of workload like KisKitis a very good example the third thingis that quantum computers are real andyou can actually get access to throughthem to them through even your publiccloud providers that you probablyalready have access some of them offerservices in their catalog that give youaccess to quantum computers uh I'm notsaying it's easy then to start usingthem that's why the integration withKubernetes is is interesting there arechallenges i'm happy to expand a bit onsome of them but uh but you can you cando it uh it's it's it's real it's notcoming it's there yeahdon't youalso just discussion here i mean I thinkthe quantum algorithms they work quitedifferently from you know what we'reused to from classic computing so isn'tthere a quite steep learning curve inorder to kind of understand the actualutilities for uh these new machinesyeah oh this this is luckily not my jobwhere I work i work on the platform sidebut we do have the advantage of being aphysics laboratory so a lot of peopleunderstand this technology really reallywell even the theory behind it and theythey are pushing the boundaries thisthis is why CERN has a lead role inthings like the open quantum instituteand the quantum technology initiative isthat we have the knowledge in house ofhow these things can have an impact forus and potentially for other use casesas well and I think the other thing inthis space as well is as we we learnmore about these algorithms more of themare developed then they themselvesbecome offered as a service that otherpeople can just co�nsume and so you'redoing you have a certain financialworkload and you're looking at someoptimization problem you're going tocall out to this algorithm the otherthing that then becomes interesting froman AI perspective is your AI model cancall tools your AI model tool AI modelcan call these services so it itself canhave access to these uh quantum basedalgorithms as well as then AI being usedto help you um develop those algorithmsand write your software just like youwrite your normal software um so I thinkthere's a a lot of potential with withAI on multiple sides of this that thatmakes sense i just out of curiosity sowho here has played with quantumcomputers or just has some experience inthis area be it the cryptographic sideokay just two three four maybe and I getwho here is interested or maybe five whohere is interested to get into thisspace okay that's abunch so for folks who are interestedwhat more can they do like what are thetangible things for instance when I waslearning more about it I was like okaythis is cool but this is it looks likesomething that I cannot take an actionupon right now so for instance there isthere are some maintainers of projectsor people working on certain projectsinternally in their companies whatshould they be doing what's the nextaction item for themi'll give a first shot i mean from a uhI think myself you know when thepostconic photography came along havingbeen worked with this a long time youknow it's it's fun it's new things newalgorithms uh applied in similar waysbut it's it's a lot of fun so you canhave a lot of fun with this and but yeahyou have to you know learn how the whatthe new algorithms are how they work howyou apply them so there's a lot ofthings you can uh study up on then youknow how will the new uh hybrid keyexchange work in in TLS for example andthen yeah crypto agility as well there'shas been a lot depend regardless ofwhich industry you work in be itfinancial or government or whatever nowthere's a huge flurry of you know newwhite papers coming out how to applycryptogility that just came out a coupleof weeks ago one from from NIST which isa real good you know almost educationalwhat are the things you have to thinkabout when it comes to crypto agility ifyou're developing stuff if you'redeploying stuff in your platform whatare different aspects you have toconsider so there's a uh great lot ofgood material out there to to read up onyeah I'd echo that I think there's a lotof material out there whether it's onthe cryptography side or whether it's onthe quantum computing side and I guessone thing for anyone who's new to anarea is it is I mean I know this as amaintain maintainer on projects rightit's such a great thing when new peoplecome along and have new ideas or say whyis this so complicated or why is thisdocumentation poor and that's where newcontributors I think are just soimmensely valuable you think you comealong to a new topic area and you don'tknow a thing but actually you'rebringing that extra insight um and thenhelping to spread the word um morebroadly so I I think I would encouragepeople to try and get involvedUh I would I would also add this is avery new area um I will give an exampleof how early it is there are concreteuses but it's quite early so uh forexample we started uh procuring foraccess to quantum computers becausethere are a few around the world uhthere is no standard to define uh howyou measure the workloads or how youcost or you do the costing of theworkloads now to procure resources whenyou don't know how to actually definethe cost of what you're using is reallyreally hard and this is where uh notonly the the this area needs quantumcomputing experts and IT experts itneeds all sorts of people it needs thebusiness people to actually start makingsomething more concrete uh from this uhthis uh services uh it's it's reallyfunny when you when you start lookinginto it how how different uh eachquantum computer is is defining how yousubmit a workload how you measure theefficiency how you you can actually askfor access and time uh so it's quiteearly which makes� it really interestingin all sorts of uh areas so even ifyou're not like a a quantum expert uhthere there's a lot to contribute to uhso there's a lot of resources it lookslike um so while I was putting togetherthis panel so I would probably try tokeep this as the last question and havesome have some time for audiencequestions so while we were putting thispanel together um there was one morepanelist who's going to join us Paulfrom IBM so when we were discussingabout this we realized that there reallyisn't a space in the CNCF communitywhere we can a forum where we can talkabout this so Ricardo I'm going to putyou on the spot you're in the TOC thetab uh so where is this forum now whatshould we do yeah I I think it's good tolook uh there I think there are twoareas one one is to look at what we didwith AI um there was a lot of uh demandto have cloudnative uh infrastructureevolve to accommodate AI uh and thisneeds two two parts one is to to do thegovernance of all these projects uh andto integrate them in the ecosystem so Ithink that that roles uh belongs a lotin the what we have in the technicaloversight committee so we created thisAI working group that has a lot of thepeople doing the work uh gettingtogether frequently uh they published awhite paper that kind of defines thestate of the ecosystem and gives somedirections of where where we should gothe other part is the technical moretechnical evolution of the platform andwe learned from from AI that Kubernetesactually is extremely flexible inaccommodating workloads that are notjust managing nodes or containers uh itbecame an orchestrator for a bunch ofstuff and this is the area where we canlearn a lot from what we did for AI orHPC all the new scheduling primitivesall the evolution we did in this partthis is essential to accommodate newthings like quantum computing and thefact that we were successful doing thisfor AI really gives a lot of optimismthat we'll be able to do the same forfor for this new era as well nigel doyou want to talk about the PQCAsorry the PQCA uh the PQC uh yeah soagain as I said there is some work goingon in something called the postquantumcryptography association that's arounduh curating some of the work in thisspace it's part of the Linux foundationit's not part of CNCF uh and I think youknow our our groups at like PCA and CNCFneed to work closer together on this andalso other groups like there's open SSFof course as well uh when we're lookingat the code analysis side yeah we we hada workshop on Monday at the maintainerssummit about what we call the tag rebootthe tag is the technical advisory groupsin the in the CNCF uh and we propose tohave more flexible way of startinginitiatives so I think launching whoeveris interested in this area we shouldpush to launch an initiative to to startgetting people and discussing moreclosely uh to not only have like a panelin during CubeCon but have this going uhsteadily during the year as well yep soif anyone's interested please findRicardo after the session and maybe wecan spin something up um so to summarizeuh the key takeaways right and tosimplify itreally I I see mainly three takeawaysone is that quantum computing isn'tgoing to replace the classical systemsbut in fact it's going to augment themthe second one we've talked a lot aboutcryptography so there is a hugetransition that is coming to quantumsafe cryptography there's a ton of workthat we need to do and we need to startright away and like like how we talkedabout AI Kubernetes is going to play anintegral role in quantum workloads andwe need to start looking at fillingthose gaps and doing all the work behindthem um so with that I'll let's let'send the panel here but we'll open up forquestions from the audience uh there's amicrophone with a stand over there ifanyone's interested[Applause]so we we saw five hands raised before socome forward and ask questionlike five hands okay so I guess noquestions so you get some time back butif anyone's interest Oh I see I see afew questions over there okayhi uh this is mainly for Ricardo but canI open to everyone uh you mentioned someof the challenges you faced whenintegrating quantum workloads withKubernetes i'd be interested to hear abit more about these right so uh thechallenges are still there i'm not goingto claim that we sorted but I think thesummary is uh that we have this um theworkloads the quantum workloadsintegrate with with the classicalworkloads so we have this hybridscenario where we need to delegate apart of the analysis or the process toquantum computers fetch the data backthe results back and continue uh thefact that this uh uh infrastructure isnot where the rest of the infrastructureis poses a problem on itself so it's thesame issue we discuss constantly in thecommunity about multicluster hybriddeployments this challenge is even morevisible in quantum computers because theinterfaces are not necessarily what weexpect and this is where I think if theproviders of these quantum computersstart offering the APIs that we expectuh it benefits a lot everyone the secondone uh is what I mentioned before isthat we when we want to procure oraccess this quantum computers the way wedefine the workloads is very differentbetween the different uh devices we haveaccess to there is no standard acrossfor defining uh units of computationacross the different quantum computersthis is also a challenge um and the lastone is that a lot of the algorithms arealso suited to specific devices uh sothat there's a lot less because of thele lack of standardization some of theworkloads are fitted to a specific typeof quantum computer or a specificimplementation which also poses quite alot of uh um like uh challenges not onlyin in orchestrating these things uh butalso making sure we have time availablein that specific device and this is uh Idon't know we all have this issue withAI currently with the lack of thescarcity of GPUs uh it's much worse interms of quantum computers like if youhave a lot of people interested thereare not not a lot of devices availableto actually uh move forward so yeah Iwould say this is what we what Icurrently face i'm sure if you go up theup the stack people are facing otherchallenges as well thank you very muchhi thank you for the panel that wasreally great um so in February Microsoftum unveiled their major 101 um quantumuh processing unit um so my question isa little bit more about um ethicalityand is there any consideration aboutethicality to um releasing them to asort of wider audience uh before theystart you know breaking cryptography umyou know on a wider scale thank youthat is that is a that is a hardquestionah how do we answer that the ethics theethics[Music]um I guess it's like any service what gogo yeah i mean it's security has beenhaving this kind of history a long timecoming with CVS and exploits and thingslike that so uh probably when it comesto you know when finding acryptographically relevant quantumcomputer comes along well it's might notbe from an ethical hacking group or anethical organization who knows so Iwell hopefully it's going to be verygradually and obvious when it happens soit doesn't you know the one of the kindof joking things is how do you know acryptographically relevant quantumcomputers is there well that's when allbitcoin are are gone or all yourbitcoins are stolen right but we'll hopeit it doesn't come to that so yeahdefinitely the organizations like IBMthough who builds thesethings probably thinks along these linesright not going to throw out somethingthat does it easily and it's going to beexpensive as hell and and I think thisis also goes back to that point aboutprioritization and what is your highvalue data and and focusing on knowingwhat cryptography you you use what theimpact of that is deciding where youneed to put those protections in placenow um don't try and fix everything butbut focus on that priority because wemay not know uh when it happens untilafterwardsthank you very muchright um I think we're at time uh if youhave more questions please find us nearthe stage and if you're interested instarting more discussions around it alsoplease find us near the stage2025-04-15 21:58:09.255500 � ��Q#��cAOAb54JRIS6Mhello everyone welcome to the panelabout quantum computing and Kuberneteswe've talked a lot about how Kubernetesis complicated now we make it even morecomplicated by throwing quantumcomputing into the mix um and so most ofus have heard the term quantum computingmostly as a theoretical pursuit we'venot really seen what the real worldapplications around it are and what doesit really mean to run quantum workloadson Kubernetes and how we can make theKubernetes and cloudnative ecosystemquantum safe we've talked a lot about AIthen why are we even talking aboutquantum now because AI is already socomplicated because if we don't starttalking about quantum now we're going torun into trouble later on so we need tomake progress while we still have timeso since we're short on time we've gotjust got 30 minutes and I'll try toleave some time for questions let's diveright into it so we've got a wonderfulset of panelists here uh why don't youall go and introduce yourselvesso my name is Natalie Fischer i work forVMware by Broadcom with Nikita on theKubernetes um area stack so uh I work asa product manager therehi there everyone i'm Nigel Jones i workfor IBM research um so I'm involved inpostquantum cryptography and uh quantumand our services there and now doingsome work with AI so it's an interestingintersection to talk about todayyeah and I'm Thomas Koson i'm chief PKIofficer at key factor and I've beenworking with cyber security and publickey infrastructure PI and opensource forthe past 30 yearshi I'm Ricardo uh I lead the platformsinfrastructure uh team at C��have a missioncritical AI system right uh nowadaysyou're just also powering our criticalbusiness operation also relying some LLMapplication either internally orexternally so we have observability gapso our observability just monitor thesesystems but they're just ls behind theircomplexity and impact there's a controlproblem so when system fails part may beoutside your control so third party compcomponents just basically creating ablindspot so those are the four criticalobservability dimension safe i'm justgoing to deep dive on each oneso model performance degragation so youcan see just silent decline models justdecay in unpredictable patterns uh andthey just often fails without anyexplicit errors there might be someconcept drift so real world data thatjust diverges from this trainingdistribution and gap just widens overtime there's a threshold creep soperformance metrics fluctuate withacceptable ranges but just failureshappen sometimes you don't even justdeductthat there's a prompt ranking challengeser for example minor prompt variationjust create different output smallchanges cause big impacts and also opacrelationship like prompt responseconnection remain invisible totraditional tools and version cast soyou're just uh constantly uh deployingnew models and also resource consumptionpatterns are different than thetraditional oneuh you can justuh consume more resource for the samepromptuh with differentvalues and there's also complexdeployment topologies like heterogenousenvironments your old Kubernetes clusternot same there are some third partydependencies there's some inter uhservice chains there's some crossboundary flow so data crossesorganizationallines and why traditional monitoringjust uh fall short so what are we doingwe are just checking simple up and downsignals and this is just you can justmiss the gradual degation if you'rechecking that or point in time metricsso you can justuh capture distribution shift but trainsremain remain invisible and also missingthe semantic context so if you're usingAPM tools there's just lag ofunderstanding modelbehavior and there's also impact on thebusiness side so let's say you have arecommendation engineuh and just gradual relevance declinelist to decrease clickthrough rates andif you're LLM hallucination uhhallucinating there's a just apossibility of uh incorrect responsesand you you should also just I think itwas the most voted uh item so uh youmight just experience aexplosion so what is the path formatforward for the new observability sodistribution aware metrics semanticunderstanding contextual correlation andbusiness alignment so uh today I am hereto just uh propose you or present youhow you can do that technicallyuh so my approach is fulland bit andopen telemetrytogether and combine solutions so whatis open telemetry and what is fullandbit so open telemetry is open sourceobservability framework so it is vendorneutral it just provides you vendorneutral collection it is just firstclass kubernetes integration and strongcommunity integration adoption probablyyou heard a lot of open telemetry duringthis conference and fluent bit it's justend to end observability pipeline it'sjust lightweight uh if you're using onkubernetesuh it has some powerful transformationcapabilities and it has a seamlessplatformintegration so open telemetry providesan open-source standard for logs metricsand tracingSo we just uh going to talk about thisstandard so it has a schema for logsmetrics andtraces and it also provide transportlayer we are just calling itOTLP and it has a very goodinstrumentation SDK you can just autoinstrument some libraries or you canjust uh use the instrumentation SDK andyou can just expand your applicationusingthat so we have a schema we have atransport OTP layer and we haveinstrumentation SDK with the opentelemetry so how fluent with an opentelemetry works togetherso basically fluent bit collecttransform enrich and deliver this datato your selectedtarget you can just consume logs tracesmetrics from open telemetry and send itto open telemetry via OTLP protocolyou can just consume multiple OTLPuh endpoints and send it to multipleOTLPendpoints and let's sayuh our application doesn't just uh giveus a logs in uh open telemetry schema sowe have so basically flant bit has opentelemetry envelope so you can just sendyour log record and fluent bit is justuh converting itto open telemetryformat it is same for matrix let's sayyou have some Prometheus endpoints orsty or other metrics to be justconverted to open telemetry complianceschema and like I said it has some sortof uh enrichment filtering capabilitiesalso one of them uh you're justconsuming uh a lot of traces and ifyou're justuh collecting the traces from LLMapplication it's so chatty so it justgenerate a lot of trace and spans youcan just use head sampling what is headsampling it's a uh probabilisticapproach it just take the headspan andapply your filter top on it if itapplies then send it to destination notthen drop the whole trace and you caneven do tail samplingUh so it wait for all spans and checkfor all spans if they apply theconditions it passes not it justdrops let's say you have uh blocks youcan just apply conditional filters sofor example in this exampleuh there's a conditions you can just useconditional operators like and and oryou can use comparison operators likeAQL or not AQL or greater or less thanuh if your conditionals are good for thelogs so it can pass not thendrops and that was the presentation partso I havea quick demo foryou okay I havea Okay so this is just an uh EKS clusterit has two GPU nodes on it uh so I justdeploy two LLM models on it and I use uhsomething called cubei it's very coolproject i just recommend you to checkalso you can just easily those are my mymodels so it is a llama 8 billioninstruct and llama 8 billion tulipthis gives me uh using VLM it gives meuh OpenAI compliant uh endpoints and I Ihave a Python application running uhwith these twomodels i am just goingto so this is basically Pythonapplication it's just sending incomingchat request to uh my LLM that's it i Ididn't use any uh metric or log or uhtracing in my code it's just using theopen telemetry uh open AI openinstrumentation and just creatinguh metrics traces and logs for all myapplications and it is sending it tofluent bit uh OTLPendpoint i'm going to show you the fullbit partokay this is the full end bit part so Ihave open telemetry input plug-inconfigured it just exposesuh OTLP endpoint for traces logs andmetrics you can just either send gRPCusing gRPC or you can just use uhprotobuff over uhHTTP uh and here we have uh output it isjust sending to my demo environment uhso it it is again uh OTLP endpoint i'mjust saying this is the matrix URI thisis the logs URI and this is the tracesURI and I'm just uh giving my API totoken for my application and that's itand I'm also as you can see I'm justusing the output as set out uh let mealso show youquickly so at the moment I'm just usingsome load testers uh so these are thegenerated matrix traces and logs sincethis is so chatty I'm just going to showyou how it is looks like ontheUI i'm just using Chronosphere platformbut uh it is another observabilityplatform so you can just use graphanadata do reel datress or any open sourcesolution but since I had an environmentjust I had used that so those are thematrix parts as you can see I justconfigureduh I don't know maybe 10 line of uh yl Ididn't do any instrumentation I just useopen telemetry autoinstrumentation and this is the resultit's just flowing on my environmentso this is the P99 client operationlatency as you can see it just show methe two models Ideployed this is the token usage sothese labels are just saying seconds butit is justuh 950 tokens so it is theirtokens those are the matrix parts let'ssay I need the traces it's also flowinghere so you can see your uh traces sentto your observplatform and we have also generated logshere you can just useuh ID to correlate them uh and viceversaokay yep that was the everything Iwanted to show you today and if you havequestions uh I'm happy to answer since Ithink we have a quite good time2025-04-15 21:58:09.787865 ��)R#�� ADVFQ20OrEFkokay yeah so hello everyone so welcometo last session of CubeCon i hope it wasa good experience for you so today justwe are going to talk about how you canjust supercharge your AIM MLobservability uh using open telemetryand fluently so uh for the first talkwe're just going to talk about uhpractical strategies for monitoring aIML workloads on Kubernetes environmentsand at the end I'm going to show you apretty quickdemo so a little bit about me so uh forthe last six years uh I'm just focusingon Kubernetes reliability and cloudinfrastructure so I'm working atChronosphere uh so Chronosphere isobservable platform so I'm justcontributinguh that observes the platform i am alsoopen source contributor uh so Fluent Bbeat CI/CD maintainer and you can justsee me on the fluent bit with selectchannels and my personal mission is likemaking invisible intelligence visibleand action actionable foreveryone so there are a couple ofchallenges with the Kubernetes so forexample ephemeral compute so ports comeand go quickly they they just basicallytake their logs and context with them uhresource organiz orchestration sothere's a dynamic scheduling shiftworkloads across nodes constantly andautoscaling uh so training and inferenceworkloads just scale differently basedon their patterns andmulti-tenency you can just deploymultiple ML models and share sameinfrastructure with differentpriorities i think we already have asolution for them so you can justpurchase your logs to some external logstorage you can just detect for examplelemon nodes if you have issues with thenodes you can just do predictive scalingyou can just forecast your traffic andjust scale based on that earlier or youcan do workload isolation those are theknown issuesbut there are some technical gaps uhwith the aiml application that I amgoing to mention more so for exampleunified telemetry collection serve uhcollecting telemetry across heterogenouscomponents is really hard and there's anissue with the kubernetes a context prouh propagation uh maintaining context asrequests travel through the services youhave multiple services and there arevery different ML framework specificinstrumentation so specializedmonitoring for machine learningcomponents and you have to just connectyour infrastructure metrics with themachine learning outcomes so these gapsare basically create create a blind spotin ML operations team just struggling totroubleshoot modelissues so I'm just going to do a quicksurvey so uh those are the that I'mhearing most so just please raise yourhands uh so do you have any issues withthe modelperformance with inputtracking so resourceoptimization so cross componenttracing I think uh most voted for theresource optimizationpart so we are just going to deep diveinto monitoring uh complex intelligencesystemsso what is in invisible intelligencechallenge sir you �� the score thiswill cause the performance of theKubernetes scheduling is very poorthat's that's a problem needs to solveand one kind of solution is DIA this iscalled dynamic resource allocation it isan API for requesting and sharingresources between ports and consentersinside a pod it is stabled in 1.32version in Kubernetes and it is um andyou need to set a resource claim and theresource class and every device vendorneed to implement their own resource DRAdriver and it communicate with Kublet todo the uh device sharing and uh uhscheduling andallocating but it has many restrictionsthe first is it has it it requires theKubernetes version must be the latest itmeans 132i uh Kubernetes and it has toand it has happens to me that not manydevice vendors implements DIA driver fornow the Nvidia DIA driver is underconstruction and and it's not productionlevel ready so you have to maybe use usethis feature in the uh in the future forit to enter the productionenvironment and also it needs to createresource claim and resource class if youyou if want to if you want to share theGPUs inside the Kubernetes you have toconfigure this hole you need to use theresource claims here which is definedhere and the corresponding resourceclass defined here it all it has it hasall be need to be applied in Kubernetesyes and still this this feature is notenabled automatically you need to enenableexplicitly and another solution is uhwhat we brought here it's called Hamihami is a hogenous AI computing virtualvirtualization middleware it is uh it isused to provide CPU sharing and managemulti multiple uh multi heterogeneous AIcomputing devices from multiple devicevendors it composed of a mutating webhook a scheduling extender and thecorresponding corresponding deviceplugins from for each of the devicevendors and uh we have an additionallyincontainer resource control for each ofthese device yes and it is very it is aplugable non-intrusive standard andlightweight which means you can helminstall and helm uninstall very easilyand it is a CNCF sandbox projectand the key feature of HAMI is thedevice share advanced scheduling andunified monitoring device share is ourkey feature let's show you here if youhave a node composed of four GPUs andyou have two users each of them submit atask which uses two GPUs this without hiyou need to allocate you you have to usefour of them all and the overallutilization is less than 50% and withham it can be observed that these twotask can be shared on two GPUs and leavethe rest to for other task to use so itcan improve the GPU utilization tonearly 100 yes and it is very and it istransparent to task you don't need tomodify the task you don't need to uhmodify the image or the source code oranything just specify the device memoryyou need to use is okay and the this isthe device share inside this is how wecontrol the device limit inside thecontainer we inject a library calledhumor inside this invocation line ithijacks calls from the CUDA runtime toCUDA driver um so we so we can do thecounting here we know how exactly arethe uh device memory allocation insideeach of the container if it pass thelimitation you set in this your insidethis task it will return an OM error yesit applies to wider range of Avidia anduh Kubernetes the only the onlyrequirement is your CUDA version isgreater than 10.2 and the the Avidiadriver version is greater than 440yes and this is how we use Hammy yousimply need to specify the number ofGPUs you wish to see in this containerand and along with the GPU memory youwish to cut for this container uh inthis example you uh you said this thistask needs two GPUs and each CPU use 10GB device memory and the scheduleulerknows sees that and it will cut the 10 Gdevice memory for this task and leavethe uh the other 22 for other for othertask to share and we provide theincontainer resource control you can seethe Nvidia SMI inside the container theupper limit of device memory is limitedto 10G device 10 yes you can't usebeyond that it is according to this yeswe we guarantee that the upper limit iscontained in this container ye�s andother features include that we cansupport the device specify you canspecify the type of GPU you wish to useif if you want only want to use A100then you can set the annotation of taskhere with the use GPU type annotationwhichmeans if it assigned to A100 then youcan only be applied to A100 cards or youcan avoid be apply applied to A100 byset this blacklist no use your CPU typehere and we have another feature calledtask priority and the the method To useit it's very easy you simply uh specifyan environment variable here called CUDAtask priority uh for now we support twotypes of priority zero is the highpriority and one is the low priority thedifference is that as long as the highpriority pod is submitting kernel to thecertain GPU the low priority pod will bewill be temporarily suspended and waitfor this long uh high priority pod tostop submitting new kernels to GPU andthe low priority pod will be resumedrunning it is all transparent to thetask and it is very automaticallyyes and we also support dynamic MIGfeature uh you can use it like the uhexamples uh in this page you simply needto specify the number of GPU you wish touse along along with the uh devicememory you wish to allocate you you setthe uh number and memory here and wewill search according to the templateshere and we will find the most f fittingmake instance for you for you to use andwe use make via make parted todynamically generate the mega instancefor certain card so the users doesn'tneed to really know the make instancename like 1g 10b because it is differentfor different different types of Nvidiacard the user only need to concern howmany GPUs it wishes to use inside thecontainer along with the device memoryit wishes to use and leave the rest forus yes and we can apply this devicememory control to other um devices otherthan Nvidia like Accent yes this is aHuawei uh manufacturer uh AI chips yesif it it is 64 GB in total and in thisexample we limited to 16 G and it youcan see here it has been limited and itit also apply to it and camreen devicesjust like Huawei and Nvidia yes and thisis other scheduling features we uh weham introduces like the new and topologyaware if you want to allocate more thanone GPUs like you want to deploy an AItraining job across multiple nodes andmultiple GPUs they wish to they probablywish to minimize the communicationcommunication cost between multiple GPUsand for that reason we can observe thetopology inside between each GPUs alongwith the uh network topology so by doingthat we can uh allocate the uh nearestGPU for for this AI training job tominimize the communication cost yes ithas been applied to Nvidia accent andthismetex device yes and we have anotherbeam pack and spread uh schedule policyit is it is for each task each taskercan specify their own schedule policyand the beammeans it it wishes to allocate to a GPUwhich has already have the task runningon that GPU to minimize fragmentationcaused by GPU share and the spread is onthe contrary wish it wishes to allocateto a GPU where no port is running onthat CPU to maximize the performance soeach port may have their own request toit may have their own schedule policy tobest meet their fits yes we we providethe beam pack and spread scatter policyfor GPU GPU level and node level yes andwe uh after we introduced the GPUsharing there are many things you needto monitor other than the DCGM exportermatrix used by used to you by the DCGMexporter like how many device memory hasbeen allocated for a certain GPU howmany device memory are still free stillavailable for other task to use and howmany workloads are running on that CPUand how many workloads and theircorresponding PIP name container name and etc they areall contained in this form of matrixporter which can be easily integratedinto the premises and later bedemonstrated by the graphana dashboardyes and the volcano vgpo is supported byhami if you will if you use volcano ifyou go go to their PLA project and theirtheir slice has a has a pages about thevolcano VGP which is contributed by usand the hammy community is responsiblefor the incontainer resource control andleave the rest scheduling process forthe volcano you can easily find there uhthe volcano vpu document on the volcanovpu project on the volcano project andit's the same about the coordinator wealso integrated GPU sharing GPU sharingmechanism into the coordinator you canfind their um document in on theirwebsiteyes okay uh allow me to pass my phone tothe to my colleague to introduce theadopters the road maps and the other thesummary yeshello hellookay uh thank you Min to introduce theHami architecture and the GPU sharing uhand the uh advanced schedule let meintroduce Hami ecosystem and theadopters uh from now on uh HA support inadditional to newia we also support suchas the Isula andcompreen met the other sex AI chips andalso we want to uh support more AI chipsand from the the AI chips where the thekilling OSSE operator system also supported thehammy to building in the AI uh systemand uh in China and around the worldmany vendors also building hammy totheir product such as the docloud andthe silicon cloud and the Ucloud theUcloud is the China biggest the naturalcloudprovider and some virtual users also useHammy to solve the GPU utilization theirsituation is the GPU utilization verysmall such as the the slow the southcarol uh company and use ham to uhcombine train and influence in theirproduction situation and uh such asthe the the travel company and uh somekey users such as the PN security andthe SEBC also some the bankand business company also use hammy toin to maximize the GPU utilization andunified management the heterogeneous airchips so uh from now uh we have nearly100 uh and user from the around theworld okay uh so uh this this year uhham also becomes the same safe sandboxbut we also have a clean road map in the2025 uh firstly we will support more theheterogeneous AI chips such as the quonand we also want to support the MD interor the AWS so any any other any anyhelps we we willwelcome and furthermore we also want tosupport the DR but it's it's morechallenges how to have a compatible wayto in integrated with Dr and the Hammybecause um many of the users use ham totheir production environment but um butfrom on and more many AI chips companyhavedon't implement so also we we will uhcreate a hammy web UI for easy to useand uh maybe in the end of the year wewill propose it to the incubating proproject so yeah and the dynamic MPS forNvidiauh and if if you want to join us we willvery very welcome yeah this is our slackand the GitHub GitHub repo yeah uh weall finished our talk uh any questionsis we're welcomeokay any questionsokaywhat are the challengesum the biggest challenges have is thatthe communication we can't reach itthrough the AMD yes that's the biggestif we have the if we can reach AMD Ithink this can be easily implementedyes we want to keep in touch with theAMD open source strategy or the softwaredevelopment but it's difficult toask yeahcould you could you say a bit more abouthow you do the scheduling maybe so howthings like pod migration or you knowwhat happens if a node goes down that apod is running on these kind of uhquestions okay the scheduling part is weimplementedour we implementment ourown scheduleuler extender here and thehammock is composed of a mutating webhook schedule extender and we do thescheduling here we uh we implement thethe filterand the filter and score process yes weand we do the uh additional u GPUfiltering and GPU node scoring here inthe schedule extender yeah so is it aregular scheduleuler plug-in as as inthe plug-in framework or is it acompletely separate scheduleuler uh itis a scheduleuler extender is not ascheduleuler framework because if you ifwe adopt the architect of thescheduleuler framework you have tocompare every uh every kubernetesschedule version from one from 116 to132 it is difficult for open sourceproject like us so we use the scheduleextender to to for for using that uh itcan be easily inserted in into every uhscheduleuler version from 116 to 132yes okay any any otherquestions have a last day bye-bye have anice day2025-04-15 21:58:10.280007 ��S#��mAVAWw5CujiR8helloeveryone welcome to our session and andthis session is about unlocking how toefficiently flexibility and manage theseven AI chips in Kubernetes it is arather abstract title but uh in one inone sentence it can be sound like how toimprove your GPU utilization inKubernetes yes this session is broughtis brought to you by two softwareengineers my name is Lemon this is mycolleague Junga we are both from a newfounded company called Dynamia PointAI and here are the Let me let me firstintroduce to you the background thefirst background is the burstrequirement for computing power theglobal GPU marketing growth is over60% than last year the mo majority ismajority growth is Nvidia and theheterogeneous is over 20% as you can seein in this figure it has been ratherboost after the emergence of largelanguagemodels and uh you may guess we are fromthe mainland China so so uh in ourcountry the highspec Nvidia can't can'tbe imported very easily so we have touse uh alternative cards like thesedevice vendors they are they are allsome alternative plan for Nvidia cardsand of course they are not as highspecas Nvidia and the user pre userexperiences may not be as good as Nvidiabutthey but it is very cheap so it so wecan use them in the production level aswell and it has a decentperformance yes but we here we meet achallenge is that the GPU can't beshared in a traditional Kubernetes andsuppose you have five GPUs each with thecapacity of 40 40G device memory and itis all running uh 2G little mode smallmodel and it will cause the all thedevice be not not be able to fit moreports other than this one so the otherpart which uses CPU are in a pendingstate which cause the utilization of GPUin the cluster is verylow and another challenge is themanagement of heterogeneous clusters asyou can see there are multiple devicevendors in China and many of them impleimplement their own scheduleulerextender and it hijacked the filterhijack the score process and you haveand you have a and if you have a clustercomposed of multiore AI cards then youhave to install their schedule extendersand you have a more sc schedulingpipeline and if you enter the filter youhave first go through the multipleextenders and go back to the filtercomputer and then into the score processyou you still need to go through all theextenders and return to�� forexample private properties so if we wantto go and inspect them visually we needto ask permission for these people to goin their private property and inspecttheir power lines this is extremely timeconsuming and extremely costly for uhpower utilitycompanies and this is why overtory hasstarted to think in a different approachwhat if instead of instead of going andinspecting those power line visuallywhat if we do it through satelliteimages specifically very high resolutionsatellite imagery and machine learningso that we can get insights from the skypretty much on how our infrastructure isdoing if it needs maintenance if thereis a damaged line forexample we do that with a simple processwell simple it will be simple becausei'm explaining to you in these threedifferent steps but actually the wholeprocess is a lot more complicated theidea is first of all that we getvegetation data from different sourcessome of those are satellite images someof those are aerial images so we haveproviders that we query and they give ushigh resolution satellite images thinkthat the highest resolution that we useis 15cm then we combine this data withinformation that we get from the energyprovider so they give us all thecoordinates of the poles the lines thekind of territory where theirinfrastructure is standing on and so onand basically what we do is that wecombine these two different types ofdata to create a map a risk map so thatthe utility company can go and see wherethere is an elevated risk of wildfirethe idea is that whenever the vegetationgets too close to the power lines thereis an elevated risk of wildfire and inthis case the provider can go and trimthe vegetation before a wildfire spreadsout so we make it easy for thoseutilities to look at the map to look atour product at the results of our scanand basically understand where they needto go and act faster in order to preventthe nextwildfire but how are we doing that andhow are we doing that using kubernetesand cloudnative solutions before we moveinto what we currently have right now iwant to take a step back and think abouthow we got started we are a startup wheni joined over story we were around 20people right now we're over 100 when wegot started we need to break things andmoves fast and move fast this is youknow what a lot of people say aboutstartup we need to figure out things youknow on right on the ground so when westarted we actually had a stack that wasextremely simple we just used kubernetesbecause it was providing us the rightbalance between flexibility andstability you can think that ourworkload can take anything from one cpuand 4 gigs of memory to 72 cpus andalpha terabyte of memory and stabilityso we had that withkubernetes and most of our processingwas happening with jupyter hub i'm sureyou're familiar already with jupyternotebooks jupyter hub is just a toolthat allows you to create differentjupyter notebooks using different typesof resources and in the beginning thatwas all we had we started with this andeverything was a jupyter notebook so wewould create all our processingpipelines we would create all our dataanalysis in jupyter notebook we builtsome custom integration to it so thatpeople could go and select the containerimage that they wanted to spin selectthe machine where they wanted to run itand basically they would have thepossibility of running their jupyternotebook on a machine which no muchhigher power compared to their laptopeventually a bigger gpu or a largeamount of memory orcpu so data scientists could just go onthis platform select the machine thatthey wanted select the image that theywanted and run this experiment in ajupyter notebook as you can imaginejupyter notebooks are great forexperimenting but this became anightmare pretty early reputability andtracking in particular was extremelydifficult and as we have to repeat thesekinds over time every time we had thiskind of messages all over zlack hey doyou remember which version of pandas weused a year ago the new one breakseverything and the whole process ingeneral was just very manual it wasreally we al�ways needed to assign a datascientist with every delivery that wewere doing for a client and is the datascientist would have to go through thejupyter notebook step by step fix theproblem that they would encounter on theway and pretty much deliver the resultsto the final client it was a veryonetoone binding one request from theclient one data scientist assigned toit so after a while it was just time toautomate and move forwardwe set to ourself two main objectivesthe first one getting rid of jupyternotebook we tried different combinationof things we tried using a tool callednbde that allows you to take yourjupyter notebook and convert it into apython package we were not verysuccessful with that i have to say ingeneral what we experienced that jupyternotebook are difficult to maintaindifficult to test and tracking with gitwas very complicated if you don't knowabout it a jupyter notebook is prettymuch a giant json file so you canimagine what that means whenever youneed to track that in git and the secondand probably the most importantobjective was that we wanted to automatethose workflow runs we wanted to breakthis relationship between the number ofdata scientists we have and the numberof projects that we were running so weneeded for a data and workfloworchestrator in here we stumbled upon aworkflow called daxter this is a projecti'm a fan of i've been contributed tothe open source project for quite sometime we started using it since version 0something and now there are like1.10 i personally like daxter becausefirst of all it's an open source projectis backed by a company in the us withtheir own cloud offering but we've beenusing their open source version verysuccessfully it gave us immediately acouple of things that we like first offirst of all fast development if youhave experience with data and workfloworchestrator you know that justunderstanding what is the right way ofsetting up your python project can be apain daxter makes it very easy daxterscaffold project name and you get theproject structure out of the box withall the best use cases from theirmaintainers it was very easy to testlocally it was very easy to get startedreally really fast it all works withpython modules which our data scientistsand data engineers like and this issomething that where i'm a little bitbiased as a cloud engineer it's verycloud native you can really see thatpeople built this tool with cloud nativein mind so what we do at our store isthat we run daxter in our kubernetescluster and the setup if you look atthis slide it's a classical microsservice architecture setup so our userthat you can see on the right side ofthe slide accesses the web interface ofdagster which is called dagit you knowas a normal web application basicallywhat they get is an overview of thesystem which pipelines are failing whichpipelines are successful which pipelinesneed attentionthen there is another component calledthe dster demon which takes care ofrunning the pipelines based on a certainschedule or on certain condition andmonitor pipelines while they're runningso that if there is a step that needs tobe retrieded it gets retrieded and ifthe pipeline fails it notifies the finaluser and all of these all of these boxesthat you see they are all deploymentsall this they are all statelessapplication and then you can see in thelower part of the slide a set of boxesthat i called the deployment ment daxtercalls them code locations basically eachone of those is a grpc server it's agrpc server that exposes basicallyadvertise to the dster demon and the webinterface which pipelines are present inthat code location which schedulessensor assets and so on so each one ofthose deployments is independent meaningthat each team that we have inside ofour story is responsible for one of theor more of these code locations and theycan do whatever they want with them theycan create a new pipeline remove it editit modify it and so on so on and the bigadvantage here is that whenever a teambreaks one of these deployment all theother deployments are not affected wethere is very high isolation b�etweenthem and that's extremely important asthe company started to growup when i said before that daxter isreally something that has been builtwith cloud native in mind i wanted tobring you a couple of example as wellone of the feature that i like the mostabout daxter and that's something wherei had experience with other tools andhad a lot of trouble with others toolsis the idea behind io managers in othertools whenever you need to build apipeline that needs to run locally andin the cloud i always found it a littlebit complicated very often you need toput some logic in the code that you'vewrite that says if i'm running in thecloud i want my data to be saved in s3if i'm running it locally i want my datato be saved locally you don't need to dothat with daxter daxter framework doesthis for you so you can focus on thelogic of your data of your applicationof your workflow in your python code anddaxter abstracts all the rest in thisway the project stay the same the codestays the same but simply based on theenvironment if daxter is running inkubernetes it will save its output in s3on google cloud storage if it's runninglocally it will run on your daxter demonand save the file locally which is areally neat feature because it allows usto really you know have a clear cutbetween the data part of things and theimplementation partanother task that daxter does reallywell is that in the age of cloud datahas many different forms right when wetalk about data processing in the pastusually you had a database a couple offiles here and there the data that youwere processing was more or less uniformbut now we live in the cloud right datacan be a big query table can be a possqldatabase or your snowflake data store ora file on gcs data can have a lot ofdifferent formats and tools very oftenthey focus on how data is built no noton how data is connected with each otherit is more and more difficult with withother tools to understand what is therelationship within between my postsqldatabase table and my big query tableand that's because the tool focuses onhow data is built not on how it'sconnected a really neat concept thatdaxter introduced on top of the usualjobs and pipelines and step is the ideabehind assetsan asset in daxter can be anything canbe any data any type of data againbigquery table it can be an asset adatabase can be an asset a file can beanasset and what daxter is very good at ismaking explicit where the connectionbetween those asset is which assetsdepends on each other and so on in herei just put a couple of example i pulledthem from their documentation you cansee very clearly that there are two airbite ingestion pipelines that then getsprocessed with you know three differentdbt projects and finally i'm running myforecast using python simple easy tovisualize i know where the data is goingi know where how the data depends oneach other example here on the rightside ingestion is done using five trendi'm then using dbt to modify the dataand finally i'm predicting the new orderusing my tensorflow model it's easy tounderstand how the data flows throughthe system because we're focusing on howdata is related not on how data isbuilt as i mentioned in daxter theydon't want you to go all in on thisapproach it's a kind of optin situationyou can even mix them but you still haveall the possibilities of runningpipelines you know in the traditionalway with jobs pipelines steps and so onover time as we started to get more andmore experience with daxter we alsostarted to build our own libraries inparticular we have a library internallibrary called maple you know we're acompany that cares about trees so ourcluster libraries and projects veryoften have name of trees in this case inparticular i'm showing you a maple opyou can think of op as a pipeline stepsthink of an op as a step pretty muchwhat i'm putting doing here is adding adecorator and saying okay this functionis not an actual fi p python functionthis one is a step of my pipeline and ican customize this step however i wantin this case i'm specifying that i wantthis function this particular s�tep torun on a node that has a gpu and thisare high memory node a node that has atleast 32 cpus available and around 390gigs of memory and the last two linesare also very important this is afeature of kubernetes that is relativelyrecent but we use extensively this iscalled ephemeral volumes basically thereare volumes that can be createddynamically and have the same lifetimeas the pod so as the pod gets created avolume gets attached to it it staysconnected to the pod for the wholelifetime of the pod and when the podgets terminated the volume also getsdeleted and this is very useful forscratch space so what i'm specifying inthis maple op is that i also want ascratch space i want 150 gb of volumethat is temporary to the lifetime of thepod and this is is very used for exampleto store temporaryfiles right now this is the situationthat we have with maple we have plans toimprove it because i feel like this isstill too kubernetes specific like ourdevelopers need to know too much aboutkubernetes to know about this so in thenew version of maple the idea is to hidethat evenmore as daxter step can be mapped intodifferent pods daxter gives us thepossibility of playing with it indifferent ways in particular in two waysthat i find quiteinteresting the first one is that we canhave a pipeline where we have a singlepod and a single node so i can have apipeline with multiple steps but all ofthem will be run in one node with thesame amount of allocated resources so atthe beginning of my pipeline i can saywell this whole pipeline is going totake four cpus and gigs of memory anddaxter will run that on a single pod soall the step will will be run as part ofa single pod or i can go the other wayaround and say look instead of having asingle pod let's do it in multiple podswhere every step has a sing differentamount of cpus and resources and so forexample i can have a multiple podsmultiple nodes pipeline where every stephas a different type of requirements inthe example below the first one takesfour cpu and 8 gigs of memory the secondone requires a lot more cpu a lot moregpu and a lot more memory and finallythe third one uses ephemeral storagethis allows us to not reserve a gpu forthe whole duration of a pipeline butjust for the steps that uses thatparticular feature if you have apipeline that lasts eight hours youdon't want to reserve a gpu for eighthours and use it just for one you wantto reserve a you want to reserve the gpujust for the hour where you're using itright as gpus are really expensivewhat is happening now so at this pointwe already became pretty you know dsteryou know good users we were open we werecontributing back to the open sourceproject we were working with it veryextensively we started to build more andmore pipelines and more and more youknow assets into it but now we wanted totake the next stepone thing we noticed with daxter onething we struggle a little bit with umis observability in particular thedaxter web interface is veryself-explanatory by opening it i can getlike a gist immediately of how mypipelines are doing and how my system isgoing but the problem is that it's verydifficult to get aggregate statistic ifi want to know something like what isthe failure rate of this pipeline overthe past 30 days that's something verydifficult toget also some of the dependencies thatwe have while we build this process arereally difficult to model to model usingdaxter native uh the using the daxtermodel so what we decided to do is tobuild a platform on top of it meaningthat daxter will continue to stay ourworkflow engine but the logic on how wetrigger those pipelines the parametersand the configuration will happenexternally so we built an interface todaxter that talks with a daxter api andthat triggers jobs in a certain order orwith a certain settings depending on thedeliver that we need to do depending onthe data that we need toanalyze and this is how it looks like soin the lower part of this image you cansee the usual gke cluster and we havedaxter installed in there and that's alland good on the top right side we seewhat we call� the platform the deliveryplatform these are differentmicroservices starter router trigger andwatcher that are all running in googlecloudron we're also big fans of googlecloud and all their communication happenusing google popsub so a classic likemessaging queuing system basically whathappens is that whenever a new eventgets triggered in the starter the routerthe trigger and the watcher basicallykeep an eye on their execution in daxterand make sure to trigger jobs in acertain order and make sure to triggerthem with the right setting so thetrigger and the washer are not doingnothing more than interacting with thedx api and making sure the flows uhhappens correctly and then what we getas an output is the possibility ofseeing a delivery end to end this wholeprocess that you see at the beginningfrom image acquisition to like uhinfrastructure you know getting the postonline from the customer up to the finaldelivery it's something that happens allin ourplatform since the communication happensvia pubsub between all these differentmicroservices we are able to export allthose messages to be query we have opentelemetry that exports data as well togoogle cloud metrics and then wevisualize it using graphana graphana isa great tool it has multiple differentdata sources so we can pull data frombigquery we can pull data from googlecloud metrics and display them in oneunique dashboard to give full visibilityto our team and this is what it lookslike our orchestration platformdashboard allows us to see somethinglike this where for every deliver thatwe have we can see step by step how it'sgoing if there are some steps that needattention if there are some steps thatare failing more than usual and if somesteps need manual intervention thatunfortunately still that still happenssometime so how do we connect all thepieces how did what are we doing rightnow and what is the current state ofthings well this is a screenshot that itook just uh a couple of weeks ago forthe first time we managed to run ourfirst delivery end to end zero touch in30 minutes this was not a new deliverythis was a delivery that we are alreadydone in the past that we wererecomputing and comparing the outputsjust to make sure the outputs were thesame and this was a great result thiswas the culmination of almost a year ofwork pretty much building our platformfrom the ground up this a team that isseparate from mine so i'm part of the steam this one is the platform team butwe have worked very closely over thepast couple of over the past months tomake sure that this project wassuccessful what's next well the firsttarget is always to deliver results toour customer as hands off as possible wewant to avoid as much as possible manualintervention we want to avoid doingthese results manually because wheneverthat happens the risk of mistakes getsmuch much higher we also plan onbuilding on newer daxter feature to makethe pipeline more reliable to changes weare using daxter extensively but theyimprove very very fast it's a startupthey also need to move fast right wewould like to break the one step onenode or one job per node assumption itwould be very nice to have multiple jobsor steps per node and we are lookinginto both gke node pooloolautoprovisioning but also kubernetesdynamic resource allocation which issomething that has been yeah has beentalked about several times thisweek and finally this one is one of thetrickiest one i have to say avoiding asmuch as possible underprovisioning oroverprovisioning resources the nature ofour work and the data we use is thatdepending on the input image that we getthe same process can take you know verydifferent amount of cpus and memoriesand this is very difficult to predict inadvance we would like first of all tomake the maple api more straightforwardto use so hide even more details fromthe users even more you know abstractkubernetes even more and definitely workon improvements on monitoring andobservability with all the three of thiswe hope we will manage in the future tomake our delivery more reliable ourclient happier and hopefully prevent thenext big wildfire let's put it this waythank you very much for the attentionthis was my talk i'm jardini work forover story in case you're interested inwhat i talk today in case if you'recurious to hear more about it ourwebsite is overstory.com and yeah thereis also a career page we're hiringactively people all over europe and theus thank you very much[Applause]folks any questionsnew daxter feature okay yeah rob thankyourob thank you rob asked me about newdaxter feature that we are looking toimplement daxter has a new feature thathas been well now it's been releasedlike six months ago or something likethis called asset checks basically italked about asset briefly but the ideawith asset checks is that every timeyour asset is materialized so it'srefreshed it will run some checks onyour asset to make sure that you're notbreaking the api downstream so if you'rebuilding yeah a file with some let'slet's call it a csv file it will makesure for example that all the rows arein there so that you're not breaking allthe all the all the pipelines that aredownstream this is a feature that wedon't actively use we have coded somelogic inside our pipelines which isbecause we didn't have the feature yetbut it's something that i feel cangreatly improve the reliability of ourpipelines because you know like sometimeyou remove um you remove a column to afile just because you think nobody'susing it but you never know how manycustomers you have how many you know howmany users you have that are pullingdata from you so that's one of thefeature i'm i'm looking forwardto any other questionyeah yeahthat's that's a very good call so thereare basically the question was we havedaxter running into our impro into ourinfrastructure and daxter alreadyprovides certain feature like triggersand schedulers and so on that are ableto do a lot of this automation for youand the question was why did you decideto reimplement all this outside of daxerusing uh the platform as you correctlysaid the part that you were seeingrunning in cloudr run this is all customdeveloped there's a little bit more ofa use case for overstory let's say thana wide use case the problem that we haveis that our pipeline take a really longtime because from image acquisition tofinal delivery you know they can be dayspretty much and also the decision thatwe make are are difficult to modelnatively in daxter and so at a certainpoint we gave it a try to try toimplement that in daxter but it was justtoo complicated and too custom let's putit this way to use just you know daxternative concept and this is why wedecided to build this system that as youcorrectly say runs outside runs ingoogle cloud platform and thosecomponents are all customly built prettymuch i hope it answers yourquestions anyoneelse yeahah yeah good question so what i've shownyou before is that you can allocatedifferent resources depending on thetask or you can define the resource likepipeline wise basically how and thequestion is how can i do one or theother you can do that basically alwayswith always with u decorators so justuse python decorators if you put thedecorator on the pipeline the resourcesare being allocated for the wholepipeline if you put the decorator stepby step if you select these resourcesstep by step then each step will haveits own resources that's this idea yeahwith uh with code location you candefine some defaults so that you makesure that you know you always have atleast a minimum amount of resources thatare allocated to every step but yeah youcan customize it always using decoratorswhich i find it really nice because youknow you have everything in code youdon't have to yeah you're not doinganything manual any point andclick yeahyeah yeah that's true it's a new featureof daxter i haven't talked about it'scalled daxter pipes i don't have a lotof experience with it i have to say butit's also one of the new features thatwe are we want to look into because itlooks reallypromising okay thank you very much forthe talk will be around in case you haveany other question thank you very much2025-04-15 21:58:10.831219 KK��T#��aA1rtyQaTfbdMgood afternoon thank you very much forcoming to my session today i know it'sone of the last session of cubecorn it'sfriday we're all tired and everythingbut you know if you're here you probablywant to know something or at leastyou're curious about how we do wildfireprevention with kubernetes and ai myname is andre jardini i'm a sensambassador i work as an sre foroverstory and in my free time i also uhorganize community events and yeahcommunity conferences around europethe title of my talk is kubernetes andai to protect our forest a cloudnativeinfrastructure for wildfireprevention like me a couple of years agobefore i started working for overtory ididn't actually know much aboutwildfires wildfires are events that ineurope are a little bit more rarecompared to the us just for because ofthe density because of the geography ofthe us territory but there are eventsthat are extremely destructive forpeople lives community buildingbusinesses and so on these events areoften very sudden they happenunpredictably and they bring a lot ofdestruction to our to ourpeople just to give you an idea just acouple of months ago i'm sure you'reaware i'm sure you're following the newsa big wildfire spread out in southerncalifornia this was the second biggestwildfire that happened in californiaover 29 people died over 200,000 peoplegot evacuated 18,000 homes destroyed andover 57,000 acres of land burned downjust to give you an idea of what 57,000acres of land means this is the areathat would have burned in london thatrepresents 57,000 acres and this was noteven the largest wildfire that happenedin california the 2018campfire was uh spread over an area thatwas almost three times as large153,000 acresanother thing that i didn't know aboutwildfire is that in californiaspecifically um over 60% of the mostdestructive wildfires have been causedby defective power lines i know itsounds curious i know that we expect youknow fire to be you know wildfires to beum caused by you know humans most of thetime just you know a cigarette that isnot turned down correctly or things likethis but power lines are actually a bigresponsible of this kind of wildfireespecially incalifornia and wildfire prevention isnot an easy task for a multitude ofreason the first one in respect to powerlines is that the power infrastructureis very vast it takes a lot of spaceit's very complex it has a big number ofdifferent substation and theinfrastructure is just really really bigto handle and to maintain and if youthink about it power lines are bringingus electricity 365 days a year so it'sthe kind of infrastructure that needs tobe you know as close as possible as atwo a 100%uptime the second cause why wildfireprevention is important but is alsodifficult at the same time is thatfailures are always sudden andcatastrophic you might have heard thatjust a couple of weeks ago a powersubstation near the ephro airport failedand that caused the whole ephrop airportto be shut down for over 24 hours oneover 1,000 flights got disrupted over300,000 passengers got affected and thiswas just a small power substation yousee how the impact of this can be suddenand catastrophic and can cause a lot ofdisruptionfinally if you think about it much ofthese infrastructure much of these powerlines and network are always subject toatmospheric agent so they can getdamaged by all kind of things let it bevegetation or a storm thunders and so oninspecting them is also not easy veryoften those power lines they cross��ormation available uh withthis in mind uh we have the first inline ESA which is European Space Agencyand it's the Europe's gateways to spaceuh ESZA works together with theirnational agencies institutions and theirmember states and they coordinateseveral missions and programs likeGalileo or Copernicus and we also havein the secondlineat which stands for the Europeanorganization for the exploitation ofmethological satellites and it operatesoperates satellites uh for monitoringweather climate and the environment nasaalso works together with theinternational agencies member statesinstitutions and the other um uh spacerelated institutions as well and we havein our online and archive data uh datadata lake we have more than 10 pabytesof data spanning more than 300collections uh I must say I worked atUMass as a cloud computing engineerbetween 2019 and 2022 and now as onlinedata access services expert since July2024 ecmwf stands for the EuropeanCenter for Medium-Range WeatherForecasts and it provides 247operational service producing anddisseminating numerical uh weatherpredictions predictions to the to itsmember states and cooperating statesecmwf also host a high high performancecomputing facility as well as a commoncloud infrastructure related with theirHPC facilities and they also have ametological archival and retrievalsystem MARS uh which is providing accessto the metlogical data that has beencollected or generated at ECMWF over theyears i also want to highlight that Iworked at ECMWF as a cloud computingengineer between 2022 and 2024so uh let me also give you a a quickintroduction to climate data records andit refers to any long-term record of acalibrated data that is useful forclimate scientists or climateapplications and the data from currentor past satellites or archive are usedto generate the CRDs and these archivesuh basically is a development uhthroughout the satellite program or oursatellite application facilities or SAFSfor short and also can include uhdifferent collections like Copernicus orother European Union funded projectsthis is very nice you know but theproblem is it is spanning over severalpabytes of data so it's really hard tomove the data from one place to anotheror download the data to your localcomputer to process um we have multipledata sources and the access methods somost of the time every um new um programhas its own ways of archiving andaccessing the data uh the source datacan come in various different formatsand some of them are binary formats sothere is some kind of a pre-processingrequired to start with working withclimate data and to do that you needmultiple libraries to handle all thesekind of anoperations and we had an idea to make ita little bit simpler for the end user sowe thought if they cannot afford movingthe data let them compute near the dataand this is how European medical cloudwas born and the idea is to move theusers to close proximity to the computeresources that we have and let them havesome computing resources allocated theirtenency this is the infrastructure as aservice as an idea but this is not aprivate cloud because we are servingcloud resources to our member states anduh other institutions with the R&D fundfor example but this is also not apublic cloud so any any random citizencannot go to the web page and sign inand get some resources therefore we arecalling it a community cloud a communitycloud for earth observation andscientists and the key idea might be tohave an ECMWF VM uh that dig digestingdown sampling and transferring only thenecessary data tovm which has some kindof a satellite imagery going on and thatis the idea but the biggest benefit ofhaving European weather cloud was thatit provide us a well established way asa metological society to work togetheruh coming among uswf and our memberstates so we have a common platform thatwe can work together this time and youmay ask why CRDs are important and howwe can make it more accessible so CRDsare they are providing CDRs areproviding uh long-term calibratedinformation on how where and to whatextent the earth's atmosphe�re landcryossphere like ice sheets and oceansare changing over long time periods sothis might be for example um importantto analyze the extent of the sea ice ofthe past decades uh and it can giveimportant clues to the climatologist orthis can be uh an early warning for afog for aviation so this and highquality systematic featureidentification is crucial for thatreason to further develop theapplication and ML algorithms to detectthese kind of anomaliesand as you me said we have mainly twogoals uh to build this earth systemfeature detection one is to spot supportsupport our member states in featuredetection uh from earth observation datato provide early warnings this is the umfog example that I gave you earlier andthe other uh goal is to collect a globaldatabase for a longtime series ofidentified features this is the icesheet example that I gave like for a 40years of data you need to gather thiskind of a features uh over a long timeperiod in order to identify thechanges and the goal and let's say theend goal is to establish a communityplatform to labor these featuresmaintain a database of these featuresincluding the possibilities to explorevisualize modify and export uh all thesefeatures that wegathered to do that we evaluated modelsto identify and categorize the tropicalstorms using pre-trained ML modelsthrough transfer learning i will not gointo the details but if you areinterested with them you can scan the QRcode and and access to our publicationum but TLDDR we were scanning over 2,000images from Japanese material agency uhuh from the satellites from GMA uhcontaining roughly 5,000 tropical stormsyou might ask why u Japanese satellitesbecause Europe doesn't have as manytropical storms as uh as Japanese uh sothat was like a no-brainer for us to usethat instead and we were like usinginternational best track archive forclimate stewardship uh IB tracks forshort database as a ground truth becausethey already identify these things afterit happens and then it they are puttingit into the database so it serves as aas a ground truth uh we were manuallychecking the bounding boxes but we werenotuh fixing the the models based on thatand we were using the war current inintensity DCI for short to use tocategorize the storms the differenttypes of storms and we found out thefaster RCNN provide the best performanceout of the otherthree and in here you can see actuallyin in in the green boxes you see thetropical storms correctly identified bythe model and also present in the IBRXuh database and in the yellow one uhthere are tropical storms detected bythe model but not found in IB tracks butthe problem is the amount of tropicalstorms are exceptionally many so uh asan IT as an IT specialist these are alllooking tropical storms to me at leastfrom my eyes but um the problem with themodel is that we need some kind of adomain expert to check all thesedifferent labels because as you can seein the yellow ones they are notidentified as real storms so this was afalse positive so manual labeling by thedomain experts are clearly needed andyou cannot have auto labels auto labeltropical storms uh it's not thatsimple uh hold on then does this meanthat the labeling has to done manuallylike we have as I said we had likethousands of images that needs to beidentified and we need climatologistsclimate scientists sit on a PC uhmanually labeling all of that in acompletely separate work i I am togetherwith my friends in that project acompletely separate project we wereexploring segment anything model forsatellite imagery again I will not gointo the details you can check thepublication but we kind of figured outthat SAM is actually doing very well onsegmenting the satellite imageryalthough that is not trained on on thatkind of a data set so we were usingEuroset data set in this case but wewere not training SAM on the Euro dataset we were just using a pre-trainedmodel but it still managed to uh segmentthe images quite accuratelyand here you can see a small applicationthat we u developed together mycolleagues at ECMWF at that time andhere you see some kind of �a flot in inthe palm tree area do you see that andthere's a bounding box and you can makethat prediction a little bit more uhpinpoint with the um bound uh with theum key points and when you select thearea of area of interest you can clickthe segment and it will basicallysegment the rest of segment the imagefrom the rest of the image and then youcan export this information to any toolthat you would like to visualize thiscan be a QGIS it's some uh web map uhapplication that I'm scientist using uhso you can export that annotation andyou can use it on your uh on yourresearch so this is something that wewere doing at that time completely uhoutside of of this scope but at it gaveme an idea so maybe we can create anenvironment uh combining all thesedifferent tools and we will startthinking about what exactly is needed tocreate such a collaborative environmentfor the earth earth reservationscientists and this is what we came upwith uh we have ML models we haveKubernetes we have climate scientists wecan work together to build an earthsystem feature detection platform thatcan serve uh climate scientists uh a acollaboration platform on on carrying oncarrying on their research so you seethere are different tools and I will betouching them in my uh future slides butthis is the blueprint for this jointlabeling and model development platformso to say so on the um the starting sideyou have the input label data forexample IB tracks and you have thesource data this can be uh data storefor example but the problem is thesource data comes in various differentbinary formats so you need to do somekind of a data uh preparation forexample you can use libraries likesatpai to generate images from thatearth observation data and then you canput them into an S3 bucket or somethingsomething like that to store and thenyou can parse this IB track labeling asa ground truth and in here you need somekind of an annotation platform like likewe developed it and I showed you in theplay with strike like something likethis and here we can use a communitytool called label studio because it hasan API and everything and it can make iteasier for it observation scientists toimport all these things to to the uhplatform that they are developingsomething and from there we can actuallyutilize SAM to help them labeling the umthe features that they detected andafter that we can export these featuresto a database in this case we are usinga postSQL extension called post uh andstore it in a longer term so that we canvisualize it afterwardsAnd this can be any web mapservice so you might ask what's underthe hood then we have an infrastructureservice infrastructure as a servicecoming from European weather cloud it'sbased on OpenStack and this is for PCreasons so this is a pathfinder as wecall it insat to identify um uh thebenefits and how we can do this in aproduction afterwards so you can see thespecs are not that uh big uh we have inin CPU and RAM and block storage we canalso have a little bit more room forthat but we are especially limited onthe um GPUs we have two physical A6000GPUs presented as VGPU flavors and weare using Kubernetes AR as the flavor uhRancher to to manage and provision theseKubernetes clusters we are usingOpenStack cloud control manager CSI CSCSI plugins uh for read write onlyaccess to the to the storage uh we alsohave long for read write many and we areusing a Nvidia GPU operator to presentthese GPU resources to the to the potswe are also having some instance groupsto uh autoscale uh the worker nodes butas I said the vGPUP instance group isquite much tight in this case because wedon't have many resourcesyes you heard it right we have A6000sand this card is 5 years old but ourinfrastructure is 5 years old as well umand this is a pathfinder so we are stilltrying to figure out what we need in thein the in the next five years so that wecan um build our platform accordinglythese these cards are presented as as Isaid virtual GPUs and we have likecouple of flavors there presenting howmuch memory that we have on the vGPUside and I must say that these resource�sare really scarce so we need to be extracareful on how we spend them um as Ialso pointed out in the previous slidewe are using Nvidia GPU operator topresent all this because it simplifiesthe installation and maintenance and theoperations and it also uh bundles up alot of nice tools from Nvidia and youdon't need to of course you need toconfigure them uh in the in in theinstallation phase but at least you haveone source of ground truth that you canconfigure everything in one place and weare implementing time slicing for betterresource sharing and utilizationuh you might ask time slicing why not XX can be in here MPS or or MC i have ashort short answer for you because wehave to uh our card doesn't supportanything else so we have to use um timeslicing uh and this is our only optionto have vGPU sharing to share someresources but don't worry I also have along answerso um we are using GPUs mostly with SAMto segment images and users are runningrepet repetitive mostly unrelated dataprocessing jobs this is not a parallelprocessing in in sense of MPI so usingsomething like MPS might be an overkillmost of the time we have some kind of aprocessor that is expecting an inputdata and it's generating some kind of anoutput data and if we lose the data insomewhere we can always reprocess it andget the same data out of it anyway sothis is a nice tradeoff in in in interms of memor memory isolation as wellbecause we we can gen regenerate thatoutput data anyway and this is but thisdoesn't mean that we are not settingsome ground rules we do have someresource quotas and we are watching theGPU utilizations anyway but I think thisis a nice trade-off with the limited GPUresources available and we are also inthe future at least we are alsoconsidering implementing retraining ofthe SAM 2 uh using annotated data andthat might change the things a littlebit but for now I think this is a nicetrade-off so um I also would like tohighlight couple of challenges that weidentified during during our platformjourney one was to hiding infrastructurecomplexity from uh EOS scientists sothese people are domain experts and theyare scientists and they are really smartbut not everyone has to know how topresent a VGPU to a pod or this or thatso this needs to be hidden uh from fromthe earth observation scientists in myopinion at least but a challenge risesagain how to integrate all this multipletools how to deploy and manage this toolto tools at scale and we also weremissing a useful documentation in thatwe can actually contribute somethingback to the opensource and to do that we came up withsome design principles first of all useopen source and open license license ofecosystem second we tried to automateeverything with GitHubs andinfrastructure as code we also adopted aKubernetes first approach where we wereusing off official operators as a firstchoice and if it is not mature enough ordoesn't exist we were using wellsupported Helm charts and if nothing isavailable and we have to build ourcustom solution that is the only timethat we are building our customsolution and this is what we end up likethe these are the tools that that I wasmentioning in my previous slides as wellbut let me just also give you a higheroverview Um we try to build a f flexibleand open and machine learning friendlyplatform for earth observationscientists andum to to make it a little bit easier forthem in the beginning we prepare thedata for the end users at least some ofthe datas that are in high demand butthey can still bring their own data ifthey if they like uh the scientists arealready familiar with Jupyter hub thenthey were using it quite extensively sowe provided a hosted hub where they caninteract with the other tools that weare providing in the platform and alsohave some kind of a GPU processing uhcapabilities to to help them uh gettingstarted and of course they they don'tthey don't have to use this hostedJupyter hub they can bring their owntools uh their own uh editor instead ofa Jupyter hub or they can bring theirown Gale server to visualize the thefeatures that we identified in thefea�tures database but the gist of thisslide is that we are maintaining some ofapplications that they are already usingum separately so that we can have acollaborative environment environmentforthem and I would like to have a fewwords on GitHubs here uh in the previousslide I said many platform uh oneplatform many applications one git butit is like also uh one Argo CD to rulethem all as well so we have all thesedifferent applications we have jupyterhub we have label studio we have nucleiowe have functions running on nucleio iwill come to that we have a cloud nativepg and postgus extension and all theseapplications needs to be managed and notinstalled first of all and then managedand then maybe day2 operations to backthe database up etc so if there is anoperator in place we can automate incoordination with githops best practiceswe can automate all these different uhtools in at least one single groundtruth and it gives us consistency on howwe present this platform to the enduser and this is what it how it lookslike in the end i know Jupyter hub canintroduce a little bit of complexitysometimes but providing a hosted hubgives us an opportunity to have some umplugins already in place for them suchas the Jupyter G or Jupyter Lab Git sothese are the things that they don'tneed to worry about anymore and we alsohave uh installed some of the um uhtools and libraries that they would liketo uh they would like to have in the endso they don't need to run manualcommands on on a terminal or somethinglike that and for the labeling part weare using label studio i must say we udid did a huge trade-off betweendifferent tools like DS annotate therewere many there are many tools uh whenit comes to the image uh labeling but wefound label studio is the mostextendable with the API and UI so it wasa nice gist uh nice nice tool that wecan move forward with it and it's quitecustomizable we are also using Nucleioas a as a service framework uh there aremany but this serves the best for theearth observation use case uh with thelower maintenance overhead so Nucleiotreats the code uh as a whole and it canbuild a a container image on the runtimeand then you just need to run this as asa function this was actually calledprocessor in the earlier versions and itfits our purpose really well becausemost of the time what we have in earthobservation science is also processorsthat are deployeddeployed one another so it is not aparallel as I described but concurrentprocesses and nucleio handles this usecase really well and most of the climatescientists are very familiar with pythonand nucleio is very python friendly theonly it can integrate with cubeflowpipelines or any other pipeline uhorchestrator as well but the onlydownside of it let's say is it lacksthis kind of a built-in workloadpipelines like cubeflow pipelines orargod workflows so if you want to do endto end automation uh the um earthobservation scientist or the user needsto bring that logic into Jupyter hub orinto their code um also the functionfunction versioning is not the same asmanaging a model registry but managing amodel registry is also an additionaloverhead and right now we only havecouple of functions so we can bundleeverything up in an image and use thatinstead and debugging the deployfunctions can be a tricky um and theimages tend to be a little bit large soum for example the the thing that I amdoing right now is to build them outsideof nucleio and put it in some kind of acentral registry so that I can justdownload it instead instead of buildingit on the runtime because these pythonlibraries can be big and you are alsoputting the model and the checkpointsinside of it and the other uh difficultywas to integration with the label studiomachine learning back end uh because itwas expecting a docker compost setup uhso we did some something manual there wehad to and documentation uh can be alittle bit tricky as well and this isthe ugly um this is the machine learningback end that that I developedpersonally you can scan the QR code andand and judge my code um as I said thelabel studio was expecting uh a dockercompost setup where where it has an webapplication server uh supervisor D andRQ to schedule the training jobs etc soI thought okay I can replace all thatwith the tools that is already availableon nucleio and then I can use themachine learn uh the um label studiosmachine learning SDK to build the modeland present it in a way that that labelstudio expects fortunately label studiois not that selective on how how you canpresent this kind of an information itjust expect you to have some kind of agson so that is very fortunate in myopinion because then you can actuallyreplace the label studio with somethingelse because at the end of the day theprediction and how this has beenpresented to the end user is just a JSONfile the rest can stay the sameobviously and this is how it looks likethe label studio at least um I wasplanning to do this on live but uh yeahvideo is safer so this is how it lookslike uh th how many images 1,800something like that uh this is alreadyuh uploaded to from an S3 uh bucket yousee here I have the Dorok uh labels thatare coming from IBRA database and I alsohave additional labels down below calledSAM these are using this will be usingthe uh machine learning model toannotate and if you go to the modelmodel is already like integrated withthe system i must say all of this can bedone from an API as well and I also havea separate video to how to do that but Ithink this is an overkill so showing theUI is goodenough um and once you set all thesethings up you have uh all these imagesthat that you can export in the systemand here you see all these differentlabelings coming from the IB trackdatabase and these are different typesof storms differentcategories and in here I would like tohave uh my own so let me just speed thisup a little bitoops so let's go and here if you go forexample on the right hand side I have uhthe SAM labels and I have the autoannotation is is enabled so that I canauto detect the um features and thenwhat I need to do is to select the labelthat I would like to give it it was aDorak tree I think so I put the um uhthe key point and then Sam did thesegmentation and generate the label andyou can see it's very well compatiblewith the um data coming from IV tracksand then you can accept uh the labelingand then this label will be available toextract and then you can extract allthat information to the earth systemfeature database and you see on the lefthand side how we can deploy this withcloud native PG but on the right handside you see a visualization of theatmosph atmospheric motion vectors AMVsfor short the movement of individualcloud or water vapor uh water vaporpatterns in success success successivesatellite satellite images you see inthe previous demo we had like multipleimages of the same system it was not thesame it was actually moving slightly sowhen you export all that labeling into adatabase and try to visualize it againyou will see all these shades comingthrough earth uh visualizing themovement of of that cloud or water vaporwhat's next well we need to worktogether with UMass member states todevelop tools for data preparation wealso need to run joint labelledcampaigns to label more and more dataand we also need to identify the bestavailable methods for each feature andokay of course we need to make more dataavailable for the experts so that theycan do more on the model development andI really like to train the ML modelswith all uh the Nucleio uh SAM 2integration i mean uh with the labelleddata and I really would like to havebetter GPUs to be honest and we are alsoplanning to improve and extend the Earthsystem feature data database with morefeatures and and and and capabilitiesand thank you very much and if you haveany questions I will be pleased toanswer them[Applause]hi hi uh just a single question howlarge is your ML ops team uh not thatbiglike one two three fourokay and it works or is it too much workit works um with a caveat with a with anasterisk of course i would like to havea bigger team okay thank youanyoneelse done thank you very much[Applause]2025-04-15 21:58:11.295641 � ��1U#��ApvTRjsSXMi0and this is me Arman very nice to meetyou all and I work as online cloud dataaccess services expert as UMat butunfortunately my colleague group atHarvard couldn't be able to join us herebut we worked uh together with him alsoa team of climate scientists and twointerns on this project so I want toalso introduce you folks uh to my teamas well and with that in mind I want tointroduce you also to European publicspace sector but let me start by sayingthat I do not hold any official uhposition to represent any of these ininstitutions so this will be like theonline inf��n traffic goes up and scalethem down right usually those GPUs willbe available it's like a whole story iflike provisioning new GPUs or notsometimes it's not it's not possible tospin down GPU GPUs but let's assume thatthose GPUs are out there and and if theyare not used to those to to run thosereplicas they are used for some otherworkloads but spinning up and down thosereplicas can take a long time and userswon't wait for a few minutes until theirqueries are being answered so what wesee usually happening isoverprovisioning those GPUs are beingmore more GPUs are being provisioned torun those servers and it means high costand low GPUutilization another use case it's coldmodels or like multiplemodels that needs to be access needs tobe served to different users butinfrequent load right just infrequentlyusers are accessing one of those modelsso what you will want to do right toscale to zero use scale to zero and putall of those models in some storage andspin a replica up when users are queringthatmodel but if that spinning up thatreplica takes many minutes then youwon't be able to do that if yourapplication needs like low latency sousually what we see is cold models justbeing provisioned on many GPUs just tokeep them warm so latencies will belatency will be low and that's againhigh cost low GPUutilization um last use case forinference offline inference rightrunning a batch job thatprocess a lot of data in a in a batchfashion right so a job starts it startsto to process the data outputs outputsright outputs out there and and when thejob finishes those replicas spawn downbut you're paying for the time that ittakes to spin up those replicas rightyou're paying for those GPUs And it canif it's lasts for several minutes thenit can be significant so reducing thatcold start problem reducing the timethat it takes to load models into yourGPU can be significant and that's thegoal of of this project and AK willexplain more about lo model loading andwhat we did with the runi streamer yeslet's let's zoom into those containersand the model loading process a bit sowe have a storage that can be a localstorage that can be an object uh storagedoesn't matter you have your model thereand first of all you need to move yourmodel to your uh CPU memory um and inthe CPU memory there are also somedifferent things that might be happeninglike sharding the model quantization etcthat will add up on time as well andthen after you load the um model to theCPU memory you need to transfer to theuh to those weights to the GPU memorywhich is the second step so these thesesteps are happening sequentially thereis no paralleliz parallelizationuh and it takes a lot of time so when westarted checking out the problem uh werealized that we needed some specificthings uh before creating the theproject so first of all the sesequential loading will not work for usso we need something that's working uhin a parallel manner um second of all weneed a library that supports multiplestorage types uh for example for safetanzers live loader um S3 is not uhsupported and you might need to changeyour either storage or your uh code basein order to make your uh storage typework um so that's something that wewanted to avoid um third we want to uhbe able to be compatible with safetensorers uh safe tensor format formodel weights is becoming thestate-of-the-art and it's very safe umso we wanted to work with safe tensorspecifically and we want um this projectto be um integrated easily withdifferent inference engines we don'twant to uh keep pushing a singleinference engine uh that you should usein order to leverage the project itselfso this brought us to runai modelstreamer so uh we created this PythonSDK with a C++ implementation and wedesigned it in a way that it acceleratesthe uh model loading times onto GPUsfrom various types of storage networkfile systems S3 disk doesn't matter sothe the two key things that we are doinghere is reading tensors concurrentlyfrom the storage while transferring themto GPU and we also divide the tensorinto equal parts uh for saturating thebandwidth while rea�ding because we wantto saturate uh storage bandwidth uh uhwith ourloader um a bit more information aboutwhat is so special about runi modelstreamer so as I said we haveconcurrency so we use multiple readinguh requests to read model tensorersconcurrently from storage and we streamthose to the GPU at the same time um wehave some adjustable parameters um soyou can adjust the level of concurrencydepending on your storage type you canchoose um how you want to divide thesafe tensor like the the data chunk sizefor each thread um and you can alsodefine CPU memory usage for example youmight have a limited CPU memory um oryou might have a gigantic CPU memorythat you want to leverage you canbasically adjust those parameters withruni model streamer and we will use thatum balance workload for reading that'ssuper important um tensor uh in AI comein different sizes so we divide thosetensors into equal parts uh for readingso that we can separate the the storagebandwidth we support multiple storagetypes local file systems cloud-basedobject storage uh we support both um youdon't need to uh convert any uh tensorum uh from safe tensor on and then uhother way around so we support uh widelyadapted safe tensorers format no need toconvert them and uh um um store themseparately and um we created the runimodel streamer in a way that it isactually a safe tensor iterator so it'svery similar how safe tensor iteratorthe traditional uh version of uh loadingmodels works um so you can easilyintegrate it with uh inference enginessuch as VLM TGI etc whatever you areusing and speaking of VLM uh if you areusing a version uh um that's higher than0.66 we are also coming out of the boxnow uh uh with like in the in the VLMcontainers and VLM uh versions uh so youcan uh try it out uh with VLM as well umand now I want to talk a bit about ourbenchmarkings like how does it actuallyperform in in uh uh in practice right umso I will not go through the wholesoftware stack I will leave the QR codehere later on so that you can uh readthe benchmarking white paper uh but weessentially used a meta model metal lamauh 8 billion uh it was f 15 gigs and westored it in a single safe tanzers fileAs hardware we used a single GPU A10G umon AWS um and uh we chose threedifferent storage types here um first ofall local SSDs one is GP3 SSD and theother one is uh IO2 SSD the importantthing that you should know about thesetwo is IoT SSD has a higher throughputum and uh we use Amazon S3 uh which islocated in the same region with our umwith ourinstance a little sneak peek to thewhole benchmarking experiment uh you cansee here and QR code is here uh feelfree to scan it uh to uh check out thewhole benchmarking study so the two mainthings that we checked is standaloneloaders safe tensor loader run modelstreamer and tensorizer they how long ittakes for them to load a model fromstorage to uh GPU and then we alsochecked how much time it takes to bootthe engine and load the model with VLMso these are the two things that we uhchecked out and as you can see we seequite some amount uh improvements uhwith runi model streamer um and I wantto share the key takeaways right nowwith you that we learned throughout thisuh these experiments so first of all umconc concurrency drives speed uh up to apoint so concurrency is amazing wechecked out also different levels ofconcurrency uh from different storagetypes um and we see a lot of improvementwhen we increase the concurrency um butwhen you saturate your storage bandwidthyou are not going to uh get more uhimprovement because laws of physics youcan't do more um so we we uh realizethat we are hitting the limit after uhincreasing concurrency uh uhenough um the second key takeaway is asI said balanced workload distribution iscrucial tensors vary in size umpartitioning work into equal chunks uhamong those threads help um optimizingbandwidth saturation um and uh makingthe streamer process faster becauseimagine like one one tensor beinggigabytes and one is megabytes so thenwe need to wait for the for the uhreading process that's happening withthe gigabytes of uh of tensors first umand the third one is storage bandwidthit matters a lot um so for deploymentsthat you are going to uh go for uh ifthey demand fast model access thinkabout investing in high performancestorage um because we realize that it itmatters a lot the bandwidth itself uh ifyou are dealing with cold start problemum it reduces the load times a lot uhespecially in on-prem and hybridenvironmentsum and as we uh talked before streamerhas some parameters so you shouldactually check out uh the concurrencylike the optimal uh optimal amount ofconcurrency that you need specificallyfor your storage type and uh also um ifyou have uh any specific requirementsfor CPU memory etc you should tune thethe uh streamers parameters uh becauseit's it affects the um cold start timesquite a lot um and with S3 we we gotvery good results um this is because uhour approach with S uh S3 so creating anAWS S3 client per thread with eachthread sending uh multiple asynchronousrequests uh to the back end um affectsthe the um performance quite a lot it'sit's super powerful we see results uhthat's under five seconds um I actuallyhad uh difficulties to put them in thein the uh chart uh in the earlier daysso this was quite amazing if you areusing S3 give it ago and some things that we noticed inthe cloud um which is uh a bitindependent of uh RNAi streamer um so wewere checking out the um theoreticalthroughput that we are going to get uhstorage bandwidthwe are going to get with the um SSDs umeven if it's written four gigs persecond in the documentation on the uhwebsites we could only see up to twogigs per second so keep in mind thatthere might be some practical limits uhthat needs a bitplanning um and also if you want to runyour own benchmarking uh with thestreamer or with some something else ummake sure that you are accounting for S3caching effects that's something that werealize um probably in the S3 back endthere is some caching happening um soafter running some experiments one aftereach other uh in the S3 setup uh we sawum quite some acceleration uh which isnot good if you're trying to calculate acold start performance uh so if you wantto give it a go and check out some uhsome benchmarkings uh make sure that youum add a cooldown period between cloudtestsum yeah and some some exciting newsthat's that's coming up and that'salready here so we have a new version uh0.13 uh it has full AWS S3 nativeauthentication uh now we also support GSuh GCS um and we have some usabilityimprovements and in the road map uh wehave some very cool stuff coming up weare going to support sharded models weare going to optimize uh multiGPU modelloading uh we are going to have parallelmulti file loading and we are alsoplanning to support GPU direct storageand uh we are especially excited aboutthe G uh GCS uh support uh we areworking with the G GKE team uh and uhthe first impressions from the G GKEteam is amazing they see uh96% model load time reduction with themodel streamer when used with VLM uh incompare in compared compared to uhdownloading the model from the cloudobject storage directly which makes ushappy um yeah uh so here are some QRcodes for you the first one is uh to theGitHub repo and this is a QR code forthe benchmarking white paper we have alot of results if you want to have alook and if you have any questions weare going to be at the booth uh afterthis talk we are very happy to um answerany questions chat about uh how youexperience this uh uh this challenge anduh one more thing we are very excitedabout this specific cubecon uh becauseum the the core of the runai platform uhour runaiuler is now available uh underApache uh license um so if you arescheduling AI workloads uh we are veryhappy to hear your feedback uh any anyfeedback any contribution any um earlyadopter uh adopters are very muchwelcome and you can also come to boothif you want to talk about theuler we arevery happy to hear your um uh hear yourideas and opinions um and we call it kaischeduleuler okayyes whatever you call it kai Ki Ki umyes that's that's it for for today fromus2025-04-15 21:58:11.921442 ��jV#��AqH5djJlbodYokay cool so this talk will be aboutoptimizing model serving inference on aKubernetesum with model weightstreaming um soI'm happy to be here um I was I'm fromthe Ranai team i was I was theco-founder and CTO of Ranai we startedin 2018 um RAI we're doing what we callAI infrastructureorchestration runai was acquired byNvidia last year and now we're part ofthe Nvidia team before that I did my PhDand and postto in information theory andAken is here with me yes so I am thedeveloper advocate from the runai teamnow in media uh I also did my masters inrobotics cognition intelligence uh inMunich uh so today we are going to talkabout uh model inference yes so modelinference and and we got this uh um ranimodel weight streamer it's an opensource project that we published lastyear and this is the work of Noah andOmar who couldn't be here but we'representing their work so great work bythem um so let's start so inference intheory right there is a training dataset and a model is being trained on thatdata set it's being evaluated after thetraining and if results are good then itit moves to inference right andinference new data or users queries aregetting into the model being processedand new predictions orgenerated right and this is what's thethe the topic of this talk right aboutinference how how models can serve newrequests in acloudnative space on on GPUsso in traditional applications with webweb applications common practice is touse autoscalers right and there areusers queries getting in and there areinstances replicas of of of applicationsrunning on maybe on in a container on aCPU and when traffic goes up new uhreplicas are spawn up right um to servethose requests and when traffic goesdown those replicas are being spawneddown right to save costsSo that's with traditional applicationsbut with AI it's much much moredifficult uh because of few reasons butone of the biggest reasons is the coldstart problem what we call the coldstart problem it takes a long time tospin up a new replica right so first ofall the process of spinning up a newreplica it involves provisioning a newmachine right but with GPU machines ittakes longer than provisioning a CPUmachine usually because there are lessGPUs out there in the cloud and becausethere are software libraries like CUDAdrivers and and and and CUDA kernels andlibraries that that wait a lot and ittakes time to install them second thingis the container image just loading thecontainer image usually within in AIthose images are big it involves a lotof Pythonlibraries booting up the inferenceengine usually also takes not aconsiderable time but this talk is goingto be about the last part about loadingthe model weights right just downloadingthe model weights from a storagelocation into the GPU and it can take along time right it can be reallysignificant for example with llama 3 8billion parameters usually it's beingit's being deployed with 16 bits perweight and then the weights are around15 gigabytes so that's a lot right uhllama 3 with 70 billion parameters itwill be more than 100 gigabyte just theweights themselves to download from astorage and and load into a GPU deepseekR1 right more than one terabyte sothat's huge that's huge and it can takemany minutes can take sometimes morethan 10 minutes just to download the theweights and load into the GPU so the ourgoal is to right to accelerate thatprocess and why is it that importantjust a few slides about that so a fewuse cases is for inference one one isreal time inference where there is onemodel and a high load right many usersquering just one model and maybe thereare multiple replicas running on a GPUand as we said right we want to scale upreplicas whe��want to make surethat your test environment is close toproduction as possiblei know in my career there's beensituations where my test environment maymaybe may be running the latest andgreatest but my production environmentis more stable so it's you're not makinga lot of changes to that one you want tomake sure that the both of environmentsare the same or at least as close aspossible just like our little hero herehe sees that ramp and he's like I gotthis i'm going to hit that we'll be finei can get through this course easy wheninreality it looks more something likethis our users have possible pitfallsmemory issues maybe uh lag or latencythings that could come up in productionthat aren't coming up in test so we wantto keep our environments very verysimilar but isn't that the promise ofcontainers isn't that why we havecontainers we have application code anddependencies all wrapped into one nicelittle container and then now I can takemy container and share it with you onyour system and share it with you onyour system and it should be the same itshould be running thesame but what about the configurationaround that container and how thatcontainer is being run are they thesame think aboutthat so I wantto share a demo that I built on asubject that's near and dear to my hearti love hamburgers who here loveshamburgers and so do I all the timeeight days a week I could have ahamburger and they're fantastic so Ibuilt this application to track all thehamburger joints that I go to and that Iwant to keep remember but also to havemy friends vote see what they like seeif they like them too and so I have thisand I wanted to show a picture of itbecause I want you to see when we'rewriting tests for this interface okaythis is the interface that we're dealingwith has a list of hamburger joints hasa ad button has some voting mechanism soyou can visualize that me I'm a visuallearner i need to see this i need to seediagrams but also I want to see code soI'm going to try to show all three ofthose to you today so that you can eachof those different types of learning youcan kind of see what's going on and thehope is as I'm sharing these ideassharing these tools you can take themback to your team the tools can beswapped out right i might show you atesting framework or other types oftools in this discussion or in this demoand you say "Yeah well I don't use thatbut I I like this other one better." Andthat's great hopefully you can take theprinciples I'm talking about and stillapply them to your team and make yourtests a little bitbetter so speaking of testing I'm goingto use Cypress cypress is an open sourceframework for JavaScript that allows youto write a JavaScript test inside ofyour uh or for your application thattests your application and so Cypressallows you to essentially assert thingsand check different aspects of your UIfind selectors in your DOM manipulatesthem a little bit and then sees if thosechanges are what you expect in factlet's look a little bit at our test codecome out here to myproject and my application that I havefairly straightforward i have a backenda front end and my ET directory each ofthese directories are going to be theirown image so they'll be their owncontainers inside of myapplication and so my test code lookssomething like this come here to CyprusE toE and spec there we goin here I set a couple of differentvalues first I get a front-end URL thatfront end URL is going to test howCypress is going to access myapplication because I want to test theentire flow not just local host on asome port i want to actually test theingress that's happening and then I'mgoing to have some data that I'm goingto be passing in to my application soI'm going to put a new burger joint inthere and then I'm now I'm going to stepthrough each of those tests the first istesting to see if the page even loadsbecause if the page doesn't load none ofthe rest of the tests are going to workright so you want to check that firstbefore each function is checking to seeif the page even loads and so we saywe're visiting the burger placessite then we find an� element on thescreen or on the on the page there onthe window and we say let's find thisanchor tag that has an ad on it that'sour ad button then in the next test I'mgoing to click on that ad button becauseI'm assuming I found it and so you stepthrough your UI you step through yourprocess building on your flow as you goso I'm going to click on it then after Iclick on it it's going to open I hopeit's going to opena form for me where I'm going to add myburgerjoint and then I'm going to save it andthen check mylist so this is what my test is going todo for my application my application asI mentioned is going to contain twocontainersthey arerunning here in a cluster that's definedlike so i have a deployment for mybackend that's going to refer to mybackend image my burger back image it'sgoing to be tagged with an image labelthat will be defined during our buildprocessthen I have a frontend very similarlyokay and then I have ingress that'sdefined here at the bottom there's lotsof ingress providers out there today I'mgoing to be usingEnro and we'll discuss why as I getthrough my my uh my processhere so that's the codeand so our environment is going to looksomething like this again I like to seecode i like to see diagrams that way Ican visually see how things worktogether i'm trying to make myproduction and my test site as as evenas possible as similar as possible bothof them are running front end andbackend pods and then I have my testcode container that's going to beinterfacing with my test clustertheflow we're going to see that in a secondwe're going to be using GitHub actionsas our platform for running our CIprocess our process to essentially buildtest and then deploy our applicationinside of our GitHub action we're goingto be using K3S super lightweight niceKubernetes distribution that allows youto spin up a Kubernetes cluster and thenspin it down as you're running yourGitHubaction and so our flow looks somethinglike this as a developer I'm going tohave an app code repo that's going tohave my GitHub action in it running myK3S cluster that's also going to be thecluster or the area where I'm going tohave my Cypress code running andthen I'm going to have my config repothis touches on some of the GitOpsprinciples I want to talk about in thissituation the GitOps communityrecommends that you have yourapplication code and your configurationcode in two different places in twodifferent repositories because both ofthem have different life cycles you'regoing to be making changes at differenttimes to your application code than youwill be to your configuration code evendifferent teams are going to haveprobably have access to the twodifferent repositories and so in thisdemonstration I wanted to separate themout just to show what that what thatmight look like and so in this situationwe have our app repo which is going tobe checking out that configuration codeduring our GitHub action it's going tobe making some changes because it'sgoing to be building our images andputting labels on them right so and soour configuration repository needs toknow about those changes so it'll becommitting those code those changes backto our configrepo allright sonow let's look at our GitHub actionand I'm going to be flying through thisa little quickly there's a lot going onin our GitHub action uh rest assuredI'll have a link at the end to show youhow to get to all these resources boththe repositories the GitHub action allthe resources I talk about it'll beavailable uh at theend so my GitHub actionhere I'm going to have it run on pullrequests every time you open a pullrequest not only when you open a pullrequest but also what they say when yousynchronize a pull request so in GitHubor in any Git repository that you'll beusing you open a pull request or a mergerequest but then commonly you're goingto be making commits to it after youopen it that's what synchronizing isdoing it's going to be running the testevery time you make another commit onthat on that uh pullrequest it's going to be running on anypull request that's trying to merge withmainand then we get� into our dependencieswe're we're going to say that we wantthe GitHub action running onYUbuntu we're going to check out thecurrent repositories code so the thepull request is comingin here's where we're going to specifythat we're using K3Sthen we're going to log into Docker sothat when we deal with our images andpush them around and we want to pushthem up to Docker Hub to a Dockerregistry remote from where we're workingso that it can be deployed and run inproduction uh so we log into Docker andthen here we startbuilding our Docker images we're goingto build our backend and our front endwe're going to tag them with the Shaw ofthe commit another great practice tohave you always want to have yourcontainers and your code commits to havethe same identification that way you cansay "Oh wait this particular image thisparticular container is running withthis particular revision of code let'smake sure that those are lined up andthesame we push those images up toDockerHub and then now we're going to bechecking out our configuration code." Sowe can check out code from another repoinside of our GitHub action so I'm goingto check it out from my config coderepo and then when I get that I give ita path i give it config repo path therein the middle where I that way I canidentify that the the the code that I'mchecking out from thatrepository inside of config repo it's alittle big because I want to make it bigso you can see it so I'm going to scrollto the right here just for a secondi'm going to reference configrepo and inside the config directory Ihave my Kubernetes manifest my Katesmanifest that we showed you before withmy couple of deployments my couple ofservices and my ingress i want to beable to perform a find and replace sortof action on this file so I'm going tolook for my placeholder values thethings in all caps and then I'm going tostick values for my Shaw in there for myimagelabels and then for my front end URLvarss do all caps these are values thatyou are variables that you save insideof your GitHub repository that you havethere and in this case we're using thesaid command it's command line commandthat allows you to do this type of findand replace inside of a file in thissituation you could also use atemplating engine something likecustomize or ginga depends on what youwant to use again it goes back to thatconversation about there are manydifferent tools for each step and theseare just the tools I'm usingtoday next in our GitHub action we havewe're going to be installing the Helmcharts that we need or in thisparticular case we need a Helm chart weneed the Enro operator so that we canuse that to provide our ingressthe reason why we went with Enro isbecause Enro allows you to set up yourURL to pass to the Enro service that'sbeing that's referring to that URL uhwithout having to know anything aboutthat cluster aside from the fact that uhaside from the fact that it's tied to itusing the this angro operator so youdon't need to know any IP addresses ofof services or anything like that sothat's that's why we went with Enrotodaythen weapply our Katesfile and with that cluster running we'regoing to build our test container andthen we're going to run it we're goingto pass in that front-end test URL thatwe have we set that front-end test URLinside of our you saw that inside of ourtest code but we also gave it inside ofour GitHub uh GitHub repository as wellgithub allows you to store variables andsecrets and things and that those arethe values that you're seeing with thevarss inside of thisfile all right in fact we'll go to thatpoint inour GitHubaction and I recorded going through thiswhat's the nice thing about video isthat we can step through it i can pauseit i can go I can skip through thingsbad part is I can't really changeresolution so sorry about that but I'llI'll get this going here and essentiallywhat I'm doing is I see a change that Iwant to make in my website i want tochange the title because I'm realizingthis is not a burger joints list for allburger joints in the world just inSeattle Washington which is where I'mfrom and so� I make the change to sayburger places ofSeattle and then I'm going to be uhpushing that code up so I'll commit thatAnd we can assume we know what what thatwhat's going on there so I'm going toskip through that a little bit and it'sgoing to push that code up to GitHub forus and now I open a pullrequest that pull request then launchesinto our GitHubaction what's nice about watching thisI'm not going to make us watch the wholething but it's going to start giving itor building and installing all of ourdependencies we we saw at the beginningthere before it started spinning awaythat it was grabbing K3S so that wecould use that and now it's going tobuild the Docker images of our back endand our front end we're going to skipacrossthose because that takestime again it goes back to that wholeslide remember with the weightlifter itdoes take timeand thennow we'll skip hereto ourtest and then righthere our testran it's always harder doing somethingin front of people there we go we'regoing to watch we're going to see justthree tests pass all four tests pass ipromise i just don't want to spend timetrying to scrub to it but we see how ourtests pass and then it can move on tothe next steps in our GitHub action ifour tests don'tpass frameworks like Cypress can provideresources like giving you a screenshotof what things looked like and in thisfor an example here I can come out to mytest code and say "Oh what does it looklike when it failed?" And I say "Ohhere's a screenshot when it couldn'tload that very first test it said "Thisis what I saw." That's what Cypress issaying so you can have that as aresource while you're running yourtests so our tests pass and we skip intothe next portion of our GitHub actionwhich is going to be doing a deploy andthis is where I want to talk a littlebit more about our GitOps principlesgitops is a philosophy of essentiallysaying our infrastructure is beingdefined as code we should use git asthat source of truth git is where weshould have everything all ourconfiguration stored you don't makechanges by hand you make changes bycommitting to our git repository andthen having a list that way now you havea history of every change that you'vemade you know exactly what changed whochanged it and when they changed it it'sall happening inside of git but thenalso you have these processes like theone we just saw the GitHub action whichis happening with our git process wherewe're pushing the code up and we canstart running automation on that code sothat when we're pushing ourconfiguration changes we can test tomake sure our configuration actuallyworks and our application actuallyworks git acts as our source of truthand then what I'm going to talk aboutnext is an idea again in the GitOpscommunity where they talk about therendered manifest pattern this is apattern where you have your manifestsour arcades config files that areactually going to be fully rendered withthe values that we want to use to deploythat are sitting waiting inside of arepository so that our CD process canpick those up it's not going to that CDprocess is not going to be finding atemplate and then swapping out valueswe're actually going to have therendered version of that of thatmanifest so in our case here during ourGitHub action below the tests after thetests pass we had the GitHub actioncommit our changes back to the configrepository but instead of putting it inmain it actually created there arebranches for testing for render test andrender prod and in this case we madethose branches that are never going tomerge back to main they're going to betheir own spots that our CD process willwatch for our main branch will keephaving that template the templatedvalues so we have the placeholder valuesinside there inside of here inside ofthe render branches we have our fullyrendered configurations the nice thingabout this I'll let this run while Italk the nice thing about this is you'llhave a full history again of theconfigurations of your application notonly for production but also for testsfor tests those are the things that justran inside of the GitHub action but thennow you have that test if you wanted torun additional tests while yourapplication was out in the world you canalso do that something called smokescreening to make sure that you can findthings before your users find them youcan find them firstso in this case here we see thatrendered manifest where we have the theimage labels we have the URL value herein our testbranch and then we alsohave thewh yay for videos all rightright here we're not going to do thistoday are we so you also saw that we hadimage labels and they had the shaws thatwere from the git commit and inside ofour of ouruh our manifestthere so we'll skip on to our next stepnow that we have rendered manifests andthey're sitting in the branches ready tobe rendered ready to be deployed we'regoing to introduce our last tool of theday and that is Argo CD argo CD is goingto be watching our specific branches andsaying "All right I'm going to watch thespecific branch and when there's changesto that branch I'm going to deploy it."In our case here we'll look at therender prodbranch i'll have this video run you'llwant to watch you see those turquoisesort of boxes down in the lower righthand corner that's where you want to seethe changes when somethingsyncs that's going to be processing ourchanges to our cluster before we pushout to productionwe see we check make sure that ouringress is set our deployments areset and then we can go and see that ourproduction site the change has been madeso I made that change to the title andI'm confident that I made changes thatdidn't adversely affect my system andthen also did exactly what I wanted todo now to the keen eye you'll noticethat this list is zero before theystarted filling out the the form hereand the previous one was not well that'sbecause this one this application isrunning with local storage so in realenvir real life you'll want to have apersistent volume some persistentstorage so that you don't uh you knowlose data every time you make a newimageso you might be thinking Scott that'snice my application's a little bitbigger than two deployments or twoservices i have a lot more CRDs in myapplication this is going to take a lotlonger to build and a lot longer to testand you're not wrong i actually recentlylearned of a new project not new I'm newto the project uh VClustervcluster is a Kubernetes distributionthat allows you to have a cluster insideof a cluster so in a sense you couldhave an ephemeral temporary Kubernetescluster that's running inside of a hostcluster now how would this help withtesting imagine this for a second youhave your cluster that's sitting on youryour cloud or bare metal sitting in theworld and it's running a Vcluster ofyour production environment and that'sfine but then it has all this the CRDsset up and installed inside of the hostcluster and you want to test with thosesame CRDs the same versions of thoseCRDs with vCluster you're able to spinup a new vCluster use all those sameCRDs test the thing out and then spin itdown you can do all that actually from aGitHub action i joined the team lastweek so I don't have anything ready toshow you but what I do have is I have alink to some resources that my colleagueSiam has made that is also availablethat I'll show you here in a QR code injust a second you can see how to dothat in fact I do believe it is that QRcode i createda channel on our community git ourcommunity slacks workspace it's forgeneral discussion of testing but alsohas a link to the slides uh that that Isaw it's actually an extended version ofthe slides because I had to chop out awhole bunch of slides and then also haslinks to the repositories links toSiam's article on how to uh essentiallyhow to spin up an ephemeral environmentin a pull request in a pull requestsimilar similar to what I just did butwith vcluster insteadi'm Scott Mallister a developer advocatewho recently joined Loth Labs the makersof VCluster i want to thank you all somuch for coming to my talk today andhope that you have a wonderful rest ofyour day and safe travels after the nexttalk[Applause]2025-04-15 21:58:12.407354 DD�� W#��wAemjrmJZR-ZIthank you all for comingtoday i need your help i usually do athing at the beginning of my talks toprove to my bosses that I went to alocation and it actually did work didn'tcome to England just to watch socceralthough I did and it was fantastic ineed you all to help me out can yousmile andwave and this room is so big we're goingto do smile and wavethere awesome thank youall put that away so we don't getdistracted if you are still here atCubeCon and you're in this room you'regoing to learn a little bit abouttesting specifically endtoend testingand now I know each ofyou like me because we're softwarepeople we're detail oriented you readthedescription facidiously you kneweverything that I'm going to talk abouttoday so of course not only we going totalk about testing we're also going totalk a little bit about GitOps not a lotbut I do want to talk about the wholeprocess right where we take our code andwe're going to build test and thendeploy our application and then we'regoing to talk about a few GitOpsprinciples in there too that I wanted toshare so testing why do we do it we doit to evaluate and verify that our codeis going to act asexpected because honestly two of themost important things in softwaredevelopment is that we make sure thatour code works and that we can pushchanges out quickly because if our codedoes something unexpected and if ourusers hit something unexpected that'susually called a bug right and we don'twant that so with testing we want tomake sure we have our tests running butwe also have our tests automated so thatwe can push our changes out quickly andmake sure that our software is runningthe latest things that we want thelatest features that we want ourcustomers touse our good friend Adler also said thisabout testing it can work asdocumentation i know for me I've learnedprojects by getting into the tests firstrunning those seeing how they how theyoperate how it affects the code eventweaking them a little bit to make surethat I understand what's happening butthen also it can make it can make forclear design and encourage productivitywe have confidence that our changes arenot going to adversely affect oursystems as we push them outso endtoend testing is that type oftesting where we're going to be testingthe entire flow you have all differenttypes of testing you could be doing inyour software development you have unittesting where you're testing the mostsmall objects possible or even just thefunctionality on those objects thenintegration testing where we startsnapping those things together andendtoend testing we're going to beexperiencing what the user experiencesso if you have a web application amobile application we're going to betesting out that UI that's going to beinterfacing with that back end which isgoing to be interfacing with our datastore so we want the whole thing to becovered and you might be sitting herethinking "Yeah Scott that's a lotthough." So if you think about it thesetests you have to write them that takestime you have to think about how they'regoing to affect your system or at leasthow you're going to take that flow thattakes time how to set up a developmentor a test environment that takes timeand itdoes but like weightlifting good testsmake our applicationsstronger not only end to end testing butwe should also keep the other types oftesting in mind whether unit testing andintegration testing we want to make sureeach of those different aspects arebeing covered the internet reminds uswhy a real world example this lock wasunit tested it totally works or itdoesn't me too you wrote it to do onething and then oh wait it's doing thisthing i didn't want it to do that so youwant to make sure that you have testscovering each of yoursituations you also ��make the change to the simple code so weaccept that we have a lot more code aslong as each individual file is simplerand self-contained um and it is okay tocontinue to have magic to create thatlots and lots of simple code that is anan asset and now we can leverage themagic machine for our own efficiency butas long as we aren't shipping it andrequiring customers to run that magicmachine we feel like we're in a muchbetter place and the the important thinghere is um we recognize also that thenature of the magic machine will improveover time like we know that LLMs aregoing to get better tomorrow we knowthat you know we will write bettertooling to create particular problems wehave and we we have people that aregoing to fix problems also um both youknow ourselves on the config connectorteam and the community here right and wewant to make sure that everyone machineshumans tooling can all collaborate andwork together on that codebase so a bigrequirement is that all of those toolsand particularly the AI tools must beable to work with the code as it existstoday they must not assume that theycreated all the code but they must justbe able to make small incrementalimprovements to the code as it existsit's the idea that the code is theprimary artifact the code is whatdefines our product and that is what wemust work with not some abstraction onthe codeso we've mentioned before this isn't thefirst time and we actually came up withthis idea of pull the magic machine tobuild time from run from productionruntime before we even got to LLMs andso the first thing I tried to build wasan abased tooling right that thatstandard classical approach but that ais complicated the processes arecomplicated and it takes a huge hugeamount of humaninvestment the generated code that wegot out of it was better than with it'snice simple code that we get inproduction but it's a very complicatedbuild system to try and maintain andevery time something new happens you youhave to spend the time to make it betterit was an improvement but it wasn'tenough of an improvement to meet thatthousanduh controller goal that wehad and coincidentally or happily enoughthis was actually around the time whenLLM started appearing in the wild as itwere the chat GPT moment coincided witha lot of this uh exploration that wewere doing um and one of the greatthings about LLMs is most of them happento be pretty good at manipulating codesort of straight out of the box so wedon't need to do a lot of that heavylifting that we needed to do with theclassical tooling um another thing youknow they're they're good um but we alsorecognized that we didn't need the 100%reliability that we get with basedtooling we can accept the idea that youknow this this can sometimes produceabsolute garbage and sometimes producewonderful results we just run it twiceright uh and that is a great assetbecause we aren't running that magicmachine in production it's it's onlyourselves that has to deal with the youknow moments of brilliance and momentsof lessbrilliance so that sounds great problemsolved right well there are someinteresting problems we got to a littlebit of it i mean the first thing and itsome of the problems are the LLMs andsome of it is actually our assumptionsour fears our interpretations of theLLMs llms are non-deterministic we canmake them deterministic but you lose alot of the power when you do it and sofor those of us with a very classicalbackground in computer science we'reused to you know the A or the otherdeterministic tools it's kind of scarygoing to a non-deterministic solution alot of the ourassumptions just get thrown out thewindow it's like if I have this new LLMtooling and I run a test and the testfails does that mean my LLM is wrong ordoes it just mean that I happen to hit abad path through the LLM and if I justrun it again I'll get something betterhow many more times do I run it you knowum excuse me if I make a change you knowand I run the test and the testfails did my fix not work i mean is iswas that prompt change not good how manytimes do I have to run it right sothat's a that's� a problem that we needto start learning how to deal with andit's it's not just a problem on how todeal with it but when you do when youhave an engineering team or teammatesthat are trying to use this tool andthey try it and the immediate responseis well I tried it once and it didn'twork well okay but you need to try itagain right and so there is humanelements to this as well our intuitionof how it works is also an interestingproblem right we think well it didn'twork this one time so I'm going to buildthis heavy solution on how to parse YAMLwell I don't know we did this right andthen I pulled all that YAML parsing outand I went I'm just going to pass theYAML in and see whether or not the LLMcan deal with it and surprise it canright so don't assume that it can't dothings even if you try it once and itdoesn't and and start simple see howmuch of the heavy lifting you can makethe LLM do for you also remember thatyou know the the these models keepworking and they keep the peoplebuilding them keep making them better ifyou wait you will get better resultsjust by going to the newer model and youhear this again and again but there areother things you can do while you'rewaiting you can come up with the rightanswer right you can iterate you cangenerate the controller that is of theform that you want as a result that'sthen can be go back into the learningthat reinforces and that helps make thatnext model better i will also say thereis something you may try today and itdoesn'twork that doesn't mean it won't work onthe next model right so one of thethings we found is it's very useful toflag our experiments so that they'reeasier to rerun on that next modelbecause we've found things that didn'twork on earlier models do work on latermodels it's also important that youunderstand your requirements right we'vetalked a little about this for our usecase do you need predictable results inproduction you know it's possible togenerate an LLM that will give youpredictable results in in production itwon't run over the nice kitten um butthat's a lot of extra work and do youneed to run the LLM at runtime so wedon't right and so we can get predictmore predictable results at productiontime by running the LLM at build timeand that you know and then have a mucheasier time determining which were theright results do you need the LLM tomake a rapid decision we don't we dothings at build time that means it'sfine for our build system to go awaywhile I'm sleeping and give me theresults back when I get up right do weneed to generate predictable resultswell yeah I do right i mean and but itdoesn't have to be every time so to toJustin's earlier point I can run ittwice i can run it thrice as long as Ihave some way to validate which ones ofthe controllers that it wrote were theright onesi think one of the most interestingthings I've read about AI is is uh RichSutton's The Bitter Lesson it is a greatessay we've linked it there it's you canGoogle it um it's mostly about AIresearch but it is this idea that Ithink applies generally like to softwareengineering and our usage of AI also andthe idea is um you as a researcher orsomeone wanting to use AI as a personwill invest you know your your knowledgeand try to get better results out oftoday's AI system and you will get goodresults um however uh you will beovertaken by the next generation of LLMsby someone that puts more GPU use in abox or uses the next generation ofhardware and essentially what we haveinternalized here is the need to acceptthat we have to accept the bitter lessonand we have to adopt strategies that donot fall down in the face of the bitterlesson and in general what that means isum basically avoiding any investment insort of model specific optimization youknow you have to do your prompt tweakingbut we're not going to spend weeks ormonths tweaking the prompts because weknow there's a new LLM coming and thatnew LLM will probably responddifferently to the prompts and all yourwork will then have to be thrown away soinstead focus on techniques things likehow do we get the right information intothe context� like LM are still obey thebasic laws of of information theory andcomputer science you know garbage ingarbage out so if you want something toappear in the output it sure is a loteasier to make sure that informationappears in the input to the LLM right itis much easier to do that than to relyon the training um that will be true ofevery foreseeable generation of LLMs tocome um can we break down big problemsinto lots of little problems right thatsmaller problems should be easierproblems and easier problems should beable to be solved more readily um can weget the feedback loops going um so thatwe can improve um both each the way wedo those strategies and sort of overallunderstand where we're going and how howwe're doing and where we need to usemore people and where we need to usemore LLMs can we you know use the LLMfor what they do today what they're goodat like today that's exposing tools orfunctions to the LLMs can we basicallydo that and and have it work the idea iswe want to be in a situation where whenthat new model is released we see itpurely as a win and not something thatinvalidates all our work we use thebroad techniques we used like with thelast generation maybe we tweak theprompts a little but we find that ourresults just get better and we aren'tsort of clinging to the work we did inthepast so jigs andinterlocks we're going to build lots oflittle tools we're going to work out howto make those tools work together andthen we're going to work out how to dothese things again and again as quicklyas possible as reliably as possible it'svery similar for those of you who maybedo something like woodworking you buildtools that are just there to help youbuild other tools and to to to do someone operation really easily reallyquicklyand so one of the first jigs this is asimple one um we basically inject uhvariables substitute values into aprompt um we have a couple of functionsthat we've exposed to the LLM like writea file and we basically say hey LLM godo this relatively simple task um you'realso allowed it's in the the three dotsbut it's also allowed to run G-Cloudhelp for example here so this is like asimple task where it reads the helpfiles for G-Cloud and tries to constructa test case right this is a simple taskso we think we can use simple toolinglike basic prompt injection or prompttemplating uh with uh with tool usage umthat works well for simple tasks um butI think one of the one of the thingsthat we learned is it doesn't work wellfor more complicated tasks so this islike more of the sort of vibe codingtype task can you actually write a heyLLM can you write a fuzzer can you writea controller can you write a CRD rightpeople are getting good results but ingeneral we find that the morecomplicated tasks need a bit morestructure to get the best results fromand so what we do is this we call itinduction is our little nickname for itbut if you look at this is actual codefrom the KCC project this is a fuzzer umfor a TPU node or TPU virtual machineand you can see the important bit isthese three bits at the three lines atthe top where we basically have somestructured input data um we're saying wewant to run a fuzz gen it's just a nameof a tool it doesn't really matter umand we are fuzzing uh a resource thathas a proto representation as thisparticular kind or type and the CRD kindis TPU virtual machine and what we do iswe take these as examples uh from thecodebase and we sort of think of them asinputs and outputs and what we're goingto do is we're going to use the LLM togive it the input and ask it for theoutput for the next case so in this caseyou know we can degrade it into just theannotations and the file the file whichis theoutput and then the way we feed that totoday's LLM is by wrapping them in XMLwhich of course makes us sad but uh youknow we move past that quickly um andthen we come to the actual principlewhich is uh we write a couple of thesefuzzers by hand uh we add thoseannotations to those first files that'ssort of the the first step of theinductive loop and then now when we wantto do the next one uh we write thea�nnotations for that n plus1 case wehave from our existing codebase we scanthe codebase construct those inputs andoutputs and we put those into thecontext and then we say all right here'sthe prompt here's the input we take thatoutput from the LLM uh we correct anyissues that might exist or we choose torun it again or whatever it is we mighthandcode it if it's that bad um but wecommit that into the codebase with codereview that becomes the n plus1 case sothis is a very powerful way to get fromyou know 2 to 3 to 4 to 5 to a th00andum for all of these sort of sub problemsthat we are facing and it works muchbetter than simple vibecoding so that sounds great um how manysteps are there and the answer is Ithink right now about 12 15 such stepsum which sounds like a lot for creatinga controller but you know there arepre-steps for some of the steps one ofthe first things you saw from Justin'sstuff was this G-Cloud well what was thepoint of doing that G-Cloud it turns outthe point of doing that G-Cloud was sothat we would make HTTP requests see theHTTP response capture that HTTP logthat's one step the next step is nowthat we understand what a requestresponse looks like for that resourcenow we can tell have the LLM read thetemplatized version and sample versionof the induct of the the induction loopand the logs and use those two pieces ofinformation to then create the mockright and so what we have is breakingdown and we we can do things like whenwe look generate that HTTP log there arebasic checks we can make on it todetermine if there are a bunch of 404sthen something went wrong and we shouldprobably stop here and so when we geteither we need to regenerate the theG-Cloud commands or we need to rerun thethe the the actual test that generatedthose things and that's where we getthis whole interlock right so that thatcheck for 404 is one of the interlocksand we can run this stuff in paralleland the the the requests that get a 404are paused we capture the status we goand investigate and other of theparallel change which didn't hit a 404get toproceed yeah and I mean I think this isthis is a hypothesis but I think it'squite important like we're not justbreaking down the the tasks because thesmaller tasks are easier it is the ideathat you know we go over here with thistask we go over here with the other taskif there is a hallucination on one ofthe two we hypothesize that it isunlikely that these two different taskswill both fail in a way that's cometogether in a way that still works rightso the hallucination on one is chopefully caught by the correct outputon the other or at least we think thereare many ways for the LLM to hallucinatebut ideally there is only one way forthem to to get it right and so that isthe that is the core of the hypothesisof this interlock idea break down theproblems lots of independent steps ifthey can mesh together hopefully they ifthey mesh together that is evidence thatthey are meshingcorrectlyso iterative improvement toolsum there are varying things we can dothere are varying things we've builtalong the way you know we have examplesof things like well I have an LLM i wantit to be able to generate a CR for mythis newly generated CRD type so webuild we build something that can gotalk to Kubernetes download the C theopen API schema and make that availableto the LLM i can then have it look atthe mocks make the mocks available tothe LLM and when we start building youknow I can building these steps thehere's the build here's the build outputeach of these things that we makeavailable to the LLM is one more thingthat the LLM can then use to nothallucinate to understand when it had aproblemand I think one of the interestingthings is I think this is you know thisis the cursor workflow this is the agentagentic workflow we've actually foundthis one hasn't worked as well for usyet as some of the other uh workflowslike the iterative loop um this is theone we sort of expect to we we we shouldbe able to give the LLM lots of toolsask it to fix a compiler error and haveit work in our experience with the waywe've built it it d�oesn't which is allcredit to other tools right of course umwe haven't had as good results and sothis is where like humans typically haveto get involved today we obviously hopethat you know we will figure it out orthe LLMs will figure it out and thatwill go away or get lessened but todaylike a lot of the basic fixing ofcompiler errors uh has to be fixed by aperson going in and fixing those thingsso yeah so we have um thinking alsoabout the way cursor works right that isa a UI sidebar and we don't want ourengineers to have to go and type in intoa little uh sidebar every time we havebuilt a automated pipeline to basicallycombine those complicated or large largenumber of steps into a repeatablepattern which Walter willdescribe so decompose combinedstrategies um we've kind of gotten abreakdown hereso we've said there are 12 15 steps thatneed to be run through to generate onecontroller a lot of this needs to bemetadata driven is actually probablywhat I would say is there was a one-timestep before we even get here which wasasking the LLM to generate the metadatafor a thousand resources then I may havespent more hours than I want torecollect actually getting rid of thehallucinations in that metadata uh butwe get the metadata andnow we have a description of you knowvarying things it can be the uh gitbranch that we're going to make eachresource in the name of the resourcewhere it can find the protoile for theresourceuh all of these sort of things and nowwe can basically just say hey it takesmaybe 10 minutes to run one step on oneresource but I have a thousand of themto do and if we've written that firststep jig correctly I can just go to bedand come back the following morning andsee that it's run that first jig acrossa thousand resources and maybe 600 ofthem were successful right and now I cankick it off running the second step onthe 600 that were successful while Istart debugging went what went wrong onthe 400 where it didn't work and in thisway we can get much betterthroughput to get things actuallyworking and the other thing is not allof these steps have to be LLM basedright we can talk about hybrid solutionsso if I've written something and Ican like the the rest or the gostructure for what I want my CRD to looklike i don't need the LLM to generatethe open API schema for me i can just goand have CRD generators there areseveral of them out there that will dothat next step forme uh and you can see here there are alot of steps i'm not going tonecessarily go through all of them butsome of these steps are we we've kind ofcoded as they're ones that the LLM doeswe've already talked a bit about thecompile errors there are classes ofcompile errors that we're actuallygotten the LLM to do a good job fixingif it's a missing import the LLM does agreat job uh there are certainly otherclasses of errors that it doesn't do agreat job on and you know this is why wehave two passes we also have the compileas the validator for those sort of stepsand one of the things to remember iseach of these steps is some actualaction and then some validation todetermine whether that action wassuccessful or notso how do we scale like we've we've nowwritten a large number of controllersand we know that sometimes LLMhallucinate so how do we create trustright one option reviewing code right ifI can review the code and the code looksgood that's great but reviewing athousand controllers is a lot of code toreview right it may be faster thanwriting the code but we had the LLMwrite the code because it was going totake so long to write the code so wehave a new bottleneckright excuse me um and there are varyingthings we can do um there are members ofthis community who are doing things likecreating llinters for CRDs uh we havellinters for code that that catches acertain number of the problemsuh if we can have a way to generatetests that we trust and one of thethings we have in this particular domainspace is if I have a a known input and aknownoutput then at some level I can kind oftrust that that test works by justchecking that the input and the outputmake sense and even if I'm generatingmocks I can I can have an AB switchwhere I run the test with the mocks Irun the test with the live system and ifthe everything lines up then I'vegenerated a much higher confidence in myoutput um I can also use a different LLMthat knows how to find particularclasses of problems right and this is agreat way um and we can use humans rightbut the key is work out what the humanis good at right so if what I want to dois determine my CRD is an API I'm goingto have to support for forever and Iwant to make sure that that API seemslike one that my team is willing tosupport then I want my best APIreviewers to go and gener to review thatAPI because that's a very hard thing toactually get an LLM today to to to sayis this a good API or notso automating I mean we talked aboutthese jigs their automation what we'vebeen talking about is then automatingthat automation right it's how can I getthe computer to do all the heavy liftinguh and we have a we'll we'll send thecode the code is available it's inGitHub it's open to anyone to review butI will say a couple of key takeaways areyou So we're trying to work out how toscale up controlling you know creatingcontrollers make sure you're solving theproblem you find uh we have re we havemade the mistake on multiple occasionsof thinking oh I need to be able to runthis thing in massive project in massiveparallel so I can return it quickly do Ior can I just let it run overnight umvalidation validation is key right likevalidation is the way that I know thatI'm not building on top of a car a a ahouse of cards that's going to fall downand recording intention and results ifsomething went wrong I want to know whatI asked the LLM to do i want to knowwhat the feedback was and what wentwrong and so it's really important thatyou record intention and results alsothere may not be one true solution rightwe have to deal with both the greenfield new resources that we've never hadto write before and so we can say whatwe generate is the right behavior aslong as it does what a human expects butwe have some of these old resources thatwe had built with Terraform and so weneed to be backward compatible and so weactually have two different pathsthrough our generator because we havetwo different problems and they need tobe a little bitdifferent so OSS parallels you knowoptimize for more simpler code lots oftests you know when we did the whenJustin and I did the initial analysisbefore we even got to LLMs of the Asolution versus the nonAST solution ourrealization was the A solution was goingto be 10 times more code than the magicmodule but the key is that the magicmodule solution all of that code you wasprocessed on everyrequest so even though there's a lotmore code for the a or in this case theLLM solution the actual executing codeis 100 times smaller so when you arelooking at an individual resource thereis much less code for you to understandwhen you're trying to work out what'sgoing on there are a lot more resourcesso there's a lot more code but the theactual code for any given request ismuch smaller and and that is sort ofsomething we've worked very hard in theKubernetes project also like we havetried to make sure that we can mergecontributions from the wonderfulcommunity um and have confidence indoing so and we try to merge thosethings quickly right this is this is anopen-source problem and so a AI and opensource are not actually these completelydifferent things we have a huge headstarthere oh so uh just to conclude quicklybecause I know we're over time uh havewe finished sorry um we're not claimingto solve all the problems uh we thinkthere's a lot of applicability of ourtechnique techniques that we'vedescribed here i won't read out theslide because uh we're out of time umbut if you would like to talk to us moreabout you know how we've uh approachedthis problem in our domain and you knowhow you can solve it or ideas you mighthave that we can better solve it in ourdomain or from your domain please comeand find us afterwards for questions andthank you very much[Applause]2025-04-15 21:58:12.924834 @@�� X#��wA_oIoaW5i-xEhello everybody goodmorning uh my name is Justin SantaBarbara i am a software engineer atGoogle i work on a number of open sourceprojects related to Kubernetes uh one ofwhich is Config Connector and we aregoing to be talking to you today aboutum how we are using AI to generate uhthose thousand controllers that make upConfig Connector this is Walter helloi'm Walter Fender uh I also work on uhseveral open source Kubernetes projectsuh I'm both the software engineer andthe EM on the config connector projectas well as a fewothers awesome so we're not really goingto dwell on config connector too muchthis is really about the AI andKubernetes controllers but we are goingto give a little bit of uh sort of ourdue diligence about you know how did wecome to a place where we ended up havingto write a thousand Kubernetescontrollersso the basic idea of config connector iswe have rest GCP APIs to manage all theGoogle resources uh it's very simplesimilar to what you would do for many ofthe other cloud providers but those RESTAPIs are not inherently Kubernetesnative and so config connector is ourattempt to write a KRM Kubernetesresource model that allows you tocontrol all the Google APIs and thereare similar projects like ACK for AWSand ASO for Azure that have the sameproblem and this means that if there areand there are a thousand different RESTAPIs to control various things we needto have a thousand controllersuh each of which have their individualbusiness logic that is going to to allowyou to do this so you know we have oneCRD and controller for SQL instance wehave another one for IM service accountso then the question is well why notbuild on Terraform and we tried let mestart with we tried uh and the problemwe get is it ends up being a magicmachine you have one core place that isthe guts of your controller and anytimeyou make changes to one resource to makesomething work some other resourcebreaks and that magic machine is reallyhard and the more resources you try todo it this way the more things breakevery time you make a change and itbecomes just too complicated having thisone very huge intricate magic machinethat is the heart of your controller andand we want to emphasize that is not aknock on Terraform this was true also ofother things we tried as well it is itis the nature of the magic machine thatwe become keepers of the magic machine acustomer will say "Please fix thissimple problem which I could fix in oneline." And we're like "Well first wehave to go ask the magic machine to dothis and this and this." And it becomesa week-long uh escapade as itwereso how do we go from magic machines toLLMs and you know as many of us who'veused LLMs like aren't they just theultimate magic machine i ask itsomething i have no idea how it'sprocessing it and then it gives me aresult isn't that the worst observationof what a magic machine isand I think the key here the the bigbreakthrough that enabled us to leveragethe ultimate magic machine of LLMs isthis idea uh Walter actually created itof sort of code as the artifact um theproblem with the Terraform or DCL magicmachine was that that complexity thecomplexity of the magic machine existedat runtime we would ship it andcustomers would run that verycomplicated machine um and that runtimecomplexity made for project complexityand what we wanted instead was we wantedlots of simple code it's okay to havelots of code as long as each individualpiece is isolated and simple when wewant to make a simple change we simply��initiativesawesome um so why non-cord contributionmatters we are going to talk about thatuh but before we start I just want toask you how many of you are strugglingwith documentation bad documentation andyou want someone to fixit Please don't be shy I see a lot ofhands Um so yeah that's that's thestarting of a talk Um so I collectedthese things uh from the people of CNCFcommunity and open source community Andone thing which I really liked at thetop is known code contributors to opensource projects are like multipliers forcord contributions Uh you don't have towrite code to contribute to open sourceUh then Moi says uh that she wasoverwhelmed Uh I felt intimidatedbecause I thought I had to write code orbuild complex software uh to fit in Butthen she came across one of the greatarticle that I'm proud to be a non-codeopensource contributor and that's whatmotivated her and that's the that's thewhole idea for today's talk We wouldlike to motivate you uh to enable morenon-code contributions Um so notnon-code contributions are secret toopen source success They enable opensource projects in many ways Be itdocumentationuh be it someone suggesting better UX uhbe it someone suggesting someone givingthe feedback uh there are so manynon-code contributions which we're goingto talk more about and in today's talkand maybe you can pick up which you wantto contribute to So let's look at onecommon example and this is somethingrelated to me um to me as well Um sowhen I started my career at blinkit uhwe used to graphfana it was back in 20192018 and uh we used to use graphana opensource version and then uh we used touse graphana and we had like a lot offeedbacks uh to graphfana and then wewent to kubernetes forum delhi we hadgraphfana team and we said that let'ssit on the table and let's talk about itwe want to give you feedbacks uh can youcan you please fix this in your uhcodebase can you have more features umso yes This is this is what I want totalk about This is also a contributionSo users ask questions or providefeedback that hey I tried out yourproduct I tried out this thing Maybeonboarding could be better Documentationwas not good Uh maybe uh there are somany open issues Can you can you havefirst timer issues there are so manyways in which you can give feedback Uhmaintainers receive and listen to thatinput Uh do we have any projectmaintainers in the audience by the waywow Oh we have um so yes um I I'm prettysure you you might feel like thefeedback is super important Um then theygain insights and ideas They improve thefeatures and experiment uh deliverbetter software and create somethingwhich users love So this is one of thenon code contribution providingfeedback So uh I just collected somedata So this is composition of peopleattending CubeCon 2024 Paris CubeConParis And uh we have business operationspeople we have executive people we havesales and marketing we have productmanagers we also have professors andacademic uh we have students So you seethose percentage uh is a huge number Uhand these are like those people who aredirectly related to product but they arenot developers So yes this is the dataand um why non-code contribution mattercommunity is more than code for anyopensource project be it Kuberneteswe're going to touch upon that like howyou can start contributing to Kubernetesbe it Kubernetes there are so manypeople behind that project there are somany people working on documentation somany people working on the release teamsand communications so many peopleworking on the issue triage so we havemore we have very uh we have a lot ofpeople who are behind the success ofopen source project Uh known codersdrive impact Um be it the feedbackcollecting the feedback be it uh be itconnecting with the community like weare we all are doing in this room Uhknown coders definitely drive impact Sorecognition isn't just for coders Therethere's a space for everyone in tech Soyes that's why non-code contributionsmatter Yeah Erh well I think uh here wehave a list a huge list that I think isif you can identify with any of thisrole because you have� many like we saydocumentations that maybe you would liketo share your knowledge creating a blogor helping with the translation in yourlanguage uh communication is like youcan see Slack is a very important in alluh open source projects to er likehelping there is very important and ifyou think about it you have thelandscape of the CNCF that is like morethan 1,000 projects we we will focus onKubernetes and open telemetry becausethere is a big project that they have abig thing in this communications but ifyou think about you can apply all thatwe will we will talk in the thousands ofthe CNCF projects that you have andreally they need attentions in is issuetriage uh content moderationCI/CD back fix it and yeah it's uh somany things uh here is an example aboutuh you have the contributor guide thatis very common in the open sourceprojects even if it's a open sourceproject that is in the sandbox We have atemplate that they have thiscontributing guide that they have waysto contribute find an issue testingyou can find it and you can help and youhave also uh more advances maybeoverwhelming contributor guides that isin Kubernetes and open telemetry and Ithink a good starting is try to read allthese read me or all these contributorguides even if it's overwhelming becausesometimes it's like uh somerepresentation that you are going in thehouse of someone else don't you have toread the rules of the house and beforeto try only maybe open an issue and saywow why is this is not working and yeahand also uh some relationships also tryto met the people that is doing the themaintainers or changing to reach thesepeople and that will be a good starterthat you you can see if yourcontributions could be approved it andand advance it because this is like thefirst step now the people try to reachthem try to understand and also read thethe rules of the house and in basis ofthat ask questions on the slack thatthat will be like the first approachthat everyone that want to start uh tocontribute in theproject okay and yeah here I we have anabstraction that maybe focus a littlemore in documentation because maybe it'sa bigtopic and if you think about we have forone side the technical writers that areexperts in this field and for other theproduct manager but uh like maybe in thedebops space there sometimes they arenot speaking each other sometimes that'sthe reason that documentations could betricky because sometimes is thedistinguished engineer that is writingand is very hard to read it and thenthey have this uh usability or userexperience to uh createum uh a quick start Don't that's that'scould be a our contributions Don't weneed more people that from ourperspective try to create a bettercontent thinking on all the the the kindof users that could use the open sourceproject and in open source we don't havethese roles dedicated Don't everyone islikea people that is donating the their timeDon't you you have to think about thatmaybe this is the timing because theytake at times when they the peopleapprove your your request or give incontact but uh yeah uh is is a littletricky but it's very I think the mostimportant is uh you get in touch and youhave this uh learningway and another uh point that I like islocalizations because for people that wedon't speak English we are from othercountries Erh this is this will help toreach more people like maybe we don'thave to translate everything but atleast the basic like like kubernetes andopen telemetry that they have complexuh complex knowledge complex you cantranslate this we have like for examplenice project like the glossaryy thatthey translate the basicsconcepts and uh it's not only aboutsometimes at the open source project forexample we have the that is some samplethat I really like is the deaf and hardhearing group that they already arecreating a resource and for example wecould help them translating to reachmore people that they have h that isbelong from these minority groups uhlike translate white papers uh that isbuilding in other ts it's a lot ofthings that you can do when you thinkabout localiz ations not on�ly translateddocumentations and know only otherdocuments that is across all the taxacross all theCNCF and here we can see about thecontributor ladder but if you think youstarted as a contributor that try totranslate try to help with some issuesissues triage or try to manage in theslack or help with the block content therelease and you start as a contributorand after you h show the resilience tokeep in time because I think the mosttricky is that you have to uh m try tobe on timelike in months one year becausesometimes it's tricky to beconsistent after that you could be amaintainer that uh and also here is agood sample that um we have good leadsinside of the open source community thatthinking about how to grow up thecommunity like for example in Kuberneteswe have building the bro release thatthey try to create a blocks only to showuh what is the caps the caps that itwill be more features that is developinside of kubernetes thinking about thepeople that is new beginnings or thecontributor experience that you will seeit don't uh when you are growing up inthe contributor ladder there also youcan create more projects that can reachmore people about how to help eachother Uh yeah here is like I put thisslide because if you this uh kubernetescommunity is very big and I think it'sone of the best samples about how can beorganizing non-coding experience So yousee the contributor experience is uhthey have this jit hub that is in theslide but each of this communitycommunities management depat event is aproject it's a complete jitab projectthat you can go there if you identify uhmaybe with statistics or try to the helpwith the mentoring mentoring is like youhave to manage reach the people thatmaybe could helping each other it's it'sa lot of project management that youhave to uh go and help and reach peopleuh contributors blogs Yeah it's it'sreally like for example we have uhKubernetes CV and the release blocks andall these manage it's a lot of work thatyou have a lot of people that is workingbehind of that that is adifferent different groups differentthat you can participate it and also youcan replicate this in the otheropensource project even they have aYouTube I put it in the Kubernetes newcontributor orientation that each weekis recording how to start how to be anew contributor inside of Kubernetes Uhuh the last week that is a blog that isa a project that try to uh to list allthe caps all all the new news aboutKubernetes for the people that is uh newuserUm that's amazing information Carol Umso so Kubernetes community has done alot of hard work creating this websitewhich is about non code contributions Somaybe you can scan the QR code and umit's quite detailed and you want to getstarted with any of the known codecontributions We're going to touch uponthat it could be release teams it couldbe documentation you can take part inthat Um the next one is testing which Itouched upon initially as well uh youneed to test to ensure that everythingworks as expected Um and everyone'ssetup is different and that's a strengthYou give the feedback and there arediverse testers they find uniqueusability issues So there are differentways you can help uh the open sourceprojects any open source projects uhunder CNCF or any any any thing you cantest the features you can report thebugs you can verify the fixes you cantry the issues uh you just run theproject you are stuck you can just raisethe issue that this is the this issomething which is not working on that'stesting the features so I'm going totouch upon the issue triage it's aprocess by which it takes and uh itreviews the new GitHub issues andrequest We have a lot of GitHub issuesand request and if you go throughKubernetes project or maybe open totelemetry or any other big project theyhave like multiple issues and therequest So uh you can basically become atriagger and you can organize them to beactioned uh and you can take participyou can you can basically contribute inthat way Uh so it basically categorizesthe issues and pull requests based onthe factors such as priority or urgencySo this gives �you an idea about andinsights about the Kubernetes project orany other project So this is one goodway to get started or acknowledgingyourself with the environment of theproject and how things work across thatopensource project Um so again this is awebsite uh the community has been doinggreat work definitely we have so manycontributors so you can scan this andyou can also you can get to know thathow you can get started with thisUm so we have release teams Uh so withevery with every Kubernetes release uhwe every with every version we have therelease teams who are working hard Umand uh I'm going to ask Carol a questionbecause because I participated inrelease team under coms So I was a comshadow but Carol has been in a lot ofthings Uh she has been in the she hasbeen the docs one Uh she has also beenin the security I guess So Carol hasbetter experience But what I want to sayis release teams has uh a very goodopportunity for you to get acquaintedwith uh the the Kubernetes project andthe whole ecosystem how things work Wehave uh enhancements lead we have comslead release signal lead docs leadrelease notes lead so you can take partin different ways Uh Carol would youlike to share your experience uh workingas a Yeah I think in any release ofsoftware you have a lot of projectsmanagement that you have to announce ablog you have to put you have tocommunicate with other groups especiallyin kubernetes we have a lot of six thespecialist groups but in other projectsyou can have uh the same process uh yeahI think we can that's great Um I knowthis is maybe this is not visible muchbut like this is a visual release pathUh it's on GitHub as well like this ifyou scan this QR code basically it takesyou to the uh GitHub repo of releaseteam Um so it has so many parts Itbasically shows that when uh when thereis a release of Kubernetes project anyversion So there is there are so manyparts related to it dogs coms everythinguh the CI/CD pipelines So yes definitelywe need more people uh on those aspectswho can contribute to these partsUh just one word before I think maybeKubernetes could be a little scarythat's my recommendation maybe startwith a project that you have moreaffinity like if you like a storage orsecurity try to go to this project opensource project have a thousands of opensource project you don't need to focuson kubernetes I think we are justshowing the experience of the peoplethat organized very well incommunications but not in all all theprojects we have souh so many groups that is exist inkubernetes uh another project like opentelemetry that is a project that is uhwith a lot of contributors you can seethat also communications you can go tothe open telemetry community and it'ssimilar to other projects they will havea jitabproject communic uh communicationsexperience or community with this kindof name that you can go to learn moreabout these non-coding roles like hereis the community and we have the enduser sick that is very interesting ifyou think about in open telemetry theend user they are trying touh they are trying to discuss how to bea vendor agnostic don't they do alots and it's is a it's a lot of workthat is not related with coding Um soall of these groups communicationscontributor experience developerexperience they need help and is this isjust a sample that you can apply andreplicate to other projects becauseeveryone needs a user experience intheir projectsOkay Um so I'm going to touch upontechnical advisory group Um so with thename we think that it's going to be moretechnical and we just have the codecontribution but that's not true I'mgoing to share my story So it's theseare communityled groups with within CNCFWe have different groups uh environmentsustainability app delivery networksecurity observability network storageIf you're interested in any of thesetopic you can take part in that Uh itprovides the expert guidance and drivecollaboration across the focus areas Umit is made up of maintainers uh usersand ecosystem contributors Uh there aredifferent activities in which you canget yourself included like white papersbest practices we released the whitepaper regarding the AI uh recently andthere there was there there are so manywhite papers I've worked in environmenttag environment sustainability so thoseare the different areas where yournon-code contributions can really helpthose technical advisory groups uh youcan host the working groups you can hostthe discussions um so so there aredifferent ways in which you cancontribute to technical advisory group Iwas part of technical environmentssustainability uh group and And then uhin that tag I once so recently like thisyear I led sustainability initiative inwhich we had to conduct uh meetupsacross the globe So we had like 22 plusmeetups across the globe Uh so manypeople organizing this meetups meetupson sustainability So many people talkingabout different projects which are undersustainability like Kepler cube greenand I was basically maintaining thatproject So basically I was working as aproject maintainer and that's a knowncode skill which was required for thesuccess of this initiative So this isone of the example in which you cancontribute uh in technical advisorygroups and I'm pretty sure like eachtechnical advisory group will have uh aneed of known coding skills So you canand the best part is it's open to all Soyou can any anytime you can join themeetings you can contribute or you canlearn So where do you find the scheduleuh if you just Google uh you are goingto find the schedule of the meetingswhen the meetings happen for each tagsyou can also join the CNCFs slack groupand you can uh basically look for thesetags and you will get to know uh the youwill get the notifications about themeetings so yes and then we havecommunity building so that's also one ofthe way in which you can participate umso community building is important whenyou have meetups across when you havemeetups you meet like you meetlike-minded people you're working onsome stuff and you discuss with them ormaybe sometimes you have like you havesimilar feedback you want to discuss Sothat's that's part of community buildingUm so you can check this website thethere are community groups across theglobe and you can take part you can meetlike-minded people So that's also one ofthe known coding skillsOkay Yeah We are just ending and youthis is like some tips because we try totalk a lot of people that is contributeswith non-coding and it's really a niceadvice to be successful that you cancontribute and keep in there and maybeuh upper in the ladder as a maintaineror something and lead some group orcreate some nice initiativeinside of CN NCF or other open sourcecommunity and yeah I think the the mainthing is like uh try to choose some sometopic that is related with your job orsomething that you really like itbecause if not it will be very trickythat you maintain your work with somesomething different don't the I thinkthis is like personal advice that I Itry to do I try to join if I am workingon security try to focus in some topicthat is related to my job to be like uhuh helping me also with my career and Iam learning also with the translationssome and also a nice advice is aboutcollaborations and not to be acompetitions and at the end this try tohelp each other when you are inside ateam that you you feel that is helpingeach there is more nice and also you metmore people and like you started to bemore motivated to continue to do itDon't try to be help uh try to help eachother and here are some of the namesthat is uh really like a key figuresthat uh is contributing non-coding hmainly is from the kubernetes and opentelemetry but is so many uh people thatis doing this work behind I don't knowif you want to say moreokay oh this is should very quickly Wehad few resources We are going to uploadthe slides Okay Awesome So we have theseresources You can check these out Uhit's mainly related to Kubernetes uhproject Uh but these resources can helpyou with uh gettingstarted And yeah thank you everyone Uhif you have any questions you can reachout to us on yeah on our socials Sothank you so much everyone for attending2025-04-15 21:58:13.627438 XIX��lZ#��Ah1AyaAIf3HAgood afternoon everybody and thank youso much for joining Rodrigo and myselfthis late on a Friday uh we realizewe're one of the last things between youand your weekend so we really appreciatethat you signed up for our three-hourlecture onwebsockets i'm I'm kidding it's justit's just 30 minutes first allow me tostart with a little intro of the twopeople that's daring to address youright now my co-speaker Rodrigo he's thereason we're here and the visionarybehind Miro's ��Y#��mApPKuJg_6A3kthank you so much everyone Um so that'sGiblified me going with the trend Uh I'mNancy I love products I work as engineerand developer advocate Currently I'mdoing my masters uh from CornellUniversity and previously I have workedat Blinket which is an e-commercestartup I have worked with Gitboard andlocal stack as a developer advocate I ama CNCF ambassador I also founded womenin cloudnative community and that'swhere I met uh Carol and she became myvery good friend and we are giving atalk So I'm super excited for giving thefirst talk with you Uh I also led uhCNCF sustainability week So I am aco-chair of tag environmentsustainability uh for all those peoplewho are concerned about sustainabilityin tech Um I love nature I love cats andI love to travel And if you want to knowmore about me you can scan the QR codeOkay Well here uh my name is KarValencia for uh nice to meet youeveryone Uh I like cloud native securityI used to work in aqua security It isvery security company Now I am inelastic but uh I really like the devopsand cloud native topics Uh I am CNCFambassador also with like Nancy and alsoI am cloud native chapter organizer forLatin America like it will be in thecity that I live that is Pablo and alsoI help with Lima Peru because I am fromPeru don't I trying to help my localcommunity and I am doing also helpingsome KCDs don't I think that's give mean overview how is the community how canWe help each other especially with opensource and on this tech topics and alsoI participate in the kubernetes releaseteam thing to also give me an a viewthat what is some uh of the topics ormaybe how can we start tocontribute even for a people that don'tspeak the language is a foreigner andthen from uh another country that is notnative and yeah that's the reason thateven I help with the Spanishlocalization mainly that in the projectsof kubernetes and open telemetry and ifyou have any doubt I reach my contact tohelp you in any of these ��evolution into cloudnativetechnologies our Kubernetes platform andCNCFadoption aws has published case studieson his work and uh we use them as acheat as a cheat code at Miro since he'smore effective than mostLLMs my myself Andre I'm a newish S surcompared to this titan next to me umI've joined I've I've dared to ventureinto infrastructure and platformingafter a decade of being a productengineer and I will still writefront-end code when indanger like any good story we'd like tooffer a little bit of history set thestage so to speak rodrigo is going towalk us through a brief history of Meroour product our infrastructure where wecame from to where we are today we'llaim to illuminate why long live statefulconnections are so important to ourproduct next we'll take a look at how weused a platform built on Kubernetes tomigrate Miro's websocket manager intoKubernetes from a stateful EC2 serviceto a stateless service on EKS reshapingour edge routing for websocketconnections reducing operational costall while embracing cloudnativetechnologies we're going to dive a biton some of the mistakes we made thelessons we got so that nobody in thisroom has to repeat them and finallywe're going to brag a bit with theresults and hopefully do all of thisfast enough so that we're out of herebefore midnight just kidding it's still30 minutes uh thank you Andrea for thesweet introduction and thank you all formaking this very last CubeCon U 2025 uhtalk uh I would like to kick things offwith a little bit of history about Mir'sproduct architecture and infrastructureuh these diagrams shows a simplifiedoverview of one of our most importantparts of the product and this is howMiru was born a few years ago uh you canimagine Miro as a gaming engine forenterprise collaboration uh currentlyAndrea and I we are here uh in what wecall the mirror board running thispresentation but if we were to inviteyou all into this boards in in your owndevices our back end would be workingtirelessly uhhandling real-time collision detectionlocking and everything that makes uhreal time collaboration and seamlessreal-time collaboration possibleat the heart of this engine sits ourboard server a stateful service crucialfor performance and lowlatence every use every user on a boardneeds to be connected to the same boardserver in our back end over a web over awebsocket connection the stateful natureof this service or this the statefulnature here introduces a significantrouting challenge how do clients knowwhich server to connectto historically we rely on Fabio B aopensource HTTP and TCP reverseproxy integrated with hash corp consolefor service discovery when a boardserver is started it integrate itregistered itself with console and Fabiodynamically update update its routingtable with a pathbasedrouting from the edge to from the clfrom the client edge to a board serverin our uh in our infrastructure as youcan see in this diagram the last part ofthe URL literally represented the nameof EC2instance at at infrastructure side thispathbased routing while functional andintroduced additional challengespecifically clients would have to renrenegotiate with the backend server iffor example the existing the existingserver that was rambling that that's uhthat board went down adding complexityto the connection management there wasalso potential split brain issues thatmight arise on infrastructure side asour internal board registering theapplication logic could go out of syncwith the console service discovery oneon infrastructure side while weacknowledge that Fabio B and console hasbrought us a long way it was time for achangebut before we get there let me tell whatwas happening in parallel at ourorganization as you can imagine ourlegacy architecture was not really cloudnative recognizing this back in 10 2021we embarked in a transformative journeyrebuilding our compute platform from theground up we committed to Kubernetes andAmazon EKS as the next generationapplications launching what we calledour compute platform powered byKubernetes and enhanced by thebest-in-class operators and controllersth�is platform provided a featurerichenvironment for our developers torapidly build and iterate over newfeatures primarily using a microserarchitecture as you can see in thisslide we adopted many technologies thatwe that we unfortunately won't have timeto talk here today for example Carter asour cluster of scaler on AWS and Kyvernuas our dynamic admission controllersitting as a brain for every requestcoming and going through the KubernetesAPIserver but back to our topic here fastforward to 2024 it was time to extendthe benefit of this compute platform toour most criticalworkloads but let's start with an I onehere why if it end broke did we fix itcanvas2024 our mural launch event was comingup we wanted to show some really coolnew features of our product that wouldmake new realtime collaborationspossible forexampledocuments and data tables on top of ourintelligentuhcanvas but clients would have varietyhard limits on the amount ofsimultaneous open connections to thesamedomain mostly because we are talkingabout browsers herebreaking away from a monolithic designmeans clients should be able to openwebsockets to any number of statusworkloads in our infrastructure or inour back end and not all one all to onelike we hadbefore this spawned the need for asmarter multiplexed websocket routingseveral logic connections at the backend multiplexed over a single one asingle physical connection to theclient now we were a collection ofengineers from three different teamsthat came together to create this piecewhile uh Rodrigo and I are the ones infront of you we're by no means the onlyones that deserve credit so firstly wehad our cloud networking team they wereresponsible for existing Fabio LBconsole and the load balancerconfigurations then we had the teamRodrigo and I are part of called computewe were responsible for the EKS clustersthe operators that power power ourmicroser platform and finally the brainsbehind our new smart websocket proxy andour board servers our collaborationruntime team now I really hope thiscontext that was driving the need forchange and the actors and the tools thatwe're wielding made a little bit ofsense because it laid the importantgroundwork for our descent into thedepths of stateful connections inKubernetes and the scaling secretsnobody talks about but before we moveinto our implementation let me unpackthe game you are ch we are playing uh tounderstand the challenge we face isimportant to grasp the fundamentals ofstateful connections and to be honesteach one of the topics that I'm going toto cover now could be having could behaving a talk on its own and they werenot always unique to stateful workloadsas they can also be important forstateless scenario but they will forsure beat usmuch m much early on when longconnections are in place so please bearwith me because we are going to get alittle bit in the indepuh the first thing before moving to anystateful uhimplementation uh we should actuallyunderstand the nature of the protocol weare using each choose each choose casemight differ and might require differentconsiderations regarding to the wireprotocol use like for example whenestablishing a secure connection to adatabase have you ever considered if thehandshake process of this connection isidentical to the one for let's say doneby a web browser while both might beusing TLS the underlying protocol andconnectionmanagement can introduce nuances in ourcase since we are using websockets andthat's pretty much what we are enforcedto use on a browseruh which in the case of websockets is anextension of the HTTP 1.1 protocol thecommunication flow would go first theover the process of establishing anencrypted and secure HTP connectionbetween the client and the server andthen only after that we'll be able toupgrade this connection to a websocketone meaning that we have two handshakesin this process one for the TLS part andthe other for the HTTP upgrade and as Isaid only after that the birectionalmessage that are being exchanged betweenthe clients and the server will be ableto be sent and received which makes atquering new �connections even moreexpensive in our scen in our scenarioincreasing resource consumption and taillatence whenoverload this leads us to the next topicrelated to keep our lives unfortunatelykeep our lives are a necessary evil thatwe must embrace here but while doing sowe need to be carefuland proper configure the idle timeoutsbetween each hop in the communicationflow which may might sounds really uhalready well known right but so we needto make sure that each part in thiscommunication flow uh we reuse aconnection that is never actually closedby the other party upstreamfinally in the in in the Linux level weencounter infemoralports these are the temporary portsassigned by the operational system forclient connections while essential forcommunication they come with inheritedlimits for instance the upper limits ofa given connection tle which is composedby the source IP and the destination IPand port is a bit over 65,000in Linux the range of availableephemeral parts is defined by anamespaced CCTL config and when runningit in a Kubernetespod it's configured by default at around28K as you can see here you can alsoincrease and turn that out based on yournecessities by changing and configuringit under the security context of yourpod spec but certainly back to theephemeral port limits if this range isexhausted new connections can't beestablished leading to connectionfailures that can manifest as droprequests and poor and frustrating userexperience therefore we must account forthose port limits and design forscaling our applications proper originalscaling of components is crucial heresometimes deploying smaller instancescan be more effective than relyingsolely on vertical scaling no matter howmuch you invest in vertical scaling theephemeral ports limits will always betherenow alongside ephemeral ports we alsoencounter another crucial Linuxcomponent to consider contracts orconnection tracking which is a corekernel Linux kernel feature thatmaintains a table of all active networkconnections this table is essential forstateful fires network adderstranslation and other network functionshowever the contract also has a sizewhen this table becomes full newconnections attempts will be silentlydropped impacting applicationavailability and loading and leading todifficult to diagnose issues and onceagain it's important to know thoselimitations and proper size and properaccount for for example for node sharingand neighborhoods on your nodes whichwill lead for distributing the properload asneeded now Rodrigo has drawn circlesaround the constraints we're operatingunder with websockets new connectionsare expensive the timing between thehops needs to be tuned and there arehard limits that prevents us fromscaling verticallyindefinitely given this context let'sdrill into how the first nerves wereconnected we'll be taking a look at ournew in-houseuilt websocket manager howwe routed the traffic there how we keptthe traffic secure for those sweetenterprise dealswhat we have here is an illustration ofour system that will help to highlightthe components as we talk through themand the problems they solved for us solet's start with the star of our showthe real time collaboration commandgateway or RTC gateway our replacementfor Fabio LB and console now the workthe team did deserves a talk all on itsown but a couple of highlights aboutthis application it uses threads equalto the number of vCPUs to reduce contextswitching additionally all inbound andoutbound connections are handled in thesame thread which significantlyincreases performance they also swappedout the default memory allocator for JMalik uh which helps prevents memoryfragmentation that comes from needing tohandle packet sizes of wildly diffingfrequenciesum now that we had our deployment therehow do we get traffic from our usersdevices onto our shiny new RTC gatewaypods the AWS load balancer controllerswoops in to take all the glory here ithas native support for Kubernetes APIslike ingress and services and also whichis what we used in the end a customresource definition called the targetgroup bin�ding that allows us toconfigure how the ALB sends traffic toour Kubernetes bods for those unfamiliarwith what a target group is or the AWSconcepts it can be summed up as aresource that allows us to configure howthe load balancing and and trafficprotocols work to the podsnext now we had our pods deployed we hadthe traffic flowing in how do we ensureall of that data is encrypted in transitwhile preventing our S sur from exitingthe building via windows or the fireescape because they have to rotatethousands of certificatesfrequently here we have manager thatpopped up another CNCF controller thatis invaluable it allows us to abstractaway the majority of the work ofmaintaining the PKIs for our cluster inthis case we could provisioncertificates automatically that is validfor a year they are rotated monthly andthat combined with our uh node lifetimeof max node lifetime of 30 days meantevery pod had a crisp certificate whenit came alivenow if you remember from the previousdiagram Rodrigo showed we moved from astateful service in EC2 to a statelessservice in EKS and one of the firstimportant things that we need to thinkabout is how are we going to handle ourshutdowns how do we gracefully drainconnections so we're going to be zoominginto the configuration between the ALBtarget group and our pod shutdown lifecycle but first of all what doesgraceful shutdown look like for our newRTC gateway ports if we try to bring itback to the user experience we don'twant users to go into loops withstuttering connections where theconnections are opened and closedfrequently while we are scaling down ordoing rollingdeployments and the RTC gateway didexactly this from the application sideby implementing a protocol where theycould send close events to all connectedclients that would then transparently tothe user acquire new connectionsu to existing pots so first we made surethe ALB gave plenty of time for the RTCgateway to do its thing we figured itshould be able to drain all existingconnections in about 2 minutes so wedoubled that as our dregistration delayon theALB next we didn't want these closedconnections to open to an ex to theexact same pod again so using ourpre-top hook we ensured the pod was nolonger in the load balancing pool beforewe started our connection drainingsequences now that our pods areterminating smoothly we could startpreparing for our expected load nowRodrigo has pointed out previously withlimitations like the port the ephemeralports and contracting um how importantit is to lean towards horizontalautoscaling instead of vert verticalautoscaling andtraditionally scaling based on resourceusage works really well for us in about80% of our cases um however since thisRTC gateway handles connections withvarying packet sizes and frequenciesimagine a workshop like this whereeverybody is on the same board versusone person designing a spaceship in hishouse resource-based scaling might alonemight not have beensufficient luckily another way to managesaturation effectively for a proxy is tolimit the amount of concurrent activeconnections per portthe team then ran extensive performancetesting to understand the resource ratioprofiles here like think how much CPUand RAM we need for how many openconnections they were then optimized insize to handle around 8,000 connectionsper pod while they could peak well pastthat uh without seeing any degradedperformance enter Kada the operator thatallowed us to scale based on the amountof open connections now here we can seean example of what a kada scaled objectlooks like um also how we configured itwe optimized for scaling up quicklynotice there's no limit on the up searchscaling down slowly at max one pot every5 minutes also a rolling window sampleto prevent scaling down prematurelyduring metric dips and a cool downperiod of 5 minutes to prevent flappingnow all of this you may say comesstandard with the Kubernetes HPA whatKada gives us it allows us to scale oncustom metrics like we have there underour triggers in our case activeconnections with which allowed us tokeep our deployment well undersaturationle�vels once we had our gracefulshutdowns and autoscaling configured thenext logical step was was load balancingyes how do we make sure every pod waspulling its weight and not sitting thereyawning while all the others weresweatingwe're considering the interaction andlevers available again between our ALBand the RTC gateway and the two viablecontending load balancing algorithms forwebsockets that we considered were roundrobin the classic and least activeconnectionsafter doing research going into my cavecoming back out and feeling like I had agreat understanding of how they'llbehave with websockets I proudlysuggested to the team why we should useleast active connections because theyunlike round robin they prevented theold pods from beingoverloaded can anybody guess what wentwrong coldstarts angry oneslike I made this graph intentionallyvague so the people in the back can'tsee my shame but we were flooding thenew pods with a barrage of websocketconnections i did not anticipate all theCPUbound work like the TLS handshakesthat we needed to do on mass as well asthe latency spikes that we'd need toendure the risk to the customer'sexperience was not acceptable here soback to the lab we wentwe needed to reanalyze both optionsroundroin and least active connectionsto see how we could mitigate thedrawbacks of either to mitigate the coldstarts of least active connections wecould decrease the connection rate butthis isn't simple because slow startdoesn't work with uh least activeconnections and another thing we couldtry was to warm the application but thiswas also not very simple because itwould need us to figure out which codesites are responsible and whichconnections may need to be pre-openedyeah pretty hardso we turned our gaze back to roundrobin how do we stop the old pods frombeing overloaded now if we take a lookat these two players we realize that asingle readiness endpoint re usuallyworks really well for normal HTTPtraffic however the behaviors are quitedifferent and it becomes very visiblewithwebsockets if we're focusing on an oldpod on an old pod when we fail readinesstwo things happened in our currentstate the ALB probe failed which meantno new connections were opened to thispod since it was saturated that was goodhowever if the Kubernetes probe failedthe the pod would be removed from theservice it would be removed from theendpoint slices the AWS load balancercontroller would observe these changesand start the draining sequence on theload balancer the pod would eventuallyexceed the graceful draining period andall existing uh connections would beterminated and since Rodrigo pointed outestablishing new connections areexpensive and the user experience hereis bad of losing the connections againthis wasbad so we split out another endpointspecially for the target group so thatwe could individually signal how wehandle new connections and existing onesthe team then updated the RTC gateway toreport not ready to the to the targetgroup when it went above 10,000connections and ready again when below9,000 while continuously tellingKubernetes that it is ready so we don'ttouch those existingconnectionssuccess no more cold starts loadbalanced and to finish our journey hererelated to HPA uh I would like to guideyou through uh uh through twointeresting uh experiences we have withit the first one is related to theKubernetes HPA algorithm itselfuh the algorithm has a built-intoleranceuh which is by default defined as 10%and in our case as we are using AmazonES we cannot really control or changethat because we don't have you don'thave access to the cube controllermanager settingsuh and perhaps in many cases uh this isnot really important but when we aretalking about scaling based on openconnections and talk about saturationand avoid uh things go wrong in thatscenario that becomes a big deal uh soin our case the developers really wantto scale based on the defined thresholdsor at least have a predictableunderstanding of where what thatthreshold will be uh so in this caseit's pretty much about understanding uhunderstanding this tolerance andaccounting for that when defining yourthreshold in this cable objects uh andthe second issue here uh shows howsensitive the HPA uh can be regarding tometrics that would sharply flap and whyyou should actually define scale scaleup behaviors on your scale object or inyour HPA itself as Andre pointed out inthe beginning when he showed uh the uhthe scaled object uh definition wedidn't have a scale up policy in thebeginning uh and interesting enoughthere was a talk back on Wednesday fromone of the maintainers and they werediscussing best practics in terms ofscaling in in Kubernetes and their firstgood practice advised there was to usescale uh scale up policies which wedidn't do at that time uh but specificabout our pro problem here uh after alittle bit of B blame gaming between uhour observability uh components and KAitself went through the code of the cubecontroller manager uh the code for theKDA uh Prometheus uh scaler and alsoadding a proxy between KDA and um andour Prometheus read end point we uhactually uh find out and were able tonarrow down that our monitoring systemwas actually misbehaving here and givingaway wrong data points for the queryeven though if you were quering thathistorically in graphs or or indashboards later on that misbehavinguh misleading information was not thereanymore uh but this is just brings evenmore visibility into how uh the HPAsystem would blindly and happily followthat misbehaving uh data uh and then thesolution is pretty simple here uh it'spretty much about defining uh scale uppolicies uh which by default it comeswith 100% of scaling alone 100% ofscaling up that's why uh we got thedouble of of of of pods in a given uhscale scale event and here we mitigatethat by controlling and and and reducingthe amount of pods that could be uhcould be created in a given minute uhtimeframefinally now we've rushed through thisjourney on how we moved our websocketmanager into Kubernetes we'vehighlighted some of the operators thatmade us possible as well as some of thescaling and balancing errors we I meanthe secrets we found let's take a lookat ourresults here we have active connectionsper pod over a day in one of our regionsthere are three events I would like todraw your attention to the yellow linemarks when reinforcement starts beingcalled in when our averages startsgetting around 8.8K the red line markswhen the older pod stops reporting readyto the target group the blue line markswhen they start reporting ready againwhen this happens we can see how therate of new connection sharply increaseson the newer pods but by now they'reready warm and basically shouting tauntsat the ALBour in initial connection latency wentdown 10x allowing us to keep it downkeep it real realtime and also moving from the statefulsetup on EC2 to a stateless setup on EKSallowed us to save around 40k per yearwhich made our PHOPS team uh smile andwink at us and now that I've rounded outthe results I'd like to hand back toRodrigo for a sneak peek into Miro'snext steps in our cloud platform and thetheory goodbye thank yougoodbye[Applause]well here uh just to tell you that uh weare happy to share that after a fewyears running containerized envircontainerized workloads in production ontop of Kubernetes we are ready as anorganization to take one giant stepforward and for this specific part ofarchitecture here covered here today itmeans that we are also moving our boardservers to kubernetes uh which is adirection that we are having as anorganization in terms of compute andcloud uhconsolidation uh and by the way if youare ever uh find yourself wondering byAmsterdam please come by to our officewe will be happy to host you there uhand now goodbye thank you yeah thank you2025-04-15 21:58:14.165596�omers uhwe usually find outside in the internetand you know we have been doing this fora long time uh so there are like two bigfaces like the inner developer loop andthe outer developer loop today we are alittle bit more focused on the innerloop because again we wanted to talk todevelopers or try to talk with platformengineers about what developers aredoing so uh in general when we talkabout the inner loop it's all about thathow a developer reads a requirementunderstand the code changes that theywill need to do in their systems andthey are working on and then we'll gothrough this loop as many times as theyneed to make sure that they areimplementing what is required there forfor the work that they are doing andthey it all starts with code right likeyou need source code somewhere and thenyou need to be able to compile or justrun your code if it's not the compiledlanguage and then write unit tests tomake sure that you are doing whateveryou're supposed to be doing and thenwhen things gets complicated doing adebug session might be needed and theseare things that developers do every dayand they need to be able to do this asfast as possible but we are at CubeConso we know that when we add Kubernetesinto this mix things get a little bittricky rightso who are you uh that's a difficultquestion so my name is Mara Salatino iam a software engineer and a CNCFambassador i work for a company that'scalled Diagrid we are working with theDapper project and I joined that companyspecifically because of that this isright between platform engineers anddevelopers i wrote this book platformengineering on Kubernetes consideringmyself a developer who has been doing alot of Kubernetes over the last 10 yearsso if you're interested in the platformengineering side check the book there isa a repository with a lot of hands-ontutorials that you can follow withdifferent CNCF projects who are you myfriend yes I'm Thomas Vitali i work at acompany called systematic in Denmarkreally passionate about anything cloudnative and Java related i also wrote abook it's called cloud native spring inaction with Spring Boot and Kubernetesand with Mauricio we figured out that wehave this relationship between platformengineers and application developers sohow can we help each other improve bothsides of the equation so that we candeliver value to the customers fasterand better and that's why we would liketo share that we are working together ona new book it's called DeveloperExperience on Kubernetes right we justannounced it a few days ago the firstfew chapters are out for in early accessso if you'd like to read it and sharefeedback we'd be happy to receive thatwith this code you also get a 45%discount yeah that's it and again Ithink that like uh kind of likeimportant thing to mention there is thatagain because we have been in in thisspace and in the developer space forsome time we keep talking to companiesthat are doing different thingssometimes those things are not optimalsometimes those things are a little bitbetter for the developer and sometimesthey are just making the developers lifevery very complicated if you're here inthe Kubernetes space my experience mypersonal experience is to talk tocompanies that are pretty much slowingthat slowing down developersproductivity and we want to fix thatright yeah so let's start with uhdefining a development environment so wewant to start working a new feature howdo we do that now basic thing sincewe're working with Kubernetes so we workwith containers is having some kind ofcontainer runtime so today we're goingto demonstrate Podman desktop it's anopen source uh project uh under uhdonation right now as a CNCF uh sandboxproject and once we have a containerruntime uh we can start building up ourdevelopment environment so first of allI have my uh podman desktop applicationI can run containers so just like anyOCI runtime like docker I can runcontainers from the CLI uh but sincewe're talking about Kubernetes at somepoint I'm going to need a cluster nowwith Podman desktop I can create acluster very easily in two differentways out of �the box i can use twoprojects one is called uh mini cube andthe other one is called kind i havealready created a cluster using kind ican show you how it works i can justgive it a name i can specify a provideris it podman is it docker and I can evenmake sure that I have an ingresscontroller so every uh service that Irun on the cluster are also exposed onmy local computer which is great fordeveloperexperience all right we have a containerruntime we have a Kubernetes cluster ifwe want to we can uh uh visualize theresources in different ways onKubernetes podman comes with a very nicedashboard out of the box that you canuse to visualize all the resources onKubernetes you can even use a headlampthat was uh uh announced the other dayduring the keynotes so you can have aheadlamp directly in podman desktop andwe visualize all the differentKubernetes resourceshowever once we have this environment weneed to open up our project andtypically we have an issue becausewhenever we uh define a developmentenvironment we we have to also thinkabout how we share it with the team andmaybe we have some documentation we haveinstructions on how to set up adevelopment environment which tools toinstall which versions and what happensis every developer will have a verydifferent development environment ontheir local machine so how can we solvethat problem we want to have a baselinewhere all developers in the team havethe same uh tooling right the same codethe same approach so we can perhapsdefine our development environment ascode and we have uh different options wewant to mention two uh of them today oneis development containers it's an openspecification that is used by differentplatforms for example GitHub code spacesuses it defbot you can run it from yourfavorite IDE maybe Visual Studio Code orJet Brains IDE or you can even usesomething called dev file def file alsouh provides this specification as codeand you can uh define a developmentenvironment directly on Kubernetesthat's right in this case I want to showa development environment based on devcontainers i have Podman already as acontainer runtime and I'm going to use atool called devpod devpod is reallygreat it supports this uh standardformat for defining uh developmentenvironment but what is cool about it isthatbesides pointing to some kind of gitrepositories or even a local folder todefine this environment it's reallyportable across any kind of environmentso of course I can run it locally onDocker or on Podman but I can choose torun it in the cloud or maybe I have aKubernetes cluster so it's really reallyconvenient no matter the infrastructureyou have the same exact developmentenvironment portable across all thesedifferent types of environments and Ithink that this kind like makes a lot ofsense when you have like tons ofdevelopers working in different teamsand you need to make sure that they areal all using the same tools that reducelike you know Yeah differences betweenenvironments but also it makes sure ithelps to for developers just to getstarted faster yeah and we also want tomake sure that developers use theirfavorite IDs the ones that they are themost productive with and that's also whyyou can choose your favorite ID maybeyou're working with a AI agent so youcan use cursor or you can use VisualStudio Code or IntelliJ uh in this caseI have defined a workspace for our demoapplication and it's running in uhVisual Studio Code directly in thebrowser so I don't even need to installmy IDE and then from here this is a Javaproject i don't have to install Java idon't have to install any toolingrelated to this application so what Ican do directly is run my spring bootapplication and that's it so as adeveloper it's great maybe I just joineda new project i do a git clone operationon my uh repository for the project Ineed to work on and then using devpod orusing any of the other tools thatsupport dev containers I can just getstarted working on a feature that's itthat's great yeah that's good andremember like this what we are showinghere is just starting an application butthis is where you sort o�ut things likecredentials to internal repositories uhsecrets to access like the you know likethe shared company internal system thatyou need to see the requirements and allthese kind of things that for adeveloper joining a team it's just sucha pain you know like it can takes twoweeks to figure it out yeah exactly andalso to use this at scale maybeproviding uh this kind of service uhfrom the platform you can think of uhembedding this dev containerspecification as part of a template inbackstage if you're using backstage oranother uh developer portal so wheneveryou bootstrap a new project youautomatically automatically get all theconfiguration needed to start working onthat project without having to installany tool the only dependency I have onmy machine right now is a containerruntime that that's all I need as aminimumbut okay we got the project now we needto start working right we need to startneed to start working on the realfeature value doing some stuff right andagain like as I mentioned before uhremember like the inner developer loopand the outer developer loop so if wethink about like you know removingKubernetes from the development tasksset of tasks that the developer is doingthen we just need to make sure that weautomate all the way to running thatapplication that the developer isworking outside containers outsideKubernetes to a Kubernetes environmentthe other approach and things thatcompanies are doing is like okay let'smake sure that developers can have alocal environment where they can runthings in containers and in a localKubernetes cluster or maybe like in aremote development environment but againadding Kubernetes to the inner loop willcomplicate things in some different waysi'm not saying that that's a bad thinguh it will reduce the differencesbetween how you run the applicationlocally and remotely but it will addsome uh complexity and because again weare writing a book that it's aboutdeveloper experience this session isabout developer experience what we havetrying been trying to do is to look outthere see different communitiesdifferent languages different tools thatwill help you to go faster uh and themain problem that we are talking abouthere is that when you are working on anapplication as a developer right like aJava developer a Go developer you onlycare about your application code and theruntime that you need like the Goruntime the Java runtime the noderuntime uh if you're adding containerson top of it now you need to understandhow containers work how do I create acontainer how do I run it how do Idistribute a container is also animportant part and finally again if youadd kubernetes on top of it now you needto understand how kubernetes work whatcubectl or cube control is and how touse it how to create the jamos for itagain we are adding things on top of theinner loop that will make the life ofdevelopers complicated this is for me inmy personal experience learningkubernetes had taught me a lot aboutapplication architecture and how to dothings uh but when you want to scale upa large organizations of you knowthousands of developers this becomesquite an important problem to solve souh when I talk when I think about likeuh transforming source code tocontainers there is like this CNCFproject that is called Pilpax thatbasically takes any language and usingthe CLI you can create a container uhthat is easy to use there you go sorrysome lagah where are we there yes so the idea islike you have a CLI the CLI willunderstand which language are you usingand it will create a container image foryou uh removing the need for thedeveloper to create a docker file tocreate that container image this againmakes a lot of sense when you'rethinking about like a platform teamenabling developers to go faster becauseagain you remove something from them butalso you have the tools from theplatform side to create the right imageshere right that can run in a productionenvironment and when I think about likedeveloper experience uh on the innerloop with Kubernetes I do think aboutGoogle co uh which was created by Googlenow it's part of� CNCF how many peopleknow about this tool likego oh just a few okay how many Godevelopers do we have in the room notthat not that many that's good anyways Ithink the point of showing this in at atCubeCon I thought that most Godevelopers will know about this but theidea of showing this is to show otherlanguages developers what's theexperience that they built here veryearly on in the Kubernetes journeyremember that most controllers were uhwere built using uh Go so they needed toiterate fast right so what I wanted toshow here is I have a Go applicationhere so you can see just a normal Go uhservice i can do go run and get theservice up and running go run uhappointments go if I run it there youcan see that the service is startingthat basically means that I have the goruntime but it's failing because itrequires a postgrql database to runalongside this right uh developers ingeneral will do something like you knowum podman compos app or docker composapp just to get the infrastructure inthis case what I will get running is apossql database locally that I can uhlet me dothis app dash d there you go and now thepostquestql uh database is runningbasically what I do can what I can donow is connect to the database right sonow I have my service it's connecting toa database all good i can start makingchanges and it will work but whathappens if I want to run this inKubernetes first I I need two thingsbasically i need a container i need topush it to a registry and then I need acluster where I will be running thiscontainer right a bunch of other thingsuh to connect to so that's where cobecomes really useful so you can use uhsomething likecoield right and look at this so thefirst thing that I'm doing here is uh cothe name of the go application and thend-platform all this is super importantdevelopers needs to create a containeryou want to make sure that you don'tcreate a container that is specific to asingle platform because when you want torun it in a cluster that cluster needsto be running in the same platform to beable to work so you want to make surethat the image that you produce it'slike available for all platforms okaythat will not work that's not set upwith the project but that's fine thatwill be the container right the nextstep is to make sure that you can deploythat and again co makes it very veryeasy so you have a deployment file inthis case you will need to find a way togenerate this for your developers if youwant to make them go faster rightbackstage is one option where you canhave that as part of your template butif you see here the name of the imagethat is being used it has this uh coprefix that basically will help go tofind and scan all the jam files andreplace this reference with the imagethat is published by co- build so when Irun co it will build and publish theimage when I run co apply like ubectlapply but instead of qctl it'sco right I'm applying my jaml files hereto a cluster so I have a kind clusterthat it's running this is going to buildthe image and push kind like the thefile here into the cluster thedefinition but before doing that it willreplace that with the hash of the imagethat I'm just building right now andagain it's not going to work becauseit's not the docker variable can youexplore that quickly yeah let's doThere you go so basically the the onlything that it's missing here is the youknow the the repository the Dockerrepository where it's going to send thatthere you go and now this is greatbecause we are working on an Applesilicon computer but the image generatedwill also work on AMD uh architecturesnot just ARM 64 so if I'm connected tolike a remote Kubernetes cluster that isrunning on AMD 64 the same image willwork that's the whole idea right so nowwe have an image but again there's noneed to build the image if I don't wantto do like that step I can just run thiscommand and this will automaticallyidentify if there are changes if thereare changes it will build the image pushthe image and then apply the changesinto the cluster right so if I do nowubectl get bots you can see that I havethe appointment appl�ication up andrunning this is pretty cool again itsimplifies all the steps that I need todo to get something running inKubernetes and creating containers anddistributing containers and there aretons of options that you can tweak inthere uh which is amazing i can go andchange now the application and pushagain and get it up and running yeah butwe can do even better right we can doeven better and that's where a scaffoldcomes into play right how many peopleknow about the scaffold here yeah so wehave some folks that's good so here likeagain this is great but I need to dolike you know go apply and send that andevery time that I make a change I can dobetter i can create have somethinglooking at my files and every timesomething changes it will just push andbuild the container and that's withscaffold scaffold dev so what this isdoing again is uh just deploying thechanges there looking at my files as youcan see it's even you know tailing thelogs for me to see if something breaksyeah so all those uh commands that yourun manually now they're all automatedbut scaffold can be used for uh more uhvarious types of applications so in thiscase we are using co for golang but youcan use it with build packs you can useit with a docker file if you havehelmchars it also works with that orwith customize so it's a really uhversatile tool that can be used acrossdifferent language stacks which isreally great and then you just runscuffle dev and you focus on the codeyou don't have to worry aboutcontainerizing and deploying doing allthat stuff yep that's that's a very goodstep forward all rightimportant to mention yeah again like youcan use scaffold with any language whichI think it's great so basically itbrings that developer experience acrossplatforms yeah that's great it'sautomated we are kind of mitigating someof the complexity here both thecognitive load and the slow feedbackloop but it's not always necessary tohave Kubernetes running locally so do wereally need it for local developmentwhat do you think folks do we needKubernetes for local development orwould you rather not so I see somethumbs down yeah but what's the questionyes or no yeah yeah up or no it's finelet's answer both options are great butbased on the context we don'tnecessarily need Kubernetes right wehave lots of different ecosystems outthere already with a great developerexperience in Java in Python in Nood.jseven in Go we can establish this verynice automated development workflows sothat where you start a command and thenyou just focus on the code thateverything will be updated automaticallyunder the hood and then we can combinethat with some tooling that helpsintegrating with different services sofor example the go applications you justshowed requires posgress and we havethis separate command to spin up aposgress uh database but we could usetest containers in order to uh provisionall these uh dependencies as part of theapplication life cycle so let's have alook we can use test containers acrossdifferent languages maybe in Java I wantto define a rabbit and Q container or inPython I have keylo in go I have apostgrql container what happens is thatwhenever I start the application or Irun integration test even then I getthis container up and running so I don'thave to worry about running a composfile beforehand and remembering also todo a unprovision a clean unprovisionoperation afterwards i I think thatthat's the difference like here you getlike a life cycle management on whenthese containers are created so ifyou're running different kind ofintegration tests you may need differentcombination of containers to start andstop and I think that test container istrying to solve specifically thatproblem yeah so I have here a jarapplication right now that also uh needsa posgress i don't have to uh install itum let's see so I don't have to run itum I want to do it from here it's goingto look nicer so I have a Javaapplication demo application i can runit so I can see now it's starting aPostgresQL database is provisioned forme i can show you that I'm not lying uhlet's go to Podman and we ca�n see thatthere's a a PostgresQL uh container thatyou created earlier from compos but wealso have a posgrql uh container downhere the flavor is pg vector it's anextension to posgress for doing uhvector store operations so if you'reusing AI uh it's quite nice uh but thenI can just call the application so I'lldo ithere let's see uh London baby now let'smake some changes there you gokubernetes baby and spelled correctlyand I don't have to restart anything butI can just send uh the request right andit will be updated automatically so Ican just enter into this uh continuousdevelopment loop and focus on my codeand everything will be updated under thehood for me including the dependenciesbut that works great with Postgress orother third party dependencies but howabout the APIs that you might integratewith because maybe your teammates areworking on a different application andyou want to integrate withSo Micros provides this capability oftesting against APIs doing this contracttesting so whenever you have anapplication integrating with a rest APIor a messaging API you can actually uhinclude micro via test containers aspart of your application life cycle soyou can really focus on developing yourapplication in isolation you don't needall these extra dependencies that justmakes the developer experience worse andthe feedback loop slower so in this casemy application is calling an an externalAPI it's called the friends API i evenhave an open API definition for this APIthat the other team working on thisapplication provided me with so just byusing this open API definition now I canrely on Micros to run automated test formy application against that API and makesure that my application is compatiblewith that API when I call it so I canshow you in the test that I'm using testcontainers to define a micro service andthis micro service will provide all themocks based on my friends open APIspecification so I have now a solutionworking both with third partydependencies but also with other type ofservices that uh other teams in myorganization might be working with yeahand this is pretty interesting becauseagain what we are trying to do here iswe are trying to give different teamsthe tools that they need so they don'tneed to kind like wait tons of time justto run integration testing right in thiscase you are running contract testing tomake sure that your service is workingagainst a very specific version ofanother service that maybe other team isbuilding without the need of even juststarting an instance of that serviceyeah but I mean we are focusing on theinner loop but eventually we really haveto hit production that's where thesoftware will provide value right so howcan we deal with these serviceintegrations in production yeah I thinkthat like service integration is a veryinteresting topic in the sense of againwhen you rely on external services thirdparty services or even managed servicefrom cloud providers you need a way toyou need a way to actually simplify theaccess to these APIs to developers rightyou don't want them to start learningabout each new service that they aregoing to use and even for simple thingslike how many people is using Kafka herein the audience okay so we have someKafka users or Rabbit MQ or ActiveMQ youknow like there are tons of thesemessaging broker systems that areamazing at doing what they do they do itat a scale but the main problem is as adeveloper if I don't use it I haven'tused it before I need to learn about allthe details on how to use it and I thinkthat that's where the Dapper project uhthat I'm working for is extremely usefulbecause it actually provides APIs fordevelopers to do things that they needto do in their cloud native applicationshow does it work so imagine that youhave applications in any language rightdapper is this blue box in the middlethat it's between your applications andthe infrastructure that they might wantto access and that diagram I understandthat's complicated it has a lot ofdifferent logos with a lot of differenticons as well so let's make it extremelysimple you have applications thatprovide APIs to do stuff and then youhave a bunch of complex infrastructureover there you don't want all yourdevelopers to learn about how thatcomplex infrastructure works so you canjust provide them APIs for them toimplement things and let's use thatmessage broker idea like Kafka again isamazing of doing what it does it it doesbut sometimes it take me so much time tolearn that I will spend two weeksfiguring out the right parameters tocreate a connection to the cluster inthis case what Dapper does it actuallyexpose these two things that are downbelow right publish events and consumingevents as a developer I can understandthose APIs I want to publish an event Iwant to consume an event right so that'skind of like what I use from myapplication and then I use Dapper in mycluster to connect to the Kafka clusterso it can actually move messages betweenapplications so how does it look thefirst example is with Java you can seeDapper client publish event i need abroker name and then a topic name andthen I just send an object there and itwill be serialized and sent to whateverinfrastructure is configured behind theAPIs the example in go is down thereit's the same thing dapper uh Dapperclient publish event now you'republishing event from Go applicationsagain we have some examples in arepository i don't think it's worthrunning it and we're running out of timeuh so let's close it up yeah sodeveloper experience uh is really allabout enjoying what we do as applicationdevelopers right so we really want toget the best tools the best experiencebecause it's our daily job we spendevery day doing that mhm uh what we uhare identifying with Mauricio is kind oflike a map of all the 10 main frictionpoints of the developer experienceacross the entire path to productionbecause each of these point isassociated with challenges so we reallyI'm trying to aim at having some kind ofuh um map or guide that can also help usprioritize how to improve the developerexperience based on what is the most uhmaybe critical challenge because thatalso varies every organization isdifferent and so there's no one singlesolution we show some tools today theymight work for you or they might notthere are other tools that might uh fityour purpose better so it's all aboutreally understanding what are theproblems as application developers andthen finding the right solution based onthat and today we have covered the innerloop right but actually the outer loopis also something that we need to fix ifwe want to improve developer experienceif you are pushing code and waiting forintegration pipelines to run and you'renot getting the right feedback or evenif you don't have access to the rightinformation about how your applicationsare running in production then you areactually slowing down your developersyeah and within the CNCF we reallytrying to get some more uh applicationdeveloper view or point of view in theCNCF landscape as you know is very bigbut not everything is relevant fordevelopers so we are co-chairing a agroup together with our friend Daniel Oabout application development uh withinthe tag app delivery in the CNCF so ifyou'd like to join uh we are really uhlooking forward to get more support andhelp from developers to reallyunderstand the problems like this verytrack this uh presentation is part of anew track at KubeCon about applicationdevelopment this is the first time thatKubeCon has a track dedicated toapplication development so we hope tosee more and more developers joiningthese events joining CNCF events uh andif you want to uh collaborate uh help uswe are currently working on uh differentuh initiatives right now to bring moreuh developer perspective into the CNCFprojects join us you can find us onSlack uh we meet every two weeks yep sowe are really open to any kind ofcontribution we want to hear from you ifyou're application developers if you'replatform engineers as well uh we want toreally help uh bridging the gap betweenplatform engineering and applicationdevelopment and thank you so much forstaying thank you so much thank you somuch2025-04-15 21:58:14.701918 ��][#��qAnvKpg3JgSjshello everyone hello everyone thank youso much for staying this long with us uhwe were know when we're planning forthis presentation we thought okay peoplewill be leaving to the airport look atthat we have a lot of people here sothank you so much for thanks for joiningcoming uh presentation so thispresentation is about you know developerexperience at the end of the day it'stitled breaking barriers that soundsvery AI chatgbt generated it kind of isbut at the end of the day what we wantto talk about it's like you know uhapplication developers how do we maketheir life easier the things that we cando with uh CNCF projects and what'sthere in the cloud native space andbasically the gaps that we are seeing inthe ecosystem yeah let's ask first sowho is a developer here raise your handapplication developers amazing yes thisis good okay who is a platform engineerokay you know it's mixed yeah audienceyes yes good stuff so again as like uhmy previous presentation I think thatThomas your presentation was prettysimilar uh we are very interested ingetting developers talking to platformengineers so you will see a lot of thisuh content around that what do platformengineers needs to know about developersand what developers needs to know aboutthe platform side of things so when youare in a developers role youconceptualize what you need to do usingkind like three building blocks in thiscase you have ideas or requirements thethings that you need to change in yoursoftware you will go and do some workthat's translating the requirements intosomething that will run somewhere andthat somewhere it's going to be usuallya production environment where yourchanges needs to be actually propagatedto or promoted to right that's going tobe basically in front of your customerand that's going to make your companyget some money uh and produce some valueright or just help someone withsomething that they are trying to do ifyou drill down into actually how do youtransform requirements into somethingthat runs in front of your cust��rce technologies thatthat serve those platforms and Nuno andI found common ground around animportant concept and technology that wefelt was missing in this landscape andthat's the notion of an applicationmodel or an application platform Thatgap matters because developers sort oflive and breathe applications it's whatthey design and build and support umit's what they reason about day-to-daybut in the world of distributed systemsand and cloudnative technologies it'sgotten increasingly difficult um toreason about uh an application as anentity and even in a lot of cases agreeon what an application is um and thatambiguityuh in addition to all the otherchallenges around cloudnativedevelopment just adds that the cognitiveload that developers struggle with andit's exactly that load that the IDPs areare designed and built um to address soyou know we felt that that was animportant gap to fillum and so platform engineers will havean open- source project to draw fromjust like they draw from so many othertechnologies on this slide um and onethat allows them to add an applicationmodel to their IDP um so developers havea more application ccentric kind ofexperience and that's where um whereRadius comes inso radius you can think of at thehighest level as a cloudnativeapplication platform it allows you todefine an application one time and thendeploy it across on premise AWS andAzure you can use it standalone or withits companion technology Dapper which Imentioned earlier but more importantlywhat we're seeing most early adopters ofRadius do is integrate it um as uh Nunoand team are doing into their existinginternal developer platform so they canoffer that application model and enablethose kind of application centricscenarios and we've we've worked with alot of uh platform engineering teamsover the last several months and thefeedback we're getting is reallyconsistent so our strategy from thebeginning with Radius was uh for it tobe a C CNCF project open source andcloud agnostic and the feedback that weget from platform engineering teams thatthat is total alignment there likethat's table stakes pretty much forevery team we talk to and then there's afew other killer sort of baselinefeatures that platform engineering teamshave kind of have brought us along aboutthat we weren't as as focused on one ofthose is GitOps integration um Flux isthe first GitOps integration we haveparticularly for Nuno's team it was apriority argo will be coming soon afterand then the second big feature thatplatform engineering teams educated usabout was the requirement for Radius tobe extensible at the resource level sothe ability to produce custom uhapplication resource types that allow aplatform engineering team to exposeresources in exactly the way theirdevelopers need to use them so if it's aa cloud storage resource for example theability for a developer to just ask fora particular type of storage and say Ineed small medium or large instead oflots of detailed configuration so thatthat level of customization is really isreally critical as well and there's fourways you can think about how Radiusmight help your application teams andyour internal developer platform effortsone is that Radius improves enterpriseapplication team collaboration part ofthat is by enabling developers to reallyfocus on their app and a lot less um uhconcern with with infrastructureconfiguration and deployment now at thesame time Radius offers a feature tooperators called infrastructure recipesthat allows operators to in advancedefine exactly how infrastructure willbe defined uh if if an app is deployedon premise or if it's deployed on AWS oron Azure they define that up front andthen developers get very much kind of aselfserve uh access to those uh uhinfrastructure resources and we'll see ademo of that in a second then theapplication graph is also a feature thathelps teams collaborate and it's superuh powerful feature every time youdeploy a Radius application it creates agraph that shows every aspect of theapplication so every container databasecache frontend backend and how they'reconnected� so it makes it trivial foreverybody in the team SRRES developersarchitects um operators to know exactlywhat's been deployed into production umand then lastly it's cloud neutral as Isaid before so you have a consistentexperience deploying across on premiseAWS and Azure and integration with yourexisting GitOps workflowsso I'll makethis more concrete with a couple ofdemosum and in the first demo I'll show howradius can be used to deployapplications unchanged first we'll goacross on premise and AWS um what you'llsee is uh on premise the applicationdeploys and leverages a reddis cache ona Kubernetes cluster and AWS a recipeautomatically deploys uh memory DB uh inthe AWS environment so we'll walkthrough thatum and you'll see me running the demothrough the Radius CLI um but obviouslyin a production environment those wouldbe GitOps workflows so let's kick thisoff so I've already set up a radiuscontrol plane on premise and in AWS andyou see I have two workspaces one foreach of the environments and as ImentionedRadius sorry aboutthat radiusum me start that overso Radius knows how to deployresources using recipes and what you'llsee here is onpremise I have a um a biceprecipe and that's going to deploy Reddusto a Kubernetes cache or a Kubernetescluster excuse me and then on AWS I'vegot a um a Terraform recipe that willdeploy uh the memoryDB and so I'll put on my developerhat and actually go look at theapplication definition and you'll seethe application's simple it's got afront-end container and a Reddus cacheand those are both native resourcessupported uh in in Radius the front-enduh container has an image a port andthen this really powerful feature uhthat is a connection connections enableRadius to do a lot of work behind thescenes for the developer includingmaking those explicit connections thatshow up in the application graph butalso injecting details like connectionstrings and credentials uh into thecontainer as environment variables sothat that tells the container how toconnect to Reddus no matter where uhReddus isdeployed so back in the terminal we'llrun raduh deploy or RAD run the app and that'sgoing to deploy theapp and give us a port forward so we canactually see the application working andyou can see the deployment's inprogress the cache is deployed the frontend is deployingokay so the app's running now and we'llnavigate to the port forward tolocalhost3000 and when the app opens we'll be ina in a container info tab and that'sgoing to show me all the environmentvariables that Radius automaticallycreated in the context of thatconnection that we were just talkingabout then we can click on the to-dolist tab and just see the applicationactually working right so we're going touh enter a to-do item integrate radiusinto your IDP everybody should have thaton their to-do list um and then it'sjust basic functionality right i cancomplete the task and delete the taskand so just so you see that theapplication's up and running then we'lllook at this Radius dashboard which iscool this is a backstagebased UI thatallows you to see everything that's inyour Radius environment yourapplications the recipes that you'reusing to deploy the app and then thisguey um uh representation of theapplication graph that I mentioned nowthis is a super simple graph right it'sjust two nodes but you see that explicitconnection between the cache we deployedand the front-end container that wedeployed um in a more complexapplication that's when you really seethe power of the graph if you thinkabout multiple team members working on abig complicated application seeing itrepresented in this way is super helpfulso that was the local deployment that wewere just checking out now let's deployto AWS we'll we'll deploy it to AWSusing the the terminal on the left andthen on the right we'll deploy redeployto that local environment we're going touse raddeploy command here and that justdoesn't give us the port forward butwe'll look at the application graph inthe context of the CLI this time becausewe want to I want you to see thecomparison of what gets deployed on AWSon �the left and what gets dep deployedlocally on the rightum so we the deployment's done we'regoing to run rad app graph and what youwhat you'll notice with AWS on the leftlocal on the right the front-endconfiguration or deployment is the sameon both sides what's different is theinfrastructure that got deployed so umyou can see in the bottom left hand umpane AWS specific resources memory DBcluster and subnet group for example onthe right is just the Kubernetesresources for running Reddus locally sothat shows you hopefully how easy it isto use Radius toseparate applications from theinfrastructure that's deployed to enablethose applications okay sothat's that's one quickdemo and what we saw in that demoremember there was a container resourceuh and a Reddus cache those are nativeresources that are supported by Radiusand those are helpful they get you upand running there's also databasesmessage cues and the like supported inthe box with Radius but what platformengineering teams really need is theability to create custom applicationresources and cataloges for thoseresources and that kind of customresource enables platform teams to buildreally customized developerexperiences and custom applicationresource cataloges with browsableintegrated documentationum and model sort of any type ofresource it could be abstract like a webservice or something really concretelike a document uh DB and those thecustom resources function really as acontract between the developers that areconsuming these things and the platformthat's um that's providing them so thenext quick demo um shows uh how wecreate a custom re radius resource anduse it to add a new feature to the sameto-do app that we've been showing andobviously it'sIt's 2025 so we our demo app has to haveAI so that's what that's what we'll addthe feature we'll add using a customresource so what I'll show is how toextend radius using this custom open AIresource um how to define it and thendeploy it with a recipe um and then as adeveloper how I would use that resourcein the applicationso here's the same to-do listapplicationand you see there's a feedback buttonbeside the add button now and so we askcopilot for feedback but it fails rightbecause there's no we haven't added thatsupport yet so the first step is tocreate a new resource type definitionand it's called open API or open AAIexcuse me my company.app app is thenamespace and then we have someboilerplate codeum plus a capacity property this iswhere I define that all my developershave to indicate is a s a t-shirt sizingof a small medium or large which istheir preference for interacting withwith a custom resource like thisso now that we have that de thedefinition in that file we're just goingto upload that file to create to letRadius know that this new resourceexists so we'll use the resource typecreate command here and this just takesthat file that we were just editing anduploads it uh to Radius and thatbasically extends the Radius API so itknows about this new resource andsupports it just like a native resourcethat we saw in the earlierdemo so then we'll register a recipe forthis open AI resource just like we usedrecipes in the previous uh demo um andI've got a template that deploys a GPTturbo model using AzureOpenAI and we'll use that templateso there's the recipe so we go back intothe app now so as again with the withthe developers hat onum we'll we'll add this resource to theapplication and you can see VS code andcopilot now understand this resourceright it's it's treated as a first classcitizen uh by the the developer toolingexperience and it's prompting us for allthe required properties that weredefined in that resource filenow thecapacity was not required so thedevelopers adding that capacity valuenow and they get a nice tool tip rightthat says you can choose a t-shirt sizefor thatresource so capacity set the last thingas we saw before right we add aconnection we add a connection betweenuh this container and the new AIresource so we get that the co-pilotintegration that we're looking for inthat front-endcontainer so we'll do rad j�ust like wedid in the previous demo it's going togive us that port forward we'll go seethe new application installed this isrunning on Azure so Radius will createthe model using Azure OpenAI right it'sAzure specific infrastructure for thisdeployment the cache will be Azure cacheforReddus the front-end container getsdeployed on a Kubernetes cluster radiuscreates a service account exposes thatservice then here back in the UI on thecontainer info we see that new resourcewe see the environment variables thatwere injected based on the creation ofthatconnection then we can go into the appand test that co-pilot integrationso again we're integrating Radius in theIDP everybody put that on your listwe'll ask for feedback this is funny tome uh C-pilot says we've got a complextask ahead i we're finding it's prettyeasy uh to integrate Radius so I thinkthere's some model training that wouldbenefit Copilot but but you see how easyit is to use Radius both to deployacross different environments and toextend it for use in a custom IDP thanksman cool stuffright as yousee on the bottomright our story didn't start with radiuswe have a lot more boxes there on thatdiagram so our storystarts in 2021 we started aprogram that we ambitiously called from8 days to 8 minutes the ideawas all of the build and deploy to allenvironments of a micros serviceshouldn't take more than eight minutesand the driver there was reallyaccelerating servicedelivery we needed to push away fromthis guys building uh some infraresource for that guy to deploy the appbut still we needed to respect the factthat those are two different lifecycles and not only that we wanted towell it was ambitious and it isambitious we wanted to cover not onlyinfra stuff but the whole IT life cycleso registering the app in the CMDB forinstance and all of the stuff you canimagine we as a regulated industrycompany need to do for everything werun but again given that time and moneyhave limitations we did not want toreinvent the wheel we did not want tosay to people tomorrow you'll use thisbrand new thing goodluck and we had stuff in place like wehad and still have a big Terraformmodule library everything isTerraform orwasso the most important delivery weidentified at time was sure we know howto do infrastructure as code we knowthere are many tool sets todo continuous compliance driftreconciliationuh all of the typical IT resource lifecycle management things what we'remissing is application as a first classentity application as an API and becausenow we've we allknow how softwaredevelopment should kind of work we knowversioning we know deployment patternswe know all of that stuff we wanted todo the same forinfraso back in21 this was what we started buildingi want to do a new micros service for mybusiness domain sure you've gottemplates on a web UI you pick atemplate you fill out the name of thethe name of the thing the uh businessdomain it's going to run under topologytechnology dependencies yeah it's goingto need cache it's going to need arelational database it's going to domessaging you know whatever and besidesall of the cool stuff we give thedevelopment team automatically when thisis filled out repos pipelines securitystuff all ofthat we capture that in afile json no everything and everyoneknows JSON thisfile which sits on the code repo for theapp is editable anytime by thedevelopers says these are my app detailsmy app can be exposed to the internet ithas these operations optionally you canput uh SLOs's under each one of thoseoperations you have messaging detailsschemas the the app roles it it has anadmin role it has a reader role it hasyou know you name it does it need radiscache persistency details whatever it'sinJSON whybecause after we have thatfile back in 21 we would convert a partof that to open application modeldefinitionOAM and pass it along to GitOps FluxCLLA and infrastructure as code toolsets and we were almost therestill the gap was enough for us to gettogether and do a lot of whiteboardingfor almost three yearsand say "Okay now we know the gap weneed to cover the gap between thisdesign and this one." And well andbasically the the gap is what Jonathanwas showing that that sort of userexperience and userdefined types aseasily as you saw today that needed tobe brought into thismodel so today what we're doing isbringing peopleover i've shown you an old JSON filenow this is what we're trying to do surewe have a CLI that converts JSON toYAML you by the way did not see YAML onthe previous demos you saw bicep rightand some people may think oh yeah bicepanother Microsoftformat we do YAML samething and if you look at it it's astandard Kubernetes CRD it's it's calleddeployment templatesand this deployment template is justpicking up the parametersfor the recipe in this case this is justa ReactJS basic app so the parametersare things like it's an extra smallapp image repo is over there it doesobservability with Prometheus and acouple of other thingsand that bit the the so line 29 thetemplate is the JSON representation ofthe bison you saw earlier the same sortofthing for those of you that are kind offamiliar with Azure you might recognizethis is a an ARM template withextensions now it's importing theradius and this file YAML file GitHubsprocess and so on does what you just sawin the demo spins infra infrastructureup wires it to the app well and thingsrunso if you're a developer on your localmachine you just register the version ofthe static front end recipe that runs itlocally even if you're calling out toOpenAI the Radius control plane is localto you so you run it locally if you aredeploying it to shared infra sure use arecipe that takes advantage of thatshare infra crossplane terraform all ofthathowever you use it so if you do cubectdlon a pipeline if you use githops if youuse customize to specify values for eachenvironment if you have a helm chart foryour app just drop the deploymenttemplate into the helm chart you versioninfrastructure as part of theapplication that's the deliverableso where's thebox it's just anotherCRD but you get the infra for thatversion of yourapp what we've learned so far from thisenable the right conversations you nolonger talk about do I have the databaseschema that I need for this version ofthe app you talk about feature testingprogressive rollout and all of thatstuff you should be talking about myopinion and you look at best of readimplementations you so you have fiveteams doing this is how I spin up mydatabases pick one and get everyone onthat pattern put that pattern to yourrecipeand because everyone's doing it the sameway with as much knobs as you want themtohave predefined SKUs predefined SLAs'sand independent life cycle so infro guysknow what they need to support it's inthe SLAs in thecontractsdevs i just want a medium-sizedcache you know what this throughputmemory size all of that needs to benow do whatever you want with thatinformation but do it the rightway and we all know how to do it forapps do it forinfra solve the right problems with theknowledge that you have for the app appsapp definitions push ifra aligned withthesame maturity that you have to pushcodes that's what we've been trying todo and that's what we're doing more andmore you want to wrap it up sure thanksman thanks everybody for the time so oneof the key kind of messages that thatNuno and I wanted to share is that wesee this this application model kind ofcapabilities that Radius brings to theopen source community as a generalpurpose requirement for most IDPS if notall IDPs so it it's a plea to join thecommunity we've seen a lot of companiesgo out and kind of roll their own typeof abstraction similar to Radius in alot of ways we can avoid a lot ofredundancy and get a lot of value out ofcollaborating together um so you can getto the radius documentation from this QRcode that also has links to our GitHubto our community activity um informationabout monthly community calls so pleasejoin us if you have questions reach outon Discord we got a very active supportchannel and we'd love to see more folksuh from this room uh involved so thankseverybody for the time thanks Nuno thankyou2025-04-15 21:58:15.228698 ��\#��AAZmcZlDCYDgEhello everyone welcome to how MillenniumBCP leverages radius to empowerdeveloper and operatorcollaboration i'mNun i work at well Millennium BCP andI'm the head of public cloud and with meI have I am Jonathan Smith i'm the headof product management uh for the Azureopen source incubations team atMicrosoft and honored to be on stagewith my friend and colleague Nuno sofirst of all show of hands are youfamiliar with Radius besides the networkprotocoloh good nice that's good that's a starwe'll increase that number over the next30 minutesdo you want to share a bit about whatthis is sure sure so we will we'll do acouple of things here i'll start off andgive some basic context about whatRadius is some demos of how it works andthen Nuno will walk us through some ofthe cool work he's done at NBCP uh toadopt Radius as part of their internaldeveloper platform including gettingworkloads to production on Radiusstarting in last December um so so herewe go with some context then so so as Imentioned I'm in the Azure open sourceincubations team at Microsoft and we'vedonated um several projects to CNCFtoday we're going to focus as we said onRadius um but hopefully you'll have achance to take a look at some of theother projects that are listed here uhDapper and Kada there's a lot of folksthat I talked to at CubeCon that arefamiliar with those they're alreadygraduated CNCF projects there's newerprojects I won't touch on here too muchbut just take a look at Copacetic yousee in the lower left corner and DSI twonew super interesting projects likeeverything else on the list here theyare available uh freely on GitHub uhunder the Apache 2.0 their license so ifyou get a chance check themout so in the context of Radius Nuno andI will talk mostly about Radius featuresthat help platform engineers like histeam uh build IDPs that better servetheir their developer and operatorcustomersum and I get to work with a lot ofplatform engineers and they've taught mea ton about like the challenges theyface and the platforms that they'rebuilding um and the way different opensource technologies help accelerate thatproduct development and I think this uhgraphic from Canoe um does a great jobsort of laying out a canonical umreference architecture showing a lot ofthe common elements in an IDP um and alot of the open sou��reis is to allow for progre moreprogressive rollouts of feature flags Soit allows more granularity that you youcouldn't do with a binary release Youknow you could slowly slowly slowlygradually enable a new feature featureor functionality of your code And withthat kind of comes the risk reduction aswell You know you have a more safer wayof enabling them and even quicker wayusually of disabling If you figure outsomething goes wrong into productionjust you know quickly disable thefeature ramp it down and the problem isis is mitigated And you know usuallythat takes a bit less time than it wouldwith a normal binary release And lastlya very very big advantage is the allowallowing you to experiment right so avery very common use case of featureflags are uh is AB testing Um you knowhow do we use them at Google i think youknow we use them more or less the sameway as anybody would do it But I wantedto point out that you know we startedusing at Google uh this for a long longtime So way back since 2009 feature flagit was the first time that Google was ggiving it a try at feature flags Uh Iwas at that time I was very very youngjust caring about video games and stuffSo uh you know just Google was like yeahlet's do feature flags So uh you knowbut here I am to talk about you knowsomething that happened you know it'sbeen happening for a while is what I'mtrying to say Um and this has prettymuch become the de facto way of you knowof introducing new functionalities andreleases U most of our developers usefeature flags So it's about 70% of themyou know use it at least like on aregular basis and I would bet that youknow most of them have used it at leastonce you know at least once And withthat kind of comes also like the hugelarge code base of feature flags thatthat we have So as you see there is morethan 150k uh but with that you know Iwant to clarify what active featureflags means is how many of them exist inthe code base not how many are in a inthe middle of a rollout because one ofthe things people like to do withfeature flags is introduce them One ofthe things people don't like to do withfeature flags is clean them up So uhthat is the least favorite thing of anydeveloper to to do the clean up part ofafterwards And um one of the interestingthing or tidbits that I can share fromuh from Google is that YouTube a whileback had this very very funny policythat if you wanted to introduce a newfeature flag you had to clean up twoother feature flags in uh in in returnto kind of just to to start cleaning upthe the code base and keep it like uhmore uh well-maintained sort of to sayUm so yeah and the way we use it ispretty much the way that I justexplained before gradual rollouts ABtesting uh and it allows us for a moreway of safer safely you know changing awhat is like pretty much a very verylarge code base So I think that's areally crucial thing for us to make surethat uh we prevent outages and we canmitigate them very very very quickly Umso let's take a look next at like howyou might use uh a particular uh featureflag in your in your code like how youmay have come to know it or befamiliarized with it right so let's saywe have this react example over here youknow we want to introduce a feature flagwhich you know will show a new messagein our uh you know in our homepage orsomething like this and you can see thatwe are using this flag called newmessage you see there on the second lineit's use flag new message and if theflag is enabled then you know we'regoing to show So the new message elsejust show the classical old way of theold way of themessage except um what if you know whatif somewhere in the flag managementsystem where we initially introducedthis feature we call it in a completelydifferent name maybe the developer wasout of sync with the PM somethinghappened and it's just like a completelydifferent name and you know when tryingto query the value for this flag wecannot find it in our flag managementsystem then what well in that case u youknow you would default to the codedefault value which is true And in thiscase what happened is just beca�use wewere let's say we weren't payingattention we completely enabled thefeature overnight We enabled it for allof the users instead of like gettingthat functionality of gradually enablingthe enabling the feature And that kindof begs the question when you have aflag service in an app you know who'sright you know somebody has a value oftrue and then in the you know your flagservice the flag is kind of disabled Soyeah who is right in that case who hasthe absolute source of truth of what thevalue of the flag should be uh evaluatedtowards And with that kind of comesthose multiple sources of truth that Iwant to talk about Uh we have this youknow we have this drawing over here I'mgoing only to focus on the green partand the blue part and Mike will go alittle bit in depth into like what isinside the this very nice blue box Soyou have kind of like the service andthe app and that's kind of the wholetrick with feature flags that it allowsyou to pretty much change something inthe service change something you know inin the flag service which in turn willimpact the behavior of your host of yourhost application So your app is queryingthe flag service and trying to figureout like what is the value for yourfeature flag And as far as the flagservice which is serving the the theflag to the host application is usuallyupdated very very frequently or orrather should I say propagated right wewant as soon as we make the changes ofour flag to be you know to you knowenable it or ramp it up or you know rampit down We want that those changes to bepropagated pretty quickly which usuallyis the case with the with the flagservice that the changes are propagatedin a pretty quick manner Uh then we haveyou know our our app which you knowusually is updated on a on a more lessfrequent cycle at least less frequentthan you know we would think about thethe flag service right you might haveyour uh weekly or you know few otherdays you know release cycle for uh youknow for for your app and that's kind ofinteresting because the flag is kind ofreferenced in both of those places andyet it is being rolled out in adifferent release you know in adifferent release cycle So that kind ofputs us in this space where you end upwith maybe the flag existing in the inthe in in in your flag service but it'snot yet in your host app or maybe theflag is in your host app but it's not inyour uh in in your you know flag serviceor what have you So then you know whatshould the value be or what is the theactual perspective of what we shouldevaluate the value towards and you knowwith that kind of comes you know theidea to really properly think about likehow do you you know how do you actuallyintroduce how do you roll out the flagand the proper life cycle of a of afeature flag So I've put here like a fewa few of these steps So you create theyou create the flag right and you youknow you introduce it in your flagservice and it's going to be in yourhost application as we discussed you'regoing to manage it you know slowly rampit up the feature Maybe quickly ramp itup and then nothing We stop therebecause we've established that peopledon't like to deprecate and clean up theand and clean up the flags No I'mkidding If we do our due diligence andwe try to do our job to the very veryend we deprecate it We make sure that weremove all of those references from theflag from the code And afterwards whenthe flag is no longer present in the inthe code we're going to go ahead anddelete it from our flag service as wellBut where should we update it first likewhere should the flag first be presentshould it first go into the code shouldit be in the in the flag service andwhat we've come to realize at at Googleis that um it's very very you know it'svery very probable that if you introduceit into the code first you know youmight evaluate it to some default valuewhich is not really really uh reallyreally safe But if you introduce it intothe flag service first then uh whathappens is usually you know the flag ispresent there but it's not affectinganything So the general perspective thatwe have is that the flag� must first bedefined into the flag management systembefore it's even allowed to exist intothe host application So we completelydisallow a flag existing into the hostapplication uh if it's not present inthe in in in in the flag service and howto check that right because you knowthis can kind of be a pretty tediousthing You know you as a developer mustcheck you know if it's there even theflag service if it's in your app or whathave you And uh we kind of have to stillsomehow reach this eventual consistencyuh when introducing the flags like whatis present in the flag service should bepresent also like in your hostapplication or what have you Um and wein with those problems also came likewith all those challenges not justsomething like you know what if the youknow these are things which we actuallycame across Uh so you know just to showyou like a few outages which you knowthey are just human errors and canhappen to uh to all of us I will not saywhether one of those three examples wascaused by me but uh you know uh it couldhappen to it it could have been me forall I know So one time like somebodyjust had like a flag name and they justaccidentally added like added like atrailing whites space and that prettymuch caused as you've seen in theprevious example a few slides back therejust was an unexpected flag value andthat just had like a you know droppedthe usage of a particular feature andthen there was like a mismatch likesomebody was defining like what thedefault should be in in the code butthen that didn't really match with whatwas the default in the flag manage manamanagement system So overnight a featurewas like fully launched and there was anoutage related related to that Andlastly what I was explaining before likesomebody was in their flag service Oh Ithink I I think we've cleaned up all thereferences to this particular featureflag We are now ready to remove it fromour from our flag service And they justremoved the flag service And whathappened to the flag it defaulted to theto the code default you know defaultvalue and you know like it just prettymuch caused another outage and it youknow pretty much disabled that u youknow that feature because we weren'treally careful but when we're talkingabout careful I think as developers wealways think like yeah it's good to becareful but it's good to even just tryto not even consider carefulness we justhow can we help as developers to to justsolve the problem before we even need tocheck into the code of all the flagreferences make sure all these thingsare in sync it should just be smoothRight So the perspective of like uh whywe're here and this talk and Googlewhat's the perspective of it is you knowthat we thought we came up with you knowwith the idea to introduce the codegenuh so we're generating all of these flagaccessors uh type we're generating typesafe flag accessors based on u based onthe flag configurations and that kind ofreally helps out uh really helps out ina multitude number of reasons you knowone is that the fact that it allows youto you know to mitigate all of thosemistakes of you know putting the wrongflag name and things like that and italso offers you a direct source of truthof what the flags which are able to bereferenced in the in the in the binaryuh are So at Google what happened whenwe introduced the codegen uh wecompletely disallowed reflective accesstowards the you know to towards the flexSo the only way to access that is withthe is is with the codegen tooling andthe type safe accessors and you knowthat that's the you know complete andexhaustive list of flags that you canactually access in your binary and thatpretty much allows you to kind of checklike what is present in the binary andmore easily uh validate that check andalso when you're deprecating the flagthat's also very cool because it stopsbeing codegen So if there's any sort oflike stale reference to the flag thatyou might have missed or you might haveforgotten about you know your costapplication will not compile and youknow that that would be like a greatthing and with all of these learnings wewant�ed to take all of this stuff to theyou know to the open source that I thinkit's a really beneficial uh reallybeneficial tooling So you know for thepast you know half of a year or so youknow u I me and like Mike we've beentalking about like how to develop like acore something which could fit into thecontext of of open source and you knowgladly like the uh open featurecommunity was welcoming me with likevery open uh very open arms and I wasjust having a great time and to to learnabout what we've been working on in openfeature I'm going to hand it over toMike who is going to talk to you aboutthe amazing thing that we worked on uhfor the past uh for the past months Allright thanks Lauren Um so yeah I'm goingto talk about open feature real quick Itis a CNCF project we're incubating nowUm and it's basically an openspecification for feature flagging It'sreally trying to uh unify the SDKs theway developers interact with featureflags and it works across you know manydifferent vendors You can hook it intoyour homemade solution as well Um andit's even a way to like wire in multipleuh providers simultaneously So reallypowerful um abstractions and it's reallyan ideal way to use feature flagging ifif you're just gettingstarted Um why would you even want tostandardize though so one of it is toavoid vendor locking You basically havereferences your code all over the placeand it can be quite challenging to uhlike migrate depending on your businesscase Um the other one is just building acommunity around this Like I I get theopportunity to work with Floren nowbecause we're doing this all together Umit's it's actually quite exciting to seehow many people care about feature flagsUm and it's quite fun to like learnother use cases and and challengespeople are running into And then basedon that community we're building nicetooling that we can share with everyoneUm and that's what we'll talk abouttoday Um as you can see the open featureecosystem is quite large at this pointUm hopefully we have your tech stackcovered If not uh let us know Um we'readding these all the time Um and yeahreally uh really happy to see thisecosystem expand Um so let's dive intothe architecture of open feature This isgoing back to uh the the slide we sawearlier Um on your right the flagmanagement system can basically beanything Um on the left is yourapplication of course but we'll dive inand look at the orange section So thisis kind of where open feature fits in Onthe left side here is our SDK and on theright is what we call the provider Andthe provider is basically the interfacethat will talk to your other systems Umyou can easily build your own if you'dlike but we also have like hundreds atthis point in ourecosystem Uh let's look at a Noode.jsexample So at the top you see wherewe're registering the provider Thisshould happen once in your applicationUh we can create a client then and thenuse the client to to basically interactwith your feature flag In this case it'sa silly example of course but we'reusing the with cows flag and thencontrolling the the council output basedon that But one thing you'll probablynotice is we're using hard-coded stringsThis has been basically the status quofor years Uh most people do it like thisUm some ways to work around this is tobasically hardcode your own const filethat does this and so you can do it in amore unified way But you know afterchatting with Floren and others itbecome very very clear that this is notexactly what you probably want Uh sothat's where the open feature CLI comesin This is a brand new initiative Wejust started the SIG probably threemonths ago or so Um so it's somethingwe're meeting weekly Um we're activelydeveloping this stuff and it's I thinkit's going to be the obvious way tostart interacting with feature flags inthe future Uh the open feature CLI is acommand line tool We're really trying toimprove the developer experience Wethink there's a big opportunity thereand there's really like three corecomponents to it You know first is likehooking into the flag management systemWe're going to make that vendor agnostic�So it's one that should work with anytool including hopefully your ownhomemade solution if you have one Um wealso have this concept of like a localmanifest So we want to be able to fetchfrom that flag management system andkeep a reference of the flags in your uhrepo And then using that we can do codegeneration There's also a lot of otheropportunities that we won't touch todaybut I think uh the future is quitebright with this relatively simpleconcept Uh let's take take a look at thethe user flow So usually what you woulddo is you'd create a feature flag Inthis case we're just adding it into thislike hypothetical management system Uhgoing back to the with cows example Umyou can see here this is the key thatyou would typically reference in yourcode This is a thing that maybe a PMwould create the feature flag in theirtheir tool and then maybe an Ajuraticket or something put the link to theuh flag that you need to reference inyour code It's right there Um and thentypically you would add some kind ofnice description but again that's inthat management toolThen we basically would fetch that lastknown state So we would pull from thatflag management system And this is uhjust a JSON file that represents likethe base of the flag Um it doesn't haveany sophisticated targeting or anythinglike that It's just basically the flagkey thing that we just saw a second agoAnd then other useful metadata like whatwould what happens if something fails Umand and this is the source of truth foryou know the behavior that would beexpected in that caseUm then we can go ahead and generate aclient So in this case we're justrunning the CLI We're running thegenerate Node.js command And you can seethat we use that manifest to spit out inthis case an open feature TSfile And all that changes with all thatwork is basically the one line here Soyou see it's it's basically identicalbut instead of calling the kind ofunfriendly boolean value and passing astring we have the the width cow methodnow So really like it's a it's a smallchange It is but it improves thedeveloper experience quite significantlySo you can see at the top that's what itused to look like and then down below isis the new stuff that we can do with thecodegen and it effectively eliminateshuman error Um it also provides somereally really rich IDE integrations andthen as Floren mentioned it basicallylets the compiler fight some of ourbattles in terms of code cleanup andthings likethat So let's give it a quick demo Uhit's usually risky to do a demo at aconference but we'll uh we'll see how itgoesUm all right Sohere there we go Off to a good start Allright Um here's our demo application Umyou can find it on uh the open featureor if you're interested It's a toggleshop and it is a convenient place to buyswitches and toggles Um as you can seehere uh we have a little banner hereIt's teasing you know free shipping onorders over $50 And this is controlledby a feature flag Uh if we dive into thecode real quick it's uh hopefullyeveryone can see this Um this is thecode for the landing page And we can seehere we're using kind of the old styleuh of of you know interacting with afeature flag And and just to show youhow it would work maybe we remove partof thisuh uh the the flag key here So I'll justremove that And then if we go back andgive it a quick refresh you'll see thatthe banner is gone now And that'sbecause we're basically defaulting backto the old functionality or the defaultvalue because the key no longer existsAnd so feature flags typically do notimpact at least they're designed to notimpact runtime behavior negatively ifsomething goes wrong Um so you'resupposed to have good defaults but thismay not be what you want to serve tocustomers Um so that just shows how easyit could be to have typos Um going backto here uh if we look at our flagmanifest file we can see we actuallyhave a reference now to like this uhoffer free shipping feature flag And ifI go ahead and just open up ourterminal and generate in this case we'lldo the the React generation You can seethat we're spitting out some uh Reacthooks here And I'll just show you thediff real quick This is the generatedcode for this new like React accessorAnd because we have all that additionalmetadata we can generate some prettynice uh in this case JS docs to get moreadditional information that I thinksignificantly helps uh improve thedeveloperexperience So going back to here let'sgo ahead and just replace thisoneAnd so it's the use offer free shippingBut the nice part is if you go hereuh offer free shipping It even likeautocompletes and you can see it righthere Imports it for us And if you hoverover it we can see there's again the JSdocs We have all the reference We havethe reason for why this exists Um and wedidn't have to change anything else Itjust basically ties in is is exactlylike you'dexpect If I were to go ahead and save ithopefully yeah the banner is alreadyback so everything's working And just toyou know prove that this isn't some kindof demo magic we'll go ahead and turnthis feature flag back off again Andhopefully if we did everything right isgone again So you can see that webasically just adding this nice you knowdeveloper experience using the code genusing some of the lessons learned youknow from from what Floren was mentionedat Google and hopefully applying thesebest practices in a way that bas youknow anyone could use without having todevelop itthemselves Moving back to thepresentation just just to kind of bringit home Sobasically once this thing shows againThere we go So yeah today we basicallytalked about what feature flags areHopefully to you know get everyone up tospeed on what we're talking about hereUm how Google uses feature flags andsome of the lessons learned You knowsome of the pains that they'veexperienced over their 10 plus years canbe applied to open source software andhopefully everyone can benefit from thatAnd then basically how that all cameinto what we're seeing here with theopen feature CLI Uh and really wanted toinvite people if you're interested thisis our repo Um you can take a look atwhat we have so far Um definitely youknow give it a star add some comments uhand then join us Really like this is anactive community Can you move it backcuz I think people are trying to YeahSorry about that I'm moving too fastYeah Yeah please Yeah So you know whilewhile you're scanning and everythingjust to just to kind of to elaborate alittle bit like we're looking for youknow two things basically like you knowwith if you'd like to try and integratethis and play out and give give it a trywith the cogen and how it works we'dlove to hear some feedback on like justto understand also like the user journeyand you know maybe if we're missingsomething or there's something which wekind of missed we'd love to just workand understand like how we can make itsuch that you would hopefully want tointegrate this into the future And thesecond thing is yeah like we're alsolike very very open to for people to tocontribute uh you know hopefully me andMike was working in such a way that itshould be pretty easy for for otherdevelopers to uh to come on board and ifyou see that we're maybe uh you knowmissing support for for your favoritelanguage or framework or uh what haveyou like again we have those things uhon a weekly on a weekly basis right nowSo we just really love it for you tocome with us So uh make sure you scanthat Yeah Yeah Absolutely Yeah Yeah Andthen just like there's a lot of otherresources if if you're interested injust learning more about open feature Wehave openfeature.dev We have a GitHub orthat has like 60 repos at this pointIt's it's growing quite uh quiteaggressively And then we're very activeon the open feature or the CNCF SlackYou can find us under open feature oropen feature dash like whatever your ffavorite uh technology or uh orframework is Um yeah thank you very muchYou've been a great audience Please scanthe QR code if you'd like to providefeedback Um and we'll be around here fora little bit and we'll also be in theproject pavilion um until about 2 So ifyou have any questions uh love to hearfrom you Thank you very much2025-04-15 21:58:15.763957 ss��m]#��AmewXGSwDCE4hello everybody Uh welcome to our talkUh today we are going to talk about typesafe feature flagging in open featureand kind of the lessons that we werelearning across at Google Uh my name isFloren and I've been a software engineerat Google for three years and my workwas kind of revolving around featureflags and making rollouts across GoogleCloud Platform safer And for a while nowI've been an active open sourcecontributor in this project that we'regoing to talk about today Hi there UhI'm Mike Beamer I'm a product manager atDinatrace I've been you know in softwarefor quite a while now with a number ofdifferent roles I'm a active open sourcecontributor I'm also a co-founder ofopen feature and on the governancecommittee That's very cool So I'm goingto start off with one question towardsall of you Uh I'm wondering who here isusing feature flags in their code uh andwho's here has used a feature flag intheircode Why no I'm kidding Feature flagsare really really cool and hopefully ifyou're using them you get to keep usingthem and uh yeah so you know just towalk you a little bit through the agendaI'm going to talk about for those of youwho don't know what feature flags areI'm going to talk a little bit aboutthem I'm going to talk about how we areusing feature flags at Google uh ourapproach and a few lessons that we'velearned across the way and some findingsthat you know we thought would be usefulto share with everybody uh and then Mikeis going to walk you through the codegeneration open feature CLI the thingwe're here to show you today and youknow stay tuned until to the very end tofind out how you can use it and moreeven importantly how you can alsocontribute to it if you like what yousee here Um cool So what is a featureflag um feature flags kind of representthis uh very very cool way to uh youknow kind of dynamically change thebehavior of your binary without creatinga new binary release every time Right weare allowed to dynamically enable ordisable a particular a particular uh aparticular feature and you don't have tomodify the the source code Um so what isthe advantage of something like this uhI think the biggest advantage that the��erwith your name written onit well they are issued by recognizedauthorities and include uniqueidentifiers such as physical attributesand a passport number they also usewatermarks and laser perforation toproveauthenticity this same concept appliesin the digital world instead ofpassports we use digital identitydocuments like x509 certificates toauthenticate humans andworkloads instead of governmentauthorities we have certificateauthorities subject names and the serialnumber serve as unique identifiers andinstead of watermarks and laserperforation cryptographic signatures areused machine identity management iscomplicated let's see why that ismachine identities greatly outnumberhuman identities about 45 to1 and theycan't rely on traditional security likemfa also their creation is oftenspontaneous anddecentralized they tend to accumulateexcessive permissions and outlive theirpurpose finally a small change inpermissions could break critical systemsbut machine identities aren't alone intheir complexity today's infrastructureas a whole presents another set ofchallenges organizations are jugglingmultiple technology stacks they're alsomanaging a mix of cloud services andon-prem systems as these systems growmore interconnected and developmentaccelerates one thing becomes clear weneed these different systems to worktogetherseamlessly that is why spiffy secureproduction identity framework foreveryone was created the cncf graduatedproject is an open-source frameworkdesigned to provide a secure andstandardized way to manage identity incloudnativeenvironments a uri which serves as thespiffy id is crucial for uniquelyidentifying your workloads in yourinfrastructure there are some specificconstraints in place for example the adsymbol cannot be part of the uri themeaning behind the path is leftopen-ended and is the responsibility ofthe implement todefine at shopify this is how westructure the uri to give us enoughmetadata the trust domain is in purpleand the subject path is in yellow thisstructure captures our workload identitypool the identity of the google serviceaccount and the corresponding projectall within the subject alternative nameof thecertificate we use common expressionlanguage to format the path to ourdesired structure danny tell us aboutspire yeah thanks so to implement spiffyyou can use spire it's also a cncfgraduated project stands for spiffyruntime environment you get agentsrunning on every node managing the fullidentity life cycle for you while aspireis the reference implementation spiffyconcepts can be implemented throughvarious platforms and services googlecloud is one of the officiallyrecognized integrations at shopify weuse google cloud certificate authorityservice for our private certificateauthorities and we leverage identityreflection for federated workloads toobtain a spiffy id this means we don'trun spire agents ourselves but we getall the benefits of standardizedidentities across ourinfrastructure when adopting spiffy idsthough we did run into some challengesone notable example has to do with ourkafkainfrastructure by default kafka requiresthe use of the distinguished name formanaging access control lists as you cansee in the highlighted documentationhowever when obtaining a certificatefrom google's ca service using reflectedspiffy mode we face a mismatch becausethe distinguished name is mutable andcan't be verified google ignores it andall we get back is the sendi with aspiffy id and this creates a challengefor cfka authorization which expects touse the jn for access controldecisions the way we solved it was bybuilding a custom principle builder inkafka there are some open sourcesolutions as you can see here on thisgithub repo and this kafka improvementproposal but we have our ownimplementation our custom build builderparses the san uri containing the spiffyid format and maps these identities toappropriate cafkaacl speaking of challenges with workloadidentity let's explore some broaderchallenges we faced when implementingmtls at scaleat shopify we have millions of podsthousands of nodes and hundreds ofclust�ers this scale makes internalservice-to-service communicationchallenging to address this we requirerobust solutions for service discoveryload balancing and traffic managementthat ensure performance and resilienceacross our distributed system in theindustry there are several establishedapproaches to serviceto-serviceauthentication these include servicemesh solutions like istto cloud providersolutions like google's application loadbalancers and kubernetes nativeapproaches with ingress controllers eachof these approaches represents differenttrade-offs between security complexityand operational overhead let's examinewhy these trade-offs matter at scalestarting with perhaps the most talkedabout solution service meshservice mesh technologies like isttooffer an appealing solution forserviceto-service authentication themesh provides strong workload identitythrough x059 certificates automaticcertificate rotation and mtls betweenservices all managed through sidecarproxies on paper it's a comprehensivesecurity layer that handles thecomplexity of service authentication foryou but let's talk about some consyou're adding significant complexitybecause each pod needs a sidecarcontainer consuming extra cpu and memoryat our scale of millions of pods thisoverhead becomes substantial there'salso the operational burden you need tomanage the mesh itself it's not justanother layer but a whole new controlplane to maintain alongside kuberneteswhile we do use endway proxies forspecific use cases for example inelastic search for the majority of ourservices we've opted for a differentapproach there are also some cloudprovider solutions for service-to-erviceauthentication google cloud like manyother providers offers internalapplication load balancers mtls can beconfigured on the target https proxyresource with the solution you areprovided custom mtls headers by defaultthat can be passed to the backend this solution does have trade-offs amanaged service means there is no needto manage replica sets or worry aboutnode failures there is seamlessintegration with other cloud serviceslike am or cloud monitoring solutionsand you get built-in security featureslike dos protection however vendorlockin is a con cons requiring a massiveinfrastructure overhaul when changingproviders you are also restricted tofeatures provided by the cloud providerand costs can quickly escalate with hightraffic danny what are some otheroptions yeah um ingress engine x youmight be thinking nightmare luckily wedidn't get affected no public admissionweb hooks anywayum it has been a core component of ourinfrastructure since 2018 when weadopted kubernetes and some teams arecurrently exploring the gateway api butingress engine x is still a primary toolfor servicetoervice authentication imean internal one um it supportsmultiple authentication methodsincluding basic o basic map ooth andmtls well um i have a confession to makewe might have some internal servicesauthenticating using basic o but we'vebeen on a continuous journey to get ridof basic o everywhere in favor of mtlsbut let's see why basic o could beproblematic from a securitystandpoint username is our only way toidentify clients and it's not verifiedmultiple services often share the samecredentials and there's no automatic keyrotation which means we have longivedcredentials by the way the lack ofautomated rotation isn't just a basicoff problem developers often forget torotate or simply don't spend time doingit because it's working until it's notanymore some keys might expire orsomeone might delete them accidentallywith automatic key rotation teams don'tneed to worry it's way less errorpronehere's a quick story once upon a timethere was a cloudflare key with adminpowers and used by a service it was setto expire after 4 months the team incharge didn't know when it's time torotate and it resulted in a partialoutage i'm sure this is a story thatyou've heard many many times before uhmaybe some admin uh got an email warningfrom cloudflare about the expiry or theysaw it on some dashboard but thedevelopers maintaining the service hadn�o idea until end users reportedproblems and came complaining on slackum so yeah storing keys with noautomatic rotation is a common issueacross the industry here we have benerren from teleport i saw their booththere uh he's vouching for zero trustaccess for github they mention foreverliving ssh keys or personal accesstokens used to access github repost umonce the ssh key key pair is generateddo developers actually often rotate themor they do so only when they have towhen they're assigned newlaptops um how about personal accesstokens do they always set them toexpire anyway um the solution here isalso sslerts which auto expire give usbetter auditability etc now back toingress engineext and our belovedauthentication method nts um here's anexample of configuration you can usespecific annotations they can all befound uh in the official ingressengineext docs i'm just going to go overthe key ones used for thedemo oftls secret points to thekubernetes secret containing the clientcerts verify depth sets the verificationdepth for the certificate chain forexample you can set it to two if youhave a root and a subordinate ca or onefor self-signed searchs in our demo wehave otls verify client set to on whichmeans that the client search must besigned by a certificate that is includedin the secret keyca.rert have um tls pass certificate toupstream it passes the entire raw clientcertificate to the server this is theonly way we can access the spiffy uriwhich we're using for our aclunfortunately ing grass engine x doesnot forward this value alone as arequestheader and this is what the content ofthe header looks like in our experienceit was a hard cell getting teams parsingthis we get this ugly url encoded stringuh a string notice the percent 20 forspaces and stuff basically we have toclean the pam uh it would be so muchnicer if we could just get a header witha sand ui field only just like whatgoogle offers as michelle showed beforeum but at shopify we have somethingcalled hack days we have three days tohack on whatever we want and experimentso we created a project to dynamicallyextract the subject alternative namespiffy from the client certificates andforward it as a request header but thatinvolves writing lua scripts and no oneon our team knows lua so we resorted toour friend gpt this code here takes thepancer it converts it to dur format passpar pass par pass par pass par pass parpass par pass par pass par pass par passpar pass par pass par pass par pass parpass par pass par pass par pass par passpar pass par passes the asn it does abunch of things um but we never open thepr upstream as we thought it wouldn't beaccepted and we can't really attach tothe code quality has a bunch of nestedloops but it works kind of there was alot of back and forth with the botmeaning there's considerable craft inthis diff anyway uh there's an option tocreate some helper uh function or youcan use some library to help with theparsing for example in golang url queryon escape is yourfriend well we all agree that mtls isgreat but with mtls certificatemanagement grows more complex right yesmanaging multiple certificateauthorities can quickly becomechallenging to simplify this at shopifywe chose to implement a shared privateca for all of our internalserviceto-service mtlscommunication this is how we set up ourpki we went with a three tier approachfor our certificate hierarchy firstthere's our root ca we keep this onelocked down it's basically just there tosign our intermediate ca theintermediate ca does all the heavylifting this middle layer gives us anextra buffer of security so we're notputting all our eggs in one basketfinally we've got the leaf certificatesthat our services actually useday-to-day for mtls these are what ourservices present to each other to provetheir identity adding that intermediatelayer might seem like extra work but itgives you additional guarantees ifsomething goes wrong with theintermediate ca we can revoke and rotatewithout having to touch our root oftrust talking about rotation one of themost difficult aspects of certificatemanagement� is rotation how do we rotatecertificateauthorities we begin the certificaterotation process for our shared ca wellin advance of expiry to ensure a smoothtransition the rotation process for theroot ca begins one year before expiryand for subordinate cas it begins sixmonths beforeexpiry at shopify we use an alert rulelike this to notify when cas are nearingexpiry to rotate root cas we followthese steps first create a new root cathen add the new rootert from the new cato the trust stores next create newsubordinate cas and lastly delete theold root ca for rotating subordinate casfirst we create a new subordinate cathen we disable existing subordinate casand finally we delete the expiredsubordinateca after ca rotation we test with ourcertificate management tool locally weverify it can successfully request andreceive certificates from the new ca sothat's for certificate authoritymanagement how about client and servercertificates here's another cncfgraduated project a wellestablished toolfor managingerts in kubernetes certmanager we use cert manager to automatethe issuance and renewal oferts for manyof our gke workloads as a team we loveusing it because its open- source naturemeans we don't have to worry aboutmaintenance cert manager seamlesslyintegrates with multiple cloud providersand services initially we were usinghashi cororps vault but thentransitioned to using google certificateauthority service and the transition waseasy thanks to google ca's issuer itserves as an external issuer for searchmanager that uses google ca service toissue searchs through manage private casyeah manager is great when you'retalking about kubernetes environment buthow about non- kubernetesenvironment we have this in-house toolwritten in golang which allows engineersto mint assert using their owncredentials for local development it'sbeen in use for four years now it's abinary that can be installed withhomebrew and there have been use caseswhere this is needed for vms as well wealso have some resources running inclusters uh without search managerinstalled and for those the generalrecommendation is to run the tool as ascikar container it's lightweight andprovides a flexible method to getcertificates it can be run as abackground process or as a one-timejob we also have some serverlessapplications more specifically we usegoogle cloud run when we talk aboutcontainers running on demand we can'trun our tool as a demon if the instancedoesn't get requested it shuts down andonly reboots when a request is sent sofor this we use uh cloud run schedulejobs um the job is set to run severaltimes during the day and takes care ofrenewal in case of upcoming expiry andthe certificates are stored in googlesecret manager exposed to the instancesas environment variables talking aboutsecret manager uh this in-house tooloffers four storage options catering todifferent use cases uh local file systemgoogle cloud storage bucket googlesecret manager and kubernetes secret itdoes require specific im permissions forthe google services and uh to store itas a kubernetes secret it needs specificrole binding here's an example of theconfigurationuh we created for the demo first we havea specific service account for the jobwe give a create to create the secretinitially and get an update for therenewal logicgiven the rise in interest in serverlessapplications internally and alsonoticing a rise in the use of ourcertificate loading tool at some pointwe had a lack of clarity and consistencyin our certificate managementinfrastructure so it also became kind oflike hard to maintain both manager andthe tool to do the same thing and aftersome research and talking with otherteams we decided on using our customtool to manage all client certificateswe decided to follow a pattern similarto the one we were using for serverlessapplications where jobs handlecertificate creation and updates onecritical consideration with thisapproach is the order of operationscertificate data must be availablebefore a service container initializesto address this we deploy an initial jobthat runs once to mount the� kubernetessecret making the tls searchertimmediately available for theapplication at bootstrap time and thensubsequent uh certificate issuance andupdates are handled by the crown jobrunning on a scheduledbasis to guarantee that the certificateloader job completes before thedeployment is created we use carval kappwith kap we can wait for specificresources types to reach a ready or uhready or completed state beforeproceeding in our demo though we choseto implement argo cd sync waves whichalso effectively manage deploymentsequencing michelle will demonstratethis later in our demo session here wehave a demo job deployed to our servicea name space similar to the one we haveinternally the job runs our binary whenwe run it we call the new store functionand we know we're running in akubernetes environment by checking ifthe namespace information is mountedmeaning we're going to store thecertificate data as asecret we create a secret calledmtlserts containing tls.ert which hasthe subordinate and leave certificatebundle tls.key with a private key andca.ert with the shared root ca which istrusted by both the client and theserver let's see this in action wedeploy the initial job and then we waitfor the job to finish running and afterit's done we can see the secret createdthere when we describe the secret we seethat it correctly injects the data asexpected um and when we decode theclient certificate we see the spiffy urirightthereyeah our deployment resource mounts thesearch volume and creates files usingthe secret mtlsert which was created bythe initial job run and our client findsthe bundle with the subordinate and leafas well as the key to make the requestto the server it also uses the ca doserto validate the servicecertificate for subsequent renewals weuse chrome jobs which run every sixhours four times a day however if everyservice uses the same chron expressionwe might run into an issue where we haveseveral containers making thousands ofconcurrent requests and then so weaddress this uh by using a customizedfunction to dynamic dynamically generatethe schedule for each servicerandomizing minutes and hoursthis is done in a deterministic way sothat every time someone renders theconfiguration it doesn't generate abrand new schedule for the same servicetriggering redeployments of the crownjob all the time so we use the name ofthe service to create this uniquefingerprint by using a saltfunction so how often do renewals occurthe cert tool checks the time to live ofthe certificate and when a cert reaches50% of its lifetime the tool run as acrown job will attempt to renew it let'simagine here a c a certificate that'svalid for 10 days and let's imaginethere's a failure of some sort saysomeone accidentally deleted impermissions since the chrome job runsfour times a day any error in issuingthe certificate will result in fouralerts per day um for the next 5 daysuntil expiry that's a total of 20 alertsuh this should provide service ownerswith enough time to take action sotalking about alerts and monitoringlet's take a closer look at ourobservability stack there yougo another key part of the certificatemanagement life cycle is collectingmetrics to better understand potentialfailures our observability stack usesgraphfana prometheus tempo and gcp logsexplorer graphfana is a web-basedplatform for visualizing metrics logsand traces through customizabledashboards prometheus is a time seriesdatabase that collects and queriesmetrics and we get application tracesfrom tempo and ca logs from gcp's logsexplorer this is an example of the typesof logs we ingest fromgcp if you're interested in learningmore about observability at shopifycheck out this recent talk by sebastianand mattcool uh let's go over some metrics ifyou're using sr manager you get metricsexposed in the prometheus formatsalready from the controller web hook andca injector component they're availableat the standard metrics endpoint on port9402 of each component pod searchmanager exposes metrics such as the dateafter with which the search expires thenumber of seconds within which thecert�ificate should renew it also exportsmetrics about the health of thecertificate object and sync error countfor example um they allow us to capturefailed certificate issuance and renewalproblems being critical for preventingoutages related to search expiry ourin-house tool emits metrics similar tothe way manager does however since thistool focuses on client certificates formcls we don't use uh we don't have acmerelated uh metrics some of the metricswe expose are sync success sync failuresearch expires in seconds we use stats dfor these metrics and we use tags forbetter error classification sync failureuses error type tags to provide moreinformation about issuance or storagerelated errors we mentioned ourunorthodox approach where we have jobsand chron jobs managing certificatesunlike longunning services whereprometheus can regularly scrape themetrics jobs are ephemeral by naturethey start execute their task andterminate often before prometheus has achance to collect metrics so how do weensure continuous monitoring one way toget indicators that there's somethingwrong when issuing the certificates isby observing the job status by usingcube state metrics directly um if you'reusing cube prometheus stack you get itout of the box you can see it in actionhere the key metric we're looking at iscube job status failed why is thisuseful uh it's an early warning systemit won't tell you exactly why the jobfailed or give you business specificmetrics but it's an excellent first lineof defensebut if you want to reliably capturemetrics like issuance failure or requestduration one option is to use prometheuspush gateway which acts as anintermediary the job actively pushes itthe metrics before termination and thepush gateway exposes them toprometheus we chose to instrument ourin-house tool with stats d with stats dwe have immediate metric pushing throughudp before the pod terminates we alsodeploy stats uh exporters that convertthese metrics to prometheus format whichare then consumed in our graphanainterface allowing us to set alerts anddashboards well so as i i i said beforewe've been showing bits of the demothroughout the talk so far a series ofuh short films or reals and it's timefor a feature length version where weget to see all the pieces together butlet's take a look at the syn take a lookat the synopsis firstwe have two services a and service brunning in separate name spaces in thelocal kind cluster they communicate viaingress and genx controller using mtlsthe controller handles tls terminationfor us the ingress controller validatesthe client certificate and then therequest is forwarded to service bservice b performs additional identityverification and access controlthe search tool manages the certificatesthat both services use for theirauthentication the process begins withthe generation of a key pair by thesearch tool which will be used for thecertificate signingrequestum the search tool authenticates withgoogle's am workload identity federationit obtains an id token which isexchanged for an access token providedby the security token service and thenwith uh the access token the search toolmakes a csr request in reflected spiffymode to get the spiffy uri which acts asa workload identity finally the searchtool stores the certificate data in akubernetessecret and the s tool is going to bepushing metrics using prometheus g pushgateway giving prometheus enough time toscript them prometheus rules evaluatethe metrics and generate alerts forissues like failed requests the metricsand alerts are visualized in graphanashow timeyeahcool to start the demo i want to walkthrough a common problem i have adirectory here with many kubernetesmanifest files that i want to apply atonce a standard apply should do thetrick but as we see that didn't workresources were applied in the wrongorder resulting in errors doing anotherapply will fix this but it is not anideal flow there has to be a better wayto conveniently conveniently deployeverything with one command in thecorrect order well the answer to that isargo i'm going to begin with ensuringthat the argo ui is up and running andto do that i'm just going to simply portforward the argo service and we shouldsee that the ui is available so i'm inthe same directory as the previousscenario but this time i defined an argoapplication and added the necessary syncwave annotations to my manifest filesnow all i have to do is apply the argoapplication manifest file let's go tothe ui and see the syncing in actionawesome all my kubernetes resources havesuccessfully deployed in the correctorder i get an amazing overview of allthe objects with the built-in ui and ican easily see the health of everythingand here we have some information aboutthe latest commit from our repo thatargo is choosing to sync from to betterunderstand how this works let's take alook at the argo annotations present onthe manifest files the name space isconfigured with a sync wave annotationvalue of negative one lower numbersindicate earlier creation while highernumbers trigger later creation bysetting it to negative one this namespace is prioritized to be createdbefore other resources in the clusterfor the job we have the sync wave set toone meaning that it will be deployedafter the namespace our deploymentsfirst need the job to run to mounttheerterts which is why for them thesync wave is set to the number twosimilar logic is applied to services andingresses where services have to becreated before ingresses now let'sdemonstrate how access control listswork with mtls on the top left terminalwe have service a and on the top rightterminal we have service b first watchwhat happens when service a tries toaccess the /in internal/50ippy endpointas you can see service a successfullyauthenticates using its certificate andgets access to the endpoint i'm going tozoom in on the spiffy id that identifiedservice a to service b during the mtlsflow now let's test our acl in adifferent way when service a tries toaccess /internal/z something different shouldhappen even though service a is stillusing the same valid certificate forauthentication notice the red errormessage access denied this is our acl inaction service a is authenticated butnot authorized to access this specificendpoint now for our failure scenariosome users are reporting issues with theapplicationto find out why i'm going to head overto graphana and then go under the alertrule section and then i'm going tofilter the alerts to those that arefiring as i i notice at the bottom anerror coming from theert tool as iexpand i see that it's an error withcertificateissuance i will further expand and get alot more information about the rule thatis firing my hunch is that the job thatis responsible for issuing certificatesto our pods seems to have encountered anissue let's go to the terminal toconfirm this i'm going to look at thestatus of the job and it looks like itnever completed we should dive deeperand take a look at the logs from the podof thejob and there we go we found an errormessage the service account didn't havethe correct permissions which caused thealert rule to fire and this marks theend of ourdemo so what are the big ideas youshould walk away with zero trustadoption continues to accelerate makingmtls not just important but essentialfor secure service-to-servicecommunication when it comes to workloadidentity spiffy stands out as theindustry standard solution offering arobust framework for serviceidentification andauthentication remember thatauthentication isn'tone-sizefits-all your solution mustscale with your organization and addressyour unique security requirementsfor certificate management you haveoptions while proven open-source toolsexist don't rule out building a customsolution if your needs demand it carotations deserve special attention theyrequire thorough planning and preciseexecution finally comprehensivemonitoring and alerting are critical youneed to catch certificate issues earlybefore they impact your services andusersthank you for taking the time to attendour talk these qr codes link to oursocials where you can connect with us umi'll stay on this slide for a moment2025-04-15 21:58:16.397501 ��P_#��WAfznzH-gf9h8all right I welcome everyone to Vsession Thanks for staying with us allthe way until the end of the CubeCon Youare some of the brave souls who areremaining today for the the last slot Sothank you Uh I am Vincent Uh I'm the CTOfor Asia Pacific at Red Hat based inSingapore and I have with me TamarHi So I'm Tamar I'm from IBM ResearchI'm an IBM fellow and a chief scientistfor sustainable computingAll right So what are we going to covertoday uh we'll start with a bit ofinہ�O^#��UAT-nN86wTebMgood morningtoday we're pulling back the curtain onshopify's mtls journey it's been quitethe ride and we've got some fascinatinginsights and lessons toshare but first some quick introductionsi'm michelle mali i've been at shopifyfor nearly 5 years and joinedinf i was a contributor on the mtlsproject and continue to work on securingshopify's infrastructureand hey i'm dennis santos i'm a seniorinfosac engineer and i joined the teamin shopify back in 2020 my recentprojects involve increasing the adoptionof mtls for internal serviceauthentication and using attestedidentities foracl here's what we're going to becovering today uh we're going to startwith a brief refresher on zero trust andthen we move on to talk about workloadidentity and adopting spiffy then wecover some options for service toservice authentication we also talkabout how we do certificate managementobserv observability and we finish witha demo followed by key takeaways we'regoing to be showing snippets of the demothroughout the slides but we will have afinal session with a video showing allthe pieces togetherbut first things first why does zerotrustmatter zero trust is a concept foundedby this fine gentleman john kindervagit's been around since 2009 and itcenters around the belief that trustingis a vulnerability and security must bedesigned with the strategy never trustalwaysverify we have seen a growing push forthe zero trust model across companiesand government agencies the latesttechnology trends for 2025 published byo'reilly states that there was a 13%rise in interest in the topic now let'stake a step back and try to understandwhy this is thecase oopsyou remember in 2020 we had the pandemicand it triggered an unprecedented shiftin how organizations operate and thisglobal crisisaccelerated the consumption of toiletpaper and also accelerated the adoptionof cloud infrastructure as companiesrapidly adapted to support remote workand digital operations also bring yourown device policies fundamentallychanged how organizations approachsecurity with scattered workforce acrossvarious locations and devices relying onthe traditional perimeter security modelor castle emote became problematic tosay the least um in zero trust we alwaysverify the identity of every entityrequesting access both human andnon-human non-human entities like vmscontainers applications and services arecalled workloads this verificationprocess can be achieved through variousmechanisms but because mutual tls ormtls ensures that both parties at eachend of the network connection are whatthey claim to be it is one of theprimary and recommended mechanisms forservicetoervice authentication this talkis going to focus on how we implementmtls at shopify to verify theseidentitiesokay so what makes up a workloadidentity what are some of the challengesorganizations such as shopify face whenimplementing and managing them toillustrate how workload identities workconsider how we use passports asidentity documents what makes a passportdifferent from any other piece of pap��troduction about green AI why itmatters Uh we've talked a lot duringKCON about optimizing AI system Soeffectively we are going to talk a lotabout this but really from the angle ofenergy efficiency Uh we'll uh talk aboutthe key challenges and opportunitiesspecific to cloud native AI uh and todaywe've made a choice to uh really focuson the optimization of AI in France andwe'll speak about why AI in France is soimportant and then some of what'shappening in the community right now uhand the possible next steps So let'sstart by framing a bit uh the debate Uhwe are actually at the very uh uhbeginning of an AI sustainability crisisuh but the the challenge that we have infront of us is not new So what I want toshare is the explosion that you seetoday in aid man uh essentially startedas early as 2010uh with the growing huge use of deepneural network technology essentially Sosince 2010 which is the deep learningera that you see pictured here we'veseen on average a growth of four to fivetimes a year of the energy consumptionto trainmodels So we are very much in a 15 yearsperiod already of this growth and thisconstant increase in number of resourceused to train modelUh notsurprisingly this has led to a hugesurge in energy demand for data centersUh and this is an interesting statisticsthat we expect by 2028 so we are justthree years short of that Uh 19% of datacenter power demand will come from AI Soabout almost 20% of it will be purelyused for AIAs a result of this we see the emergenceof regulations to try to contain uh uhthe use of energy Uh we are in Europehere So you may be familiar with the uhthe EUA act although it's actually forthe European Union but you are veryclose in UK Uh this is actually thefirst regulation globally that is goingto mandate the disclosure of energyconsumption for AI systems uh that isprobably the first of many regulationsto come I am personally based in AsiaPacific and I'm actually working with anumber of government entity that arealready elaborating their ownregulations So this is going to be uh uhprobably a mandatory requirement forbusiness to understand their consumptionNow ironically uh today if you look atenterprise and this is actually astatistics that was published by thestate of AI infrastructure last year 74%of companies that use AI today areactually struggling to leverage theircompute and acceleration infrastructureSo that means there are lot ofoperational problem in optimizing theuse of resourcesHowever it's not uh you know only a sadpicture because AI also helps us with alot of improvements uh and in particularin sustainability as well I'm justsharing a few use case here Uh they arevery close to us because they they areuse case that uh my my friends at IBMresearch have all contributed to in theclimate science space uh we'vecollaborated with NASA to take 250,000terabyte of earth observation data andbuild foundation model to basically readuh satellite imaging in a much moreefficient way Uh we see result of aboutyou know four times improvement in speedof interpretation of of satelliteimaging with this type of foundationmodel So they allow us to do someprediction of extreme climate events forexample and obviously this is a hugebenefit to humanity in material scienceuh you know foundation model have beenincreasingly used to uh simulate andmodel new uh material structure and thatis a huge problem as well for humankinduh I'll take the example of US There'sactually uh uh close to 800 substancesthat are toxic that are being monitoredby EPA in consumer product and there'shundreds of thousand of consumer productthat now need to beretrofitted with kind of green and safealternative So that's a huge problem tosolve Um in environmental science wehave this problem with what we callthese forever chemicals uh those arechemicals that take thousand of years toclear in the environment And uh lastyear actually IBM research had apartnership partnership with thenational science agency to look at theidentification and remediation of thismaterial Uh and last but not least inhealth sciencesuh I would say AI has lite�rally changedthe game in terms of early detection ofdisease particularly when applied tomedical imaging So those are just a fewexample Uh so we definitely want to keepAI and make use of it especially if it'snot to generate funny pictures uhthrough a chatbot but to actually youknow build real value uh with theapplication and so we contend that inorder to do this we should be looking atthe whole life cycle and supply chain ofAI and the system to actually improvefrom top to bottomSo there's a few way of looking at itlike the first way is through themachine learning life cycle uh a bit youknow when we discuss about security ordevops and the shift left actually Ifeel this applies to AI when you look atthe machine learning life cycle fromdata preparation to feature engineeringmodel development all the way to servingthe earlier you actually start to uh uhimprove and simplify your flow then themore results you have So an examplewould be from a data perspective if youcan actually make more meaningful dataset through data distillation you couldreduce your model training by up to70% Because you basically have toprocess less data Yeah Then the wholelife cycle is actually impacted by thisbecause you have smaller model fastertesting less resources used in theprocess Um so we want to optimizethrough the machine learning life cycleand then the other angle is more interms of pillars We have a dataoptimization pillar We have a modeloptimization How do we make the model assmall as possible and then we have asystem centric pillar which is more theefficient operation So those are threelevers three types of uh potentialoptimization we can do uh pretty muchacross the system At the end of the daywe are trying to reduce the use ofcompute accelerator networking and thenstorage Now I'll cometo why uh the focus of inference Uhobviously the whole life cycle isimportant but inference actually has avery special place right now in terms ofthe overall consumption I'm actuallyusing some figures published uh publiclyby Facebook AI and uh they areinteresting in the sense that I'm goingto say for some of you are enterprisehere whatever you see here publishedfrom Facebook you are probably wayhigher than this in terms of the theplace of inference in your consumption Imean let's let's be reminded thatFacebook really has a business based ondata and they actually tend to retrainand retune their model on a veryfrequent basisMost of you and and us in the audiencewho actually use more classical use caseof data we tend not to do this on such afrequent basis Yeah But so Facebookpublished this study and it's very easyfor them to actually measure where knowthe different part of the life cycle andtheir contribution because they are theyhave specialized fleet of of hardwareand infrastructure dealing with thedifferent part of the processing Uh soin their infrastructure data and modeltuning and inference are literallydifferent fleet of servers and differenttype of optimization Uh so what theyactually showed is close to 65% of an AIsystem operational carbon uh footprintactually is spent on inferenceOperational means the energy spend uh uhto basically run the inference of themodel So that's almost you know likeit's around two3 of pretty much theoverall total Uh at the same time wealso demonstrated that optimizing thethe the platform level Uh so that's theoperation can lead to as much as 800times improvement in consumption Sothat's like a huge reduction If youactually take care of optimizing yourinference you can get fantastic resultsUm and let's not forget something to tofinish on this topic A big part of thecarbon consumption of AI is alsoactually embodied in the hardware Sowhen for example you buy uh a twocluster ofH200 when you actually effectively onlyusingone obviously you are you actually havea lot of embodied carbon in the processwhich is coming from the manufacturingof the hardware the shipping and so onSo the other interesting aspect is 50%of the embodied carbon comes from thehardware manufacturing So when youoptimize inference you basically needless hardware an�d you are also getting a50% reduction in overall carbon So thatis super interesting So that's what Iwanted to share really on explaining whywe have this focus on efficientinference and I'll let Tamar now explainsome of the work we are doing in thisspaceAll right All right Can Okay So my firststatement here is that green AI isefficient AI And why is that when youwork on your efficiency you saveresources If you use less resources bydefinition you use less energy Energy isa resource So you can also think aboutit like this So it's all about runningyour work um meeting your SLOs'srespecting your SLOs's while making thebest use of resources using lessresources and using them moreefficiently Now there are lots ofdifferent techniques for efficient AIacross the life cycle and I'm going tomention a few of them So you see on thepurple the purple boxes there uh thereare techniques that have to do withmodel architecture So this is the AIscientist who is working on newarchitectures such as Laura adaptersWith Laura adapters you freeze some ofthe weights uh when you're doing finetuning for particular tasks and then youget these adapters that are just beingused for that particular task basicallybreaking breaking the monolith Ummixtures of experts that's anotherarchitectural model architecturalprinciple that can help us have moreefficient inference and more efficientfine-tuning Quantization that's anotherthing Speculative decoding This is aruntime inference technique where youare guessing ahead of time likeoptimistically a a a number of tokenstogether using a smaller model These areall techniques which are used in orderto optimize a single model Okay Now thesecond category is what you see with theblue rectangles here We're talking aboutwhat's going on in deployment andoperations Completely different story Sowhat's going on there is that you havemultiple different models that areserving multiple different users Uhthese models have very differentcharacteristics and the requests arealso could be of very different natureSome of them are latency bound Some ofthem are more about throughput Okay Andall of this is running on a cluster withheterogeneous hardware Heterogeneoushardware is really important And I willtell you why We believe in fitforpurposeaccelerators You don't want to run yourentire workload on GPUs because you'regoing to waste a lot of cost and a lotof energy So if you have a large modelin your training and you have a trainingjob obviously you want to use GPUs Ifyou have inference requests or if youhave smaller models then you may bebetter off with other types ofaccelerators Okay So this is really therole of the platform The role of theplatform is to bridge between the modelsand the different jobs and requests andthe infrastructure that is aheterogeneous infrastructure with fitfor purpose accelerators and to do thatin the most effectiveway All right So going a little bit intothe challenges of that platform So youreally want to you have really multipledifferent patterns that are emerging inthat landscape of AI support chat box Soyou have low latency This is where yougo to the bank and you can't find wherecan I find my big number or somethinglike that and you're chatting with an AIchatbot and you can be tolerant becauseyou're used to getting support frompeople and people are really slow So youcan be tolerant but you still want lowlatency Okay So then you go to codeassist Code assist is like you're codingand you want autocomp completion forexample here You also want low latencybut you want much more much lowerlatency because if it's not fast enoughyou're going to complete the codeyourself right developers have nopatience We know that Um and the contextwindow here is much bigger because youyou need to understand the entirecodebase in order to help you in orderto help the AI needs to understand theentire codebase in order to help youdebug do autocomplete and so on Then youhave LLM service Here we're talkingabout high volume multiple tenants lotsof interactive requests all comingtogether and you need to support thatDocument processin�g completely differentidea here It's about throughput Documentprocessing is all about throughput LLMpowered search This is a hybridreal-time LLM and rag which istransactional database kind of workloadand identic completely different storywhere you have you now need to worryabout the end to end latency the contextacross all of these transactionstopology awareness and so on So all ofthese are different patterns and whatyou see on the right is you see a bunchof metrics throughput which is token persecond time to first token time betweentokens blah blah blah all these metricsand you see some selected strategiesthere are lots of different strategiesthe point here is for each one of thesepatterns you're going to need otherstrategies for chat bots you need stickymanagement for example for code assistyou want to uh context truncation so youwant to eliminate the code that is notimportant and keep the code that is isimportant in order to reduce the contextfor LLM service You need you need um SLObased routing and queuing because you'regoing to have requests with differentSLOs's and and and so on and so forth Sothe really the variation in context landdegree of similarity across querieslatency and throughput trade-offs and soon willaffect the strategy that you want to useand that's what makes it so complicatedSo in really just to summarize this is asummary chart which says look based onthe use case you may want to usedifferent techniques in order to achieveyour goals and your highle goals at theend of the day is you know like I'mshowing here 30% reduction but maybeit's more you know a a significantreduction or a significant increase inyour effective throughputuh which is a successfully processedtoken sometime we call it good put youknow a a a significant reduction of costreduction of energy and uh and stillwhile still maintaining user perceivedlatency and so on That's what you wantat the end of the day The challenge hereis how to match the patterns that youhave with the strategy that you use Sothis is where the platform comes intoplay The goal is to maximize the overalloutput while meeting your SLOs's andminimizing your cost bridging betweenthese heterogeneous LLMs and adaptersand these heterogeneous infrastructurewith fitforpurpose accelerators And howdo you do that so you need to integratebest of breed optimization techniques ina coherent fashion And what I mean bycoherent fashion is that you want all ofthese optimization techniques to workwith each other rather than against eachother And this is very important Soyou're going to see that lots oftechniques are being introduced Thisarea is evolving really really quicklySo we're going to go quickly throughsome of them Right sizing and GPUslicing Why do you want to do GPUslicing of course you want smaller LLMmodels to use just a slice of a GPU Theydon't need the entire thing So you wantto share GPUs across models You can useMC partitions or MPS And you will needright sizing techniques What is rightsizing techniques is techniques thatallow you either through profiling or byanalytic methods to assess the size ofthe of the slice that you need for aparticular model And by the way I'mmentioning here under sources you'regoing to see the links to some work thatwe did in IBM research in collaborationwith um Red Hat and Nvidia on InstaSlice which is an open-source projectthat allows you to dynamically sliceGPUs And there is another work that wedid here that is really showing you howto right size When you combine the twothen you can get really really goodresults and save a lot of energy and alot of cost So then we go to the otherone routing and queuing Why do you needrouting and queuing why is it not justload balancing the good old loadbalancing well that's because you havedifferent requests with very differentnature Batch versus interactive rightand it's very unpredictable to know howlong it's going to take a to process asingle request Why because you don'tknow how many tokens are going to begenerated So you need intelligent loadbalancing and intelligent queuemanagement You want to have you� want toavoid the head of the line problem whereone request is blocking everything elseuh you want to uh potentially introduceeviction and techniques like that Youmay want to reorder the queue or theeven the way you arrange the cues matterUm so we did some work in collaborationwith UIC So this is IBM research and UICUm very interesting work I I recommendto read this paper because it's goinginto all these policies and a lot ofwork that already went into uh the VLMopen sourceproject All right So the point here isthat these things depend on one anotherSo as you do um routing and queuing youalso want to do caching So one of themost important things here is that whenyou KV cache management is a veryimportant techniques that is being usedand this is because of the um autoreggressive nature of LLMs So you wantto reuse it's it works in iterations andevery iteration generates the next tokenBut what you want is you don't want torecomputee everything again and againThat's why you use KV cache However inyour load balancing or in your routingyou better do it based on where thecache is right otherwise you're going toget suboptimal results So you see thatthere's some interdependencies that arestarting to appear here Um so we talkedabout caching and loading and why youneed them right auto reggressive natureH that by itself is a very complicatedproblem because not only that you need aKV cache in every node it's also how doyou share KVK cache across nodes becauseyou're not always going to be able tosend all of the u prompts that are uhthat relate to the same user or to thesame session or that are similar to thesame node because of load balancingissues Okay so you need to share the KVcache across nodes How do you do thatlots of ideas Um project moon cake is agood example that is really really sortof stretchingum the the possible here and looking attechniques such as this aggregatedprefill and decoding and soon Um and then Laura management Uh soremember that we discussed the Lauraadapters That's another thing here um wecan have thousands of adapters and theseadapters could be dynamic So where do wecache eachadapter so that it's going to beavailable for the requests that need itbecause not all requests are equalThey're going to need differentadapters And then how do routing andqueuing actually factor it in in thedecision making All right So here wecreated another dependency and thenfinallyautoscaling Autoscaling is all about thenumber of VLM instances So you have amodel which is running on three or fiveVLM instances and you need to grow thenumber of VLM instances based on yourload and shrink it based on your load Inthat process you also need to makeplacement decisions So where am Istarting that new VM instance that alsodepends on all of these other elementsSo one thing is you want to look at thesize of the queue in order to know howmany instances you need for a VM requestand the other thing is you need tounderstand what is guest where in orderto place the new VLM server for aparticular model in the optimal place Sowe're introducing all of theseinterdependencies and I think that thisis more more than just dependenciesDependencies is like when you need alibrary in order to run something andit's easy to handle I call itintersectionality intersectionality inthe sense that there needs to be somesort of of a very intelligence awarenessso that these control mechanisms aregoing to work together so that 1 + 1 isgoing to equal three not minus one Sothe question is how to keep the balanceon one way we see a huge like a lightspeeded evolution of the technology andI bet you probably in one month I'mgoing to have to have another pink boxhere because there is going to be a newtechnique but I can bet you 100% thatthey're going to be more algorithms todo better distributed cache managementand so on and better routing and betterqueuing and so on So the area evolvesreally quickly and we really want tomaximize a combined outcome of these ofthese uh that we're getting from all ofthese optimization techniques right sohow do we do that so these are somearchitectural principles first of allmodularity and separation of concerns wereally have to have well- definfinedAPIs well defined control uh controlflow and data flow what does the routingneed to know about caching so that therouting could be optimal right and um wereally want datadriven optimization Sowe need benchmarking tools that allowsus to understand what are the benefitsthat we're going to get from particulartechniques but also when we combinetechniques together right uh we want toleverage best of breed open sourcetechnologies and we want to support thisrapid evolution working together as acommunity so that we can test newalgorithms quickly So if I'm using myfavorite router but I have a newdistributed KV cache algorithm I cantest the two together and I don't needto reinvent the wheel every time that anew technology comes into play andobviously support atrogenity becausethat's also good for efficiency and forcost reduction Um so now I'm going topass it back to Vincent to talk aboutthe role of the CNCF in supporting thecommunity effort around this article andaround this technologyAll right So we we've seen extremelycomplex problem to actually solve theplatform efficiency problem uh andactually that opens really an avenue forthe CNCF to really help the industry tolook at how we standardize some of thesepractices and efforts Uh so you probablyhave heard this morning during one ofthe keynote uh examples such as thegateway API extension for LLM in FranceThis is an example of standard that theCNCF as a group can promote fortechnology vendors and user to reallystandardize the way they are going tolook at optimizing inference 2 model Uhyou may have heard as well uh in thetelco session about a project calledKepler Kepler essentially is a standardfor energy observabilityuh uh in workload and obviously we areusing it to evaluate the energy span ofAI workload in a cloud native context Sothe CNCF is has this huge role to playto help us actually standardize ourtechnology contribution to some of theplatform capabilityNow the challenge though is havingtechnology and standard is good but it'sactually usually difficult for anyplayer to build a consistent platform Sowhat we started about a year ago is ajoint uh partnership between the theCNCF uh AI work group uh as well as thetag environmental sustainability to lookat best practices of architectingsustainable platform So we started towrite a white paper uh and this whitepaper essentially is is uh looking atfour dimension which influence theapproach to sustainable AI which is atype of deployment environment that youactually uh deploy the technology on Itcould be anything from a public cloudfootprint to an edge computing footprintuh it looks like the type of AI systemTamar has explained earlier that thisliterally drives the type ofoptimization uh you can leverageobviously the AI life cycle are youoptimizing the data the model theoperation and then of course the personaso who is actually operating theplatform and can take active steps tooptimize so this white paper is meant touh provide a collection of bestpractices in uh uh building andoperating sustainable AI platformAnd so I'll finish with this Uh if thistopic is of interest uh I would inviteyou to find out more about the work bythe cloud native AI working group whichI'm part of We did a lot with AI ingeneral But as you've probably inferredfrom your time in the past three day atKubeCon a lot of the focus right now isoperating AI efficiently Uh the secondlink is really uh on the working groupon the sustainable AI uh uh white paperand so we are very close to publish anopen draft for consultation but rightnow you can find about 10 to 15different best practices and techniquealready in the white paper uh that areshared across different organization andwe very much welcome more technicalcontribution or reviews Thank you verymuch for your attention Uh and have agreat weekend ahead[Applause]2025-04-15 21:58:17.061834�del again this is very simpleexample just using MLP model for um uhimage classification uh and then Idefining my loss function my step and mytraining code here uh so this is againnative MLX code without any modificationat the very end we export model back todisk because I want to do some evalsafter this example is complete so whatis the next step next step for me isactually uh get available runtimes soremember we talked previously we have aruntime component so runtime you canthink about like a template or blueprintthat means that data scientist can useto play with this uh so this is MLXruntime MLX distributed runtime which asan entry point of API run and to usethis runtime like what I just need to doI just have a one single simple APIcalled train and this API so this APIactually comes from Cubeflow SDK so weas a community kind of created this newSDK called Qflow SDK and they provide aPython interface for me as a datascientist to interact with Kubernetescluster so no Kubernetes at all like Ijust you know I just say I have myfunction called ML MLX train I have myarguments like here I'm passing my modelpath and I specify how many nodes I wantto use to scale my MLX code also I can Ican install some packages in the runtimeit could be cycl or pandas whatever Iwant to install to evaluate it and thenthis is like runtime reference to um tomy runtime here so this extra APIgenerates some job id uh also we canlist all the jobs so you can see some ofthe jobs being created before this isthe jobs we just started we can get allthe steps from jobs as you can see usingthree nodes so this is one of thelauncher nodes to the worker nodes andall of them using like two CPUs devicesuh then we can get the locks from ourcode so right now because we dodistribute training here uh actuallywe're running uh 60,000 samples 60,000images which is distributed across threenodes so in total we perform DDP here sowe're running as you can see uh 20,000images uh pair like working pair like uhworker node uh so we is a very simple asyou can see training is happening um soit generates accuracy it's actuallyrunning across all the available devicesuh for me to work with also like I thinkimportant to mention if I have more morelike resources I can you know changenumber of nodes to 100 to 1,000 and itwill automatically distribute myfunction across all the availableresources so let's see if training iscomplete all right so it's seven appyeah so training is complete model hasbeen saved to the disk let's try to seeif this is available so this is my modelit's sitting right here uh okay so thenyou know what what data scientistsusually do evals right evalations uh Ican pass some images from the testbatches and I can see what kind of mymodel returns so this is like you knowour model pretty I would say right so wehave like 40% of accuracy because we youknow we you know don't train like a lotof data but you know this model actuallygenerates some outputs here you know Ican do some relations and I can see howit actually um been um calculated sovery simple just writing native codescale it uh but again it's very powerfulbecause the next notebook what we'regoing to do we're going to fine-tunetune T5 model using deep speed uh so forthose who don't know deep speed is aframework built on top of torch tooptimize this for distributed trainingand here we're going to use uh eightvaried GPUs uh on two nodes so it'susing GPUs to fine-tune T5 transformeruh first step similar to previous one wejust initializing the client passing thecontext to my GPU cluster uh specifyingmy distributed environment specifying mydata set here we're using VikiHow dataset uh tokenized data set downloadingmodel tokenizer uh defining deep speedconfig so for those familiar with deepspeed deep speed provide a very nice wayfor data scientists to configureoptimizer scheduulers uh microbatches soI think they really familiar with thoseAPIs again native native deep speed codehere and okay in the very end in thevery end we exporting model back to S3so this is my training function that Idefine uh saving checkpoint to t�he S3getting available runtimes so because wehave GPU cluster here I have anotherruntimes so the first runtime is deepspeed distributed second runtime istorch distributed and as you can seethis runtime has a four accelerators forGPUs for me also we have this API calledruntime packages so sometimes I want toknow like what is inside the runtime andwhat kind of packages it actuallycontains so this actually should uh tellme I hope this will tell me or no Idon't know let's see it should tell mewhat is instal inst installed inside theruntime allright right all right all right allright all right all right all right allright all right all right all rightright all rightright sometime it takes time you know toto process an image and get the data butusually like you know this API shouldreturn me the the training job let meactually show you all right you knowit's always great to have a live demobut sometime you need to have arecording you know so all right so letme let me show you what's actually itshould do um basically this API so Ihope you can see it here all right notnot here so okay train custom trainertrainjob yeah this one all right so nevermind live demo you know sometimes itdoesn't work so this API should actuallyreturn a list of available packages inruntimeuh so you can see this open API 4.1Python version available packages at thebottom you can see the the deep speedshould be installed there we installedsome packages here so me as you canunderstand what version of deep speed isusing uh what I can use to evaluate mymodel and you can see this demo wasrecorded in 4 a.m nice time to record ademo so yeah so this is actually alsoavailable GPU devices for me so it saysfour Tesla V100 for me to play with andthe next step similar to previousexamples what we need to do we need totake the function and call the one APIcalled train uh which basically passingthis deep speed function we definedbefore how many nodes I want to use whatis my arguments to my function what isthe input arguments to my trainingfunction like data set URL model name ofthe bucket uh like T5 base um and uh umyeah then we just you know me as a datascientist just need to execute this codethis you know Python simple simplefunction similar to previous example wecan configure like you know version ofthe runtime packages like B3 or anyother packages I want to install on topon what is already exist in the runtimeso this return the random job ID uh thenwe can get list of all of our jobs wecan actually get the the nodes so herebecause we're using two GPUs we returntwo different nodes here and uh here itsay four GPUs for everynodeum and yeah so we're also using GPUshere we have the graphana dashboardavailable so this is the actuallyutilization dashboard you can thinkabout using DCGM exporter for this toshow me like utilization of my uhtraining uh so when yeah we can see somespikes here uh we can see like number ofGPUs so this is like very usefuldashboard for data scientists tounderstand what's happening uh behindthe job um so right now because we'redoing like tokenization we can see likeon the right side it's not been stilllike yet utilize the GPU resourceshere um so this like I just run the jobbefore we start this demo basically uhso we get again get job locks API returnthe locks for my for my trainingfunction so because here we're using uhtwo nodes in four GPUs we do multiode uhmultiode multiGPU so in total we'reusing eight devices which means like intotal we're using 160 samples across allthose accelerators uh here we can seelike discover API settingsum we should see like deep speedconfiguration at the very bottom hereyeah like this so similar configurationwe said beforeum and then at the very end we train themodel so here we do fine-tuning of T5and we just run this on a few appex andI think the training takes around youknow 60 seconds um and every node everyworking node process 160 uh uh 160samples so right now if you go back tothe graph dashboard here we can see thespikes because GPU utilization go up andwe can see like eight Nvidia GPU devicesthat we use to fine-tu�ne thisLLM so and then I guess the next stepfor us is to perform evaluations thevery end um total training time is 69seconds um and then we again we candownload model back from S3 to thenotebook i didn't do this here becausethe model quite large it's like threethree GB and we preform somewhere elseso T5 is very good for textsummarization so it can do somesummarize your data so we're taking theinformation from um Qflow documentationqflow documentation which actually uhsummarize I ask my LLM to summarize thetext at the bottom and performing outputlike what is the Qflow trainer projectisabout yeah I think like right now yeahwe're just loading the model wefine-tune to the hugging facetransformer and asking the transformerto actually uh produce theoutput it takes some time for my machineto process the model and actuallygenerate the output from here um itshouldbe yeah it should take like a fewseconds afterwards so uh yeah so Qflowtrainer is a comparency project designedfor large language models so very simpleright um but again like I think the goalwhat we try to say here and let me goback to my official slideso like what what what we try to sayhere um you see two examples right oneis using MLX second is using deep speeduh no Kubernetes at all I didn't even goto coupubectl like I don't want to knowwhat is coupl like I'm a data scientistright I forally scale MLX and deep speedusing Qflow train job I didn't even knowanything about API because I just say Iwant to use you know API is acommunication library but I don't reallyuh configure it and this consistencybetween other environments like my localenvironment my cloud environment mentall the infrastructure complexity ishidden which allows me to do rapiditeration so if you want to check thisnotebook this is the QR code for forthem i just want to spend maybe um a fewminutes for quickly like QFL trainer V2so this is the overall introduction bythe way thank you CN for working on thisnew logo with this new logo for ourproject and the goal is for this projectis to allow data scientists to dotraining on Kubernetes in a very verysimple way so we connect all theselibraries on top of cube on top ofkubernetes with additional features likemultiode training fine-tuning elastictraining gang scheduling without evenworrying about kubernetes complexitiestoday we support um three runtimes whichis torch deep speed and mallex we'reworking on jacks and tensorflow supportas well so folks can also leverage thoseruntimes for those frameworks and if youwant to learn about this project I justleave with the QR code for you as wellso before I pass it to you to speak moreabout HPC uh this is actually what'shappening behind the scenes so demolooks extremely simple but behind thescenes we orchestrating the entire APIcluster to perform distributedcommunication so NPI require host fileSSH keys to create those training nodesand as you can see here we have job setwhich generates two jobs we have alauncher and the node and the launchercreates the the launcher node which arerunning the training as well and then itexecute the NPI run command on theworker node so worker node can alsoperform the the communication and in theNPI world every GPU is a slot uh so youcan see we have a total of eight GPUsand the variant we exporting world backto S back to S3 so data science canperform evals so it's very complex likelooking in this diagram but from thescience perspective it just one API oneAPI one function and that's it so withthat I will pass it to Ricky we willspeak more about how we see thistransition from HPC to AI um movingforwardokay um thank you for uh describingAndre uh in my section uh we are goingto talk about what is uh transition fromHPC to AI workload in my talk title andwhat is a problem for that um today'smodel development and deliveringworkflow has many steps uh as you cansee in this slide uh it is from datapreparation to modelserving however uh SPC focuses mostlyonly on model development tuning andtraining in this case uh if we constructHPC dedicated infrastructureorchestrator uh we probably need tomultiple� work scheduleuler andinfrastructure or orchestratorssomethinglike Kubernetes or uh some else uh thatwill bring us increasing maintainingcosts and ML engineers should understandmultiple userinterfaces so we sometimes consider towant to construct construct entire MLcycle on top of Kubernetes instead ofdifferent work or scheduleuler anduh infrastructureorchestration we re leberagingKubernetes allows us to automated uhinfrastructure management andcomprehensive user interface powered byKubernetes uh something like uhmitigating management cost andleveraging self-heering or and more uhhowever uh for this transition uh wehave some problem so in the next sectionuh let me describe what is problem andhow to solve it uh by uh cube fortrainer first problem uhis uh we face we face uh migrating toKubernetes uh one day DevOps engineer ummigrated job execution platform toKubernetes uh from scrum or some elseand um they told it to data scientist uhas you can see here uh at the at at thattime uh data scientists are worried howcan they perform their jobs by same waysas previously in this case uh deopdevops engineers tells that they need tochange their training code so that itcan adapt for the crowd nativeenvironment however uh as Andre uhdescribed uh in the demo partuh data scientists just want to performtraining code uh they do not want tocare any infrastructure specificationchange in their training codeso why does thishappen let's consider why such problemhappen uh we can consider uh threelayers uh for ML training job submissionenvironments uh data scientists areresponsible for training calls and howto use ML frameworks however wheninfrastructure layer ischanged these rayers got affectionsbecause those rayers fully rely oninfrastructure ray and a couple oftraining code executing specificationsare depends on each job schedulerspecifications so uh those uh gaps andoverhead for datascientist so when DevOps engineers uhmigrate their training environment toKubernetes they should consider the uhanother userinterface actually even if they canmitigate those problems by providingcomprehensive user interface or DevOpsengineer manually adapts data scientisttraining code to Kubernetes we will faceother problems so let's see anotherproblem by concrete use userstory another problem is um uh we faceafter they succeeded to migrate toKubernetes one day data scientistsconsider to want to use another newstateofthe-art library or uh mechanismfor their training cause uh however uhas we face uh in previous program thetraining code should be adapted to cubarenvironments it's crowd native uh sowhen data scientists consideruh sorry uh however as we face inprevious uh problem uh uhuh okay so the their training codeshould be adapt and so we so when datascientist uh consideruh interris new algorithm uh or sort ofrivalries uh uh DevOps engineer shouldsupport uh the mechanism ininfrastructure layer in kubernous layeruh that's not easy uh to address everytime every library every frameworks andso uh this is secondproblem yes let's consider why does thishappen so the key problem is a differentcycle between uh providingstate-ofthe-art ML framework andinfrastructure changes the sort of MLframework rivalry are rapidlydevelopment but it's not easy toconstruct new infrastructure radar sohow to migrate those how to mitigate dohow to mitigate to problems let medescribe in the nextslide um the mitigate solution is uhrearing MPI uh like open MPI uhinterface uh the NPI can filling up gapsuh between job schedulers theseadvantage can help to mitigate problemone which means data scientists do notneed to mostly uh care infrastructurechange additionally NPI run or NPI execor some NPI tools can perform arbitrarycommands which allows us to easilyinfrastructure problem debuging becausewhen they want to debug uh where comefrom the air in their training code uhthey can perform infrastructure airerror verification commands like NvidiaNCCL test or some toolsuh in same interfaceMPI this allows us to decouple oftraining code errors and infrastructureerror erroreasily however uh as you know um indistributed training frameworks uh wehave another mechanism like torch thetorch is native PyTorch commands fordistributed training uh that isdifferent approach uh opposed toMPI and actually Q for trainer supportsboth tools in B2API mpi is basically rancher basedmechanism aka uh center oriented buttoran like um to torture take advantageuh rancher rest mechanism akadistributingoriented both tools are helpful toperform executing uh PyTorch baseddistributed training on the other handthey both have pros and cons somethinglike debugability and fortrainers however uh in the typical Rajmodel training environment we often faceinfrastructure errors related tocomputing system and device uh duringtraining executionso I think the debugability is betterthan for triants since if we cannot uhresolve those infrastructure errors emerengineer uh uh never execute theirtrainingcode actually we can mitigate uh this uhlack of for tolerancyuh for MPI by horot or some toolsuh this is the reason why we considerMPI is useful for model training andfinetuning for ranch uh trainingenvironment however uh when we constructNPI environment the DevOps engineer needto manually set up NPI worker nodes andinsurance part communication and morethis is not tribal efforts and it's hardto address those for all no all jobs inthis case how to construct suchenvironment on top ofKubernetes uh actually uh we canconsider uh two approaches um one is acube control execution pattern second isuh SSH pattern uh cube control patternexecute MPI initializationum throughout cube API server whichmeans easily set up due to unnecessaryadditional setup however this typicallyhappens cubernetes control planeperformance issues since this tries tooccupy an entire control planeprocessingabilities opposed to cube controlpattern we can consider SSH solutionwhich is a slightly hard to construct umdue to necessary unnecessary additionalstate setups related to SSH uh setupinitialization but uh this SSH solutionis safer for cubar controlplanes so this is why we use cube fortrainer um in the cube for trainer isresponsible for uh this SSH based MPIenvironment setup which allows DevOpsengineer to mitigate complicated setupsan alternative as uh as an alternativeuh introducing QR uh MPR runtime uh wecan consider leveraging directly usecubernetes job or job set but those donot automatically set up NPIenvironments which means data scientistsneed to manually specify infrastructureparameters for NPI it's obviously notgood user experienceso as Andre mentioned in uh what is cubefor trainer section uh cube for traineris ro oriented resource model andautomatically set up npi environmentswhich indicates data scientist can focusonly on training code and uh trainingparameter related uh infrastructureparameter like number of node uh numberof processes uhsomething uh this is actual range ofYAML configuration as you can see uhthey do not need to specifyinfrastructure parameters they justspecify trainingparameters additionally uh we havePython SDK as Andre uh showed in demo uhthis allows data scientists morePythonic style uh job submission wayopposed to YAML manifest styleokay uh in conclusion uhuh we introduced uh cube for trainer MPruntime uh this allows data scientistsand infrastructure engineers uh easilyscale up MPI environments or easily setup NPI environments and data scientistscan focus on training codesso this is future work uh we planning toimplement uh uh some feature uh so ifyou have any interested uh you can checkum this QRcode uh finally uh we have uh QR automand training working group and cubressbatch working group uh those workinggroup uh are working tightly uhuh typically uh so uh if you areinterested in those problem or newfeature uh you can join SH channel or uhcommittee meetingsome okay uh thank you for uh our talkuh thank you thanks everyoneuh We don't have a lot of time forquestions we happy to answer questionsafterwards uh right do we have time orno we don't have time all right so feelfree to reach out to us happy to chat abit more thanks around for time2025-04-15 21:58:17.808801 PP��`#��_AFnb1a5Kaxgosimple to use API and various tooptimize distributed workloads so firstof all let me speak about about datascientists uh so what we hear from themwe've been working there for the last 7to 8 years so what they really wantsthey want quick ability to write ML codeusing native libraries like torch speedjacks and scaleit tricky part because it's kind of liketo do multiple times sometimesassociated with the process so theproblem is like they kind of like wantto do multiple times without evenknowing it's running but the trick islike in reality there are a lot ofthings associated with this process forexample like they told us they need toknow how to start environment how toconfigure Docker images how to configurecompute resources or data accesssometimes they also need to learn moreabout API maybe they want to know aboutGaling HPC technologies or maybe evenknow about resource cues so all this wekind of talk infrastructure claim andthe goal of us as a community how we canactually remove it because datascientist stuck behind this group andright now we also live in a world ofgeni you know models become much morecomplex we have like huge data set hugeamount of data we have like alldifferent frameworks the ecosystem is umis just a lot technology API nickel bluehow we can provide a unified way to givethem you know ability to interact withall those diversity of frameworksand in the same way adopting newtechnologies like uh uh API and otherthings so what actually the end userswant they just want a simplicityflexibility and scalability and theydon't want to learn anything aboutkubernetes so how we can abstractkubernetes um complexity from them sofor this we actually do this projectcalled cubeflow trainer uh we introducedthis project in the last coupon in2024 so this project was kind of like anext generation of training operatorwhich we started in 2017 right 2017 andbasically this is we kind of separate uhdifferent services so we have a servicefor data science called train and wehave a service for devops engineerscalled training runtime so thedifference is is training runtime islike a blueprint that platform engineerscan use to configure those uh thoseconfigurations and data science can workwith the one Python interface tointeract with the jobs and then we'reusing the job set to actually performdistributed communications uh so withthat let me actually give a new thingthat we're going to introduce in thissession which is actually API runtimeand I think like uh the best thing toknow about API is actually doing thedemo about this new frameworks with MLXand GP um so with that I'm going to showyou actually two notebooks uh the firstone uh called distribute with MLX so forthose who not know MLX is the umframework specifically designed by we'regoing through why we actually want toleverage it here because MLX using APIfor distributed communication and we tryto see how we can make it easier forfolks to actually for scientists whowant to run those code actually scale iton top of um so the first thing I hopethe demo will go well live let's see ifthe internet is working actually so thefirst thing right what I need to do uhis I'm playing the role data scientist Iwant to train so this is the simple CNNmodel which image classification examplethe first thing I need to do I just needto initialize my trainer client and bypoint my trainer client to my local miniplatform so this is running locally inmy mini cube on my machine without anyyou know other you know advanced computeso next step is defining MLX code sothis is pretty simple like if you knowMLX API they have API for communicationlike world size rank then I getting mydata from amnest I distribute it acrossmultiple partitions uh then I definingmy mo�� working with MLtoday So we won't dwell here for toolong In short at the beginning we hadalmost no support for our datascientists It led to a scatteredtechnology landscapePeople were working either individuallyor in small teams often solving the sameproblem over and overagain Without a common developmentprocess we risked having a lack ofreproducibility and a lack oftraceability from ML model back to thecode data and parameters used to createitWhile the undertaking to build a commoncentral platform in a large enterprisesuch as ours is not always the rightsolution in this case we think that bydoing so we've been able to bring ourdiverse community of data scientists andanalysts together We have um built anactive community around the platformWe've led people into common practiceswithout being toorigid and we are able to solve commonproblems once such as the integrationinto the company's network while umfollowing the prescribed securitystandards So let's have a look at howwe've approachedit This is the big picture of Abacusthat we're going to talk abouttoday Previously the users had to startwith a blank canvasThis made it very difficult for them toknow exactly how to get started andthat's what led to the fragmentation andthe siloed way ofworking There was a lack of commonpractices and there were no templates tohelp them getstarted Experience also told us that ifthey were able to build an ML model thengoing into production was really reallydifficult and there was no supportavailable for the feedback loop used touh create successive iterations of themodelSo we divided the life cycle into threestages and we we talk about threeseparate user journeys The first milewhich involves on boarding to theplatform and creating new projectsduring day-to-day usage which is eitherproducing insights or training ML modelsand then the last mile in which peoplecan take their trained ML model intoproduction and then continuously monitorits performanceSo we'll talk more about each one ofthese three stages in the next 20something minutes but first we'd like toreflect a little bit on the day-to-daylife of us platformengineers So at the start we saw theplatform as a a large tech stack stylepicture representing a subset of theCNCFlandscape Here we show how each of theseuh individual projects is linkedtogether to solve some some taskUm but if we take away these arrows thenwe can see the same set of projects wesaw in the beginning and whether weinstall these projects with githtopswith helm install with cube cuttle applywe believe that this is the easier partof the this process When we firststarted weaving installed many otherprojects such as the volcano batchuler along time before we even knew whether weneeded them ornot But if we only look at the arrowsthis shows the integration of all ofthose software projects This is used toaccomplish a task or even build thedevelopment process around which ourdata scientists work And this is reallywhat helps us solve the illities that wesaw on the common challenges slideearlier Reproducibility traceability andvisibility This is where we introducesome control while at the same timeallowing for the flexibility so thatwe're not being too rigidSo if we now go and zoom into the firstmile this is the onboarding phase andthis is where we have the first touchpoint with the user So we want thisstage to be as frictionless and seamlessas possibleWe direct all users to a single URL andthat's where we have the Qflow dashboardand we've configured Qflow so thatwhenever they visit the dashboard forthe first time they're on boarded to theplatform Here we've also patched theQflow profile management components sothat we can integrate our own onboarding application into the flow andthis application creates all of theresources that our users use So itcreates the Qflow profile and theKubernetesnamespace It creates a GitHub repothat's generated from a template that'sset up well for datascience We create a container imageregistry a vault secret store and arepository in Lake FS which is an objectstore that provides good support forversio�ning of dataWe also provision the CI infrastructureused to build the Qflow pipelinecomponents and we use Tecton forthat Finally we also create an Azor ADgroup which is how we integrate into thecorporation'sidentity and access management systemand that's used to gain access to all ofthese servicesThe result of this process is that theuser receives an email which is above myhead here and that contains all of thelinks to those services so they caneasily get started and at this pointthey can launch a notebook server in apersonalsandbox and from there they can start toexplore the the platformUm so from the users's perspective whatthey see here is that they're able tomanage their profile either first-timeon boarding or creating subsequentprojects and to get access to theservice However from our perspective asthe engineering team what we see is thisis the tip of the iceberg If we lookbelow the waterline we're handling allof the enterprise integration We providecomprehensive documentation includingtutorials and getting started guidesAnd we ensure multi-tenant isolation Sowe deploy the network policy and the STOauthorization policies that mean thatusers can't see each other's namespaces And we deploy this infrastructurewithGitOps Finally we also enable FinOpsright from the start as we'll see now Sothis is the QFlow central dashboard andwe've tailored this to match our colorscheme We've included links to ourdocumentation and to our Slack supportchannel We've also added a card hereright in the UI that shows thecumulative costs for the last 30 days Soby surfacing this information in the UIthe the users can be a lot more costconscious We also added the ability tocreate a new project from right herefrom the UI So let's have a look at thatSo this is the new project page If weclick start we can enter a name We cansay cubecon EU2025 Uh we can set a Python package nameif needed or skip it We can select thetier which George is going to talk aboutin a minute and the minimum Pythonpackage version that's used to uh set inthe in the generated source uh templateWe won't wait here until the 40 secondsor so that it requires to create thenamespaceSo to summarize what we've learned bybuilding this on boardingflow at the start um we see that peopleact responsibly whenever they see thecosts So we're it's really nice to seethat our colleagues are reallycostconscious and um even if the monthlycost is less than a cup of coffee insome cases they often ask us to offboardfrom the platform in order to save thatcostUm and we think that that's becausewe've made the onboarding flow veryseamless and frictionless so that somepeople will say that they want tooffboard and then they'll come backagain whenever they want to to use theplatform We also learned that we shouldhave contributed our patches to CubeFlowWe were aware of this at the time butthe time pressures of getting intoproduction we decided that thecontribution process and reworking ourpatches to be more generic would havetaken too much time given that wealready have workloads running indevelopment Now we decided to prioritizegoing toprod However several years later wethink it's high time to pick up on thisand to start to contribute them back Wehave contributed several small patchesto the project and we found that theprocess was both welcoming and seamlessSo we definitely recommendit Andfinally we uh built our on boardingapplication as a service calling out toother services to create those resourcesIf we were to do that today we wouldhave chosen to use the operator patternso that we have the reconciliation tokeep those servicesrunning We were aware of that at thetime But again we decided to prioritizegoing into production and given theexperience we had in the team at thattime having a simple application callingthose API endpoints to set up theservices we believe that was the rightcall at that timeSo with that I'll hand over to GeorgeHe's going to continue with theday-to-day work ThanksSteve Okay so now that we've passed thefirst mile and we on boarded to theplatform it's time for day-to-dayoperations �And this is where the mostwork is happening for our users and forour team as well If we take a look atthe diagram we see two boxes thereannotated with insights and ML productSo what are thoseuh early in our journey we've discoveredthat you know if you add too muchfunctionality and give you you know toomuch to the user they find itoverwhelming and especially yesterdaythey were working in the comfort of thelocal environment with the notebookservers and the next day they werethrown into Kubernetes world with youknow deployments ports authorizationpolicies and whatnot and it was a littlebit too much We did have users you knowearly adopters as well of course thatloved this and loved experimentation butin general there was some resistantresistance So to address that we try tofirst of all abstract away KubernetesAPIs as much as possible but also tointroduce this functionality to them ina in a gradual manner So we've split theways of working into two tiers So wecall them insights and ML product So weprovide a subset of functionality andfeatures depending on the goals they'retrying to achieve But in both cases theyget uh some baseline to get started withlike for instance repository from atemplate and the UI via CubeFlow UI Nowlet's take a look at the HTRindividually First one is the insightsIt's the more lightweight alternativeout of two Everything we do starts witha source code versioning So in theonboarding we're creating a GitHub repofrom start template which structurestheir code after Python package and italso adds some boiler plate for unittest and integration tests Uh everyproject that onboards to the platformhas the same structure So it makessupport troubleshooting down line muchmuch easierusers then clone their source code intothe notebook servers and um yeah theystart working the same way they wereworking before uh on the laptops uh wepre-built notebook servers withdifferent u packages you know CLIconnectors depending on what we'retrying to do uh which are you knowcommon to to most of the users but someusers also cate their own notebookimages and contribute back to theplatform form So the way they do it isbasically a pull request with a dockerfile and then CI Docker file to ourrepository and then CI takes care of itIt builds the image it tax it it uploadsit to harbor image registry and thenmakes it available to everyoneelse They also get a spark cluster uhand they can schedule spark jobs fromnotebook servers But we find that mostof the time you know they get away withuh much lightweight tools with betterAPIs like u doug DB for instance Uhwe've done a lot of leg work initiallyto make sure that the platform is wellintegrated within the network at Volvocars and has access to most of the datasources uh but at the same time we alsoprovide and the gitlike data versioningwith uh lake FS which we integratedreally nicely or natively one might saywithin our platform So this tierprovides you know balance of simplicityand just enough reproducibility andfunctionality to do the exploratory dataanalysisum I don't know some other kind ofanalytics or you know reporting or adashboard and that's what is uh used formost of thetimes Uh yeah so now the idea has beenvalidated Uh you know we are past theADA stage and it's time to add someautomation uh and give it a bit more ofa production look and feel Uh here userscan upgrade seamlessly the project thatthey already had with the one pullrequest again or they can start a newproject the way Steve demonstrated in aUI and have a you know project on the MLtier ML producttier Uh it is also worth mentioning thatthis tier extends the previous one Soeverything you had before you had herehave here as well we didn't add the youknow the icons to to make it a bitcleaner but uh then we built on the ontop of the existing functionality Sofirst of all we upgrade the repositoryuh to hold manifest for arbitraryapplicationsuh of cubeflow pipelines and cubeflowcomponents So whenever they make acommit to their repo it kick it it sendsthe uh payload from GitHub web hook toTCON and in the CI cycle then we builtuh �the components for them We tag themwith commit sha we upload them to harborimage registry Then we also buildcubeflow components and compile cubeflowpipelines again tag them upload tocubeflowUh and yeah from this point forward theyare working in already now familiarenvironment of cubeflow where they useuh training operators hyperparametertuning with katip of course cubeflowpipelines where they do a lot of uh uhdata processing work and then eventuallysome of them deploy models as well fromthe pipelines as a case of inferenceservice and here we would like toemphasize the importance of the CIbecause if you're just starting you knowyou read the documentation and likelet's say cubeflow docs and uh you havea lot of this nice small examples of youknow creating container from function orusing components but you know how tobuild those components is left up forthe user So if you build a component youneed to make sure you tag it You need toupload it to the registry You need toupdate the references of the componentin your cubeflow pipeline You need tocompile it You also need to upload thattoo So it becomes very cumbersome andobviously very errorprone So that's whyit's very important to have a first ofall a standard structure that everyproject can reuse and also this rigorousapproach to to do all these steps inautomated and standardized fashion andthat's exactly what we provide in thistier We provide the standardization andwe support everything from the firstcommit down to the inferenceservice and uh yeah there are fewlearnings we can share here and first ofall it's very important to start simpleright do not over complicate things youknow don't try to provide all the bellsand whistles from the day one becauseit's just it will become a little bitoverwhelming to the user and yeah theadoption of the platform will also dropor will be slow then You need to designfor different personas You need to keepthings simple and add functionalitygradually Right not everyone will have aproject with a with the model inproduction Many projects are you knowabout the hypothesis testing or aboutthe idea validation or you knowreporting and so on And you also havedifferent users from different parts ofthe organization So they might have alittle bit different approach to howthey workYeah And uh it's also you know when whenimplementing this functionality it'svery important to balance the freedomhow much freedom you give on theplatform versus how much you knowrestrictions you impose on it becauseagain freedom to do whatever you want isuh will leave your user in the state ofuh black canvas as Steve pointed out andit kind of defeats the purpose of havingthe standardization in the first placeuh and also it can uh you know post allsorts of uh issues like for instanceunintentional network network openingright so you need to be opinionated inthe way you do things like in our caseif the user wants to create a serviceentry they need to create a pull requeston our repository and then some of oneof our engineers will review it and thenit's merged and uh you know since wehave a like pretty good support in Slackchannel this takes you just a couple ofminutes from request to to to themerge And last but not least the CI isreally more difficult than it seems Uhbecause given the multi-dimensionalityof ML space right you need to keep trackof your code changes changes and but youalso need to keep track of uh dataversion of your model version and youknow any artifacts that that you mightproduce or might use So you really needto take a step back and think about howyou want to implement the CI to keeptrack of all these things And I knowmaybe the the first good step is to havea dot file in the repository where youkeep the references of the versions ofthese artifacts and you know you updatethem in the CIcycles All right So now we proceed tothe final stage So we have the modelswe're happy with and we are ready todeploy them in production I'm sureeveryone is aware and I hope agrees thatyou know handovers are not great rightso that's why it was very important tous to make this transi�tion from previoustier to production as simple as possibleas seamless as possible so that userscan truly take the end to end ownershipof the entire life cycle of theirprojectuh and yeah and that's why the changefrom um from previous stage is reallyvery minimal There are two things thatthat stand out here like first of allthere are no inference services deployedvia cubeflow pipelines anymore All ofthe manifests they have for theapplications or for the inferenceservices are already in their GitHubrepositories from the previous stagesand now everything is deployed via ArgoCD When they want to roll out newversion of the model or the applicationthey just you know uh bump the the tagSo we make a pull request on the repomerge it and that's itUh yeah we also learned that you knowmany users deploy both ksurf uhinferences and uh like fast APIalongside it They do some pre-processingbefore they send payload to the model orthey simply add the UI layer toit The second noticeable difference hereis the the monitoring is the the thebottom box there Uh one of therequirements we've put on ourselves whenuh is that you know this system had tobe very flexible and extendable andthat's because many teams already havetheir own uh subscriptions like storagesubscriptions right so we wanted toprovide this option for them to bringtheir own storage for monitoring as wellso they would be able you know to justplug plug it in so to say right so we'vecreated a login service that receivescloud events from KServethen correlates inputs and outputs ofthese events with the X request IDheader and then ingest them into thestorage of theirchoice And uh yeah this approach hasproven to be very useful uh becauseusers have full control over how theywant to share the data and with whom andthey also have the ability to bring anyanalytics tools if they want to and workon the data directlyAnd of course we also have uh alertingin place So from Argo CD if the servicesgo out of sync we send Slacknotification to their Slack channel andalso from the monitoring service uh ifthere is a model drift for instance wesend notification to their Slack channelWe have the usual suspects uh with thePrometheus and Graphfana to knowvisualize metrics and on top of that wehave u teams get the ingress and the endto encryption TLS certificate renewrenewal out of the box right so theydon't need to think about it and theyprobably don't even know this ishappening it's just there in thebackground with uhSTTO so what have we learned from thisstage like first of all few ML productsget into production or through ML modelsget into production right and it's it'squite natural like uh first of all tobegin with not every project has the thegoal to have a model in production rightwe've already talked about this but theones they have they have you know theADA state stage first and havehypothesis validation they need to testit and then you know it doesn't itdoesn't make the the cutuh yeah the githubs is very importantand you know it was not news to us butuhyeah the moment users see it in actionit completely changes the way they theysee things you know first of all itremoves the overhead for them ofdeploying anything manually but theyalso love the reassurance of you knowthings um kind of out of healing rightor spinning up if something goessouth and production is is an iterativeprocess right it's not the stage thatyou reach each and then you call it aday and it becomes someone else'sproblem It's an iterative process whereusers need to maintain the model theyneed to monitor them retrain themredeploy them And that's why it is againvery important to have the seamlesstransition between these stages so thatyou don't have this I don't know bigbang handover orwhatever Yes And users need to have anenvironment where they can test theirmodels safely from you know further fromthe day-to-day work or like from theproductionworkloads But um you also need to havean environment where you can safely trynew features upgrade components upgradecluster without interfering with userworkloads So there are a couple of waysto do it right like first of all if wethink of user they can have you can havetwo namespaces for them one name spacefor production workloads and inferencesthe other one for day-to-day work andthey can just you know release makereleases to the production namespace Theother option is to have a dev clusterwhere you they have disparity they runthe same things in both clusters andthen you can make upgrades and observeif it causes any breakages for the userand if it's not then obviously youproceed to productioncluster and or you can have both rightso in our case we have both we havedefro namespaces and defro cluster aswell so to conclude our talk uh I wouldlike to emphasize and reiterate over acouple of points points like first ofall if it was not clear enough andexplicit enough you know integrationtakes much much more time and effort andenergy than installation You can helminstall keep cuddle apply new tools andyou know have a nice example up andrunning fairly quickly but it's nowherenear when the when you need to dosomething in production and have aproduction ready system because like youneed to think a lot how these componentsinteract and work together within yourplatform But you also need to think howyou integrate the platform with the restof the company's ecosystem with all therestrictions that might be imposed onyouThen once you have the platform you needto know how you need to think how youmaintain it and how how you grow it Youmight have all sorts of projects withdifferent use cases with differentrequirements that you have not thoughtof And then you need to you need to beable to add new functionality you knowswap the components that you alreadyhave Oh it's uh we're running out oftimeYes Yeah Uh I'll wrap it up thenBasically you need to build a communityaround yourself and you need to treatyour platform as a product because atthe end of the day that's what it isright you have a platform as a servicewhere you have users running theirworkloads So you need to do a productwork a little bit and you have thiscapacity with your engineering team thenthat's excellent because you haveengineers that know the product know theuser you know understand the pain pointsknow exactly what needs to be done tomake it uh make it better And if youhave a good community around you thenyou can all grow together You can learnfrom each other and you know just helpeach other out With that I would like tothank you and I hope you enjoyed thetalk Good YeahWell thank you very much for today'stalk I think you left the best for thelast day So very very nice I uh I'vebeen working with Cubeflow in my companyfor four years and I have this uh addedresponsibility of building the pipelinesof uh building the inference services Sowe have a little bit less time to workon only the platform This is like partof the stuff we do Uh and I found thispart that you did at the beginning whereyou like you know you customize even thethe creation of the name of the of theproject sending them an email like thisthese GitLab or GitHub templates Is thatsomething you're thinking of like opensourcing now that you're looking intoum you know more into the communitybecause that will be something thatwould be amazing for us And uh evenchanging the color seems silly but itgives us this personal you knowpersonalization aspect and I think it'suh pretty inspiring by the way andthanks thanks a lot Yeah we've beenthinking about it and we would love todo it but uh what Steve pointed out isthat you know it's not asu it's it kind of tightly integratedwith the onboarding application that wehave right but we maybe can you know tryto open source both right and then itwill be like a package of you knowcreating the project from the UI but youalso have a back end to to do all thesethings for you Yeah Or or even like someblog post that like gives the the recipeof like this is how we did it because Iyou have this knowledge andum I don't think anybody I haven't seenthis uh to-do like how the steps so thatwould be awesome So yeah Yeah Of courseThanks a lot Yeah Thank you Thank you2025-04-15 21:58:18.340645 ��(a#��Afnt3f8sWJLAhello Can you guys hear us yeah AllrightGreat Yeah Hi everyone My name is GeorgeI'm engineering manager in ML platformteam at Volc Cars This is my colleagueSteve Hello Uh today we're going to talkabout Abacus which is the platform we'vebuilt internally at VolvoSo today anyone just within a couple ofminutes can onboard to ML platform spinup notebook server start validating theidea and uh without you know approvalswithout any waiting times without theman in the middle and uh since we'vebuilt it on top of Kubernetes and cloudnative stack we thought this would be aperfect venue to share our learningswith you and u I'll hand it over toSteve to talk a little bit more aboutabout the platform Okay thanks George Sothis is our technology stack Um theplatform itself Abacus is the top layerhere and these are the components thatwe're going to talk more abouttoday However before we start we'd liketo acknowledge the other developerplatforms at Volvo Cars on which we'vebuilt Most notably as a smallteam we benefit greatly from the commonenterprise container platformThis is a common codebase which deploysa number of Volvo cars Kubernetesclusters running a myriad of thecompany'sworkloads This allows us to focus on ourprimary concern which isML As you can see we're fond of opensource and cloudnativetechnologies And Abacus is built on theQflow ecosystemUh we've added other products here toeither complement Qflow or to integratewith the corporate network atVolvo So let's set some context first bylooking at somenumbers This gives an overview of thetype of scale at which we're working inour company The highlights here are thatwe have approximately 200 monthly activeusers We have been running productionworkloads for around three years nowincluding some at the start on anon-production cluster which wasn'tgreat But the figures that we are mostproud of here are those around thecommunity So we use Slack for bothannouncements and for support We believein transparency and so we solve oursupport issues out in the open and it'sthe engineers who build the platform whoare talking to the users So we don'thave any first or second line supportUm this is really good because itcreates a bond between the engineers andbetween the users So we see that someusers they even help out with thesupport requests ofothers We also have 48 contributors toour our inner sourcedrepo and we promote that anyone withinthe company can suggest a change simplyby uploading aPR So that's a little bit about where weare todayUh let's take a look at where did westartfrom So this picture is probably fairlyfamiliar to many people��ing and readingcomprehension in more and more beingadded to the list So in fact it's amust-win battle for us to efficientlyadopt and scale out AI to the entirecompanyAnd AI of course exists on a spectrumfrom very commonly available or citizenAI things like chat GBT or co-pilot thatwe all love working with daily for textsummarization code generation and muchmore other end of the spectrum howeverin research we have very hetrogeneousdata and due to this very varying natureof the data we often need much morebespoke or custom AI models that can beknowledge graphs it can be things likealpha fold where based on an inputsequence of amino acids we can predictthe three-dimensional structure of aprotein and then there are things inbetween such as forecasting models orchatbots and each of the steps in thespectrum comes with its own uniquechallenges in terms of hosting addingconnectivity compute dataetc we can apply AI through our entirevalue chain from how we best communicatewith doctors or other healthare careprofessionals to how we predict andprevent stockouts through mixedmarketing or sales uplift models We canuse image recognition models inmanufacturing to detect broken vials andpull out these defective vials early inthe process or in development tooptimize clinical trials However some ofthe most exciting use cases can be foundwithin the discovery phase Generative AIcan scan enormous amounts of potentialtargets finding leads with just theright attributes producing a betterqualified sample of leads forpreclinical and clinicaldevelopment How do we then get there toscaled AI we are working with thisequation where we backtrack from thebusiness value we want to obtain We arestarting from a foundation of fair dataSo data that is findable accessibleinteroperable and reproducible Then wesit together with people with the rightmindset that they are willing to adoptAI in the daily way of working thesecrossf functional teams We work withthem typically in shorter twoehackathons or rapid uh iteration sprintsand deliver for them thestate-of-the-art infrastructure to scaleout AINext we will talk about container imagesbecause they are a little bit specialwhen it comes to ML training andinference as they can often be quitelarge and sometimes contain data aswell ML images can be really large 30 GBor even much more and that poses achallenge due to latency So our firststep we take is often to try to slimdown the size of the image However weoften still left with really largeimagesOn top of that there are several stepsin the containerized applicationdeployment process from part creation tonode selection to then checking if animage is already available or it wouldotherwise need to be pulled And all ofthese steps comes with their own latencyand overhead So our solution to that isto use Harbor as a proxy and cachesitting side by side with our compute Soright next to the Geon cluster andintegrated with JRock Artifactory SASsolution that fits with our corporatestrategy of being software as a servicefirst and if that's not immediatelyavailable we build our own solution incloud otherwise onpremise To get anything meaningful outof your models after they are trainedyou need to of course train them on highquality and correct data And while youcan get your data to say pretty muchanything you want that can of courselead to some really bizarre outcomesThis is a early version of Google'sGemini model it was trained on prettypoor data coming from a lot of Redditpost and when probed for what's the bestpizza topping it would suggest glue asthat gives your pizza a nice texture Weof course want to avoid that and ensurewe have the right correctness andquality on top of just the fair dataprinciples Similarly important dataaspect is that notion of data gravitycoined by Dave McCroy back when he wasat VMware around 2010 He's now CEO ofdigital realy That's the data centercompany that's hosting Geian Heintroduced this concept of data gravitysimilar to regular gravity It has thisintrinsic pool of other related dataapplication services and we want toallow for that because it �can greatlysimplify security aspects and boostperformance as well Some of you mightrecognize this formula is similar inshape to Newtonian gravity I'm a nerdyphysicist so I enjoy equations like thatBut um how does it actually lookglobally where we want to make the dataavailable for all of our users well weintroducing this three tier datastrategy It's based on the data usagefrequency So on top we have a hot tierthat's for data that needs to beimmediately available There we are usingVea storage We then have a warm tier forfrequently accessed data There will gowith cumulu That data will have a littlebit of redundancy to it and it will havea slightly smaller data gravity poolthan the hot tier Finally there's a coldtier for historical compliance data Wejust need to keep there We'll have anarchival solution and that of course uhdoesn't need to be uh accessed thatfrequently It has the lowest datagravity pool and it can be fewer morecentralizedlocations Here we'll hand over toMario's telling you more about userjourneys and related Thank you Gustav Umso right now we're reaching the middleof this presentation So um now you haveyou have at least two major informationThe first one is uh is of course the umwhy AI is important for noises about allthis data getting more and more and weneed tools to process them better andthe second one is a technical foundationthat we are also having uh but you'regoing to tell me what about the journeywhat about the users sorry who are theywho are our users this is exactly whatwe're going to cover now so I would likeyou to to to to join me and uh and thisis a she's a she's a researcher at noonnoisesk she's um looking for a for theperfect AI platform where she can um umenable AI to empower drug discoveryRight so she breathe science she sheeats um chemistry and then uh she uh shedrinks machine learning right okay Sohow does it look like so is looking fora platform where she can authenticate ina compliant way right um dep use herfavorite tool that's out there It can bea Jupyter notebook for instance um trainor or deploy large language model umbased on specific data sets track theexperiment of course and then monitorthe the execution on on real time Whenyou take that kind of problematic andyou scale it on on hundreds of userssome key features of the platforms somekey features of the platform becomesvery critical right so what if theplatform can seamlessly display umhardware and then provide small smartresource allocation helper help heractually in a in a workflowum but you're going to think about thatis a she is a a researcher so we havebeen talking about resource hardware umscheduling and stuff it sounds likesomething that she doesn't know so hermain focus is is contribute to drugdiscovery and she cannot be a DevOps orplatform engineer It just doesn't workHe end upsups being something cute rightbut it it it doesn't work very well Sobut you might ask if she's in the rightpath but she's definitely is becausenowadays we have tools for that We we wehave tool to make sure that she stillfound the best tools that to work withintegrated all together and she can andshe can nowadays hope for self-serviceand innovate faster right then that canbe Domino ML ML MLflowetc Now that you have an understandingof the journey of a let's go togetherand see how we can help her withinfrastructure Have anyone in this roomheard about Gizion it maybe run the handOkay not too much Good Well Gishon issitting in Denmark as Gustaf said is the21st um largest superco computer in theworld in the second happiest country inthe world So yeah maybe there's somecoloration Maybe at some point theywanted to process all of that happythoughts But Gon is basically a DGXsuper pod DJX stand for deep GPUaccelerated is basically a uh asupercomputers meant for AI at scale Ithas almost 200 of H100 nodes and that'seach of them has like eight core of GPUsalmost 700 of memory random accessmemory and all interconnected in with awith infinity band quantum 2 which we'regoing to cover later on So uh in NovaNordisk because Gifion has been built byuh DCI Danis�h center of uh AI innovationand with Nova Nordisk foundation ourimmediate u objective in Nova Nisk is tobuild an inference cluster sitting closeto that hardware and we can we can seethat we're going to have inferenceworkload sitting on the top and as AI isbecoming more and more um famous withnew trends and so on they they cannotjust wait to have access to those GPUsright but as you can see that there'salready a gap is it's is simply too farThey won't go there and plug the thecomputer to the to Gion and expect tohave the workloadrunning So before we get into how wesolve that that uh proximity problemlet's dive a bit more into uh thespecificity of Geishium Why why why doeshe exist why this hardware and actuallyyou heard about H100 H200 H stands forhopper not hell Hopper architecture Funfact about about Nvidia They like toname their infrastructure after umscientists You know one of this we useNordic vlogist I don't know if you heardabout by frost at the observabilitycommon last year um is one of themHandle I mean the team might be hereWell anyway the whole architecture comewith three major feature The first oneis confidential compute Second one ismulti- instance GPU very good forinference and the last one is scalableinterconnect very good for training Andwe're just going to focus today on thelast two onesIn that note we try to simplify the uhthe the scalable interconnect intosomething that you might understand Solet's take this example this analogythat we have two teams one sitting herein in UK the other one sitting in Franceand uh because and because they are verygood at um at um finding I meanbasically I want to what I want to sayis that these teams are very good atfinding new molecules and they have tostart by processingdata and in order to make sure becausethey have to collaborate and they haveto collaborate together but the bummeris that they have there's a coordinatorsitting somewhere somewhere where itonly accept fusion protocol That guy isa whole school guy right so here you cansee that no matter how good they are atbeing scientists researchers it alreadyrepresent a bottleneck issue What ifthey could communicate instantlybasically it's kind of the same thingwhen we're talking about um GPU GPU toGPU peering Um on the legacy system youhad what was called the PCI expresswhere um if you have to have one GPUcommunicating to another one in anothernode it will go through the PCI expressetc And that was quite limited in termsof speed and today thanks to NV link NVswitch um technology you can achieve atremendous amount ofspeed Let's look also at the uh otherfeature MIG multi- instance GPU Nowthese two teams are sitting in DenmarkOkay not in France not in UK anymore Uhand it's not because we don't like thesetwo country Maybe we just wanted them topay taxes in Denmark right but anywaythey got into a uh they they got intothe same building They still have tofind molecule but this time they theydon't want to collaborate with eachother So each of them get their dataanalyze it but they have like thisamazing lights in the same office inDenmark And when one team books the thethe the lab the other one cannot bookthe lab But what if the lab can be canbe fragmented in smaller chunks insmaller partitions of the uh of ofsmaller rooms where they can both worktogether based on a capacity That'sbasically what the multi- instance GPUis solving based on any any um anyrequirement of a workloads you canfragment your GPU and uh and and andallocate it to a specific workload andum and that's very actually very goodfor inference and it's uh it solve theproblem that we have with uh timeslicing where um it's more difficult toshareright all right now that you understoodthe specifics of the uh hopperarchitecture let's go back to how do wehelp her well heard know that there'ssome scheduling monitoring she doesn'tshe doesn't really feel happy aboutdealing with those She know that there'sa tool that might help with those butshe just don't know how to name itThankfully we know how to name it It'scalled Kubernetes right and it's goodthat we talk about it� now because Ithink it's the 29th slide It's been 18minutes and we're Kubernetes conferenceI think So um so that's exactly what weare going to do um on this specifichardware um on um H200 nodes and CPUnodes We're going to build a Kubernetescluster Houston mentioned thecriticality of having a the modeldelivered fastly So our running as aproxy close by and because we are a verylarge company and we have many differenttype of profiles we need to have somekind of workload managers that we'regoing to cover a bit later but not toomuch in details because has been coveredby by professional uh throughout this uhcubecon and uh of course because we dealwith Nvidia hardware we need to have theNvidia's operator the GPU one thenetworking one you know what itis so the importance is very importantto kind of remember why we we needworkload manager because when you whenyou have to deal with a lot a lot of HPChigh performance compute you don't comewith those with those critical featureof scheduling in Kubernetes um so youwon't have multiple queuing system um umit's kind of difficult right so that'sthat's why I will recommend you to towatch the um the the talk from team twofrom sketch MD two days ago I think umabout slurm getting is explaining whyit's very important to have a strongerworkload manager when you doHPC All right so now let's let's go intothe demo Uh we prepared something foryou Uh it's been very difficult becausewe changed it at the very last timeThere's been amazing demo so far It'sthe last day So we're not trying to wowyou today We're just trying to help Edayou know something very simple Uh sowhat we're going to do um we're going tohave a short overview of the networkinguh the the Nvidia operator sub sorry theGPU one specifically deploy LLM on usethat use GPUs and have a a little bit ofuh the awesome platform that might mightI might be speaking of So I'll just exitand all right so this is what weprepared So I'm just going to show you abit theinfrastructure So we have few um umdifferent type of node pools This is theAKS cluster Asia Kubernetes servicecluster We have two node of uh of GPUand several of CPU and um all arealready called on in the cluster So theyare working everything is ready And nowwe're going to look at the specific umGPU nodes Um they the networkingoperator is running on those on thiscluster So it automatically does what'scalled a node feature discovery labelingthe nodes and and and then identifyingthe hardware and such like and puttingthat at at label in the notes Perfect Sohere we have v 100 nodes scrolling a bitdown So we're going to deploy I thinkyou also know a bit what it is um fromuh there's been a talk about it two daysago as also I think um is going to useuh um these values helmshot values to bedeployed Uh we deployed a shot I thinkit's already been deployed but let'sjust have a new revision revision six Wehave this endpoint It's working So let'ssee if you have a model I think Ideployed a llama tree Um that's the caseWe have the latestone And let's seeif using GPU We have as uh as putting inthe hand values one core of GPU is usingthat And let's just run one question toLama 3 That's what we did That's what wewant to do So I'm running I just ask aquestion I want to deploy LLMs inKubernetes using awesome platformbecause we're supposed to help EDA Whatare the option out there processes givesme back an impute output sorry andsomehow I can't uh scroll down but uhhopefully I don't think there's uhthere's any mention of the platform thatwe prepared for EDA which is called runibut the importance of this demo is thatRita is not a platform engineer socubicial access forget about it noteforget about it all this kind of stuffforget about it so she's looking for asomething more user friendly and then II think we this is where we have thisplatform for her that's called runaiThis platform has been um has been uhacquired by Nvidia Um so she canauthenticate like shewanted Yeah Perfect So she see thatthere's two GPUs on the node We canallocate some fractions to there to herShe can see all the works running Andthen if she wants to also deploy LLMlike we did she doesn't have to gothrough whatever She goes to what'scalled a workload manager and she selectthe workload She if you want toinference on Melm she just clickupsEveryone loves clickups It's easy Andshe select her model and and then bam isdeployed Something that I would like toshow you uh is that um uh it emphasizedthe uh what Gusa was saying earlier onabout the um about the importance ofhaving something sitting as close to thesomething as a as a as artifactorysolution sitting close to the to the tothe hardware because on this schedulingI don't know if you can see it correctlybut it took like 20 minutes or more toget just the image So if you'rereleasing new image all the time you youreally want to wait 20 minutes all thetime So if the image is already cachedon the star in the cache on thesomewhere you fetch it instantly Rightso that's also one of the technicalchallenges that Gusta was talking aboutAll right So that's that's ends the demoLet's go back to thepresentation Uh prefer slide Perfect Allright I have to click here and we go Allright So perfect We already saw that Butof course we did when of course we arealso supposed to support all theworkload managers such as uh becauseworking also with HPC so with many teamsum we uh we have we have better waystoday to to do to not do staticallocation of GPUs they are they arethere I won't go into details is alsobeing covered um um yesterday I think byuh by um two people from uh one fromyeah IBM and the other one from Apple Ithink um sorry I don't remember thenames that's fine Uh but where you canbasically do what's dynamic resourceallocation is basically the words like apersistent volume you claim an amountand when it's not used etc All right Sothat's basically how we help EDA intheir journey providing this part thisum AI awesome AI platform that we callrun AI for in for us It might change inthe future but um that's how we have todo that and I will give back to toGustaf talking about the f the futuredirection the last chapter Thank you somuch Marios So let's round off with aquick look into the future and um nowthat the models have been trained Weneed of course a place where we can hostthem and people can access them as wellSo we are building this inferencecluster named escard right next togithion On top of that we also havetools kubernetes cluster that hosts thisnice run.ai user interface we just sawto give the users a a better experienceSo let's finally take a quick look atthis hybrid cluster architecture wherewe have hetrogenous bare metal mix nodesin the bottom where some are morefocused on GPU while others are moreoriented towards CPU compute On top weare hosting control plane differentsystem tools amenities controllers andthis spreads over a range of operatingsystems due to us getting some oss fromNvidia DGX Ubuntu and then we are usingUbuntu for our own VMs Then when itcomes to Kubernetes this can be acombination of container runtime in cubeadmin or for more hosted platformsomething like rancher kubernetes engineRK When we need to do development incloud we can of course use Asia's AKS orAWSEKS Then on top we host different systemtools as Mario mentions That's thingslike Harbor for the images Argo CD forGitHubs etc Because many of our usersthey are of course domain experts intheir respective research fields butthey might not be as familiar withKubernetes So we want to make sure ourclusters are as nice and easy to use forfor them to get started aspossible This opens up a lot ofpossibilities for us either hostinglarge language models directly or thosewe cannot host we can add connectivityto For example through run.ai AI We canalso add connectivity to hogging faceand a lot of other options Then we'llalso focus of course a lot of these uhspecialized medicinal models that areneeded in inresearch Uh with that we want to thankyou and give a special thanks to ourproduct owner Simon He was a huge partin helping us prepare for this talkThank you for that And uh we are readyto take any questions if you might have2025-04-15 21:58:18.887405 YY��b#��MAFC5TAGsBbRQhello and welcome to our presentationabout extending kubernetus with AIcapabilities My name is Costto Rasmusenand I'm joined by my colleague MarioTanava He's our senior platform engineerand I'm tech lead of ourcontainerization team at Novanoriskresearch anddevelopment We'll begin with a quicklook at the agenda I will introduce youto the company and how AI is relevantfor us Then dive a bit into ourtechnical foundation of container imagesand some dataperspectives Then I will hand over toMarios who will tell you more about Geonthe new exciting AI supercomputer basedin Denmark that's currently ranked the21st in the world based on the global500 ranking Marios will tell you aboutuser journeys challenges we encounteredand solutions we came up with And thenwe'll round off taking a quick look intothe future and some questions at theend So who are we and how can we benefitfromAI we are a large pharmaceutical companythat specialize in treating diabetes andobesity but also other therapy areassuch as non-alcoholic fatty liverchronic kidney disease cardiovasculardisease and some brain disordersParkinson and Alzheimer We also treatsome more rare endocrine and bleedingdisorders such as hemophiliaAnd as IT professionals we are thenbuilding the technological platforms toenable our researchers to innovate andimprove on these uhtreatments We have a global organizationwith expertise and knowledge across theentire farmer value chain Several of oursites operate as transformationalresearch units being semiautonomous inrelation to the global R&D organizationThis allows them to make datadrivendecisions quickly and focus theirefforts onresearch Data is growing rapidlyglobally at an exponential pace It'sshown here annually insettans to go through Luckily AI is verycapable of scanning massive amounts ofdata and it's starting to overtake thebest human performance across a numberof areas such as image recognitionlanguage understand��plicationlevel metrics following the opentelemetry specification about differentservices your application may carry likeHTTP gRPC Kafka etc we also provide uhtraces at uh following the opentelemetry specification also applicationleveltraces uh the network as I said is notinstrumented as a single program thatdoes everything you need to installmultiple programs in different multiplesmall programs in different parts ofyour uh of your systemuh those programs goes from programs inthe traffic control of the of or theexpress data path to get informationfrom from IPs ports packets Ethernetaddresses interfaces um they will giveyou information about L3 and L4 layersuh but this is very limited toconnectionsuh with K probes in the kernel you canget more information about IPSconnection uh host names and so on andthen if you go to the application levelwith probes you can get richerinformation like which protocols arethey using and even for each method callarguments payloads return codes SQL etcanything that is visible through theinternallyuh but instrumentation even this is notuniform I said it's not just installinga program and do it but it's not alsoinstalling four programs and you haveeverything because as we said it isplatform dependent you need to know thebinary uh organization you need to knowthe runtime organization and you mighthave different applications in yoursystem using different technologies sofor example for C rust and python PythonPython I I mention it because even ifit's an is interpreted it interacts uhin in with the operating system uh inbinary level with sys calls you mightneed a set of probes but if you want toinstrument go and java you might need tohook your probes in in other points orin in in other parts of yourexecutable because for example some goand java handle some libraries that uhother other languages rely on on theoperating system while Go and Java forexample for TLS or secure connectionsbundle their their ownlibraries basically with all these setof programs what we have is a bunch ofpelpieces but we need to join them to givesome sense to them for network metricswhat we what you get is every time forevery packet a program is is is run andextracts some information from yourconnections source IPs ports destinationIPs destination ports also otherinformation like payload size in uhsystem interface and so on if you wantfor example to know how many bandwidthis is uh you have between two endpointsin your system is relatively easybecause you you have source IP sourceport destination IP destination port soyou can groupuh you can you can you you can group allyour uh packets network packets andaggregate the this databut uh and you uh one example of metricwe have is BA provises network bailnetwork flow bytes that for twoendpoints it provides how many bytes aregoing in a given direction but forhigher level applications it's a bit orhigher higher layers application levellayers it's a bit more difficult becausewhat you see is a bunch of or or a aof events coming from different programsthat means coming from a socket comingfrom the TLS library coming from theATPS library coming from a Golangexecutable and you get a bunch of eventsbut what you don't really care about theevents themselves are not so useful arelike individual puzzle pieces what youknow is what you want is to know thewhole cycle of a for example an HTTPrequest when it starts when itends what method has called uh what itreturn and the those areuh pieces that come individually youneed to merge them to jointhem in classic webservers it's there is an easy way it'srelatively easy when I say easy is likeokay it's not not not rocket scienceuh you have a thread ID an apparentthread so you can relate you can getthis information from from EVPF so evenif you get events from different sourcesyou store them in a in an EVPF map andthen from the user space you can readDCVPF map uh DCVPF map and join them bythe thread for example classic classicold style web servers accept aconnection create a new thread for thatconnection and and then uh per carriestheir request from �that thread so usingthe thread ID even if you have multipleevents in the in the queue using thethread ID you can group which events uhbelong to a single HTTP request so herewe might know that this HTTP request isa client side request because uh we wesee this HTTP payload in the socketsubmission so not in the socketreceiving but in the socket submissionso we know it's at client side requisitewe know it'sHTTP just inspecting the payload that isextracted from Ebpf you can even knowthe the method the HTTP method get usersuh you can you may know the payload sizeyou may know the response code becausefor that same HTTP for that same threadID you you you have another event lateruh at time 143 is just a random time uhwith the HTTP status and you can knowthe payload time and you can even knowthe total transaction time since theconnection is created until theconnection is is closedbut uh luckily but unluckily for us uhmodern web servers uh don't work thatway they use an even loop they have alotof they have a they have a small pool ofthreads and they carry request from thatsmall pool of of threadsso in uh we need to also apart thethread ID is not bell anymore because asthe same thread might carry multipleuh concurrent requests uh so we need tohook also in some implementationdepending functions so we need to addextra EVPF probes for example for a gostandard HTTP application we need tomaintain a tree of parent child goroutines so we know uh uh we we canrelate the different events and can jointhe pieces according to the parent childgo routines in NodeJS you need you havethis a sync ID which is a a similar IDin Kafka you need to maintain a tree ofpointers to the different messagehandlers so uh things start becomingmore plat uh more implementationdependent we need to support or or toexplicitly add support to new or toother uh to to new frame frameworks uhalso this has a problem that is if youhave a library if the library orframework internally is updated forbecause they optimize it they find abetter a more performant way to do ityour instrumentation might be broken soyou need to send us a GitHub issue andwe will fix it as soon as we canso uh summarizing what we providenetwork metrics network level metrics L3L4 are robust because they are based ona stable APIs uh basically the Linuxkernel TCP IP library and the binaryimplementation is a standard uh so it'sit's robust but it provides very basicinformation basically source and destinhow many flow how many bytes are flowingbetween two endpoints in your clusteruh application level metrics providesricher information like methods or mightgive better insights what's going oninside your application but the h theinconvenience that it needs to implementexplicit support to any newimplementation or library of a protocoland it's relying on internalimplementation details that can changewith time so it's broken it's alsoimportant to provideuh trace context propagation this isthis is an example of a single tracethis means that for example if a if aservice gets a request and under theunder the the context of that requestinvokes requests to other services uh wewant to get traces for example here wesee a Chrome client that is invoking afront end proxy this front end proxy isinvoking a front end and you can get abetterview this is implemented using atransparent heerthe this transparent heater is providedby multiple instrumentation SDKs or fromthe libraries that basically specify thehave good knowledge at the at theplatform level or at the code level havegood knowledge of which request isrelated with which other client uh subrequests the problem is doing that inEVPF since EVPF as I say is doesn't havehas limited application level uhinformation one way to do it is uh usingan EVPF program in the network stackthat when it detects for example in thatcase an HTTP request it will punch ahole in the in the HTTP buffer giving aspace for adding a transparent heaterthis transparin is known is alreadycaptured in another probes for from fromthe request and using thread informationand so on can relate that but this wo�n'twork in TLSuh because in TLS we cannot modify a aseiffort secure payload so we areproviding a a novel approach based on IPcontext propagation based on IP packetsso when we have an IP packetuh we punch a hole in that packet andadd uh an IPv4 options uh it's it's it'sa part of the packet we expand thepacket and we want to we pass BA passesthe context in that packet however thisis a trace parent ID uh and this doesn'tfit in this this IPv4 options is verylimited a trace parent this is what ispassed in ITP header for example has atrace ID ID and a span ID the trace IDis the group identifies the group ofrequests and the span ID is theindividual request so what we dois we remove all the information that isnot the trace ID and uh pass this traceID as an IPv4 in the IPv4 optionsuhpacket and then in the this is in theaddress when the request goes out butthen in the ingress we reconstruct thetrace ID but instead of inh since wecannot have we cannot inherit the spanID we we relate it with a given requestusing the span ID as the TCP len and TCPact which we expect to be unique so thatway we can overcome secure connectionsand pass context the inconvenience ofthis is thatuh this works with this allow bailinteracting with other SDKs all opentelemetry SDKs while this requires thatall the services are instrumented withbailthanks Mario for the detailedexplanation now I going to proceed totalk about how uh we make all of theseum network monitoring sense in thecontext of Kubernetes because that's whywe are here for CubeConuh so what BA is at EBPF level as Mariomentioned is that it's like very basicinformation at network level we workwith addresses and ports either forsource and destination so we have thiswith graph and for um applicationmetrics in this case for example we haveHTTP server request duration uh we areable to trackuh how how is the latency for this routebut in this case uh most of uh as aservice name we capture Java which isthe command line uh for this um for thisapplication but this is not enough forus what a Kubernetes user needs uh inthis in this case for for the case of umnetwork metrics is which Kubernetesservices are calling uh which so uh forexample we have um um a service frontend in the name space app that iscalling a database service in thestorage uh name and space and for thecase of application metrics we want toknow that that um service that it wasrunning in Java was actually theinventory service running in the backendname space how do how do we handle allof this well uh Mario mentioned this inin the previous session but I just goingto recap uh we we use a Kubernetesabstraction called informers which uh uhis provided in the Kubernetes API goclient and basically uh it allows us tosubscribe to events happening in theKubernetes cluster in this case we aresubscribed to pots services and nodesand the cublet is going to respond usevery time there is updates in when youcreate update or delete any of these umentities then in bailout we maintain u amap of IP addresses and and resourcesand process ID and and resources andlike be careful with this because it'sthat uh like if you have a cluster witha few nodes it's fine but if you have acluster like us with thousand nodes wewe can create an outage so it's betterto use the cache that Mario mentionedbefore so yeah so how do we match all ofthis uh process process information withKubernetes information so what with APFsees at BA uh instrumenting anapplication is very basic it's just theP and the command as we've seen beforeuh and but in Kubernetes world we havewhat we want to see which is the namethe name space and all of this stuff sothankfully we we are able to thanks tothe process file system go to uh withthe P to fetch um specifically the croupand with the croup we are able to relatethanks the container ID um the processinformation with thebot okay uh we can also do uh someinteresting things like track in zonetraffic that means that uh we can uhthanks to this uh network observabilityuh inspect which zones are callingothers in in in the different clustersand this is uh very interesting becausewe can u calculate how much is the colike we can infer the costum betweenzones yeah so this is basically uh wejust go to the port and we are able toto get the node name and in the node uhthere is this label topology kuberneteszoneand fetching this data we are able toenrich our bail flow network flow byteswith a source and destinationuh zone and we can even create a newmetric which is the the total of bytesbetweenzones uh we are also able to uh trackexternal traffic for now we were justlike uh tracking like traffic fromservices to service inside the the samecluster or uh crow zones but we are alsoable to track external traffic forexample here we have a ping togolang.org uh it's going to return anaddress and we can do an S lookup to getthe the host name for that IP in thecontext of bail let's imagine that wehave an instrumented application hereand this uh we are doing cool request orHTTP request whatever to golang.org andthis goes to the DNS serveruh this is going to return someinformation like the the host name andthanks to BA and EVPF we're going to tapinto this uh call and we going to reachour network flow bytes with uhdestination name instead of an IP we'regoing to say the the DNS like the hostname here yeah the the good part here isthat using EVPF to sniff onto the DNStraffic allows you to get the actualname if we were using a name space lookup we will re we will decorate insteadof goland.org we will decorate it withthis give matt whatever whatever that iswhat is actually returned so thanks toevpf we can provide a more accuratelocal reverseDNS uh this is uh very cool we aregetting um a lot of information thanksto the network uh monitoring but uh ifyou enable bailine yourcluster is going to start to explode thecardality because there's a lot oflabels and a lot of metrics that we'restarting to enable but thankfully uh weprovide in in in in our config like away to for a specific metric indicatewhich labels we want to include or wewant to exclude from the collection ofmetrics and uh you maybe are wonderingwhy what's the like uh utility of all ofthis so thanks to uh BA and the networkmonitoring we're like providing the likethe the data to uh one of uh othergraphana products which is graphanasearch and one of the the features thatuh search has is this uh entity explorerwhich uh allows you to see in all theentities of um of your cluster how areconnected each other in this case uh wewe deployed uh in our cluster uh thisproject from open telemetry called opentelemetry demo we showcase ane-commerce where there are many uhmicroservices calling each other so wedeployed open telemetry demo in our in ain a cluster we enable BA and we havethis so you can see how the differentservices are calling each other and thenyou can have a a better understandinghow is the topology of the network inyourcluster uh this is like very importantbut thanks uh thanks to the shoulder ofgiants like that's how we build bailinitially bailout was a fork of opentelemetry go instrumentationuh that was covering the applicationlayer um and with time we evolve it towe added more features and we wentbeyond go and now we are able to doapplication monitoring for any kind ofprogramming language and for the networkpark we uh thanks to uh we did asuccessful proof of concept after ahackathon umuh taking inspiration from this projectcalled net observe IO and we basicallyBA is doing the combination of these twouh amazing projects and Graphana has hasa core values open source and communityso we were like very uh keen to likedonate BA back to open telemetry and weinitiated a donation like in uh inOctober last year and it's been ongoingfor a couple of months and thankfully weare reaching to the that the donation isvery well accepted and in the upcomingmonths we plan to uh finally set up thewhole uh community around what would belike EBPF auto instrumentation opentelemetryuh yeahproject and that's it and thanks a lotfor your attention and if you havequestions[Applause]please questionsandahno okay thank you very much[Applause]2025-04-15 21:58:19.470672 3$3��Fd#��CAp52nxvo6hXktoday we're going to tell a story aboutsome pods running in Kubernetes thatwere crushing a little too hard on theKubernetes API Um I'm Terara Tori I'm astaff software engineer at GraphfanaLabs and I'm joined today by myco-presenter Mario Matias uh also staffsoftware engineer at Graphfana LabsSo today you're going to learn a littlebit about ebpf We're just going to gointo the basics of ebpf If you want toknow more and the nitty-gritty detailsMario has a talk just after��Dc#��?AHV3Nb_wUro4thank you everybody to come to the verylast session of CubeCon in this roomwe're really happy that that you havethe patience to stay here until the lastmoment uh my name is uh Mark Tudori i'ma software engineer in Graphana Labs andwe're going to talk today about usingEVPF for non-invasive invasive instantnetwork monitoring and here's Mariohello i'm Mario Matias also from GrafanaLab and uh I work in the same team as asMark uh we work mainly on EVPF and weare going to present how we use EVPF toprovideuh monitor to monitor your network yourand your application at different layersof the network stack i know many of youalready knowEVPF so but I'd like to do a smallintroduction uh how we are using ebpf inin in the graphana beta team basicallyyou your application run on top of orusing uh a set of libraries and it runson top of the Linux kernel using it itinteracts to it with sys call and theLinux kernel has a runtime thatallocates the resources and operatesyour applicationan EVPFbased solution like Grafana Bailauh to provide observability or securityor networking any not concretelyGraphana but any EVPF based solutionalso runs asa as a user level application in theuser space application but it interactswith the EVP the the EVPF implementationthat runs in the kernel that providesIt's verification and just in timecompilation for safe access to theresources it provides some maps tocommunicate the user and the kernelspace and some kernel helper helper APIto interact and to load theprograms we are using many types of EVPFprograms evpf is not something like okayI run a I deploy a single program and itdoes everything evpf runs or injectsmultiple small unlimited programs intodifferent parts of your uh stack in theoperating system runtime you can loadnetwork programs that in your networkstack you can run K probes that will betriggeredupon concrete or or given uh kernelevents but also you can load you probesyou can load you can add probes into theapplication or the user space levelapplication this is in libraries or inthe actual applicationexecutable good part of EVPF is that ifyou want to run EVPF or for exampleinstrument an application you you don'tneed to rebuild the code of yourapplication to inject the probes youdon't need to redeploy your serviceswith with the EPF agent it will be EB itwill be the EVPF solution who injectsdirectly and transparently to theinstrumented application it has goodnative performance because it has a justin time compiled and it also uh addssafety it uh it evpf preverifies yourprogram to guarantee that the programwill end and won't do some uh operationsthat could hang your kernel but ebpf isnot magic it's not an SQL or is not a ina file system interface in where you canhave uh information in a well definfinedstable formatuh it requires AP API level knowledge ofthe instrumented targets and when I sayAPI level I mean binary level knowledgeof the data not not not you don't haveto think in a code level but how theinformation is structured in a b binarylevel and also programs are limited insize andfunctionality so instrumenting anapplication and the network levelrequires ers multiple small programs tobe coordinated in a common effortgraphana bail is the solution we provideuh or is is our approach for zero codeautomatic instrumentation and networkmonitoring by zero code I mean withouthaving to change or redeploy yourapplication it provides metrics atdifferent levels of your network stacklike uh levels about uh metrics aboutnetwork connections uh also ap�� this in thisauditorium that you should definitelycome and check out Uh we're going totalk about how in in enrichingKuberneting eBPF data we broke theKubernetes API Uh how we uh why weneeded to enrich that data and what wedo to to to do that in BA and then howwe ended up scaling that uh enrichmentprocessSo uh oh uh before I go on uh can I geta show of hands who here is familiarwithebpf okay so about half the people Umwho has written an ebpfprogram okay fewer hands And who hasdeployed that ebpf program intoproduction in Kubernetes okay awesomeThank you Um so before I go on I need toset the stage and start and tell you alittle bit aboutebpf So ebpf if you don't know itenables safe and efficient extension ofthe Linux kernel Now you might say wealready have kernel modules Why do weneed ebpf well if you've ever run akernel module you know that if thekernel module crashes crashes the wholesystem Everything goes down Um what ebpfdoes is it introduces a virtual machinethe ebpf virtual machine and thatenforces uh halting as well as a fewother safety properties that enables youto run these kernel modu these kernelextensions in a safe way Um now the forI think a lot of people when they thinkof ebpf they think it's sort of a a newfangled technology Um but actually thestuff that we're talking about was addedto the Linux kernel in 2015 by DanielBorman We're talking about the TC egressingress hooks and that is what allows usto snoop on network traffic Lets us getour grubby little paws on all thosenetwork calls so that we can constructthese amazing service graphs that we usein other products Um but kernel data byitself especially in Kubernetes isreally not that useful Um when if anyanyone who's sed into a container knowsthat if it all all it has is cat and shit's it's really annoying to go intoslrock and look at every single detailabout every container And especiallywhen you're doing this at scale acrossan entire fleet of machines it gets verytedious So as I said kernel data is rawright and in in Kubernetes every singlenode in a standard deployment has itsown kernel And that kernel that kerneldata is giving you information about thesource port source IP destination portdestination IP But that's not what wecare about in Kubernetes right inKubernetes we care about Kubernetesobjects like services and pods nodes Sohow can we take this raw data and turnit into something that is actuallyuseful for humans that care aboutKubernetes well um as I mentioned everynode has a kernel Kubernetes has aconstruct called a Damon set that allowsyou to run one pod per node So thateffectively gives you one pod per kernelSo that's how we solved kind of thefirst problem of running ebpf inkubernetes is um we we have now one podrunning on every kernel and so we cangather that information Um still we needto get this metadata because we want podIPs are constantly changing We need toknow about pods and services so that wecan provide useful information to peopleKubernetes has this incredible API whichis arguably what makes Kubernetes sopowerful right you can set the state inthat API and Kubernetes will just go andreconcile your system such that whatyou've what you've represented in theAPI becomes reality Um and so we thoughtwell there's this very tempting API thatknows everything about everything that'shappening in the cluster Why don't wejust ask that and so why don't we justhave every single Damon or every singlepod in the Damon set ask the KubernetesAPI what could go wrongright thank you TerraUh so uh BA is built using a pipelinearchitecture in which data is extractedin early stages is joined is transformedin in other stagesuh in in one of those stages we want todecorate metrics with Kubernetesmetadata to make those metrics moremeaningful for the final user We mighthave for example for network levelmetricsuh source and destination IPs of all thepots that you are connected from theEVPF side we have access uh to thesystem level metrics at the networkpackages or row rowconnections and from EVPF we can seesource and destinationIPs but uh for the final user thoseme�trics are not usually so meaningfulbecause we don't really me memorize orwe don't have a mental map of wipes IPpiece belong to to which node So we wantto decorate those metrics in a givenstage with Kubernetes metadata This ispot name spacesuh other levels and so on So in one ofthose stages of the BApipeline we get metrics decorated withsource and destinations IPs and we sendout metrics decorated with Kubernetesmetadata This Kubernetes metadata isacquired using a Go library BA is mostlywritten in Go Uh there is a Go librarynamed Kubernetes informersThis library connects to the cube APIand on one side it retrieves a wholesnapshot of your cluster I I means allthe parts of your cluster nodesservices names metadata IP addresses andso on And also it watches for any updateinto into your Kubernetes cluster Thisis a new pot is created A pot haschanged the IP address or has new IPaddresses A pot is destroyed So givingas input this constant flow ofKubernetes object BA uh maintains aninmemory map that for each IP addressalso other other it it can also useother keys like container ID P ID and soon for other kind of metrics butbasically we maintain an accessible mapfor a given IP which Kubernetes objectit belongs to So in that pipeline stagewe can provide extra information thatmight be meaningful for the for thefinaluser Uh BA runs one as Terra said oneinstance on each on each node Uh andthat works that worked pretty prettygood atfirst until we deployed in production ina largecluster Uh we already had BA deployed inproduction Nobody complained Everythingwas okay But one day they thought hey sowe deployed and happily we say okaylet's go home Uh but someone elsesomeone else say eh this cluster is notworking isis whatever is happening is is notworking and that is because we passedfrom the Kubernetes API server uh wentfrom t few ten of requests per second tomore than two million requests persecond Alsothe also the error rates of those APIrequests went from zero to uh 10,000 anduhand yeah we we also even lost the somemetrics because that start working thatcow said that notonly not only the decoration of metadatastop working in BA but also some otheruh kubernetes objects were not properlyreconciled So basically we took down thethe cluster with withBA Some users also report it but we Ihave to be honest we ignore them Uh sayokay maybe maybe you are doing somesomething weird They even provide somesome public they shared in our GitHub uhrequest some memory information actuallyuh enabling BA in their cluster doubledthe memory consumption of the cubeAPI So we saidokay our early users are complaining wecannot deploy it internally so we needto do somethingwhat was happening behind the scenes UhBA is runs as a demo set This is we haveone BA on each node and uh to to getmetadata from the Kubernetes objects itcontacts to the cubelet that exposes alocal endpoint in which we can ask foruh in that case both nodes and serviceswhich is anything that can have an IP wewant it But the cubelet only maintainslocal information about the ports No uhabout the ports especially Uh but sincewe are tracking network metrics we alsowant to know the source IP which islocal but the destination IP we alsowant to know for external port So weneed each bail instance need to maintaina snapshot of the whole cluster So whenBA this informer's API contacts thelocalcublet each localcublet this is a simplification so it'sprobably I mean it's might be morecomplex but just to understand eachcublet sends a request to the cube APIand for each request request the cubeAPI sends back a request for the othercublets So if we have three nodes whenthen when this cubelet sends a asubscribe cube API will send it to theother port So it will each node willreceive two more in that case only threenodes to twomore two more requests just to send thepod information to share the the localpod information with the other with theother nodes for few nodes that's okaybut this this cube API is handling uh anumber of sub of of subscriptions uhwith a an order of complexity of powerof two it's fine for it's �fine for fewnodes but given uh when you reach acritical number then uh memory CPU uh isis too high you take down the clusterSo uh we were discussing how can we fixthis because uh not decorating ourmetrics with Kubernetes metadata was notan option because the theinstrumentation will become mostlyuseless in that case So first we thoughtabout replacing the subscription modeljust by individual requests because evenat the end if we if we store a snapshotof the of all of all all all the pots wedon't really need or we we mostly bailmostly manage information from all theall the pots So what if you we just askit won't work well because the the APIis not well designed it to efficientlyquery by IP but just by name and alsoeven during the during the deploymentsthe initial uh at the initialization ofeach instance the cube API will getanyway a stampede of requests we werethinking about okay why does just wedon't get uh the information from thelocal information from the cullet APIwhich is somewhat undocumented is not somuch uh designed for the final user butfor internalbehavior but we don't get global objectsto it I mean we can only get uh localpots so we were thinking on clusteringthis this creating a clustering acluster catch but that we we will getending up getting similar are problemwith a gossip protocol to share metadataUh we will add some network traffic andcomplexity So we decide to move to acentralized cache deployment So insteadof integrating this informer's code oneach bail instance which run as a demoset we will create an externalpot thatuh the it's is deployed few it deploysfew few replicas it can be one replicatwo replicas I don't know it depends onon the size of yourcluster and this uh cache pod willimplement the go informer's code tosubscribe subscribe to the KubernetesAPI from freeu instances and get acomplete snapshot of all the potmetadata which is later shared with eachbaila instance that connects to thiscatch service at the end what we have isa two-level cache even if we saycentralized cache it can be distributedbecause you can have multiple instancesand thenthe each bail will be will connect todifferent instances uh using some loadbalancing So this catch instance potconnects to the cube API gets all theinformant metadata If you request for apod metadata for example you will see abig JAML file or a big JSON file with alot ofmetadata that actually we don't reallywe don't really need all that metadataSo that cache instance just stores theminimal needed snapshot of of therequired Kubernetes object that is namename space kind the owner and somelabels and annotations which is themetadata we really care for fordecorating metrics and bail is connectedto that uh cache instance using aminimal protobuff definition uh via agRPC stream So the the network and theresources required for the fortransferring all this information fromthe cache to bail is is much lower andat the end bay also once receives theinformation it it's it keeps updated aninternal an internal database mappingIPs container ids qualified names to tothe metadata objectDecentralized cache is a storage list Wewanted to make it as easy to maintain aspossible So we didn't want it to requireum a storage system We didn't want it torequire some extra communication like amessage queue It's just plain gRPC Thatmeans that if by some reason this bot uhcrashes maybe a back or maybe it getstoo many resources and and gets allkilledUh while when this when a pot restartsthe cache pot restarts another pot willbe will be started will get a new willreload the snapshot and BA will connectto that new pot in that time BA willstill maintain its local copy of themetadata So BA will miss some updatesbut most of the metrics it gets uh andneeds to to decorate BA will find uh thethe metadata for most of those metricsgiven that this cache pod instance takesfew seconds to to torestart So some issues that we ran intowith this approach the centralized cachesolved the primary problem but as weknow in tech everything is a trade-offUm the main reason why we reallyevaluated a number of options was wedidn't want to introduce anothercomponent for our users to manageEverything that you run in Kubernetesadds toil You got to upgrade it You gotto check the resource utilization Um andin this also uh the the resourceutilization especially during startupbecause the cache is empty when the podsfirst start It has to refresh its brainabout everything that's happening insidethe cluster because the caches maintaina state of of everything that's going onAnd so during startup there's this kindof uhthundering like storm of requests thatit makes to the Kubernetes API to getinformation about everything whole bunchof des serialization that needs tohappen there and then also serializationthat happens when the Damon set requestsuh information from the cache Um and soyou can see here for the first minute orso when the cache is starting resourceutilization is almost double the steadystate So that's a very important thingMake sure you're leaving headroom whenyou're deploying this cache That'ssomething you got to think about thereUm and so when you're setting yourrequest and limits obviously in the podum and uh we can see here the uhKubernetes app showing that yeah the theresource utilization is is uh twice whatit would be normally Uh this here wehave some uh Pyroscope profiles showinguh this is how we kind of dove into whatwas causing the space and timecomplexity and we found that it wasmostly from from d serialization Uh andthat's what was was causing a lot of uhmemory utilization on startup So um wewe solved that by just storing uh asMario mentioned just storing thespecific information that is necessarynot the whole Kubernetes object but justthe stuff that you've configured thatyou want to have decorated Um andanother thing that we did too was on theuh client side we changed that to be agRPC subscription So that way it'sbinary and the binary encodings uhreduce the memory uh utilization of ofserialization to those those requestsUm so I think in summary here theKubernetes API can handle a lot but likeeverything in life it haslimits and our solution was to introducea centralized cache as as we've kind ofdescribed and I think like a generaltakeaway that you could look at here isanytime you're deploying a Damon setinto Kubernetes you really need to thinkabout performance and when you're doingthat capacity planning It's not just thetime and space complexity of thealgorithms in that specific deploymentbut you also need to think about thedownstream impact on the variouscomponents like the Kubernetes API Uhand think about you know how is thisgoing to how is this going to affect usum that's the talk Thank you so muchDoes anyone have any questions[Applause]okay Uh thank you for all your work Hugefan Um Baylor took one of our open shiftclusters down too Um when did youintroduce this change we started usingwith Baylor 2.0 I think M uh this thiswas I think this change uh was beforewas released in BA 2.9 I 1.9 or 2.0 butyou need to en enable it uh on purposeSo by default BA will still run in inthe old mode But uh if you are forexample using the helm chart uh there isan option uh that is bail catch yourinstances or so on that if if that ifthat number provided is different thanzero it willautomatically deploy that cache instanceand configure BA to connect to it ThankyouYeah Okay In the meanwhile if there isanother question uh this cache is is anindependent component We designing itfor BA but I think it covers some usecases from other teams that I' I've methere other people having the sameproblem with informers and and and APIstarvation So it's it's it's an openservice with a an open protocol withlibraries So it it can be also useful inaddition to baila even if you are notusing it for baila foryou Okay So thank you for your attentionThank you so much2025-04-15 21:58:20.172296�'m joined at the stage uh by mycolleague tazik hey i'm tazik and i workas a senior software engineer at newrelic in the same team as javicool so we'll start outlining somecontext some numbers uh about our scalethe the problem that we want to solve umalso why we move to a cellulararchitectureum librarian also cluster api in orderto implement uh this uh kubernetesinfrastructure in multiple cloudproviders um additionally we are goingto showcase how we added some layers ontop of this to easy uh consumption foruh instances and the the differentofferings and nuances that the multi thecloud providers offer usnow what we do on new relic uh weprovide um intelligent observabilityplatform that empowers developers toenhance uh digital experiences uh wehave more than 85,000 active customersuh we process more than 400 400 millionqueries per day uh we ingest aroundseven pabytes per day that makes uh atthe end of the year around 3 exabytesand with that we process 12 billion ofevents perminute now how does thisum translate into uh kubernetes uh so weoperate all of these uh on top of uh 280kubernetes clusters um we operate we runmore than uh 5,000 uh pots uh on on overthe uh21,000 nodes um and we run all of thesein multiple cloud providers uh andmultiple regions uh and our averagecluster has uh between 300 and 500 uhnodes and we run on top of them on eachof the of our clusters um around 5,000and 7,000pots so one of the key points here isthat for us we have two different uhdata data flow paths one for inest andone for the quering now uh the provingcontext about all of this is that formany years we were running most of ourservices uh into a controller workloadsbut running on top of a dcos single dcoscluster and regarding our data pipelinesuh they leverage kafka we are heavy cfkausers and we were running also a uniqueuh kafka clusters uh in order toaccomplish that um because of this kindum monolithic infrastructure it was veryhard to scale um update so any operationregarding adding nodes or upgradingum it could be very risky um frequentlywe were running into incidents with ahuge blast radius basically when we havean incident on this uh we were affectingall of our customers right so at somepoint uh we start solving uh started tosolve the problem are to thinking aboutsolving the problem so we initiated uhback in 2020 a program a multi-yearprogram in order to first migrate to thecloud um in order to have morescalability capabilities but also wewanted to isolate uh the blast radius tolimit the blast radius in case ofincidents so we also align this programuh to shift uh to a cellbasedarchitecturenow which are cells basically uh in abiological context a cell is the smalleunit that can live on its own so inorder to accomplish that it should haveall the resources inside uh all thecomponents necessary ne necessary for uhaccomplish a a specific function rightit should be independent let's say umanother characteristic uh is that theyexchange energy and matter soeffectively the the cells areinterconnected each other uh in order toprovide uh more complex functions rightnow talking about a cell-basedarchitecture this architecture alignswith this definition regarding cellsbecause um a workload in thisarchitecture is is decomposed inself-contained installations that shouldsatisfy operations for a shard so whenwe talk about a shard we are talkingabout a subset of a large larger datadata set for instance uh a subset of ofour users right so this makes uh a cellan independent unit of a scale uh it'salso limit the blast radius because uhif you have an an issue on a cell that'sum limited to that specific cell so inconsequence it's limited to a specificsubset of your dataright and and in order to scale out asthis is a a repeatable pattern what whatyou need to add is more cell instancesfor that workload and that's kind of oneof the hardest parts because um of thisarchitecture because you need to findout a logic um a thin layer you put overthe the cells in order to share thattraffic between the cells you need toshare the data you need to manage thattraffic flowi�ng to the to the differentcell instances so that makes the thecell router and as we were working uhtalking about workloads um you need toidentify the compose your infrastructureinto those isolated repeatable patternsum that's also a a critical um momentfor going to a soulbased architecturebecause you need to kind of um implementsome mechanism that help you on this forinstance a domain driven design so youidentify workloads and create new celltypesum so that was a kind of genericdefinition for a cell right for us atnew relic we are implementing cell's uhscope to a aws account as yoursubscription or gcp project so a cell isliving in an specific unit of of thiskind we added uh only one kubernetescluster inside we added one caferacluster and we added only uh uh onebinet or vpc in order to give thenetworking for that specific cellinstance um depending on the cell typewe could should be adding also someother resources like data stores loadbalancers whatever that cell type needsto work as an isolation and independentunit additionally they are going to bepure with other cell types if they needto exchange information like thebiological uh cells and in our case wedeploy it in a multi- availability uhzonesetup umum another special characteristic weadded is that we wanted to be them to befemoral so we wanted them to be uhdestroyed and replaceduh each time so we link this with thekubernetes life cycle of 90 days or ifversion is something we are stillworking on it but it's a charact that wewant to to achieve and regarding fromthe previous um for the former diagramif we are talking about the composing uha monolithic infrastructure in severalworkloads it means that you are going tohave different cell types each cell typeit would it would be representing aspecific domain right and also anotherimportant point is that this is anliving architecture it's continuouslyevolving maybe you set up a specificcell type you determine oh this is a toobig we are putting too many resourcesand you want to de compose that celltype into different smaller cell typesnow from our former diagram this is howit looks adding turning it into uh cellsuh operating being independent units andin our case as we have two differentdata paths we need different cellrouters with different logic uh in orderto accomplish the the sh of the data forthe but also for on the querypath now this is a a picture that ithink uh it represents visually veryvery great the um the achievement ofchar jumping from a monolithicinfrastructure into a cell architecturethis is um an snapshot from the firstyear of our program um here we are weare seeing the um telemetry data fromour customers flowing into ourenvironment and you can notice that atthe beginning there's this big chunkblue blue chunk of data flowing thatrepresents our former data centers alongtime that chunk is decreasing while youcan notice several smaller chunks ofdata flow um uh that represent dataflowing to a specific uh cell instancesnow you can see how we were sharing thattraffic from our customers intoisolating uh isolatedunits now this look cool but of coursechanging from a monolithicinfrastructure and into a highlydistributed environment has a lot ofchallenge for instance the uh the assetmanagement cell inventory on kubernetescluster life cycle if we were talkingabout that we run uh more than uh 280kubernetes clusters and each clusterlives inside a cell it means that wehave more than 280uh cells right also uh we run this in amulti cloud implementation so it makethis harder and because of that and thedifferent compute uh offering each uhcloud and the different nuances eachcloud provides we need to put someinstruction layers for our c for ourinternal customers for our developers inorder to consume all of this in aseamless way so the scheduling offeringuh in an efficient way uh it's veryimportantnow why we jump into cluster api whycluster api so in order to tackle someof these challenge uh cluster api it ishelp us the with the life cycle of themanagement of of clusters right uh itgives you the abstractions to to do thisin a multi cloud environment it givesyou the declarative specification so yourun uh and operate kubernetes managingotherkubernetes and also you can streamlineuh in a seamless coherent appearanceoper operational way uh from a centralcentralized point to any cloud to anyinfrastructure now here again we made itslightly different uh we have what wecall a common and control cluster wherewe uh extending the kubernetes api andwe create crds that model our cellulararchitecture in order to uh manage uhthe life cycle of cells uh but also ofcourse we run uh and bootstrap the thelife cycle management of kubernetes webootstrap the kubernetes from here uh wehave a streamlined bootstrapping processthat help us doing the same way for anycloud now the different thing that wedid here is that as we want the cells uhto be an isolated units uh what we uhdetermine is to after the bootstrappingprocess move the we move the ummanagement copy objects into eachdestination cluster so each uhmanagement sorry each targeted clustereach worker cluster is at the same timea control plane uh a management clusterand worker cluster at the same time wemade it isolated while we maintain acentralized point in order to streamlineoperations let's say and with that i'llhand over to my colleague tastic to godeeper thanks aigiven the numerous moving parts andteams involved and the frequency inwhich we were creating anddecommissioning a cell we need a solidrepeatable and reliable way to maintainand decommission kus clusters so thisprocess also needs to be easy to debugmaintain and extend and we should alsobe able to checkpoint the variousstates let's see how do we do this on avery high levelwe have a command and control cell whichoperates by running different homegrowncluster controllers which in turnmonitors the creation of a homegrowncluster crd object which gets createdafter the prerequisite cell buildautomation gets run once this object iscreated the cluster controller specificto the cloud provider manages thereconciliation and during thisreconciliation process the kus job isinitiated that creates a kind cluster indocker and docker modethis client cluster is used to installcappy and the capy cloud providerinstallation and necessary clusterdependencies these objects and necessarycluster dependencies which objectsrequired to create the we also createthe objects required to create createthe control plane and some initialworker nodes finally these resultingobjects are transferred to the targetgators cluster facilitatingself-management of the cluster by itselfand the reconciliation continuesfurther let's take the example of azurecloud provider and its bootstrappingprocess on the very left side you willnotice that we install inside the kindcluster which is running inside the kusjob the cappy and capsi controller andafter that we create the cubadm controlplane object to create the control planemachine deployments are also created tohouse applications which are deploydeployed during the clusterbootstrapping process and once thecontrol plane is ready we startinstalling cluster dependenciesthis is followed by a cluster ctl in iton the target kis cluster to deploy thenecessary controllers and crds for capyandcaps subsequently a cluster ctl moveoperation is executed to make the targetcluster self-managed along with all thedependencies which were installedthe cater's job concludes allowingreconciliation to proceed further withadditional cluster operations likesyncing waves of argo applications inorder to make the cluster ready runningfinally a set of test suits from ourside to make sure the cluster is readyto receive live deployments fromteams also noted here are the groupingsof different cap and cabzy objects whichare required to create the control planeand the worker nodes for other cloudproviders like aws and gcp the sameprocess is followed for bootstrappingwith the difference being that thecontrol plane could either be hosted orself-managed now that the control planeis ready we would want the developers tostart using the cell let's take a lookat how developers provision nodes fortheir applicationswe leverage machine pools very heavilyfor this which is the primary underlyingconstruct exposed to developers forcreation and management of kus nodessince this would be a very frequentoperation we would want to get out ofthe way of developers as much as we canin the process of creation of machinepools and for this what we have done iswe have exposed a generic machine poolhelmchart which the developers used tocreate an ago application deployed viaour internal deployment platform in thetarget cells now as this ago applicationis deployed underneath the helm chart istemplated with a mix of global helmvalues and same defaults injected via akus web hook to create the necessaryobjects for node creation with thefeatures requested by the developer inthe cloudproviders as users create these machinepools for their applications let's talka little bit about the highle overviewof nodes in each cell take for examplein aws cell here we have differentgroups of nodes we have the general poolwhich is multi-tenant for teams to useand a user application will land here bydefault if they don't specify anyspecial requirements for theirnode dedicated pools are created byteams when they want specific nodefeatures present which are not presentin the general pool or when they don'twant to be affected by noisy neighborsby getting deployed to the general poolpools are also differentiated by thenode feature of the architecture whenteams want their applications to land upon a specific architecture is when weallow them to use the machine pool chartto simply create a node or a group ofnodes with a specific architectureintended bythem now you might notice that nodes ineach of these pools are provided by twocompute providers cappy and carpentercarpenter being more efficient in binpacking of application pods inside nodesutilizing groupless autoscaling andoptimizing for cost by trying to findthe cheapest node to run from theprovided configuration at any givenpoint as of now we only have carpenternodes in aws cells and we will beexpanding it further to the other cloudproviders now that we have created thecontrol plane for the kus cluster andthe necessary node automation creationprocess is in place for developers tostart creating their nodes let's take adeeper look at scheduling workloads tothese nodesto shed some more light with theincreasing adoption of kubernetes weobserved a growing variety of schedulingrequirements among different teams andapplications which made it hard to trackof and ultimately impaired our abilityto introduce changes to the underlyingcompute platform in a safe and agilemanner our team aims to solve this issuewith scheduling classes a new relicspecific construct which our teamprovides to all users running onkubernetes and what is it it's adeclarative way to express schedulingrequirements for applications withoutthe user knowing all the necessary nodelabels and taints which the nodes haveand how does it work at the heart of itit's an admission controller in the formof a mutating web hook running onspecific resources inside the cell andsome of the design goals which we had inmind was this should be cloud exhausticgiven that we run on multiple clouds andit should not be reinventing the wheeland should build on top of the kusprimitives scheduling primitivesprovided by kus should have samedefaults deterministic and users shouldbe able to chain these schedulingclasses on top of each other so as toget multipleoutputs on a very high level when a userdeploys an application the web hook runsits validation if it needs to do anymutation and if it and if a mutation isrequired to be done the application getsmuted with affinity and tolerations oncethis application has this affinity andtolerations the k scheduleuler takesover and then tries finding the bestnode to schedule the ports for theapplication and this construct allows usto also run carpenter as well as machinepools together on the same cell forexample take this application which istrying to pass feature fu as arequirement via scheduling class ourscheduling class engine then defaults tocarpenter scheduling class if it's anaws cell since the user hasn't mentionedopting out of it the scheduling classweb hooks then add the required affinityand tolerations and then thescheduleuler just takes takes it overfrom there to schedule it to a carpenternode created by the note poolconfiguration which is present insidethe cellsimilarly when the application specifiesthe scheduling class to opt out ofcarpenter they specify copy as ascheduling class annotation and then therequired scheduling constraints getadded which would make the scheduleulerschedule the application in a copypool now as part of streamlining theupgrade process and management ofcontrol control planes each cell has itsown ago cd application where all therelated control plane objects aremanaged we have a homegrown crd calledcluster life cycle which targets groupsof cells based on environment labelswhich the custom cluster crd object istracking as we change the kus version onthis cluster life cycle crd one of ourcommand command and control controllersreconcile on this object introducing adiff in the kus version attributes ofthe upstream objects of cubadium controltrain object for example and then theupstream controllers take over thereconciliation and upgrade processsimilar to the control plane upgradestandardization we also need to do noderefreshes inside our kus clusters whichcould be due to a variety of reasons andnot limited to node upgrades but alsopatches and what we do is we our twocompute providers are cluster api andcarpenter in general inside all ourcells and to do upgrades what we do iswe have a worker configuration crd whichtracks attributes like ami versionversion inside each cell and what wehave done is we have standardized on topof capy and capy cloud provider apiswhere we allow where we enable drift ontop of objects of machine deploymentsaws machine pools machine pools we alsofurther build on top of the carpenters'snode drift feature to not redo whatcarpenter already allows us to do bydefault out of thebox we have been running the setup forseveral years and adding improvingfeatures to it here are a couple ofthings which we learned which we wouldlike to sharedifferent crappy cloud providerimplementations have different versionsof cappy referenced and this is hard tomanage on top of managing folks for somebespoke features and fixes which wemanage and the challenge of keeping thisfolks synced with upstream is definitelynot an easy task this also brings us tothe point of how it's challenging tomaintain the automation when differentclusters are using different apiversions of cappy and its cloud providerobjectswhen it comes to self-managed and hostedon self-hosted clusters it's easier tomaintain the k address version paritysince you can control the upgradecadence whereas in a hosted provider youare tied to the upgrade charter of thevendor which is different across thevendors making it again hard to be onthe same version across different cloudproviders and this becomes especiallychallenging if you have components whichrequire a specific version of the cl ofthe kus version deployed across thesedifferent cells so you can feel howchallenging this can get overtime there's also lesser control on themanagement of the control planecomponents in a vendor envir environmentfor example if you wanted to pre-warmthe control plane components thisoperation would simply not be possiblein some situations and the flexibilitythat is provided when you're hosting thecontrol plane yourself you can pre-warmdepending on when you want to shift thetraffic inside a cell almost before thetraffic shift is happening right rightso this is something which you cancontrol pretty much byyourself more standardization ofautomation is also possible when theclusters are managed by cubeadm managedclusters across cloud providers becausethe apis are simply standardized and youare not building on top of cloudprovider specific apis which would meanthat you have to manage and maintainthem differently bespoke solutionsacross these cloud providers so you havechances of more automation and morestandardization the caveat here beingthat if you're on self-hosted you aresigning up for more work which would inotherwise be taken care of by the cloudprovider for example if you want to douh backups cd cd management and so onand soforth furthermore we wanted to just alsoadd about a little bit on our carpenteradoption we have benefited from itsgroupless autoscaling feature itautomatically handling insufficientcapacity errors and its ability to doefficient bin packing it also allows usto choose it also chooses the cheapestnode possible from a set ofconfiguration which we provide in eachcell and to keep the costs lowin cases when the teams are not able tohandle the consolidation rate ofcarpenter we allow them to opt out viaour scheduling class construct and theycan easily opt out balancing the act ofreliability with cost management at thesametime we plan to further expand oncarpenter on the other cloud providersandwe this is something which we want to doinfuture with that thanks and we wouldlove to hear questions from you and takecareafter we miss this uh we would like tomention that um there's a t contributorstrategy launching a new mentorshipprogram so if you are interested inmentoring people uh from underrepresenting groups please sign upokay uh hi hey um i very much appreciateuh the talk it was an eye openener forme uh i have plenty questions but maybetwo of them now um you talked about thisbootstrapperuh component in the management clusterwhat is behind that like you mean thebootstrapping of the kus cluster uh wellthe the stuff you have to do before youcan initialize cluster api i guess it isit was like in the middle of your graphall right so uh during the cell buildprocess there are a couple ofinitialization steps inside the accountof the cloud provider which we need todo thisis dependent on what type of cell it isand what the requirement of that cell isas jav mentioned there are differentcell types we have a variety of celltypes and we need certain things to bethere before even the initialization ofthe kadis cluster and the reconciliationprocess starts soum i mean to summarize i think it's itdepends on what cloud provider it is andwhat cell type it is and what is the useof that cell type because depending onthat there are there are other teamswhich try doing a lot of other thingsbefore this for example aws what is thetechnology behind that is it like yourown controller is it i don't knowcrossplay terraform this is homegrownyeah okay you you mean before beforewhat we do the kus bootstrapping i thinkyou were asking what we do before that'shomegrown yeah okayokay any other question and you docluster cuttle move of the copyresources the kappa resources into theyou know workflow cluster do you do thatfor any other resources as well like doyou modify cluster cutle move to extendit make it move even more customresources such as cluster life cycleconfiguration or something so uh when wedo the cluster ctl move operationeverything which has so there are twosteps to reconciliation one thing whichi didn't mention too much due to thislack of time was um during thecheckpoint process of kus clustercreation and the point at which the jobsucceeds there is another phase wherereconciliation of other steps starthappening via the command and controlcell controllers but just before themove operation we have everything whichis then required for example say thecluster api and the cloud providerinstallation and the necessary thingswhich we deem necessary for the clusterto function at this point andreconciliation to start happeningfurthermore all those objects are movedaway and moved to the target cell andthe job concludes um so at this pointthe one checkpoint has finished and thenext checkpoint starts over if thatmakes sense and we track this inside uha specific crd which we have on thedifferent uh states which the clusterbuild processes at in this point i don'tknow if that answers your question nothat's fine thank you very much noproblemhello there uh from what i understood acell is declared to be um ephemeral ifyou don't have control about whichapplication is running in the cell howdo you migrate data if a cell vanishesand a new one is created that's a goodquestion so and that's a tough oneactually well you you control what whatwhich are the applications the servicethat you are going to deploy in thatcell because you need to define thatcell type now it could be uh a statelesscell type that's the that the best uhoption right and then you don't need toworry about that but in our case wefirst during the journey we have somestateful cells we put some automationtooling for uh when it was the point todecomomish that the commission in thatcell and migrate the data to anothercell we put some automation to do thatthat's tough so along the way and thisis you know uh across several iterationsyou uh try to decouple that statefulpart into an a a different specific celland maybe instead being ephemeral youcan put it in a more stateful permanentcell for us we have some special cellswe call them in clay um we made themvery resilient in terms of verymoving few moving components let's sayuh instead of having a huge amount ofservices running there um but basicallyyeah we we're taking out that stful toanother cell type and then we make theoriginal cell more stateless and thatway when it comes to uh thecommissioning and uh yeah build newcells it's easierthank you it's it's not a it's not a ananswer that fits any situation so youneed to take your answers along the wayit's the same with the cell router sohow do you do the cell router it dependsfor on your workloads and your trafficuh characteristics let's saythank youthis on hi um that was my questionactually i was going to ask if you couldexplain a little bit more about how yourcell routting works cuz i mean i don'tknow if i understood this properly butit seems like how i mean how do youdecide when a user hits your api whichcell they should be rooted to if eachcell hold like is owns their own datacan you hear the can you repeat a bit uhwith more volume sorry i can yeah sorryum so when a user hits your api y um howdo you decide could you just talk alittle bit more about how the cellroutting works like how do you decidewhich cell a user request should bedirected to ah cool about the cellrouting you meansure so in our case we have a mechanismum analyzing the heers of traffic fromcustomersuh based on the data type and on thecustomer id and api key uh and with thatwe decide to which cell we shouldredirect uh that traffic coming from ourcustomer basically uh so that means thatfor instance for a specific customer weare able to shift traffic uh dependingon the data type to different cells youcould we could have been migrating umrouting traffic into for metrics forcell a while we could be uh redirectingtraffic from logs to cell b and we do itthat with that that tapple of data uhdata type and customer id in a sense uhit could bea an presentation on its own but that'sin a sense the the mechanismthanks um so is that more is thatlike so if it's done with the user idsay for example if you're rooting basedon id or something to oversimplify yesbut it's not user id it's customer idand we do some more uh operations therebut in a sense that's the idea yeah okayso i mean like you have some id thattells you where it should go is thatdoes each of the cells likepublish what what ids they'reresponsible for or do you record it likewhen no we actually save that on aspecific repository let's say so when itcomes to uh reach out to that data uhfor the query part first you need toknow uh regarding that query where isthe data located could be in severalpoints right uh and we do that with aspecific um database internally uh thatsaves all those uh all that meta datalet'sokay thank you you'rewelcome what elsethank you a lot thanks thanks2025-04-15 21:58:20.814922 ��ef#��AWaDSASWA2z4hey good morning thank you all formaking the hike up to this verydifficult to find place in order to joinus for the six security maintainer tracktalk succession planting for a floweringfuture um I'm Tabitha Sable i'm one ofthe co-chairs and I'm really glad that Iget to help us make this space forourselves and each other together andgive the uh mic here to Kayn to do therest hi I'm Kayn i am also a co-chair ofKubernetes SIG security i'm the newestfreshest uh greenest SIG securityco-chair and during my day job I dosecurity things at Ozero by Octahello I'm Ian Smart i'm a consultant atAmberwolf and I'm one of the co- projectleads for SIG uh third party audithey all uh I'm Rory i do security stuffat data dog and I am one of the co-leadsof SIG security dogsand hello I'm I work at ISA at Cisco i'ma software engineer and I will berepresenting six security tooling todayso what really is SIG security um ingeneral we are a group within Kubernetesthat takes a community-based approach inorder to improve security for theKubernetes user base and for the projectitself but like specifically what doesthat mean you know it means that weprovide a place where folks within thecommunity and within the project cancome together to share their interestand their concerns for how to how tomaintain and improve security and thenwe can organize co-working with thevarious SIGs within Kubernetes in orderto make improvements in those areas thatthey maintain so like forexample Mahi recently led the work toremove the the security context denyadmission controller that was in the uhin in KKK for a very long time and itprovided some bit of a attractivenuisance because if you turned it on youcouldn't run a modern cluster but it wasthere and it was nominally a securityfeature therefore it was referenced inuh compliance standards things like thatand so a lot of folks had to write a lotof paperwork justifying why they weren'tusing this outdated thing that was stillin Kubernetes and so you know Mah alongwith several other SIG secur��Ce#��=A3KLsfEyNKrYhello everyone and thanks for joining usuh to our session at at the end of thisintense week uh we're going to showcasehow a new relic we uh manage a multicloud kubernetes infrastructure leveringcluster api on top of a cellarchitectureum in order to scale out uh workloadsbut also uh while limiting the blastradius uh forincidents uh my name is javier moscasanchez i'm working as a so principalsoftware engineer um i'm a kubernetesand multicloud architect at new relicand i�ity folksand SIG Ofolks went the long road down theKubernetes change control process inorder to get that removed and now it'snot there to poke anybody anymorewe also maintain some security relatedtools and processes used by the projectoverall like we maintain the officialCVE feed which is a a web page and anRSS feed that you can subscribe to andsee the newest results and we have a subproject that uh runs and coordinatesthirdpartyaudits since the beginning we have takenthe approach that no one person ororganization can adopt a position ofauthority and demand or dictate securityon behalf of the organization and thiscomes from the idea that Kubernetes orany sort of community is large and it iswhat the members make it so if you thinkabout contributing some code or somedocumentation to Kubernetes everyone whoworks on Kubernetes has the ability tomake Kubernetesinsecure anyone can write a bug anyonecan write bad advice on a blog postanyone can write a design that hasfailed to take adversarial nature intoaccount and if it's the case that we allhave the power to make Kubernetesinsecure then it seems obvious to methat also we all have the power tocontribute to making it secure and sothat is the approach that we take wegather folks together with variousbackgrounds various levels of experienceand we learn and grow together and wework together with the rest of thecommunity to make these sorts ofimprovementson the subject of learning and growingtogether I'm going to hand it off tonewest subchair here Kayn in order totalk about thefuture woo yeah so um as Tabitha said uhher and Ian have done an amazing job ofsewing the earth and making a reallyripe environment for plants and peopleto flourish um I have spent much time asa well-tended little seedling and now Iget to move out into the garden properwhere I'm still very well tended to behonest we take care of each other buthere I am growing and I'm so happy toget to learn from this incrediblecommunity and get to be a part of it umas as we've gone over our project ismade up of sub projects so you can seethem here represented in their ownlittle plants and microcosms there isthe documentation project the toolingproject and the third party audit andI'm not going to talk about them becausewe have experts that will cover them uhbetter than I ever could and the beesare me Ian and Tabby just buzzing aroundhelping where we can and pollinating theideas bringing things from fields afarfor the team to work on um and it's justgreat it's great you should come join usit's a good spot to be and so what do wehave planned for the future a lot umwe're just going to keep on doing thethings we do well we're going to keep ontrying to be an open and welcome spot inthe Kubernetes community in the opensource community um we're going tocontinue with the sub project maininitiatives again I don't want to gointo any detail um but we have a thirdparty audit that's occurring now we aregoing to continue to work on theKubernetes doc site making sure that uhwe're covering the security essentialsand that we're removing any staleinformation and making it as easy aspossible for users to know how to useKubernetes securely uh we're going towork on strengthening relationships withother SIGs we have because we makeourselves so approachable we getapproached and um there are a lot offolks who want help and opinions and wewant to give them and we want to haveconversations we want to make sure thatuh the resources here are available tothe greater Kubernetes community whenmaking security decisions or justwanting to hang out um and then finallyuh at our project booth we had beenapproached by a few people who had whitepapers or had found information onlinethat wasn't necessarily reflective ofthe current state of security forKubernetes so this year we're going toreally try and get out there a littlebit more and make sure that not only inthe Kubernetes sphere but in the greaterinternet sphere um we're trying toremove a little bit of that mess and oneof one such opportunity is going to berevamping the OAS Kubernetes top 10 soif that is appealing to you which itshould be because that's cool um pleasecome join our meeting we have one notnext week but the week after um and atthe end of the docu at the end of thepresentation we'll share some links soyou can get on our mailing list umthat's that's some of it and we're goingto play it by ear whatever comes upwe're going to tackle it if you haveideas please come talk to us talk to usnow talk to us later find us on theinternet um we'd love to we'd love tohear what you think we should dogood clicker still works so uh I'm Ianstill hasn't changed i'm four letter Ianin SIG security to save for a conflictwith the wonderful Ian Coldwater who ishiding from the stage um I along withRay Laano uh help run the third partyaudit sub project so everybody loves anaudit i'm sure we've all been involvedin many and had great fun with all ofthem uh what we're doing as a project istrying to make sure we have externalparties perform a dedicated securityreview of the Kubernetes project and thecodebase we as much as we can thinkadversarially and defensively absolutelycan't review everything that'sintroduced into the project it'smultiple millions of lines of Go andvarious other programming languages forall the extra projects out there so weuse this project as a chance to getdedicated security time from externalvendors to come in and actually providesome effort and I'm trying to see myspeaker notes to see if there's anythingelse I meant to say so I'm going toawkwardly move over here and it'll belike we planned it so yeah we're tryingto get extra eyes on the codebase justto help us find any extra securityissues out there we do have a hacker oneprogram as well this is a separate pieceof work so if anyone here is sitting onan O day in Kubernetes and has foundsome mad hacks that let you mine Moneroin every single cluster first of allplease let me know because that soundsfun but before you do that not first ofall please email security@kubernetes.iothis is a separate piece of work to theongoing bug bounty programs and securityresponsibledisclosure as a little bit of historywe've run two third party audits in thepast the first one was in 2018 with atrade and trail of bits if anyone reallywants to go and read that report now theuh issue for the findings there is 81146on the Kubernetes repo if anyone reallywants to memorize that number um therewas also another review in 2021 and inthe spirit of being a flowering seed whocan't do the flowery language that Kalindid i was actually on the delivery teamfor that engagement and have now foundmyself on the vendor side for thecurrent audit so NCC Group did an auditin 2021 uh again the findings for thatare on GitHub under 118980 i thoughtthat was worth telling everyone so ifanyone wants to go and look at thosefindings please do one of our ongoingpieces of work as well as running a newaudit is going through the findings ofthe previous ones we have a number offindings that that pesky autoclosingissue robot has marked as stale so thereis an ongoing bit of work to make surethat all of those issues are addressed anumber of them are marked as closed andfixed so security improvement hasdemonstrabably happened a number of themhas been closed as we won't fix thisthis is intentional by design so anotherthing we do as a SIG is make sure thatanything that is by design but we thinkis a pointy edge that we don't wantpeople to cut themselves on is make surethat these are documented elsewhere inthe Kubernetes documentation aswell and we also have a number offindings which have been marked as theyneed a ke so anyone here who's notfamiliar with a kept Kubernetesenhancement proposal uh we have a numberof findings which do need somesignificant effort to actually make afix if anyone wants to try and getinvolved in the Kubernetes codebase butwants a clear steer on where they canstart we have a number of known securityissues which are requiring somesignificant effort to avoid breakingchanges in Kubernetes that aren'tterrible if they were terrible scarysecurity things they would have beenfixed already but we have some that needa little bit of effort so if you want toget started uh speak to me after we'remore than happy to point you in therightdirection and for the future plans whichhave become the current plans due towonderful scheduling we are currentlyrunning the 2025 Kubernetes audit ifanyone wants to read the backstory ofthis there is a folder in the SIGsecurity GitHub repo that documents theRFP we issued instead of reviewing theentire codebase which I can tell youfrom experience is not much fun we'reworking through different projects fromvarious different uh SIGs so weapproached the SIGs at the tail end oflast year and said does anybody have acomponent you would like to be reviewedcompiled all of those and put a bit outto vendors we are working with the open-source technology improvement fund OIFthis year they recommended a vendor wellthey recommended a few vendors weselected one and the project is ongoingso the chosen vendor shielder arecurrently reviewing quite a lot of thesub projects i can't remember the wholelist but it is documented um and theyare starting to report findings so weare working with them directly everycouple of weeks we get an update we'rehappy to say there are a couple offindings coming out of that which aregoing through responsible disclosure andwe will obviously make sure they arefixed before they go public uh but yeswe are starting to get some findings outof this audit and we are hoping that atthe tail end of this year we should havea report that can be made publicand with that I will hand overawesome uh so I'm going to talk to you alittle bit about SIG security docs uhwhat do we do well it kind of says onthe slide we do a couple of things theone is we try and work across SIGs toimprove the security content of theKubernetes website obviously when peopleare configuring or managing clustersthey're going to go to the website as aprimary resource so it's reallyimportant that we have good securityinformation there we also do someadditional work where we have somethinglike a threat model uh which wouldn'tnecessarily fit as part of docs and wewant to actually have that written aswell so we have some white papersavailable inside our GitHub repositorybut in general what we try and do isimprove the overall documentation ofKubernetes security and what I wouldlike to do is convince you that you toowould like to be involved in documentingthe security of Kubernetes because Ithink there's a number of really goodreasons genuinely think this is a reallygood thing to do so why would you do ityou will understand more about theproject i write quite a lot aboutKubernetes and container security andwhat I find is the act of writingsomething down the act of working outwhat you need to say will lead you tobetter understand something whenever I'mwriting a blog I often find outsomething where I had made an assumptionabout how something worked or howsomething was operating and I'm wrong soI start writing it down documenting ohthat's actually incorrect you improveyour learning a good example of this asan opportunity at the moment is we havegot a long running project to try anddevelop a hardening guide for Kubernetesand we've split that into a number ofsections that focus on different aspectsof theproject a good example the scheduleulerone has just finished up and an helpingus out with that is um delving he delvedright into the scheduleuler andunderstood more about what the securityimplications of different parameters anddifferent features of the Kubernetesscheduleuler are and I learned a lotjust by reading and reviewing it andtrying to help out getting the thingready for the website and I'm sure helearned a lot in actually designing itso if you get involved in hardeningguide as a good example you will learnthings about specific aspects ofKubernetes but there's more um oh almosttoo much more we missed the good bit youwill learn things cool things aboutKubernetes i promise you you will learnthings cool things about Kubernetes byinvolving yourself with security docs wehave another piece of work we juststarted doing which is looking at thewebsite and looking all of the openissues that relate to security andseeing if there's any we can contributeto getting those issues fixed and closedand in doing so I learned things aboutKubernetes I never knew because theseare quite kind of in the depth areassometimes so one things I learned wasthe Kubernetes API server has aself-signed certificate that it usesonly for loop back calls uh and it has a12-month lifespan if your API server isup for more than 12 months it will crashand it will crash because thatcertificate becomes invalid and it isnot reissued until the API serverrestarts so if you ever try and leave aKubernetes cluster running for more than12 months that's what's going to happenyour API server will crash and now Iknow why and now so do you know why ifyou read these issues you too will findthings you never knew about Kubernetes ipromise um so that's a great way reasonto get involved in docs right you willlearn things cool things that are maybethat one's only good for trivia um butalso if you ever have a cluster crashingcrashing after 12 months you can makeyourself look so clever by saying I knowwhat happened it is the internalcertificate you also get to work withother parts of the projects that's verytrue as well what because I think wementioned it earlier on is that we arewe don't own the code as a SIG so whatwe do is we have to work with the SIGswho do own the code if you involveyourself with writing docs you will workwith other SIGs and you will meet othercool people who are involved in theproject and who will help you learnabout the things you need to know towrite those documentation you'll workwith SIG docs because SIG docs help usget things into the state and um stylethat is needed for the website you willwork with other SIGs like SIG O SIG nodewho own the code that we are writingdocs about so it's a great opportunityto learn about and meet other people aswell and with that I shall hand you onto Mahi to talk about toolsthank you Rory so now I will talk aboutSIG security tooling um so here are thethe few goals we have uh basically wetry to build and improve the security ofKubernetes by writing code and workingacross SIGs um and I I think we havebeen able to create like a very nicespace for new contributors in the pastto share and learn so uh basically thisSIG is organized like this we meet everyother Friday and um on one side you canpropose like new learning sessions ontoolings and work related to securitythat you've been working on or you canjust join the working sessions where wetry to actually make progress on theissues so uh I wanted to mention some ofthe work that has been done uh we havebeen running the sneak uh scanner on therelease image of Kubernetes and latelythere have been like a little bit ofcross SIG collaboration to move uh thescripts from the test infra to the SIPsecurity repository and and move um thejobs to less trusted clusters because weare like seek security we want toenforce that on us as wellum another one is the official CV feedso Tabby uh talked about that justbefore uh we created this uh autorefreshing list of CV feed it'savailable on the website you can seethere's like a dedicated web page andyou can see we have like this JSON feedand RSS feed that you can consume uh forall your needs um oh sorry uhwhoyeah all right please all right it'sgood um so the the initial list of uhtask we needed to do to make the uh CVum u feed GA is kind of finished we onlymostly have to do the last one now whichis like try to update the CV feed nearreal time so the the only issue so faris that the CV feed can be a bit late inthe worst case it can be like 12 hourslate if the website is not rebuilt bysome PR that has been merged uh in thewebsite repository so yeah the idea isjust to use like a web hook to try torebuild the websites but um anyway thething is that we wanted to make this uhfeed G and uh recently we've beentalking to some people about the waythey use the CV feed and now we want toinclude even more uh task in thisproject but uh again can you go back ohit's okay um so if you want to join thiseffort uh we've been trying to writesome documentation about how this thingwork uh it's a fairly trivial piece ofdocumentation and it's like 50 lines soif you want to get involved like now youcan check this PR and and review it andand and say if it makes sense or ornot um yeah I wanted to talk about thatas well we we have uh used this we havelike started this initiative like awhile ago trying to run the go checkproject on Kubernetes it's been uhupdated like recently but so far wedon't do anything with the results so ifanybody's interested about joining anduh and helping us on on using this uhscanning results that that would be niceum and I discovered very recentlyactually that uh some people at Aquawere like creating this new projectwhich is called the CV feed OSV so theywere like consuming the CV feed I justpresented before uh for their securityuh scanner project but they neededactually this OSV format so um likelately the initiative has been mostly totry to move on to this new format forthe official CV feed try to merge theefforts and maybe kindly ask to thecubernetes SRC to issue the initial uhCV as an OSV format because like rightnow we consume the CVS from the KKrepository issues uh yeah issues and uhthat's like in a very like free form sothat would be nice to have likesomething nicely formated that we canconsume distribute uh along the JSON andRSSfeed so with that toKen so if you're dying to get involvedwhich I assume you are because we justtalked about a whole bunch of reallyamazing things and great opportunitiesum there's some information up here onthe slides uh the darker blue one thatsays SIG security above it is a link toour GitHub repo that has our readmewhich will get you into the mailing listand you can read about any of thewonderful things that we've talked abouttoday the one on the right will get youinto Kubernetes Slack and you can findus at SIG Security and I mean if youcome to these talks we're all allbegging you to join us but just for anexample um my first ever KubeCon wasValencia and I attended this talk with asome of these still here some of themnot gone on to new things um and I feltlike I had seen my people and I had seena spot where I could just exist safelyin the community um and so I decidedactually with an audience member Dannythat we were going to try and go fromKubeCon attendee to maintainer in uhbetween KubeCons basically we appliedfor a talk the talk got rejected but wedid the work we went to the meetings weshowed up we became known and uh nowhere today I'm a co-chair which is wildpeople ask me "How did you become acoach?" I was like "Well sir they'regoing to let you grow they're going tofoster you they're going to fertilizeyou are going to help you becomewhatever you want to be if you don'twant to be up here you don't have to beyou can do all sorts of stuff so pleasejoin us um we'd love to have you we areWe don't have any requirements at thedoor except be kind and welcoming um aswe will be to you um and we'll help youout and then finally um we have sometime for questions comments concerns youcan ask us now or if you'd like to youcan um come find us later or find us ontheinternet ohmike is runningi'm a brand new person who doesn'treally have a lot of experience insecurity and I find the idea ofcontributing to SIG security to bereally intimidating what would you sayto somebody like mei would say within SIG security we havea fair number of folks with a lot ofexperience but not a lot of time andwith kindness in their hearts and sosomebody like you who would like tobecome involved would like to make somecontribution but is afraid ofit we are ready for you because if youwould like to try something you can havethe expertise of the whole group backingyou up to help you to make sure that youare on the right track to provide youwith encouragement and so I would sayplease come we would we would love tomeetyouany more questionswell I think then we've accomplishedwhat we all came here for thank you somuch for coming and hopefully we willsee you on Slack[Applause]2025-04-15 21:58:21.615808e out why it failed um it is veryexpected that you're going to requiremanual fixes to generated files thathumans should not be touching becausethey're generated files and there is nodocumentation to guide you uh this isessentially only doable by one or twopeople the one or two people who builtand maintained this tool uh Xander and Ican't do it we've tried and we'retechnical leads we should be able to doit but we can't because all of thesefailure modes are undocumented which isuh not great the code itself is alsoalmost entirely undocumented uh this isobviously not at all in line with ourideals as a project or especially as aSIG seeing as once again we are SIG docsand we don't havedocs all right so now that youunderstand that this is a perfectprocess 10 out of 10 no notes um I'mgoing to talk about what some of thegoals look like going forward and andhow we'd like to ideally change thingsum so when we think about you know howwe operate as a SIG um we would love fornew contributors to be able to get inand play with like the reference docs umwe often get a lot of new contributorsin SIG docs it's like one of the sigsthat contributors often start with andum it would be cool if um this codewasn't like a rat king that theycouldn't interact with at all and wassomething that we could actually use tobring people into the project with andget them up and runningwith um in the past these docs hadalways been generated by the releaseteam on uh a release and um as you cansee that that process as it's broken umcan't be done by the release teamanymore um can't can't be done by me andcat um there is one maybe two peoplethat can do it um depending on the dayor uh availability um yeah it's it'sdeep magic i don't entirely know um butwe would this is the goal for us is toget it back to the release team beingable to do thisgeneration um this is I don't have a lotto say here uh this is like bar on theground um as Cat said we're SIG docs umwe should have it documented um thereshould be some comments in the code anduh yeah that's that's that um how canyou help um so this is actually themoment when you find out that you've allbeen lured into a cry for help um it'sYeah umyeah um that's what this talk is um so Ithink a good wayfor contributors to like start helpingout with this is something as simple aslike improving theexisting thing that we have and that'sas simple as going through and addingcomments to the existing code orcleaning up things where they could bebetter maybe reducing the ways in whichbash calls Python calls Go um you know Ithink there's a lot of room to take whatwe have and make it a little bit cleanerum and yeah we would we would love tohave some new contributors to the SIGexplorethat um and then this is the idealfuture state hi apple pie in the skygoal um we would love to actuallyproceed with a full rewrite at somepoint in the future um and you know thisis something that we would love to getsome contributors jumping in to maybestart thinking about the design for thisand what it could look like um and so iffolks are interested um you know it'd bea good time to join the SIG meetings umand start talking aboutthis um and then lastly you know thisthe the reference doc generation toolcurrently is entirely customwritten umbut you know the Kubernetes API is it'sit'sopen API spec so there is tooling outthere for this um so if anyone isfamiliar with the open API ecosystem umideally when we move forward with arewrite we'd be able to utilize existingtooling um to make you know this thisgeneration a bit easier um rather thanhaving it built completely from theground up um yeah please help um we doknow that the the reference docs are ahugely utilized part of thedocumentation like we have the analyticsto see on the website how much they arelooked at and um it's you know much likeopen source it can blow over with astiff breeze um and it doesn't even needa stiff breeze for this part of it so umthat's where we are i probably undertime but thank you we can take questionsum helpmicrophoneprobablynot sick all right photo home2025-04-15 21:58:22.074151 <<�9g#�+ARdT6P5x_fDMwe were betting on how many people wouldcome to this talk because nobody comesto maintainer track talks um and Xanderbet uh less than seven and we haveunfortunately beat that number thank youthank youuh welcome to SIG Docs and youmodernizing API reference generation myname is Cat Cosgrove and I work forWaylandUtani uh I am Xander Jerbinsky and Icurrently work for Shinraoh I guess I should mention that we'reboth uh SIG docs technical leads whichis obviously why we're giving a talk onthe maintainer track but uh we're goingto talk about the process for generatingthe API reference docs for Kubernetesand cube control um first I'm going todescribe the current process note thateverything I'm going to go through hereis something that has to be done twiceonce for Kubernetes and once for cubecontrol which is lovely and not at allinconvenient so uh the first thing wehave to do is create a local workspaceand set our go paths and then get alocal clone of several repositories umthis starts out pretty pretty easypretty straightforward this is alldocumentation from the uh Kuberneteswebsiteyou also need the K website repo and theKK repo which you have to rename forsome reason that remains a mystery to mebecause I did not build thistool and then we have to set someconfusing build variables that willrepeatedly be a problem for you whileyou're trying to do this entire processum it's it's worth noting that this toolis actually uh a bash script that callsa Python script that calls several Goscripts which is normal and sensible wayto uh buildsoftware this has to be done for everysingle release twice so we need aversioned directory and we've got tofetch the open API spec this isalso pretty easy at this point thingsare going well um you're not going torun into any problems just yet all ofthis is going to go smoothly and but itit will go off the rails prettyrapidly this is where things start to gooff the rails um I am not quite sure whythis tends to fail so hard um but itdoes every single time um all we'redoing is making the copy API andgenerating those two files but this isusually going tobarf um this is this is straightforwardwe're just modifying some markdown umbut this does have to be done for everysingle release it is manual it isannoying it is consistent acrossreleases you always have to do this soit does seem like uh a thing we couldautomate or a thing that could behandled by like a flag when you'rerunning these operations the first timebut instead we're opening a markdownfile and changing the exact same thingevery timetwice uh this will fail every singletime you need to locally test the APIreference um this is not going to workreliably i think it used to um a longlong time ago before things got morecomplex this this process used to workwell um but it it no longer does um assoon as you try to actually update andbuild all of this and then locally testthe API reference it starts to fallapart pretty uhirreversibly so the process as it'sdocumented makes it look like it's veryeasy um it it makes it look like it'svery clear gives you expected inputs andexpected outputs and you you would thinkthat that's the way things actually workbecause we're SIG docs right we'reresponsible for the docs so surely ourown docs are good um alas that isuntrue so this thing fails regularly itdoes not have helpful or coherent errormessages you are almost certainly goingto have to read a stack trace to findout where something actually failed andGod help you when you're trying tofigur let's talk about differentfeatures of Kerno and what they cancurrently do for you and this will alsoact as a summary of current capabilitiesand use cases so starting withvalidation so validation as the namesuggests validates a resource so it's astandard yes or no policy where wheneveryou give this policy a resource it willrun the validation checks as you havedefined in the policy against theresource it will give you a yes or noresponse and you can do whatever youwant based on the response that you getso if you want to outright block theresource when it fails the policy youcan do that or you can just like createa report uh with all the results in itto use it for compliance reasons rightum this is the most common policy typethat we have it can be used for variouscases like ensuring that a label ispresent on your resource or somethinglike the the resource requirements likethe CPU and memory limits are setproperly you can use it for that toocurrently we supports validation usingpatterns which is a declarative way ofspecifying validation logic so as youcan see in on the screen there's anexample where you are where we'reverifying that the label name is setproperly on your resource if you have acomplicated condition you can use thatour condition block which uses Jamespath and James path functions uh for profor writing complicated conditions wealso recently added support for cell andwe also support port security policiesum besides this you can also runbackground scans against your resourcesso um you can just periodically runthose policy against all the resourcesin your cluster it will create policyreports for you and you will be aware ofif every resource in the cluster iscurrently um compliant with yourstandards and this is great for timeproofing and preventingmisconfiguration moving on so as uhlet's say in case you uh instead ofwanting to validate resource you justwant to mutate the resource tweak it upa bit um to meet your requirementsinstead of just blocking it you can usea mutation policy so as the namesuggests it will mutate the resource anduh and mutation always occurs beforevalidation in the admission chain whichmeans that you cannot bypass anyvalidation logic just by using mutationpolicy um currently we support bothstrategic merge patch and JSON patch sostrategic merge patch would be yourdeclarative way of specifying how youwant to mutate the resource you can seean example on the screen where uh we aremutating the security context and youcan also just specify the patchesdirectly using the patch uh JSON patchformat right you can just specify theoperation path and value it will do itfor you and even if you have installedkivero later um on cluster and you wantto mutate resources that are alreadypresent in the cluster you can do thatusing kivero as well we also havesupport for mutating existing resourcescoming to the case where you'd want tolet's say generate or create a newresource when a condition is fulfilledyou can do that using kernel generationpolicy a use case for that would belet's say you just created a newnamespace and you want to ensure thatthat name space has a network policy init by default you can do that usingkuberno generate policy um how you woulddefine that um you can define thatdefine the object you want to createusing um uh either you can specify aclone a clone object a source which canbe an object present in the cluster soyou can just use let's say an existingnetwork policy as a source or you canjust directly define the YAML of theobject you want to create in the clusterpolicy itself and when you're using asource um Kerno make sure that theobject is the new object that is createdis synchronized with the one you use asa source so if you make any change inlike the source resource the cloneresource will also have that change sothis is very good if you have a secretthat you modify if every single secretthat's created from that will also bemodified and it is good it providestamper resistance and it is ideal formulti-entworkloads next um a good use case thatyou have seen from the community isresource optimization where you cancreate a policy which will matchstateful sets and deployments andwhenever those are created it willcreate a vertical bot autoscaler forthem and this is a policy that was usedby Adidas to reduce the cost of runningtheir cluster by 50% and this was amajor part of how they did that next oneokay um coming to cleaning up resourcesso we have this thing called a kernocleanup policy which will periodicallylook for resources in your cluster andit will um find them and clean them upfor you on a regular basis and this islike ideal for removing unused resourcesor resources that violate yourconditions um we also support anotherbehavior for cleanup where you can likeset up a label called TTL it'scleanup.io/TL and you can let's say setthat on a board with a duration so youcan create a pod with a cleanup TTLlabel set to one day and after one dayKerno will automatically like deletethat resource for you so this is alsovery good for setting expiration date onresources uh one more use case for thiswould be to cleaning up pod disruptionbudget so sometimes what we have seen isthat um some pod disruption budgets theydo not get deleted and sometimes theyblock nodes from being turned off whichcan which can cause like extra cost sowhat you can do is you can just create acluster cleanup policy which will findthose pod disruption budgets andautomatically delete them like and helpyou in resourceoptimization the last one would be imageverification so since Kubernetes um isbased on containers so it is paramountthat you only allow trusted containersin your cluster right and one way to doit would be to making sure that you onlyuse trusted images and you can do thatby using image verification and signinglogic so you can verify images that weresigned by either no notary and cosignboth of the major um image verificationsolution and you can verify thesignature on your container images youcan also verify the signatures onattestations like esper or vulnerabilityscan reports on those images you canlet's say if you have a sb and you wantto verify that it has um the rightlicenses or like the right dependencyyou can also do that using kerno weallow you to um check the payload objectand make sure it has specific conditionson it um we support cosign keyless andcertificates and recently we also addedsupport for the new GitHub artifact atastation so these are all the kernofeatures and we can use them for anycompliance or security or automation usecase might you might have of going onthanks so yeah now that we know whatKaberno can do and what different typesof policies we have why do we want tochange Kyerno in its way it's workingtodayso um many of you might heard thatKubernetes nowadays has an inbuiltadmission policy feature and provides anadmission uh policy type for validatingas well as mutating and Kubernetesdecided to go with cell as a solution todeclare expressions for this kind ofpolicy types on the other hand we haveKyerno which also evolves over time andbrings today many different ways toachieve the same or similar logic toyeah declare how or what your policyshould do we have for example patternmatching as described we have assertiontrees we have a basic cell supportalready but this brings also somedifficulties for us as a maintainer andfor you as a user because we providingin the end one really large APIum and you have one single C for eachtype of policy so you might not knowwhat is the best solution for me shouldI go with James path should I maybe gowith cell will uh pattern matching dothe trick for me it's really hard todecide and also it's sometimes hard toknow what fields or what configurationsI need for my type of policy and how itwill affect how my policy is executed inthe cluster for us as maintainers andalso for the community to support it'snot easy to say you should go with oneor another solution as it's reallydepending on the user on the environmentand on the scaleso um let's take a deeper look on cellwhat kerno kubernetes choose for theirsolution and what makes it different soon the first point we already have areally large feature set we have mayofeatures as as James pass alreadywithout any custom um extextensions we have a really largecommunity which already provides a largelibrary of different um features you caneasily add to your cellenvironment this makes it really simpleand powerful and it is side effect freeso it doesn't um mutate or change theinput in any wayso Kyerno always had a goal to becloudnative as possible and to make iteasy for you as a user to switch um fromother solutions or from the Kyernoadmission validation policy to Kyerno touse the extended feature set so it'smore like natural that we now take alook on what Kubernetes does and how wecould utilizing it to make Kyvernobetter for you as auser and improvements we wanted toachieve as a list we want to be simplerand expressivewe also use this as a chance to improveourperformance but also retain our currentfeature set so it was not an option togo with cell now but limiting thefeatures Kyverno already provides to theuser because it doesn't help if the userhas another way to write policies butcan no longer achieve what it uh whatthe user achieved beforeand as mentioned Kubernetes native we umdecided to go with the standardizedKubernetes API as much as possible so webasically reused the API for validationadmission policies and just extended itwith um additional features Kyernoprovides for you as auser in the last releases we left theKubernetes space and are also be able tovalidate or gener uh in general operateon resources outside of Kubernetes andwe also want to keep this supported andthis means we have now one single API orC in case of validation for example tovalidate all kind of JSON based payloadsas well as your Kubernetes resources soyou no longer have to learn differentthings if you want to use Kyerno JSONfor external resources or if you stickto Kubernetes resources and use thingslike patternmatching um yeah as mentioned we want tokeep our functionality this means for usthat we decided to extend the cell umimplementation we are using with ourfeature set and adding libraries thatyeah implementing uh given features likeum resource lookups external servicecalls config map lookups and otherfeatures you know from the existingAPI and yeah in the end we have a betterbalance between be declarative and havean imperative syntax with the cellapproachyeah yeah so based on the limitationsthat we just identified and the featuresthat we want we decided to create abrand new policy types and these policytypes will be simple and they are veryexpressive and they will retain all thecurrent kerno features as well as wewill add new features like support forany GSON payload right so here's what wedid we created five different policytypes so previously we only had onecluster policy but we decided to splitthem into all of the five differentrules that they have which creates avery simpler API and we created thevalidating policy image validatingpolicy mutating policy generating policyand the cleanup policy the first twowill be available right now in 114release and the next three will be addedum in the future releases based on thefeedback and what community wants solet's get into the first one thevalidating policy if you look at thepolicy um from the overview of thepolicy it looks very similar to thecubernetes VAP API because that's theAPI we based this thing on um but if youlook closely you will see that we haveadded some um small things like theevaluation config and you will see anaudit annotation at the bottom so whatwe have basically done here is we havetaken the VAP API and extended it withfeatures that cannot be added in the APIserver and VIAP because of performanceconstraints so the features that weadded can be some of them are let's sayif you want to make an API call um to anexternal server you cannot do it in theAPI server because they have timeconstraint but you can do it here youcan use our own http.get library to makean API call you can also fetch anyresource that's pres present in thecluster you could already do it do do itin a cluster policy that feature is alsoretained here and let's say if you wantto create a report and you want to makesure that the report has some customproperties you can also add them usingthe audit annotation thing so we stillhave the um evaluation config you canseeinspect.ealuation you will see there's abackground enabled true so the thosethings replicate the background scanningbehavior that we previously had and wealso have the admission disabling soevery single thing that you know fromcurrent verno will still be available inthese but they are we just created asimpler and like more expressive API sotalking about expressive API we saidthat we would support JSON payload sowhat we have done here is we have addedsupport for JSON so you can use the sameAPI to create a policy that will be youwill be able to use on any JSON payloadso what you have to do is you'll have tojust set the evaluation mode to JSON andthen you can just write any validationlogic in cell so in this example we arebasically um verifying the JSON um theparse JSON value of a docker file andwe're running some checks on it todisable um use of curl command right soyou can do all of these um with this APIthe same API you won't have to usenojson or any different um solution thatwe have we just have one single API thatwill be able to satisfy all your needsin cub in Kubernetes and outside ofKubernetes as well so yep based on thewell-known betting emission policy APIwe extended with kerno features that youalready know and love the policyboundaries have been made clearer byseparating the policy into their ownCRDs and we have also added support forany JSON payload now so the other policythat we we're going to talk about herewas the image validating policy this isalso based on the VAP API so just likethe VAP it still has the validationblock where you specify the expressionbut if you look closely you will see ithas some special things like you willsee there's an images variable here andthere's this thing called attesters andthere are functions that are not presentin anywhere else so we have added thesethings ourselves to facilitate imageverification in these different type ofpolicy and so like we retain everysingle behavior that we currently hadand we added some more here as well soone thing that we added for imageverification wasat basically any trusted authority rightso you can have an attestation of typenotary or cosign all the current cosignfeatures like keyless or key kms thatyou already know they're already presenthere um you can also specify notaryadister using notary certificates thatyou might create and this this variablecan you can specify these and thenaccess them in the validation block andyou can run like set expression on themto do image verification other featuresthat we had to add was an imageextraction logic because we want tosupport image verification on any JSONpayload um we'll have we will need yourhelp to um tell us where the images areright so if you have a generic JSON youwill have to tell us where the imagesare and for that what we have done is wehave created an images variable you canjust specify the location of the imagesusing cell and we will find those imagesand then you can access them to runimage verification against them youwon't have to define these if you'reusing just a pod or pod controllersthey're autofilled for you in those casebut for any custom resource or JSON youwill have to specify these yourself umat aist stations as I previouslydiscussed J um um it can be sbombs orvulnerability scan reports you can umyou can already you can still use thecosign into addestation and the OCIrefers API attentation you can definethem in your policy and then refer andthey can refer to them in the validationblock using cell so here's a sampleimage validating policy so you can seethere's the attesttor block there's anattestation block and there's averifications block since this policyonly matches Paul you won't need todefine the images variable here and ifyou look at u verifications block youwill see that um there is a expressionand in the expression you we are loopingover all the images in the containersand then we're using the verify imagesignature function and we verifying themagainst the notary address that wedefined here right and the second oneonce that once the first one passes itwill go to the second one and it willverify the signatures on thevulnerability scans and once that isdone then you can specify a conditionthat will find the payload of thevulnerability scan report and then itwill make sure that that payload doesn'thave any critical or high vulnerabilityscan um vulnerability um in it right sothis gives you a more expressive syntaxfor writing complicated condition youcan um write it on JSON now as well sothat's that's a very nifty feature thatpeople wanted and yeah so as I said allwe can add complexation using cell allthe features from image verify fe verifyimage rules are already present and nowyou can just use it on any JSON payloadso we will have other features like ummutating policy and generate and otherones in the future releases right now wehave these two and you can try it out inthe new one release i think the RC forthat is out already and you'll be ableto access it um there so let's move tothe demoum yeah so before I go strict into thedemo I prepared a QA code to arepository which has all the uhresources I will showcase in the demo aswell as uh links to the demon uhdemonstrated demonstration setups uh toourplayground um so you will be ablewithout installing any LC orum test build of Cavono to try it outand experiments with it and yeah everyfeedback will be uh verywelcome so let me quickly I can show theQ code later again if anyoneum was not able to take a picture sothis is how the repository will looklike you have the different links to ourum playground i will showcase this in aminute so every demo we will presentingum are here and you can try it out andmodifyit uh so let me start so this is our umplayground where youcan try things out um this is uh wealready have a preview build for 113 um40 where the new policy types areavailable so as you already saw this isa very basic uh no let me go with avalidating policy um which checks uh adeployment label so we using a matchcondition to targeting deployments inthis case um as in previous or for ourcurrent policy uh C you can also justtargeting pots and we still uhautogenerate our rules for othercontrollers like replica set deploymentstateful sets and so on so this featurewill also be available for um the newpolicy types and you can still configureit over anannotation and yeah now we have avariable block where we basicallyusing an end label from the resource wewe are getting and checking if it's ifthe value is pro and in the actualvalidation we're just checking if thevalue of the variable is true if so umwe will get a success if not we will getan error message and on the right sidewe see um two example resources so wehave a deployment with the expectedlabel on it and we have a bad deploymentwith the wrong label on it and you canrun it and as you might guess the gooddeployment passed the evaluation and thebad oneum provided our defined error message sothis is a very basicum policy and as I mentioned weimplemented different ways toum reimplement the features Kao alreadyhad one of them are the resource lookupso we implemented a resource librarywhich makes it possible for you toaccess other resourcing uh resources inyour policy so for this example I havea simple config map with a single itemit's a list of name spaces I want toallow in mypolicy and in the policy we use avariableagain to yeah get config map from thedefault name space and the given name ofour config map and in our validation wecan access this variableum the data and splitting the list so itconverts from a string to a list andthen we checks via the in operator ifour nameace of the object is part of thelist we defined in our quantific map ornot and also in this one I have anexample with a test which was part ofthe uh quantic maplist and a second oneum which is not partof the list so and also in this case yousee it's passing and failing asexpected and in our error message inthis case I used an message expressionso you can also use cell in for exampleyour message or audit annotations todynamicallyum access variables values of yourresource in the express in the messageand yeah in this case we see that onlythe dev test and stage name space isallowed but in this case we had a uh pronamespace um the last thing I want to showis uh also is a is a helper um functionwe providing um in this case it's theimage data helper so it passes yourimage string in your deployment potwhatever and you can do easily checksfor example in this we are checking thatall containersum inour potum have the GitHub container registry asits image registry so when we have adeployment with GitHub registry itshould work if we have one without thenit should fail and also in this case umwe see that the expected resource passesand the other onenot so yeah this are my examples for thevalidating policy and now we continuewith some examples for the imagevalidationpolicy okay so let's load a policy toverify image signature so you can see wehave a image validating policy which mamatches on all the parts in the clusterand then um this is the match conditionto verify that the prod is true thatdoesn't matter here and we are onlyverifying images that are present in thegcr.io registry so you can specify aglob or a cell expression here if youonly want to verify certain images onthat resource and in the addistersection which is the trusted authoritywe are creating a notary adister u andwe're passing a cert value will beverified against the signature on theimage andthen you turn thaton and then we are running somevalidation checks here so in thevalidation check what we're doing is wehave a images variable and we'rechecking all the images on the containerwe map we are mapping against all ofthem and then we are running the verifyimage signature so this is the functionthat takes the image as the firstargument and a list of atttor you wantto verify it against and then in thereturn value it will give you the countof that passed the image so what we'redoing here is we are passing the singleaddress that we had and then we arechecking that the value of uh the returnvalue is more than zero so one of theaddress you provided verified the imageso if we run this since this making athis makes a network call it might takesome time let's see oh quick so thispassed the image because we're using thesigned one and if I use the unsignedone it failed because that one did nothave a signature that can be verifiedagainst this policy nice so that's theexample for image verification now let'sjust look at um atestation verificationso here we have a similar policy butinstead of verifying the image we'reverifying the sbomb attestation so it isa refer API attestation which means itwas um attested to the image using theOCI referers API and what we doing is weare using the verify at stationationsignature function we're passing theimage we're passing the atestation andthen we're passing the array of atesterswe want to verify it against and we'redoing the same thing we're making surethat the value is greater than zero andif we run ithere this one um and if you this passesbecause the image has the add a stationand quickly let's look at one more umhere's one where you want to verify thatthe payload of the addestation is theright one um again similar policy butwe're just checking in the secondcondition that we're fetching thepayload and verifying that the bombformat is set to cyclone DX and if yourunit this should pass and if I just changeit from cyclone DX to say um yeah tempsomethingthis fails because this is not the notthe L bomb format that was defined inthe um in the payload right so this isthese are the two policies that we haveavailable at the moment you can just trythem out on the playground on the linkthat we have which is where's thethis one right thisone just give me a second should be hereyeah so you can just check out the umthe examples in the repository that wehave provided and yeah that's all we hadto talk about so do you have anyquestions[Applause]2025-04-15 21:58:22.646415 ��j#��EAaYGGnDDGX-Qall right welcome everyone Hi thank youfor coming This is the uh linkeryproject update Um my name is Link Myname is Linkerty My name is Alex LeangI'm a project maintainer on linkerty I'ma software engineer at Buoyant and I'mhappy to be giving this update on kindof all the things that are new and goingon with the link project It's been areally busy year for us Uh the pace ofexecution on this project has just beenlike through the roof So there's a lotto talk aboutUm the��=i#��1A7U6nAxUxG6call right I think we're actually livenow and I don't have to yell quite somuch So this is the Emissary Ingressmaintainer talk where we will be talkingabout version 4 in the road ahead Umquick show of hands How many of you haveworked with Emissary or running it okaya much higher percentage than usualwhich is interesting Um I do have someslides and I'm Flynn You can reach me asFlynn at.io Everybody here probablyalready knowsthat the uh the takeaway from theseslides real��_h#��uAL13y_-zLin4so welcome everybody to our talkunlocking the future of kubernetespolicy as code with kaberno so I'm FrankI'm a senior software engineer atpneumatada and I I'm visual and I'm akiburn maintainerum yeah let's start with the first topicum or our short agenda so first we talkabout what Cavverno is for this folkswho doesn't work with it yet um what itdoes today and how you can achieve umyour goals with it then we will go intothe next features of our upcomingrelease which will change the way Kernoworked before a bit and yeah at the endwe showcasing these new features withsomedemonstrations and uh we have preparedsome stuff that you can try it outyourself if you like to so let's gettingstarted with what is Calono um who ofour audience already using it in someway nice that's a lot thank you so whatis Kyvernoum Kyerno is a CNCF incubating projectthe name comes from the Greek word uhword to govern which means basically uhgovernance so it's a policy engine builtfor Kubernetes in a Kubernetes nativeway to help to achieve what yourcompliance uh or what your workloadsneeds to be compliant um Kubernetesnative means that it's only usesKubernetes native um logic or tools towrite policiesum like or it's basically relying onlyon YAML so you don't need to write anyother programming languages in or umwhat is needed in other similar toolslike OPA or Cuborn for example so youonly need YAML to achieve or to writeyour policies it's working as anadmission controller so it alreadyreviews and validates your workloadsdoing the admission review it also worksas a scanner so you can um validateresources which are already applied in acluster so when you have an alreadyrunning cluster for a while and decideto introduce Kyerno you can just runyour policies in a background scanningmanner and already check how yourexisting workloads are compliant withyour given rule setwe also providing uh features forauditing and reporting so that you canuh see in an easy way how well yourresourcesum yeah work with your current ruleset um so why Kyerno as I mentioned as aKubernetes native tool it's easy to useso folks which had no experience withother Kubernetes or policy engines inthe past can easily start adopting it wehave a very large library of predefinedum policies for different use cases umyeah we have a really active communityso witharound 3,000 users in our Kyvernos Slackchannel you get really fast help if youstuck somehow and um yeah nowadays wealso support payloads outside ofKubernetes as long as it's based on aJSON formatokay so ly boils downto yeah we don't have enough people toget get done what we want to getdone So now is an excellent time to comeandhelp For the rest of this bit we willtalk about the purpose of the projectfor people who are new to it We'll talkabout the past the present and thefuture And then there will be a bit forI think I labeled it discussion becauseusually in this particular crowd is onethat wants a little bit more than justasking questions and having answersshouted at them from thestage For anyone who's not familiar withthis the purpose of emissary is fairlyeasy to state If you have a cluster youhave things running inside the clusterand you have users outside the clusterYou would like your users outside thecluster to be able to use things thatare inside the cluster However clustersdon't likethat This is what we basically call theingress problem How do you arrange it sothat it's possible for people to use thethings in your cluster safely fromoutside the cluster and the usual way wedeal with this is just to put somethinghere at the edge that can then go andmediate all of your requests safelyEmissary is a thing to do thatUm this class of thing is basicallycalled an Ingress controller named afterthe old Ingress resource and the Ingressproblem because the old Ingress resourcewas a way to try to talk about solvingthe Ingressproblem Uh emissary doesn't really havea lot to do with the ingress resourceAs a class ingress controllers includingemissary are always able to do basicrouting saying "Hey take this piece ofthe URL space and send it over to thisservice." Emissary can also doauthentication andauthorization and traffic splitting andcanaries and AB testing which is reallythe same thing as canaries just slightlydifferent logicuhretries circuit breaking rate limitinglots of otherthings Because emissary is an Englishcontroller that can do all these thingswe tend to refer to it as an APIgateway Specifically MSER is a developercentricself-service role-based opinionated CNCFincubating open-source API serverUm I'm sure I could find some morebuzzwords to throw up there if anybodyreallywants The bits of that that are to mevery very important are actually all ofthem but developer centric bit tends tocome first The point here is that Janethe applicationdeveloper is a person who tends to thinkof Kubernetes as pure frictionShe does not want to have to wrangleKubernetes She just wants to be able torun things and let people use them SoJane ideally will be able to do this ina way that makes sense to her and fitsher mindset which the mapping resourcetries to enable that And we will comeback later to whether or not itsucceeds Jane can go and do all of thison her own herself It is entirelypossible as an application developer toinstall an emissary configure it on yourown use it and not involve any otherperson ever We know people who do thingsthis way But it is alsopossible this is the role-based bit tosplit out the more infrastructure bitsof this into a separate role filled bypossibly a separate person and then letyour application developers only worryabout the bit that they care aboutspecificallyThe opinionated bit comes in becausethere are a lot of really cool featuresthat we could throw into things thatwouldn't get a lot ofuse Um these esoteric but kind of coolthings tend not to be present inemissaries configuration languageThis brings usto kind of an interesting point aboutall this which is that the modern IPAgateways if you look at what they can dothey're mostlyequivalent except for their inputlanguages I personally tend to thinkthat emissaries input language is abetter fit for a busy applicationdeveloper than anything else that's outthere I would like you all to recognizethat I am extremely biased when it comesto thisAny questions sofar allright On Well onward seems odd whenwe're about to talk about the past butlet's talk a little bit about the pastOnly a littlebit Emissary has been around since2017 One of the interestingramifications of that is that emissarypredates CRDsUm this led to some interesting designdecisions in the original API where wedid things that simply don't work withCRDs today The uh canonical example ofthis one is that this thing calledambassador ID was either a string or alist of strings and you can't do that ina structural CRD definitionSo V1 and V2 simply don't obey the rulesof modern CRDs which meant that when wefixed all of that in V3 alpha 1 we hadto use conversion web hooks to deal withthe breaking changes And this wasremarkably unpleasant and nobody reallywants to do thatanymore By nobody I mean none of thedevelopers and none of theusersUm the trick is that getting rid of theweb hook for good is complexKubernetes versioning doesn't actuallywork the way most people think it worksThe API version field that you hand overto the API server in a transaction ismuch more about the version of theschema you want to use for thatparticular transaction than it is aboutthe version that's actually storedanywhere And so trying to properlymanage versioning while not usingconversion web hooks while supportingCRDs that had breaking changesuh is reallydifficult Another fun bit here isthat if you're using the V3 Alpha 1 CRDsthe API group has to get ambassador.ioin it Um Emissary the project doesn'town thatdomain and Ambassador the company isn'treally doing Emissary anymore So we needto get rid of that domain which meanschanging the API group and the wholeconversion web hook thing doesn'tsupport that atall As in apparently this is a thingthat nobody in Kubernetes has ever donebefore It's probably a little strong butthere's no support for it in the APIserverThere are actually opportunities here aswell and we'll come back to that in amomenttoo All right the present Um this is aslide that I put about put up about thepresent in Salt Lake We can shrink thatdown quite a bit We pretty much havefigured out how to be a communityproject We got pretty much completelydivorced from Ambassador Labs Um it wasamicable worked out okay The tricky bitis figuring out how to get things donewith only volunteers which is like Isaid in the f in the beginning of thisYeah we really need more help with allthis Salt Lake had three goals listed Ofthese threeum we were able to do basically one ofthem By which I mean I tagged 3.10.0right before walking into this talk Soit is real now It exists It is a finaldevelopment final release thing Uh itwill be the last of the Emissary 3releases and there are some caveatsaround that in that we still need tostrongarm GHCR into having the artifactspushed to the correct place So watch theSlack channel to get notified when thathappens It'll be soon next couple ofdays Not quite one out of threeis a little unpleasant It's better thannone out of three but not what we wouldhave hoped So yeah for the love of allthat is holy people come help Comeon For thefuture Like I mentioned there will notbe any further Emissary 3 builds unlessone of you want to do them You don'tLet's just be clear about that bitUm goal number one for Atlanta is to dothe thing we originally set in Salt Lakeand bloody well get rid of the getambassador.io docs and have them all atemissary.dev There's an enormous amountof chop wood carry water type of workwith all this If you are interested ingetting involved this is a great placeto get involvedUm it willbe probably a bit tediousand oh my god so very muchappreciated Goal number two for Atlantais yes Emissary4.0.0 I would actually like for that tobe 4.0.0 a real thing but I'll settlefor a development build Now we have atest build of this already It runs onARM 64 It was very cathartic to take thechainsaw to a bunch of code in it thatdidn't need to be there anymore I reallyenjoyed that partUm the next step is to rip out a bunchof CRD conversion code that's completelyuseless right now and that will also bea lot of fun and I'm looking forward toitUm it's a little challenging and one ofthe things here is yes we will need todo this new CRD group the new API groupWe don't really have a choice in that Ithas tohappen The current plan is for this tobe a migration not for us to supportboth V3 alpha 1 andV4 Um the opportunity buried in here isfirst it's a mechanical trans mechanicaltranslation sort of migration So we cando tooling that will do that for you andyou will be able to run emissary 3 andemissary 4 side by side in the samecluster at the same time with no chancethat one of them will break theother except for port numbers You knowyou will have to have separate servicesfor them and things like that Um that'sactually kind of cool We did this withthe M em m em m em m em m em m em m em mem m em m em ms 1 to m emissary 2 and memissary 2 to msary 3 transitions and wewere mostly able to do this but the factthat it's a separate API group meansthat it's really much cleaner to getthat completely right with fouruh we also get to fix somethings like getting rid of underscoresand putting units on durations andthings likethatUm this is actually mostlydone We made a lot of progress on thisone before deciding that it wasimportant to go back and do 3.10 whichsucked up more time than I would havecared to admitbut this is mostly picking up thingsthat we've already done and continuingforward withthem Now raise your hand if you've heardof Ingate or actually raise your hand ifyou've heard of Ingress EngineXRaise your hand if you've heard of someof the recent going ons going on withIngressEngineX No Okay So Ingress EngineX iswindingdown Um they are to be replaced by athing called Ingate which is anotherproject that is a part of SIGnetworks Uh endgate will be built ongateway APIThe transition from Ingress EngineX toIngate is kind of an urgentthing for reasons that I will let peoplewho are part of that project describe Umthe opportunity here that's interestingfrom emissary's point of view is thatgateway API still does not support abunch of tablestakes features for APIgateways and needing to allow thistransition may end up being a forcingfunction forthat which means that it may be possibleto actually let emissary become a thintranslation layer over gateway API So weget to both support emissaries inputlanguage and not have to maintain thingslike envoy at all which I think would bereally cool And it will be interestingto hear what y'all think about whetheror not that's cool or run flea you knowflea screaming or somewhere inbetween Last slide Like I said beforewhat we really need are people from thecommunity to get involved I'm looking ateach and every one of you in this roombecause that would be a lovely thing tosee Uh you can reach me on the CNCFSlack as alwaysand there may still be people trying touse the old Ambassador Lab Slack foremissary emissary related things I'vetalked to at least one person at KubeConwho said they were doing that Uh I don'thave access to that Slack anymore sothat's not going to work out very wellButdiscussion I would be obviously veryinterested in hearing what you all haveto say We can do this with me talkingthrough the mic so that it's recorded orwe can just drop the mic and I can sitdown and we can chat like civilizedhuman beings Got a mic here sir i'msorry Say again Oh yeah Yeah Great Ifanybody has a question there's amicrophone over here for youYou look familiar This probably isn'tgoing to be a fun questionHey I guess it's me who was using thewrong Slack yesterday I'm sorry Sayagain It's me who was using the wrongSlack yesterday with Okay Uh so I justlearned that uh V1 endpoint were goingto be duplicatedUm I do know that the new version ofEmissary has added support for inputslice but I'm guessing there might bemore uh code cleanup to do or somethingelse to do so that MS handles uh v1endpointsdisappearing I guess And now I've justlearned also that you do not plan tohave any more build for version three Uhactually 3.10 10 supports endpointslices It supports endpoint slices butdoes it support not having endpointsyeah Oh wait Sorry Does it support nothaving end points or does it That's myquestion because they are going away atsome point Um if I remember that changecorrectly it will do the right thingwhether you have endpoints or endpointslices or a combination of the two Eventhough I don't actually no it won't do acombination of the two If endpointslices are present it will use themOtherwise it'll look for endpointsRight Does that make sense yeah Okay Ihope I I hope it won't crash if thisyear for endpoint does not exist anymoreThat's what I'm talking about Oh Oh Iget it Umso essentially I'm asking if this Idon't I don't think it will I thinkit'll do the right thing if there's noendpoint resource Okay Because I'mslightly afraid that an upgrade to V4will become required soonishYeah Okayit's certainly the casethat if we find that 3.10 does the wrongthing if there's no endpoint resourcecorrect my answer will be that soundslike a lovely reason to go to emissary 4Um and I apologize for that other thanthe fact thatI really really do not want to buildanother Emissary 3 We had to reresurrect a bunch of build pipelinestuff to make that happen I don't yuckYes So you had a question over here HeyYeah Thank you for the great talk Uh canyou tell us something about the licensemodel uh because when I remember rightthere's a community license model orcommunity key or enterprise key and soon Is it still present uh will this inthe future still presentor is there something changing and whenI can or how far I can use the communitykey or the community license and whenthe enterprise license starts So thecommunity license versus enterpriselicense thing is a fe a characteristicof ambassador edge stack not of emissaryEmissary is just a normal just an opensource license You can run it forwhatever kind of traffic you want YeahUm as far as I know at present there'snobody you can buy support from for itOkay So the license is only the edgestack for Yes And Okay Thank you Also asfar as I know edge stack still existsbut I don't know Yeah like documentationyou need something CRDs from the edge orsomething like this I do know right Andyeah it's literallybeen like 3 years since I looked at edgestack at allPretty confident the last time I lookedat edge stack was shortly before 2000 orlike mid2022 Anyone else i mean obviously I'mgoing to be here to talk to you for abit after this anyway So if you don'twant to ask your question on mic that'sfine Um but if you do now's your chanceYep Got one over hereUh first of all thank you for for for agreat talkUm we've been using emissary since umback when it was part of Ambassador Labsand installed it using the data wirerepo Uh wow And thenuh and then we I think that got updatedto 3 3.9 and then all of a sudden inJanuary this year it got updated to3.12.2 Uh and I just I got a bitconfused in in terms of what's thedifference between 3.12.2 I guess that'sambassador labs You're not alone in yourconfusionUm this is a question that's a littlebit interesting to me because the honestanswer is going to sound like I'm tryingto throw Ambassador Labs under the busand that is not my intention But thehonest answer is that Ambassador Labsdid a 3.12.2 release of Edge Stack uh ofAmbassador Edge Stack and something inthat machinery ended up dumping anartifact named 3.12.2 twointo some bits of emissariesartifacts So that is actually not anemissary artifact and it won't work Umwhich is part of the reason why we'relooking at let's just use GHCR foremissary going forward to get rid ofthat confusionThank you I uh I should also point outthat when we first heard about that thenI went back to Ambassador Labs actuallyI think they're just called Ambassadornow but I went back to them and said"What what's going on with this?" Andthe reaction that I got was universally"Oh my god that's not supposed tohappen." So they they were not trying tobreak anything Um and they have as Iunderstand it they've gone through tofigure out what happened and make itnever happen againSo yeah that that confused a lot ofpeople as it turnsout Anythingelse all right I think we can close downthe mics and such I'll be here for a bitif you want to ask further questions orcome by and say hello or whatever Thankyou very much Appreciate it2025-04-15 21:58:23.242518re have been in fact a lot ofthings uh going on with linker at thisuh KubeCon So we're kind of towards theend of the week now but we've had awhole bunch of talks all through theweek We had a great uh linker day uhback on Tuesday Uh a lot of reallyreally cool talks there Um on Wednesdaythere were a couple of talks I gave oneUh William Morgan gave the other Uhthose recordings if you weren't able tomake it to those I highly recommend youcheck those out when the recordings areavailable Um and then here we are onThursday This is the linkery updatewhere I'm just going to give kind of aoverview of where the project is atwhat's new what's going on what's comingup Um of course we're also in the uh theexpo hall So if you have more questionsabout LinkerDert come find us there anduh happy to chat about all all thingsservicemesh Um if you don't already know whatlink is um it is a service mesh So it'sultra light ultra fast security first isour positioning Um it's builtspecifically for Kubernetes So it'sKubernetes native Um it's been inproduction for eight years now at avariety of different companies atvarious different scales and in variousdifferent environments Um so it's it'sdefinitely battle tested and battlehardened Um we're a CNCF project We'regraduated and uh I'm really proud of thework that we do on itUm and to kind of summarize like in anutshell where we think uh linkd ispositioned as a service mesh we wantedto give every platform engineer in theworld the tools they need to create asecure reliable and observablecloudnative platform And all three ofthose pieces are kind of critical Youcan't compromise on one uh to getanother So the design philosophy forlink is that it should just work um youshould just be able to install itwithout using your brain and uh and itshould just kind of work right rightaway Um there's a lot of tools out therethat require a lot of configuration andlicority can be one of them Uh but youshould kind of have to scale up theconfig that you uh is required based onwhat your needs are and easy thingsshould be easy More complex things arethere when you're ready for them Umanother philosophy is to not be greedyright when we're talking about a servicemesh uh in the case of linkerty we havea sidecar proxy And that means thatproxy is in every single pod in yourapplication in your cluster And so youknow at that scale it's very easy youknow as software developers to be likewell computers are powerful memory ischeap you know who cares let's just dowhatever But in the sidecar environmentevery bite matters And when you'retalking about intercepting uh trafficand all your requests are flowingthrough that proxy then everymillisecond every microscond matters Sowe really are always conscious of thatand trying to uh use the minimum amountof resources that we possibly possiblycan Um I think maybe the most importantof these is is removing operationalcomplexity So any kind of cloudnativeapplication is you know certainly goingto be complex There are a lot of movingpieces There are a lot of differenttechnologies in play There's a lot foryou as a developer or as an operator toto be aware of And we don't want linkdto be kind of an extra thing that makesyour life harder We want it to besomething that makes your life easier Soto whatever degree possible we want tomake that easy to operate have it justkind of uh do what you expect and bepredictable Um and security has to bethere out of the box That'snon-negotiable you know we don't wantinsecure traffic out there and we don'twant it to be difficult to secure yourtraffic So if you install linkerd youget MTLS by default No configurationnecessary Um it it's justthere Um and so what kind of makes linkdunique in this space is that we have ourown proxy that we wrote specifically forlinkd So this was written in Rust Uh wecall it a micro proxy because it isdesigned to be as lightweight and out ofthe way as possible Um so we don't useEnvoy Um we use this this Rust proxy Umand it was written to be security firstThat was the reason or one of thereasons we chose Rust as theimplementation language is because wewanted to make sure uh to whateverdegree possible that we didn't have uhmemory safety issues that we could bereally uh deliberate about the memory wewere using and and using it in a safeand efficientway Um this is built on state-of-the-artnetworking stack So we use librariesfrom Rust like Tokyo Hyper H2 and TowerWe use state-of-the-art crypto librariesto do a MTLS and uh it's a very uh it'sa very modern and um I think cuttingedge project Um but this is all to saythe proxy isn't something that anyonewho uses link should ever have to worryabout You know this should be animplementation detail The service meshshould do its thing and you shouldn'thave to care that the proxy is writtenin Rust or written in Go or written in Cor written in whatever It should justget out of the way consume minimalresources and do itsjob Okay so linkerty year in review whathas happened in the last year sinceKubeCon Paris it's alot Uh so we released linkerd 216 Umthis release had a bunch of stuff in itUh maybe most notable was that we took alot of the functionality that was inlinkerd for per routeout behavior uhlike per routeout retries per routetimeouts per route metrics um all of thestuff that was previously configured inlinkd using something called serviceprofiles Those were trying to deprecateand move away from and move everythingonto uh a more commonly used API whichis the gateway API So we want thegateway AP API to be the common way toconfigure a lot of these things inlinkerd so that you can get per HTTProute per gpc route or even TCP TLSroutes um and have all thatconfiguration live on the gateway API sowe can kind of integrate with the restof the ecosystem Uh so that was added inin 216 We also added support for IPv6 Sothat's uh part of linkery Now um weadded a policy audit mode So linkerd hasauthorization policies So you canconfigure traffic to be allowed ordenied But these are very scary toimplement because if you you know wantto switch your default policy to defaultdeny which is a good thing to do from asecurity posture position uh it's veryscary because you haven't set up thosepolicies correctly you're going to startdenying traffic that you shouldn't bedenying Your application will break Umso we've added this uh audit policy modewhere you can say "Well let me set up myaudit my my my policy my security policybut instead of denying things insteadjust let them through but tell me aboutit so that you can make sure that yourauthorization policies are configuredcorrectly Once you're satisfied withthat then you can go ahead and switch todefault deny and you'll have confidencethat you're not going to break yourapplication in doing so."Um and since we were moving retries overonto HTTP route and gRPC route anyway wekind of did a little bit of clean up andrestructured the way that's configuredto make it more in line with with howthe proxy isarchitected Um in linkd 217 we addedegress rate limiting and federatedservices So this is a big release withthree very large featuresUm so we and I'm so I'm going to spendsome more time in in later slidestalking about each of these in moredetail but each one of these is is isverylarge Uh linkerd 218 is not released yetWe're we're still putting the finishingtouches on this but it should bereleased very very soon We kind of hadto put some of the the work here on holdin order to come to KubeCon and talkhere So as soon as as soon as I'm backhome this is this is going to go outvery quickly Um but linkerty 218 addeduh a more githops friendly to way to dolinkerty multicluster So there were someaspects of linkerty multicluster thatwere not uh compatible with githops andwe wanted to remedy that Um we did somemore work on the way that linkdinteracts with the gateway API and I'lltalk more about that uh in a moment andwe added protocol declarations to linkdso that you can annotate your traffic asa specific protocol in order to avoidprotocol detection I'll talk more aboutthat too Um so this this release will beout very verysoon Okay So to dive a little bit deeperinto egress metrics uh so we added thisnew resource type to capture trafficwhich is leaving the mesh which is goingfrom inside the your meshed workloads touh something outside And so we have thisnew CRD this new resource type calledegress network And this uh allows you toto describe that traffic And so uh withan egress network you can attach routesto it and you can get metrics on all thetraffic that's leaving your clusterright so this is uh a degree ofobservability that you didn't havebefore and you can capture um forexample like TLS host names in thosemetrics you can see which host nameswhich TLS names is my applicationtalking to outside the mesh and you cando an audit of that for exampleum it also allows you to control thattraffic that's leaving the mesh so youcan disallow certain traffic Um so youcan have this for example in this exampor here we have traffic policy denywhich means we're going to deny all ofthe traffic leaving the mesh but then wecan add exceptions to that So we added aTLS route here on the right hand sidethat allows traffic to httpbin.org ifit's TLS for example So it gives youthat egress access control um and so fortraffic leaving the mesh you can decidewhat's allowed to go out and what's notUm we added some uh serverside ratelimiting So uh this is a new resourcetype called the HTTP local rate limitpolicy These policies let you uh specifyrate limits either global rate limits orper client rate limits that let you uhprotect your service from from undueload by saying hey this is the maximumamount of requests per second that wewant to let in And if more than thatcome in then linkerd can kind of act asa barrier there and and reject thoserequests before they get to yourapplication and protect you from thatthat heavyload Um we also added these thingscalled federated services Uh I gave atalk about federated services doing adeep dive on this on uh Tuesday atlinker day So if you want more detailsabout this well there's lots of docsonline but you can also go and find thatrecording um where I kind of dive reallydeep into this feature and exactly howit works and why uh why why it wasnecessary what motivated it Uh but thebasic idea here is that if you've got aservice that's deployed to multipledifferent clusters multiple differentKubernetes clusters Um and that servicehas the same name and the same namespace across all of those clusters youcan merge those together into what'scalled a federated service Uh and afederated service is then just going tosimply load balance over all of theendpoints of all of those services inall of those remote clusters So this isa multicluster feature that lets you uhkind of interact with the service in acluster agnostic way So you're saying Iwant to talk to this service I don'tcare which cluster is running in in factit's running in multiple clusters and Ijust want to access that service as awhole And then if you know one of thoseclusters goes down or if one of thoseclusters is slow that load balancingwill kick in and that service willremain available even if clusters joinclusters leave Um it's it's dynamic inthat wayuh and we added protocol declarations Sothis is actually really cool So one ofthe things that linkerd uh needs to knowwhen it's routing traffic is it needs toknow what protocol the traffic is And sotypically it does this with somethingcalled protocol detection where it willjust look at the uh bytes that arecoming in on that connection and say heydo these bytes look like HTTP becauseHTTP requests always start a certain wayUm and if they do then okay it's HTTPwe'll treat it as such We'll do HTTPbased routing We'll collect HTTP basedmetrics We'll do all that layer 7 goodstuff Um and if it's not then we'lltreat it as TCP We'll say okay well thisis some other protocol It's not HTTP Sowe're going to instead just gather morebasic L4 type of uh metrics So justbytes and so on Um and this is reallygreat It works very well97% of the time Uh but there are a bunchof edge cases where this this doesn'tAnd if you've ever been if you've everrun into protocol detection timeouts andseen weirdness as a result of that youknow what I'm talking about Um and soone of the ways that people have workedaround this issue is we have thisfeature called opaque ports And sothere's a configuration where you cansay you know this port here uh justtreat it as opaque Don't try and doprotocol detection on it Don't worryabout what protocol it is Just justforward it through Just treat it as TCPDon't worry about it Don't don't do allthat stuff Um and that works reallygreat But until now we didn't have anyway to do the opposite of that We didn'thave a way to mark a port as hey this isHTTP Don't try and detect it Justbelieve me Um I'm telling you Uh and andso please do all that layer 7 stuff forthis connection It's HTTP Um don't dothe detection Um so that's what we'veadded here We've added protocoldeclarations These are uh they make useof the app protocol field in the serviceresource So for any Kubernetes servicethere is a app protocol field and nowyou can specify in that field what theprotocol is and if you specify HTTP oropaque then we will not do protocoldetection We'll just use whatever's inthat field Uh and so this is especiallyuseful for uh workloads that run rightat the edge of your resource limitsbecause uh when resources are veryconstrained is kind of one of the commoncases that protocol detection can failUh we also added GitOps compatiblemulticluster linking So if you've usedthe linkd multicluster the way that itworks is that step one is you installthe linkerd multicluster extension andthen step two is you have to linktogether all the clusters that you wantto be able to talk to each other Um andthis is very straightforward but itinvolves running this command linkerdmulticluster link And so you have toestablish all these links if you havethree different clusters that all talkto each other that's you know one linkin each direction between all thedifferent clustersum and you know if you're just doingthis it works great it's fine but if youwant a git ops compatible setup then youwant to somehow specify thatdeclaratively you don't want to have torun the install and then this separateuh linkingstep and um it's kind of even worse thanthat because when you upgrade linkerdyou also have to upgrade the linkerymulticluster extension and then you alsohave to upgrade each one of these linksindividually So we've rearranged the waythat this is structured so that thoselinks are now part of the linkermulticluster chart itself Uh so you haveto specify those links as values in thelinkery multicluster Helm chart Um butnow that's all kind of uh part of thatchart and so when that chart getsupgraded all of the links get upgradeduh it's much more git ops friendly Soit's going to work with with whatevergit ops workflow you have Um so we wethink about this as being declarativeinstead of instead of imperative Soinstead of doing the second step oflinking that's just all specified in thevalues Um and we've done a lot of workon the gateway API to bring it up to amodern version So previously we were umreading fairly old versions of thegateway API resources Uh I think we wereusing like the v0.7 release of ofgateway API Um and whatever resource uhversions were in that We've upgraded tonow read the v1 versions of HTTP routeand gRPC route Um so that's compatiblewith a wider range of more modernversions of the gateway API So if youhave anywhere between version one andversion 121 of gateway API that's nowgoing to be compatible with linkerty Umand we're also compatible with both thestandard and the experimental channelsThe experimental channels have TCP routeand TLS route uh in them So if you wantto make use of those types you need touse the experimental channel Otherwiseyou can use standard Um and we're alsokind of shifting our philosophy on howthe gateway API is installed Sopreviously uh whenever you installlinkerty by default it would come withthe gateway API CRDs we would bringthose along uh and install them for youUm and increasingly we've heard feedbackthat that's not what people want becausethey either already have these CRDsinstalled on their cluster or they'rebeing provided by some other project bytheir by their ingress or by theirgateway or by something else Um and sowe really are kind of moving to thisworld where the gateway APIs are a uhprerequisite they're something thatalready exists on the cluster and and wecan just interact with them and and wedon't have different projects kind offighting over who should be installingthese and and making sure they're thereSo we're kind of going through thistransition period right now where wewant those to be provided already We'rejust going to kind of assume they existWe'll do some checks to make sure andwarn you if they're not Um but we wantto treat those as a as an externaldependency Um and we can like kind ofhelp you through that process and andand give some helpful error messagesthat say "Hey we detected the gatewayAPI resources that we need are notavailable Here's how you install themHere's where you get them from." Um andso by the next version of Linkerty whichis Linkerty 219 we're no longer going tobe installing those by default So wewant to kind of roll this out in a waythat is safe and is not going to breakanyone on upgrade Um but that's kind ofwhat we're moving towardsUm and then just kind of talking aboutthe Linkerty project as a whole on a ona somewhat more meta level Um what'sreally exciting is that Linkerty is nowwe say is a sustainable project SoLinkerty is developed primarily by onecompany which is Buoyant That is whosigns my paychecks and funds all of mywork on Lingard Uh and Buoyant is now asof fairly recently a profitable companyUm and so I think that gives the projecta lot more confidence in itssustainability and its longevity uhbecause there's a sustainable companyfunding it Um and I think we've alreadyseen like dividends on that We've seenthe pace of execution on liquidityreally really increase as a result Buyit has had more resources in order tohire maintainers and increase projectoutput Um and so it's really great tohave uh a project that has dollars andcents behind it Um you know linkert hasbeen around for almost 10 years I thinkthis was uh William Morgan our CEO addedthis line We're aiming for another 90 Umso yeah I mean I think it's it's uh aproject that we want people to haveconfidence is going to be around It'snot going to go away tomorrow when uhall of the funding goes away or all ofthe maintainers don't have a job anymoreYou know that's not a situation we're inand I'm really grateful for thatUm Linkerd does weekly releases So ifyou've uh ever seen this kind of formatedge-25-.4.1 this is our uh releasenaming format for our weekly edges Uhthat means the 25th year the fourthmonth and the first release of thatmonth This would be the first release inApril2025 Um when we release these edgesthey're released off of Maine So all ofthe latest bug fixes all of the latestfeatures all of the latest securitypatches everything that's uh kind of onmain at that point in time is isreleased in those edges Uh and the ideais that every edge release is intendedto be production ready These are notkind of experimental releases or or oror beta or you know unstable Umeverything that gets merged to main issupposed to be of production levelquality So this is all code that hasbeen battle tested and and we'reconfident in we stand behind Um we alsokind of bundle up features into thesereleases So we have these 2 uh whateverreleases 2.18 for example that kind ofrepresents a finished feature or notfinished but a cohesive uh feature andwe kind of iterate on these as you sawin uh the examples Sometimes we'llrelease a feature in one release andkind of iterate and improve it in thenext Um but each of these is meant to besomewhat self-contained and cohesive Umand that's also kind of the the timewhen we take a pause on on future workin order to document what we have andmake sure that you know it's usable andthat it uh is documented and and um andpeople know how to how to consume it Umso every version has a correspondingedge release and also has a git tag Sothere's uh kind of a record through thegit history of of where all the releasesare and and what edges they correspondto Okay Okay So what's coming next forlinking that's that's where we are nowThat's all the million things we did inthe last year Um and like there'sthere's so much on the horizon Uh andthere's so much work to do Um so uh oneof the biggest things on our road map isto to continue to flesh out Windowssupport So right now uh this has been inprogress for a little while The linkproxy uh runs on Windows Uh that tooksome doing Um but of course the proxy isnot the entire service mesh There's moreto the service mesh than just the proxySo there's still more work to do to makesure that we can say confidently linkerdthe service mesh will run in Windowsenvironments Um so that's in progress Uhwe uh want linkerd to do more uh withthe ingress use case So right now foringress what a lot of people do isthey'll have some kind of ingresscontroller like engine x or somethingelse Um and then they'll also have uhthat is a meshed workload So they'llalso have a linkery proxy runningalongside it and so incoming connectionsinto that ingress get you know have TLSterminated by engine X or whatever andthen uh for the east west traffic goingout of there uh linkerty kind of takesover and does like workload based MTLSfrom there on out and so the question iswell why do you need two proxies therethat one hands off to the other whycan't linkerty do everything um somoving more into that ingress use caseis definitely something on our road mapthat we would like toexplore Uh we want to do egress TLSorigination So whenever you have uh aworkload that's talking to some kind ofexternal egress um TLS uh identity sosay you're talking to you know get httpsuh github.com or whatever Um that is TLSthat's originated by your applicationAnd so when the linker proxy interceptsthat connection it's encrypted right andit should be because that's how DLSworks Um which means we can't read thespecifics of those those bytes We don'tknow what's going on there We can't dolayer 7 um metrics collection layer 7load balancing or or anything reallyintelligent with that traffic Um so wekind of want the ability to say well ifyou just send a plain unencryptedrequest to us we will originate the TLSto to that external uh host And then wekind of because we can see thoserequests we can see that traffic we cando all that layer 7 load balancing layer7 metrics layer 7 um routing all of thatgood stuff Um and then but it's stillyou know encrypted to to thedestination Um we have a feature calledmesh expansion um which I didn't talkabout because it was uh not added thisyear uh which allows you to haveworkloads that run outside of Kubernetesbe part of the mesh Um that's a veryexciting piece of work that we'recontinuing to improve and iterate on Umwe want to add support for privatenetworks to that so that if you've gotworkloads that are running on privatenetworks outside of Kubernetes those canbe part of the meshtoo Um there's still more to do on linkmanagement I think you know in in 218we're doing a lot to make that moreGitOps friendly There's still more to doto make that even easier to work withUm we want to do more with using Spiffyfor identity inside the cluster So ifyou're using mesh expansion uh Spiffy isthe identity provider we use for thoseworkloads that are running outside ofKubernetes Uh but we also want toexplore if we can use Spiffy foridentity inside the cluster as well andkind of bring bring thosetogether Um and then to kind of uh addmore flexibility and configuration tothe way federated services work Sofederated services are very powerful butthey're also a little bit rigid in uhthe way that they operate Every servicethat joins the federated service has tohave the same name and namespace inorder to be merged together They uh whenthey do that the service that's createdthe federated service that's created asa result of that is called that servicename-federated Um and that's kind ofhard-coded uh name So we want to add alittle bit more flexibility there so youcan kind of control how that works andand make sure that it suits your usecase Um so that's kind of the the thingsthat are immediately on our road map butof course you know that road map is verydynamic It's always in flux and it'salways uh adapted to what users aretelling us they want So this is you knowdefinitely uh a two-way street wherewe're always interested in hearing whatpeople are doing with linking what theywant to do with linking that they can'tUh and that definitely informs andadjusts our road map as time goes on Soif there's anything on this slide oranything at all that is of interest toyou that you really want to come talkabout please come to the link booth cometo the buoyant booth come tell us whatyour use case is uh so we can have thatinformation when we're planning ourroadmap or you know open a ticket on onGitHub come to the link slack uh justget in touch with us anyway you can andand be part of this projectum if you want more uh information aboutuh linkerd we have a service meshacademy which is a self-paced onlinecourse you can get certified in linkarduh there's some really good educationalmaterials there it's a really nicecourse Um you'll definitely learn a lotno matter where you are in your servicematch journey If you go tobuoyant.iosma you can sign upAnd uh I'm happy to take any questionsyou might have right nowYeah there's a mic there If you canrepeat YeahOkay I I heard you so I'll repeat yourquestion The question is is there aperformance advantage to using protocolsorry it's allthanksYeah essentially you know can we shavelike five milliseconds off of everyrequest across the cluster if we enableprotocol declarations or is the thetrade-off with the complexity notworthwhile uh yeah this is a greatquestion I think the answer is it itdepends a little bit So in the in thegood case in the ideal case uh nobecause protocol um detection happensbasically instantly as as we read thosebites we immediately know that what theprotocol is So in the in the golden pathuh there's no difference Where thingsstart to matter is is where things arenot in that golden path And so where uhfor whatever reason those bytes don'tget written to the connectionimmediately We're not able to detectthat connection Um and so linky willactually have a 10-second timeout wherewe wait for bytes so that we know whatthis protocol is And if we don't getanything in those 10 seconds we'll giveup and say "Okay well it's just TCP Iguess." Um and so when we're in thosecases we actually incur a 10-seconddelay on connection establishment Um andso that that delay is what we can avoidby just saying "Hey don't try and dodetection on this Just believe me it'syou know it's this protocol." Got itOkayThanks ThanksThank you Great talk I have a questionrelated to multicluster and federatedservices So in our case uh we would haveup to hundreds of devices and eachdevice is a m is a cluster and how youensure that this functionality isbackward compatible because if wecombine all of these clusters and havewe cannot update them all at once YeahSo they have different versions oflinkadia and um that means thismulticluster still has to work Yeahthat's that's a great question So andit's something that we always thinkabout when we're adding features uh thatexist in one version and don't exist inin another is kind of what is theupgrade story between those and inparticular like what happens when someof those clusters have been upgraded andsome have not SorryUm so so yes that's something that'sthat's always on our mind We know thatuh people will be running in these umtransitory states where where part someclusters are up and some are not andthey have to be able to continue to worktogetherAnyoneelse all right Thank you very much2025-04-15 21:58:23.826863 SIG offfolks went the long road down theKubernetes change control process inorder to get that removed and now it'snot there to poke anybody anymorewe also maintain some security relatedtools and processes used by the projectoverall like we maintain the officialCVE feed which is a a web page and anRSS feed that you can subscribe to andsee the newest results and we have a subproject that uh runs and coordinatesthirdpartyaudits since the beginning we have takenthe approach that no one person ororganization can adopt a position ofauthority and demand or dictate securityon behalf of the organization and thiscomes from the idea that Kubernetes orany sort of community is large and it iswhat the members make it so if you thinkabout contributing some code or somedocumentation to Kubernetes everyone whoworks on Kubernetes has the ability tomake Kubernetesinsecure anyone can write a bug anyonecan write bad advice on a blog postanyone can write a design that hasfailed to take adversarial nature intoaccount and if it's the case that we allhave the power to make Kubernetesinsecure then it seems obvious to methat also we all have the power tocontribute to making it secure and sothat is the approach that we take wegather folks together with variousbackgrounds various levels of experienceand we learn and grow together and wework together with the rest of thecommunity to make these sorts ofimprovementson the subject of learning and growingtogether I'm going to hand it off tonewest subchair here Kayn in order totalk about thefuture woo yeah so um as Tabitha said uhher and Ian have done an amazing job ofsewing the earth and making a reallyripe environment for plants and peopleto flourish um I have spent much time asa well-tended little seedling and now Iget to move out into the garden properwhere I'm still very well tended to behonest we take care of each other buthere I am growing and I'm so happy toget to learn from this incrediblecommunity and get to be a part of it umas as we've gone over our project ismade up of sub projects so you can seethem here represented in their ownlittle plants and microcosms there isthe documentation project the toolingproject and the third party audit andI'm not going to talk about them becausewe have experts that will cover them uhbetter than I ever could and the beesare me Ian and Tabby just buzzing aroundhelping where we can and pollinating theideas bringing things from fields afarfor the team to work on um and it's justgreat it's great you should come join usit's a good spot to be and so what do wehave planned for the future a lot umwe're just going to keep on doing thethings we do well we're going to keep ontrying to be an open and welcome spot inthe Kubernetes community in the opensource community um we're going tocontinue with the sub project maininitiatives again I don't want to gointo any detail um but we have a thirdparty audit that's occurring now we aregoing to continue to work on theKubernetes doc site making sure that uhwe're covering the security essentialsand that we're removing any staleinformation and making it as easy aspossible for users to know how to useKubernetes securely uh we're going towork on strengthening relationships withother SIGs we have because we makeourselves so approachable we getapproached and um there are a lot offolks who want help and opinions and wewant to give them and we want to haveconversations we want to make sure thatuh the resources here are available tothe greater Kubernetes community whenmaking security decisions or justwanting to hang out um and then finallyuh at our project booth we had beenapproached by a few people who had whitepapers or had found information onlinethat wasn't necessarily reflective ofthe current state of security forKubernetes so this year we're going toreally try and get out there a littlebit more and make sure that not only inthe Kubernetes sphere but in the greaterinternet sphere um we're trying toremove a little bit of that mess and oneof one such opportunity is going to berevamping the OAS Kubernetes top 10 soif that is appealin g to you which itshould be because that's cool um pleasecome join our meeting we have one notnext week but the week after um and atthe end of the do at the end of thepresentation we'll share some links soyou can get on our mailing list umthat's that's some of it and we're goingto play it by ear whatever comes upwe're going to tackle it if you haveideas please come talk to us talk to usnow talk to us later find us on theinternet um we'd love to we'd love tohear what you think we shoulddo good clicker still works so uh I'mIan still hasn't changed i'm four-letterIan in SIG security to save for aconflict with the wonderful Ian ColdWater who is hiding from the stage um Ialong with Ray Laano uh help run thethird party audit sub project soeverybody loves an audit i'm sure we'veall been involved in many and had greatfun with all of them uh what we're doingas a project is trying to make sure wehave external parties perform adedicated security review of theKubernetes project and thecodebase we as much as we can thinkadversarially and defensively absolutelycan't review everything that'sintroduced into the project it'smultiple millions of lines of Go andvarious other programming languages forall the extra projects out there so weuse this project as a chance to getdedicated security time from externalvendors to come in and actually providesome effort and I'm trying to see myspeaker notes to see if there's anythingelse I meant to say so I'm going toawkwardly move over here and it'll belike we planned it so yeah we're tryingto get extra eyes on the codebase justto help us find any extra securityissues out therewe do have a hacker one program as wellthis is a separate piece of work so ifanyone here is sitting on an ode inKubernetes and has found some mad hacksthat let you mine Monero in every singlecluster first of all please let me knowbecause that sounds fun but before youdo that not first of all please emailsecurity@kubernetes.io this is aseparate piece of work to the ongoingbug bounty programs and securityresponsibledisclosure as a little bit of historywe've run two third party audits in thepast the first one was in 2018 with atread and trail of bits if anyone reallywants to go and read that report now theuh issue for the findings there is 81146on the Kubernetes repo if anyone reallywants to memorize that number um therewas also another review in 2021 and inthe spirit of being a flowering seed whocan't do the flowery language that Kalindid I was actually on the delivery teamfor that engagement and have now foundmyself on the vendor side for thecurrent audit so NCC Group did an auditin 2021 uh again the findings for thatare on GitHub under 118980 i thoughtthat was worth telling everyone so ifanyone wants to go and look at thosefindings please do one of our ongoingpieces of work as well as running a newaudit is going through the findings ofthe previous ones we have a number offindings that that pesky autoclosingissue robot has marked as stale so thereis an ongoing bit of work to make surethat all of those issues are addressed anumber of them are marked as closed andfixed so security improvement hasdemonstrabably happened a number of themhas been closed as we won't fix thisthis is intentional by design so anotherthing we do as a SIG is make sure thatanything that is by design but we thinkis a pointy edge that we don't wantpeople to cut themselves on is make surethat these are documented elsewhere inthe Kubernetes documentation aswell and we also have a number offindings which have been marked as theyneed a KE so anyone here who's notfamiliar with a kept Kubernetesenhancement proposal uh we have a numberof findings which do need somesignificant effort to actually make afix if anyone wants to try and getinvolved in the Kubernetes codebase butwants a clear steer on where they canstart we have a number of known securityissues which are requiring somesignificant effort to avoid breakingchanges in Kubernetes that aren'tterrible if they were terrible scarysecurity things they would have beenfixed already but we have some that needa little! bit of effort so if you want toget started uh speak to me after we'remore than happy to point you in therightdirection and for the future plans whichhave become the current plans due towonderful scheduling we are currentlyrunning the 2025 Kubernetes audit ifanyone wants to read the backstory ofthis there is a folder in the SIGsecurity GitHub repo that documents theRFP we issued instead of reviewing theentire codebase which I can tell youfrom experience is not much fun we'reworking through different projects fromvarious different uh SIGs so weapproached the SIGs at the tail end oflast year and said does anybody have acomponent you would like to be reviewedcompiled all of those and put a bit outto vendors we are working with the open-source technology improvement fund thisyear they recommended a vendor well theyrecommended a few vendors we selectedone and the project is ongoing so thechosen vendor shielder are currentlyreviewing quite a lot of the subprojects i can't remember the whole listbut it is documented um and they arestarting to report findings so we areworking with them directly every coupleof weeks we get an update we're happy tosay there are a couple of findingscoming out of that which are goingthrough responsible disclosure and wewill obviously make sure they are fixedbefore they go public uh but yes we arestarting to get some findings out ofthis audit and we are hoping that at thetail end of this year we should have areport that can be made publicand with that I will hand overawesome uh so I'm going to talk to you alittle bit about SIG security docs uhwhat do we do well it kind of says onthe slide we do a couple of things theone is we try and work across SIGs toimprove the security content of theKubernetes website obviously when peopleare configuring or managing clustersthey're going to go to the website as aprimary resource so it's reallyimportant that we have good securityinformation there we also do someadditional work where we have somethinglike a threat model uh which wouldn'tnecessarily fit as part of docs and wewant to actually have that written aswell so we have some white papersavailable inside our GitHub repositorybut in general what we try and do isimprove the overall documentation ofKubernetes security and what I wouldlike to do is convince you that you toowould like to be involved in documentingthe security of Kubernetes because Ithink there's a number of really goodreasons genuinely think this is a reallygood thing to do so why would you do ityou will understand more about theproject i write quite a lot aboutKubernetes and container security andwhat I find is the act of writingsomething down the act of working outwhat you need to say will lead you tobetter understand something whenever I'mwriting a blog I often find outsomething where I had made an assumptionabout how something worked or howsomething was operating and I'm wrong soI start writing it down documenting ohthat's actually incorrect you improveyour learning a good example of this asan opportunity at the moment is we havegot a longunning project to try anddevelop a hardening guide for Kubernetesand we've split that into a number ofsections that focus on different aspectsof the projecta good example the scheduleuler one hasjust finished up and Anaman has beenhelping us out with that is um delvinghe delved right into the scheduleulerand understood more about what thesecurity implications of differentparameters and different features of theKubernetes scheduleuler are and Ilearned a lot just by reading andreviewing it and trying to help outgetting the thing ready for the websiteand I'm sure he learned a lot inactually designing it so if you getinvolved in hardening guide as a goodexample you will learn things aboutspecific aspects of Kubernetes butthere's more um oh almost too much morewe missed the good bit you will learnthings cool things about Kubernetes ipromise you you will learn things coolthings about Kubernetes by involvingyourself with security docs we haveanother piece of work we've just starteddoing which is looking at the we"bsiteand looking all of the open issues thatrelate to security and seeing if there'sany we can contribute to getting thoseissues fixed and closed and in doing soI learned things about Kubernetes Inever knew because these are quite kindof in the depths areas sometimes so onethings I learned was the Kubernetes APIserver has a self-signed certificatethat it uses only for loop back calls uhand it has a 12-month lifespan if yourAPI server is up for more than 12 monthsit will crash and it will crash becausethat certificate becomes invalid and itis not reissued until the API serverrestarts so if you ever try and leave aKubernetes cluster running for more than12 months that's what's going to happenyour API server will crash and now Iknow why and now so do you know why ifyou read these issues you too will findthings you never knew about Kubernetes ipromise um so that's a great way reasonto get involved in docs right you willlearn things cool things that are maybethat one's only good for trivia um butalso if you ever have a cluster crashingcrashing after 12 months you can makeyourself look so clever by saying I knowwhat happened it is the internalcertificate you also get to work withother parts of the projects that's verytrue as well what because I think wementioned it earlier on is that we arewe don't own the code as a SIG so whatwe do is we have to work with the SIGswho do own the code if you involveyourself with writing docs you will workwith other SIGs and you will meet othercool people who are involved in theproject and who will help you learnabout the things you need to know towrite those documentation you'll workwith SIG docs because SIG docs help usget things into the state and um stylethat is needed for the website you willwork with other SIGs like SIG O sig nodewho own the code that we are writingdocs about so it's a great opportunityto learn about and meet other people aswell and with that I shall hand you onto Mahi to talk about toolsthank you Rory so now I will talk aboutSIG security tooling um so here are thethe few goals we have uh basically wetry to build and improve the security ofKinesis by writing code and workingacross SIGs um and I I think we havebeen able to create like a very nicespace for new contributors in the pastto share and learn so uh basically thisSIG is organized like this we meet everyother Friday and um on one side you canpropose like new learning sessions ontoolings and work related to securitythat you've been working on or you canjust join the working sessions where wetry to actually make progress on theissues so uh I wanted to mention some ofthe work that has been done uh we havebeen running the sneak uh scanner on therelease image of Kubernetes and latelythere have been like a little bit ofcross SIG collaboration to move uh thescripts from the test infra to the SIsecurity repository and and move um thejobs to less trusted clusters because weare like six security we want to enforcethat on us as wellum another one is the official CV feedso Tabby uh talked about that justbefore uh we created this uh autorefreshing list of CV feed it'savailable on the website you can seethere's like a dedicated web page andyou can see we have like this JSON feedand RSS feed that you can consume uh forall your needs um oh sorry uhwhoa yeah yeah yeah allright please all right it's good um sothe the initial list of uh task weneeded to do to make the uh CV um uhfeed ga is kind of finished we onlymostly have to do the last one now whichis like try to update the CV feed nearreal time so the the only issue so faris that the CV feed can be a bit late inthe worst case it can be like 12 hourslate if the website is not rebuilt bysome PR that has been merged uh in thewebsite repository so yeah the idea isjust to use like a web hook to try torebuild the websites but um anyway thething is that we wanted to make this uhfeed G and uh recently we've beentalking to some people about the waythey use the CV feed and now we want toinclude even more uh task in thisproject but uh again can you go back ohit's okay um so if you want to join thiseffort uh we've been trying to writesome documentation about how this thingwork uh it's a fairly trivial piece ofdocumentation it's like 50 lines so ifyou want to get involved like now youcan check this PR and and review it andand and say if it makes sense or ornot um yeah I wanted to talk about thatas well we we have uh used this we havelike started this initiative like awhile ago trying to run the go checkproject on Kubernetes it's been uhupdated like recently but so far wedon't do anything with the results so ifanybody's interested about joining anduh and helping us on on using this uhscanning results that that would be niceum and I discovered very recentlyactually that uh some people at Aquawere like creating this new projectwhich is called the CV feed OSV so theywere like consuming the CV feed I justpresented before uh for their securityuh scanner project but they neededactually this OSV format so um likelately the initiative has been mostly totry to move on to this new format forthe official CV feed try to merge theefforts and maybe kindly ask to thecubernetes SRC to issue the initial uhCV as an OSV format because like rightnow we consumed the CVS from the KKrepository issues uh yeah issues and uhthat's like in a very like free form sothat would be nice to have likesomething nicely formated that we canconsume distribute uh along the JSON andRSSfeed so with that toKen so if you're dying to get involvedwhich I assume you are because we justtalked about a whole bunch of reallyamazing things and great opportunitiesum there's some information up here onthe slides uh the darker blue one thatsays SIG security above it is a link toour GitHub repo that has our readmewhich will get you into the mailing listand you can read about any of thewonderful things that we've talked abouttoday the one on the right will get youinto Kubernetes Slack and you can findus at SIG Security and I mean if youcome to these talks we're all allbegging you to join us but just for anexample um my first ever KubeCon wasValencia and I attended this talk with asome of these still here some of themnot gone on to new things um and I feltlike I had seen my people and I had seena spot where I could just exist safelyin the community um and so I decidedactually with an audience member Dannythat we were going to try and go fromKubeCon attendee to maintainer in uhbetween KubeCons basically we appliedfor a talk the talk got rejected but wedid the work we went to the meetings weshowed up we became known and uh nowhere today I'm a co-chair which is wildpeople ask me "How did you become acoach?" I was like "Well sir they'regoing to let you grow they're going tofoster you they're going to fertilizeyou are going to help you becomewhatever you want to be if you don'twant to be up here you don't have to beyou can do all sorts of stuff so pleasejoin us um we'd love to have you we areWe don't have any requirements at thedoor except be kind and welcoming um aswe will be to you um and we'll help youout and then finally um we have sometime for questions comments concerns youcan ask us now or if you'd like to youcan um come find us later or find us ontheinternet ohmike is runningi'm a brand new person who doesn'treally have a lot of experience insecurity and I find the idea ofcontributing to SIG security to bereally intimidating what would you sayto somebody like mei would say within SIG security we havea fair number of folks with a lot ofexperience but not a lot of time andwith kindness in their hearts and sosomebody like you who would like tobecome involved would like to make somecontribution but is afraid of itwe are ready for you because if youwould like to try something you can havethe expertise of the whole group backingyou up to help you to make sure that youare on the right track to provide youwith encouragement and so I would sayplease come we would we would love tomeetyouany more questionswell I think then we've accomplishedwhat we all came here for thank you somuch for coming and hopefully we willsee you on Slack[Applause]2025-04-15 21:58:24.471153 ��l#��qAMbk6FY_9FKMwelcome everyone Um welcome to oursession the immediate and lastingbenefits of tag security assessments I'mBrian Keller I'm an open sourcemaintainer focusing on air gapKubernetes and a tech lead for tagsecurity Good morning everyone Uh I hopeyou have a great CubeCon here I am Benum maintainer of the cubecaping projectwhich is an incubating CNCF project andalso CTO at Armo which is a cloudsecuritystartup All right So I get to stand uphere and talk to you about security$��ok#��A0p-sZT0LWOghey good morning thank you all formaking the hike up to this verydifficult to find place in order to joinus for the six security maintainer tracktalk succession planting for a floweringfuture um I'm Tabitha Sable i'm one ofthe co-chairs and I'm really glad that Iget to help us make this space forourselves and each other together andgive the uh mic here to Kayn to do therest hi I'm Kayn i am also a co-chair ofKubernetes SIG security i'm the newestfreshest uh greenest SIG securityco-chair and during my day job I dosecurity things at Ozero by Octahello I'm Ian Smart i'm a consultant atAmberwolf and I'm one of the co- projectleads for SIG uh third partyaudit hey all uh I'm Rory i do securitystuff at Data Dog and I am one of theco-leads of SIG security docsand hello I'm I work at is at Cisco i'ma software engineer and I will berepresenting six security tooling todayso what really is SIG security um ingeneral we are a group within Kubernetesthat takes a community-based approach inorder to improve security for theKubernetes user base and for the projectitself but like specifically what doesthat mean you know it means that weprovide a place where folks within thecommunity and within the project cancome together to share their interestand their concerns for how to how tomaintain and improve security and thenwe can organize co-working with thevarious SIGs within Kubernetes in orderto make improvements in those areas thatthey maintain so like forexample Mah recently led the work toremove the the security context denyadmission controller that was in the uhin in KKK for a very long time and itprovided some bit of a attractivenuisance because if you turned it on youcouldn't run a modern cluster but it wasthere and it was nominally a securityfeature therefore it was referenced inuh compliance standards things like thatand so a lot of folks had to write a lotof paperwork justifying why they weren'tusing this outdated thing that was stillin Kubernetes and so you know Mah alongwith several other SIG security folksand% Uh Ican imagine what everyone thinks whenthat happens is great Somebody's goingto talk at me about security Uh verymuch not the intent very much want thisto be more of a like how do we improvethe landscape how do we um providesupport and Oh yeah thatoffYeah Oh no lost itUm and most importantly kind of likewhat are the things that we want tobegin to focus on as far as what thatsupport looks like and what are thepaths forward for end users and think ofconsumers of everything on the landscapeand also producers projects uh havevarying levels of security personnel uhto support their project and it could beanywhere from the project is meant toprovide security and so there aresecurity personnel that help with thearchitecture and the security and or aproject that is not security related andhas no maintainers that can provide forkind of that security relatedperspective Um and so uh the technicaladvisor group for security exists withinthe TOC under the TOC uh within the CNCFand really what we provide is kind of avariety of different functions if youwill um most importantly advising kindof in the name uh advising standardprocesses So for things like thegraduation process if you're moving fromsandbox to incubating incubating tograduation there are things that we wantto look at There are things that the TOCneeds for those who have been on aproject going through that process andyou know what is what is the baselinewhat are the expectations um we try tofigure out what that baseline shouldlook like and then clearly document themUm assessments is something we're goingto talk about today uh but very much uhfocusing on the self- assessment processthe joint assessment process and whatthat looks like and we'll dig in more tothat and then research for those who arearen't aware um I think across the boardfrom end user and and enterprisesadopting cloudnative technologies to uhyou know other uh related efforts ofsupply chain security uh white papersthat kind of apply broadly to projects uwhether they be open source or closedsource Um so what we're going to talkabout with the assessments piece isthere's kind of two layers for tagsecurity that we focus on One is theself assessment and it u there's a linkthere for those who have not seen thetag security repository u where we keepthe assessment templates but theassessment the self- assessment templateis meant to be that self-service layerWe want to enable um the group to uh theproject if you will or the consumerWe'll get that in a sec Um to basicallyhave a a baseline What are theexpectations when the TOC or anybody uhcomes to the projects and says are yousecure that's such a a broad spectrumright it's notfair andor prove your security uh thosethings like how do we start to qualifythem in a way that can be managed withsome expectations And so the selfassessment is meant to kind of establisha minimum baseline Do you have XYZ umwhat things can you be proactive aboutdoing that will make it easier to sayhere is how I've met this criteria Umand you can be up like front leaninginto that You can produce it You canhave contributors uh work on this umalongside you in your projects u or ifyou are a consumer of that project youmight be looking for this sort ofdocumentation to exist and so wecontinue to cater and evolve that andthen jumping over to the jointassessment u particularly this is forprojects that are moving from incubatingto graduated and this is a more of acomprehensive collaborative experienceWe get a group of people um contributorsvolunteers from tag security to workwith the project handinand dive into a back and forth of likehelp us understand the architectureMaybe there are patterns we've seenbefore that have led to security We havea list of compromises sitting in therepository of things that have beenexperienced in the past that have led tovulnerability or susceptibility of somecompromise and we've learned from themand how can we continue to try and spotthose patterns in the wild as they existum and inform projects of things thatthey probably need to meet at a minimumuh before moving from i&ncubating tograduated And that is something that theTOC expects uh particularly fromsecurity relatedprojects And so we talked about the whatthe assessments themselves The why is ummore of a a different layer right we wewhy do we conduct these assessments andand what's the value of them is kind ofdown to one lowering barriers uh toaccessibility and more diversity of umrole and thought within contribution Uhwe'll talk about towards the end of whatthat what in getting involved looks likeBut um how do we how do we bringsecurity personnel uh how do we bringothers who are well-versed in securityand architectures and decision-m intothe process to really try and enhancesecurity um how do we scale securityacross the landscape uh if we thinkabout everybody every single project inthe landscape doing security bythemselves it's going to be ch it'sgoing to be changing it's going to bewhat we mentioned earlier about thenuance of different projects havingdifferent layers of security uhpersonnel to kind of assist with thoseefforts and that doesn't always scalepeople don't always scale in thatprocess so how do we lift the baselinehow we how do we lift all ships in orderto you know kind of pro provide the bestreturn return on investment with regardsto time spent right maintainers ofprojects are already busy people Uh endusers consuming projects for variousreasons are are already busy Um so howdo we cut down on that time and providemore clear concise expectations of wherethings are today and also kind ofdocument that asa addestation if you will of hey this isthis is where we are These are theconsiderations we've made These are alsothe constraints of the security of thisproject Um which I think can beimportant for outlining um for everyoneinvolved of why is this decision madeWhy does this project require certainprivileges uh and and annotating that upfront as opposed to finding out later onwhich could you know kind of derail someof the uh I'd say adoption processSoum I'm here in this talk on behalf ofCNCF project Cubscape as I told youbeforeUm and I'm going to talk tell you likeour story throughout the self assessmentwe did So those uh if you haven't heardCubcape went into incubationuh publicly as per uh this cubeconuh we've been working on this for sometime but even before um so cubecape is asecurity project I will tell you in aminuteum more about the project what we aredoing uh but as a security project wealways had um you know as people whohave been around in different you knowuh uh tasks around uh CNCF we've been intouch with tech security uh more or lessfrom the beginning of the projectumnow a few words about what cube cubecapedoes so cubescape started as a CLI tooluh assessing uh the security um andcompliance of uh of Kubernetes clustersUm checking the configurations the MLfiles the CLI turned into not justscanning clusters but also scanning uhum scanning YAML files Helm charts ummade with the ability to hook up in yourin your CI/CD and so on And the projectthen evolved into a full-fledged uhKubernetes open source Kubernetessecurity platform So as of todayCubscape is covering a lot of lot offeatures that someone who has aKubernetes cluster and wants to have anopen source solution covering uh uh umits security from many perspectives uhhave a one-stop shop and you can installCubescape as an operatorum in the cluster which we'll come backa little bit later because it isimportant part in our story Um it doesvulnerability scanning uh as I saidbefore uh configuration scanning uh it'sable to sub do like uh policies uharound the Kubernetes cluster and in thepast year or so we've added capabilitiesto detect uh uh runtime uh incidentsdetect uh um effective attacks on aKubernetes cluster and not justum postureissues Um so it became to somewhat acomplex project because since we aresupporting a lot of features and a lotof components uh relatively uh um ayoung project uh we've faced like a lotof questions around our own securitySo we've entered uh we contributed theproject into the sandbox phase in uh2022 if I recall Um during uh thesandboxing p'rocess we had to go to uh uhuh the tech security as uh I would saythe security experts of the TOC uh uh toget an uh opinion about whether thisproject worth uh um you know to be partof the CNCF uh family and we getapproved to sandboxing and then you knowas the adoption grow the feature grownit was like really clear for us that aspart of the journey we are going to moveinto uh into incubation and be afull-fledged CNCF project Um which tookus some uh took us took us some time Uhwe got actually we got a broad adoptionaround the project uh that wasn't reallyan issue like but on the other handsometimes you know when you're a projectand you have like early success with abroad adoption sometimes you are likelacking many things uh documentation uhuh uh security or left leaving behindmany things which are which areimportant as for you as a projectqualitySo as part of the incubation processwhere we're required to do many manythings um one of the things was the selfassessment and um well you know I myselfcoming from the security industry formore time than I care to admit uh umwhich is like something around 20 yearsuh and um and you know most of theproject maintainers are also coming fromthe security industry and we said Oh youknow security assessment is is theeasiest thing to do right like we knowwe know our stuff we don't need likesecurity self assessment why it isinteresting like let's leave it as thelast thing to do in the whole prochecklist and um and we really reallyhad to notice around October uhsomething around last CubeConuh in Utah when uh uh we were told thatlike look guys if by December uh you arenot getting all the things that you haveto do then we won't be able to TOC won'tbe able to assessuh uh tech security won't be able toassess you and then we'll miss the uhthe window where we can go intoincubation because every incubateproject is is published in cubecons Sothen we'll delay the whole incubationprocess by half year So uh you know Imyself and Matias who's anothermaintainer you can meet him at theCubscape kiosk um you know got into avirtual room because he's living in inuh in near Lake Geneva in Switzerlandand France Uh I'm living in Jerusalem Uhwe got into a virtual room with a lot ofcoffee and started to work on the selfassessmentAnd um in general the self assessmentwas um at the worst blink it looked likeI'm sorry to say the very bureaucraticprocess like we have this template wehave this like need to go through andlike just fill in all thesequestionnaires and and create all thesediagrams and and more or less todocument a lot of things and to docu toanswer a lot of uh questions we wethought we know the answer for and theyare like super obvious and why do wethereUm but it made us uh uh you know gothrough a process of like reiteratingeverything that we were like superobvious for us And you know sometimesyou know all these cases when you knowsomething really well and you havesomeone else sitting next to you andstarting to ask questions and they seemlike super uh obvious you know to answerbut it makes you rethink everything andthen it it causes great gives you greatvalue So this is what uh uh uh um whathappened to us So we rethoughtuh uh all the security side of theproject uh despite being ourselves asecurity provider into some extent andit it made us work and it made us dosome goodthings So the one of the example issuesthat was raised during the review that Itold you that cubescape can be installedas an operator in Kubernetes cluster SoCubscape has these multiplemicroservices uh uh doing like differentfeatures uh in the cluster Uh I will dolike just a little uh go through Uh wehave uh uh an operator component whichlike uh manages all the other uhcomponents in the environment We have uhuh the crial cubecape microser which isdoes the configuration scanning throughagainst the API server the the cubulanwhich is the vulner our vulnerabilityscanner micros service which can be uhscaled separately We have our nodeagents u our node agents are in chargeof uh ebpf data collection so enrichingvulnerability and configuration sca(nningand also detecting uh uh runtime uh uhincidents Uh we have another interestingcomponent which is like used withexternal system called the synchronizerwhich is there to uh synchronize objectswe are creating in the cluster toexternal uh servers Um it's a great subproject but uh this is not our subjecttoday but the interesting thing is thatcubescape was built in for kubernetesand kubernetes only and we wanted to fiteverything we do into the kubernetesecosystem This means that every objectuh uh which we create every APIcommunication we do we wanted to do itthrough the API server So any kind ofexistingsystem that uh uh that um communicateswith us is able to tap into every datawe produce or every configuration wehave So we wanted to and one of thethings we are creating is image scansand sbombs Um now I don't know if howgood you know sbombs but sbombs are bigobjects like they are they can besometimes from even 50 or 100 megabytesof data Does Kubernetes API server likesobjects at thissizeno Uh so this was a problem for us Theway we solved it is there is a lesserknown uh feature in the in the APIserver called uh aggregated API serverextensions which enables you to createuh uh another API server in theKubernetes cluster only serving specificobjects and the main API server is justonly rerouting requests to that APIserver uh so from the outside user theylook like yet just another API serverobjects but at the end theimplementation is residing in adifferent component is this is what ourstorage is uh uh uh our storagecomponent is actually that now the wayit works that that the API ser betweenthe API server which is actually in thiscase a proxy with our storage for ourstorageum have the same connection rules justlike any other extensionfeature in the API server They havemutual TLS and a and uh certificaterequests and all these kind of securityin order to secure the communicationbetween the API server and thissubservice Now in our case when wedeveloped this it actually took us sometime It was like uh uh it was achallenge uh to implement on this serverwe have like a lot of issues at thebeginning and obviously like solvingMTLS wasn't the most interesting partand therefore we said okay like there isthis option to just like skip TLS uh uhuh in the communication just for for thesake of the development but as up untilthe self assessment uh this flag turnwas turned on so up until we got intothe self assessment uh we forgot thatactually the storage component and theAPI server wasn't talking in TLS andthere was no authentication andencryption in between them which is kindof a problem because the storage itselfcontains all your vulnerability data oryour configuration scans all your sbombsall your security configurations so uhyou know sometimes as security personyou are sorry say bad things about allthe stupid developers but in this verycase we were the stupid developersourselves Soum so we as it turned out during thereview we just found out that we neverturned this on never enabled this So weobviously we uh uh we handled it and wereleased the version uh uh of the Helmchart enabling this and solved theproblem uh as of today So I want to takea few words about you know the value forus So this whole process although likeseems like a burden and like nothing notsomething you would like to have as likean open source project because likeevery in open source project most of thetime the people love to do like newfeatures implement uh implement newstuff and just like go to CubeCon andtalk about it and like you know handgoing through and doing review codereviews and architecture reviews andcreate diagrams is like usually not thebest thing to do in an open sourceproject you would thinkBut in general this whole selfassessment really made us uh um you knowreview what we've done up until now Umand despite like really this wassomething we like to push out for a longtime and didn't want to uh to attack itreally brought us a lot of value of likecreatingdocumentation reviewing the securitypart of our stuff and like making usremind making us you know uh remin)d usto to turn on TLS when we can Uh and andwe have like a few other lessinteresting minor issues we've found butin general it's like you know sometimesyou need to have those things that whenyou get a mirror and just like look intothat and you know going back to theground truth and and the self assessmentwas really a great opportunity to to dothat So really we really thank the allthe tech security people who areinvolved helping us throughout this uhprocess and looking forward for for thenext oneAll right So lasting impacts right uhyou do these assessments hopefully it'snot one and done Hopefully the even theself assessment itself can be somethingthat we want to evolve into some sort ofregular periodic review right um maybethat's annually uh something thatprovides for more confidence for the endconsumers is really what we want how dowe increase adoption I think that enduser confidence goes a really long wayum and in doing so what we're seeinghere is also the the fact that thelandscape is growing extensivelyUm and in doing in looking at that likewhat does that mean well it means thatone there's lots more projects Um buttwo there's lots more projects that arelooking to make their way through thegraduation process And you know if aslike as things continue to evolve wewant that we want that process to havemore defined expectations What thingscan projects be proactive about i Ithink trying been hinting at that now uhahead of time as opposed to kind ofgoing through the process blind you knowand that's not to say anything againstthe process but the process itself hasbeen learning and growing And if we canif we can really start to figure outwhat the baseline expectations are wecan continue to improve upon them butalso projects can proactively um leaninto you know meeting the criteria umconsolidated security review is reallytrying to provide for those expectationsand make it a lot easier for uh you knowchecking checking the boxes that arenecessary I don't like I don't likecheckbox activities for security Uh butnonetheless it it gets a lot of thebaseline criteria out of the way so thatpeople can focus on the more importantelements Are there constraints are thererestrictions are there areas that doneed further development and that's fineif there is part of the architecture ofyour project that is like this is knowninsecure We don't like this but wehaven't figured out a way around it Umthere is no way around it currently Uhand it's a known constraint and here'swhat you do to mitigate uh risk Uh otherconsiderations need to be made there Andso uh those things are always importantThe baselines themselves need to beupdated Right the security assessmenttemplate for self-service should bethings we learn from right it's a mostlyself-service template Let's get it toall the way self-service template sothat we can scale that effort and removeas much the human element as possiblewhere required Maybe it's confusionduring the process If there's a lot ofconfusion and back and forth and reviewthat is getting in the way of gettingthe activity done and getting it donewith confidence then let's improve theprocess and let's continue to improve itUm so that people who are doing it todayhave a a better time but also on thenext periodic review Um let's let's makethat better Uh and so all of that saidone of the other things that I reallywant to continue to hit on and I putthis in my lightning talk earlier thisweek was contributor paths Um if youhave personnel who have the securitymanager role within your GitHuborganization I think that's a great wayto kind of like influence what kind ofactivities they can do that is differentfrom say a maintainer or contributor Umthese are things that we can look at andkind of different roles across anyplatform that you use for managing yourproject that can help influence thebaseline Uh what are their what's theirrole what's their purpose when do theyget involved how do they do things andhow do we get people involved forprojects that don't have those personnelhow dowe advocate for you know recruitingthose people from open source who wantto volunteer their time who may not havethe expertise to dive into a project whohas a let's say even a high level ofmaturity and you know trying to join thecontributor ranks and find something towork on that can be a barrier to entryfor some maybe they don't have theskills for the language set or thearchitecture or the technologiesinvolved But um there is a wide expanseof security personnel who might want toget involved with a project and say heybefore it's even required let's startworking on this um self assessment andstart looking at what things can be donethe security.mmd file the you know whathow do you address a security report oradvisory or vulnerability what do we dowho does that go to all of that requiredkind of criteria IA for u a project andits governance uh I think is reallyimportant and and what what does this dois it enhances security uh it's notgoing to be perfect it doesn't need tobe perfect you know we don't live in aperfect world but uh if we can take thislandscape and enhance security by 1%today then you know hopefully over timethis continues to compound and we figureout okay this is this is known bad thisis something that is hard to do or thisis something where a a project needssupport What do they do when they get aa security report advisory and they'relooking at it and like what do I do withthis i've never experienced this beforeWho do I go to who do I ask who do I getadvice from um that's I think a reallygood place for tag security to reallysupport the landscape Um as well asworking with other projects right otherprojects may have experienced thisbefore and that's probably typicalacross the ecosystem uh to say hey hasanybody else done this because I couldreally use some help Um and so gettinginvolved just real quick uh I thinkthere's a couple there's many layers ofgetting involved The two that we wantedto call out uh of relevance to thisdiscussion is security tag one Um thatlast graph of how things are growingthey're scaling We need to automate whatwe can and we need people to help uswith that automation because that's alandscapewide automation That's not justone single project Uh conducting theassessments is still a human process Atthe end of the day we need people toreview that kind of next bullet pointthere Um need people to be involved withthat process And it is inherently meantto bea we don't we don't care about yourlevel of expertise If you want to comeand join it and observe please sign upfor that That is something that we do WeI I in particular have learned from thatprocess as well of just seeing how otherpeople review um these these assessmentshow they work through a project whatthings they like to understand whatpatterns they've seen before Um andincreasing the baseline Uh so the selfassessment itself is something that wework on on those templates but alsowe're trying to make this more genericeven outside of the CNCF there's the umbaseline now being moved over to theopen SSF to try and make this more of alarger ecosystem And then projects Ithink projects would are not going tosay no to um getting involved withreviewing the security of their projecthelping with the assessments umreviewing things that are necessary Andso with all that said if you areinterested please come by the kiosks Ifyou're interested in CubeCape and youwant to hear more um both of the boothsare pretty close to each other If youwant to hear more about tag security andget involved or you have questions orconcerns or need support please come bythose and ask uh ask questions as wellas come find us up here if you haveanything you want to askThank you2025-04-15 21:58:25.096013+ pay attention to thesevulnerabilities um you know prior tothat people were like it's open sourcewith all eyes bugs are shallow you knowthe whole the statement there um and Ithink this kind of disproved it and atthis point the the race betweenattackers and defenders was largelyabout when something was known and zeroday appropriately disclosed how quicklycould you patch and update your softwarebefore the attackers jumped in right inthe Struts cases and and theHeartbleleed cases we saw the attackersfast following these disclosures so thiswasn't sort of a prezeroday attack itwas immediately following it within daysand weeks at that time um you know acouple years later of course there wasthe Equifax a different Strutsvulnerability this is the first time youstarted to really see these things popup on the nightly news and people reallyasking about it um and then certainly uhyou know 2020 and then 2021 you seethere with log forj those kind ofhappened in quick succession during thepandemic when we were all sitting athome you know baking bread in the statesI don't know what you guys are doingelsewhere well I I I was actually justuh reminiscing that around 2017 withEquifax I was already doing supply chainwork and uh I was in another securityconference and I was like oh supplychain it's going to be really big it'sgoing to be really bad And uh anotherattendee turned to me and he was like"Actually no that's that's a spy storyno nothing will ever happen i thinkBluetooth vulnerabilities is what weshould be concerned about." Uh and wellhere we are here we are uh a questionthat I have for the audience though isby looking at thistimeline would everybody agree that allof these are software supply chainattacks or is there something else goingoninterestingwho says it's a supply chain attackraise your hand all of these are supplychain attacksokay just a handful just who says uhthere's something else on this supplychain sort of like incident uh it'sconvinced that there's somethingelse okay so uh very littleparticipation so something I wanted tosomething I wanted to point out here isuh I believe that there's really twoproblems to the supply chain and I thinkthat sets up the debate uh there aresupply chain attacks uh say solar windsand cuails undisputably somebodyattacked the supply chain to uh causeharm and then there is supply chainvulnerabilities which are closer to ourEquifaxes or log forjs that areindisputably something spread throughthe chain in such a way that it uh madeus manage risk uh in different ways thatalso had incredible impact in thesoftware supply chain and in oursoftware and then everyday basis but Ipause it and I think this is a littlebit too uh what I want to to set off thedebate here or well not a debate a panelconversation is without being able tomake that distinction we're not going tobe able to mitigate both supply chainattacks and supply chain vulnerabilitiesi don't know what what do you guys thinkwell I think sometimes we're also rushedwe're rushed in determining if it is avulnerability or if it's really anattack so it is because of legislationit is because reporting it is all thosesort of things we need to have a rootcause right away so in really doing thatroot cause investigation and taking thelearnings that you get from that um Ithink we're being rushed sometimes toreally determine what to do how tomitigate and then how to follow up onthat yeah I think you're right strictlyspeaking these first three things theyweren't really attacks or at leastcertainly the first two weren't attacksand log forj wasn't attack althoughattackers exploited them after the factthat's really the point log Solar Windsand then later XZ the attempted umattack right um but as you'll as we'lltalk about in a moment you know thereare a significant number of theseattacks but I think you're right topoint out strictly speaking those werefailures of the software supply chain uhvulnerabilities if you will that led upto it and it did show as well themassive impact that it can have howwidely it can be spread how manyorganizations can be impacted and ,howeasily that can be done so impact wasmassivealso curious what kind of USB B is goingin there and what that will do to thepresentation but we just got ourhardware supply chain hacked somehowmaybe it's okay and I think as well itcomes from curiosity so if you reallylook back at the first hacks or thefirst explo exploits whatsoever itactually came from researchers and fromacademics that were interested toexploiting what is possible so the firstincident that we saw comes from ccuriosity and what I find interesting aswell is still nowadays when we look atbehavior or when we look at attacks orwhat we're doing exploiting maliciouscode whatsoever that stems from behaviorwhen we look at mitigation or when welook at our security posture or securitystrategy we always look at techtechnical capabilities or technology wecompletely forget where the tech stemsfrom and thatcurious behavior as it has been sincethe 1980s since the Arponetso uh I'm going to hand it back to youuh I think this sort of you youpreviewed this right there main threatsfacing open source you have the classicvulnerability which is really just anunintentional bug that can in certaincases be exploited you know the thestruts the log forj log forj log forshell as big and profound as that waswas kind of an old school boring kind ofbug um didn't even apply in all cases itwas sort of the intersection of what uhsomedeserialization and um an issue in theJava runtime so you had to have both ofthose which was a very common overlapbut it it wasn't a bug in 100% casesthat's how most vulnerabilities unfoldum but the since around 2017 we've seenthis massive rise of intentionalmalicious components malware um you knowSolar Winds was a very sophisticatedattack um it's a little bit differentthan some of the other ones I'm about totalk about but it it was spec in yourpoint it was an attack right um you knowthe XZ utils where somebody tried to youknow they they actually got commit bitsand publish bits on a very popularcompression algorithm used by basicallyall the Linux distros did actuallysubmit malicious code but it only gotinto what the alpha right it didn'tactually make it into the publishedversions of that so we narrowly avoideda real catastrophe in that instance yeahthere's there's actually another examplethat uh really made me think and I thinkwe we may get there as well luckily uhit also stayed on the nightly build butTensorFlow was backed i don't know ifeverybody uh was aware of that they weretrying to introduce back doors inmachine learning models by compromisingthe software supply chain and thenretraining model uh like infrastructureto actually introduce veryI think we're getting hackedit's that USB yeah it's the USB um yeahvery sophisticated uh vectors that areharder to to assess that's right um myWi-Fi is off by the way so I did notjust just for the record um so so thisone here this is uh something that SodaType has been tracking since 2017 uhproviding data and and solutions to thisnumber is recent as of about a week agoso we've tracked and logged over828,000 intentionally maliciouscomponents these are basically fakecomponents that have no purpose otherthan to attack developers and thedevelopment infrastructure so anotherway to think about this it's a spearfishing attack on your developers comingfrom the supply chain right and uh we'reseeing about 18,000 new packages everysingle week falling into this categorythe majority of them are in npm the nextbiggest about 50% of them are npm thenext biggest by the way is Python ohwhich in the world of AI is very scarythe TensorFlow thing you're talkingabout right so many of these things arethe classic typo squatting type ofattack they publish a component that hasa similar name they fake the downloadsthey fake the stars they fake thecomments and your developers aresearching for a thing and they're like"Oh that looks like the right one." Theydownload it and as soon as it hits theirmachine it executes the code becausethere these npm and Python they have preand post install scripts so they executethe code right away and many of th-eselook like a smash and grab type ofattack they'll steal environmentvariables they'll steal whatever thedeveloper has access to and post itsomewhere in the world you wish itdidn't go to right some of them are moresophisticated trying to drop back doorsand other types of things but a lot ofthem are just playing the numbers gamegrabbing the data they can and hopingthat they can come back and leveragethem and this happens right on thedevelopment machine so if you don't haveinfrastructure to block it you'reunlikely to be able to detect it afterthe fact because most of it uh doesn'teven compile and so the developer wouldprobably fix the mistake find the realcomponent and only then does it enterinto your GitHub actions and your CI/CDso your traditional vulnerability flowis completely blind to this type ofattack uh I really like that you use thethe term spear fishing because I thinkit's also useful for us to understandthis like smashing graph versus likenation state a style supply chain attackin the world of fishing you have fishingas in somebody says they're Bill Gatesand if you reply to them they're goingto give you a scholarship to da da dathat's just fishing right you also havespear fishing when somebody learns thepatterns of their victim they tailor theemail in such a way that uh that it'smore credible they uh probably learn thename of people inside of a company totry to pretend to be somebody inside ofa company and those are a little bitmore devastating but it's because thestakes are higher as well myunderstanding is that for both fishingand spear fishing you take differentstrategies to try to mitigate the riskuh you being a risk analyst I don't knowif I'm lying no no no it is and it is inrisk quantification you have tomake a balanced decision on that and howto mitigate and also in depending whatyou're actually going after you have todetermine if I mitigate in this way whatkind of percentage of the risk does ittake away and you can even do yourbudget quantification in that senseyeah and another interesting part ofthis it's a little bit trouble some whenwe use the word malware in this casecertainly from from my perspectivebecause I talk to people and they'relike listen I have malware defense andso I I'm I'm covered here and it's likeno you really don't because what a lotof these uh implementations look likeit's more like open- source malware itnever existed there is not a fingerprintnecessarily a back door or somethingelse that one of these malware toolswould recognize as like oh that's a badbinary I'm going to block it no it'sliterally custom code that would looklike any other piece of open sourcesoftware that's doing normal things butit happens to be designed for mal intentwhich is why strictly speaking it'smalware but your traditional malwaretools are not going to detect and pickup on it and it and it's unfortunatethat we don't have a a industry standardterm for that you know I tend to call itmalicious open- source malicious uhintentionally malicious packages to tryto differentiate from your traditionalworms and and root kits and things likethat and it gets even more complicatedwhen it's iterated on andbefore before it's been seen it usuallyis iterated on because it isoverdeveloped it is over uh shared umoverused so those iterations make iteven more difficult to see what'sactually going on which component wherehowI I would say that uh and I may havesome uh graphs about this on the slidesomewhere later uh if you take uh allthe supply chain sort of like typossquatted packages back doors and so onso forth and you run offtheshelf antivirus software on it will have anincredibly bad performance and myintuition is that most of the timemalware say it's an email like a fishinguh sort of you get you use the emailtool to send a PowerPoint presentationthat executed some code obviously wrongnow if you think about open sourcemalware it's you use the tool to installcode to execute code that execute itsown code well that's a little bit harderto tell why is it why is it doing whatit's doing and is that part of thecommon golden path. of a piece ofsoftware and how it's distributed rightyeah and the the biggest challenge thatI'm finding is not enough people evenunderstand that this is a vector so whenthis happens they don't know thedevelopers don't realize wait a minutethat's I need to raise my hand and havesomebody investigate what just happenedbecause that was unusual they go oh Igrabbed the wrong component i don't knowwhat's up i found the right one and I'mjust moving on with my job meanwhile thedata has been shipped off and you knowand and and that's the major problemthat I see right now is the lack ofawareness and understanding i I thinkthat goes to both of you's point whichis a supply chain is meant to propagatethe artifacts so if it gets into thesupply chain it's just going to gothrough the pipes and it's just going tobe pushed down uh I I would actuallywonder I don't think I've seen any anysamples of this but I wonder if there'sa multi-stage sort of like supply chainrewriting process where I infect adeveloper machine actually there thereis one that I know of uh on Eclipse uhOctopus i was just going to mention thati don't know why frankly all the badguys in the room close your ears i don'tknow why they haven't done that becausethat happens that was definitely duringa pandemic so that was around 2021 therewas somebody literally did create a wormin Java that leveraged uh Eclipse and Ithink Intelligj and so literally if youpulled down that component that thingwould worm style infect every jar thatthat that developer machine had accessto and it would repeat and it lookedlike it was a proof of concept becauseit didn't do anything other thanreplicate itself i and so it's beenthree or four years now why we haven'tseen that happen in mass effect I don'tknow but there's no way it just wentaway that's why I forgot about itbecause I actually I went in tried tolook through devs.dev and see like whereit was uh infecting and it was like abunch of student uh sort of like jarsthat were like pretty pretty much likestudents were playing with this so itdidn't actually made it up to a majorpiece of software but maybe maybe it ismaybe it's one down like out there thatwe just haven't seen that'seffectively propagating in ways that wecannot understand it is also becausewhat is being done is already highlyeffective so what you see is when thereare more measures to mitigate you seethat attacks become more multi-stage forinstance easyaccess lease time so probably inevolution you will see that coming up ormore and more but it's like saving thegood antibiotics in the background untilyou really need it right now they don'tneed it because they're hammering all ofthese things 18,000 of them out thereinto npm and Python every week and it'shighly effective because they keep doingit only when we get better at defendingagainst it then yeah some of these moreadvanced techniques might pop up rightand I am wondering what we are going tosee when for instance governments basedon all the information that they have onsbombs or all the information that theyget from components libraries in thecyber resilience act I don't know who itis another hacking on the on the audiopartthey tried to do something no but I amwondering if you have that openness oncomponents on libraries etcthere will be governments that will bancertain components or thatwill give a need for um software toto have functionality is replaced i'mwondering what kind of effect that willhave because on the one hand that isagain going to have a large effect onour innovation but on the other hand itis going to force a mechanism where weneed to look closer at what is in our inour software what are those componentsso this uh this stat here is the answerit was mine doing it cut me off um thisthis stat here I think is reallyprofound right so last year in the 10thannual state of the software supplychain report that we publish um we webasically analyzed thousands andthousands of enterprise applications welooked at downloads both from MavenCentral but also from other ecosystemsand we kind of said of all the at thetime 7 million or s/o open sourcecomponents that exist out there how manyare commonly used it was about762,000 so about 10% are typically usedfeels small but also right there's a lotof projects out there that have come andgone and never got traction but aroundthe same time the number of fakecomponents that the bad guys have putout there was equal at that time andit's now as you could see evensignificantly larger so the point beingthere's more noise out there than actualcomponents being used across the entireecosystem and that's really bad um andbecause this number is growing seriouslyfaster right so probably next year itmight be double exactly and this is theknown number that's the known number theknown number so imagine there's a lotunknown still that's right that's rightand and the the deeper moresophisticated ones like the octopus likethe XZ those are much harder to figureout it's hard to know that they're notout out there and we just haven'ttripped over them yetokay so this touches on what I was uhtalking about earlier which is uh Itried to investigate can we useofftheshelf sort of like well-known ITtools that we use to manage other typesof risks uh onyour right on your right left on yourleft uh this is an ROC curve i'm notgoing to explain the the sort of likethe science behind it but really youwould want that line on a good uhdetector to be touching on the rightmostand the topmost axis of the of the plotthat's a good detector say like afingerprint sensor would be behavinglike that in this case it's almost likea diagonal so it's almost like a malwarescanner on open source software malwarewill basically flip a coin and tell youoh yeah this is good or no this is badwith very little regard to is itactually malware or not uh this is thisis a hint that we actually need todevelop new tools that are able toreason about the fundamental propertiesof the software that's written and themalware that's written for that uhsupply chain setting on the left handside is uh another study that we did totry to understand how many files areexisting on your containers and how manyare tracked by SCA tools when yougenerate an SBOM for your container howmany files did the sbomb not report uhto my surprise some for some containersyou will see 50% of the files are notreported this includes secrets but thisalso includes malware like maliciouscode that's embedded on the containerand distributed that your sbomb didn'teven see because it was not a part of apackage or it didn't hash the whole filesystem so it really does hint to me thatwe need to develop something else weneed to start thinking about how do wegate the trust of a product that's beingdistributed throughout the pipeline anddo we need to have a sort of like supplychain aware scanners that understand howthis malware is being developed andhow's how is it being distributed andmaybe supply chain sort of like malwareaware SDA tools like the other side ofthe coin that take a look at a containerand they say hey look I have no clue ifthis is malware but this is not part ofa known piece of software and you couldbe taking different steps to appro to toapproach this i don't know what you WellI think what you're visualizing here iskind of what I was getting at beforethat the traditional malware techniquesthey don't work in this case exactly umand in terms of we need to develop athing I feel like I already did thatthat's how we gotthe and and just to give a sense of howthat works what we recognized way backwhen was that this was a cat-and- mousegame every single one of these did notlook like the previous ones there wasnot an easy way to fingerprint andrecognize it but what we figured out wasthat the problem space looks an awfullot like credit card fraud detection andso what credit card companies do some ofyou probably here they buildstereotypical models of all of us asconsumers they understand what wenormally do and what we don't normallydo and if you do some transaction that'sa little weird uh I travel all over theworld my credit card gets used all overthe place i don't go to uh a departmentstore and buy TVs in London when I livein the US that would be weird right andso when those things happen theyrecognize that that's an abnormalbehavior and they block it they send youa text all these kinds of things westarting back in around2018 we started building models of theopen source projects so we understandwhere are they typically released fromin the world who typically works on themwhat time of day are they done what typeof commit changes you know for exampleit's rare for a very popular project tobe the first one to use anotherdependency that nobody else does butthat's how attackers bury their thingsit's you might remember the uhdependency confusion research that wasdone years ago our prototype system waspicking that up before it was publishedactually and the reason for that isbecause that that white hat at the timeresearch he was putting out componentsthat had the same name as internalcomponents at companies with a very highversion number so their build tool wouldprefer the latest and grab it well guesswhat that's not a normal projectbehavior projects don't publish theirfirst version as version 1000 they startat 0.1 or 1.0 but the the system oncetrained was able to instantly recognizethis is an abnormal behavior and flag itas suspicious so that 800 and some oddthousand number that's how that systemworks that's what we're doing to detectthose and I I do want to hinge at thatbehavior elementbecause a supply chain is a it's aseries of processes of people actingthroughthe through a distribution and creationof software so it makes sense that inthe same way that we control the qualityof regularuh physical goods we may want to controlthe quality of software by making surethat all of these processes uh followcertain uh uh certain sort of likesecure baselines uh known good processesnon- good approaches i wanted to alsoput a parenthetical here on the on themalware graph that I put this is part ofa research we did with uh with Linux DROdevelopers trying to understand do theyactually scan for malware uh and atfirst we approached them and we werelike do you scan for malware when youpull code into your DRO and they said nobecause it it doesn't do anything andwe're like wow okay are you sure and wewe we turned up with this plot so andkind of to segue to the next uh slide Ithink part of what we also should bethinking about is when we think aboutthe supply chain as a process withmultiple stakeholders collaboratingtogether we should also be looking atthis tireless work of open sourcedevelopers and uh communicating withthem better contributing back workingclosely with Linux DRO developers kernelhackers there's there's really a worldout there of people that are trying tomake this uh this more secure andsometimes we tend to forget that thatreally is it's a network of people ofvolunteers and that um we can use a lotof different tools but it's also amatter of uh building a better worldwith collaboration and uh andintegrations um yeah fullythis is I thinkI'm sorry I said we have one about oneminute left oh wow already time flieswhen you're having fun fast well I I Ithink if anything within cyber securitydoesn't matter if it's this topic or oranother it is about information sharingand it is about collaboration becauseadversaries will work on each other'swork we'll innovate we'll do anythingwe'll look at behavior anything you cancrack you will crack so I thinkinformation sharing and thencollaborating together um is one of thecritical components if it's in researchif it's in within organizations or withvendors that's the thing that we need todo so right so uh actually before weleave tell me the name of your tool iwant to I I want to play with it becauseuh it really does touch in a lot of thethe topics that we talked about the SCAtool the malware scanner is not workingmaybe we need to Yeah we we call itrepository firewall or sonite malwaredefense we have a a big game with theducks and you can go fish forvulnerabilities or malware over in thethe the South Hall and if you want totalk to us more about it but that's whatit's called yeah2025-04-15 21:58:25.713725 ��m#��OAkyLdmGYZ6BQall right thanks everybody for joiningus we have a very interesting panel hereto talk about uh some pressing topics ihave uh two colleagues here joining meso we'll start with uh someintroductions and then uh we'll dive insound good yeah perfect uh Matt Laneri'm a senior industry analyst cybersecurity and risk at Forester researchum and within for research I researcheddifferent domains of cyber security uhAPI security cyber consulting servicestrends threats um and obviouslylegislation as well so for this topicCRA and sbombs will definitely bepresented on um yeah so we fill a couplepanels just on those two things aloneyeah definitely uh so I am Santiago i ama professor at Purdue University uh theshortest way I can condense what I do isI do research on how to securely developsoftware and how to check that softwareis securely developed that has beenkeeping me busy foruh almost 20 years by now it's it'sincredible how time passes uh I developa a couple of projects within the CNCFand I also work very closely with theopen source community to try to gaugerisk and upcoming threats andmitigations and strategies to minimizethe effect of malicious actors allthroughout the software supply chaingreat and uh my name is Brian Fox i'mthe co-founder and CTO at Sonatype umfor the last uh 17 years or so I've beenfocused on helping enterprises of allsizes manage their software dependenciesum Sonotype runs the Maven Centralrepository where all the world getstheir open source Java components fromyou're probably using it even if youdidn't know we run it um and uh I alsosit on the board of the opensourcesecurity foundation the open SSF andalso the Finnos financial open sourcefoundation at Linux Foundation and uhI'm also a longtime member of the ApacheSoftware Foundation so a lot of lot offoundation supports uh here um in thisso uh let's uh let's divein of course clicker was workingokay well we'll do it by hand that'sokay all right so these are the thingsthat we're going to cover uh you knowwhat is a software supply chain attackhow are how is it evolved we have acouple slides on that the variousthreats that are facing open source umhow we think this might evolve hasevolved and um and and talk a little bitabout the different uh differenttakeaways and and how traditionalsolutions aren't keeping up anything youwant to add to this nono all right let's do itstand up here then all right so uh whatis a software supply chain attack soreally we've seen this evolve uh quite abit over the last uh decade 11 years youknow it really goes back further thanthis but I think the when we talk aboutsupply chain attacks and certainlythings inside of open source people tendto remember back to the first struts andheartbleleed and shell shock right in2014 this is when we started to see sortof the logos and the names and themarketing campaigns aroundvulnerabilities um this particularstruts this is not the one you may haveheard uh Equifax uh a lot of a lot ofdata got stolen this is not the one thiswas one before that what was interestingabout this was this is really whenpeople started to*2rom fromfrom this initial uh discovery that wehad to head for relatively shortlifetime of the certificateso to do the authentication we of as yousay we we add wanted to add MTLSauthenticationum for since we had a lot ofmicroservices we were sure that wereally had to automate this there was noway we're going to going to handle uhthe numbers here with uh the number ofapplication and also the short lifetimeFor the same reason we uh we wanted uhour users to be self-provisioningprovision because they they wanted towork independently and be able to dothis without having to go through somekind of manual process askingum forcertificates so first of all we did someproof of conceptuh you probably have heard of this umopen- source project Spiffy and Spir uhthey look pretty interesting since theyreally tailor for what we're trying todo here uh about workloadidentities so we decided to try thatfirst um so we tried to set up Spy butwe had some challenges running it on ourtarget Kubernetes clusters which areOpen Shiftuh this is caused by uh open shiftenforcing s Linux and some of the spircomponents uh are conflicting with uhsome some of some as Linux policieswhich means you have to do additionalwork uh most of these issues are solvednow there's still one open there's alink there but there is a even a workararound available for it so now we we runspy in uh our devcluster but at the time we did wecouldn't make it work so we had to lookfor something else uh so we found searchmanager it looked prettyinteresting since we are uh uh workingin an airgap environment we cannot usesearch manager for the most typical usecase which is to get certificates fromlet'sencrypt um after the work with Spiffyand Spy we wanted to take something outof it uh the X509 SVIDS so we set a goalof of trying to use SER manager tocreate certificate that look like theycould come from Spir that means uh thecertificate should contain a URIattribute and uh here is one example onon how a a spiffy ID might look like fora Kubernetesworkload in our in our case we decidedto define the trust domain which is kindof the host name if you look at this asa URL so you know which which clusterthe workload lives in uh and it alsovital for us to know which name spacethe workload is running in uh because weuse a multi-tenant model where thetenants share the cluster so one tenantwill have one name space or multiplenamespace and other tenants will haveother namespaces so that was veryimportant for us we didn't care too muchabout the service account at the pointbecause that's inside the same namespaceso it's it's governed by the same tenantbut there is a huge challenge with thewith search manager in itself becausesearch manager is made for making iteasy to to to issuecertificates so considering themulti-tenency we first of all had toprevent their tenants to tip on eachother toes you should only get acertificate which the tenant is entitledto receiveand we also wanted some way of enforcingthe uh xfag svids because we're going touse that information on theauthorization layer in Kafka with theKafkaACL so for the v1 version um which wehad a very short uh deadline for about 6months we had to solve this really fastand cut some corners i'm going to sayhow we explain how we did itthe first one was to try to set upsearch manager formulti-tenency uh we use private PImeaning we have a shared common uh rootCA and uh we are supposed to use it wecould of course create our own PKI butthat was not an option that we thoughtwas very interestingso um we decided to set up a clusterissuer and since a cluster issuer isavailable throughout the cluster weneeded a way to to control the access tothis search managerissuer so the first thing we did was todisable the arbback permissions for uhaccess to the search manager customresourcesbut that only solves part of the problembecause the search manager also have theingress shim enabled bydefault so we had to disable it becauseuh if the ingression is enabled you canjust get a certificate by annotating aningress and our users are prettyself-provisioned for other resou3rces inthe cluster and we didn't want to changethat at the time we could uh also notconsider any CSI drivers for the samereasons because they're they all end upin the certificate request that wewanted to block our user from gettingaccess to so now we had everythinglockeddown but how would our users get accessto to thisfeature controllers ofcourse can you see this it might be abit small no it's okay on the top um wehave the green box uh saying applicationoperator this was an operator we alreadyhad in our clusters uh with it the maingoal of this operator is to provide asimplified way for our users to schedulea workload it it basically creates adeployment uh as a minimum and and thenby by adjusting the the specspecification on the applicationresource you can enable more featurewithout having to create all theunderlying resources so it's veryopinionated applicationresource so the first thing this mustmust be available for the users that areusing application resource but we didn'twe we didn't want to force our users touse applications because we have otherusers that are using other types ofworkloads like a stateful set or doingother stuff so we decided in this caseto use the service account as thecontroller for the featureso by just adding a simple annotation tothe service accountuh we created a new controller identityprovider which watches the serviceaccounts and uh interacts with sortmanager creating a resource matchingthat serviceaccount um and then sort manager kicksin and and does it its usual stuffyou're probably aware of this uh itissues the certificate and returns thecertificate inside a secretuh which can then can be mounted by bythe by the workloadpods we're going to look a bit more intodetails here uh you're probably familiarwith this but uh maybe not all ofyouso manager uh returns a certificate inin a secret together with a private keythis allows a workload like in this casethe Kafka client to mount TLS.CRT andTLS.key and by this way it's possible toenable an MTLS to the Kafka clusterwhich has is uh certificate and privatekey but there isa minor challenge here we also need uhsome trustestablishments but in our case we arekind of lucky because we're using ourinternal PI and the all certificates arerooted in our shared companyCA that's why I say on the top heredon't do this um but I said we we justhave six months so we had to cut somecorners this was one of them we knew itwas wrong s manager provides also CA crtinside the certificate secret and sinceit's actually contains the certificatewe were looking for company CA wedecided to mountCA.CRT inside the pod and use that asthe trust anchor for the Kafkaclient and this is obviously wrong andTim will explain morewhy yes so if you're doing this pleaseum pay attention because like Eric saidthis is not what you should be doing umactually the first resource you can kindofum look at if you are in this situationis our website we have a page therewhere weum give a couple of bullet points on whyyou shouldn't do this uh the one thatapplies here is that by directlymounting CA.CRT CRT it's very hard orit's impossible basically to do a um CAroot rotationif or it's impossible to do a CA rootrotation if you don't want to have anydowntime so any rotation will willresult in downtime because there isactually a proper process that you haveto go through to do root rotation whichis impossible with this CA crt um and inthis example there's actually anotherproblem um and that is that we are usingthe CA of the the Kafka client u chaincertificate chain while what we actuallywant to verify is of course the identityof the Kafka server so for example umone case in which this will go wrong isif the cuffer client is suddenly rootedin a different CA in the new companyCA so in this case what will happen isbecause we are mounting the CA crt inthe client in the trust store the clientwill suddenly only start trusting the umcertificates rooted in the new companyCA but like we said before the Kafka'scluster is still using the old companyCA and so this is like this weirdsituation where because the Kafk4a clientactually changed root CA is also startedtrusting a different uh CA so what weshould do instead is we should actuallytrust the company CA um the root of thiscluster CA uh chain so this is actuallythe right arrow that we should draw umbut this is not that easy because thatisn't normally available in the clustereric like he said he had that CA crtavailable so he just used it um and alsoin his case this wasn't like a differentCA so he didn't see any problems butonce you're kind of changing your setupthen you get into these problems andthisis like an instant way to get uh youinto some trouble and to get somedowntimeum there's actually a good solution forthis um something that Eric I thinkimplemented in his next version of histool or his setup yeah just showing thisuh short title slide uh there were someissues in V1 uh it's it was running inproduction there were no real issueswith it uh and it was a huge success butuh we started knowing that a rootrotation was approaching so we needed tosolve this some way because we didn'twant our workloads to break break downuh and we also had requests for usingsearch manager for other use cases sothe way we had to lock down searchmanager was really unfortunate so we'relooking for ways to to solve this andTim will explain more about how we wesolved the first thingso in this slide you can actually seewhat you should do instead of what Ericdid with the c.crtum field of the certificate of thesecrets instead what you should use istrust manager another project that wehave um I'm a great salesman I know uhso this is uh another controller thatyou install on the cluster and youconfigure it through these bundleresources which is a custom resourcethat you can uh use once you installtrust manager and basically trustmanager kind of allows you to configurea set of sources containing a set of uhcertificates that you want to trust andthen as outputuh of this bundle actually new configmaps are created across differentnamespaces that you can then mount inyour uh container where you actuallywanting to have those certificates inyourtrust bundle um so in this case we havea bundle that contains company CA andthe cafe client chain is has migrated tonew company but that's not a problembecause the trust bundle is still uhcontaining company C instead so the cafclient will actually be able to do MTLSwithout any issuesso what happens now if we also decidethat the Kafka server or the Kafkacluster um wants to migrate to the newCA right then in order to do thatwithout any downtime the first thing youhave to do before you do anything is youactually want to make sure that all theclients also start trusting the newcompany CA and the easy way to do thatwith this with this trust manager bundleand that this wasn't be possible beforewith with the CARTT field is youactually just add another CA to thistrust bundle so you add the new companyCA to the trust bundle and now theclient just supportsboth and once you do this and once yourestart all clients make sure that theyall load in this new um trustbundle you can actually start portingover all the Kafka clusters to the newcompany CA to identities that root inthis new company CA because those willbe trusted by the client now too so youcan have this mix of new company Cidentities and old company C8 identitiesum and once you fully migrate to all newcompany CA identities you can actuallyuntrust the old companyCA so that's the full migration kind ofexplained and that's why you should usetrust manager because this is notpossible to do um without any downtimeif you're just using CA.RT then it willjust instantaneously switch from companyCA to new company CA and like half ofyour applications will breakuh this is what the resource looks likein YAML so you have this bundleum with a set of sources defined thatlive in config maps and then you havethe target which is also config map anamespace selector which says in whichconf in which name spaces these configmaps should be created so that the therethey are available to be mounted thepots and then Eric will now explain inmore detail t5he second problem he solvedin V2 yeah the second problem was thatwe wanted to unlock search manager againbut we still had the same requirementsthat I introduced with um after sometime a new project popped up calledapprove a policy um that is a a aplug-in for search manager which allowsyou to define policiesuh of what how the certificate shouldlook like and where they should beavailable uh in our case they werealmost feature complete for what we trytrying to solve at least on the um onsome parts of it but what we weremissing was to kind of uh attach thepolicy to the name space because sincewe again are running multiple tenants inin a cluster we want to ensure that uhthe certificates somehow is linked towhere the namespace of the certificateisrequested but that was solved bycontributing the cell support which is areally cool feature uh I thinkpersonally so it can be have a lot ofother use cases you probably know cellfrom from upstream kubernetesuh now Tim is going to explain a bitmore on the details of approval policyyes so by sorry sorry um so byintroducing approval policy we couldreverse all the things that we have doneuh previously so now certificates andcertificate requests were available andalso in grion we want to experiment withAnd we are also considering uh the CSIdriver sorry about that yes so as Ericexplained he opened up the all theseresources again to his tenants but nowwe need a different way to kind ofprotect uh which certificates are beingissued and which are not and so the wayyou do that is through these approvalpluginsso let's go through the whole flow hereokay so a tenant creates a certificatethat results in a certificate requestbeing created automatically or they canalso directly create a certificaterequest automatic uh manually and then acertificate request is really theresource that will be considered by asub manager issuer but before thathappens the issuer will actually checkdoes this certificate request have a aum a condition a approval condition thatis set to approved if it doesn't havethat it will actually wait if it has onethat's set to denied it will actuallysay "Okay I'll not do anything with thisuh certificate request." So by defaultwhen you install S manager there's thisapprove all approver which approves allthe certific requests it sees which isgreat to get started but what Ericneeded is some more controlum and for that you can actually disablethe default approver and instead useapproved policy an approved policy isthis plug-in that you can configurethrough these certificate request policyresources this is basically adescription of what certificate requestsyou want to approveum when you see them so in this casethis is a policy that will approveclientcertificates and the one check that itwill perform here is that all the URIsin the client certificates actually uhstarts with this spiffy prefix and thespiffy prefix in this case contains thename space of this certificate requestresource so that way we are sure thatwhen a certificate request is createdum before it gets issued we actuallymake sure that the uh URI in that inthat certificate request actuallymatches the name space in which thecertificate request lives and that wayuh we make sure that there's like nocertific request created for anothertenant or another namespace um in Eric'ssetup the same you can also use toconfigure servercertificates and here instead weactually do checks on the DNS namesinstead also check that they this casehave a suffix with uh the namespace so to to conclude when you want toscale your TLS setup there are threethings that are very important but firstscale is kind of a very hard problem tosolve um because you have manycertificates you also have many tenantswhich different requirements withdifferent requirementsum also many clusters possible so thefirst thing you want to solve is youwant to automate like the very basicsyou want to automate the certificatelife cycle um and also the distributionso for that use search manager trustmanager the second thing that we saw isyou want to make sure that once that isautomated you can give the tenants inyour setup the ability to kind ofself-provision everything um but youwant them to be able to do that withoutbreaking all the tenants so you set uppolicies um and these policies you canconfigure through approval policy andthen the last one we didn't really talkabout this um but it's also veryimportant you have to make sure that allyour automation is monitored so thatwhen something goes wrong whenautomation breaks you get an alert orwhen a policy is not adhered to uh youalso are notified of that so that's theconclusion thank you for listening iwould like to hear if there are anyquestions and any kind of feedback youcan leave through this QR code thank youmicrophone there if uh questionshow are you doing the monitoringum well there are multiple differentsolutions for that i think I'm not surewhich one Eric is using i think maybesome matrix metrics prometrius metricsyes and like the cube states matrix orYeah we use some of this i'm looking inthis direction my clustering is herealso um we also set up some custommonitoring of the CAum because we want to know what's theearliest certificate that expires inyour chain and get an early warningabout that one so we we typically usePrometheusGrafana there also very various uhcommercial solutions available for thatmount the trust bundle in the pod do youmount it the normal way because thetrust bundle is delivered as a in aconfig mapif it changes then how oh that's a goodquestion I had actually in my recentpresentation prepared we use a communityproject called Reloader some of youmight have heard of it so that way it itwatches a config map or a secret andwhenever it changes uh it will trigger arolling update of your workloadthat's also very important for thecertificates themselves when theychange the trust store probably won'tchange that much but the certificateitself when it changes you must makesure that the application loads a newcertificate in time otherwise you'llhave uh downtime basically first overthere hi when you were initially doingthe migrations and when you had thethousand services um and using the trustbundle uh did you have any issues likefor example when I when I when I gotused There was an issue where some ofthe services were using like somecertain libraries which all which wereonly checking the first certificate inthe bundle uh no looking at the changewe didn't have a problem with that thereare no change in the CI bundle there'sonly typically only root certificateslike self-signed certificates in thebundle you have the root yeah butthey're bothself-signed so they're not chains reallythey're they're form differentchains there yeah don't use thatum do you have an issue have youcompared issuers with your private CA umor you have a signing certificate inclusteruh we had different solution over timewe started out with a CA issueruh we we so we had to order aintermediate or subordinate CA insideour private PI which we initiallyinstalled as a CIA issure it's notrecommended it's it's not that secure sonow we're running it externally so werun it in Hashik of vault using the PIengine so manager just sends thecertificate signing request to vault andyou get the signed certificate back somanager takes care of the of theautomation and and all this renewal andI think that allows us also to set upextrauh security in vault because we alsohave the the policy again repeated invault so as an extra layer yep thank youthe one thing that you cannot do orthat's harder to do is um in fault youcannot say for example that the namespace has to be in the URI so thereforeyou need approved policy cuz it isn'taware of what namespace is so you couldalso solve that by having like differentissuers in each name space that makes itso that you have you have to have very alot of uh different issuers basicallybut we also we we used the the policiesinvolved to kind of we have one issueper cluster so we ensure the first partof the spiffy id so we know that uhwe're not having environments goingacrossAny more questionsokay thank you for listening thank you2025-04-15 21:58:26.350751 <<��(n#��AgWgagjHtnlEso in this presentation we will beexplaining how to do or how to solve TLSat scale more specifically we'llactually be going into um Eric setup thesetup that Eric created for one of hisclients uh where he's actually usingmanager in a very advanced way um he setit up such that it's usable within amulti-tenant environment and such thatall these tenants can self-service theirowncertificates my name is Tim i work forCyber Arc um I'm a search managementmaintainer and so is Eric and I'm Ericuh I work as a contractor but for manyyears I've been working for the startnetthe Norwegian TSO which is responsiblefor the transmission of energy in theNorwegian powergrid okay so a bit more about managerthird manager like probably most of youalready know is a CNCF graduated projectand we are very interested in solvingX509on Kubernetes and Open Shift um so ifyou look at search manager on GitHubthis organization we have a coupleprojects that are very interesting andthe most important one of course issearch manager itself you have trustmanager approve a policy uh and a fewothers we'll actually go into those uhlater uh we're a fairly popular projecti think we have almost 12,000 GitHubstars uh 400 contributors and we arebeing downloaded many many times per dayum and today we will actually be talkingabout how Eric is using Manager um butthat wasn't always the case I thinkso when Eric starteduh with this client he actually didn'tuse any search manager at all so he wentfrom zero to the point where he is atright now which is actually being asearch manager maintainer it's like aamazing story basicallyum so in this presentation we'll try toactually uh take you through the journeythat Eric made and kind of show also howhe set up search manager for his clientand how he um really got into this veryadvanced use caseyeah so first uh a bit of context hereum I'm used to work as an applicationdeveloper for many years uh also forstartnet um so before joining the ourplatform team uh this was a situation wehave uh multiple Kubernetes clusters andmultiple Kafka clusters running on VM sonot running on on Kubernetes uh I wasusing this environment uh as a developerbutuh after some time I I joined theapplication teamum this setup was created by a projectthatuh had a lot of applications spin up ina cluster doing a lot ofstuff so but there was somechallenges because of time constraintskafka was uh set up without anyauthentication at all meaning that datawas available for anyone with access tothe cluster inside our networksumwe the uh we have a lot of tenants ummeaning independent teams that that workon their own schedule uh they typicallyuse micros service architecture meaningwe have a thousand of workloads runningin our Kubernetes clusterssoum at some point there was a new majorproject on the horizon which wassupposed to uh take care of automaticbalancing of the market of energy umand it was really no option to keep thissituation without uh any authenticationwe had to secure thedata so we started looking into this andum we first we decided we wanted to usethe native authorization mechanism inKafka which is called Kafka ACL but weneeded to find a way to authenticatetheseworkloadsuhso just going to touch into what optionsyou have with Kafka you basically havetwo options you can either use uh whatKafka call susleuh that there exists some variants of itbut the in general it requires you to beintegrated with an identity provider andalso requiring some kind of registrationprocess in the identity provideruh in addition you have the option ofusing SSLuh which is also the Kafka name for itwhich means you can authenticate usingMTLS umwe work in an organization with uhworking in a traditional way so a lot ofmanual processes so we wanted toinvestigate the NTLS option firstum in order to not be dependent on anyregistration process in in any identityprovider uh so we looked at how it worksin Kafka and we soon find out that Kafkadoesn't support certificate revocationsso that was an important take f18 it in demo after languageseparation uh all next steps can be runindependently and execute uh for eachlanguageso um as Alexi talked about right likeyou have lots of pre-processing stagesin your pipeline these stages may or maynot be linked to each other and they canoperate on data the very wide variety ofscales right it could be at the megabytegigabyte level just for like localdevelopment and testing or at actualterabyte scale for when you have youknow lots of data and you're doingactual data transformationsone of the issues that we were runninginto was how do we make it easy for ourdevelopers to like do quick local likedevelopment cycles on their laptops andthen scale up to the cloud right um thisthe solution we elected to use for themost part was Ray and Cubray um there'sobviously options like Spark Daskk andall of that stuff we we likedCubray so Cubray is an operator thatbrings core Ray concepts such as Rayclusters Ray jobs and Ray serve toKubernetes since it's an operator itrequires custom resources and YAML whichmost of our users actually hated we gotaround that by using something calledthe Cubray API server and all that doesis it allows your users to make directAPI requests and then converts them intothe corresponding YAML objects on theclusterthis is acted upon by the cube operatorto do whatever you need it to be doingall our array based transforms arefollows the driver worker paradigm anduh this uh the driver when it startreads all the input files names orobject names in the case of S3 and uhdispatch uh tasks toworkers which are implemented likearray actors sorry unlike Spark we don'thave apartitions here because uh when a workerfinishes proceeding the a file it asksthe driver for the next one and uh thisapproach helped us to prevent uh slowdown when uh file sizes uh varysignificantlywe usea separate cluster for each uh differentfor different data processing task anduh which allow us to create taskspecific images and uh avoiddependencies uh conflicts and includesupport for legacy libraries for exampleeven Java models each worker reads andwrites data separately and uh thisspread of network load between actors orin the end of the day array in the endof the day Kubernetes ports andnodes before going to KFP automation andyou know to summarize very shortintroduction to to Ray we want to sharewith you Our most uh extended task thatwe executed it was a dduplication taskfing duplication task with almost eightand a half billiondocumentswith compressed storage 23terabytes the task reduce the documentsand storage something like 33 3540% but the most interesting part hereis the right cluster configurationit included7500 CPU cores and 56 terabyte of RAMand the task spann almost 40hours um so I just noticed that theslides look a little funny we'll makesure to upload slides with the correcttemplate afterwards i don't know whathappened um anyway so we just talkedabout how Cubray can be used to scale upyour workloads but how do youorchestrate these longunning ETL jobsrightumso so um your first option obviously isjust running things locally but that'snot very resilient to like PTO andvacationsum the next thing we looked at wassomething like Kubernetes jobs which gotus most of the way there but the userexperience wasn't quite friendly enoughfor like a data scientist to really diveinto right like Kubernetes like jobsthey didn't have a very great UI forthem this is why we started using cubespipelines um while this talk is focusedmainly on data engineering I do want tolike reiterate that or iterate that umcubeflow pipelines can be used fororchestrating your entire MLOps lifecycle um there's three main ways tocreate steps or components as they callthem in cubeflow pipelines um you canhave normal like Python decoratedcomponents very easy to get started withand um you can have very complicatedlike completely custom containerizedcomponents as well right so if you haveJava code that you need to be runningfor some you know unfortunate reason youcan dothat um well that looks good enough umso here's a picture of the UI um hereagain the sp9ecifics of this DAG doesn'tactually matter right it's just like abunch of steps they're all connectedtogether um we have some users of Cubapipelines who have hundreds or eventhousands of components in a singlepipeline why they do that you know it'sfor flexibility right you can dowhatever you need to be doing the otherthing I want to call attention to iswithin the UI again you can see you haveinput parameters you can look at thelogs while it's running all this kind ofstuff right yes you can do this from theterminal but it makes it much easier forpeople who aren't familiar withKubernetes to kind of orchestrate andbuild these jobsout now um say I say I want to explorecubo pipelines right um maybe I don'twant to learn a new Python SDK for itwell the answer for that is Elra elraoffers a pretty easy to use drag anddrop interface for creating umcomplicated DAGs um you just put themtogether you can specify any inputparameters you need outputs sequence allof that fun stuff so and if you'renervous about trying it out you know useaLyra so I want to briefly sum up some ofthe key benefits we had from using Cubopipelines for this group um the first upfirst one wasmodularity our workflows were broken upinto um like smaller reusable chunksthat we could chain together to buildvery complicated pipelines this madedebugging very easy since there wasindividual pieces of work and whensomething was breaking you knew what wasgoingon um it also introduced um a lot ofnicities around reproducibilitykfp uh runs are kept around you know fordays weeks months you could see whatinput parameters went into it what theoutput artifacts for view all of that inthe UI so as a data scientist again whenyou want to see how something hasshifted over time it's very easy to dothat without having to go anywhere elselastly again I keep talking about the UIright visualization capabilities in KFPmade it very easy to build out neuronsto track experiments and to justtroubleshoot any issues we were havingall of these came together to allow usto replace the data scientists withoperators this way data scientists couldfocus on problems that they're reallygood at solving and like that bringmaximum value for them and the operatorscould do what they're good at like justrunning stuffnext up Alexi is going to be talkingabout some of the pipelines they'vebuiltokay so Anish talked about modularityand in order to provide uh reusablecomponents and modularity we implementedall by steps like a simple pipeline anduh we have first uh step is argumentpreparation because uh some uh argumentswe need to proceed during the runtimeinstead of compilation time after thatuh we start ray cluster and it's not weactually it's a KFP component and uhwhen the the ray cluster is ready KFPsubmits the job to it and uh when job isfinished uh the cluster is stopped orundepundeployed the undeployed component isimplemented like exit handler it can besimilar to in the in the code for likelike try try and uh finalize final toguarantee that cluster is undeployedindependently on the status of executionof job if error orsuccess and all these free component uhlike deploy array cluster execute joband uh destroys the array cluster orshared component are the same for alldata prep-processing steps thedifference only in arguments that weprovide tothem we can see here example of uhsimple run run and with all the threecomponents like we mentioned uh here anduh how we create thecomponents okay we created simplepipelines but uh our target is toautomate the entire uh process maybeseveral uh data prep-processing steps orall the the steps so we createmulti-steppipelines or super pipelines like wecall them where each step is a nestedpipeline or nested simple pipeline forspecific data prep-processing step andyou can ask um how we implement nestedpipeline if we work for example with KFPv1 because uh nested pipeline wereintroduced in KFP only in versiontwo so we use a KFPS SDK to executepre-installed uh simple pipeline and asa result uh on with KFP1 we have uh nplus one runs where n is the number ofsteps in in the multi-step p:ipeline forexample here we will get four runs andin the demo we are going to show you howwe doit and it's a perfect time to go to thedemo i think it's this oneokay maybe let me stop it for amoment we can see here we have severalpipelines deployed on KFP server it'sKFP v1 the first one it's a multi-steppipeline just in a moment I will show itand all others are simple pipelines forspecific uh step in the dataprep-processingso we gohere anduh we can see here that uh first we runuh document docu document identificationafter that exact dduplication and afterthatlanguage separation and filtering andthen uh we have uh annotated transformsfor uh document qualityannotations and uh we runuh uh three differentuh sub pipelines for different languagesfor English uh Japanese and uhFrench okaysookay and uh we can see here for each uhsimple pipeline we have the same patternwith or witharba withfour steps like we discussed before andwe didn't have any anyruns now we execute the the superpipeline we start therun okay anduh we can see first first run it's therun of a multisteppipeline in a minute uh we will haveanotherone okayit's meantime I will demonstrate you theKFPV2it's exactly the same super pipeline butuh here we have uh KFPv2 supports nestedpipeline here the real nested pipipelines we can go into the the UI tosee the parameters and uh we can executestart the run and uh another examplehere that we demonstrated like Anishsaid that we had previous run and wejust clone it to run it again so we havea history of uhexecution and we can see the first stepis is running nowgo back to KFPv1 refresh and we see that uh inadditional to super pipeline uh docudocu document ID is running and almosthere it's done and uh we goback to runs and we see that exactdduplication uh isstarting and we can see go intothe super pipeline multi-step pipelineAnd we can see that the first step isdone exactly duplication is done and nowthe language identification is startingand we can see for each step thedifferent runs on the on thedashboard for KFP V2 it's simpler wehave only single run and we can uh seethe status of each steps on the same uhviewso it'sum language and notification is finishedand now we have a split for eachlanguages for uh English Japanese andFr the filtering the filtering is doneand now we runit'sdocument qualityannotators the same onKFPV2 just refresh the screenand uh we have language identificationis done the Japanese filter is startingthe first after that others anduh meantime we return back to KFPV1 wesee that the process is done and if wego to therun and we can see that instead of asingle run we have a lot of differentruns for each step we have a separaterun and we can go into the if you wantto runs and checkit on KFPV2 the Japanese filter startedfirst so now the docu quality forJapanese is started and uhother quality docu document annotatorsare running and it's done and if you goto the now toruns just to demonstrate that all thepicture anduh justmoment wehave only two runs here because one it'sthe previous run and the second oneit's that we executed i think we donewith thedemo andum we go to discussums the history of our work when IBMstarted work uh working on um datapreparation for large language modelseach data scientist work independentlythey created proprietary Python scriptsexecuted them on Ray on Spark okay withKFP we automated the process of uh startand run butthe data processing scripts stayproprietary and uh it's create a lot ofproblems for example for newcomers orsomebody developers wanted to create anew transformer he has toknow he had to know how to work with theinfrastructure how to work with ray howto work withspark anduh in order to resolve this uh problemuh last year with submitted uh orpublisheduh data prep kit opensourceproject that designed to provideuh to help to implement data processingin consistent wayit rem removed uh need for mo mobiledevelopers for model developers toimplement common task like access to S3storage generate metadatauh unified uh shared parameters and themost important uh it wraps um runrunning uh frame framework so thedevelopers now don't have to know deepknowledge about rail sparkat simplify add newmodels and of course uh we have KFPautomation for the models on the sameumproject DPKwas checked in production it was usedfor IBM granite LLM's creations and umtwo weeks ago uh DPK joinedLinux Foundation data and AIcommunity this picture is fromumour paper that was published last yearon data in big data confidence and uh wecan seehere list of different transformanceit's not all of them that uh we have inDPK we have much more but most importantpast part here is that we have differentruntimes we have we can run on pi pythonjust on the laptop we can uh run it onspark we can run it on array and uh KFPautomation foruh automated process and DPK is not onlyfor finetuning code data preparation itcan be used for ra we have some exampleson the on the p intheproject andum actually yeah sure um so here's someQR codes you can scan or just Google forthese uh the first one takes you to thedata prep kit project and the second oneis a link to the cubeflow community umagain both communities we're reallytrying to grow them out um so you knowwe welcome contributionsum so let's get in a few seconds forthat um I'd also like to call out thisis Alexis's first time giving apresentation so like[Applause]congratulations all right we have sevenminutes left so anyquestions it looks like there's a mic inthe middle I thinkhello all right um yes in CubeFlowupstream we have experimentalintegration for Ray as well so did youuse that or did you build your own Rayi'm sorry could you repeat the secondbit a bitdid you use the experimental integrationin upstream cubeflow for ray or did youbuild it yourself the cube rayintegration so um this was using vanillacubray like there's noadditional extensions really on cube umthere were some features that were addedto the cube API server and cubri as partof some of this work but that's allupstreamactually we have addition but not on KFPnot on Cubray but the connection betweenKFP and Cubray how to execute the thefree components that I mentioned here tostartthe array to to submit the job and todestroy the cluster yeah for example thepermissions to do that I think they'realso upstream so that you could maybereuse and the second question isregarding Spark did you try somethingsimilar with the Spark operator as wellum I'll let you answer that one yes wewe had discussions about the sparkintegration with KFP and actuallychecked the spark operator thatcurrently part of aKFP project but meantime we use only RAis KFP so the spark operator is a littlemore recent to the cubeflow project sowhen we started on it cube seemed to bein a better spot so something we'llprobably revisit in the futurethank you thank youhi thank you for the presentation umbeing new to the um CubeFlow um I have aquestion um and it makes an impressionthat the maintaining this framework onKubernetes is pretty large effort andrequires you know significant investmentof skills so I'm wondering for thespecific um pre-training task that youdemoed today with the cubeflow pipelineswhich is obviously working could it besort of implemented with a morelightweight components withoutdeployment the whole cubeflow so make itmore sort of smaller footprint moreaccessible uh or it has to be full-blowndeployment of cubeflow pipelinesyou can use DPK withwithout KFP you can run it on just inPython or spark or with array but itwill be single step and in the do in theexamples we have integrationwith with the Jupiter notebooks it's akind of pipelines to connect the stepstogether and uh currently we check if itcan be maybe with langraphlraph and the len len flow so the samescript could be run from the Jupyternotebook essentially and that would workas well as long as it uses your plug-inright correct correct okay yes thank youall right uh well we're basically attime um if anyone wants to talk in moredetail about any other stuff you knowfeel free to come by 30 minutes is not alot of time to go into tons of detail soyeah thank you for attendingthank you2025-04-15 21:58:26.918239 �s��`p#��wAYvXCcSjXKEQall right folks before we hand let'splay this game all together i knowyou're you're doing it and you have allthe time to finish the lab the whole daythe cluster is going to be up andrunning the wall day so take your timedon't worry instruction are therecluster is there but now I like to playwith all of you and in order to playwith all of you I I will use it's goingto be me against you Natalie exactlyexactly so we also set up the gameourself and on on anotherclusters and uh and I think it's herehere wego let mecheck and it's overthereright so this is thegame and we made it uh multi-user rightlike in theinstruction so now I like you touh access this QRcode and it's create QR code if you canplease try toaccess check I want to do alsomyself go so this is the app running inproduction and you are the user right umthe as user you will see this thisinterface now when you click start gameyou should see something like grantcamera access and when you grant cameraaccess you are asking permission to themobile or your device so you might needto grant this permission but when theirpermission is is is uh is is given thenyou are able to join the game and jointhe game means that you are joining thegameuh that we have to start so rememberthere's a front end and there's a backend the front end will join the game andthe back end will start the game nowyou'll be assigned to team one or teamtwo how many of you are in teamone how many on team two oh wow and whenI click here next round you shoul<��uo#��!ACzdX5qDgQ2Uhi everyone my name is Anish Astana i'man engineering manager with the openshift AI group at Red Hat um I've beenworking in the cloud native AI space forthe last seven years or so now hello anduh I'm Royal Alex i work for IBMresearch something like 25 years startedwith IBM middleware after that uh cloudinfrastructure cloud solutions andcurrently my focus on data processingautomation and um today we'll be talkingto you about um foundation datafoundation model data engineering uhusingkubernetes so to give a brief overviewof the agenda uh we'll be introducingwhat data what these data processingworkflows look like first then we'lltalk about how projects such as cube rayand cubeflow pipelines can be used toscale up and productionize yourworkflows then we'll kick off the demoand then introduce another projectcalled um data preparation kit whichmakes it much easier to build thesesystemsout okay so let's talk about dataprep-processing workflows the for workworkflows usually starts from datacorpse access and in our case we useparet files with error tablesformat and uh when we work with largedata sets duplication orsemi-duplications are often present anduh therefore the first uh steps in thedata processing it's uh working with thedduplication which can be exactduplication or combination of exact ractor fitduplication after that usually or mightbe language separation and filteringwhich allowsto proceed with next steps based on aspecificlanguage depends on uh the data inputsome annotated transformers can existfor example to check that data doesn'tinclude personal identifiableinformation PII or hateabuse profanity language or just uhcheck that quality of documents thistransforms at least in our case thereare mutualuh they do do not depends one to anotherand we can execute them uh in any orderor even in parallel because they writedata in separate uhcolumns we will see that in the nextslide and usually the process endswith filtering andtokenization like I said before the onesome uh steps can can run in paralleland merge at the endanother type of parallelism can beapplied based on natural languagebecause after uh and we're going todemonstrate7=d seethe camera we're using the front camerabecause this the model was training withthe front camera so now you should doyou know the picture yep and it shouldrecognize your sign and if it's wrongit's the model i don't know it's it'sthe background and so on so the the gameis like uh you know now team one winbecause it's this sum of all the signand let's say it was a paper againstShisur and then and then Shisur wonright so that's the game um so it was around test so let's do it together againi wanna I want to make sure you areaccessing the app and you just refreshthe page on on your mobile or I can Ican do it again i can show you uh thethe code againi wanted to play as well yeah pleaseplease so it's uh see code if you wantto play again you have the quick code oryou can just refresh the page and youwill have the the the the page so sincewe said Roberto in and everyone we saythat there was a competition now it'stime to play there are three rounds andto the first three in the rank we have arank also in the game we will give thosefantastic fedoras so now it's time toplay art please um if you can access theapp can you access the appwaiting waiting because we need to startthe game okay joining the game join thegame please and be ready to to win somegadget so let's see let's do somepicture let's see lots of paper and rocklet me know if your sign was wellrecognized right depends rock rock itwas uh well recognized okay cool by sidewe're using front camera because wetrain the model with front camera in theold model we were using the back camerabecause we were training with the backcamera so another round i'm team one whoare you team one again it was a tiebefore let's see nowpaper well detectedhow many of you got a a correctdetection and now again yeah so he'swalking of course cheers to the engineerand let'ssee how many wehave rock mycase okay okay so team one woncongratulations team one yes congrat toZip and cluster man who isZip zip you have your nickname on thephone who isZip who isHolly holly you have your fantastic reddotcongratulations Hollyand who is what's the other nameclusterman hey Clusterman congrats butwe don't have the first so who isTeslook teslook here we go mate congratscongrats you're welcome so that was thef fun part of of the lab and please takeyour time to complete the lab create thesame instance as we did again this is aproduction grade applicationmicroserbased model at scale using CPUinference uh we shown a complete exampleon how to do all of this yep and uh youhave the slides and I will update theslides also with the all the links ofthe source code of this lab so you willhave it um now I think we can uh givesome time for for you to finish the lababsolutely and I don't know I don't knowif there's another session taking placeafter this one uh typically this onefinishes at half past three but yeahplease also if you can rate the sessionif you like it the idea the flow thepassion the swag and everything fromthis session please give it a give itgive us a thumbs up uh it will be veryimportant for us and I thank you forjoining us today and I wish you a greatCubeCon thanks thank you enjoy your timetake care enjoyall right folks before we hand let'splay this game all together i knowyou're you're doing it and you have allthe time to finish the lab the whole daythe cluster is going to be up andrunning the wall day so take your timedon't worry instruction are therecluster is there but now I'd like toplay with all of you all right folksbefore we hand let's play this game alltogether I know you're you're doing itand you have all the time to finish thelab all the day the cluster is going tobe up and running the wall day so takeyour time don't worry instruction arethere cluster is there but now I like toplay with all of you and in order toplay with all of you I I will use it'sgoing to be me against you Nataliexactly exactly so we also set up thegame ourself and uh on on anotherclusters and uh and I think it's herehere wego let mecheck and it's overthereright so this is thegame and we made it uh multi-user rightlike in theinstruction so now I like youto access this QRcode and it's create QR code if you canplease try toaccess check I want to do alsomyself we go so this is the app runningin production and you are the user rightumthe as user you will see this thisinterface now when you click start gameyou should see something like grantcamera access and when you grant cameraaccess you are asking permission to themobile or your device so you might needto grant this permission but when theirpermission is is is uh is is given thenyou are able to join the game and jointhe game means that you are joining thegameuh that we have to start so rememberthere's a front end and there's a backend the front end will join the game andthe back end will start the game nowyou'll be assigned to team one or teamtwo how many of you are in teamone how many on team two oh wow and whenI click here next round you should seethe camera we're using the front camerabecause this the model was training withthe front camera so now you should dothe picture yep and it should recognizeyour sign and if it's wrong it's themodel i don't know it's it's thebackground and so on so the the game islike uh you know now team one winbecause it's this sum of all the signand let's say it was a paper againstShisur and then and then Shisur wonright so that's the game um so it was around test so let's do it together againi wanna I want to make sure you areaccessing the app and you just refreshthe page on on your mobile or I can Ican do it again i can show you uh thethe code againi wanted to play as well yeah pleaseplease so it's uh see code if you wantto play again you have the quick code oryou can just refresh the page and youwill have the the the the page so sincewe said Roberto in and everyone we saythat there was a competition now it'stime to play there are three rounds andto the first three in the rank we have arank also in the game we will give thosefantastic fedoras so now it's time toplay art please um if you can access theapp can you access the appwaiting waiting because we need to startthe game okay joining the game join thegame please and be ready to to win somegadget so let's see let's do somepicture let's see lots of paper and rocklet me know if your sign was wellrecognized right depends rock rock wasuh well recognized okay cool my sidewe're using front camera because wetrained the model with front camera inthe old model we were using the backcamera because we were training with thebackcamera so another round i'm team one whoare you team one again it was a tiebefore let's see nowpaper well detected how many of you gota a correct detection and now again yeahso he's walking of course cheers to theengineer and let'ssee how many wehave rock mycase okay so team one woncongratulations team one yes congrat toZip we did it and Clusterman who isZip zip you have your nickname on thephone who isZip who isHolly holly you have your fantastic reddotcongratulations Hollyand who is what's the other nameclusterman hey Clusterman congrats butwe don't have the first so who isTeslook let's look here we go matecongratscongrats you're welcome so that was thef fun part of of the lab and please takeyour time to complete the lab create thesame instance as we did again this is aproduction grade applicationmicroserbased model at scale using CPUinference uh we shown a complete exampleon how to do all of this yep and uh youhave the slides and I will update theslides also with the all the links ofthe source code of this lab so you willhave it um now I think we can uh givesome time for you to finish the lababsolutely and I don't know I don't knowif there's another session taking placeafter this one uh typically this onefinishes at half past three but yeahplease also if you can rate the sessionif you like it the idea the flow theinstruction the passion the swag andeverything from this session please giveit a give it give us a thumbs up uh itwill be very important for us and Ithank you for joining us today and Iwish you a great CubeCon thanks thankyou enjoy your time take care enjoy2025-04-15 21:58:27.437348?er it'll just be that you'llhave uh pod IP pods with same IPs acrossdifferent networks so yeah sorry Ididn't understand the question but thankyou for the question yes please maybeyou need to use the mic if everybodyneeds to hear you unfortunatelyhello okay um so you have cluster userdefined and you have um userdefined umUDN right yep so let's say you definetheclustered network rightand a user defined a userdefined networkfor the same name space do we thenoverwrite one of them or how does thatwork yeah so if the admin has created acluster network and the tenant alsocreates one i think that was thequestion right yes yeah the tenants onewill not come up so in the status youwill see an error for the tenant andyou'll have an alert or something thatsays there is already an existingnetwork in this name space that has beendefined by the admin so contact youradministrator or the vice versa if thetenant creates it first and so so forprimary networks you cannot have morethan one network for your pod right itdoesn't make any sense to have like thewhole concept of primaries all yourdefault traffic should go through thatnetwork so you can only have one primarynetwork in your namespace so if thecluster administrator creates it firstand then the tenant creates it or viceversa one of them will be stuck in errorstate like the CRD will have a statusthat says network creation unsuccessfulbecause there's an overlapping networkokay so if we use it the OVN then thenit just ignores the network networkpolicies on the cluster thenum not sure I followed the secondquestion so if you create like the wayyou created the green and the yellow umnetwork for example I create a networkpolicy stating the port in the blue onecan go to the yellow one it will justignore it right because it doesn't sohow do network policies work withsegmentation yes does it just ignore itbecause you don't use the defaultnetwork again you use the OVN one rightyes so you're going to get the answer tothat question in a while it's comingokay thank youyeah pleasethank you in the case of creating auserdefined network or cluster creatingnetwork if my pot needs to talk with theKubernetes API server it's possible it'scoming it will be I'll be showing it inthe workshop also okay the next questionif I have two uh warloads each one in adifferent name space each one with theirown network can I have IP connectivitybetween these two workloadsright now okay the right answer is as ofnow no the whole point is to keep themseparate so no IP connectivity completesegmentation complete isolation forcompliance reasons if you want to haveconnectivity expose them using a loadbalancer and then through the serviceyou can have connectivity but pods bydefault is a default posture of no butwe have something on the road map calledinterconnecting UDNS and usually the usecase is maybe your side range is fullmaybe your cluster doesn't have any moreresources you want to expand your UDNthen you can have another UDN andinterconnect them but we are designingthis still right because it's kind of acomplex problem do we want partialconnectivity do we want fullconnectivity etc so um yeah okay andlast questions uh in the same space canI have two user defined networks orcluster defined networks i mean I have apot uh talking into different networkplanes management and signaling orwhatever yes that is possible only oneof them will be your primary network theothers will be secondary you can haveany number of networks in a given namespace only one of them will have roleprimary every other network will besecondary so your management network canbe primary so where whichever networkyou want the default to go through thatwill be your primary but you can have Idon't know five 10 more name spaces thepod will have 10 interfaces and they'llall be plugged to different networks andif for whatever reason I need internallyto have connectivity between these twonetworks can I create a port as avirtual router in order to forwardtraffic between the two networks that isnot possible today but it is going to bepossible using what I mentione@d which isinterconnecting UDN so you don't have tohave a pod that's a router we will dothe virtual router that sits betweenthese two networks that will connectthem and then maybe we'll make it moreconfigurable saying maybe it's notreally secure to do parttood there maybeit's part through cluster IP service orsomething right like so that's on theroad map so coming soon thanks a lotgreat so my pods are up so we'll try tocontinue with the workshop uh this isjust the first blue pod that I'm tryingto describe here so it's the let me showyou the command that I did oops it's abig on so it's the app blue zero pod inthe bluenetwork and there's a bunch ofannotations which is basically the IPIPAM don't worry about it what I want toshow is that the pod is healthy and ithas an IP1024419 which is bad right because wecreated a created a network with 103 103cider but nothing to be surprised aboutthe reason why the status.pod ips areshowing you this IP is becauseKubernetes doesn't know about multiplenetworks cublet still needs an IP toprobe at otherwise your health checkprobes are going to fail so that is theIP that you see here but this IP is onlyused for probing that's it so the pod isstill connected oh well that's it as insomewhat but the pod is still connectedto your default Kubernetes network andthis will be true until the coreKubernetes community fixes things likeprobes or DNS or reachability to KPIservice and DNS which needs that infranetwork attachment so all your UDN podswill always be able to access DNSservice and cube API service which isrunning in your infra network and cubletprobes will be working through thedefault network interface on the podwhich is the default Kubernetes networkthat's the only use case for needingthis pod IP nothing else so looking atthe interfaces inside your pod right sothis is an example of a blue pod you cansee three interfaces one is the localhost which we don't care about theETH0 the1024419 is the IP that is used forprobing your pod nothing else is on thatinterface basically everything else isblocked only cublet probes work thereeverything else is going to use your UDNinterface which is the last interfaceand that is the IP that this pod hasactually today1031032.5 right so this is the interfacethrough which all the pod traffic isgoing to go out for your blue pod in thecluster these are the routes inside thepod and the 1096 that you see is yourcluster cider so services are going towork through your UDN1 interface rightso that is the only takeaway out of allthese routes so to answer one of thequestions that was being asked rightwhat can I have more than one networkyes you And that will have like a routespecific to that secondary network it'snot present here but then you can seethe ETH0 the UDN1 all these interfacesand routes but at least the two keythings here is that all default trafficfor a pod will go out its UDN interfacethe services and everything will also bethrough the UDN interface the onlyreason why there's an ETH0 interfacehooked to Kubernetes default network isfor cubelet probes to workand this is like a nice view of what arethe networks on this pod so inspect thepod blue right you can see that ro isprimary for the first so look at thefirst one i'm talking about the firstone so there's ro primary the IP is 2.5and the MAC address is there for thatspecific interface which is the UDN0interface setup and all the routes inthat pod are also visiblethere and then you can see that there'sblue one and blue one parts that it wastwo replicas right so there's blue zeroand blue one and you can see that thethere is the second part of it isinfrastructure locked network which iswhat I was trying to tell you about the10 244 range is for cublet probes it'slocked basically and only probes workthere yeah I'll come in a moment to yourquestion just going to get through someof these in le oftime and I'm going to try to connectbetween my blue pods within the bluenetwork you can see that that works verywell and it's using the primary pod IPprimary pod IP is your 103 pod IP so 1.5is the curl that I did so betweeAn yourblue network things are working fine andI'm going to try to use the 10244 IPlike I said right and you can see thatthat ping doesn't work and the reason itdoesn't work is obviously because onlyprobes work there and this is the greenpod again the green pod has another youknow the pod status it looks like thisbut you don't have to worry about thisIP interfaces inside the green pod youcan see another UDN interface with theIP 2032032.5 you can see the routes inside thegreen pod and all the interfaces so sameeverything I showed for blue isapplicable to green just two differentnetworks so we can also see the twogreen pods talking to each othersuccessfully and you can see that thegreen pods using the kubernetes IP IPKubernetes aware IP will not workbecause only probes work there nowthere's no admin network policies in mycluster no network policies in mycluster no baseline policies in mycluster nothing and this is the setupthat I have on my cluster right now so Ihave a blue network a green network redand yellow are connected to each otherso I have the 103 203 and the 192 rangeright so these are the pod IPs that Ihave and we tried to talk between blueit worked we tried to talk between greenit worked and now we're going to try totalk between blue and green which is thebeauty of this workshop so let's see ifthat worksnow so a pod in blue is trying to talkto pod in green doesn't work right soblue is trying to talk to 203 networkit's isolated by default there's no waythat thisworks what about red to yellow thatshould work right because they're partof the same colored enterprise networkand this was the admin created networkand so that looks good forus and services uh I think we are kindof over time with the pods and I shouldhave given it off to the virtualizationside of things but I do want to at leastdo the theory behind services andnetwork policies right do we have timefor that Miguelokay so let me get back to so any thatwas like very simple right like wecreate networks we get pods we try toping between them if they're part ofdifferent networks they won't talk toeach other they're part of same networkthey talk to each other and now I knowthere's a lot of questions for servicesso let's get back toour menty meter so if you're on quizplease try to log into the website andenter the menty meter code miguel canyou read out the code do you have it theor I can try to doit it's[Music]617692 yep i see a lot of thumbs upthere's 47 people so that's pretty cooland now let's try to see services i'mgoing to try to present the slideshowso this will be our first question butbefore we do that let me just like tryto share some context right so clusterservices are of different types you havecluster IPs node ports load balancersand let's take cluster IPs and simplifyit cluster IP the cider that you getwithin your cluster and it's usedusually for connection from pods withinyour cluster forget load balancers andnode ports to keep things simpleRight now imagine you have a blueservice in your blue namespace a greenservice in your green namespace and youhave a red service in your red namespacea yellow service in your yellownamespace right so we've done with podsnow let's look at how services work withisolation how do you think the servicewhich is your cluster IP range in yourcluster should work with UDNS right solet's go back to your mentimeter i thinkthis is the first question that you seei see people answering this live on thequiz so I'm going to read out theanswers which which one do you thinkshould be each UDN should have its ownservice cider which seems to be trendinghigh or should all your UDN's be sharingthe same clusterwide service cider whichseems to be a little bit on the on thedown part where we have five people nowtalking about this and neither I have abetter idea that I want to share there'stwo people who have said this raise yourhands no don't be shy is it because youdidn't agree with one and two that youknow was it youMikuelkeith somebody said this somebody has abetter idea than the the two options Ihad no okay so how many of you actuBallysaid all UDs will share the sameclusterwide service cider raise yourhands i see a few of youcongratulations that was the rightanswer actually I should have pressedenter here so sorry majority of youactually thought that each UD shouldhave the same service like each UDshould have its own service setter butactually all UDs are going to have thesame service setter in the setup i knowthat sounds confusing but we'll explainto you in a moment why that's the caseif we have the time but if not comecatch me after the workshop session butit's it's still cool the isolation stillworks right you're going to have oneservice cider and and I get whyeverybody gave that option because youhave a different pod cider obviously sowhy not have a different service ciderbut you know because service cider andthe whole control plane is againcontrolled by core kubernetes it's alittle bit like IPAM is done by the CNIso we have more freedom there for theservices endpoint slice controller andthe service controller it's all bakedinto core kubernetes so it's much easierif you actually have a clusterwideservice cider that is shared by all yourUDNs and then it's actually even moreeasier to isolate your cluster IPservices in that way so now if you goback to our slideshow and keep goinghere imagine that now you have thisright so a blue pod is trying to talk toyour green service that's the gray arrowthat I see here which I think is yournext question on menty meter so keepgoing on your quiz to the next page ithink I should probably change it hereyeah what color do you think that shouldbe so a blue pod is trying to talk toyour green serviceshould it be green it works should it bered no it shouldn't work because youwant it isolated neither it should beuserconfigurable or maybe it's becoming toocomplex that's alsofine i see more people giving thumbs upi'll wait for a few secondsjust to repeat the question the blue podis trying to talk to your green clusterIP so cluster IP the green cluster IP isin your green network the blue pod isobviously inside the blue network soit's cross network communication overservices oh I still see a lot of peopleworkingit's actually kind of close are you Areyou trying to make sure that it's 50/50now with thevoting what What do you think what doyou think Miguel should it be red orgreen blue pod is talking to greenservice so Miguel saying it's green ithink it's red because red is the answerbecause I implemented it right so you'regoing to have isolation for yourservices also intact until youinterconnect your clusters orinterconnect your UDNS together so untilyou interconnect blue with green whichis what I said is on the road map youwill have isolation actually so it won'twork but this is actually a goodfeedback and this is exactly why I'mdoing the workshop because I would loveto understand why there's a section thatalso thinks the other way around becauseit's an opinionated implementation rightand we are we would love to hear moreuse cases and and think why the audiencethinks differently but I know we'realmost at time Miguel I'll I'll be donein a moment so give me a fewminutes so coming back to our slide deckso this is actually going to be redso You can see it's red and now I have afew more arrows right if the red pod istrying so so whatever is within the samenetwork it's the same principle whateverapplies to pods applies to cluster IPservices so if the service is in adifferent network you cannot reach itit's not accessible for pods and theother networks it's completely isolatedfor cluster IPs so only the pods withinthe colored enterprise network can reachthe red service or the yellow serviceonly the pods within the blue networkcan reach the blue service and only thepods within the green network can reachthe green service that is exactly how wehave implemented it but if you beg todiffer and want to have this exposedthat is what is on the road map calledinterconnecting UDNS so we'll have a CRDthat explicitly gets that request fromthe user that says I want to expose myservice to other networks so you'll beable to do it and we'll bCe able to getto that in a moment i had slice thenetwork policies but I can't do it nowanymore because I have to give it overto Miguel for virtualization so I thinkI overestimated myself in how much I cancover in 35 minutes but there was aquestion on network policies andUDNS and uh I will just do a raise ofhands because we don't have time to getthrough it right how many of you thinkif you have an allow rule that lets yourblue pod talk to your green pod itshouldwork so the people raising your handsare saying network policies should havea higher precedence than the networksegmentationcorrect well how many of you think theother way around that it should not workthat's the majority the majority isright the network segmentation is whattakes precedence network policies arealso namespace scoped so network scopedthis is how it would look like if youhave a network policy it will only workwithin that network if you have a bluenetwork if you have a network policy inthe colored enterprise namespace therules there will only dictate and willbe subjected to pods within that networkcross networks it's very clear it'salways isolated always segmentedcompletely compliant and with that I'mgoing to give it to Miguel who will betalking about VMs plus UDMs and thankyou everyoneokay so let's begin the Can you hear mein the back okay thank you uh so let'sbegin the virtualization section of theworkshop and we'll pretty much be seeingthis but for virtualizationworkloads this is a brief agenda of whatwe'll be what we will be doing it'spretty standard I guess so we will bejust creating a cluster UDN we will becreating a VM in each of the name spacesand we will do a couple of things firstwe will ensure that egress to theinternet works from within the VMwe will ensure that east west trafficworks as you expect and then we willkind of uh showcase uh virtualization'suh bread and butter which is we willmigrate a VM from one node to anotherwhile it has an established TCPconnection and we will ensure that thatconnection is not broken while migratinguh I will first just uh create the VMsbecause this can take I don't know howlong and I will stop Sura's part of thedemo uh so I need to go back[Music]here go intovert and um that's it right so I willjust dothe I have ascript i don't remember what it does andI'm struggling with the character set ofthe keyboarduh vertworkshop so this is just to show that wehave like uvert installed uh probablywe'll just skip this but yeah this the Idon't have time to explain what theseare so I've created two name spaces asSuya showed before the exact same thingthere is nothing special aboutthis and we have all our name spacesshowing here and I'm showingthe the cluster UDN here as you can seeit was created on the blue namespace andon the red namespace again nothingspecial we've seen this for pods alreadynow this is the thing I wanted to startwhich is creating the virtual machinesand now we'll wait for them to be readyand while they're ready or not I willshow like show visually what we'retrying to do here the thing is we haveagain likeum two types of we have two namespaceswhich are interconnected by the appnetwork and what you can do is you seethat the pods or the VMs that arerunning on these two namespaces caninterconnect themselves if we had it ona different uh attached to differentnetwork they would not be able tocommunicate between themselves like forinstance something on the greennamespace would never be able to accessanything on the blue or red name spacesbut the red and blue name spaces cancommunicate between themselves uh theexample we have just focuses on the partabove so we only are provisioning thehappy network and we only have the redand the blue name spaces because we justwant to show like this happy path let ussee what happens in thebackground still waiting this is notlookingamazing but still let us plow throughand Iwill since we don't have a lot of timeleft I willuh kind of entertain you with a coupleof thoughts so when you migrate avirtual machine uh on again we're usingQvert right so you have a pod which isuh whicDh has a virtual machine in it uhfor instance here you have like podnumber two and the thing is we willmigrate this virtual machine to adifferent node right so this virtualmachine will be running inside of adifferent pod and we are in Kubernetesso we have IPAM in the cluster so myquestion to you is what do you thinkwillhappen when the VM migrates the VM willhave an IPbut will it be the same IP it had beforeso again uh I hope I know how to dealwith this mentimeterthing how do I do that okay here it isso next question and you can vote onthis thought exercise okay so it's hereso please start voting again we're onKubernetes uh we have IPAM on thisnetwork and we are migrating a VM to adifferent node which means that the VMwill be running on site of a differentpod so what's going to happenhere okay prettyclose it's fun to see thischange okay I'll wait a little bit morein the meanwhile I'll check what'shappening here this poor VM is notstarting is the other one oh okay so oneof the VMs is uh running but I guesswhat's happening is it's failing to pullthe image on the for the red one butwe'll give it a little bit moretime and uh okay this is very close butI will kind of disclose the the answerand I just have to press enter Iguess tada and it stays the same so thething is for you to expect that theestablishedum TCP connection to survive themigration to a different node yeah theIP must stay the same otherwise thefivele will changed and you'll get aconnection reset and uh that's not whatwe want to haveso wow oh the red VM is not ready it'snot making me look good but still uhlet's see what's happeningon our cluster uh getVMI where's the dash minus OYAML whatokay both of them are running which isgood and they have IP addresses assignedlet's just take a very quick look atthe at the cluster UDN we have createdmaster userdefinenetworkokay okay and we want to check the happytenant and dashyyaml okay as you can see the subnet ithas is on the 1921680016 and fun thing here topologylayer 2 previously sa showed to us likea layer three and here we have layer twowhat this means and again let's show avisual representation of uhthis essentially what we have is like aclusterwide network like a big switchand everything on the cluster is beingconnected to this logical network whichspans across all the nodes in thesystem And uh as we've seenthe what's happeninghere okayuhhuh our VMI's got our virtual machineshave IPs on on the subnet and now I willstart a demox[Music]session uh where's thequotes well I couldn't find the quotesso it's going to be like this so I'mgoing to to uh enter console into eachof the virtual machinesi'm going to change the Wi-Fi to aprivate one to Oh yeah hey check justquick question can you create the uh UDNfor the name space without setting a CRlike I can't hear it okayis it better now yes it is okay so canyou create the UDM for each nameacewithout explicitly setting a C so likeis it possible to let the Kubernetescluster pick out the Cdo you want a default C is that thequestion or do you want to be able tospecify the C i I want to be able to letthe cluster set a C ide for me insteadof having to explicitly set it for aname spacei don't think the question is no IPAM Ithink so instead of explicitly settingthe CR you want the user to be able todo theIPAM is did I get the question correctum no oh well not sure um let let merephrase so in your example you'resetting explicitly the C4 name space yesbut let's say I I don't care which IP Iget uh yeah you're going to get adefault okay a default which will besomething like let's say the default isuh 10 to 44 a private range but if youdon't care it's totally fine the problemis that every if you have if you arecreating one network per namespace andif you don't set the value and if youget the default which is 10244 right soif the user doesn't set anything we givea cider it's not random so everynamespace will have the same 10244 soyou will have an overlapping part IPwhich is totally fine as long as yourworkloads don't need to talk to eachother ever you're fine but you know whatI mean like tEhe blue and the green bothwill have 103 103 103 103 so if you havepods created in both namespaces they'llall have the same IPs it's actually oneof the neat use cases of wanting networksegmentation right so because if youdon't care about what IPs you want toget you just get that default range andbut the range is not going to be uniqueacross your cluster it'll just be thatyou'll have uh pod IP pods with same IPsacross different networks so yeah sorryI didn't understand the question butthank you for the question yes pleasemaybe you need to use the mic ifeverybody needs to hear youunfortunatelyhello okay um so you have cluster userdefined and you have um userdefined umUDN right yep so let's say you definethe clustered network rightand a user defined a user definednetwork for the same name space do wethen overwrite one of them or how doesthat work yeah so if the admin hascreated a cluster network and the tenantalso creates one i think that was thequestion right yes yeah the tenants onewill not come up so in the status youwill see an error for the tenant andyou'll have an alert or something thatsays there is already an existingnetwork in this name space that has beendefined by the admin so contact youradministrator or the vice versa if thetenant creates it first and so so forprimary networks you cannot have morethan one network for your pod right itdoesn't make any sense to have like thewhole concept of primaries all yourdefault traffic should go through thatnetwork so you can only have one primarynetwork in your namespace so if thecluster administrator creates it firstand then the tenant creates it or viceversa one of them will be stuck in errorstate like the CRD will have a statusthat says network creation unsuccessfulbecause there's an overlapping networkokay so if we use it the OVN then thenit just ignores the network networkpolicies on the cluster thenum not sure I followed the secondquestion so if you create like the wayyou created the green and the yellow umnetwork for example I create a networkpolicy stating the port in the blue onecan go to the yellow one it will justignore it right because it doesn't sohow do network policies work withsegmentation yes does it just ignore itbecause you don't use the defaultnetwork again you use the OVN one rightyes so you're going to get the answer tothat question in a while it's comingokay thank youyeah pleasethank you in the case of creating auserdefined network or cluster creatingnetwork if my pot needs to talk with theKubernetes API server it's possible it'scoming how it'll be I'll be showing itin the workshop also okay the nextquestion if I have two uh workloads eachone in a different name space each onewith their own network can I have IPconnectivity between these two workloadsright now okay the right answer is as ofnow no the whole point is to keep themseparate so no IP connectivity completesegmentation complete isolation forcompliance reasons if you want to haveconnectivity expose them using a loadbalancer and then through the serviceyou can have connectivity but pods bydefault is a default posture of no butwe have something on the road map calledinterconnecting UDNS and usually the usecase is maybe your side range is fullmaybe your cluster doesn't have any moreresources you want to expand your UDNthen you can have another UDN andinterconnect them but we are designingthis still right because it's kind of acomplex problem do we want partialconnectivity do we want fullconnectivity etc so um yeah okay andlast questions uh in the same space canI have two user defined networks orcluster defined networks i mean I have apot uh talking into different networkplanes management and signaling orwhatever yes that is possible only oneof them will be your primary network theothers will be secondary you can haveany number of networks in a given namespace only one of them will have roleprimary every other network will besecondary so your management network canbe primary so where whichever networkyou want the default to go through thatwill be your primary but you can have Idon't know five 1F0 more name spaces thepod will have 10 interfaces and they'llall be plugged to different networks andif for whatever reason I need internallyto have connectivity between these twonetworks can I create a port as avirtual router in order to forwardtraffic between the two networks that isnot possible today but it is going to bepossible using what I mentioned which isinterconnecting UDN so you don't have tohave a pod that's a router we will dothe virtual router that sits betweenthese two networks that will connectthem and then maybe we'll make it moreconfigurable saying maybe it's notreally secure to do podtood there maybeit's part through cluster IP service orsomething right like so that's on theroad map coming soon thanks alot greatso my pods are up so we'll try tocontinue with the workshop uh this isjust the first blue pod that I'm tryingto describe here so it's the let me showyou the command that I did oops it's abig one so it's the app blue zero pod inthe bluenetwork and there's a bunch ofannotations which is basically the IPIPAM don't worry about it what I want toshow is that the pod is healthy and ithas an IP 1024419 which is bad right because wecreated a created a network with 103 103cider but nothing to be surprised aboutthe reason why the status.pod ips areshowing you this IP is becauseKubernetes doesn't know about multiplenetworks cublet still needs an IP toprobe at otherwise your health checkprobes are going to fail so that is theIP that you see here but this IP is onlyused for probing that's it so the pod isstill connected oh well well that's itas in somewhat but the pod is stillconnected to your default Kubernetesnetwork and this will be true until thecore Kubernetes community fixes thingslike probes or DNS or reachability toKPI service and DNS which needs thatinfra network attachment so all your UDNpods will always be able to access DNSservice and cube API service which isrunning in your infra network and cubletprobes will be working through thedefault network interface on the podwhich is the default Kubernetes networkthat's the only use case for needingthis pod IP nothing else so looking atthe interfaces inside your pod right sothis is an example of a blue pod you cansee three interfaces one is the localhost which we don't care about theETH0 the102441 9 is the IP that is used forprobing your pod nothing else is on thatinterface basically everything else isblocked only cublet probes work thereeverything else is going to use your UDNinterface which is the last interfaceand that is the IP that this pod hasactually today1031032.5 right so this is the interfacethrough which all the pod traffic isgoing to go out for your blue pod in thecluster these are the routes inside thepod and the 1096 that you see is yourcluster cider so services are going towork through your UDN1 interface rightso that is the only takeaway out of allthese routes so to answer one of thequestions that was being asked rightwhat can I have more than one networkyes you and that will have like a routespecific to that secondary network it'snot present here but then you can seethe ETH0 the UDN1 all these interfacesand routes but any at least the two keythings here is that all default trafficfor a pod will go out it's UDN interfacethe services and everything will also bethrough the UDN interface the onlyreason why there's an ETH0 interfacehooked to Kubernetes default network isfor cubelet probes towork and this is like a nice view ofwhat are the networks on this pod soinspect the pod blue right you can seethat RO is primary for the first so lookat the first one i'm talking about thefirst one so there's RO primary the IPis 2.5 and the MAC address is there forthat specific interface which is theUDN0 interface setup and all the routesin that pod are also visiblethere and then you can see that there'sblue one and blue one pods that it wastwo replicas right so there's blue zeroand blue one and you can see that thethere is the second part of it isinfrastructure locked network which iswhat I was trying to tell you about the10244 range is for cublet probGes it'slocked basically and only probes workthere yeah I'll come in a moment to yourquestion just going to get through someof these in lie oftime and I'm going to try to connectbetween my blue pods within the bluenetwork you can see that that works verywell and it's using the primary pod IPprimary pod IP is your 103 pod IP so 1.5is the curl that I did so between yourblue network things are working fine andI'm going to try to use the 10244 IPlike I said right and you can see thatthat ping doesn't work and the reason itdoesn't work is obviously because onlyprobes workthere and this is the green pod againthe green pod has another you know thepod status it looks like this but youdon't have to worry about this IPinterfaces inside the green pod you cansee another UDN interface with the IP2032032.5 you can see the routes inside thegreen pod and all the interfaces so sameeverything I showed for blue isapplicable to green just two differentnetworks so we can also see the twogreen pods talking to each othersuccessfully and you can see that thegreen pods using the Kubernetes IPKubernetes aware IP will not workbecause only probes work there nowthere's no admin network policies on mycluster no network policies in mycluster no baseline policies in mycluster nothing and this is the setupthat I have on my cluster right now so Ihave a blue network a green network redand yellow are connected to each otherso I have the 103 203 and the 192 rangeright so these are the pod IPs that Ihave and we tried to talk between blueit worked we tried to talk between greenit worked and now we're going to try totalk between blue and green which is thebeauty of this workshop so let's see ifthat worksnow so a pod in blue is trying to talkto pod in green doesn't work right soblue is trying to talk to 203 networkit's isolated by default there's no waythat thisworks what about red to yellow thatshould work right because they're partof the same colored enterprise networkand this was the admin created networkand so that looks good forus and services uh I think we are kindof over time with the pods and I shouldhave given it off to the virtualizationside of things but I do want to at leastdo the theory behind services andnetwork policies right do we have timefor that Miguelokay so let me get back to so any thatwas like very simple right like wecreate networks we get pods we try toping between them if they're part ofdifferent networks they won't talk toeach other if they're part of samenetwork they talk to each other and nowI know there's a lot of questions forservices so let's get back toour menty meter so if you're on quizplease try to log into the website andenter the mentyter code miguel can youread out the code do you have it the orI can try to do itit's[Music]617692 yep I see a lot of thumbs upthere's 47 people so that's pretty cooland now let's try to see services i'mgoing to try to present theslideshow so this will be our firstquestion but before we do that let mejust like try to share some contextright so cluster services are ofdifferent types you have cluster IPsnode ports load balancers and let's takecluster IPs and simplify it cluster IPsthe cider that you get within yourcluster and it's used usually forconnection from pods within your clusterforget load balancers and node ports tokeep thingssimple right now imagine you have a blueservice in your blue namespace a greenservice in your green namespace and youhave a red service in your red namespacea yellow service in your yellownamespace right so we've done with partsnow let's look at how services work withisolation how do you think the servicewhich is your cluster IP range in yourcluster should work with UDNS right solet's go back to your mentter i thinkthis is the first question that you seei see people answering this live on thequiz so I'm going to read out theanswers which which one do you thinkshould be each UDN should have its ownservice cider which seems to be trendinghigh or should all your UDN's be sharingthe same clusterwide service cider whichseems to be a little bit on the on thedown part wHhere we have five people nowtalking about this and neither I have abetter idea that I want to share there'stwo people who have said this raise yourhands no don't be shy is it because youdidn't agree with one and two that youknow was it youMikuelkeith somebody said this somebody has abetter idea than the the two options Ihad no okay so how many of you actuallysaid all UDs will share the sameclusterwide service cider raise yourhands i see a few of youcongratulations that was the rightanswer actually I should have pressedenter here so sorry majority of youactually thought that each UD shouldhave the same service like each UDshould have its own service seter butactually all UDs are going to have thesame service cider in the setup i knowthat sounds confusing but we'll explainto you in a moment why that's the caseif we have the time but if not comecatch me after the workshop session butit's it's still cool the isolation stillworks right you're going to have oneservice cider and I and I get whyeverybody gave that option because youhave a different pod cider obviously sowhy not have a different servicebut you know because service cider andthe whole control plane is againcontrolled by core kubernetes it's alittle bit like IPAM is done by the CNIso we have more freedom there for theservices endpoint slice controller andthe service controller it's all bakedinto core kubernetes so it's much easierif you actually have a clusterwideservice cider that is shared by all yourUDNs and then it's actually even moreeasier to isolate your cluster IPservices in that way so now if you goback to our slideshow and keep goinghere imagine that now you have thisright so a blue pod is trying to talk toyour green service that's the gray arrowthat I see here which I think is yournext question on menty meter so keepgoing on your quiz to the next page ithink I should probably change it hereyeah what color do you think that shouldbe so a blue pod is trying to talk toyour greenservice should it be green it worksshould it be red no it shouldn't workbecause you want it isolated neither itshould be userconfigurable or maybe it's becoming toocomplex that's alsofine i see more people giving thumbs upi'll wait for a few secondsjust to repeat the question the blue podis trying to talk to your green clusterIP so cluster IP the green cluster IP isin your green network the blue pod isobviously inside the blue network soit's cross network communication overservices oh I still see a lot of peoplehurtingit's actually kind of close are you Areyou trying to make sure that it's 50/50now with thevoting what What do you think what doyou think Miguel should it be red orgreen blue pod is talking to greenservice so Miguel saying it's green ithink it's red because red is the answerbecause I implemented it right so you'regoing to have isolation for yourservices also intact until youinterconnect your clusters orinterconnect your UDNS together so untilyou interconnect blue with green whichis what I said is on the road map youwill have isolation actually so it won'twork but this is actually a goodfeedback and this is exactly why I'mdoing the workshop because I would loveto understand why there's a section thatalso thinks the other way around becauseit's an opinionated implementation rightand we are we would love to hear moreuse cases and and think why the audiencethinks differently but I know we'realmost at time Miguel I I'll be done ina moment so give me a fewminutes so coming back to our slide deckso this is actually going to be redso you can see it's red and now I have afew more arrows right if the red pod istrying so so whatever is within the samenetwork it's the same principle whateverapplies to pods applies to cluster IPservices so if the service is in adifferent network you cannot reach itit's not accessible for pods in theother networks it's completely isolatedfor cluster IPs so only the pods withinthe colored enterprise network can reachthe red service or the yellow serviceonly the pods within the blue networkcan reach the blue service and only thepods within the green Inetwork can reachthe green service that is exactly how wehave implemented it but if you beg todiffer and want to have this exposedthat is what is on the road map calledinterconnecting UDNS so we'll have a CRDthat explicitly gets that request fromthe user that says I want to expose myservice to other networks so you'll beable to do it and we'll be able to getto that in a moment i had slice thenetwork policies but I can't do it nowanymore because I have to give it overto Miguel for virtualization so I thinkI overestimated myself and how much Ican cover in 35 minutes but there was aquestion on network policies andUDNS and uh I will just do a raise ofhands because we don't have time to getthrough it right how many of you thinkif you have an allow rule that lets yourblue pod talk to your green pod itshould workso the people raising your hands aresaying network policies should have ahigher precedence than the networksegmentationcorrect well how many of you think theother way around that it should not workthat's the majority the majority isright the network segmentation is whattakes precedence network policies arealso namespace scoped so network scopedso this is how it would look like if youhave a network policy it will only workwithin that network if you have a bluenetwork if you have a network policy inthe colored enterprise namespace therules there will only dictate and willbe subjected to pods within that networkcross networks it's very clear it'salways isolated always segmentedcompletely compliant and with that I'mgoing to give it to Miguel who will betalking about VMs plus UDMs and thankyou everyoneokay so let's begin the Can you hear mein the back okay thank you uh so let'sbegin the virtualization section of theworkshop and we'll pretty much be seeingthis but for virtualizationworkloads this is a brief agenda of whatwe'll be what we will be doing it'spretty standard I guess so we will bejust creating a cluster UDN we will becreating a VM in each of the name spacesand we will do a couple of things firstwe will ensure that egress to theinternet works from within the VMwe will ensure that east west trafficworks as you expect and then we willkind of uh showcase uh virtualization'suh bread and butter which is we willmigrate a VM from one node to anotherwhile it has an established TCPconnection and we will ensure that thatconnection is not broken whilemigrating uh I will first just uh createthe VMs because this can take I don'tknow how long and I will stop Sura'spart of the demo uh so I need to go back[Music]here go intovert and um that's it right so I willjust dothe I have ascript i don't remember what it does andI'm struggling with the character set ofthekeyboard uh vertworkshop so this is just to show that wehave like u installed uh probably we'lljust skip this but yeah this the I don'thave time to explain what these are soI've created two namespaces as Suyashowed before the exact same thing thereis nothing special aboutthis and we have all our name spacesshowing here and I'm showingthe the cluster UDN here as you can seeit was created on the blue namespace andon the red namespace again nothingspecial we've seen this for pods alreadynow this is the thing I wanted to startwhich is creating the virtual machinesand now we'll wait for them to be readyand while they're ready or not I willshow like show visually what we'retrying to do here the thing is we haveagain likeum two types of we have two namespaceswhich are interconnected by the appnetwork and what you can do is you seethat the pods or the VMs that arerunning on these two name spaces caninterconnect themselves if we had it ona different uh attached to differentnetwork they would not be able tocommunicate between themselves like forinstance something on the greennamespace would never be able to accessanything on the blue or red name spacesbut the red and blue name spaces cancommunicate between themselves uh theexample we have just focuses on the partabove so we only are provisioning thehappy network and we only have the redand the blue name spaces because we justwant Jto show like this happy path let ussee what happened in thebackground still waiting this is notlookingamazing but still let us plow throughand Iwill since we don't have a lot of timeleft I willum kind of entertain you with a coupleof thoughts so when you migrate avirtual machine uh on again we're usingcubvert right so you have a pod which isuh which has a virtual machine in it uhfor instance here you have like podnumber two and the thing is we willmigrate this virtual machine to adifferent node right so this virtualmachine will be running inside of adifferent pod and we are in Kubernetesso we have IPAM in the cluster so myquestion to you is what do you thinkwillhappen when the VM migrates the VM willhave an IPbut will it be the same IP it had beforeso again uh I hope I know how to dealwith this mentimeterthing how do I do that okay here it isso next question and you can vote onthis thought exercise okay so it's hereso please startvoting again we're on Kubernetes uh wehave IPAM on this network and we aremigrating a VM to a different node whichmeans that the VM will be running onside of a different pod so what's goingto happenhere okay prettyclose it's fun to see thischange okay I'll wait a little bit morein the meanwhile I'll check what'shappening here this poor VM is notstarting is the other one oh okay so oneof the VMs is uh running but I guesswhat's happening is it's failing to pullthe image on the for the red one butwe'll give it a little bit moretime and uh okay this is very close butI will kind of disclose the an answerand I just have to press enter Iguess tada and it stays the same so thething is for you to expect that theestablishedum TCP connection to survive themigration to a different node yeah theIP must stay the same otherwise thefivele will change and you'll get aconnection reset and uh that's not whatwe want to haveso wow oh the red GM is not ready it'snot making me look good but still uhlet's see what's happeningon our cluster uh getVMI where's the dash minus OYAML whatokay both of them are running which isgood and they have IP addresses assignedlet's just take a very quick look atthe at the cluster UDN we havecreated cluster userdefinenetworkokay okay and we want to check the happytenant and dashyaml okay as you can see the subnet ithas is on the 1921680016 and fun thing here topologylayer 2 previously sa showed to us likea layer three and here we have layer twowhat this means and again let's show avisual representation of uh thisessentially what we have is like aclusterwide network like a big switchand everything on the cluster is beingconnected to this logical network whichspans across all the nodes in thesystem and uh as we've seenthe what's happeninghere okayuhhuh our VMI got our virtual machineshave IPs on on the subnets and now Iwill start a demox[Music]session Uh where's thequotes well I couldn't find the quotesso it's going to be like this so I'mgoing to to uh enter console into eachof the virtualmachines uh well this should not behappeningi guess it's the price to pay for a livedemo it's not workingalongconsole i don't know what's happening tobe honestokay i did notuh very amateurish of me thank youSurya uh where is my Okay cool so I willhave to do the export thingagain i always assume this is setup assumptions are the closest path tofailure honestly so yeah let's do thisagain ver console cool so the passwordis not and the user are really not uhvery complex fedora fedorum uh againhere I'm going to do vert cuddle againfor theblue okay fedorafedorum so the IP address on the bluevirtual machine is the docs.6 addressand the IP address on the virtualmachine is the five address payattention to one thing remember the podhas two interfaces but the virtualmachine only has one so we pretty muchkind of discard the cluster defaultnetwork attachment and we just plump itthe um well the UDN attachment into theVM so I'm going to do two things thefirst of which is to curl to theinternet so I don't know let's go withthis one why am I doing thatand we have egress to the internet andI'm going to do a ping betweeKn the twovirtual machines to prove that there isum east west traffic so zero and thisone is 6 the other one is five so we canshow that we have east westconnectivity and now what I'm going todo I'm going to start a night sessionsession in one of our VMs in the blue VMthis is going to be the server and I'mgoing to start this IP session on port9,000 and I wanted the client to breakif this TCP connectionexplodes what's happening this shouldnothappen wait did something wrong minus saha okay so server started and now I'mgoing to doIP3 minus see 192168.0 I think it was 6 and I want thisto last for I don't know3600seconds and nope so it was a five my badwhy is this happening oh I know i didnot seems I was correct the first timeokay finally I got this started so wehave an established connection betweenthese two things these two things whichare VMs and what I'm what I'm going todo now is I will remember toexport and I will migrate one of the VMsuh and for that I have this vert cuddlecommand migrate and I will migrate let'ssay the blue virtual machineuh blue name space and it's calledblue will schedule to migrate and nowI'm going to do cube cuddle get pods andI will check the pods on the blue namespace so we have and I will watch so twothings are going to happen now first theconnection will will survive but I getejected from the console because wellthat pod stopped existing but as youseen here and uh the middle one andlet's make thisbig what happened is everything is okaybut I lost a bunch of packets i had likethis I guess big hiccup here and I losta bunch of packets but it pretty muchsurvived so it went to a different nodeand I still have an establishedconnection which is the exact thing wewanted to show in thisdemo uh this is a demo part we alreadydid this uh there's the answer that youget the exact same uh IP address and nowI'm going to explain to you i don'tremember know how much time I have leftfour minutes a little bit on how is thisdone if you have any questions startqueuing up therebecause I'm just going to show how weprocess the IP address so we have thisis how it looks uh when you when you getthe information from your interfaceyou'll see that you have the five um IPaddress for instance here you get theMAC address and you have the name of theattachment rightthe way we have to persist this IPaddress is we have come up with a newCRD and for each UDN what we do is wepersist something called an IPM claimthis IPM claim is something that willsurvive for as long as the virtualmachine life cycle i mean we we arepretty much uh tying the life cycle ofthe IP address or the life of the IPaddress to the life cycle of the virtualmachine not of the pod where the virtualmachine runs on and we do that usingthis IPM claim uh CRD so when you createa VM we just provision a an IPM claimCR Kubernetes once it assigns an IPaddress for that I for that uh networkinterface it will just update this CRwith the IP address and as you see itjust matches perfectly you have on thegreen boxes you have the IP address onthe uh red boxes you have the name ofthe interface and on the yellow boxesyou have the name of the virtualmachine and uh I think we are a littlebit out of time so uh if you have anyquestion just feel free to make them nowif we have time no so conclusions timeyeah yeah we'll just take a moment toconclude here and Miguel me and Keithwill be here for any further questionswe're almost at at time but Keith yeahjust a just a really brief summary so inthe beginning we saw networksegmentation outside of the Kubernetescluster here we've seen networksegmentation inside the Kubernetescluster separating name spaces from fromeach other in completely separatenetworks and giving us isolated networksjust like we can have you know isolatedCPU memory storage resources in in inthe Kubernetes cluster so this thisgives us another tool in the securitytoolbox um it still works with networkpolicies those still work inside theUDNS so we get all the network policygoodness but we uh cut down a lot of theoverhead of network policies by havingthese these separate netwLorks sohopefully this was a useful session foreveryone yeah and also we have more onthe road map of both of these upstreamprojects for example doing userdefinednetworks and advertising port IPs usingBGP not having an overlay tunnel foryour east west having EVPN so extendingyour segmentation using VRF light tonetworks for your traditional telconetworking virtualization use cases allof these are coming it's just that wehave only 75 minutes there's no way wecan show all of that so if if thoseinterest you if you would love tocontribute to Ovian Kubernetes orCubeword please reach out to me orMiguel we would love to hear from youand contribution bar is really low rightso if you read our docs or if you seethere's no docs even that is goodcontribution so anything that interestsyou or you want to share with us pleaselet us know we'll be here but otherwiseI think we can take questions till theytell us to you know get out of here soyou can use the mic if you have anyfurther questionsgoing once yes please yes yeah you mighthave to use the mic so that everybodycan hear you it's right behind youwhen you're doing your live migrationthis is for Miguel you're doing yourlive migration at uh your IP address isthe same what happens to the MACaddress is it also the same what happensto the Yeah the MAC address mac addressyes the MAC address is also the sameokay so you have no um ARP updates thatyou have your five topple will stay thesame so it's uh it's okay and like Yeahokay that's the main idea there's in theslides there's more information on howto make the gateway work as well so youcan have like uh persist stuff for towhen you're egressing but we we can talkabout that lateryep go ahead okay before you created uhthe green tenant and you gave it anetwork IP block is it possible todivide that block into different subnetso you want your same name space to havefurther fine grained segmentation andhave more than one yes subet as of todayI don't think we support more than onecider unfortunately but we we want tohear your use case so tell us why that'simportant to haveuh an explanation could be a tenantcreates its own network and just want todivide into different subnets forexample uh I don't know maybedifferentiate the environments and um Idon't know also other so I think in thatcase I one of the workarounds I canthink of is to have that tenant have agroup of namespaces and then do the samebut you know have a side up pernamespace but I'm not sure if that fitsyour use case like you don't have tothink of a tenant as just one namespacemaybe three name spaces are part of thesame tenant you can just ID them withthat label and now that tenant has threename spaces A B C A has oneider B hasone side C has one cider and you canconnect them together as same networkbut okay thank you that would requirethe interconnected thing that we don'thave you can ask the admin to create theCDNR Yep next question that has just onesubnet uh yeah so I think it's a kind offollowup so just with theinterconnectionum is there a way to like skip therouting to have a pot living in twociders so that just one part from onename space can directly communicate withanothernamespace it's like skipping routingi'm not sure I understood the questioncan you say that again so when you dothe interconnect there will be routingfrom one name space to the other isthere a way to just skip the routing forone potskip the routing for more for one pot ohum we haven't exactly designedinterconnecting yet one of the ideas wehad was to do it that way but things aretricky when you put BGP in the mix forexample if you now advertise these twoUDNS using BGP you don't even need arouter they'll just be able to talk toeach other with no effort you just needthe FRR Kubernetes or something set upjust outside or or the DN set runningand it's going to take care of it foryou so that's why I haven't reallytouched too much upon interconnectingDNS and even for the previous questionlike Miguel was pointing out tointerconnect your three name spaces withthe same tenant so it it we it'spossible maybe uh definitely reach outto us and maybe write down that as anissue because we are in the design phaseso if that is something that can bechanged it's possible right like it canbe listed as an alternative way of doingit so thank you okay thank you therethere's an interesting thing about thatthat is if you do not use the router uhso let's say you're not defi you willnot be able to have overlapping subnetsright because they will have the samething so if you have the router we wouldprobably be able to allow that to happencorrect right so it's I mean optionsdrawbacks pro costs and we don't knowwe're looking for it hi so thismechanism this UDNuh which is quite interesting uh but tobe able to use it uh is it alreadyavailable and in in which Kubernetesversions and uh does it work only onprem uh or it's on all platforms UDNS orand great question I forgot to mentionthat it's platform agnostic bare metalany cloud it works the same way for bothVMs and pods but uh this is all upstreamso it's supported from1.32 Kubernetes so if you installvanilla OVN Kubernetes on your clusterwith 1.32 it is available and it shouldwork right so just download the thelatest thing and we are trying to getout a release officially so that you'renot having to download our latest soI'll get back to you on which releasethis is available with Ovian Kubernetesspecifically but today if you just goand pull it down it is there for you totry and run and test it okay thank youand the cube v yeah I I just like toclarify one thing further that thisthing is actuallyum well the way we have for a kubernetesto make cubvert work on the cloud it'sthis like uh the the existing masqueradeinterface that u has does not supportlive migrationuh secondary interfaces like of othertypes like bridge and all that will notwork so on cloud this is what we aregoing for so thank you for that questionum one last question um so the way toknow to allow your OVN to know whichname space it needs to use you setlabels on it right on the name spacesi think you're talking about the coloredenterprise example that I showed with uhyeah but for like the blue and yellowname spaces you also added labels to thename space right uh yes the so there isa label to tell that this nameace shouldbe UDN friendly right I want a UDNnetworkde but that label is very muchexactly the same for any namespace youwant now the to create the networkitself you don't need any labels if youcreate the the the UDN is namespacescope scoped so if you create that inthe blue name space that network appliesto the whole name space yes so let's sayfor example someone deletes I don't knowthe label from the name space or deletesthe user defined um CD then the ports inthe nameace will then go backautomatically to the default Kubernetesgreat question unless you delete yourworkloads you can't delete the UDN thereis a finalizer that we set on the APIand UD will be stuck in delete state itwill say go delete your workloads youcan't delete a network because you havea workload attached to your network theworkloads networking is not mutabletoday so if you have a pod attached toyour UDN it cannot now just get migratedover to another network or somethingokay so I hope that answers yourquestion like you have a final umthere's a finalizer concept in theKubernetes API where you can put adependency resource as if you want todelete this you have to have deletedthat and that and that right so untilall your pods are deleted nobody candelete your networkyeah but there are ways to deletefinalizers rightI'm sorry there's there are ways to alsodelete finalizers from custom resourcedefinition so let's just say for exampleI'm really dumb and I delete thefinalizers do my ports thenautomatically restart and then startusing the default network or be stuck inthat will all be stuck in error statethese are operations that shouldn't beperformed it's like shooting yourself inthe foot literally okay no problemthanks yeah and I think we're over timeway over time thank you everybody forbeing patient with us and for joining ushere and have a great lunch thank you2025-04-15 21:58:28.202215 Z��Z��Js#��KA1c2va5nATmQso hello and welcome to the fun side ofthe climate apocalypse um I'm HollyCumins i work for Red Hat uh my day jobis that I help build Quarkus um which iscloudnative Java uh but I have a pastlife as a consultant uh and so this talkis based on some of the things that I'veseen in in when I was a consultant whenI went that doesn't seem like a verygood idea um so hands up if this isfamiliar i get charged $2 a month fromAWS and I'm too scared to turn it offand too lazy to figure out Yep yep yepthere's hands too lazy to figure outwhat's causing the bandwidth or I getemails from an eight-year-old WordPressinstall and I'm pretty sure it's anon-rem server but I don't know where itis and I don't know what it is and Ican't turn it off but that thing isstill emailing me so a few of you putyour hands up but of course losing $2 amonth to an AWS instance that you can'tfind that's that's for amateurs reallythe the leaders in ourindustry do it a bit better so here wehave Twitter who managed to forget about700 GPUs as one does i mean hands up whohasn't lost 700 GPUs because you justmisplaced them and the thing is thething that's so terrible about thisstory is GPUs are in demand we all wantGPUs and there was 700 of them sat therepowered up using power and not doinganything at all and it's easy to laughat this as something that happens toother people that other people do but wehave all done this i in 2018 uh Ilearned Kubernetes and so I did whatanybody learning new technology would doi created a cluster but I had a bit toomuch work in progress so after creatingthe cluster I forgot about the clusterthat I'd creaR��r#��5AUVPe-rdxK7whi everyone Uh really happy to be herewith you today with Kristoff to talk toyou about Decathan's journey intoplatformengineering First we have a question foryou Would like you to take a couple ofseconds to think about what is yourdefinition of platform engineeringNext we have a number for you And thisnumber is 5,000 What do you think thisnumberrepresents you can scream if youwant Igot posters Yeah 5,000 is the number ofteammates in the Kasan digital And thisnumber is important because it representthe scaN��aq#��yAbFKls7IvzNEi'm going to change the Wi-Fi to aprivate one to Oh yeah hey check justquick question can you create the uh UDNfor the name space without setting a CRlike I can't here okay check is itbetter now yes it is okay so can youcreate the UDM for each nameace withoutexplicitly setting a CR so like is itpossible to let the Kubernetes clusterpick out the Cdo you want a default C is that thequestion or do you want to be able tospecify the C i I want to be able to letthe cluster set a C for me instead ofhaving to explicitly set it for a namespacei don't think the question is no IPAM ithink so instead of explicitly settingthe CI you want the user to be able todo theIPAM is did I get the question correctum no oh well not sure um let let merephrase so in your example you'resetting explicitly the C4 namespace yesbut let's say I I don't care which IP Iget uh yeah you're going to get adefault okay a default which will besomething like let's say the default isuh 10 to 444 a private range but if youdon't care it's totally fine the problemis that every if you have if you arecreating one network per namespace andif you don't set the value and if youget the default which is 10244 right soif the user doesn't set anything we givea cider it's not random so everynamespace will have the same 10244 soyou will have an overlapping part IPwhich is totally fine as long as yourworkloads don't need to talk to eachother ever you're fine but you know whatI mean like the blue and the green bothwill have 103 103 103 103 so if you havepods created in both namespaces they'llall have the same IPs it's actually oneof the neat use cases of wanting networksegmentation right because if you don'tcare about what IPs you want to get youjust get that default range and but therange is not going to be unique acrossyour clust>Ole of our platform We built it toaccommodate multiple thousands of usersand ultimately this means that everyplatform needs to fit its ownorganization So a couple of words aboutthe castman So we are the world largestsports good retailer and as I said wehave around 5,000 u digital teammatesto understand our journey u first wemust understand where we come from Socouple of years ago this was kind of ourorganization in a digital entity We haduh big domains each domains had theirown ops team dev teams everyone wasquite independent from each other andthe moto at the time was liberation andliberation meant the freedom to dowhatever we wanted as long as it createdvalue for the company So it was a greattime We started a lot of projects Uh wemade a lot of great things but there wasalso drawbacks There was alsoissuesandand as an example of this issue uh atsome point an internal audit revealedthat we had more than 20 observabilitytools used across the digitalSo that's the kind of inefficiencies I'mtalking about and the kind ofinefficiencies we wanted totackle Amongst all the projects that westarted and created two initiativesreally are important to us in ourjourney The first one is that we addedthe tech governance because for 5,000people yeah we need some governance andand in the end and it's loosely based onCNCF or Kubernetes with technicaloversight committee and special interestgroups and the other thing is that wewanted to provide a standardized way toprovision and useinfrastructure So this is ourorganization todayThe domain organization is basically thesame with ops teams in each domains kindof closer to the dev now And what'simportant to note is we now have thisplatform based layer and it's composedof the business capabilities platformwhich handles the software componentslike checkout invoicing catalogsetc and dataplatform name speaks for itself and ourentity cloud platform engineering or CPwhich handles the network the bestinfrastructure and tries to build thisplatform for all our digital digitalusersWe initiated the work on our uh goldenpass during the last quarter of uh2020 We did not start with a dedicatedportal for developers and uh as far as Iremember we not even thought about doingthat We started with a platform engineto be honest more as an opportunityrather than as an explicit choice uh atthat time umessentially because the initial teamworking on the topic was closer to opsfolks than to uh backends or front- enddevelopers Uh by the way uh this notionof a platform orchestrator uh was onlymentioned way later uh in September2023 in the volume 29 of uh the saltworks uhtechradar What are the problems that wetry to solve um we observed a trend toreinvent the wheel Uh each domain at hadits own way of consuming the buildingblock of the building blocks sorry ofthe infrastructure for example like thepublic cloud envelopes the networkaddressing plans and other service uhproviders And also there was too muchroom available for interpretation whenit uh when they had to implementoperating model or apply securitypracticesAnother bigissue was that the developers were notreally part of uh the infrastructure uhbuildingprocess So what we provideduh we wanted our platform engine tobecome the way to streamline how tobuild the infrastructure inside theketan We did that by supplying anotheruh level ofabstraction It was uh ready to use byboth developers and ops uh so that theycan contribute together at the same timeuh to the definition of theinfrastructure they uhneeded and there were no uhinfrastructure development skillrequired to use this engine It was allinclusiveuh all-incclusiveobservability security credentialmanagement um IM policies uh firewallrules everything was provided without uhany specific stuff to do and it was alsoascod because we are uh believers in theGitHub way So our engine is named uh 3Sand 3S stands for uh self-service stackYou will see that we are not definitelyuh naming engineers We um allowed ourusers to describe what they want at theinfrastructure through YAML descriptorsand they are able to collaborate througha pPull requestuh we are running Terraform for themunder thehood and uh the outcome is uh GCPprojects and all the the service uhneeded for the resources they requestedBut um there is one big assumedtrade-off with this solution It's thefact that we only provided a way toconnect through internet It means thereis no connectivity with uh the decatronprivate and internalnetwork and it worked mostly as expectedWe observed adoption in almost all thedomain at least the domains that did notrequire an connectivity to the internalnetworks uh and we also gathered verygood feedback from the the developersthat started to use uh the engine Webuild this engine thanks to a core teamof uh 15 folks uh coming from the opsworld and from the developer world Wewere also able to on board uh more than50 contributors from several domains andwe also managed to uh enforce prettyhigh quality uh standards uh even ifthey are not perfect uh because of thecomplexity of the engine we needed toprovide But there were also limits likeI said earlier there were noconnectivity to the internalnetworkUm as we are running everything inside asingle terraform run uh we started toface some performance issues uh in someusecases Regarding the contribution thepretty uh the first steps to be able tocontribute was actually pretty high uhfor some folks that were not familiarwith uh the quality standards that wewere enforcing and the broad set oftechnologies that we had to use uh underthe hood And for the release managementit was really done the hard way Uhmainly because of the high number of uhfunctional tests that we had to do inrelation with uh the richness of theservice catalog that we provided Andthese tests were pretty hard to uhautomate and we did not succeeded toautomate them as much as we would havewantedto SorrySo uh here we are three years later Wehad the beginning of2024 and we reflected on the limitsChristoff just just mentioned and we setthree main objective for our nextiteration of3S and they revolve around these three uthingsproducts focus on developer experienceand tackle this uh internal connectivityissue we have for the products our goalis to increase adoption increase ouruser base So we need a a more robust wayto interact with our users a more robustway to sell our product to our users getfeedback etcWe wanted us also to have a moredeveloper focused product because weneeded a better user experience for ourdevelopers because they were our ourprimarily uh users targeted usersand describing infrastructure into YAMLIt's not that bad or maybe but it'sreally not the best way for developersthat are not keen oninfrastructure And the last one in orderto on board all the domains in theKathon digital well we needed to uhtackle this internal connectivityproblem Regarding the product um as Isaid wanted to increase our users and weneeded something better to interact withthemAs for our ex existing users and or allof our users overall we wanted toimprove our commitment have a betterrelease management uh release more oftenum better releases better um planningfor the better road maps um for the thenext releases And we also wanted tocreate a dedicated feedback loop withour users So we added a discoveryprocess in order for our users to submitfeature requests be able to upvote themdownvote them discuss and exchange withus about theirneeds So our product strategyuh evolvedand we switched to a three productstrategy So we have single regionproduct and multi-reion product Uh asChristos mentioned we are platformengineers and not naming engineers Soyeah SRP andMRP Uh these two products uh stemdirectly from the3S And we added a third product ourinternal developer portal in order towork on this uh user developer uhexperience thingSo SRP and MRP I won't talk much aboutMRP because it's basically SRP with amulti-reion as a first citizen So it'skind of uh declare once deploy multipletimes Um and so SRP and MRP they areloosely based on the 3S engine a bitreworked but what's important is thenetwork part in order to accommodate forthe internal connectivity We completelyoverhauled the netwQork on SIP andMRP Uh new addressing plan newsubnetting new VPNs so on soforth As Christo mentioned uh ths arekind of independent bubbles They canonly expose application publicly to theinternet And in order to be able toreach our internal network from SRP weintroduced an in-house component that wecalled platform score And platform scoreis the component that reworked all thenetwork as I mentioned earlierSo it allows us to reach our internalnetwork from uh our SRP and MRP tenantsand it also allows us to internallyexpose application to this internalnetwork So what is platform score soit's basically terafform provider and anAPI And what's interesting about the APIis that it allows us toprovision a lot of network and securityresources and it allows us to provisionthem independently from the network teamIt's really interesting because beforewe had to submit support request etc etcIt was not the most efficient way to doSo now all the security all thecompliance all the standards they arebuilt into the PFC API and we can we arereally autonomous and can provisionthese resources that are kind of heavyondemand And the PFC API was not the onlyAPI we introduced Uh this was kind of acultural change for us having so manyAPIs and developing in-house APIs uh weadded um provisioning API we added asecret management API and this allowedus to really streamline all theconsumption of our products from youknow provisioning to actual workloaddeployments So SRP looks good rightshould be uh should tackle all the thethe issues we have Well it was notwithout changes and it was also not uhwithout the product beingchanged The first issue we met was kindof a dependency h because when weintroduced the platform scorecomponents well all the other componentswell they basically depend on thenetwork So they had to rework to be towork with platform score and this put alot of stress and strain on the passwordscore team because they had to work withall the bugs all the feature request ofall the other teams at the same time Soit was not a great time for us to behonest and this is in addition to um theusual I would say stress that is put onthe engine team because all thecomponent where we worked a bit and wellthis put also some stress at the sametime on the engine teamThe other issue uh the bigger one thatwe faced is that at the beginning of SRPand MRP we really thought that it wouldbe possible for us to migrate from fromfor users to migrate to from 3S toSRP but as we reworked the network weintroduced changes and many changes andmany more changes and at some point werealized that it wouldn't be possible wewon't be able to migrate from 3S to SRPor MRPseamlessly And that was a bummer for ourusers because if our existing userswanted to migrate from 3S to SRP wellthey had to start from scratch They hadto migrate all the data all the workloadfrom CHS to SRP And this would be costlybecause it would take them a lot of timeAs for our new users or potential newusers the good thing was that they couldleverage our Intel network So good Butthe thing is the users that wereskeptical about the value of 3S comparedto what they were already using you knowbuilding things uh their own way withTerafform directly Well they were stillskeptical about SRP and MRPbecause it was just just like a 3S withinternal connectivity and a couple newfeatures It was not a game changer forthem So they didn't see thevalue So now we are in December 2024which is quite funny because actually itwas after wesubmitted call for paper for thisconference So um we convinced ourselveswe had to make a choice and we agreedthat we would have to choose between twooptions The first onewas either we go back to the 3S engineor do we start to build a new enginesomething like a ne hybrid one uhrelying on the foundation of both 3s andMRP project We also agreed on the facton on on our decision drivers Sorry Umthe two main decision main decisiondrivers were um we really wanted tomaximize the the usage of our platformorchestrator and the second one is thatall the development costs related to thechoice that we would be making wouldhave to be as lower as possibleSo which option do you think we didchoose yeah back to the3S back to the 3Sengine Um if we look at the pros andcons of uh the the fact that we madethis choice uh some strengths we weregoing back to a more familiar ground Soit was uh easier for some of us to to towork on that but there is still this bigmajor weakness related to the fact thatwe cannot uh inter we don't have anyconnectivity to the internal network Uhbut we also found some opportunities Uhthe first one was that we decided totackle some technical depth related tothe this four four years old engine thatwas a 3S engine and we were also able toupdate the portfolio of the 3S thanks tothe fact that we decided to backportsome of the features that we uhdeveloped only for the SRP and MRPproduct But there was also another uhthreat which is the fact that we have tofind how to uh cover the blind spots ofthis uh 3S products uh especially forthe users that do not want to use orcannot want to use the 3Sengine What is our future uh we are nowone quarter after this reboot of the 3sengine uh like we said at the verybeginning we started with the platformorchestrator In the meantime the wholecompany became more mature re regardingits um developer portal expectations Weare only at the very beginning of thelink between our developer portal andthis uh platformorchestrator and we really want torefine the link between uh the portaland the platform orchestrator so thatthis link becomes uh a key level to theadoption of uh the platform orchestratorSo uh now a couple of lessons learns andtakeaways from this journey uh intoplatformengineering First uh something good forus that we intend to keep is ourdelivery management process and ourclonification We now have a a dedicateddelivery manager uh we do quarterlyplanning to um to work out all thedependencies between the teams Um wehave clear road maps of where we want togo and it's uh it's it's really smoothon the workloads on all the teams andit's really alleviate a lot of stress Soit's a really goodpoint Next um we should stay flexibleand we should learn and adapt Learningthrough failure is a key value atDecathon So it's okay for us to go oneway fail go back try something new anduntil we find something that works it'sfine And we think that in our everchanging tech world uh it's a good umgood thing to havebecause we need to know when toaccelerate and double down on somethingwhen know when to take a step back andtry something new in order to get wherewewant This one is kind of obvious butit's easily forgottenWe platform engineers uh serves ourserve our users and our users servetheir business So our job is really tomake their job easier uh by alleviatinga lot of uh complexities a lot ofdifficulties and making them be able tofocus on on the business and what makesthe value for the companyAnd one mistake we we did was that uh weput too much complexity on our users uhthinking that they would be able tomigrate or they would want to migratefrom 3S to SRP and our job is to as Isaid absorb this complexity not put themput that on theirshoulder and the last one this is kindof obvious but it's also easilyforgotten and but it's I think is reallyimportant Resistance to change isinevitable and the sooner youacknowledge that the sooner you will youthe better you will be able to deal withit We are backenders If you don't havethe reference I will invite you tosubscribe to this uh awesome uh platformengineering related news uh newsletteruh written byCasparen Uh he gives this name to uh thebackenders to the group of people thatstart the platform engineering journeywith uh with a platform orchestrator Umwe always try to align our goals withthe maturity of uh the moment and webuild this platform orchestrator fordecline users with the Gatlin users uhwhich is really one of the recurringpoints uh you might have heard uhseveral time since you are at this uhinstance of the cubecube So now we would like to invite youagain to think about what is yourdefinition of platformengineering Thank you[Applause]2025-04-15 21:58:28.776853Sted for two months and thenwhen I went back and looked at it Irealized that I'd created a fairlywellsp speced cluster this thing wasa,000 pounds a month and it was just satthere doing nothing while I did my otherjobs so that was a slightly awkwardconversation with my boss and I I didn'tlearn from that experience so while Iwas preparing this talk I had um a Macserver for Mac Stadium that I was usingfor Maxi but I'd kind of not quitemanaged to make it work and then I'd gotdistracted doing other things likewriting a talk and then I went back andlooked at it and it was £150 a monththat again was just sat there burningCPU doingnothing so how how bad is this problemis it just you know a few quite isolatedbut kind of hilarious incidents likeTwitter losing 700 GPUs or is itsomething more systematic obviously asas rational computer people we shouldn'tbe making judgments based on stories onthe internet we should be measuringmeasure don'tguess but it's kind of hard to measurethis it's kind of hard to quantify howmany things you've forgotten aboutbecause by definition you've forgottenabout them so so this is an actualpicture of a zombie um as you can seeyou can't see it that's the whole pointyou don't know it's there however thereis some research that that people havedone to to try and at least surface thisin a in a statistical wayso in 2015 an organization called theAnthesis Institute did a survey of 4,000servers they found 30% of them weredoing no useful work a couple of yearslater they repeated the survey um theyincreased the scale so they did it for16,000 servers and the numbers werepretty much the same a quarter of themwere doing no useful work and when theysay no useful work I should clarify thatthis is a high bar this isn't theservers were mining Bitcoin or showingcat pictures this is they hadn'tdelivered any information however lowthe quality however valuefree orcomputing services for six months ormore nothing was going in nothing wascoming out and so they call theseservers come servers they're they're onlife support they are they are consumingresources but not doing anything atall but there's another category ofunderutilized servers which I think ismaybe more serious becauseit's less obvious and these areunderutilizedservers and again we see this all thetime a while ago I I was talking to acustomer and they were explaining to methat the way they they managed theirworkload was they ran it as a batch jobon weekends but the server stayed up allweek even though nothing was running umI've spoken to other people and theyhave a system that they only use in UKworking hours but they leave it running247 and again you know we we canquantify this so the anthesis institutefound that 29% of servers were activeless than 5% of the time 5%utilization is pathetic you know thisshould make us embarrassed and worriedand if we can generalize this so we cango from that a third of the servers areonly active 5% of the time and we canconclude that the average server in ourindustry is running at about 12 to 18%of capacity we are completely not takingadvantage of the computing power that wehave available to us and this matters alot because if you have a server runningat 12 to 18% of capacity the cost to youfor the hardware for the cloud is prettymuch100% and the cost to the planet ispretty high as well so even thoughsomething is using very little of itscapacity it will be using between 30 and60% of its maximum power so we have thisdisproportionality between how muchvalue we're getting from it and how manyresources we're putting into it andagain you know we can we canquantify how much this is costing us andthe the spoiler is it's a lot so in 2021there was a study and they found 26billion was wasted by always on cloudinstances if you think what you could dowith $26 billion if you could get thatback you could have ice cream for yourwhole team for like till the end of theuniverse there's so much you could dowith this money and the thing is youknow I mentioned already this this isn'tjust money um the money is is a big partof it and it's not even just Ttheelectricity so even if these things arerunning on renewable electricity there'sstill a problem because every piece ofhardware that we'rewasting has what's called embodiedcarbon embodied carbon is the carbonthat it took to manufacture that softthat hardware and it's it's a lot andthen of course when you have a datacenter that data center needs water tocool it and it's a lot and then ofcourse once you're done with yourcomputer that you didn't really need itgets to the end of the life and it getsdecommissioned and that generatese-waste it generates landfill so we haveall of these impacts that come fromcomputation that we didn't evenwant and we can put this intocontext the the green softwarefoundation have an excellent set ofprinciples the green software principlesand there's three basic categories ofthings that we should be doing toimprove the sustainability of oursoftware or of our IT operations thefirst is carbon awareness this is aboutwhere you run your workload and when yourun your workload which affects whatkind of electricity is beingused the next is hardware efficiency sothis is about elasticity and utilizationbut we'll come back to that um the lastone is electricity efficiency so this isreally about your software stack so it'sabout how you coded your algorithms werethey efficient what stack did you chooseare you running on Rust or are yourunning on PHP that's going to make areally big difference but that middleone the hardwareefficiency is what we're going to talkabout today because that's all aboutelasticity andutilization and the zombieproblem you can call it the zombieproblem which sounds really exciting oryou can call it low utilization whichmakes you sound a little bit moreserious it all depends you know whiwhich youwant how why why does this happeni think there's three threereasons the first isforgetfulness the second islaziness and the last one is fear nowthis sounds kind of terrible it soundskind of negative and judgmental i thinkwe honestly we don't need to be thisjudgmental this happens to all of us sowe can reframe these in a more positiveway there is a lack of institutionalmemorythere are competing priorities becausethere is always more to do than there istime and many organizations have a quitesensible risk aversion but thecombination of all of these is thezombieapocalypse and I was surprised aboutforgetfulness but it was something thatthe anthesis institute when they whenthey did that survey you know they werethey sort of asked the same question whywhy does this happen how could thishappen and their best guess for why aquarter of the servers were zombies wasperhaps someone forgot to turn them offand I think this is exactly what happensand again no judgment managing machinesis hard managing machines has alwaysbeen hard in um this isn't urban legendbut it it did actually happen um it'snot a zombie it's the opposite of azombie um many years ago in2001 there was um a university and theyhad thisserver and it was a useful server butwhen they did the audit to try andconnect the physical server to the tothe to the software they couldn't findthe server um and they eventually foundtheserver sealed behind a wall they had tolike follow the cables and theyeventually found it so you kind of thinkif it's possible to accidentally brick aserver into a wall of course it'spossible to forget a server in a datacenter you know it's amazing that it'snot more than25% another thing that quite oftenhappensis we just don't quite get around todecommissioning because decommissioningis kind of boring and quite often thereason that we need to decommission asystem is because a project ended if aproject ended you don't want to investtime tidying up that project you've gotyour new projected which is what you'rebeing measured and quite often as wellbusiness processes change you know weused to use this system now we use thenewsystem but we kind of don't quite getround to to taking down the old systemand that I think one of the reasons thatwe don't take down the old system issort of the the what if the thefear another another thingU that fearcauses isoverprovisioning because no one wants tobe the person who underprovisioned thesystem and caused an outage so we saywell I won't get in trouble if we run at5% utilization i'll get in a lot oftrouble if we run at 120%utilization and there's really solidtechnical reasons as wellso if you if you have sensitive data orif you are security conscious then quiteoften you're working with isolationrequirements you cannot just take all ofyour workloads and put them in the samecluster you can't just put them on thesame machinethere's technical issues with Kubernetesfor example so when I first startedlearning Kubernetes I imagined that theway we were going to deploy these thingswas that we were going to have onecluster for our whole organization andwe were all going to have namespaces butthe problem is that CRDs tend to applyacross namespaces and so then you'regoing to have conflicts and managing asadmin rights is you know a little bitsensitive and so what's ended uphappening is we've gone from this modelwhere we have the cluster as the sort ofthe base layer that everything is builton to the cluster as the unit ofdeployment because that's safesttechnically but it's also prettywasteful and autoscaling is somethingthat really should be helping us withthis but autoscaling algorithms have abias to optimize for availabilitybecause just like no one wants to be theperson who provisions so that you arerunning at 120% utilization no one wantsto be the person who writes theautoscaling algorithm that causes anoutage that's not what we want ourautoscaling algorithms to dofair andutilization and elasticity are reallyquite related and one of the things thatwe find with a lot of our systems isthat they're not quite as elastic as wehoped for so what we should be aimingfor when we figure out how muchutilization we want we should be aimingfor around 70 to 80% that gives you someheadroom but it's also making goodefficient use of your systembut I mentioned that if the workloadgoes up and the capacity doesn'tincrease then you end up withoverutilization this is the catastrophiccase this is the you are getting sackedin the morning case everybody tries toavoid this and the way to avoid this isto go to this which is safe but veryvery wasteful how do we fix it what weneed to do is we need to have theelasticity so that if the workload goesup the capacity of the system can alsoincrease so when we design systems wereally should be optimizing forelasticity at at every level of thesystem not just one or two because if wehave elasticity when the workload goesdown we can scale the capacity down andstay at that goodutilization so how do we how do we solvethe zombie problemwell elasticity is is part of it butit's only part ofit before we can start looking at thatwhat we need to do is we need to dodetection and we need to do destructionand this actually sounds quite fun ifyou look up on the internet how tohandle zombies what you'll find is thetop five ways to kill a zombie andalmost all of them involve flamethrowersnow how many of you work in regulatedindustries about about a third of yourhands yeah so with regulated industriesthings are a little bit more challengingyou cannot go to into your data centerif you have a regulated industry with aflamethrower and start removing theservers it's I think it's oh I forgetthe regulation but you you can look itup at home so we need to be a little bitmore clever and systematic about how wedo this so often what we end up doinginstead is system archaeology but systemarchaeology is a little bit like realarchaeology it can be pretty tedious andit can be pretty hard um one of theleast entertaining four hours of my lifethat I have ever had um was I had ameeting with the CTO of a UK bank andthey'd sort of summon us in to help themwith their system archaeology and theyhad a spreadsheet and they were goingthrough every single thing to try andfigure out what all the workloads wereand this spreadsheet model I'm sorry tosay is very common um this is aspreadsheet that went round my workquite recently and you can see that Vyouknow we have all of the systems andwe're trying to figure out who owns itwhat's it used for and then you getlovely comments turning uplike it was assigned to someone and theycame back and they said "These assetsare unknown to me i do not know who theowner should be how do you what do youdo with that there's you hit a deadend." Um or same thing you know theperson we thought owned these doesn'town them and has no idea who should ownthem what do you do and so then at thispoint if if the spreadsheet isn'tworking and if the email isn't workingthen quite often the sort of the thenext resortis long emails so probably all of youhave gotten some of these right whereyou know you get an organizationwideemail that says "We seem to be paying aquite a large cloud bill we're fairlysure that most of this is useless but wedon't know what it is and we're tryingto avoid the spreadsheet so insteadcould you just please go look at yourstuff and and turn it off if you need toagain you know this is prettycrude so sometimes what organizationstry instead is tags at least with a tagyou have some metadata that hopefullyallows you to connect the server to thefunction but I have found that taggingis better than nothing but it's notreally as reliable as it should bebecause tags tend to go out ofdate so you could decide that this isall too much like hard work and youcould go back to the flamethrowerapproach and so this is called thescreamtestnow if you are practicing chaos testinghow how many of you are practicing chaostesting a a couple of hands um you canextend your chaos testing and those ofyou who aren't practicing chaos testingyet you can introduce it and instead ofthinking about system resiliency and allof those other things that we want toachieve with chaos testing what we canintroduce is the eco monkey so the ecommonkey is a bit like the other chaosmonkeys but what the eco monkey does isit randomly shuts off a server and itsees who screams if no one screamsyou're okayyou can leave it shut off um I shouldwarnyou the scream is real for this um Iheard I heard a story it was a large IToutsourcing organization and they weredoing an audit of their systems andinside their firewall so in in the areathat was reserved just for theirinternal IT they found this servernobody knew what it was it probably wentaround the spreadsheet nobody claimed itin the spreadsheet so doesn't seem tohave a t purpose let's turn itoff awesome half an hour later they gota very very very very angry call fromtheir main customer saying "The backboneof our network has just disappeared whatdid you do?" Um to which the onlyresponse of course is "Oops sorry aboutthat we'll sort it out."Um there this is why risk aversion is athingum now of course you're not even doingit unless you have an ops for it um soall of the opses can actually help a lotwith this um one of the ones I'd like tocall out is green ops um green ops isincreasing in popularity as a term butit hasn't quite made it yet uh so thefirst time I Googled for green ops Ididn't find anything about it andsustainability um what I found was thisthis is a green ops it's a midsizedtrilabyte that is mostly found inOntario inCanada who knew um it will get therethough uh a more popular ops that isreally really helpful is Finnops soFinnops is framed and marketed in termsof money but of course it also has quitea lot of sustainability benefits so finis really about figuring out who in yourcompany forgot to turn off their cloudbecause if you have that real-time flowof financialinformation you can then have optimizeaway the stuff that's nobody isusing and I really like backstage forthis so with backstage um you can havethe cost insights plugin and what thatallows you to do is bring that financialinformation that normally is only at thesort of the CFO level and actually bringit to the engineers who are responsiblefor optimizing away that cost uh you canalso use the cloud carbon footprintplug-in which will allow you to directlylook at the carbon impact of yourworkloads both of those super superuseful but like I mentionedthat detecting Wthese is only half thebattle because once you've decidedsomething isuseless it's pretty scary to to turn itoff you know this the scream is real umand what this can mean is that peopleoften identify zombie servers but it'sjust too much bureaucracy to actuallyget to the point where you can turn themoff so you know bureaucracy is the isthe friend ofzombies and even at a human level aswell there's a lot of things thatprevent us from wanting to turn thesethings off so a lot of us have thatpersonal risk aversion of you know whywould I shut off my server because Imight need it later and there's acognitive bias here as well so it canseem scary to turn off the servers butit also sometimes is just actually kindof painful to turn off the serversthere's a cognitive bias called the IKEAeffect and what the IKEA effect says isif you put work into making somethingyou feel more fond of it which we'vedefinitely all seen with you know thesort ofthe things that our children make andthat kind of thing or you know thingsthat we make that are really not verygood but we made it and and we like itand so we have the same thing with ourservers that once you've put a wholebunch of work into setting up a systemyou feel quite attached to it you don'tyou don't want to shut it off but thisYou know even though these factors arethere we need to move away from that andwe need to move to a model which is muchmore like a light and a light switch alight switch is the ultimate inelasticity we never ever leave a roomand say I could switch the light off butif I switch the lightoff what if it never comes back on againwe have confidence that the light isgoing to come back on again but withservers a lot of us still have this sortof institutional reflex memory that ifwe turn a serveroff it may never be the same againbecause we thought it was stateless andactually there's state somewhere and sowe just don't want to take the chance sowe're stuck we we keep the servers onand and as well quiteoften getting the server set back up inthe same way is just too much work so weleave the server on so we need to moveaway from this what and in order to moveaway from this what we need is we needsome qualities of service so turning thesystem off and on again it has to befast it has the system actually has towork when you bring it back up and sothat means you need item potency itmeans you need the resiliencyum and I started calling this lightswitch ops um and the good news is likein cloud native we already needed thosequalities of service anyway so we'rewe're threequarters of the way there andI started talking about light switch opsand everybody went and then I sort of Inoticed that now um there's an O'Reillybook building green software and lightswitch ops is in that book and I keepsort of seeing it elsewhere and otherpeople are talking about it so it's it'sa very exciting moment because I thoughtleaders um so light switch ops is reallymake servers make turning servers offand on as easy as turning the lights onand off and make it as low risk and asnon-scary so the first thing is you justhave to get rid of that scary state uhyou have to move away from yoursnowflake servers and move towards amore GitOps model where you've got yourinfrastructure as code and you canreliably spin it back up again becauseif you have that then that means you canspin your server down and you can spinit up and it will be in the right placeand it will have the right thingsobviously in the Kubernetes ecosystemhopefully we're doing a pretty good jobof this um so cube control helps withthis anible helps with this lots ofproducts that help with this and thenthe next thing once you have somethingthat you can spin up and down automatethe turning it off and on again so it'sno use just having the scripts youactually have to run the scripts and theway you do these it doesn't have to befancyso one UK bank they just made it so thatwhen you got a self-service instance itwould self-destruct after two weeksunless you renewed the lease so it wassort of like the Bladeunner of serversthey had a 50% reduction in their cloudco costs um you can just time the shutoff and so a Chicago company they got a30% reduction in their cloud bill justfrom doing thisuh a Belgian school got a 12,000 eurosaving a year just from shell scripts toturn their servers off and on again sothese things can be really simple or nowwe're getting open source projects tosupport this as well so there's aproject called Daily Clean that camefrom AXA France and it means that youdon't have to learn the cron syntaxyourself because no one wants to learnthe cron syntax you can have the frontend um and so the way daily clean worksis it just gives you an extra pod inyour Kubernetes cluster um I quite likethis because it relates to my day jobbecause wouldn't it be terrible if thisextra pod ended up consuming moreresources than you were saving byturning the things off the extra poduses Quarkus um so it's super superresource light which is nice uh there'salso commercial products here so I sawrecently there's one called Turn It Offi haven't used it um but I just saw itand I was really happy to see thatthere's there's more more products inthis area um you can also do autotuningyou can do autoscaling you can do binpacking there are some things that seemlike they should help but don't um oneof these is the cloud so in the cloud wehave a bit more elasticity but it's soeasy to forget things in the cloudbecause they're out of sight out of mindyou know you don't even need to brick itup into a wall to forget it umvirtualization again seems like itshould help but you still have to turnthe virtual server off and a lot ofvirtual servers are still running theoperating system and not doing anythingelse so they're still consumingresources you need to turn it off umserverless should helpum not every workload should beconverted to serverless because it it isa big lift to get there and withserverless what you tend to see is thatalthough the workload itself has quite alot of elasticity there is a controlplane to support the serverless systemthat control plane has much lesselasticity and so even though you havethe autoscaling or the the serverlessscaling on your application you do needto factor in the cost of that serverlesscontrolplane um there are some things thatdefinitely don't help even though itreally seems like they should one ofthese is prevention which is kind ofcounterintuitive because surely surelyshutting the barn door before the horsehas escaped is the right thing to do itturns out that if you put really heavybarriers in place to prevent people frombeing able to provision systems oncethey get that system they are neverletting it go so what you need to do tosupport the elasticity is you need tohave the process elasticity as well itneeds to be an easy come easygo processgovernance model for servers and withall of this what I think is kind ofexciting is that this is a problem butwe're we're starting to be able to solveit but we haven't totally solved it yetand that means that there'sopportunities for all of us to innovatethere's opportunities for all of us tomake a difference and if we do thatwe'll be saving the world which is kindof awesome and we'll also be savingmoney which tends to make our employershappy so it's a win-win so if you're auser I would recommend always try andget that utilization up aim forelasticity try and limit your cubesprawl and desify know what you're usingand turn it off when you're not using itif you're a toolcreator make sure that your toolsupports better utilization again byhaving that built-in elasticity and thatsupport for multi-tenency so that peoplecan do the bin packing and supportdesmification so make sure that yourtool makes it easy for people to seewhat they're using and easy to throwthings out um and with that thank youvery much the slides are the left QRcode um and feedback is oh actually no Idon't know what I'm p facing theopposite way one of those QR codes isslides one of those QR codes is feedbackwhich is very helpful to me and to theorganizers um and you can find me at RedHat and on Blue Sky thank you very much[Applause]2025-04-15 21:58:29.595456Yaso been here almost threeyears um prior to that was working atweworks and all of their open sourceprojects and we're going to start with alittle story that's definitely not basedon fact or any of our previousexperiences at all and we've definitelyjust made it up for this conference talkuh there's a new product in the market anew startup called ai i'm a littleteapot and jake and i have been roped inas new members of their infrastructuredevops s sur platform team to startbuilding out ways for their developersto be moreefficient so we start with our teamsthey need a way to deploy workloadsdatabases anything into environments howare they going to do that yeah just asingle kubernetes cluster it's a prettyeasy thing to do when you're gettingstarted maybe you go to the cloudprovider of your choice do some clickops in the ui and get a cluster andyou're good to gookay we quickly realized that onecluster is not really going to do itwe're going to need production we'regoing to need development they're goingto have different requirements differentscales different compute power stillpretty easy to manage right yeah not nottoo difficult just do the same sets asbefore get another cluster it's a bitmore operational overhead but it's nottoo intimidatingokay we've had an outage we've realizedwe need to be a little bit more robustand resilient we want to look intoavailability zones so that we've got areplica of these things across ourdifferent environments now yeah okayyou've just doubled the number ofclusters again bit more operationaloverhead you're starting to feel a bitmore like a platform team looking afterthese clusters maybe you start to thinkabout other tools in the ecosystem youcould use to manage these clusters butyou know it's still early days saleshave been great this year uh we wereoriginally an eu only company now we'vejust sold a massive amount of work tothe us uh they have way different rulesway different regulations we want tokeep this data completely separate let'sput things in different regions now okaythe complexity of the platform isdefinitely creeping up you're definitelygoing to want to start using some toolsto manage the clusters maybe a terraformmaybe uh using some open source projectslike uh cappy or uh you know any of thetools that are available in theecosystem to manage the clustersuh okay there's an ml team that's comeup to me they've got a bunch of freeazure credits uh they've realized thatthey've spun up a bunch of kuberneteslooks pretty much the same as whatyou've done in aws right can you putthat all under the same platform nowyeah it's definitely adding thecomplexity a lot right we know thedifferent clouds all offer kubernetesbut they have different authenticationmodels um they all have subtledifferences in how they do various partsso you know it's not easy managingmultiple clouds definitely thecomplexity of the platform is growingand it grows even further when weacquire a new ml company that are allbased on prem they want complete controlto build out their models they've alsotold us it's just kubernetesyeah this is definitely gettingcomplicated now multiple clouds on premit's just uh it's a lot of complexity tomanage for a platform team and thisproblem grows even further when you getbigger and more effective and moresuccessful as a business so you're notjust acquiring different teams yourteams that are working already are doingweird things with your software they'repulling out lambdas they're working onthat on-prem environment with not justthose kubernetes clusters but you've gotvirtual machines you've got yourdatabases on there you're thinking aboutedge compute now maybe terraform puppetanible different ways of orchestratingthese things that are all being managedin different ways and it goes a littlebit crazyso maybe a quick show of hands haveanybody else in the audience experiencedthis process of you know a slowlybuilding platform or complexity oh shootlike half of you oh awesome i'm sorry soso this is why we built um a frameworkcalled kredics um it's an open sourceproject to try and make it easierZ tobuild better platformsand we've actually learned a lot whiledoing it not just in the kubernetesspace but in the product space and allthe other technologies we work with sowe want to give you some lessons somedos and don'ts so that you uh hopefullydon't make the same mistakes that we doand uh hopefully learn somethingand the first don't that i want to talkabout and it feels very relevant givenwhere we are today is don't reinvent thewheel use theecosystem uh you've probably seen thisabout a dozen times already uh thisslide this is 215 projects i was able tocount and they're across all differenttypes of areas so you've gotobservability things you've gotdeployments all sort of kubernetesrelated or adjacent and they solveproblems that exist already they've donethings for you done it the hard way sothat you don't have to rebuild the samemistakes that people have alreadymade but something to remember given howcomplicated it is is how do you pick thetools to use how do you choose the rightframework and in our case we're aplatform so i want to read this ihaven't actually seen it this week thisone usually comes out all the time andit's the evanbotcher definition ofplatforms and he says that a digitalplatform is a foundation of self-serviceapis tools services knowledges knowledgeand support which are arranged as acompelling internal product so thismeans we should be thinking about thingsa bit more holistically and alsoremembering what are those things thatwe want to provide as a service what arethe tools that make sense for ourplatform users and how do we want toprovide that to people when we pickthose things from the ecosystem thatalreadyexist otherwise you could end up with amodel like this and this is somethingthat i have personally fallen foul ofmyself and you try and do everythingthat a platform provides on your own youtry and build your own guies andinterfaces for your users becauseactually you think you know best aboutwhat they want you try and incorporateall the different novelties of the ci/cdsystems that exist in the ecosystemmaybe not really knowing which one'sbest for you try in use your owninfrastructure as code or even you buy aproduct that claims to do all of thesethings out of the box for you easy peasyand you end up looking like the gremlinafter midnight that's covered inwater so instead of trying to do it allyourself you can innovate on top of thatwheel so we've got all of theseecosystem projects already what are yougoing to add that's new and for us inwe've got this concept of a promise andthis is your way of providing anythingas a service and we think it's got somekey bits in it like the api how dopeople interact with the promise somedependencies workflows and rules aboutwhere it ends up and how we've built ontop of the wheel is this is just a yamljust a crd definition looks veryfamiliar to people in the ecosystem andit means we can fit really nicely sowe've got this specialism that we thinkwe have as a product which is aroundplatform orchestration so how do youstitch all of those lower levelinfrastructure components together andprovide them as a service to your endusers this is where we've got ourspecialties so that's where we want toput all of our energy and we're a nicefluffy gremlin in there while leveragingthe power of backstage port headlamp asgooies while making sure that theorchestration is done by kubernetesbecause that's way better than we coulddo it and has a massive wealth ofknowledge built up already and we'renever going to be a cloud so why wouldwe bother doing something as good as awsazure or gcpalready however we've tried to get ridof some of that complexity how but westill have to think about multiclusterso don't underestimate the complexity ofgoing multicluster because jake's goingto tell you some of the challenges thatcome with that yeah so when buildingthis framework we wanted to supportdeploying workloads to multiple clustersand we started out with a simple usecase we've seen a lot you know oftendevelopers get access to a kubernetescluster they start deploying someoperators to it mayb[e a cubeflow andthen they start using that operator toget some value so you know here to traintheir machine learning um data and thisis pretty great they start consuming theplatform more and more they're a veryhappy uh engineer they start using thecluster even more installing moreoperators getting more value out ofkubernetes and using more things in theecosystem this then grows you havemultiple people across multiple teamsall using kubernetes to get value andit's pretty great but you realize havinga cluster per person doesn't really workso then you start to go multi-tenant youhave different people installingdifferent operators different softwarestacks into kubernetes um and you startto hit some problems you know peoplestart requiring different things in thecluster people want to upgrade thekubernetes cluster um some peopleaccident and delete each other's stuffwhich i'm sure we've all had um and itquickly becomes a bit of a mess so ourmain idea is that you don't want to begiving your developers direct access tokubernetes to do all of these thingsyou'd rather just give them access to aplatform and the platform will take careof it for you so the basic idea is aplatform engineer would want to providesomething as a service on the platformso in this case maybe they want toprovide cubeflow as a service so theywould go to the effort of writing whatdoes it mean to run cubeflow at mycompany and capture that in a promiselike cat was talking about and installit onto the platform so what does thatmean well in this case cubeflow beinginstalled on the platform should go andschedule out qflow to all of mykubernetes clusters where i want qflowto run now there's some requirements tothat right it might be cubeflow onlyruns on a case in a particular versionor only clusters that have particulargpu nodes um but you know that that'swhat we need to support so in a worldwhere this works this will be great acustomer could come to the platform andsay hey train this model for me you theplatform can then decide okay i'm goingto schedule it to this cluster basedupon the requirements that they've setand this is great you can then get loadsof people come into the platform sayinghey do this thing for me give me thisservice and the platform can take careof the complexity of deciding where toschedule it to so this is the world wewanted to get to um and then we startedthinking about okay how do we want to dothis how do we design a framework thathandles the scheduling to multiplekubernetes clusters for you so westarted off with a pretty simple ideait's like okay we want the platform tohave access to these clusters you couldjust do it with the api like give theplatform cluster all of the credentialsto talk to all of the clusters um inyour fleet there are loads of tools inthe ecosystem today to manage this youcould do use things like commada or argocd um but they all have some drawbacksin our case we didn't want to rely onconnectivity to all of the clustersmaybe you have kubernetes clustersrunning in the on the edge maybe they'rerunning in airgapped environments umacross different clouds and having acentral place that has credentials thatcan talk to all of them is not going tobe veryrealistic so then we were like okaygitops is a very powerful uh tool that'sbeing very widely adopted what if wejust said rather than havingconnectivity to these clusters we justsay what we're going to do isorchestrate writing the files to thecorrect um git repositories or s3buckets that can then be synced uh tothe remote clusters by the tool of theirchoice you know they could be runningargo could be running flux they could berunning their own uh loops they know howto read the files from git and convergebut it it's pretty flexible right wedon't have to have connectivity you canimagine somebody literally going intoair gap cluster plugging in a usb stickcopying some files across to the gitdirectory and everythingconverging so we're pretty happy withthis and we realized we need to start tofind a way to distinguish these clustersfrom each other how do i know what thiscluster is goo\d for uh so then we werelike okay this has been solved a littlebit you see nodes in within a clusternodes have labels pods select where theywant to go to based on the labels on anode this is quite a good idea wedecided we would reuse this for ourscheduling so here you could say thiskubernetes cluster belongs in thisregion it has gpu nodes enabled maybeyou say what version it is whatoperators deployed um just a way todescribe the usefulness of theseclusters so coming back to our initialuse case we this was pretty good youknow a platform engineer could come tothe platform and say hey schedule thisoperator so flow as an example to all ofmy clusters that have gpu nodes enabledwe could then filter and go okay i knowthese certain clusters have this labelthat says they have gpu nodes let me goand write to my state store to my gitrepository or my s3 bucket the files torun cubeflow and then the githops toolson the clusters can pull the files andthen run the code and it's good my fleethas been prepared it's now ready toaccept requests from the user so thenthe user can come to the platform andsay hey train this model for me and it'sgreat the platform notices okay i knowwhich clusters um have gpu nodes i knowwhich ones have keyflow installed let mego ahead and take this workload write itto my git repository and have thatsynced down and everybody'shappy but we quickly realized that whilethis is a really great um idea there aremore complicated use cases that start tocome up so let's say you want to trainum you want to train this model and youwant to have a dashboard set up for itbut you don't want to send a dashboardto a cluster that's really expensivethat has gpu nodes you'd rather send itsend it to a separate cluster maybe it'sa bit cheaper um and with the way wedesigned it to begin with that wasn'tpossible because we were only thinkingabout scheduling one document at a timeso then we had to continue our evolutionand go okay actually we want to supportscheduling multiple documents tomultiple different places all at thesame time for those more complicatedworkloads um and you can imagine thisfor a variety of things like running adatabase multicluster you might do a noa like a agent to one cluster an agentto another cluster and it really startedto enable a lot of more complicated usecases and then we realized that actuallythis githubs idea is pretty great um andit's actually been widely adopted byother tools so i think some of theobvious ones like terraform and palumipeople know that you can committerraform files of palumi files torepositories and have them be convergedby other tools um but this actuallyapplies to a lot of things so evenbackstage for example you can declare aplace to go read in git um all the filesfrom and it will show it in backstage orthe same for ansible tower um and werealized this is actually a really goodidea like you don't have to use gitopsjust for kubernetes you could use it forother services so that's what we did wemade our scheduling more open-ended moreflexible and said "okay you can imaginea world in which we send a uh m machinelearning model to one cluster we send adashboard to another cluster or we senda request for an s3 bucket to aterraform git repository and we send arequest for a component in backstage toanother one." and we're getting thisnice flow of everything's being declaredin git i can see everything in git andeverything's being converged by theseremote clusters so that's the sort ofthe journey we went on um we thought itwas going to be really easy we realizedthat actually going multicluster is umprettycomplex so after this we were like greatthe next mistake we made is that wethought um we could reduce ah sorry wethe next uh lesson learned is that twotry to do try to reduce the complexityof things that are in your control so inthis case we built this system you knowwrite files to a git have it be syncedby the remote cluster but how are wegoing to get information back from thatcluster a real simple first pass mightbe okay you deploy an agent to thatcluster you give access to an api that'srun]ning on the platform and it pushesdata back or maybe you have it talkingto a remote database somewhere elsethat's then pushing data back into theplatform but we realized that actuallythis is changing the architecture of ourof our software quite drastically rightbefore we were saying we don't needconnectivity the git repository or thes3 bucket is the only source ofcommunication and doing this wouldintroduce more complexity so we're likeokay how can we reduce the complexity ofthe things that are in our control wellwhy don't we just use the same uh methodfor communicating data to the cluster tocommunicate data back now this ended upbeing quite a simple solution we managedto work on it very quickly get it to ourcustomers um very fast and you knowstart getting value very quickly we'regoing to learn as it scales you knowdoes this continue working but we didn'tspend months developing a feature thatdidn't work we did it in days ratherthan weeks and monthsthere are times where we didn't dosomething so uh efficiently though timeto wear some dirtylaundry this is how we design thingsinternally well used to so we had acustomer come to us and talk to about afeature that they wanted oh i've gotthis promise i need to upgrade it how doi know what the current version is howdo i know which version it's going tohow do i make sure that that's alleffective and we were like okay coolwe've seen this problem beforeversioning can be hard dependencymanagement on those versioning versionscan be hard let us think about it so westarted with one of these white boxesand that was original versioningconversation and then we were like ohwait once we've got versioning how do imake sure that the dependencies areright and that i depend on the rightversion that was white box number twothen once i've got those versionschained up what happen if one upgradesand how do i manage those upgrade cyclesand and if i want to upgrade for a patchis that going to be different from theother one that's white box number threeand that's when we kind of called it andwe were like hang on we could keep goinglike this it could be turtles all theway down boxes and boxes of us thinkingabout something before we deliveranything so we're like right okay let'sgo back and do somethingsimple maybe not so much i went back intime and looked over these boards andsaw how much people were working onthese things we had seven people in ourcompany for two weeks working on thisdesign phase and then another threeweeks working on the execution phase andthat's when i was like "okay five weeksworth of work let's see if this has gotthe results we want." we released thefeature and it was absolute cricketscomplete silence for 12 months and thenwe got an absolute onslaught of feedbackfrom customers who were using thisfeature that we didn't realize at whichpoint none of us could remember or evenread what was on any of these boards sowe had no idea what was goingon that was maybe amistake so what do we learn from that soas opposed to the health checks examplethat jake was talking about where we gotpretty fast feedback what did we learnnever listen to your users um no i'mkidding kidding i'm a product manageri'm not allowed to say that i will belynched by marty kagan if i tell younever to listen to your users not rightreally we should have probably beenlistening a little bit closer because ifwe'd have spoken in a bit more detail toa few more of our different customerswe'd have realized it's not quite theburning issue that it sounded like atthebeginning okay instead do the fastestthing to unblock your users maybe thatwould have worked if we'd have donesomething immediately within two days ofthem asking us maybe they would haveused the feature and given us feedbackthat's also probably not going to workthen you're going to end up with a loadof tiny bitty features that you can'tmaintain and upgrade over time and it'sgoing to be really painful tomanage this was my ceo's secret santapresent in 2024 it is a framed pictureof a quote that he says on probably adaily basis if any of you have hadconversations with him this week he'sprobably said this to you as well he'sknown for it and it's life is a seriesof prioritization exercises there is aninfinite amount of work that you couldbe doing so many improvements you couldbe making to your platform to yourproducts in your life everything but youhave to choose what are the right onesfor you to do at that particular time sohe is constantly reminding us of this oflike how do we make sure we are workingon the most important thing right nowand we do that by prioritizing our userneeds with our wider goals because ouruser needs will tell us things like weneed versioning but our wider goals asan organization will help us figure outis that something we want to do rightnow is that something we want to do intwo months six months and fit in with awider picture about what we are doingand how do you figure that out youremember why you started in the firstplace so that picture that i showed youat the beginning where we were talkingabout the complexity of managing all ofthose different clouds on premiseservices terraform edge computingansible that complex landscape is why westarted our company and why we built inthe first place because we wanted tomake it easier for the application teamsto have a unified interface so that theycould have that interface layer butactually there are a lot of productsthat do that already and we were allplatformengineers so we wanted to create craticsto add add abstraction layer so thatyour platform teams can build somethingfor those users that they can havethrough this unified interface withouthaving that complexity on the user sideor the platform team side so you canboth platforms platform engineers andapplication teams get that uni unifiedexperience oh and also other platformorchestrators are available i'm notgoing to tell you who they arethough but drum roll for the biggestlesson that we've actually learned sofarnever assume your users actually knowkubernetes uh quatics builds uponkubernetes it extends kubernetes it hasoperators it has crds um and we thoughtthat people would just take this runwith it and understand it and werealized that that's not the case loadsof companies are still early in theirjourney on kubernetes loads of peopledon't have that much expertise in theirorganization on kubernetes um so yeahthis is a painful lesson that we learnedbut you need to try and make sure um youdon't bake that into your productokay here's the summary slide don'treinvent the wheel there's a massiveecosystem out there particularly in thekubernetes space but do innovate on topof that wheel add your special source ontop of whatever you're building so thatyou can make something better and buildup all those projects that you'rebuilding on top of don't underestimatethe complexity of going multicluster uhjake's point i've counted it was 30slides so that proves how complicated itcan be to go multicluster but do reducethe complexity of the things that werein your control so if you want to gomulticluster pick the bits of it thatare going to be powerful for yourorganization and then don't just buildthings you think are going to be greatbecause you're probably wrong but doremember why you started building in thefirst place because even in my story wesaid didn't get usage for 12 months butwe did at the end of it people wereusing that functionality and learningfrom it so remember why you started sothat you can build the things thatpeople want just make sure you do it atthe right time and also never assumethat people actually know kubernetes idon't think i can shout this hard enoughit's really really difficult andcomplicated thank you very much forlistening to useverybody so if you want to learn moreabout kratics as a product or just wantto chat to me and jake because we're funpeople we've got a booth we're at 641we've got card games so you can comeplay card games with us and one of ourcolleagues will run you through that andalso if you want a demo you can have ademo are there anyquestions no jake and i will hang aroundif people want to ask us anything we'rehere2025-04-15 21:58:30.071962 A �A�4u#��Am8ZnlZTo1OEwelcome uh today I'm going to take youon a journey that uh we took to buildour AI serving platform using operatorsthe challenges we faced and how weovercome those with a design patternso yeah I love Kubernetes and I guessyou also do so if you also loveKubernetes I guess we all loveoperators and let's see what and if youdon't uh perhaps uh you will love itafter thissession so what's operator in a nutshelloperator extends a Kubernetes core APIand in let's say in more descriptive wayand by Kubernetes documentationoperators are software extension toKubernetes that make use of customresource to make applications and theircomponents uh operators followKubernetes principle notably the controlloop so essentially control loop are theexecution logic and resources are thedata and operators here are theautopilot of your cluster that's coolright so how uh the how controllers anduh um resources are doing that socontrollers are checking uh Kubernetescluster and your desired uh state fromyour resource and reconcile these two sothat's uh simply put so now that we knowwhat's operator what is controller whatis resource how should we start to buildour operators so how it started uh soyou go search for a framework and youwill perhaps go to the Google sayingokay best framework for developingKubernetes operators and you willprobably hit operator SDK at leastthat's uh what I what I did and you willgo to start uh with a tutorial how tobuild your first operators and what'sbetter than the uh original website howcan I build one you will start uhfollowing the quick starts how to buildyour operators and you will perhapsbuild your first uh mimcache uh operatorby following the tutorials and you'rehappy so based on this hype you willstart building your operator on top ofthe tutorial what you learned andcongratulation now you have your first_��Xt#��gAAHY4IDlBhzEso we are here to talk to you a littlebit about building a platform frameworklessons learned from developing amulticlusters kubernetes operator so weare from cintaso and we work on craticsthat was one of our originalstickers let me know if you know thejoke i don't really get it um so i'm catmorris i'm the product manager here atcentasto i've been here for about yearand a half two years now and before thati worked on platform products uh atthoughtworks and other companies workingfor five or six years in this platformkubernetes domain i'm here with jakeyeah hi uh i'm jake uh i'm an engineerhere at centX`operator and you're so happy with thatbut this was me uh two years ago howit's going after that the uh controllerlogic get complex very quickly it's notuh it will be so many things that younever anticipated the first day and theCRDs are are like the a stone and whenyou want to change something in thatit's like carving in a stone and it's init's out there it's a central API youcannot change it easily and above allfor the platform engineers it's theheadache that only the creators maytouch the operator code after two yearsand everybody will come to you sayingokay I want this I want the other thingand you are responsible for all of thatand nobody else can do that so uh thiswas uh when we were at this point of thetime that we developed some of thecontrollers uh and operators and we seethis uh irony that cloudnativedevelopers build monolithicoperators i said this is something thatwe should not do so what should we dowith this monolithic operator with thisbigmonolith so we we all all know theanswer usually it's divide and conquerso on the divide part we can havemodular CRDs so we can split or uh uhCRD into a smaller pieces of the ofspecsso that's the uh divide part and for theconquering part we can havemicrocontrollers which each of thesemicrocontrollers take care of thisspecific uh um CRDs which we identifythat they are for a specifictask so what uh a controller does solet's let's recap so controllerreconciles the cluster and in doing thatit means that it needs to translate yourcustom resource to built-in resources orlet's say other resources which in turnand finally will be uh a built-inresource that Kubernetes knows andoperates with that so we want todecouple CRDtranslation from controllerum so that the logic is not u how tobuild this ones and we already know atool that its work is to translatevalues into manifest by using a templateand I guess you can all agree with meand you may already click what what itis and it's helm it's the main job ofhelm to get some values use and com withthe templates to create manifest so whenthere is such tool there let's use thatin our benefit and put it in our uhdesign so with values and template andnow we have our resource which we wantto make a connection for the connectionwhat we came up with was to come up withsomething called trade system so thedeveloper or the application developerwill care about how to car define thecharacteristics of the resource forexample uh the type is stateless or thesize ismedium so they don't care if you want touh deploy it whatever way and translateit to whatever way in Kubernetes theythey will saying that okay this is astateless that's how a developer careabout this this uhresource and we are linking the valuesand templates uh to the resource by thisannotations so I'm going to get in moreuh technical uh terms right now butbefore that uh we need to first uh seethe um structure of the code base so wewill have uh charts I guess they don'tneed introduction values and templatesplushelpers so for the values if we want togo into that like for example for thetype uh we define a a helm value andit's type and it can have a defaultvalue because sometimes people may notset uh uh the characteristic so we sayokay if it it's not set it's statelessand we have some sort of uh conditionsif it's a stateless we know that we needto use a deployment template if it's astateful we do have a stateful settemplate the same goes also for the sizeuh for the size having one default let'ssay the default replica one if it'ssmall a replica is one medium two and uhlarge tree the point here is that you itenables us to have an abstraction levelbetween the size and the actual uhimplementation which is for example thenumber of replicas how many replica doyou wantand this can can be different percluster like for example for t-shirts uhmedium means something in Europe and ifyou go to China medium means somethingelse if you go to US medium meanssomething else but all are medium so youhave this abstraction of calling thisworkload or this service as a mediumsize but uh the platform engineers arerespaonsible for defining what mediummeans in my clusterso now let's see uh when we are doing itthis way how big will be the the mainfile the main file will be as simple asthis like including uh include tradetype and we define trade.ype type in ourhelper and in the next two lineuh we set the we get the uh value fromthe resource and in this case it's astateless and we also need to knowwhat's the default value if it's not setit's also a stateless and then we aregoing to iterate over the uh conditionsto see for this uh value what's thetemplate for stateless we know from thevalues that the template isdeployment and we are going to continueso this is the most important part isthat so now we know that it's a templateis uh deployment now we need to includethat uh and means that we need toinclude trait template deployment whichwe do have it here as an uhdeployment which is just like a helmtemplate we all know that how it's likeso we need to define that and then thisway we managed to uh connectHelm to um to our templates with withvalues in theresourcesso now we do have configurable modulesconfigurable controllers which are whichcan be configured through values andtemplates and all can be done uh notinside the code of a controller notinside let's say you don't need to writea go code or uh you don't need to writejava it's all helm and helm is somehow awell-known so like um devops people knowhow to write uh write this helmtemplates platform engineers knows soit's a common language for configuringyour uh operators or let's saycontrollers so uh is it worth the effortso to have all these abstraction layersthisconnectionsum so if if you're not still uhconvinced that it it worksum and you may say for example that okaymy operator is not that complex I don'tneed this level of abstraction then ifit's not a complex thing I would suggestconsidering helm in total becauseoperators are mostly for complex uhstructures complex uh applicationlogics and if you are still notconvinced I'm going to remind you ofthis libraryunstructured so if you already if youknow it you know how of a big of aheadache is working with a unstructuredAPI and especially for external CRDswhen you have external CRD and evenespecially if it's not uh it doesn'thave the API in your language forexample you're writing in Go and the APIor the other controller which theydefine theum um the CRDs are in Java so you don'thave a way to directly import that andyou have to use onstruct and that's aheadache believe me and you probablysome of you may already know that so Iwould say definitely of course it worththe the effort so let's recap so for thecontroller uh the task was to reconcileuh cluster which is the um actual statewith the desired state of yourKubernetes resource and we managed todelegate this uh translation ofKubernetes custom resource into built-inresources so we are now getting rid ofthis now let's see how this uhcontrollers work with this uhmicrocontroller design pattern so for wewill have a number of u microcontrollersto take care of a specific parts of uhspecs but for that first we need acoordinator which controller calls thatcoordinator and the coordinator knowsfrom the spec how to divide uh thesethings to send it to differentmicrocontrollers and what do we wantfrom the controller here from acontroller we want on values because wehave our templates in our helm so if wehave the values from the controllerswhich they need to understand the logicand then say us okay this is the thevalue that you should set that then wealready have our our manifest creationuh process so we need to get uh thevalues from them and so you can havealso take a we can take a look at how westructure the codebase so it will have auh my controller.go go which is the mainor big uh control loop we do have a cocoordinator which knows based on this uhbig spec how to divide it and to send itto microcontrollers and we will have anumber ofmicrocontrollers so we are stilladhering to the best practice ofKubernetes to put uh all of thesethreads things in one container for theperformance sake but still we do havethis flexibility to add more uhmicrocontrollers to add more uh logic ormodify that for example if you want tochange your HTTP controller ormicrocontroller you know where to findit how to edit it and how it will workso to putting everything uh together Iguess we established that that the uhcustom resource will be linked to thehelm templates by the traits and we alsodo have a controller which reads fromthis uh CRD or custom resource let's sayis it your desired state and using helmand the connection with the uh uhresources to the helm it will create themanifest Now it's time to get an actualstate from thecluster and then comparing these twoapplying the difference to your clusterso the benefit of this is that uh itwill also give us the um separation ofresponsibilities the developers areinteracting with custom resources theydefine uh the resources through uhcharacteristics of course they need tostill fill the spec and the DevOps or uhSR security people know how to uh configthis um this translation using Helm fordifferent types ofclusters and the job for the platformengineer will be to develop thismicrocontrollers and also creating thisuh maintaining the main uh controllerloop coordinationsthis work is part of our uh biggerproject this is the core of one of ourproject which called ACDA forevolutionary changes in data analysis sowe are trying to help data scientists sothat they can serve their models uheasily and our system is built on top ofthis uh operators so what we're missinghere is the actual scientistso the actual scientists in our uhplatform can use a a user interface tocreate this uh beautiful pipelines orthey can use our uh software SDK so thatit can automatically generate uh customresources using uh CI/CDpipelines so to uh to let you know whatkind of operators we do have we firstneed to see what's the applicationwhat's our application so as I said it'sa um AI serving platform that you cancreate yourpipelines for example we do have a adata source it's a module which sendsdata to uh an LLM then we do have avalidator taking the stream of data andthen perhaps validate or reject and giveit some score and with this techniquethat we do call is let's say umgeneralization on top of the thing so wedo we see all of this as link we canalso have loop backs so we can also sendthe data to to a planner that based onthe evaluation can say okay what shouldbe the next prompt how we can tune theprompt so that it works better for thenext uh data based on these evaluationsso the whole thing we call it a pipelineand we need a pipeline controller forthat the the boxes are modulecontrollers modules and we do have amodule controller for that and itdoesn't matter if how do you want todeploy that because we have a versatileuh controller that can pick any kind ofmodules and translate it in any form sothat itruns and for the links we do have a linkcontroller also this the same designpattern and on top of that you can alsoconfigure the modules for example tochange the uh threshold here and see thedifferent results on thefly so at the end you don't need so manycontrollers you need configurablecontrollers you need uh controllers youcan config the way you want fordifferent types of resources fordifferentenvironments so this is how we built ouruh operators they are easy to developconfigurable maintainable and we do havea clear responsibility for each uhperson in this organization so therewill be nodispute so I hope by now you also loveoperatorsmy name is Mustafa Hadadian i'm CEO andfounder of Kardell kaidell stands forcontinuous AI delivery and I'm alsofinishing my PhD at University ofCoroning i used to be a data engineerdata scientist leading this uh team ofdata engineers team of data scientiststogether uh leading the projects andbeen also a platform engineer so I I sawthe whole spectrum and try to buildsomething that all of the stakeholdershas a saying in this platform so theyeach can know how they can contribute tothis uhapplication so if you like uh you canscan the QR code and let's stay uhconnected thank you[Applause]2025-04-15 21:58:30.832094cnew resource which islooking to land in a Kubernetes clusterUm that's the sorts of uh things thatpeople are currently running policy onOper is your judge in this scenario andtakes that JSON message runs some Regopolicy and returns to you a JSON valueas well So it's JSON in and JSON outAt the same time um Oper is is loadingin policy and data in order to make sureit's ready to make those decisions Sohow that looks is uh while Opra'swaiting for requests it it makes thathas the most recent version of thepolicy bundle that you're you'veconfigured it to use uh and it's at thesame time also sending information aboutthe decisions which it's made uh to adecision log store if you've configuredit to do so uh this is useful forauditing use cases but it's also helpfulfor uh debugging uh and monitoringtoo So why does policy as code work uhthere are lots of people using policy ascode today Uh we see them as maintainersof the project They're using Oprah to doit Uh why does it work for themuh policy as code allows you to decouplepolicy logic from your applications andfrom your the services and platformsthat you're building and to standardizeit And this allows developers to focuson building uh business value in theirapplications without needing toimplement the same policy controls thatperhaps others in the organization havealreadybuilt Similar to how you work on othercode a policy as code can be versioncontrolled and collaborated on usingstandard tools like um GitHub and so onwhere you can use pull requests to umcheck that changes to policy are as youwould expect and have gone through thecorrectchannels It's also possible to sharecommon security policies that yourorganization may have between teams uhsimilarly to how you may share a sharedlibrary or something in otherapplications Policy is code works inmuch the same way Um so as as a finalpoint and something that we'll touch onmore as we go through um code is issomething that you can staticallyanalyze and you can use uh softwaredevelopment tools to help you work oncode and uh that's something we've madesome good improvements to for for Regoin recent years So um that's animportant benefit aswell I want to highlight some of usecases or highlight uh some others whoare using OpE and talking about itpublicly Uh there are some people whoraised their hands at the start of thesession like if you're interested inspeaking about OPER or would like topromote OPER the use of OPE within yourorganization please do get in touch Umwe didn't prompt these people to comeand say great things about OPE Uh thisis a a quote from a presentation at thecolloccated ISTTO day last year Uh Operaprovides us a generic way to applypolicy consistently across all ourservices and systems Um that's exactlywhat we're trying to do with Oprah andthis this is a great talk and use casefor our project if you're unsure abouthow to use it and how it fits inSimilarly yesterday uh alsocoincidentally from Bloomberg um anotheranother great session when the recordingis out I'd encourage you to watch it uhtalking about how Oper allows them todecouple access control logic from therest of their services This is for adata access and sort of data servicethat they're using Oprah forSo just to give a quick overview of somecommunity highlights uh like uhsomething that happened last week Iwanted to share We had a a a questionasked the the asker had started the Ohthat's gone out of line That's a shameAnyway they've fallen in love with OperI wanted to highlight this sort of funone of the fun things about being amaintainer is you get to see these uhsee these queries come in and um yeah pepeople do love Oprah and um it's it'sdefinitely a highlight of being amaintainer on a project is to see thingslike this But in the last since lastwe've also passed 10,000 GitHub stars Sowe've got uh more than 10,000 internetpoints now So you can congratulate theproject on that Um but maybe slightlymore tangible is that we've um made somesignificant improvements to the um theRego debug adapter protocol support Soin in the Regal Llinter language serveruh our colledague Johan has uh has addedsupport for the debug adapter Uh this isan important usability feature It'sexactly what I was talking about howwith code you can use programming toolsto to work on it and then uh improve itmore effectivelySo I'm going to give a a short updateabout OPE gatekeeper as well Um we havehad some slides and Anders and I workmore on the core Opera project and ruleengine Uh there's a sibling project inthe open policy agent or called OperaGatekeeper I'm sure some of you here areusing it or have heard of it at least Uhand it allows you to um configure policycontrols for Kubernetes admission usingcustom resources Uh so yeah they they'veshared some updates and have alsopublished some new releases since thelastKubeCon Uh one of the important thingsthat you can now do in Opera GatekeeperUm and it's also happened since the lastKubeCon is that we've released Oper Uhthis um consolidates and standardizes anew version of Rego going forward Uhthis new version is now available inOpera Opera Gatekeeper as well uh youjust need to make sure to set yoursource version field On top of thatthey've made some improvements to thepub sub interface as well as uh someimprovements to the gator CLI This is aCLI tool uh Gatekeeper users can use touh perform checks and validations ontheir policies prior to rolling them outSo if you're uh an open Gatekeeper userI'd encourage you to update and checkthose updates outUh they also have some improvements forexporting violations to disk uh and aregoing to be uh graduating theirvalidating admission policy integrationto beta aswell So I made reference to Opera 10That's something um obviously a majormilestone for the project we released atthe very end of last year Um if you'vemissed this there are a few things youmight want to check out to get you up tospeed Most importantly we have someupgrade documentation on the OPERwebsite you can check out And uh there'sa release blog as well where sort ofsets the scene for this uh the sort ofmeaning of this update more generally Uhand also you can read through the recentreleases We're on 160 now as wellUh so before I hand over to Anders Iwant to just give a quick overview ofsome upcoming uh roadmap suggestions anduh give you a sort of overview of thekinds of things we're thinking about atthe moment And um yeah as I go throughthese I want to just remind you thatthese are kind of open for discussionI've tried to provide the links we'llprovide the issue numbers if you do wantto go and look them up Um so yeah bearthat in mindThe first thing we're considering iswhat we're calling streaming tests Uhyou're familiar when you're perhaps withother tools when you run unit tests Uhyou find as you you run the test suitethe results are streamed back to you Uhas as uh as the tests are run and asthey pass or fail uh this is an exampleas a short comparison Here's thecomparison between go test and oper testtoday As you saw the go test uh wouldprint out the the test results as theycame in whereas Opel waits for the fulltest suite to to complete before showingyou the results That's something that wewould like to improve Um this also makesit easier for us to integrate uh withlanguage tooling and provide kind oflive feedback uh for those running testswithin theireditors In addition to that uh we'realso um this was this is one that we'reparticularly keen to get people'sfeedback on uh oper famously doesn'thave like an or operator Um and uh we'realso considering an alternative operatorwhich I suppose is like a kind of umperhaps more familiar to maybe thoseusing languages with things like turnaryoperators to allow you to kind ofattempt something and if it's undefineduse a default instead This is notsomething that that we've uh really beenable to make progress with before but webelieve that both these two features aresomething that would potentially helpthose new to Rego Uh the or logical orein particular is something we find a lotof users expect to find in Rego eventhough it doesn't exist Um and thealternative operator again is um allowsyou to express things that at the momenetare quite verboseSo yeah um in the meantime you can't dothat you should check out this blog postwritten by Anders Um it's one of ourmost popular blog posts on the Styrablog So uh yeah do do have a look atthat in the meantime There's plenty ofways to do it already but at the sametime we're keen to getfeedback Um I might let Anders talkabout this one It's it's one that'sclose to his heart and um he's got someideas about it Sure Uh thanks Uh yeah soanother uh feature pretty high up on theroad map is this a string interpolationand it's is uh exists in many otherlanguages Uh one reason why we have whywe feel this would help a lot is uh thatthe sprint f builtin that many use todayto to include variables or values in intheir in their outputs like when whenyou have a rule and it returns a warningor a deny message or something and youwant to include a variable One problemis that we see a lot and something thatpeople that trips up a lot of people isthat uh they have these uh this rulebody with all these conditions and theydid everything right But uh so thepolicy evaluates up until the last linewhen they do a sprint f in order tobuild that uh deny message or somethingbecause one of the references they'retrying to insert in the string is notdefined So you might say like uh warninguh the user must be of must be 21 yearsold or whatever and so you say likeuser.h but what if you have a user andthere is no age defineduh then the the whole rule is going tofail or it's not going to evaluate Uhbut that that could have been likewasn't really important information Itwas just something you wanted uh in astring So if that would be the caseyou'd rather just print like unknown orquestion mark or or whatnot Uh so it'sit is both a nice new feature It will itwill make for some um better lookingpolicies but it is also an importantfeature uh in in in thataspect Yeah I think I think this is whatI covered right but yeah basically anexample of of that would be uh where youin this case you reference like the thegroups or the tester groupuh and it's not there and and thingsfail and what that wasn't that wasn'treally uh something you you'd expect Soit's and it's frustrating for us when wesupport someone and you see like theydid everything right and they're so it'sit's frustrating on both endsSo yeah just with the road map we putout a video on the Opera YouTube channelrecently uh where we uh where I talkedto Johan who's another Opera maintainerabout um a number of other items on theroad map Uh if you want to get up tospeed that's a great place to getstarted and all of the issues and thingsare linked in there as well Uh so yeahif you want to look into this in moredetail do have a lookGreat All right So uh I'll close thisoff with talking a little bit about uhOPA performance This is a uh something Ihave been working on uh a lot for forthe past 3 months or so Uhyeah Uh okay So so why have I beenworking on performance there might be uhother more pressing issues in OPA and uhyou'd be right It's not the mostpressing issue Opa is generally veryperformant and for normal kind ofworkloads and policies there's there'suh rarely a reason to toto worry aboutit But what we did uh a few years ago isthat we built a llinter for Rego This isRegal which eventually uh which we'lltalk more about Uh but we decided likesince we we we we are re we are Regoguys or we are OPA guys like of coursewe should build a llinter in Regotoo And eventually uh we saw like okay allinter that's going to help a lot ofpeople because it can automate and helpyou uh get support without having to asksomeone you can just write something andyou can do your best and liner will helpyou and tell you what uh you did wrongand and suggest a better way to do it Soso that uh that culminated uh into uhalso making Regal a language server Alanguage server is basically uh it'spart of uh the language server protocolspec uh where you have a a server thateditors like VS Code or uh Vim or orwhatnot can talk to to get informationuh such as linting issues uh but alsothings like autocomplete uh evaluationdebugging and sof on Uh and the goal ofall this and why why we saw performancegoing to is going to matter is becauseof course we want to provide the bestpossible editing experience for anyoneworking with Drago We know it's it'salready a lot to learn a new languageIt's and it and it is a differentlanguage A lot of people find it hard touh get started or and so on So uh whatwe at least want to ensure is like youhave the best possible tools uh as youas you work on that and as you're tryingtolearn Uh so that's that's regal and ifyou haven't tried that I encourage youto dothat Uh but back to performance why whywhy did why was that important for allinter and a language server so uh wehave around already 2 years in we havearound 100 lintrules Uh so all in all we have about15,000 lines of rego So that is alreadythere We're kind of way past what anormal gatekeeper policy would look likeor I don't know terraform validation orwhatever So it's a there's a lot of regoand adding to that the uh this rego isevaluated for any file you send to thellinter right because that's just howit's got to work So if you have 200files which is about what regal itselfis you're going to have to multiply thatwith all these lines that have to beevaluated each time So that that will beabout 3 million lines of Rego and that'salso uh even more uh off the charts of aof a normal uh Rego evaluation and atthat scale of course performance isgoing tomatter Uh but still already withoutoptimizations we found like on a on amodern MacBook Pro uh doing this takesabout like two two three seconds whichisn't too badUh but since since re regal works inparallel and if you run the same uhlimping in like GitHub actions orsomething where you don't have all thesecores to parallelize it's going to be alot slower and it might even be likeminutes because it's we only have uh asingle CPU or maybe two and it's a CPUbound workloadAdditionally the language server isgoing to lint all the time Possibly evenevery time you make a key press in youreditor it's going to send client's goingto send like lint lint this again lintthis again lint this again And if thattakes two seconds you know you'reprobably not going to wait like twoseconds between each key press So thatthat was another aspect where we neededthis to be veryfast Uh so we don't do a whole lot ofwork in Regal ourselves It's basicallyjust evaluating OPA So if we wanted afaster linting we had to have a fasterOPA So that's what wedid Uh and the way we did this wasbasically try and make make senseidentify where is where is it slow orwhy is it slow for 3 million uh lines ofreggo at each key pressuh so identify where uh the hot pathsare and of course it's a go applicationopa so that means reducing memoryallocations is is likely uh the mosteffective way of making something fasteruh another kind of area we focused is toimprove the uh the runtime of opas'sbuilt-in functions there's a lot of themwe mentioned like the sprint f1 I thinkthere's 180 built-in functions orsomething like There's a lot of built-infunctions and of course uh when you doan optimization like this we'll firststart to look into those who are uhcalled the mostoften but there's also otheroptimizations like data structure uhuh yeah what kind of data types are weusing what kind of data structures likebasic stuffreally and a few optimization we did aremostly going to be beneficial for regalBut uh a majority of them are going tobenefit are beneficial for any policyevaluations It's it's it's justOPA So uh from OPA version 70uh which isDecember that that is so essentially uhfrom OPA 1.0 and onwards is where allthese performance uh improvements havelanded YeahSo yeah there we don't think we havetime to go into all of theseimprovements but uh as you can see justfrom the from these uh PR summaries alot of it is about uh avoidingallocations on the hot pathUh and just to provide one example of ofI think we did this previously if youdid and this this code here is from thefrom the count function and you you callcount a lot in Rego because you want toknow how many numbers or how many itemsare in this array or this set So that'sthat's one example and what we did herewe're as we before would say like everytime we we did a count we we asked thecomputer or the runtime for some memorybecause we need a number that we cansend back to the user or so uh if wecalled count 1 million times we wouldask the runtime 1 million times for andthis could have been for the same numberand well whereas now we do interning ofof numbers So if you ask one milliontime we will still and it's always thesame number You'll always get the samenumber back uh or rather you'll alwaysjust allocate uh for for one number inin theruntime And the result of all these workis basically this So um and this is onmy my MacBook Pro 10,000 lights of Regouh from OPA version 70 We were up atalmost two seconds and this is much muchmuch much longer on on in a CI CDpipeline like GitHub actions And from OPfrom OPA version one is when we start tosee like real improvements Theimprovements drop off some but so fromOPA version one to zero we're down toalmost 1 second So uh it's a723% faster evaluation across the boardAnd this of course benefits not just usregal but anyone using OPA contestgatekeeper or orwhatnot And uh yeah the red bars arememory which is not not as much but uhsince we avoid memory allocations itdoes have a a good impact there tooOkay great all the more reason toupgrade If anybody happens to do theirown benchmarking please do share themwith us on the Slack or via some otherchannel We'd love to see if the resultsmatch your own Um so yeah I think we'vegot about five minutes left for somequestions Yeah if you want to takequestions Yeah that's great Thanks[Applause]No questionsThere's some hands uh over hereUh so sorry if this has been part of thepresentation before but I was curiouslike what other uh alternatives youconsidered when uh developing RIO andwhat the thought process was uh whyusing this declarative language and notthe other andsorry what what other approaches wasconsideredUh yeah so the obvious other otheralternative approach would have been tobuild the llinter in in go opa built ingo uh it would perform faster becauselike it we have more things to tweak butuh why we why we didn't is is prettymuch uh because then we wouldn't havehad this we we have learned a lot aboutRego ourselves and like performancecharacteristicsUh but just the fact that uh Regal isalso written in Rego has meant that likewe when we lint Regal itself Sobasically we built a a llinter in Regoto lint the Rego in the llinter wasbuilt in Rego So that there's a lot ofRego and that is that has uh helped usit's kind of dog fooding oruh both to improve uh OPA and Regal anduh the ecosystem at large So if even ifI talked about performance here but wehave fixed a lot of bugs in OPA in thelast uh couple of years that we havethat have emerged in Regal and becausewe have used Rego so much and and ofcourse maybe uh maybe not always forthings that it was intended to but uh soit's been a a learning journey and Ithink it paid off for both us and for umfor all these projectsYeah like I guess also the where thellinter is today where almost everythingis or the rules and the language servercompletions are in um written in Regolike that and when we started theproject we we did uh have some that werewritten in Go rules that were written inGo Uh so we did kind of imagine that wewould need to have we would try and havesome in Rego and some in Go We'veactually been able to get rid of the Goones quite recently So um yeah we'rekind of on on that path now but it wassomething that we did leave open when wewere first working onit Any otherquestions okay all good Uh please do umscan the QR code and leave feedback onthe session uh we're we're able topresent as uh as as a graduated projecteach CubeCon and uh yeah we do read thefeedback that you write and we'd alwaystry and uh make sure that it's whatpeople who are attending the couponevents are interested to hear Uh so yeahand if you're interested to go back toany of the slides the the other link ison the other side Thanks very much2025-04-15 21:58:31.540174 yjy��Sw#��]AZbi46yTlSVoum hello so we are at the Wittmaintainer talk um today we're going todeep dive into one of the most importantuh one of the very important module ofuh wittus initially I'm going to u startwith a very quick intro um to what isand u my name is Rohit Mak i'm amaintainer i work at planet scale i'vebeen maintaining witness for about 5years and I'm Schlomi with planet scaleuh also maintaining for five years anduh contributed in the my scale communityi authored ghost orchestrh��zv#��+AXtA-NKoJDaIwow it's great So many turned up here uhdespite the last being this being thelast session So thank you for thatUh yeah we're uh we're here to talkabout OPA today Uh are you all familiarwith that or show of hands how manyusers how many users of OPA do we havehereabout half Cool Yep We'll do both anintro and a deep dive So hopefullythere's something for everyone Yeah OkaySo do you want to do quick intros yeahHello everybody I'm Charlie I work onthe developer relations team at Styrawith Anders Um we're both openmaintainers and uh yeah we're happy topresent to you today Thanks for comingoutYeah I think you you already I'm Andersand I work in the same team So um yeahI'd like to start these presentationsjust to get people thinking about whatis policy I think we've it's a word thatpeople we're meant to do this to startit off just get people thinking aboutwhat is policy Everybody's got adifferent idea about what policy meansto them or the first thing that comesinto their mind But I think for thepurposes of this presentation and forthinking about open policy agent it'simportant to think about policy as beingmany different things um anywhere inyour different applications you'reworking on or platforms you're workingwith where you need to implementsomething that looks like a rule whetherit's authorizing users grantingdifferent tenants in a Kubernetesplatform access to do particular thingsthere uh defining custom rules and CIjobs uh implementing business policy andapplications all of these things arewithin scope for what we're talkingabouttoday so that's what policy is policyy'srules rules uh and I made reference tovarious technical policies there butoften policy is kind of legal policy orpolicy that might be written in youremployee handbook Um today we're talkingabout policy as code That's what openpolicy agent is really all about Um andthis is just an example showing you knowpolicy in natural language versus apolicy as code As you can see it's uhsomething that you could uh you couldwrite down and express as code That'swhat we're talking about So how does howdoes that's his title That's a shame Umhow does policy as code work so how doesOPE work with policy as code so as asort of simple model is that you providesome information about a decision thatyou're interested to make or have OPERmake for you and you've preloaded OpEwith some uh some policy configurationthat you want it to evaluate uh and youget a decision out That's what policy iscode looks like from uh from a caller ofoperHow that looks in in a sort of morearchitectural uh sense is you have mighthave a request arriving from a user toone of your services Uh that service uhmight be a service you've built It mightbe a service which is um one of ourexisting integrations with OpE uh makesa call into OPER for a policy decisionThat's called a query And this can beany JSON value rep representing uh anoperation which has maybe been taken bya user or a biator and otheropen sourcetools yeah so originally uh Deepti Sigerwho was a tech lead was supposed touh do whatever I'm talking about now butshe had some personal issues you canland the last minuteumso what is Vitus right so that's aquestion that many people have beencoming and asking and we find it alittle surprising because uh if you useslack every message of yours is actuallybeing stored in a vit cluster if you useuh if you use u git create PRs issuesthey all going intovit play u activation games every bulletis going into vituh ventedshopify many of these uh hubspot rightso cash app if you're uh from the USthat's entirely in witness uh we runlike Slack runs uh over three millionqueries a secondum Vitest was one of the reasons theysurvived COVID very well when suddenlyeverybody is workingremote umso so it's proven at level but all ofyou are probably indirectly using it butyou don't know about it so that's why Iwanted to make a point here of talkingabout this so it originated in uh Googlein YouTube uh and it was serving all themetadata for YouTube for a long time umuntil Google finally when they bought itmoved it over to Spanner and then theyopen sourcedit so it's got Apache 2 license um therea lot of contributors from all overum one of the key decisions that wasmade because they were already runningMySQL servers that to make continue touse MySQL as a storage layer so unlikemany uh systems who invent uh databasenot that they're bad but you here youalready get something that you knowespecially if you have used MySQL and wedon't have to worry about that part ofit what we do is we take multiple ofMySQL clusters uh give you a distributedoption uh make itdistributed and uh so it's massivelyscalable i mean you can just see if allthese um users are using it[Music]um high availability is also reallyimportant um so there are three thingsthat make it highly available yeah whatI forgot to mention earlier was it wasuh Google when they bought it theyported it over to Bog Google Borg whichis a cluster management system whichactually is the inspiration forKubernetes uh and Docker so this is beencloud native from like very earlydays so you get Kubernetes scaling umthe architecture is there's a lot ofmicroservices you can scale themseparately depending on your uhrequirementmysql of course already has replicationso you can use that and Vitus addssomething really important which is veryspecial uh to Vitus is shardinghorizontal sharding there's also you cando vertical sharding but that's easyhorizontal sharding where a table sitsacross multiple um MySQL servers rightparts of it one row isn't only one shardand Wis provides configurable shardingright so many databases talk about autosharding it it that works for someworkloads no problem but for many of ourcustomers for many workloads as youscale you need depending on your work uhload patterns and the kind of queriesyou run and the kind of apps you builddevelopers you have you need uh toconfigure how you shard your data andVitis does that very well you can changeyour sharding at any time there is zerotime downtime cut over built in throughvarious complex mechanisms you don'treally have time to go into that but uhwe are going to be at booth 1B in thenorth um for some time maybe today anduhtomorrow soum meet us up there um there's also alot of things operability that is therefor backups uh maintenance if you wantto upgrade your versions etcum yeah so a lot of people use w move towitness for scalabilityperformance but there are also users whojust use it for the module is going totalk about which is doing schema changesat scale so this a huge problem I'll lethim uh talk more about it so over tothankyou all right So we want to talk abouthow Vest manages schema changes at scaleto do that we need to discuss a littlebit about the VitS architecture so I'mgoing to do a bottoms up real quick walkthrough about the architecture soconsider a normal MySQL replicationtopology in this picture you have oneprimary three replicas the first thingwe're going to do is to add a tabletwhich is kind of a sidecar orj a demonattached to each of the MySQL nodes sothis tablet has control over the MySQLserver it can bring it down bring it uprestore from backup but most importantlyfor our discussion is that it controlsall the traffic that is going into theMySQL server the MySQL otherwise doesnot allow incoming connections so theapp never talks directly to the MySQLserver in production you will havemultiple clusters maybe it's a differentdatabase for whatever uh reason you havemultiple clusters and what we're goingto put in front of them is Vitigatewhich is a query engine and loanbalancer and query routter and uh uhfirewall like all combined into one andvitigate is a component that masqueradesas a monolith my server so it can speakto my protocol your app connects to thattalks to it and thinks that this isactually the database where behind thescene vitigate really talks to the allthe all of the underlying uhclusters so in reality we have manyvitigates both for high availability andfor uh throughput of queries just tohave enough capacity to serveeveryone rohit mentioned that Vita Vitzis a sharding framework and this issomething we're going to discuss inlength today so you you might have somedatabases that are unsharded but somedatabases just too large and you shardthem so there will be multipleshards and for different databases andsome databases are unsharded and thequestion is okay so how does the howdoes my query go from here to there howdoes Vitigate know if I'm going to queryuse commerce select star from ordersusing customer ID equals 4 how does itknow that it needs to go to thisspecific database on this specific shardand the answer is that we have asharding scheme right a shardingconfiguration it's stored on a non-datapath uh topology server as we call itit's basicallyanc footprint in memory just to definethe routing rules for the entire clusterand once vitigate starts it loads thedata into memory and from there likeit's not in the data path of yourqueries okay sofarokay so schema changes is a big problemin the myl world if you're notfamiliar if you havea table that is really large you have agazillion rows like a billion ormore altering a table like adding acolumn or modifying an index is aprocess that could potentially takehours or even days during that time thetable will be locked locked such thatyou cannot read and you cannot write toit so it's kind of an outage situationnow my scale does support some onlineoperations and some instant DDL which isreally fantastic but it's limited it's asubset of all the operations that can bedone on the table and this is the reasonwhy you will find online schema changetools in the MySQL landscape such asghost py online schema change spirit andothers vest also implements an onlineschema change mechanism and all of thesetools kind of work in the same way theidea is this you have this huge massivetable there's millions and millions andmillions of rows in that table now youwant to modify it but we know that thisis a blocking operation so what thesetools do they create a new table we callit the shadow table in the likeness ofyour original table but it's empty nowwe alter the shadow table which is acheap operation because the table isempty so it's no time then we startpopulating the shadow table from theoriginal table we both copy the millionsand millions and millions and rows aswell as apply all the incoming trafficall the chain log that is ongoing whilewe copy those millions of rows peopleare still using the table insertingdeleting updating we capture all thatand apply that as well on the shadowtable until the two tables are in syncor in in almost sync and this isbasically where most tools differ buteventually they all do kind of the samething they make sure that the tables arein sync they do a short stop the worldthey lock the tables right now it's asituation where you cannot read or writefrom the table and then they flip themthey they just exchange one with anotherso the app was just a second ago writingto your original table but all of asudden it writes to the new uh schemaright the new table andk after a whileyou can throw the old table to thegarbage okay so far so this is a bigproblem just for a single largetable just for an unsharded databasejust for a normal my skill setup now wewant to discuss how this problem becomeseven more complex when you want tomanage multi-shededuh uhdatabases so as a quick reminder to thearchitecture one of the most importantattributes of this architecture is thatthe different clusters that you see themthere's three in this picture they knownothing about each otherthe two shardsabove they don't even know that they areshards like the my skills server thathas tables and rows it doesn't know thatit's a it's part of a greater game it'snot aware that there's anothercomplementing server that has the restof the rows as far as MySQL is concernedthat's that's the data it's up to VES tojuggle and operate all thisdata all right so let's beginuh with a few aspects ofum what it means to apply changes to asharded setup we begin with item potencyso let's say I I have this table i wantto add anindex i set a DDL strategy to say VSwhich kind of means do it the Vest waydon't do it the MySQL way do it theonline schema changeway and I issue an alter table ad keypreviously we saw that vitigate uh whenI do an update table or insert or deletewe saw that vitigate would send a queryto the appropriate shard only but now wereally wanted to send to the all of therelevant shards so if I have 64 shardsif my table is so big that I split itinto 64 we need to inform all 64 shardsthat we want to update this table toalter this tablequite the task what I get in response inreality is a job ID but that table isnot yet modified vitigate merelyannounced to the tablets that here is arequest each tablet on the primary shardon the primary of each shard uh tooknote said "Oh okay i received arequest." They each in their own goodtime and independently of oneanother spin up this online schemachange mechanism that we illustratedlateron so for me as a user I just issued onequery but there's now 64 differentshards who are operating it so the firstthing I can do or I want to do is to getvisibility like okay what's going onlike it's the problem times 64 so I canissue a show v test migrations commandthat command again gets sent by vitigateto all shards and they each return onerow what's the status of the migrationon me this is I am shard0408 and there yeah the migration iscompleted as of12:21 each shard will report a statushopefully they will all complete i asthe user am happy and we can continue uhwith ourlife but what happens if two of theshards were unavailable for some reasonbecause uh because computers right lifeis tough and I've sent a request toalter a table 62 shards received therequest began the work maybe evencompleted it but two of the shards nevergot the request in the first place nowif if I only have one table one clusteror one server no problem i would run thecommand again but I I can't run thecommand again now because 62 shotsalready have that index it will eitherbe a syntax error to or an error to readthat index or it will add a second indexit's crazy right so this is where identitem potency comes in i could there'sit's nuance there's several ways to dothat but essentially if I if I couldtake this UU ID that I got earlier thisjobID and resupplied it then any shard thatreceives this command says "Oh wait waitwait that's a job ID I've alreadyprocessed i know this job ID i don'tneed to start it again this is rightthis is a no op i'm giving up i'm saying"Okay I'm done i'm happy you don't needto ask me anymore." But the two shardsthat previously never got the therequest they will see that as new andwe'll start processingit there's another techniqueit's designed to solve something elsebut it's very useful uh in this scenarioas well and it's called declarativemigrations so Vit supports the idea of adeclarative migration where you neverspecify you never say alter table allyou say is create table and drop tablenow if a shard receives this createtable statement and the table does notexist on that shard it's created itcreates ilt noproblem if the table exists and looksexactly like this okay it's a no upeverything is good but if the tableexists and has a different schema maybethe index is different the column isdifferent then that shard each shardindependently computes what would ittake to get from the existing schemainto the new schema so it will devisethe alter table statement which you cansee at the bottom here each ad willdevise the correct statement that willget you to the new state and it's kindof the K Kubernetes way right it's a youdon't deploy a change you deploy a stateso even if you get into this weird limbosituation which you shouldn't but evenif you are and you you're not sure anddifferent shards are different schemasokay that's still that's still okay youcan deploy a declarative migration inand each shot will independently makesure that it goes into the desired stateall right our next problem isconsistency if a migration takes hourstocomplete different servers havedifferent workloads serve differenttraffic run on different hardware havedifferent noisy neighbors they willcomplete the migration at differenttimes it's certainly possible thatdifferent MySQL servers will completethis migration even hours apartand it's okay but it's undesired it'sundesired because it's confusing becauseat any point in time different shadeswill present different schemas it'sconfusing to some scripts which are notyou know the most sophisticated ones itcan be confusing to an engineer adeveloper that goes into the databasethey see oh there's the schema there'sthe the new column I can start using itgreat but no because on another shotthat column still does not existso it's unhealthyideally we want all the schema to be thesame all the time across all the shotsand there's a way to do that notatomically that's not possible at leastwith my skill it's impossible to changethe schema atomically at all the shardsat the exact same time but we can getpretty close and theidea is to postpone completion so I setthat as a detail strategy and if we getback to the the way online only schemachange works I copied the millions andmillions of rows i've tracked the changelog the incoming traffic i brought thetables up tosync nothing really compels me to cutover at that particular time it's okayif I stall a little bit more if Icontinue to track the ongoing changesand continue to keep the tables in synclive make sense all right so it'spossible that one completesfirst or one of the shards uh backfillsthe table first but it will not cut overautomatically it will continue to applythechanges and I will make it so that itonly completes when all the rest arecomplete so the way this works is I willissue a show with test migrations one ofthe columns that they return is ready tocomplete which will be either zero likeoh man I I have many more rows to copyI'm I'm not there yet or one meaningyeah I'm I'm generally up to date I'mI'm okay I'm can cut over whenever youtell me to so again if I have these uh16 shards in this example I will issuethis query if all shards respond withready to complete equals one then I canissue an alter test migrationcomplete this statement again getspushed to all the shards at the sametime concurrently and they will each dotheir best to cut over as soon aspossible and in effect in production inourexperience thismeans up to a few seconds between thefirst child that that actually completesto the last one which is not bad at alluh in terms of you know relationaldatabases and the behavior ofapplications um some shots can beuh victims to an abusive query thatmaybe an ETL or something heavyweightthat is inserting tons of rows orselecting tons of rows is and is kind ofpreventing them from cutting over thatcan happen which can create a gap a timegap for that child um there is anotherway to uh ask Vest to be more uh brutalabout it so it will forcibly terminateany queries or transactions that areholding locks on the migrated table justahead of the cut over so in a way pavingthe way for a successful cutover okay so faruh we mind blow this when we explode inanother dimensionm first we exploded thedimension of shards but now we we alsosay we can apply the same logic formultiple concurrent migrations acrossall shards so VS allows you to runmultiple migrationsconcurrently and some of the stuff willbe truly concurrent some will have somesequentialization or serializationbetween them for performancereasons but essentially the same logicapplies you will run all these changestogether this isuh it's like a user is doing a pullrequest and they're doing multiplechanges in multiple files but thosechanges make sense together the same waysome changes to tables make sensetogether right so you deploy all thesechanges this query will go in parallelto all shards each shard will say "Oh Ireceived request four five six seveneight uhmigrations we'll apply them all at thesame time i will provide a postponemigration and then I will issue a show vtest migrations and wait for all shotsto say ready to complete for allmigrations and then I will issue acomplete all command boom all shots willstart f previously cut over cut over oneafter another again our experience showsI mean don't overdo this don't do like30 massive tables at the same time on128 shots but if you don't overdo it youcan still expect a difference of secondsmaybe 10 maybe 20 seconds from the firsttable completed on one shot to the lasttable completed on a differentshot okay last I want to talk aboutresiliency so question I have twosystems one in one system I have fiveservers in another system I have 50servers which of these two systems ismore likely to see a serverfailure five or 5050 correct a lot of people would getconfused the more servers we have themore likely it is that any single serverwill exhibit a failure of course it's upto a good infrastructure to make afailure in a 50 service setup less of animpact as compared to a failure in afive server setupright now let's look at schemamigrations if you use uh an existingonline schema change tool and let's saythe migration takes fivedays but after two days the MySQLprimarycrashes you have just lost two days ofwork you will fail over do whatever ittakes to fix the MySQL fine but you willhave to restart a migration and redo allof these two days uh worth ofwork this is with one server whathappens when you have 64 shards 64different servers running themigration 63 of them are happy but aftertwo days one of themfails now technically it's only that onethat needs to be restarted right butremember I really like to cut over allof my shots togetherso what happens is that 63 shards willhave to stall for two extra days allwaiting for that one to catch up and themore shards I have the more probable itis that I will be in thissituation so Vest doesn't fall for thattrap vitz uses V replication like one ofthe most important key components ofVSwhich is an auditor of data transfer soto speak so when it copies rows eitherthrough either like uh the millions andmillions of rows that it copies from theoriginal table or when it applies thechangelog using the same transaction where itapplies the change it also audits themetadata of the change like the range ofqueries the range of rows that it justcopied or the position in the MySQLbinary log it's a journaline systemwhich means the migration is stateful ifa primary dies vess promotes a newprimary that primary looks looks at itsown database and says "Oh I recognizethere's an interrupted migration this isthe state of the migration this is theprecise point where it was terminated."it is able to adopt the migration andpick up from the very same point ofinterruption which means you haven'tlost two days you've lost less than oneminute the time it takes from the momentof failure to the failover to the pointwhere it adopts a migration to the pointwhere it resumesoperation and so to summarize the schemamigration story inVS frankly I think it's better than thenon nonshedscenario vitz automates so much on yourbehalf you can run as many changes asyou want you don't need to run them oneby one you can run them together youdon't need to worry about failuresbecause they self-reoveryou don'nt need to worry about schemachanges being different because you cancontrol the cut over you can decide tocut over not during the weekend but bythe time you're back at the office onMonday after you've had your coffee soyou're a lot more relaxed about how themigration is going to behave in thismulti-sharded uh environment and uh weand our the users the Vitas users arelike so able to migrate tables that areterabytes ofdata and which serves uh serve millionsof queries per second part of them readsof course but many many of them arerights without having to worry about itvest manages all these takes the trafficinto consideration takes replication lagthe load on the service intoconfiguration and just smoothers theentireoperation all rightum on the topic of road mapap so vit isan open source project we maintain it wehave thoughts for thefuture there's always things to do anduh to the large part of it this isaffected by the community or just youknow as a user you need a particularthing you find that schema changerequiressomething we just had great ideas fromsome users dur during this conference umwe always have a road map ahead ahead ofus we do uh uh two releases per year anduh you are very welcome to join us inthe VitSlack to ask questions to makesuggestions open issues onGitHub read the code read the uh uh thedocuments read our blogposts i'm happy to answer questions if Ihave thetime right uh we will be here tomorrowthe Vitess booth will be open in theproject pavilion between the hours of12:30 to 2okay so again the hours 12:30 to 12:30to 2 p.m uh the projectpavilion questions sir yes thank youvery much for the presentation i thinkyou did a fantastic job explaining allthe complexity of running schema changesat the scale thank you uh I have twoquestions if I may the first one isabout safety of the safety checks sothere are certain situations where doingan online schema change may lead to dataloss a common example is having a uniqueindex deleting one of the rows that waspart of the unique index and now onceyou are inserting rows into the shadowtable some of the rows are droppedbecause those rows are not unique anyanymore another example is changing acolumn that is part of a unique indexand so on so is there any plans to addthese safety validations into bet testthemsel itself or is it theresponsibility of the whoever is usingbet to run the safety checks this is anexcellent question thank you so much umso this is beingaddressed there's multi-layers answermultilayers to the answer so Vitas worksin the Unix philos Unix philosophy rightwe give you the building blocks youwrite the commands you run the commandlines the the the flags and you'reresponsible to manage this and if you dothen there are answers for you now ifyou remove a column from a unique indexthen indeed as you say on the targettable there is less uniqueness right youcan insert something that you couldn'tinsert originally but how would it getthere it would only get there if youinsert the row originally on theoriginal table right the problem isdifferent after you cut over thensuddenly on the new table you're able toinsert two rows where previously therecould only beone is thatcorrect let's put it a different way ifyou have a table with no unique indexand you add a new unique index and theconstraint fails because on the originaltable you have two conflicting rows theentire migration willfail however I want to add to that vestoffers a mechanism that is calledreverts revert says once we cut over atthe time of cut over if you remember wedo a stop the world we record theposition where we cutover after cut over if you're unhappywith the new table you can issue arevert migration and what we do is youwe use the same VR replication mechanismto propagate the changes back from thenew table to the originaltable so that you can cut over back tothe original table but still maintainall the new rows you inserted or updatedor deleted right so you can go back tosafetythe issue of unique is is that is thepoint where maybe you're now you knowyou messed up your data you'renot it's not applicable anymore becauseyou messed it up maybe you incrementedokay you incremented from a tiny int toa big int and now you insert a gazillionright a billion you cannot move back tothe original table because that numbercannot be applied to a tiny int columnso there is indeeda measure ofuh you need you need to own that changethere is a measure of that uh part ofVest is uh there's a library calledschema and uh there's actually a blogpost on both the vitest blog as well asthe planet scale blog that describe howyou can use it to find out about riskswhen comparing two schemas so it can beit's an advisor that tells you oh justfor you know you can lose data duringthis migration because I don't know youdropped a column or reduced the scope ofa column or you can lose data becauseyou reduce uniqueness or introduceduniqueness so this is another tool thatyou can use thank you very much mysecond question really fast is um so youshow show an example of particular MySQLsyntax that could be applied the altertable and all that is there anyparticular MySQL native syntax thatcannot be applied to be test like forkeys or something like that rightforeign keys is is is the example um theproblem is with how my skill managesforeign keysuh in particular my skill does not knowhow to manage foreign keys the onlyelement in my scale that knows aboutforeign keys is InnoDB and the way thisworks with InnoDB is that it isimpossible for you to replace a tablewithout touching the foreign keys likeand keeping them in place so we do havea public fork of MySQL where we fix thatbut you will need to use that particularversion of MySQL and then there's a d-unsafe uh allow foreign keys flag whichallows you to do that without thisforeign keys are not possibleyes hello thank you for the talk i havea small question i think I missed itwhen you explained how is the migrationcompleted so the swap the synchronizedswap is it a command run by engineers ortooling so it's run by the tooling youdon't need to worry about it i mean youneed to worry about it because it's abit of the risky or dangerous lastmoment uh change the command isbasically merely a rename table A to Band B to A except it involves table C inbetween because you need to re renametable A to C B to A C to B right that'sthe command and the challenge for thetool is to make sure that one beforerunning this command the tables areactually in complete sync or else thisis wrong rightand two that it is able to acquire andcomplete the cut over uh uh to acquireand complete uh the locking of thetables so there are safety mechanismthat say uh InnoDB lock weight timeoutso that if the tables cannot be lockedfor the rename the entire the entireprocess of theswap fails and you roll back and v testtries again in a few minutes and againin a few minutes but it maintainsintegrity and consistency of the dataand so it's always vest that triggers itit's never an external command i Ithoughtyou mentioned or didn't say I understoodthat like engineers chooses the momentwhere it's completed you you can choosethe moment but Vest does it for you soyou choose the moment you what you cansay is postpone completion right dashpostpone start the migration but don'tever cut over automatically and then atsome point you will show vit migrationyou will see that vit is generallyspeaking ready to complete you candecide to go on a vacation during thattime and say no keep running please I'mbusy I'm skiing right now but onceyou're back from your uh ski vacationyou sit at your desk and you say "Wellyou know what i'm ready cut over nowplease." And then we test will say "Okayyeah I've been keeping keeping up nowI'm ready to cut over i'm doing this."Sure there's more questions but no timeso I'm happy to take questions outsideand give the room to the next speakerthank you so much[Applause]2025-04-15 21:58:32.181453puh gettingoption choose the container runtime youwant uh the networking interface calicoselium uh any of them that's been aroundand so on so that's uh very compensibleand choose the option or the requirementyou have and yeah there will be a passuh can afford it so once to use acontainerd with sky code that's that'sfeasible and withthe quickly how do we uh keep it stablemainly we have yeah a ci infrastructurethat we deploy about 10 to 15,000cluster that we try that we test test onuh every month so for every pr we willdeploy about 20 to 50 cluster so thatmeans we create yeah uh vm for let's saywant to by ubuntu with calico willdeploy three vm uh of them and deploycube spray on it test the cluster testthe networking and and and kill the killthe vm so at times sometimes we havelike yeah we can spin up depending ofthe of the workload two to 300 vm atonce just to test uh pr from thecommunities of course that's that'sexpensive so how did we uh manage thecost of itand yeah two two main way one is towrite better code uh more efficient uhuh code so can choose things or anotherway also is to um focus on theinfrastructure so today will be moretalking about the infrastructurepart and what's kind of best to lowerprice is to start to shop around and ifyou look at the at the market actuallythere's a wide range of of prices itcould come from few thousand to few tensof thousand so the range could be it'sat least 10x from the cheapest one tothe most expensive one and that's evenmore true with when you add gpu uh ormissing purchase and so on so that'syeah that's that seems very simple justgo with the cheapest one so what'swhat's the what's the problem here um iwould saythat's the main thing is to avoid uhvendor locking that's that's the mainmain main main main goal and that's notso so simple so the main main partthat's where cube spray can help it's ifyou build everything on top ofkubernetes and using cube spray tomanage your kubernetes cluster thenthat's become your uh common denominatorfor every everyevery cluster you you manage so thatwould beone one way to manage everythingeverywhere and now if you look at yourstack so you will have your um yeah aiai stack u running on communities ithink there is other many why you shoulddo that for for scalability and so wenot cover uh that you run this stack ontop ofkubernetes that is managed via cubestrayas mentioned and that's allow you towrite to run basically everywhere sowith cube spray it's also integratedwith gpu management the operator so itwill be deployed uh automatically andthat's a way to manage um yeah transpapparentlyall all hard other hardware provider andbasically focusing on that denim willlet you like the hardware become nowjust some machines a bunch of of ipsthat you can decide u where what to usewhere and yeah that's in short having sofar from experience having the freedomand control of where you run so it's notnecessarily about focusing on the costnecessarily of like a provider buthaving removing the vendor lock in uhsituation that's that has a biggestimpact um i've talked quite a lot alsoduring the conference some are usingeven like very large scale all the cloudprovider and what they realize like theycan aggressively um also uh negotiate onpricing like yeah sure uh they do withamazon google cloud azure and and nowthey have like a big discount on all ofthem because that's that's removing thelock in you can have and they have thefreedom of controlling uh that's that'senabling a lot of of saving aroundthat's also enable you to yeah asmentioned um to mix and match what'swhat you need so you can be on the corecore workload here have uh on prem gpurunning and mix match with cube sprayit's like one one interface to uh manageall yourcluster yeah so as mentioned discus uhtoday uh we we had prepared a demo but ithink was best to actually uh get a userof cube spray in the community toexplain like why they choose cube sprayuh for that specific use case uh prettybig on on using ai uh infrastructure andthey're using q spray so um yeah thanksluke for joining spontaneously today uhthanks[Applause]hello my name is luke simmons i workwith a healthcare company out of swedencalled vestia transun uh we have about50,000 people uh that are employed andwe needed a solution to be able to um tobring our own hardware we had to be ableto buy dgx or something like that or buymaybe smaller gpus and be able to attachthem to different worker nodes and weneeded a solution to provide for thisand we also needed a solution that wecould be able to spin up some of thelatest uh versions of kubernetes and ithink probably some people know it's notalways very easy in enterprise companiesto always be able to bump yourkubernetes versions to the latest thingto be able to use the latest stuff thatyou need for ai deployments and stufflike that so we ended up uh spinning upuh choosing cube spray it's hardwareagnostic it can use anything that youwant and we started spinning up to applyour dgxs and a100s uh directly so whatdo we use it for uh we use it to um weuse the nvidia gpu operator uh just likehe antoine described right there andthen we add on different frameworks liketensorflow pytorch r that you canautomatically be able to allocate thingson the fly then you can also be able touse uh we use a running eyeuler on topof that to be able to automaticallypreempt workloads and stuff like thatalso to give priority to inference andmodel serving so we can automatically beable to uh dynamically allocate uhinteractive workloadsuh sessions for data scientists to beable to start developing things and theycan be able to train things and we canbe able to serve things with k nativeand things like that right out of thebox this works fantastic for us we haveabout 20 different data scientists allcoming in to be able to work off thisplatform we can use cube spray we canspin up test instances we can be able totest these things off on cube spray wecan tear them down we can bring them upon our dgx infrastructures and stufflike that uh all because of cube sprayjust right out of the box so it's uhjust been a really really fantasticexperience to be able to work with it sothanks alot thank you rickyeah so as mentioned many yeah cubespray it's manage all the life cycle ofyour cluster we try to keep um thereleases so always always support uhthree version of communities for everyrelease and so we backport to uh theprevious release it's uh satisfydistribution and yeah we have a testevery release so master branch is testedbut when when we flag a tag that's atest and that's where we we safe to togo to upgrade your cluster using thistag um i like a lot of question why 132is not like a out there um that'sbecause of yeah we focus on thestability focus on make sure uheverything works when you apply theupgrade so takes a bit of time butusually we're not too many uh atleast trying to be on the on theschedule of of the communities which isalso something that not all provider uhsometimes like even like a year long andwe at most two to three months yeah sothat's uh cube spray it's a fullycommunity driven uh for yeah mentionedfor the last 10 years even before cncfwas created it's thousand of contributorand it's a lot of of yeah so thank youto the committee it's thousand ofcontributions that's all we are at thekiosk um tomorrow morning again uh 22adon't hesitate to stop by and uh thetrack channel are very useful for if youneedyeah that's about it thanks[Applause]everyone anyquestion i don't knowwherespeaker hi hi thanks for the great worki'm wondering if the fact that you nevermentioned the word anible uh during thispresentation is a marketing[Music]uh so the the question is about eniblewhat so yeah qr be behind the scene isusing enible and do all the automationusing enible so to um yeah doing and andcubm so uh enible would be theorchestration toolbut that's moreum implementation detail in a way forfor us so if tomorrow we find somethingthat's not will be uh moving on sothat's not like a strong focus to towork on on has been there it's beenworking uh correctly and that's alsoenable uh all that that possibilitiesuhright thank you everyone[Applause]2025-04-15 21:58:32.844075 7�7��Gy#��EACn8xvysLWVghi everyone and welcome to the Falcomaintener track i am Leonardo and I amhere together with other maintainers totalk about Falco so first of all what'sFalco you can think about Falco like thesecurity camera for your cloud falco isthe first ever cloud native runtimesecurity tool donated to the CNCF andsince the last year is a graduatedproject it basically detect securitytraits and deliver alerts to yourteam and how Falco works falco works bymonitoring kernel or cloud events forkernel events it taps into the kernelusing or or a BPF probe or a kernelmodel then send these events to the rulethat match those event against a set ofpredefined rules those rules defineanomal behaviors and are needed todetect the the traits or the securitypolicy violation that can happen once arules detect a malicious behavior itimmediately emit an alert but there ismore there's more because Falco has astrong plug-in system that allow you touh grab data from other data source fromthe cloud we have for example uh aplug-in for the kubernetes of the logbut we also have other plug-in for otherdata s tools like octto or githan andalso we have many option uh to deliverthe output of falco uh thanks to anecosystem tool called the falco sidekickfalco is capable to deliver those alertsto more than 50 destination you can sendalerts to Slack to assignments to Kafkato an database andwhatever this is a very briefintroduction of what is Falco and nowlet's see the today agenda today we willtalk about the la the latest fusionfalco we will do an an overview on theecosystem with a focus on the Kubernetesor the log plug-in and wer�Wx#��eASqKqB-q_m8Ehi everyone thank you for joining thesession i'm antoan lron situ at connieuh gmbbh in germany uh initial uhquarter of cupspray and i should havebeen with moham today but unfortunatelyhave to go back uh home yesterday umhe's a senior engineer at new york i seeand a cube spray maintainerso first quickly if you don't knowwhat's cub spray it's an orchestratorfor kubernetes so to install uh and andupgrade and all the life cycle ofmanaging kubernetes cluster it's beenthere for 10 years since uh the firstversion of it and i've seen like a largeadoption so far it's focusing onproduction uh environment so notnecessarily like for development thereis better tool for that and focus onmaintenance and stability so having safeuh safe upgrade and so on one thing thatis uh very important is the uhflexibility so we try really to meet uhwhere you are in term ofconfiguration and that means it can runbasically everywhere so on the cloud onbare metal um private data center thereis integration with uh most of of cloudprovider it's uh works onevery uh operating system uh nearly andand then you can yeah keep os also show yousome feature that we are currentlydeveloping and we are going to releasevery soon those feature are thecontainer the new container supportimplemented as a plug-in defaultoperator and uh a new feature of theevent generator so Luca it's youthank you i'm here talking to you aboutwhat's new with Falco so last time wetalked we were in Salt Lake City forthose of you who attended the uh Falcocontainer track and we had some excitingnews so what's going on right now soright now we are seeing when we are hereand we are at CubeCon and we ask peoplewhat they want out of Falco everyoneactually wants uh more things tailoredto their use cases and so in the lastfew years if you have noticed we havemade it so that you can make Falco yourown you can add integrations you can addplugins you can add everything and ourmain duty as maintainers right now ismaking sure that the platform upon ueverything else is based is stable andis performing well we you've got imagesthat you can deploy easily and so forthso we have focused on a lot of theseover the last few months for instance wehave been uh evolving evolving ourstrategy regarding the container imagesif you have been a longtime user ofFalco you might remember something andyou might remember the good old bigimage that was the only image that youcould have more than 1 GB big uh whichcontained a lot of compilers and a lotof things that you didn't really need inall cases because they were needed tobuild the kernel module because that wasthe only way to have falco or the legacyebpf probe which also required to becompiled today uh thanks to the newversion of the ebpf probe which usesquery we can actually use a minimalimage as the default image for falco andum back in 2023 we started experimentingwith woli the minimal undistro that uhthat has uh some advantages and today weare shipping that as default plus a lotof other images that you can uh that youcan choose to use depending on your usecase so what's the advantage and why diddid we uh take our time to uh streamlinethe container images uh first of allsmaller image size so the image that youare running is smaller and also many ofyou are required to run softwarecomposition analysis such as trivialgripe in your environment and this isgoing to alert you a lot about thepotential CVES and vulnerabilities andyou need you would need to fix that tomaintain compliance these images such asWolfie have been designed to reduce thenumber of vulnerabilities detected bythese tools and so uh this means that ifwe were using the older Debian you willfind that you had and you installed theFalco in November now it's complainingabout 135 vulnerabilities that aremostly false positives but if you wereusing the DRS or minimal image that nowis the default you will have one so uh Ihope to to take we hope that we'll takesome work back from the the shoulders ofDevOps engineer and security engineersthat maintain falco also we haveimproved performance a lot by doinglow-level optimization with uh forexample we changed the default compilerwe are compile uh compiling with clangand zig and we changed even theallocator at a lower level so now wehave up to 10% improvement in throughputfor events which doesn't sound like alot but some of you are running millionsof events per second in your cluster sothat does really add to to a noticeableimprovement also uh as usual we aremaintaining we are continuing to addfields and operators that allow you towrite rules that are more expressive domore things more information about thecontainers however uh when we're talkingabout containers that will be moreexciting things that we'll be talkingabout in a minute uh we have a LANoperator like many languages we didn'thave one before and one day we noticedand we added and some othermodifications such as this one uh also Iwill start by introducing some ecosystemproject although later we'll talk a bitmore in details about them uh but wehave a new entry in this many of youhave asked uh so how can I use Falco forsupply chain security can I use this inmy CI/CD pipelines and we have peoplethatt actually built a project that doesthis which is called Falco actions andit's an integration between Falco andGitHub actions so you can start Falco inyour GitHub action job and then have itrecord everything that happens and thenanalyze it for potentially suspiciousevents after the after the the job iscompleted it's very easy to to run andto try and it even has an analyzer stepthat can use uh captures capture filesfalco supports capture files or even AIto try and figure out if something is iswrong on top of the Falco rules we allknow and love and we have also a new anew uh addition to the features of FalcoTalon people were very excited about theresponse engine of Falco uh so you canactually it's a response engine that'sKubernetes native you can tell whensomething happens please do this actionand right now you can involve uh invokeGCP actions and use GCP cloud storagebecause before they were more centeredon AWS and you can perform live capturesuh Falco supports captures as Imentioned before so you can essentiallydownload what's happening at the at thatspecific time and analyze it offline uhso this is a very exciting project ifyou are if you use it and you're a goexpert please check it out and join usin maintaining it uh so now we talk tous about a bit more about ecosystemprojects and something very cool she'sbeen working onthank you sonow wewill come to talk about what theecosystem um havedone falco alreadyprovide plugins to u to read audit logsyou can do it from GKE and from N4AS and now you can do the same thingswith twonew pluginsfor AKS and andVcloudMKS we will talk about these twonew pluginsso we have somepre prerequisite to follow for the AKAKS clusters first you have to createand Azure event hub then you have toenable event in your AKSclusters in order to configureyour clusters toship slugs and you have to create or toreuse and as yourblob courageso quicklyumAKS clusters ship their credit log toum have an hub and it store them to ablog storage and then theFelco plugin just have to read from theoven interface and use the blob storageasa checksponsor in order to install and use ityou'll just have to execute the falcoctctlartifact install command and you have tofill thefalco file with this type of contentdon't forget to to to replace the eventhub and the blob storagefield if you are in autoed the sourcecode anduh andthe and the blog post are available youyou have the Q two QRcode and if you want to know um a littlemore abouttheVcloudMK plugin so so first you have to createa LDP logdata platform then you have to createa data stream intoit you have to connect theMKS cluster to this data stream and youhave to retrieve the websocketURL in order to connect toit soconcretely theAudit log ofumMKS clusters are stored in LDP datastream and theFalco plugin connect to the um LDPthrough the webcircuit to install and useIt's the same things you haveto execute falco ctl command and to fillthe felco area file don't forget to topaste the the websocket ldp URL in theopen p field and you can also insert ituh through Elm so you you have to uh youhave to fill thevalues can file all these file are uhare in the read file and in somepost so if you areinterested you can find two blog postand this source code um with a plug-inand if you need umanother manage kubernetes audit logoperationdon't hesitate to contribute and tocreateanother plugin if I manage to to to doit for the uh tocode plugin everyone cantoo thank you and nowit's and now uh andnow Jason will talk aboutwhat's cookingallright thanks i'm Jason i'm going to talkabout what's happening in the updatesfor our container metadata support umI'm going to do a step back and give abit of a refresher what we are trying totalk about here so one of thefundamental things that Falco does is asyou know collect telemetry andinformation from the kernel but not onlylike it also fetches data from theoutside world for all those things thateven the kernel is not fully aware ofone of these for example is thecontainer information connecting to thecontainer runtimes but also speakingwith the API serveru to get podinformation namespace all sort of thingsand then when you know a falco alertshappens like in the red snippet you seeover there you're going to have anenrichment section saying hey this eventhappened from that process but also inthat container for that image from thatregister and so on and so forth uh thishas been one of the first features thatwe ever developed for file has beenthere since a while and uh it's you knowjust one of the things that people can'tgive up on so um it's part of the corecode of the projectso what is it that is not working for usuh first off I mean the the many pointsis maintainability in the sense of thecode is pretty old so it is working buthas been sitting there for a while andit's becoming a cost for the communityto you know follow on and keep updatingthat second point is Falco forperformance reasons is historicallywritten in C++ meaning that also thispiece of code is written in C++ and uhyou know over time things are changingwe notice but we didn't have native SDKsor clients to connect to the the dockercontainer runtime or the cry interfaceso we had to maintain it and develop itinhouse which is a cost again and uhalso because we needed to make this asperformance as possible it's not supertestable and that has been a strugglefor us uh in terms of performance thiscan't be synchronous so imagine whathappens falco is receiving events fromthe kernel he notices that there's a newexecution from a certain process uh itwill start looking up for the containerinformation from the sockets that hefounds in the system now we can't waitfor that information to arrive it wouldbe too late right we don't want to stallthe event processing pipeline and so welet the first few events go and wait forthe result later on that also means thatwe need to implement and we did uh someyou know sort of locking mechanism withmutxes which takes a considerable sliceof the falco usage overall and impactspotential maximum throughput that thetool can sustain um this also impactsrel uh reliability in the sense thatgiven the model that I described thefirst few events may come not enrichedwith information and that is essentiallya big issue I mean can be an issue foreveryone but uh also especially forshort-lived containers in which casesmaybe for those few milliseconds that wewait we may lose the informationentirelyso okay it was skipped sorry how is theplug-in system helping us in this um thenature of the plug-in system is multi-language meaning that we can develop onein go and that's what we did nicely thego ecosystem provides a good set of SDKsand clients that we can use to connectto you know all the container runtimesimplementation that we support um but wecan also do it multil- language in theplug-in meaning that we can have a slicewritten in C++ for the performancecritical paths and the go the go side ofthe plug plugin you know for theconnection integrations part um the newplug-in which is about to be released uhvery soon uh connects to the containercreated events which happens before thecontainer started events and so nowFalco knows about the container evenbefore seeing the first system callbeing executed about that process um andthe same goes for when the containerstops so that was funny to us since wealways had like late problems now weknow that the container dies even beforewe stopped uh processing all the syscalls about it so we had to do somecoordination uh and the nice thing isthat we were fully capable of doingfeature priority so this was a very goodexample of modularization of the projectthanks to the plug-in systeminginitiative that started a couple yearsago and I need to say a big thank you toFederico in the specific which is one ofour our core maintainers which developedthis all in the open with a you know anopen design and blueprint um and bringit tocompletion the outcome of this is abetter time to data so now it'sbasically instant the moment you see thefirst uh you know clone or exact by doneby run C inside the container that'salready enriched with informationwhereas in the past it could take upuntil av few milliseconds and 100 wemeasure worst case scenario um it's mucheasier for us to maintain and test andyou know uh we can collect many morecontributors in that way as well giventhat the code the go code is a morefriendly um we were able to develop newfalco fields for metad containerinformation which can be used fordeveloping falco rules and this is alsoquite more linear in terms of CPU so nowlet's say Falco will become even moreperformant in processing uh eventsthere's a couple breaking changes um theplug-in system is not supported by thestatic builds with masle uh and so thecontainer meta will be lost there aswell uh plus for people using the newfalco exported metrics um there's goingto be a few um I mean a couple numbersthat are going to change the prefixbecause now it aders to the same prefixof the plug-in system as well and thisis due to be released in 0.41 which youknow is supposed to happen in a couplemonths so stay tuned and u the promisewe make as maintainers to keep bundlingthis integration inside the falcopackages and images uh so there will beno need for falcoctl or any extradownload uh at installation time uh atleast since falco 1.0 so essentiallythere will be no breaking change in interms of UX for whoever is installingfalco compared to today you willprobably just get the benefits plusthose tiny breaking changesand that's about it i'm going to passthe microphone now to Aldo um to learnabout the FAL operator we'redeveloping thankyou okay uh the FAL operator aKubernetes native Falconmanagement well why the FAL operatorwell why not no just kidding we reallyneed it so when running FAL with uh theHelm charts we run quickly inlimitations and some of theselimitations are that we have limitedflexibility and the reason is that Helmdeploys fal as a single instance typemeaning that if you need a demon set orafter that you need a deployment youwill need um another release a helmrelease and that will be let's say aburden for many of us and u at the sametime we have problems with maintainingsettings across clusters when we aredeploying Falco in multiple clustersmeaning that updates are um manuallyupdates to values.yamel YAML files andthat in time could cause drifties in theconfiguration and also the the artifactmanagement when deploying Falco withHelm we have um artifacts are deployedas a as a single instance with Falco andif you need to update those artifactsyou will need to redeploy Falco itselfand uh this is let's say the reason whywe need a new uh let's say Kubernetesnative management for Falco and itsartifacts here we go how it works wellwe have um there is a fal operator whichis the the main controller which watchesthe Falcoy instances which is a customresource definition and on that we havethe let's say only the uh configurationsregarding Kubernetes meaning P templatespec deployment type that could be adeployment or demon set and it will nottake care uh for the uh Falcoconfigurations itself When deployingFalco it will inject an artifactoperator which is running as a sidecarand uh the artifact operator in factwill take care to handle the artifactmanagement the artifact managements anduh how it does it we have different kindof custom resources that are the thatmodels the different artifacts in Falcolike the rules files configs or theplugins and it fetches them and makesthem available to Falco in um in uh uhvolume mount that is shared between thetwocontainersum also the the the Falco operator willallow multi- instance management as Isaid before with Helm you will needdifferent Helm releases but here youwill just need to uh deploy multipleFalco instances with differentconfigurations and uh the operator willtake care to uh handle them and uh itwill notoriize conflicts between them soit is let's say a must for people thathave different uh deployment of faloswith different configurations or maybesome for the node level security likewith cys codes or uh other with pluginsso this will makes things a loteasier and um here let's see theflexible artifacts management well Ithink this is huge because it will allowpeople to manage the artifacts uh uhdifferently from how we are doing rightnow so meaning that artifacts managementtheir life cycle is separated from theone of the falco deployment itself so itwill allow anyone to create let's sayrules without needing to be for examplean admin of the cluster and uh at thesame time you can see that uhthe the artifact operator supportsdifferent sources for the artifacts OCIregistry for example for centralized andversions OCI artifacts the Kubernetesconfig map for example for uh clustermanaged um configurations or row YAML itwill specify it will allow to specifyinline configurations into the artifactswhat what does it mean for example let'ssay all of these sources called can canbe uh specified in the same customresource meaning if you have the rulesfiles you can get the uh default ruleset that the u Falco security uh offersand on top of that if you need to changelet's say a macro a condition oranything in the rules then you can justum set it in the in the in the inline uhin the inline uh configuration and itwill override the the default rule setum and um at the same time the artifactssupports priorities what does it mean uhif there are two types of um artifactsthen you can assign priorities and thehighest priority will overwrite the onewith the lowest priority again this willallow people to override maybe rulesthat they don't have uh they don'tmanage or they are just getting them asis from third parties um also um the thepriority is also inside the uh artifactsfor example if you have the thethe rules that you are getting thedefault rules and you want to to tooverride something the sub prioritieswill say okay I need the config map thathas the highest priority than the OCIartifact or from the inlineconfigurations and yep and that's itfrom my side and now I will leave thestage to um Leo thank youand now the very latest news uh aboutthe event generator what's the eventgenerator div generator is a tool that Icreated a couple of years ago togenerate real kernel activity uh I didthis why why is this just because wewanted to simulate real attack realactivity in the kernel generating realevents that the Falco can catch uh it isneeded because it it's very useful totest if rule works but also to test theFalco engine itself indeed indeed theevent generator as a mechanism toconnect to Falco and check if the theevent generator generated by the eventgenerator is ced by by by Falco butwhat's the news the news is that untilnow all the action that the eventgenerator could create were hardcodedmostly in go into the tool now we aregoing to introduce a a declarative wayto generate test basically is a fullflagged test suite when you can defineof course the name the rule the rulethat you are targeting but you cancreate also a context for for for theevent that you are going to generate youcan create a process tree a processlineage with with all the args andcapabilities and names for each processyou can also create a resource like aclient server that will be a bitdifficult to implement in YAML and youalso can have step each step willtrigger a sys call and this is allactivity that will be generated for realinside the kernel there are a lot ofmore feature that you can discover ifyou take a look to the event generatorrepository one of these is for examplethat you can bind uh options from uvalues from the resource to the to thearguments that will be passed to the tothe sys calls we have also uh a matrixto uh to generate multiple events withdifferent input values and of courseit's not shown in this in this exampleyou can also define expected outputvalues that the event generator willcheck once the test run this will bereleased verysoon okay and now we are uh going to theend of this presentation but uh beforesaying thank you I want to invite you tojoin our community if you are not yetthere you can find us on the Falcochannel on the Kubernetes Slack and wealso have a community calls every fewweeks i recommend you to join becauseyou will find a lot of nice person likethose thank you everyone[Applause]2025-04-15 21:58:33.364786xore we want support for statefulapplication failover to improve ApacheFlink workload resiliency onmulticluster Kubernetes um we actuallyhad a talk about this use case umyesterday um and I'm going to sound alittle bit biased but I think it waspretty amazing so if anybody missed it Iwould strongly urge that you watch therecording um there's onemore um we want to optimize GPUutilization for AI model training acrossheterogeneousclusters this is also had use case andwe also have a talk here at CubeCon umfor this use case unfortunately it's aconcurrent talk happening right now butI al I would also strongly urge everyoneto check out therecording okay so now um time for thebig question um given that we have allthose use cases um for comma um how willall those teams usekada at a very high level we basicallyhave two choices um choice one doing ityourself where each team will build andmaintain their own um commodasmulticluster infrastructure which is nota trivial task or maybe some of you knowwhere I'm going with this already uusing a pf path where each team willoffload management of commadamulticluster infrastructure to somemagical managed service offering and onmy team we build this managed serviceoffering so I'll take you through how wedid that starting with the most basicsetup and um gradually building up towhat we refer to as um stretchcommada so to start we build what wecall the hostcluster host or management cluster andthroughout this talk I'll use the termshost and management cluster interinterchangeably um then we installed thecomma operator from the upstream commataproject in that host cluster and usedthat operator to provision a base commacontrol plane however at this pointthere's still some more work to be doneas the promise of manage comma is foreverything including ingress traffic tomanage comma control planes as well asfully automated and dynamic membercluster registration to justautomatically work and to make thathappen we built our own operator themanage comma operator which integrateswith our internal systems to set upingress traffic to manage comma controlplanes and we'll also install thecluster registration subsystem thatintegrates with our internal systems forfully automated and dynamic membercluster um registration um in thatsubsystem we have the cluster providercomponent that integrates with ourinternal cluster inventory API todynamically sync cluster state formanage comma control planes um we alsohave cluster credentials provider thatintegrates with our internal tokenexchange service um to provide comma ofthe credentials that it needs to pushwork to register member clusters umComma also has an add-on calledestimator um that when enabled will helpto inform the comma scheduler on howworkload should be federated across theset of registered member clusters forcomma instance um when enabled theremust exist one scheduleuler estimatorinstance for each registered membercluster um and our um scheduleulerestimator provider will handleprovisioning of those estimatorinstances as well as all of the plumbingrequired to make that um feature work soat this point we have successfullyprovisioned a comma control plane for atenant um so now let's go through someof the scenarios to see what type offailures we can withstand with thistopology um the first one is failure ofa member cluster um for this tenant wehave a control plane that's joined totwo member clusters so if we have u onemember cluster failure then we can stillmaintain continuity of service asworkload running on that fail clustercan be migrated by comma over to theother member clusterhowever if we lose our managementcluster um given that this is the onlymanagement cluster we have then all ofmanage comma will be out um this is abig problem as our tenants will rely onmanage comma to power mission criticalworkload so we have to solve this issueum to do that we get a bit more fancyand introduce what's called what werefer to as the stretch topology um inthis topology we now have multiple hostclust host of management clusters thatspan multiple data centers um then wetheyn take the commada instance andstretch it across that cohort ofmanagement clustersum integrate with our DNS-based um loadbalancer manage service to provide oneunified ingress endpoint to each manageinstance and also integrate um and forthat to work um we ensure that the sameCA certificate is used to provision theinstance on both host clusters um thenwe integrate with our manage SCDoffering to provision one multiclusterSCD instance for the manage commainstance um and ensure that the instanceon both host clusters connect to thesame stdcluster and now in this topology if welose a member cluster we can stillmaintain continuity of service as theother member cluster will still be upand runningand if we lose a host cluster we canstill maintain continuity of service asthe other host cluster and the datacenter will still be up and runningum if we lose an entire data center wewill lose some of the members of the SCDcluster as well as theum some of the members of the SCDcluster as well as the host clusterrunning in that data center but canstill maintain continuity of service aswe will still have quum for the SCDcluster and we'll still have um the hostcluster and the other data center stillup andrunning um however if we lose both hostclusters then manage chromatada'sout and this one should beself-explanatory but I'll vocalize itanyway if we lose both um data centersthen manage comma is definitelydefinitely out um however those two lastscenarios should be very very unlikelyso if we ever get to this point then Ithink we will have much bigger problemsto worry about anywayum so now let's look under the hood tosee how um the stretch topology works umwe start by installing what we refer toas our central source of truth it's aminimal control plane with cube APIserver um controller manager with asubset of the controllers enabled aswell as certain manager components andit's installed um stacked on our cohortof hostclusters we then again integrate withour DNSB load balancer managed serviceto provide one unified ingress to thatcentral source of truth and alsointegrate with our manage SCD offeringto have that source of truth backed byone multi- datacenterd cluster um then we install theoperators um we have the source of truthoperator the manage chromatada operatorand the syncoperator um at this point when ourcustom resource is applied to the sourceof truth API server the source of truthoperator listening to event sources fromthat API server um will do a few thingsit will connect to our managed SCDoffering to allocate an SCD cluster forthe comma instance it will alsoprovision the um API server CAScertificate that we need to make unifiedingress work for the um stretch instanceas well as other things like umprovisioning of a client certificate umfor the commander control planeum then um the sync operator which alsolistens to event sources from the sourceof truth API server will sync all ofthose resources from the source of truthAPI server to the host cluster APIserver um at that point the managecommand operator will have everything itneeds to provision that instance on thathost cluster and we'll do um just thatand the same process happens on bothhostclusters okay so the promise of amanaged service like manage comma is tohandle all of the complexities ofmulticluster infrastructure so that froma tenants point of view use of comma istrivial and a good way to look at howthat's achieved is by looking at theprocess for tenant onboarding it's avery it's a very trivial two-stepprocess first we will create a tenencyfor that tenant that's basically acustom resource um and once the tenencyis provisioned we will then privilegethe tenant um well give the tenantaccess to create commander controlplanes in that tenency and that's alsojust another custom resource that'salready pretty trivial but we made iteven simpler than that um so this tworesource that you're looking at is oneof the resources that's part of ourcluster inventory API and it's aresource that our tenants for managechromata already own so to simplify theonboarding process even further giventhat zthe plan is for each KMADA controlplan to be joined to member clustersbelonging to a tier um we build a firstclass integration that empowers ourtenants to declarativelyum define the desired state of a commodcontrol plane that should beautomatically joined to member clustersbelonging to that tier um directly aspart of the tier resource and and that'sexactly what you see on the right umvery trivialprocess okay um for the last segmentbefore handing it back to Hankai um aspart of manage kada we've also workedvery closely with the comma community onnumerous open source contributions thathave had direct impact um for thedevelopment of manage comma so I'll gothrough some of them so the first one umas part of provisioning a control planethe comma operator needs to installinstall CRDs and by default it willdownload those manifest from somerelease artifact on GitHub unfortunatelythis approach cannot work within anairgapped environment like OS atBloomberg um so to remediate that issuewe made a contribution back to thecommander project um to allow supportfor using a custom HTTP source fordownloading those um CRD manifests umwhich um will work within our air gapenvironment this is another one um sothis one added support to the commaoperator for configuring the priorityclass of manage comma control planecomponents to ensure reliability andstability of manage comma controlplanes this another one this is the oneI talked about earlier um so in one ofthe diagrams I showed earlier in thestretch topology we have the commainstance stretch across two or it can betwo or more actually um host clustersbut accessible via one unified ingressendpoint um for that to work we have tomake sure that the same CS certificateis used to provision the instance onboth management clusters and thiscontribution added support forconfiguring the comma operator to usecustom CScertificates okay I got another one soone of our requirements from our tenantsfor manage comma is for configuringencryption at rest for confidential datathere's quite some optionality for howyou can um configure that for the cubeAPI server um but one highly recommendedapproach is to integrate with a keymanagement serviceum with when with with if um that optionis chosen then there must exist uh KMSplug-in that runs alongside the cube APIum server um to bridge that integrationfor with the key management service asthe cube API server will have to connectto that plug-in via gpc um forencryption decryption requestUm and this contribution added supportto the commod operator for API serversidecore containers to enable thatintegration for us at Bloomberg for ourinternal key managementservice um it's another one um so bydefault the comma operator as part ofprovisioning a control plane will set uppublic key infrastructureum with the leaf certificates in thatinfrastructure having a default validityperiod of one year um so to align withour organiz organizational securitypolicies and certificate managementpolicies we made a contribution back tothe commander project to make thatconfigurable so that we can configurethe validity period of those liftcertificates as weplease okay one more I think we're closeum so the way manage comma work is wemake use of the comma operator but alsohave our manage command operator whichinterates with our internal system andthe way it works at a very high level isthat our uh operator will offload workto the manage comma our operator themanage comma operator will offload workto the comma operator for provisioning abase control plane um and once that basecontrol plane is ready we have to doother things like setting up of ingresstraffic um and this contribution addedsupport to the comma operator forreliably discovering the API serverservice for configuring that ingresstraffic okay one more um um for managecommada we integrate with our managedSCD service to allocate one u multi datacentered cluster for each manage commacontrol pin um and this contributionadded super to the comma operator forusing external sedd umclusters okay one more i think this isthe last oneum we have a set of requirements forattendance for manage comma um and thatincludes the ability to integrate withour internal um off-end web hook servicefor authentication as well as forconfiguring encryption at rest um withexisting support in the comma operatorfor specifying extra args we made acontribution to also add support forextra volumes and volume mounts to andum to unlock those use casesuh my favorite part of doing this islive demos but unfortunately we don'thave enough time so gonna pass back toHungai okay thank you Joe joe has uhmade a lot of contribution to theproductum now Joe is a maintainer of thecommander operator uhright yeah thank you um I I want to talkuh about the Kamada community uh Kamadawas open source at 2021 and become asafosuh product at the same year and um movedthe levels to incubation at 2023um Kamada joined the efforts of morethan 700 contributors and uh we have auh we we we build a fast growingcommunity and uh for now we have a uh wehave 36uh 36 public adopters they run commadain in production and some of them areare use commada to uh for managing uhquite large scale of ourclustersuh souh this year uh we areuh still focus on AI training jobinference jobs uh for now I I thinkKamada can can support uh the trainingjob quite well but we still have somecorner case we need to handle and forcommand operator I believe we going tohave a more enhancement need to do rightuhso we we we also have a twouh team for performance enhancement mentuh uh they will focus on uh improvingthe performance the scability andanother team is for the commadadashboard uh a lot of user need uh a UIfor manager commadauh so uh that'sall thank you we still have some timeright okay thank[Applause]So I think we can we have time we stillhave some time so if you have any anyquestions okay take silence as none okaybut I and Jill will still uh here for awhile so if you uh going to ask hi sureyeah sorry I joined kind of halfway soamazing presentation i'm sure I'll catchthe first part how does kamada comparedto Q for example um what do you say islike the different feature sets uh I'msorry how many commada what no worriesso how does Kamada compare to Q as amulticluster orchestrator okay umfirstly I think a Q uh is a project thatproject uh provides the the Q managementin singlecluster and uh Kamala was uh built uhwere designed to handle the problem withmulticluster at in the first place uh soI think that's the big the big thedifference betweenthe Hello yeah so you're saying thatit's a bit of an afterthought maybe forQ that uh it was initially a singlecluster tool and I see they have like aleader Q with multicluster support butyou're saying actually Kamada out of thebox is trying to support multiclustermanagement and that's kind of thedifferentiator is that what you'resayingum sorry no worries no worries am Ispeaking too fastcool so what I mean is are you trying tosay that Q uh only provides multiclustersupport as a secondary feature whereasKomado like that's the main kind ofselling point for Kada you're askingmulticluster kill right yes yes uh yeahuh we have uh uh thinking about it uhfor a while um uh first we are trying todo is to integrate queue maybe integratevolcano as well uh because they both uhprovide the queue in single cluster uhso we are trying to in integrate them uhfor commander user to uh manage their uhjobs on multiclass area um so uh we'llsee how it work yeah I think we are weare working on it but for now umactually Kamada uh community and volcanocommunity have already workeduh together and provide a solution formodule named the volcano global productnice you can have a try thank you yeahI'll do that and also about uh cubeflowis there any sort of integration withcubeflow pipelinesuh so sorry all goodokay so does Kamada have an anyintegration with cubeflow pipelinescubeflow yes uh I think uh cubeflow forkamada is just a uh a set of CRD kamadahas a full support for any kind of CD soI think uh I I don't see any gap to runKF follow on Kamada nice that's perfectthank you so muchso thank you all thank you for time2025-04-15 21:58:33.788792 ��ez#��ArbVV8WIJYwwuh hello everyone welcome to thissessionum my name is Hongghai from Huawei andI'm uh one of the maintainer of ourcommadaproject and I'm J i work at Bloomberg atour cloud comput services platformteam this is our agenda for today firstum Hankai will do an overview of KMADAand some of the features that itprovides and cover some of the key usecases um then I'll do an overview ofsome of the use cases that we have atBloomberg how we build a manage serviceto support those use cases um and thenI'll switch it back to Hankai who willtalk about community growth andecosystem integration as well as currentactivity and road map for the Comaprojectokay uh I will give a a briefintroduction of kamada for those uhdon't have the context uh about whatkamada it it is and what it so uh whatproblem it solves uh commander is a aproject that designed uh to manageapplication across a mult cluster and uhuh most of our user are using commadasomething like a CD system they deployworkloads to Kamada and Kamada uh helpto schedule the workload to uh one ormore[Music]clusters anduh after uh the resource uh has beenpropagated to uh multiclusteruh the administrator might need to uhhave a global uh view of the resourceand so uh we support that the resultscan be uh reported to uh a third partythird party database uh like Alexasearch or open search and so they canbuild a global view of of the resourcesso uh another uh fancy uh feature is thefailover uh the failover include thecluster failover and application failover the for the class failover if onecluster uh crash or or shut down uhcommander can uh migrate the applicationuh from that cluster to anotherand for uhsorry yeah I just said the applicationlevel uh the election level that commadacan watches uh your application runningin single cluster and if the job uhcrashor crash command can uh migrate theapplication to anothercluster yeah I just told the uh clusterserver okay um as as you see in previousslide uh comma has a uh a lot ofcomponents as much as the communities isvery it's not not that easy to maintainso many components so uh Joe uh will uhshow their practice uh to joinKamadaokay okay so I'll cover some of the usecases that we have for multiclusterfederation and Bloomberg we have manyuse cases um then I'll do an overview ofhow we build a managed service tosupport those use cases um followed bysome of the contributions that we madeback to the comma project um that havehad direct impact in the development ofthat manual service so I'll start withthis one which is a more general one umit goes something like this we want toreliably propagate resources such asconfig maps secrets cluster workflowtemplates and workflow templates to aset ofclusters this is another one this is avery hot one to accelerate AI inferencewe want to um federate mo cacheresources on GPU nodes across a set ofclusters to reduce uh mo warm-uptime i have one more or there's a fewmw}e we havewhy why do we need to have this wholetalk about secure software distributionum you just you know you just take yoursoftware you're the software produceryou take this software foo and you sendit to the consumer what could possiblygo wrong um I'm about to tell you so thefirst thing that could go wrong is thatan attacker could just replace yoursoftware with with malware right likethey could just replace it with anythingthey want and um then your consumerwould be very sad i should have put a afrowny face there because they'regetting the wrong the wrong softwareanother thing that can go wrong is thatthey just get the wrong version of thesoftware and this is a slightly moresubtle attack that I think a lot ofother protections um you know have havemore trouble with right is that you oncehad this version of the software food2.2 and this was the correct version itwas great you signed it um and then youmake you know maybe there's a new newfeatures that are added maybe you'veeven patched some securityvulnerabilities and now you're on FU 2.3but you want to make sure that theconsumer is actually getting the correctversion and not this old version thatyou no longer want to be distributingso what about signatures i hinted atthis a second ago but what let's justsign the software right why can't wejust sign the software and then makesure that the user gets the correctversion so the first problem here iswith this outofdate metadata right soyou have um fu 2 2.2 was valid right yousigned it with your proper key and itwas it was correct um and then when youproduced food 2.3 um you also signed itbut the consumer doesn't know any betterright if an attacker is able to givethem food 2.2 you with that validsignature they'll still think that it'svalid um the other thing that can gowrong um depending on how you're doingsignature um verification is that theattacker can also you know use publickey cryptography right they can they cantake some malicious package and sign itwith some arbitrary key and that is asigned package right so there's likethis extra step where it not only has tobe signed we have to make sure it'ssigned by the correct person um and thecorrect person today right similar tothat previousproblem so yeah you now have a keydistribution problem where you need tomake sure that the software producer notonly do you have this package securelydistributed to the consumer you alsohave the key that needs you need to useto verify the package you know prettypretty soon you're using keys to signkeys and where do where do we go fromhereokay yeah um what we talk about hereit's quite complex so for that I createa a call service uh that is you can haveyour uh personalized uh bin voice so andwe use that to uh create examples realexamples uh about uh using uh the updateframework here soum what is this service um it's a reallyuh normal service with one API uh thatuh the user can request the the custo uhboth invoice and uh it is store u someum uh data in a storage and they can getit back and as this service is quitecritical um we need to secure that rightso uh let's see how um by the way it'sso critical that of course we usekubernetsYeah so now we have our our criticalservice that we're we're securing withwith with distributing securely with TUum so that now let's get to to what doesall have to do with this updateframework thing that's in the title ofthe talk um so what what is TU uh TU isa framework for secure softwaredistribution and updating um that'sdesigned to protect the freshness and umconsistency and integrity of softwarebeing distributed um it does this umthrough a number of means and I'll gointo more details about how it works ina sec but um it does it uses thisprinciple of compromise resilience sothat even if something in the systemgoes wrong um the system remains secureright it takes like a cascading seriesof failures to actually um deliver thewrong software to people um yeah this isthis includes repositories keysdeveloper accounts anything like that uhonly one of these being compromised isinsufficient to actually do an attackthis a~llows us to both reduce the impactof a compromise right no one compromiseis enough and also allow for securerecovery when those compromises dooccur um so I'm going to walk throughkind of how this works in TU so first wehave uh content integrity so how do wedo that so you have this package this isthe one we're we're going to distributein this example today um as I mentionedearlier the first kind of thought folkshave is let's sign the package right sowe have a signature on the packagethere's some downsides to just signingit in the package like this um primarilyit's that um if the once you once yousign a package right if you put thesignature inside the package thatactually changes the hash of the packageso if you have multiple signatures oranything like that um it gets reallyawkward to kind of unpack those so whatwe're going to do instead is do what wecall a detach signature which isbasically you have the file and you haveanother file another file what we call ametadata file um in this case it's inJSON in this example which includes thesigned hash of the file that we'redistributing and of course you have tosign that too right so so now now wehave a signed um secure hash of the filebut how do we know what what key issupposed to be used to sign this file sonow let's put that key in anothermetadata file and sign that you mightnotice this is going to become a problempretty soon right let's just do thisjust just one more time um to what we'regoing to call the root.json JSONum this this is the our root of trustand because it's our root of trust we'regoing to do what we call multi-signature trust so this this key at thetop the the unlabeled key is actuallymultiple keys right so you can havemultiple keys these can be your mostsecured keys probably hardware tokens umowned by different people maybe even ondifferent continents right you can doall kinds of crazy stuff to secure thisbecause it's as you're going to see thisfile actually changes way less oftenthan the other ones so these keys can beused less often that's kind of thethat's kind of why we did these threelayers here is so that the keys that areused more often can be more easilyrevoked right by more securerules okay so now we're signing the filebut we still haven't dealt with thisproblem of old versions right so whathappens if we now release fu 2.0 rightwe want to make make sure that folks areusing this fu 2.0 instead of foo 1.0 sowe add a sign we add the hash of fu 2.0to this fu.json JSON resign it with thekeyA um but but right if if you had the oldFU.json it might still look valid to tosome of the internet right if you'rejust able to replay that old metadata soto prevent this we put version numberson these metadata files fu is on V2because we just changed it and you putthese version numbers in another filethat we're going to call snapshot andyou sign that and you also put that keyum in this route this really secured umrelocation right so this is where you'refinding your initialkeys um and that and as so as theversions update the snapshot is going toupdate pretty frequently right everytime you have a new version of somethingthe snapshot will update be resigned butno changes have to be made to root rightso you can still keep those root keysoffline and less used um and able torecover if this snap mo more usedsnapshot key um if something was wrongwith it uh you can of course stillreplay a snapshot file right so we had apreviously signed file that said foo wason v1 how do we make sure you don't usethat let's add a time stamp same ideaput it in metadata sign it um and thatincludes the current snapshot along witha time stamp and then depending on yourconfiguration you can set a window oftime right within which this time stampneeds to be valid it's basically aheartbeat I think is another way thatthese can be referred to and the idea isthat um because this heartbeat is verysmall right it's just a hash of thesnapshot in a current time stamp it'svery efficient to just resign it umredistribute it and so you can make surethat everyone's getting the correctversions of the package with pretty lowoverhead and that gives what we're goingto call freshness um to therepository soum now that we kind of have hopefully abrief understanding of how tough workshow these different pieces fit togetherlet's go back to the problem that we'resolving so we have a package in thiscase we're going to talk about engine Xgive it a real name um and then you wantto make sure certain things happenbefore it goes into your productionenvironment um so you want to make surethat this this package that you'reingestingum as part of your software supply chainis a good package before you then sendit off to your production users youmight want to make sure that has an SBOMyou might want to you know look at thisSOMO for CV or maybe the package itselffor any CVES maybe do some security scanmaybe check the signatures from otherother systems that you don't controlthat maybe aren't using something asfancy as as this but you can still makesure that they're valid and that you'vedone all those verificationsteps so once you've done all thoserequirements you can kind of internallyset up a tough repository that youcontrol and manage so after theserequirements are checked you can ingestthe image to your tough repositorybefore it goes to the productionenvironment yeah so you check therequirements add the image and then umonly then if if it um verifies from thetough repository does it end up beinginstalled in a production environmentthis means that you don't have to waitfor any upstream open source umrepositories to do this all thisverification you can do it yourself andyou're also not trusting every packageon every open source repository rightwhich maybe um not all of them are asgood as others so this kind of lets youum ensure you know what you'redownloading into your productionenvironment so to summarize what doesthis give us this gives us verificationthat checks have been run right becausesomebody with a signature is basicallyattesting you could say that those umver those checks have been run it givesyou control over ingested images you youlike you're not relying on some upstreamrepositories checks you control whichimages are securely ingested not justbecause they're signed but because yousigned them and this also allows you forsecure recovery after CVS are discoveredas we discussed previously um you can uhif if you update tough metadata theusers will know right away that it wasupdated so if you remove an image fromyour tough repository your internaltough repository you can know thatwithin some period of time no no one noone in your production environment willbe using them and this also gives you asingle root of trust for your um like aa single root of trust with with likestrong protections for your um imagesthat you're using right so you have thisone root of trust that root roll andtough and then um that kind of gives youall the different packages that you'reusing so what does this look like inpractice so this is kind of a a lot alot of theory so far right you haveum what what you know what is the toughproject in the CNCF so the main thing itis is a specification that kind ofexplains what all those rules are howthey work how to make sure you use themsecurely um it also includes a bunch ofdifferent implementations in variousdifferent programming languages forfolks who want to get started on thisand actually use it in practice umlanguages include Python Go Rust PHP ithink there's a Java one too i didn'tput that logo up here um as well as abunch of different deployments in allkinds of critical services including themost important critical service yeahincluding call service so we use stuffof course we're trying to do the rightway so how uh call serves to use stuffbecause we saw the explanation beautifulexplanation from uh Marina but tough istough right um so in that case here weare using one of uh theseimplementations that are is are stuffrepostor sets for tough um you candeploy it use um uh containers helmcharts are available for you todemonstration or production deploydeploymentum and we'll explain how we organize allthis stuff metadata that w�e uh mentionedbefore so what you're going to see herethat uh we have uh those uh targets themetadata actually the delegations umthey we have first set here is for theprovenence in the way that we can storeuh the sbombs attestations uh and alsoour uh artifacts that we use to buildour service and uh also um the contentthat we generate um for to the user wealso are we have one metadata that wecan store it uh then the user can uh uhdownload the their uh bin message in asecure way so and of course we want tosecure our third part as we mentionedbefore engineix in the example we use uhengineix here that's our most criticaluh part here because if someone getaccess to that for example they could uhjust uh uh give uh our users uh badmessage and we don't want itso we uh in that uh I will show here howwe do the we secure the uh the inputtest stations and bondBasically everything uh is destroyed inthe GitHub releases for now and uh wecan use uh our stuff client as well butwe can use uh one of the implementationsto download this artifact so it meansthat uh no one could compromise uh ouruh test stations here so let's see howcould be one attack imagine that someoneum get access uh you know anvulnerability or a bad administration inour site and someone get access to ouruh uh GitHub space and he can uh haverights to edit it so imagine that attackgain this access to the GitHub helicesand he's able to go and replace our teststation for the build for example by amalicious uh one so how we can uh secureour uh users so using a tough here todownload this attestation it will tellthat we don't trust in that attestationbut if you just use like a wget todownload it because it's available youcan get the malicious uhattestation well yeah with tough withouttough so um but we also want to secureour container image like the critical uhum um dependence that we have for our uhdeployment let's show this because thisis more uh interesting I would say so wesee that we have the version uh1.25.5 of engineix and uh we want topedate it to the next uh version butimagine that we need to go through someprocess in our organization uh if thelicense are okay CVS are okay everythingthere if you someone try to deploy ithe's not allowed to deploy it he's notallowed to pull this image so to do thisactually we need to add this artifact toour uh tough metadata so we did it butstill if someone just add it to ourmetadata the someone cannot deploy itbecause it require some uh trustedsignature so okay we did ourverifications license CV is everythingnow we sign it we say okay let's add uhto the tough metadata and I use my sixdoor to sign for example and now if weafter the signature if we try to uhdeploy this authorized the version inour environment uh we we can do it asyou can see right now but let's say thatthe 125.5 is malicious and don't wantthat we do some regression here for somereason someone wants to be a new uh PRchanging that we don't want that so uhlet's delete uh that version but uh alsAlso you cannot just attack by deletingit from the metadata it requires alsoagain a signature so we need someone uhto sign it i think noise is Marinathat's signing yeah so now uh we areallowed to go and uh not allowed to notstrange here allowed to not download ituh basically now uh our deploymentdoesn't trust anymore in that uh versionso uh to get uh involved to to theproject uh yeah we have the toughdocumentationuh the rough uh project uh tough is morethe specificationuh but in that uh um uh repository youcan also find uh implementations frompython tuf go and others um the R stuffis a project uh under uh open SSF and uhit allows you to not deal with all theimplementation of tough repositorythat's quite uh complex uh you can reachout uh to us to the tough in CNCF Slackand also in the uh open SSF in the Rstuff and uh you can find the differentimplementations and and more hereyeahso yeah I think I think Car aboutcovered it so yeah thank you all thankyou all so much for coming um and let usI think we have a little bit of time forquestions[Applause]so looks like there's a there's afloat�ing mic here if anyone um hasquestions soyeah many thanks for the introduction uhone question so currently most peopledefinitely us um deploying stuff inKubernetes using some githops approachusing flux or a CD something like thatso not the tough client so is there anydiscussion ongoing how to bring bothworlds togetherthank you yeah sorry uh can you repeatthe about the client you mentioned yeahso I mean this this verification with awith a tough client was reallyinteresting of course but currentlypeople use as I said Argo CD or Flux todeploy dynamically to Kubernetes and theverification is maybe done so six doorsignatures with Kubern or some othertooling so um all these tools as far asI know don't support tough now and thequestion is how this cool metadatavalidation could also be done with withmaybe potentially other tooling yeah itcould do be implemented in your forexample in your pipelines as well um orof course uh if because you can usethose uh tough libraries to build yourown uh let's say uh client uh we havebeen working in a like generic clientthat you could just configure and useand uh of course like before rightbefore this uh uh presentation here Iwas trying to change the code from uhthe docker client to just implement uhfor example uh go tough there that youcould use a parameter to verify acertain tough repository before signingbut u we the idea of the tough clientit's more for uh depends a lot of it'sreally difficult to create a generic anda trusted uh client that's why you needto use the libraries mostly to buildyourYeah I think I'd add that the you knowthe stuff that's internal to the companycan happen before it goes to your yourfinal deployment pipeline right so oneone easy way to do it before thatsupport is fully integrated is just todo you know verification one step beforeyou send it out to Argo and stuff whichisn't quite as satisfying but um can bedone faster um but yeah we're definitelyactively working with projects like thatto help um make this all more built-inso many thingshi thank you for the talk uh from adeveloper experience or kind of way sousually developers have maybe theirtheir their uh uh pipeline somehowchecked in as code maybe azure pipelineor go actions stuff like that uh GitHubactions sorry um how you makesure like is this the part where youwill usually do the signing or is therelike a separate part where let's say thea clumsy developer doesn't likemisconfigure it and then ship onlypartially signed or like doesn't includethe right stuff for the signing um doyou have like a separate step for thator how do you usually do that so thequestion is how like how to make surethat you don't like forget to sign itbefore you try and send it out to peopleyeah and that the the the the importantstuff is signedyeah i think well I'll let Carson toobut I think one idea is basically justto you you do the verification at atmultiple points right so like once theydo it you like you try to do the toughverification and of course if it failsthat means you can kind of send it backto the developers and you can put thisall in the GitHub actions right and sayyou know try again please please likeyou know run these commands or whateverbefore you before you send it out umyeah so like one one thing where I waswondering how how it could play togetheris I mean the sbomb stuff is easy to Isay easy to generate but when it comesmaybe to something like static codeanalysis and you want to make sure maybethat there's a proper report for the thethe static code and there let's say sonacube or something like that yeah isthere something like obviously everyproject has its own configuration thereand like any recommendations how toconfigure that or how to handle that iwould actually say that probably the thething I would look at here is um toughsister project which car can talk abouttoo into because into I think has a morelike generic format for attestationsright it's not just signatures on thefiles themselves it can put you canactually put some kind of rich datainside of the the attestation right umwhich could say like for example thiswas the actual outcome of the scan thisis all the stuff that happened these arethe inputs the outputs um and then havethat be signed you can also likedistribute that alongside the toughthere's ways to connect theBut I think that like signature formatmight be like a better fit for for thatthank you yeah and within totalattestations you can also createpolicies to verify that and for examplecombining with tough if uh some policydoesn't pass because there's uh notenough coverage for example you could uhnot allowing uh this to be signed and uhyou could for example uh also sign thetough metadata using um let's saymachine keys like KMS keys so one ofyour first signatures could be come fromyour pipeline that said oh everythingwas good so you sign that and if you ifyou if you sign that in a specific uhdelegation let's say staging delegationyou could still use this to deploy evenyou very finess check it uh for yourtest for your next tests level and whenyou say now I want to hel you could haveit in another delegation and say now Iwant my hel manager my tech uh my leadsign that with their six store personalkeys then we can releaseThanks a lot for all that clarificationaround the projectum I have a question around yourboundaries uh in the framework and whatyou propose does it something whatoverlaps with SLSA or it's somethingcombining withSLSA where is the limit it's notcompletely clear for me yeah so um theSLSA framework it's actually um it'srelated to the into framework i thinkthey use a similar attestation format umthat provides I think it really focuseson I think the build provenencesituation um in in you know in in thisin this process right so it focuses onsaying you know these are the actorsthat were supposed to do the builds thisis what happened but at the end of thatkind of salsa framework you have um wellI guess in the middle between when youcreate it and when you verify it youhave some you know you wrote this set ofattestations that you created this alsoattestations you have the image that youneed to distribute and you can actuallythen securely distribute those thingswith tough and make sure that the salsaum attestations themselves are also upto date right so it wasn't just thatlike you know you know this version ofthe software was built correctly it hasthe salt access stations it is thisversion of the software when you do thata second time you want to make sure thatpeople are getting the new one right andso you can use tough for that thatum that that like distribution of the umsupply chain metadata step okay youprovide the packaging finally ofeverything yeah and there's a couple ofexisting integrations of um mostly inToto and TU but um in Toto and Salsaagain they kind of do a similarattestation thing so you could like veryeasily do the same thingthere um I was wondering about the umthe repository service is thisrequirement or is it also replaceablewith for example a generic uh OCIregistry to keep the metadataum I I I will explain a little bit withR stuff so um R stuff or TU in generaldoesn't uh care where the artifacts arestorage you can use OCIS um you can useuh S3 buckets and everything else uh thetough metadata usually doesn't gotogether but it can even be together uhwith um your artifacts and for exampleAR stuff right now supports uh PVCvolumes uh that you can share later anduh also S3 buckets so you can have a andwe could for example easily if requiredin the future have OCI support for thetough metadata in our stuff as well welland uh maybe if I may ask anotherquestion um to continue on on whatsomebody else asked um is there a planto maybe have an admission controllerfor uh tough to yeah to run it inKubernetes and then deny admission whenI would love to see it if someone wantjoin me uh I I'd love to do thisyeah it's not that that hard but I I'mnot the expert in Kubernetesso all right I think I think we're justout of time so thank you all for comingand um we'll be around here and we havea kiosk also in the project pavilion ifyou want to come ask some more questionsthank you all[Applause]2025-04-15 21:58:34.306820 ��^|#��sAVCmp--NcxeEokay let's get started Um thanks forwaiting everybody Uh the keynotes gotgoing a little bit late Um so I justgiven a couple minutes for people to ummake their way over My name is Joe BettsI work on SIG API machinery as atechnical lead I've been doing that fora couple years Um before I worked on APImachinery I also worked on uh as amaintainer of CD for a couple years Soum have a background in kind of bothsystemsSo today's um agenda is kind of an introlevel um session for API machinery So ifyou don't know what API machinery is umwe're going to spend some time goingover that Then we're going to go overthe updates um in the Kubernetes 1.33release that we've been working on Umand then we'll kind of finish outtalking about some future plans and howto get involved in theSIG So let's look at what API machineryis It's a kind of a broad crosscuttingSIG Um so it's a lot of differentresponsibilities Um our coreresponsibility is the rest mechanics ofthe Kubernetes API Um so that includeseverything that is involved in defininga REST API the versioning um theserialization protocols resources subresources all ofthat Um and at its core like one of themain things we do is we provide supportfor building resource definitions Um sothat's how you build types like thebuilt-in types pod node those types ofthings But also how you build customresources so CRDs and things like thatUm sometimes you'll hear the term KRMwhich means Kubernetes of resource modelSo we kind of have an opinionatedapproach to what a resource is Um and alot of that is supported by APImachinery Um the various aspects of thatinclude everything from defaulting toversioning to conversionum the semantics of apply and patchmechanics subresources um all kinds ofthings likethat Um we have um a significant amountof investment into control planeextensibility So that includes you knowcustom resources is one of the mostprominent and obvious examples Um butyou can also add custom resourcesthrough aggregated API servers So wesupport that Um we support extensibilityof the admission control What that meansis anytime a request um is coming in aright request is coming in to thecontr��E{#��AAlIYXVIPsk_Uhello everyone uh good morning uh thankyou uh to join us um my name is Kaido umI'm a open source soft engineer i'm ummaintainer of uh tough the updateframework also uh in Toto um and I'mhere today with uh Marinayeah hi everyone i'm Marina i'm aresearch scientist at Adira and also umone of the co-chairs of CNCF's tagsecurity and most important for today amaintainer of the tough project so we'lltalk to you today about about tough andhow to toughen up your software supplychain as the titlesays so first I'm going to talk for aminute about this software supply chainmetadata that a lot of folks are haveand I guess the problem that a lot offolks have so you have an image that youwant to distribute to to users you knowyou probably have already have apipeline for doing this you have a wholeprocess and then you have all this otherstuff right that that you're starting togenerate and um look at as part of yoursoftware supply chain um I think sbombsprobably the most famous of these isalso attestations maybe vex to goalongside those those sbombs and policykind of all this other information aboutthe image and its dependencies and itsvulnerabilities um that you want to alsodistribute alongside the image so youhave this whole collection of thingsthat you want to somehow get um to thesoftware consumer consumer while makingsure that none of it is tampered with umso they actually get you know not onlythe image they expect but all the othermetadata and stuff that they expectso we're going to kind of talk aboutthis problem very generally and then goback to this problem of um softwaresupply chain metadata distribution sofirst we'll just look at how you knowthis this this looks a lot like you knowthe original distribution problem how doyou get that image to users securely umbut you can really distribute anythingsecurely in thiswayum so what can go wrong why ar|�ol plane you can intercept that umthrough admission control You can eitherreject it or modify it Um and so we havetwo mechanisms mechanisms for that Oneis called web hooks So admission webhooks you can validate or mutate umresources um during admission Um we alsohave a new form of admission control umthat is based on inline logic You don'thave to write your own binary as a webhook Um so we call those admissionpolicies and they use a small embeddedprogramming language called the commonexpression language or cell to do thatWe also support cell more generallythroughout the API So that is amechanism that we support that allowsyou to have small inline chunks of logicum throughout the API It's used in umlike DRRA for um expressing complex umsemantics if you want to allocate aresourceUm we're responsible for the languageclients of um of the API server Um so inparticular we spend a lot of time on theGo clients Um but we support a varietyof different clients Um we have bothtyped and dynamic forms of those Um soyou can um the typed form we provide onethat provides all of the built-in APIsUm you can generate your own typedclients or you can use the dynamicclient with CRDs And we also providediscovery for clients So clients candiscover what's available in theAPI Um we also provide the controllerinfrastructure Um so the coupecontroller manager kind of as aframework is our responsibility Um andmore generally controller manager is ourresponsibility So all the infrastructureto build those um is things that we workon Um so that includes the informerinfrastructure the watch mechanism allof that to have efficient controllersrunning in your control planesUm beyond that we um take responsibilityfor the reliability scale andperformance of most of the control planeUm SIG um scheduling does takeresponsibility for the schedulerspecifically Um but when it comes to thecontroller managers the API server um alot of that falls on us Um we also sharethat with SIGCD which is a awesome newSIG that we haveUm to kind of give a sense of what thebounds of our responsibilities are Ifigured it'd be helpful to talk aboutwhat API machinery is not So we are notresponsible for API review Um there is adedicated group of really talentedpeople that do that Um so if you areworking on a feature you do need APIreview We would be certainly happy togive you advice but we are not officialreviewers Um and there's a dedicatedgroup that you need to go to forthat We are also not responsible for allAPIs Um APIs are owned by the respectiveSIGs Um so while we do own some APIs wedon't actually own that many APIs um asa SIG Um we own a couple around um umyou know our extensibility mechanisms Sothe CRD resource itself is ourresponsibility for example Um we arealso not responsible for controllersalthough we own the controllerinfrastructure um and the frameworkingfor controllers Um the controllersthemselves are also owned by theirrespective SIGs Um we are notresponsible for coupube cuddle that'sowned by SIG CLI and I've mentioned CD acouple times already but we are notdirectly responsible for CD that's ownedby SIGCD Um and there's been a lot ofgood work um on redefining the interfacebetween machinery and segate cityrecently Um and that's I think that'sbenefited us both and has resulted insome opportunities to improveperformance which isgreat All right So that's kind of whatSIG API machinery is in broad terms Umhopefully this next section will makethat a lot more concrete because whatwe're going to do is we're going to gotalk about individual caps that we'vebeen working on recently Um so this isthe list of cups I'm going to kind ofbriefly go over Um it's there's I thinknine cups here They're broken up into umthings that are progressing or thingsthat have been actively worked on in thelast cycle So there are other cups thatare open in our SIG but these are theones that have had activity in the lastcycle So these are ones I'm going tospend a couple minutes oneach Um so what I'm going to do is I'mjust going to go through these in orderstarting with the alphas Um so ordernamespace del�etion This is a reallygreat ke um for anybody that's hadtrouble with um namespace life cycle Umit's just an alpha So um we want you totry this out if you're interested in itBefore this ke was in place If youdeleted a namespace there was no orderof deletion of the resources in thatnamespace So if you did happen to havesome finalizers or um some ownerreferences those would be respectedduring deletion But other than that itwas just completelyunordered Um so the first ordering we'regoing to add is that we are going todelete the pods in the name space andwait for those to stop before deletingthe rest of the resources This kind ofeliminates that class of weird edgecases that you can have when theresources in the namespace impact theway the pods run And if you're startingto delete them before the pods go awaythe pods can kind of change at the lastsecond and do weird things Um so thisthis makes for a lot more predictableperformance of yourworkload Um the next cap is snapshotableAPI server cache Uh you don't have toread this whole thing I'm just going tosummarize it really briefly For a longtime the API server has cachedinformation um that you can access fromCD So not all requests to the API servergo directly to CD A lot of them areserved by a cache that cache um has beenused heavily for watches for a long timebut only more recently have we startedto serve more and more list requestsfrom it And this kept finishes the storyarc of making it so that all listrequests are now served from that cacheby it's restructured the cache and madeit possible to do that Um this is reallyimportant in terms of having predictableperformance because without this cap youbasically have two code baths with verydifferent performance profiles and it'skind of hard to know without being anexpert of API machinery which of thosetwo code paths you were going to executewhen you made a list request Um so whatthis kept does is it consolidates thatall to a single efficient code path Sono matter what kind of list requestsyou're making you're going to getpredictable performance You don't reallyhave to worry about itOkay next up is emulation version Thisis a large change and this ke isactually filed under SIG architecturebut because there are a number of peoplein SIG API machinery working on it Ifigured I'd call it out The way this keworks is that we add a- emulationversion flag to binaries like thecoupube API server and you can provide aversion number a Kubernetes versionnumber as an argument Um so for exampleif you are running Kubernetes 132 binaryand you set this to 131 that binary isgoing to pretend to be a previousversion of that component Um so it'sgoing to basically you can think of itas like it's automatically turning offAPIs and switching feature gates tomatch the version that you askedfor So why would you want that um wellone major use case for this is if youwant to have more reliable upgrades youcan break your upgrade into two stepsFirst you upgrade your binary versionbut you fix your emulation version sothat it's not changed Um and so from auser perspective the APIs and thefeatures haven't changed at all butyou're getting to test out a new binarySo as a cluster administrator upgradingyour cluster you're you're testing outthat new binary making sure it works andif it doesn't you can roll it backknowing that you haven't accumulated anyusage of new APIs and features becauseyou haven't turned them on yet Um sothat's kind of the core idea And you'regoing to see a couple other caps um thatwe've been working on related to safeupgradesUm this is another one of those I'm notgoing to spend a lot of time on it butthe idea of mixed version proxy is thatwhen you're upgrading especially whenyou're upgrading a high availabilitycluster and you have multiple APIservers when you're upgrading they'reall not going to be the same version atthe same time You might have some thatare 133 and some that are 132 becauseyou're rolling your upgradesout During that state a user that isbeing load balanced into these APIservers might be able to perceive thedifference of those those �um of whatevertheir request reaches whatever APIserver their request reaches becausethey're different right some some areadding new APIs some are deleting oldAPIs that have been removed Um so whatthis kept does is it hides thatdifference If your request reaches anAPI server that cannot serve the APIyou're asking for it proxies it over toa pure that can if oneexists Um next up is the seabboardserializer So we've had CRDs inKubernetes for a long time CRDs areserved and stored as JSON today That'snot particularly efficient Um so Seaboreis aum binary protocol that is basicallyJSON equivalent It is a self-describingprotocol but it doesn't have theserialization and deserialization costof JSON Um and so we're going to beusing this in storage and when clientsask for it over the API to get a farbetter um kind of performance profile ofserialization deserialization It'spretty cheap to use Um so you don't getquite the storage compaction you do ofProtobuff which is what we use heavilyfor native types Um but you getsomething that is quite a bit closer Umso this will be a big um improvementperformance forCRDs Um next um we're into our betas Sothe first beta is declarative validationThis is not something you're going tosee as a user directly yet Um but it's abig change to the way that we'redeveloping Kubernetes Um the idea is foras long as the project has existed wehave we have defined our APIs in ghosttrucks and then written a bunch ofhandwritten validation in another filenext to it um that controls what'sallowed to be written to thosefields And so if you actually want toknow for sure how a type in Kubernetesis validated the only real way to knowis to go find those Go files and huntthrough and look for the validation codeUm so what we're going to do is we'regoing to replace all that handwrittenvalidation code with markers on the Gotags um that declaratively specify whatthe validation is Um this makes it a loteasier as a user to know for sure whatthe validationexpectations are Um and then it alsoallows us to build tooling based on thisSo today if you go to our open APIschemas which define our types you'renot going to find a lot of validationinformation Um but once this is in placewe can then publish this information outthrough Open API and you'll get a farmore enriched um validation summary fromthat that endpoint Um we're going to begenerating the internal validation codefrom these annotations So they are goingto be the source of truth they're goingto be if you see one of these you knowthat that's actually the validationrule Um next up the next beta feature iscoordinated leader election This is akind of complicated um topic um but thebasic idea is when you're running an HAcluster um you're going to be runningmultiple control plane nodes and each ofthose control plane nodes has acontroller manager the coupe controllermanager running But the way that we wantto run the coupe controller manager iswe only want one of them active at atime We want mutual exclusion Um and sothe way that we've achieved that todayis that if you have three of thesecontroller managers running they racefor a lease and the one that wins thatlease becomes the leader and the othertwo become passive backups or passivereplicas um ready to take over at anytime So we're going to change thisslightly because this works really wellduring steady state when thosecontroller managers are exactly the sameIt doesn't really matter which one winsthe lease but during an upgrade you canperceive the difference depending on ifthe old version or the new version getsthe lease And during an upgrade it'sreally easy to have a situation whereyou start with the old version go to thenew version jump back to the old versionagain Um and that can result in kind ofsurprising behavior depending on what'sgoing on And so what we're going to dois we're going to switch to an aslightly different approach of leaderelection where all of the controllermanagers announce their candidacy forthe leadership and then there is acentralized leader election coordinatorthat looks at the candidates and picksthe bes�t one The default strategy fornow is going to be pick the oldestversion because that keeps you fromviolating skew between components bothfor upgrades and for rollbacks Um but weare entertaining other strategies andthis can be used more broadly Um you canuse this to actually ask the currentleader to give up leadership which isnot something that you can do today Umand you could have different strategiesum for differentsystems Um next up for betas isstreaming encoded for list responsesThis is probably the most significantperformance enhancement um in thisrelease and in fact in the last couplereleases So what this does is itswitches the way that we serve listresponses back to clients fromaccumulating all the information inmemory um when serializing and thensending it after we accumulated it tosending a to to serializing each listitem one at a time streaming it back tothe client and not maintaining any morememory than that in the API server Soyou can imagine if you are requesting alist response for hundreds of thousandsof resources the memory profile of theAPI server is dramatically better inthis It can concurrently handle manymany large list requests without runningout of memory or running into problemsSo this is a huge stability improvementfor the API server I'm really lookingforward toit Um the last cap is CRD validationratcheting This is also kind of a subtlefeature So today when we validateresources uh either if it's a crate oran update we just validate every singlefield in the request It seems like kindof an obvious thing to do but forupdates if a field hasn't changed youdon't actually need to revalidate it Youalready validated it when it was createdAnd so ratcheting is the idea of notdoing that revalid revalidation And theonly reason that this actually becomes aperceivable difference is sometimes thatwe change the validation on fields Um sothis happened in 133 Um there was a CVEaround IPs and citers and we introducedtightening validation for both of thosetypes And what that means is thatpreviously valid stored field data hasbecomeinvalid And so now we have two choicesWe can either reject all updatesum now that that tightening has happenedif the old data was bad or we couldchoose to only reject updates thatactually touch thosefields Um and we've chosen the latterthat we believe that it's it's a muchbetter transition experience if you'retightening validation to only fail ifyou're actually touching the fieldsyou're in um that that have thevalidation changed for Um so we addedthis to CRDs a while ago and we'refinally bringing that to G So if youhave a CRD and for some reason you needto change the validation all yourcontrollers that don't touch that fieldshould continue to work It's only yourclients and controllers that actuallyinteract with that field that mightstart breaking Um which reduces the riskof the change somewhat It's still abreaking change It's still prettydangerous but sometimes you have to doit All right So that covers I think mostof the caps Um let's talk a little bitabout future plans for SIG API machineryUm so there's two areas I wanted topoint out The first is you probablynoticed that we had quite a few cupscoming in around making upgrades saferUh this is a priority for our SIG Um asKubernetes matures we want to make surethat we make upgrades the bestexperience we can for users Um and it'sgoing to take us a while right like weneed to earn trust with clusteradministrators Um but by adding some ofthe the caps that we've added we'reproviding the tool set to make thoseupgrades safer Um and that combines wellwith other things that are happening atlarge in the project So for example umother people outside of SIG APImachinery like Jordan Liot have beenworking hard to um improve our backportpolicy so we don't break patch versionsum by backporting things that are riskySo we've really done a lot of work tomake that safer Um we've also stoppedturning on beta APIs by default becauseeventually those all go away and that'sa breaking change So now if you're notactively turning on beta APIs you're notgoing to be broke�n by that Um so that'skind of the trend that I hope I hopepeople can see here Um the next kind ofsignificant trend is um in towardsdeclarative APIs Um so there's kind oftwo aspects to this Um there'sdeclarative validation like I mentionedis focusing on making built-in typesmore declarative improving theirdefaulting improving their ratchetingthings like that Um we're also focusinga lot on bringingCRDs more in line with how native typeswork Um now CRDs are sometimes ahead ofnative types CRDs have had declarativevalidation for a long time Um and that'sworked out pretty well Um but they'realso behind in some areas So in just inthe last release we finally added fieldselectors to CRDs Um and that's a reallyuseful feature You can now filter CRDsby some field in them Um you can usethat in you know um along with some ofthe new um authorization features to douseful things like constraining exactlywhat what resources in a group ofresources a particular user can seeum and so there's more caps coming alonglike that additional printer columnswill with cell is going to allow CRDauthors to give a very specific view oftheir CRD um when you ask for like atable format back in cube cuddleum which has been missing for a longtime Um we're working to add morevalidation formats to CRDs So you can dothings like say I want the C to makesure the CRD's name um or this fieldhere is a Kubernetes resource name It'sa valid Kubernetes resource name Todaythe only way you could actually do thatwould be to go find the rag X and plugit in Um so we're going to give you um anamed format for things like that and abunch of other things um where there'salso gaps in um the way that you cancompose types So if you're trying toembed a type in another type or you'retrying to reference another type there'snot a lot of API machinery support forthat Um so we're looking into ways to dothatbetter All right Um that covers most ofwhat I had to present The last thing Iwas going to say is we are activelylooking for contributors in the SIG Umif you want to reach out to us um here'sour Slack Um we re we meet twice a weekSorry I gave uh US times here Um buthopefully these are somewhat uh Europeanfriendly times Um and then I've listedout um our chairs and leads So with thatI'm going to stop and take any questionsyouhave Thank you[Applause]Um we'll use thisone M um I was looking do can you sharesome more details about the new formatsthat you're going to provide for uh theAPI like label selectors is an areawhere uh I don't want to use a a webhook to validate it So it would be niceto have it included in the API specsomehowUm I will add to these slides and uploadit Um we have we have an issue that istracking um requested formats andimprovements Um so I think labelselectors was already on therefortunately Um but yes there there'sthere's a variety of other onesincluding like I think I think wesupport like an IPv4 and IPv6 format butwe don't support a general IP formatthat could be either Um so there's abunch of like really core types that aremissing but also things like selectorsand you know conditions and status andthings like thatAllright Anyone elseuh uh I'm seeing that um there are someissues uh that uh for instanceserverside apply defines some uh listtypes and and um merge keys uh while incustomize you have the patch type andpatch key I think it's called it's verysimilar and kind of represent the samething I I went to another talk aroundwith six CLI uh I waswondering who who could put that forwardI work as a platform engineer So we do alot of CRDs and we provide CRDs to usersand they also use tools like customizenot the CLI tool but in other contextlike flux or argouh and I would really love those thingsto be coordinated so we don't have toprovide extra tools to to to supportcustomize betterYeah I I would love to learn more aboutthe specific problems you run into So ifyou have an issue please send it out tothe SIG but um yes there are two patchmechanisms that have almost identicalrequirements in the in the metadata thatyou need to put in the types Um I myintuition is that could probably be madebetter So we should look intoit All right any other questionshow's the journey oh that's loud How'sthe journey from um original um uhstreaming support to uh maybe somethingmore modern that's quick and HTTP3 fullystandards compliant how's that goingum so is this this a question about theJSON streaming support or uh the HTTPlayerum yes So the the streaming supportwe've done in this release I I don'tknow if this answer your question butI'll try Um is is mostly focused onmaking sure that the API server doesn'taccumulate a huge response in memorybefore it starts serving it Um so it'smostly it's mostly focused on thataspect of itSo I'll clarify I was talking more aboutlike if I'm going to do a streamingrequest to fetch um I'm going to execinto a pod and that's a streaming thingbut the the AP API server gets involvedSo that sort of evolution from earlyKubernetes to where we want to be Um Ithink I think once you once you have thestreaming responses we have on the listtoday plus the watch mechanism once onceyou've got that initial list done andyou're starting the watch you end up ina pretty efficient mode where you'rejust kind of constantly gettingupdates Um if is there a performanceproblem or is there kind of a standardsqu is it more of a Yeah Is it more of aperformance question or more of astandards question i was I was askingabout this this the journey towards likeweb standards So web you know forexample websockets over HTTP3 as a assomething that any client that thatunderstands how to do websockets couldcould use for an exec I know that's anode API but um as an example Oh yeah Umwe did recently and I didn't list it Umwe did recently do work to move away umso we have moved over to um a standard astandard websockets um and we're usingthe I think the gorilla library for thatSo there is a cap for that Um I believeall the code merged this release Um sogo check that out and and see if thathandlesit All right one more and I think yeahthere's one back thereUm I had a task recently to introduce anew API field that needs to be sync withalready built in the in the API So whatis the approach for uh read requeststo default this new fields because ofthe defaulting and validation they areconsidered only for write requestsSo are you asking to add a new field tothe standard APIs not the standard APIbut to uh API server extension in ourcase Um yeah so um you can adddeclarative defaulting today Um we justmerged declarative validationum in 131 and we've only added a couplevalidators so far So it's a little tooearly for general purpose use inaggregated API servers Um but it's justanother generator It's validation genright next to defaulter gen and defaultgen Um so as we build that out and addmore support um that'll very quicklybecome available So over the next coupleof releases keep a very close eye onthat and if there's specific validatorsyou need please reach out Yeah it wasmore or less like uh we already had afield Now we introduced like for theservice IP and IPS that was introduced acouple of years ago and when youintroduced this new field you have tokeep it in sync with the alreadyexisting field for compatibility reasonsI'm sorry I'm not sure I totallyfollowed thatum you know in the service API rightyeah spec IP was a singular IP and whenthe IPv6 support was added then thisturned into array not a single valueright and both this field had to be keepin sync So when you send out a create orupdate request to the API server it canuh sync them with the defaulting But ifyou just try to read uh a service thenbasically there is I think no defaultingand uh you as a client you might notreceive uh the new API resource withboth fields setYeah So is is your question um are weworking on that or yeah if there is someapproach to um we should talk to theexperts on that I haven't touched thatfield in a long time Um yeah so bringbring up an issue or or drop somethingin Slack Um there's a couple expertsthat we could talk to about that one inparticular Yeah Thank you All rightwe're at time Thank you so much everyone2025-04-15 21:58:34.840159�We've got clouds we've gotmultiple clouds they're you know theyhave their own quirks We've got on-premUm so we want to work around you knowrecognizing the environmentalconstraints and how you know differenttopologies are different clusterdeployment topologies are are you knowcan do different things and um focus onAPIs that work everywhere Uh we want toavoid solving optional problems Um soyou know focus on like the core problemand that's where you know we startedwith multicluster services and have beenexpanding out just focusing on you knowsimple use cases that solve concrete uhproblems that everybody has um andmaking sure that you know we areconsistent and compose well withexisting APIs so that you know we'reproducing a series of building blocksthat you can use in whatever way uh yousee fit uh a core building block and andthis is conceptual um is uh the clusterset and what this is is a uh a realpattern of use that we see in the fieldIt's a set of uh clusters governed by asingle authority Could be anorganization could be a company uh couldbe a platform team uh with a high degreeof trust within that set So these areclusters where you expect to deploysimilar applications or the sameapplications um across the the set anduh they all revolve around uh the thingthat brings them together is thisconcept of namespace sameness and thatis that you know a namespace uh with thesame name in any cluster in this clusterset is used for the same purpose thatmeans permissions uh the samepermissions the same kind of workloadsum and a name space doesn't have toexist in every cluster so that you knowyou can you can segment your fleet Butwhen a namespace exists it means thesame thing So you know we discouragethings like calling every dev namespacedev and then you end up with a bunch ofuh of conflicts So better better namesthat are identifiable in this you knowmulticluster uh case Moving towards theworld of cattle not petsSo on our way to move to let's see so tocattle insta pets there are a fewprojects that we are working on and thefirst one is about is about about APIright what it is about um as Jeremyalluded to first is um in the when inthe beginning um Kubernetes actually isuh the whole boundary is its own clusterso it is the universe And uh like do youknow what's the name our our ownuniverse we don't know It's the universeSo you don't need to have a name Butthen somehow we figure that it'smultiverse Now you need a name So aboutAPI is there to give a name for thatuniverse So that's where it come fromand I think um u we had multiple votesum to decide what exactly that idea iscalled Um so we get that settled Uh sofinally we our universe has a name andum and we go way beyond that right justthe name itself is is good better thannothing but we still need to know a lotmore about this universe So we are inthe process of adding more properties uhinto this API to have more universal uhcommunitydriven or consensus propertiesthere so we can actually describe thatuniverse better other than just a nameSo with that um it isum oh actually I kind of just go throughthat is uh it's what give it moreconcurrent um give it more well-knownproperties and uh with that um one ofthe reasons that we need to describe acertain cluster is uh orchestrationright as again Jeremy alluded to thatnowadays many times you want to knowwhere you put your AI jobs whatever jobsyou have all applications there becausethere are certain scarcity of resourcesor price and you want to go find theleast uh expensive places So how do youdescribe that um that's where this uhproperty comes into help right if youhave certain properties can you canassociate with that cluster then you canmake those decisions other other thanotherwise you only have names you have awhole bunch of kettles they all havenames but you really don't know who iswho or what who are good at what so umyeah historically we have uh made a fewattempts at least from the sik site andone of the probably more famous one isthe Kubby Fed Um but uh um we would wewould say that probably it's not themost success story in the six histo�rybut uh that demand still exist clearlyand uh again from this kubic count Idon't know how many I lost count ofnumber of multicluster talks my handsdefinitely is not in enough to countit's like every day you see 10 of themor something like that so it's clearlythe demand is there the the use case isthere the community is looking for somecommon solutions They don't have toreinvent the wheel Um so uh from a sixstandard point of view we the firstthing that we take a stab at that iscalled a cluster profile API That's kindof the youngest uh API among these uhexisting APIs So that proposal comesfrom a lot of um open source uh projectthat is has uh real case real uh worldusage and we come together and withmultiple discussions we get to a um kindof u beginning of that cluster it'sstill a journey that we haven't uhfinished yet So one two things thatwe're really uh trying to get to that isum so what is the cluster profile rightthe cluster profile is you have aboutAPI that describes itself but when youget to multiverse you have this kind ofcattle ranger you actually has to figureout which cattle works best for which uhwhich uh metal whatever you call itright so at that point you want to havea representation of each cluster socluster profile is for that purposeIt's a read only cluster readonly APImeans it's reflective of the certainproperties which is which is about couldbe about about API properties and thenyou can uh do scheduling ororchestration there So there are twopieces that is we are really working onnow One is the uh credential part Sobecause just have a object representinga cluster doesn't give you automaticaccess to that cluster you still don'tactually have a way to do that So uh andobviously stored the secret um someplaceisn't uh isn't probably will not flywith any of your security complianceofficers Uh so we need to find some waythat is uh uh better than that So we areworking on that The second one is umagain the properties Uh we haveproposals here We would love some moreinputs from the community Give usfeedbacks It's a work in progress Uhlet's see what's next Yeah So this iskind of the cluster profile propertiesuh diagram I don't know if you can seeclearly Um basically you can see the uhthe the yellow ones are the realclusters and the the blue ones are thecluster profile that is justrepresenting those uh clusters and thenyou have a cluster manager that'sbasically sitting up top there and sayokay I know which one every cluster whatthey what they are doing and uh and thenthe the best part is with that unifiedAPI you can actually have popularmulticluster abled um third party likeAgo CD Flux Q MultiQ to actuallyintegrate with any of those um any ofthose cluster managers that speak thislanguage It's kind of like now you don'thave to talk to them individually Youhave a one common language that you cantalk to So that's kind of the beauty ofthat We are still working towards thatgoal especially we about that uh secretswhen we have this credential part sortedout this will work very well And lastlyum those the cluster profile API itselfis doesn't live in the vacuum it as Ikind of alluded to before that you havethe about API that we get the propertiesfrom and you also have this work APIanother API we sit in the uh s thatabout how do you actually assign jobs toeach cluster right you you know this jobyou understand it now you wanted thisthis cluster to actually do some workfor you so that's where the the workcomes into being and even with thatafter you have the work there you stillwant them to talk to each other and herethat Laura is goingUm all right I'm going to talk about acouple more projects that um occur inSIG multicluster Uh one of them which weh is one of the older ones at this pointSo if you have been to a SIGMC talk orfollowed in the past you've probablyheard about this is the multiclusterservices API or the MCS API Um this is astandard for how a service object justlike you're used to in a single clusterenvironment can be um basically flaggedto be exposed across a cluster set sothat it could be consumed by umworkloads t�hat are in a totallydifferent cluster from itUm the there's some links here about uhthe multicluster services API Um sincewe last gave a maintainer update in uhNorth America um Psyllium has come a lotfurther on its implementation of the MCSAPI and there's also a ton of other umimplementations of this API that you canuh take a look at Um there are someupdates ongoing um regarding as we asmore implementations use this API thenwe get of course more feedback about howto improve that or how we can representthe standard better Um so uh there's av1 alpha 2 uh still in progress and umanother really important aspect of thisum the multicluster services API isaboutuh uh service discovery across clustersSo you can also think of that as like aeast west direction Um but the gatewayAPI as many of you may also know um isabout that north south traffic from aclient outside of the cluster to um getuh to endpoints that are inside thecluster and there's an integration pointthere if people from outside the clusterwould like to contact any of theendpoints that might be representativeof whatever that uh service is across aset of many clusters So that connectionpoint between uh the MCS API and SIGMCand the gateway API in SIG network issomething that we work very closely withthem on Um and there's another uh linkthere to a GE gateway enhancementproposal that uh talks a little bit morein detail about how those leverage eachotherUm yeah and I'll this is kind of like alittle cute version of the MCS plusgateway Um one other project that issuper super new that just recently umwas proposed to the SIG for um adoptionas a project under the umbrella of SIGmulticluster is multicluster runtime Umthis is an extension of the existingcontroller runtime framework formulticluster use cases and it's intendedto be a drop-in um solution um and uh issomething that we also see uh very closeconnection with uh the work that Ryanwas talking about um regarding thecluster profile and the idea of exposingthese standards of how that list orinventory of like what all your clustersin the multiverse that you live in umthat uh is a really good integrationpoint with what the folks who wereworking on multicluster runtime um werealready working on Um we also like thissince as we mentioned in the approach uhslide uh as SIG multicluster we'rereally trying to um integrate well withthe existing Kubernetes APIs as you'reused to them in a single clusterenvironment I didn't talk about this toomuch in detail but a a important part ofthe multicluster services API was tomake it feel like you were just using aservice but like it just happens to bein multiple clusters Um and themulticluster runtime project is alsotrying to take this existing frameworkcontroller runtime that everybody isused to and make that uh something thatcan be uh used across um a multiclusterum cluster set So I'm going to hand itover to our co-chairs again to talkabout what's nextThank you Laura So coming up next in noparticular order and this isn'texhaustive either Um so cluster profileAPI Um we need to work on integrationsof that with the actual ecosystem Um asJeremy said at the start we focus onproviding actual APIs that have meaningacross a variety of implementations butwe're also interested in keeping trackof uh the implementations themselves Forexample Argo CDMultiQ We're also very very interestedin coming up with canonical patterns Ifyou look at the people up here we allwork for infrastructure providers in oneway or another we're not actuallybuilding solutions with these uhmulticluster APIs So we might be gettingthings wronguh and what so that's why we're reallykeen on getting um feedback and I'llcome back to that from usersuh in theSIG and we we with the goal ultimatelyof coming up with patterns not just theAPIs themselves but how you can use themto build applications that work acrossmultiple clusters things to watch outfor things that work well um and then soI'll skip the next one because thatJeremy wants to talk about that but alsojust figuring out what else we needbecause as Jeremy said to ummulticluster ha�s been growing in atleast visibility uh events such as thisone and we've also switched it seems tome from uh having multicluster meaningjust provisioning multiple clusters andbeing able to manage them in acomfortable way to actually buildingapplications that are designed to usemultiple clusters and so connecting themhaving information about all of them Umand there are probably pieces that aremissing or pieces that every singleimplementation does in similar ways andso should perhaps becomelibraries Uh but yeah we know thatthere's a bunch of stuff to to add AndJeremy I'll let you talk about leaderelection Yeah this one this one's kindof been a a a pet vision of mine for forquite some time but I think especiallywith you know the recent adoption of themulticluster runtime project and as westart exploring more in that in thatvein the need for um you knowmulticluster coordination is is moreimportant than ever right now that weactually potentially have a path torunning clusters or running controllersacross multiple clusters um and inmultiple clusters we need a way for themto coordinate So I think you know thisis one of those really interesting areasthat's uh that's more and more importantover time and also a great uh you knowuh starter project for anyone who wantsto come in and start kind of ideaideiating with us on on what that couldlook likeRight And there are a few other thingstoo that spring to mind that aren't onthe slide But in general uh anythingthat you can do inside a cluster wherewe need to think about how it could bedone across multiple clusters that'salso uh interesting to us And so forexample KE4444 if I'm I think uh is now availableAnd so that uh allows you to specify umwhen you're talking to services whetheryou want local ones or remote onesThat's something that SIG network didbut that's probably that we should bethinking about extending in SIGmulticluster Uh there's also the uhquestion that of network policy thatcomes up a lot that we'll have to tacklesomeday And with all this work to bedone uhwe really need um more not necessarilypeople to work on it but more input Umso we're interested in finding out whatyou're all doingwith multicluster in general whetherit's with multicluster APIs that havebeen developed in the SIG or not If youfind that what we're doing is missingthe mark that's very interesting to usum and things that you're buildingthings thatum you're that you would like others tobuild for you Um and so we're alwaysinterested in seeing your designs seeinghow how you work with all this stuff Sodemos uh are always welcome um whatproblems are you trying tosolve and what unique needs do you haveum that we haven't taken intoconsideration because you might thinkthat they're unique to you but and itturns out that very often they're notAnd so if you're interested in helpinguh we could use help with the testsheets So if youum well our APIs have got test suites togo with them because they are APIs thatare designed to be implemented bymultiple implementations Uh we providesomething like conformance suites thatallow implementations to say okay wemeet the requirements Uh our website islacking in detail and could use somework Uh also recently so we mentionedthis multicluster runtime is now hostedin the SIG and it's an experiment So uhwe're looking for people to try that outand see if itworks And so how can you do all thiswell come to the SIG Uh we have awebsite that you can find there So thethe slides aren't up on the on SCED yetbut we'll upload them You'll be able tofind them there There's a Slack channelwhere you can come and chat with us Uhthere's a mailing list as well that youcan see here And if you sign up for themailing list you'll automatically get acalendar invitation for our meetingswhich up until now were bi-weekly butwe're changing them to be weeklystarting next Tuesday Uh you can see thetimesthere Um and that's it for thepresentation We've got a few minutes forquestions There's a mic in the middlethereHi So it's not as much a question but Ithink yesterday at uh Argoon there wasalso I think it was for �MCO themulticluster operator Um there wasalready someintegrations demonstrated Um I thinkwhat they intend to do is uh make use ofthe about API not use the work API andthen still do their own credentials as asort of midway while everyone is workingon it Um is there something you guysfeel is uh is the the the better way todo it because on one hand you end upwith people with kind of in betweenconfigurations on the other hand it doesmove everyone forward I mean what isyour perspective on thatyeah I can start Um yeah so I think uhwell I guess quarantine could kind ofappear too now Yeah but um so MCO isintended to uh integrate with the SIGMCstandards and I know there's like kindof where it is now and then there's likewhere it's like going forward too Sowe're in a point in time right now umand uh integration with the about APIand the cluster profile API are thepriority for MCO specifically Um I thinkuh SIGMC or maybe even to back up moreKubernetes has always been reallyinteresting because sometimes it worksas an organization that ships softwareand sometimes it works as anorganization that ships standards and umSIG multicluster because we recognizethat there's like lots of differentenvironments that people are goingacross cloud or in cloud or onrem orlike the reasons and the variety of howthey're deploying things multicluster isstill like you know there's some levelof like standardization that we justlike can't or like there's a highceiling there's a high skill ceiling youknow so um so it's always been importantto us to like find that space betweenwhen are we shipping like the solutionand when are we shipping the API thatlike people can coales around so um Ithink for um MCO uh integrating morewith uh the standards as we've set themis the plan and then um We expect anddesire there to be many differentimplementations of these APIs That'sdefinitely the goal And then I think onelast bit is that where in that pro whenin that process that um the differentimplementations are having a difficultylike staying on standard or whetherbecause that's they have a legitimatelike edge case or reason to do so orbecause it's too hard for them to meetthe standard That's a place where it isI do think it is the jurisdiction ofSIGMC to provide the libraries figureout where that you know if that line oflike when are we shipping software andwhen are we shipping like kind of likenudge that to like wherever it is thatmakes that easier In fact clusterprofile is a great example of thatbecause there's a long time where SIGMCwas kind of avoiding an API API aboutthat because there wasn't really a clearconsensus on what that would mean and wedid get more people in the room and morepeople at the table to talk about thatand then now we feel more confident thatthat's something that people canreasonably adopt and that we have thebandwidth to help them adopt if it'smissing somethingThat just sparked more questions butokay there's no quue so go for it Um souh one is uh in ISTTO you have serviceimports and service exports I think thatis exactly the same API as we're goingto see in uh the multicluster uh ormaybe not exactly but it's very similarto when you want to choose to export orimport a service from another cluster Umis that something that is uh scheduledto be part of multicluster soon or isthat something that we're pushing out orjust so it's essentially the samequestion for Argo but then for STTObecause they're doing their own thing inone way in the other hand they are alsodoing the gateway API so that is moretowards a common standard and I supposethey're going to do the same thing withservice imports and exports perhapssimilar to what selium needs to whatlink needs they are all common needsYeah So the service import export APIthat actually is the MCS API that is themulticluster API and IST is animplementation and and MCS is an exampleof something where we defined a standardand no implementation and there'sseveral implementations Um you know uhwe on this stage and a bunch of folks inthis room have built their own andcollaborated on various implementationsUm and yeah that is STTO is one Sothat's that's a great example of I thinkthe standard actually working So you canyou can do it on this platform you cando it on that platform you can use STOyou know and it's the common pattern Allright that makes a lot of sense becauseit was working so well that it was justhappily using it and ignore thedocumentation So yeahgood job STO teamAnd if you're interested in seeing theAPI actually in use there was a verynice presentation yesterday by Ryan andAugust You can watch the video uh whenit's available if you didn't see ityesterday about uh signal multiclusterAPIs in use in OCM and cube fleet andalso cluster ADM rightno Was it not a talkbut yeah we haven't we haven't watchedall the mastercluster talks that havetaken place at this uh CubeCon Yeah wehave those GKE has one There's there's Imean there's there's really quite a fewnow It's it's great to see theproliferationUm you were mentioning uh the linkbetween MCS and gateway and how gatewayis normally north south and MCS is eastwest Um but with with gamma uh gateway'sapproach to that is to use service meshfor for east west and so I'm wonderinghow that interacts with MCS which isdoing it obviously without a requirementfor a service mesh and yeah how doesthat workso kind of that actually kind of buildson the previous question I think it likethe MCS API is really just the API Sowhether you're using a service mesh ornot using a gateway you can target a aservice that is in a single clusterThat's kind of the traditional gatewayapproach Or you can target amulticluster service And whether or notyou're using a mesh um you're basicallysaying I don't care where the endpointsare right i it can be across clusters itcan be in that cluster it could move toa different cluster Um but it's a way ofkind of creating that abstraction So aservice consumer uh doesn't really needto worry about uh where the producerdecides to to put the workload Right Sowhether there's a mesh or not is just animplementation detail Exactly Okay OkayOkay Cool Okay Thanks Uh thank you foryour presentation Uh so I I've seen thatuh uh you actually have some standardsfor the uh east west traffic Uh so howthe services actually talk to each otherin different clusters But are there anystandardization on uh like uh uh northsouth traffic as well So when uh clientswants to reach to like differentclusters uh there are projects like caseGB I think uh sens projects but uh arethere any like global uh service loadbalancer type of uh standardization thatyou are thinking of as a likereplacement for ingress in multicloudstuff So that that's precisely whatgateway is is trying to accomplish So ifyou so gateway would be how you thegateway API there's you know many waysto use the gateway API and and many youknow it's it's effectively um like thenext generation ingress controls rightso you can build your global loadbalancer out of the gateway API andtarget that multicluster service andkind of build that um that abstractionso that you don't need to you care oryou can control how you know clientsconnect to your service Okay But butthere still like are they going to likeoperate on the proxy level per clusteror is it something that integrates withthe upstream DNS servers uh upstreamintegration So yeah the the multiclusterserver like the the barebones well itactually depends on the implementationbut for example you know having workedon GKE I can talk about that like it's abarebones like you know cube proxy orwhatever data plane based uh solutionbut when you know that's where you canplug in a service mesh and get morecapabilities Um using gateway you kindof move things up a layer So the idea isthat the API describes what you want toachieve which is make my serviceavailable Um but the implementation canvary depending on how you're consumingit and all the configuration and whetheror not you're using gateway and howyou're using gateway and so you knowit's a lot of control and flexibilitybut in a common pattern so you can swappieces in and out Okay Thank youUh thank you everyone[Applause]2025-04-15 21:58:35.630211 ��P~#��WAZManfhV6DZUhello Hello everyone Uh thank you fortaking the time especially like towardsthe end of the day Um I do I do enjoy uhI do enjoy a maintainer track session umfor so many of you Uh uh actually let'sget a let's get some audienceparticipation How many of you contributeto Kubernetestoday Active users ofKubernetes Do you have any maintainersof Kubernetes in the house maintainersof otherprojects Cool All right Cool So we got anice little mix Um so today we're gonnawe're gonna��T}#��_A-SFVDr3wQ_whello Uh welcome everyone to the SIGmulticluster intro and deep dive I'mJeremy Olstead Thompson uh one of theco-chairs of SIGMC and I work on GK atGoogle And I'm Stephen Kit the otherco-chair and I work on multicluster atRed HatAnd I'm Ryan Zan from Microsoft workingon multiclusterAnd uh my name is Laura Loren I work atGoogle on GKEAllright So um what are we going to talkabout we're going to kind of give a highlevel uh overview of what the SIG'sabout and how we approach the problemsUm we're going to go into some of thecool things that we're working on Andthen uh and and some new things thathave recently developed and we're we'revery excited And then um we'll uh get tohow you can contribute And the mostimportant part here is hearing from youSo what are weabout clusters right um multicluster iseverywhere uh these days and I thinkwe've shown this before We've talkedabout this before and I feel like youknow this has been increasingly true atevery CubeCon for years now and thisyear more than ever you know we'reseeing lots of you know lots of theproblems we've talked about in the pastand um that I'm sure a lot of you arefamiliar with um fault tolerance anddata locality and policy and thesethings you know drive you to need morethan one cluster for your workloads Umbut more than ever especially with youknow the AI boom we're seeing right nowand this increasing utilization of thisuh scarce capacity we're seeing the needto do um things like capacity chasingand you know running workloads that arejust kind of finding the hardwarewherever they can and multicluster isyou know the best way to do that inKubernetes Um now some of the problemshere are that uh Kubernetes was builtwith the cluster with the cluster asbasically the boundary of the universeright uh all of the concepts inKubernetes don't really have anyself-identificationuh built in right and so we've beenslowly kind of whittling away at thatand um making some progress at thingslike uh multicluster services and andthe about API We'll talk about thesesoon where we've kind of introduced waysto um you know break down some of thosebarriers but uh there's a long way to goUm although I feel like less than therehas been in the past and hopefully umsomeday very soon we'll say we did itbut you know I think I think it'll besometime um and we really need yourinput So there's been you know everybodyhere please come if you have questionsor or use cases we'll talk at the endhow you can participate but we'd love tohear your stories Um and thank you tothose who have recently come and kind oftalked about how they do things So whatis our approach um it's a very diverseecosystem �� kind of go over what thesteering committee does what we've beenworking on what we're responsible forand kind of how we interact with the theother governance groups within theproject Um but first things first who weare Mache do you want to Yeah So my nameis Mache and together with Stephen andfive other friends I remember seeingPaco Paco's right over there So he's ourthird uh steering member He's the one onthe far right at the bottom We have fourmore that are not currently present Isaw some passing by in the in thehallway Uh we are the Kubernetes uhsteering member you have all theinformation the pictures where we workwhat we do on a daily basis I'm notgoing to repeat that You can always readthat information as we are talking abouta little bit more uh interesting thingsSo so actually what do you work on inthe project outside of doing steeringOh yeah So uh it just so happens that ifyou currently look at each and everysingle one of usually spans at least twoor two or even three areas It just sohappens that the majority of us aretechnically involved in the actualeither release process or writingcertain parts In my case that's uh 6 CLISo everything cube cuddle If you touchedcube cuddle I touched it as well If Ibroke it I'm sorry I'll fix it Ping meon Slack Ping me on GitHub ProbablySlack is faster Uh if I brokecontrollers I'm sorry Same thing withregards ping supplyUm and and batch that's that's a newthing that I've been involved uh for thepast few years give or take I rememberwhen we um set up the the working groupWhat about you Stephen Yeah So um so alot of formers um I think uh the uh so Iused to be one of the co-chairs of SIGAzure uh as well as the um uh nowdefunct uh they're well they're bothdefunct uh now defunct uh SIGPM soproduct uh program and projectmanagement of the project Um so ifyou've heard of the uh little thingcalled KPS uh Kubernetes enhancementproposals that is one of um the thingsthat you can blame me for Um if uh so Ialso spend time with SIG release I'mcurrently one of the co-chairs for SIGrelease We have uh we have a fewco-chairs and technical leads but um Ithink I'm the most tenured at this pointUh so yeah I've been doing SIG releasefor a while Uh my team is responsiblefor releasing Kubernetes every cycle UhI'm one of the founders of the releaseengineering sub project and have builtup a team of release managers to about Ithink we're about 15 now kind of acrossthe world Um so that is my day-to-daywhen I'm not wearing the steering hat Umand then all of these lovely folks UhSasha is one of my co-chairs with uh SIGrelease Uh Ben is responsible Um he'she's over testing and Kate Infra UmPatrick is uh also testing Antonio istesting um and networking Yes Um so yeahwe get um we get around the project Uh Ithink that it's not just uh so whatwe're here to talk about today actuallyhas nothing to do with the technicalaspects of the project Um we are umwe're going to talk a little bit aboutsteering and and how we do what we do Umbut to give you some context aboutKubernetes there are community stats asof March 2025 We've got close to 100,000contributors right and and over,200 orgmembers to Kubernetes um huge project Ithink um you know common misconceptionthat Kubernetes is just KubernetesKubernetes we we uh we have multiple uhuh GitHub orgs that we maintain SoKubernetes Kubernetes SIGs Kubernetesclient uh wow what are all the ones nowas well Um and there's one more I'mprobably missing but uh a couple moreYeah Yeah Yeah It's it's quite a few butover you know close to 400 repos and 3032 uh governance groups within thecommunity So that spans across uh SIGsspecial interest groups right Uh workinggroups uh WGs uh committees um and uh nonot not so much uh user groups anymoreright Yeah we've we're kind of turningdown user groups and and pushing that upto the CNCF level but there's quite abunch of uh there's there's quite abunch of moving pieces in Kubernetes umfrom the um people governance side Umcool So what does the steering committeedo We are a governing body of theKubernetes project and we providedecision-making and ov�ersight pertainingto the Kubernetes bylaws uh sub projectsand financial planning Um the what thatthat means on a day-to-day um is is kindof interesting I think sometimes we willhave quiet months and years andsometimes we'll have really active onesuh a lot of the work that we do uh so sofor reference uh the way we so I thinkyou know uh SIGs or special specialinterest groups are our main kind ofunit of of of work right so the theseare um these are groups that arechartered to focus on a specific topicwithin the project right um so SIGrelease for example is responsible formaking sure that we release Kubernetesuh at a reasonable frequency and of highquality and what goes into that and huhand on time and on time and on time Soon time is umuh subject to interpretation sometimesuh depending on what is happening youknow vulnerabilities um you know I thinkdoing bug fixes for features that arerecently introduced um can get kind ofinteresting but that is what my group isresponsible for and do you want to talkabout uh 6 CLI um maybe necessarily 6CLI 6 CLI or SIG apps and probably SIGrelease beingthe overarching body throughout therelease process especially that they ownour body souls fingers and minds for asI was uh explained that uh recentlythree times a year for about 14 weeks uheach time uh all of the other sigs uhabide to whatever SIG release says uh weneed to make sure that the features thatwe're working on within our uh smallergroups for 6 CLI that's literallyeverything that um touches cubectl forsig apps that's everything that touchescontrollers um the batch working groupalso touches controllers uh and a coupleuh sig uh sig sponsor projects thateverything goes smoothly we have thenecessary enhancements in place arereviewed agreed read on which basicallymeans there's one of the meetings thatyou usually will pre uh you will presentyour topic you will try to convince ifthere are questions or answer eventualdoubts if everyone are uh in agreementwe are uh moving forward and also if forwhatever reason whether there is aslippage with regards to you didn't makeit because life happened or or anythingwe can decide like oh okay we won't beable to make it and we'll just push itto the next release or because the re uhthe review process for um for aparticularfeature took a little bit longer thanoriginally anticipated while we were uheither implementing or reviewing aparticular feature Additional edge casesthat hasn't been um discussed ordiscovered before are coming into uhinto the day of light So a lot of thosesix are pretty busy during each of theuh each of the release to make sure thatthe train is slowly chuggingforward So special interest groups ownuh own sub projects Sub projects owncode for the community Um so the theonly unit of uh governance that ownscode are sub projects and they aregoverned by sigs Um there are workinggroups as well Working groups are meantto be timebounded efforts within thecommunity that are basically agreed uponby two or more sigs Right So the timebound may be in the order of monthsmaybe years Um but we want to make surethat that time bound also has some exitcriteria right We are going to stop thisworking group when we have figured outhow I don't know LTS works right So LTSis a is an example where we um where wecame out of the LTS conversation with aone-year support cycle for Kubernetesthe first time around and and then uhpeople were like we're not done with LTSyet So a few years um I think it's a fewyears at this point right back um we werestarted um working group LTS andthat's uh some of I think some of themajor uh providers looking to providelong-term support and figure out whatwhat buttons and knobs need to be turnedwithin the the Kubernetes community tomake that happen Um committees uh so Imentioned all the governance groupbecause because committees are a littlespecial um committees are uh elected byuh some subset of the project Right Soin the uh in the case of steering uh thecommittee members are uh yes wait rightso we have new steering members right umuh some of some of them are actuallyreturning stee�ring members like uh likeBen um but Sasha uh and and Antonio joinuh the steering committee very recentlyUm we so the steering committee membersare elected by the community right so ifyou are Kubernetes contributor uh youare eligible for the the election rightum you're eligible to vote in theelection you're eligible to presentyourself as a potential uh candidate forthe election um for the so we have threecommittees it's steering uh the code ofconduct committee and the securityresponse committee the security responsecommittee is responsible for uhvulnerability management they areeffectively the the open sourcevulnerability management team for theproject A very um uh reasonably sizedgroup and also um incredibly uhincredibly learned on on security topicsUm the but each of these groups and thencode So the security response committeeum we don't really run elections asoften for them but they are uh usuallynew security response uh committee uhmembers are proposed by the existingsecurity response committee And thenfinally the the code of conductcommittee And the code of conductcommittee hasum yes the code of conduct committeeBoom Uh so we've got Divia Danielle SeanHey Sean Um and um are are the uh newlyuh joining uh members of the the code ofconduct committee and we thank Hillaryand Xander for uh for their work um overthe over the last over the last fewyears Um so I mentioned the committeesuh because the committees are a littlespecial in that we have uh we are one ofwe are the only governance group withinthe project that are uh chartered tohave private conversations Right Sogoing back to uh what the steeringcommittee does sometimes we do a lot ofthings behind closed doors and you maynot you may never see some of the thingsthat we do Um but some of the thingsthat we do in public uh you know on theon the funding slide um on the the slidethat mentions financial planning uh wehad a we had a repo called Kubernetesfunding right that was owned by steeringa little while back um it's quite awhile back at this point um and we wouldget these uh we'd get these fundingrequests hey I' I'd love a few I'd lovea few dollars to be able to to uh standup testing for my for my project rightum I just want some infrastructure tokind of vet this this idea idea before Icontinue along Um and and over time wewere like hey this is this is somethingthat needs to be managed by a team rightSo again uh the steering committee isnot uh meant to be a technical bodyWhile all of the people on steering arepretty adept technically um the thesteering committee is not is is not atechnical body And what we focus ondoing is being uh a delegator right umif there is uh if there is a problem tosolve we we do act as the the voice ofthe community right So if we're ifthere's a press release if there's umyou know if there's some statement thatneeds to be made on on on behalf of thecommunity we are the ones doing it Butfrom the perspective of um we we alsowant to make sure that uh we have theright people doing the right thingsright So when we saw lot lots of fundingrequests were coming in forinfrastructure right We said hey wemight need a new governance group forthis alto together right and that is thethat is effectively the the the originstory for um the sig kates infra rightso that sig is responsible for managingkubernetes uh community infrastructurein a way that all community members cancontribute to right um so uh all ofthese governance groups are also umbound by charters that they um that theystand up and the steering committee isresponsible for approving and reviewingand approving all of thosecharters So we've gone through thesteering new steering members Uh we havea wonderful uh former steering uhcommittee member and our uhrepresentative on the CNCF governingboard Kristoff Plleer um at Red Hat Umwe have talked about our code of conductcommittee members And I think what wouldbe cool next is to maybe go into what'sbeen happening in the community uh in uhfrom the context of the annual reportsYeah So as Stephen was talking we have alot of the things that we're handlingbehi�nd the doors But one of the thingsthat we actually doing every single yearin the open in the community uminvolving all of the SIG leads or theSIG leads or chairscan promote a person to do help themprepare is basically annual reports Theannual reports at this point in time arestating currently around two importantquestions what is important that yoursick your working group did in the pastyear we're talking about 2024 at thispoint in time that you would like tohighlight so th those are the majorhighlights that I that I uh went throughall the 32 groups that we have all thereports and those are the importantthings multiple uh six and war groupshave shifted their chairs their techleads uh if it wasn't previouslyexplainedThe difference between a chair and techlead is at some point in time we onlyhad one role for every SIG and a personwas both responsible for theorganizational side which basicallymeans for example presenting at CubeConsuh making sure that the uh the weekly orbi-weekly meetings are on time thatthere's agenda eventually if there's noagenda cancelling making sure that thecalendar updates are up to date makingsure that the annual reports are filledin on time and eventually respond uhresponding and helping guide uh the thework around the SIG Uh we've noticedthat over time certain SIGsuh have not one rotated people but thenmeant that those people aside from thoseorganizational things they were alsoresponsible for technical side of aparticular uh special interest group orworking group So we decided to dividethe roles and sik and that's entirely upto any special uh interest groupdecision whether they want to haveseparate people that are responsible fororganizational side of things andwhether they want to uh have uh specialtechnical lead role which basicallythose are the people that will bereviewing your PRs those will be thepeople that will be reviewing yourenhancements for for any particularsubgroup those are the people that thatwill help and make sure that thetechnical side of the group is um issound and is and has been discussed uhmultiple times Usually other importantum things that have happened in the sixwere also multiple six in cooperationwith sikcontribly separate group that isresponsible for contributor experienceThey help us promote uh all the workthat is happening within our specialinterest groups or grow groups Withtheir help we are able to put togetherinformation about what's going on We'republishing blog posts We are able to usethem toum also promote the events that arehappening within the community on socialon various social medias we have apretty significant um user base throughvarious social medias and it's fullynicely automated So a big shout out toour uh contrib for all the work thatthey are doing Uh not only that theyalso put together a workshops which helpbring new contributors into theKubernetes project because that's thebiggest ask that we've been hearing formultiple years that contributing toKubernetes at this point in time isextremely difficult for newcomers So bigshout out to the sik contrib uh signothey introduced a new role which iscalled cap wrangler which basicallymeans the cap Steven was talking this isa process that is helping us to definewhat is the new feature what is theshape of the new feature how we want tolook how the implementation will looklike what kind of use cases it it willcover and the cap wrangler is basicallyto help uh people to go through the capprocess from within the uh the SIG noteSo I I think I I want to stop andhighlight Cap Wrangler for a secondbecause I think one of the we weretalking about SIG release a littleearlier and this is kind of where all ofthe you know where the snowball kind oflike rolls downhill and we try topackage all the things out to get outout out of the door on time Um but italso represents so so one it's anincredibly important role um and it andit kind of pulls it kind of de-risks therelease team when they we have theenhancements team the enhancements teamhas to start ahead of the rest of therelease team sub sub team roles uh toensure that we� kind of have this bundleready to go and people understand whatwe're releasing for the next cycle So Ilove this from a SIG release perspectivebut I also want to point out that um thedocs the SIG docs has something calledthe docs wrangler right And the docswrangler um so so this is kind of arepresentation of like we found a rolein uh in a different SIG We're learningfrom each other right We found a role ina different SIG that makes sense to umthe makes sense to to to to copy rightEffectively And I think um you know kindkind of from the you know proud SIGrelease dad uh perspective um we have alot of uh we have a lot of groups withinthe the the project that have adoptedlike handbooks right so I I wrote someof the first handbooks for the releaseteam way back in I I don't want to talkabout how longago but we've seen lots of lots ofgovernance groups across the projectkind of like introduce handbooks as welllike hey here's how you do the job likeyou know and going to Mach's point aboutlike encouraging contributors to to toget started and showing them showingthem the map is a big way to do it So Ilove to see kind of thecross-pollination of ideas across thesigs and see them come out in the annualreport Okay so those th this was thefirst question What do you want tohighlight Now jumping quickly forwardthe other important question that we'reasking every sik and work group is whatarea of the project you're strugglingwith That means whether you havesufficient number of reviewers andapprovers whether that's a sub projectbecause every special interest group canhave their own sub projects This iswhere the Kubernetes 6 repositories comeinto play This is the place where everysik for example if you've been listeningto keynotes on Wednesday it wasmentioned that headlamp uh was beingadopted as a sik UI sub project So itwill be under kubernetes6/adlamp and that's one of the subprojects of the sig UI every singleother um SIG will have their own subprojects and those are one of thepotential areas that they might needhelp because for example in case of asix CLI we have a KUI project and sadlywe lost the main contributor we cannotgetuh we haven't heard from from them in awhile we're trying to reconnect but ifwe're not able to to connect and get thethe maintainer back that means we willhave to archive the repository becausewe don't have anyone who is interestedin And in a similar fashion all thespecial interest groups and all wargroups and that's unfortunately aseparate slide because there's like thismany help that we actually need Uh wewe're only a handful of people that areuh keeping the project running I shouldhave put the XKCD Maybe we're not assmall as the the people from the XKCDOn the other hand how big the the userbase is of aKubernetes itself that's probably stillnot a uh an overstatement So SIG Docswill always happily uh accept help towork with dogs and blogs specspecifically Recently they announcedthat there they are having issues withGerman and Vietnamese translations Wehave several of the uh dogs in variouslanguages available and it just sohappens that both the German and theVietnamese translation lost theircurrent uh maintainers and we are on thelookout for uh for new maintainers thatwill be willing to step up and help withthose um those pieces of ourdocumentationSo I think this this also highlights umyou know it's important to talk aboutwhat we're we're doing but also where weneed help because um and I think youknow in the Kuey case and and few of thethese projects I think people don'trealize as you were mentioning thatthere is um there are these littlemicrocosms within Kubernetes right wherewhere there is there is help neededright like you know if we we're to talkabout again sig release right we wouldlove more release managers right to tocontinue following the sun but we shouldyou for anyone who is interested inKubernetes and interested in a projectthat looks familiar to you on this slidelike take a moment take a picture likeum we are looking for that um but but onon the on the on the on the other handright it's also a natural and healthyprocess to talk about what it looks liketo spin down a project right subprojects should not necessarily lastforever and when they stop making sensethey should be archived and and thereshould be a healthy discussion about howand when when to do that I think we'vegot a pretty good process But again onthis need help slide if you're seeingsomething that you're interested in andis important to your day-to-day take amoment um and and think aboutcontributing tothis Like I mentioned the war groupsthey have their own set of problems andand set of challenges that they are alsoworking with Um and even as Stephen wasmentioning the work groups are only timebound and we are moving slowly forwardFor example the structured logging theirgoal is to make sure that the structuredlogging is present in Kubernetes As soonas they achieve that goal this is theirexit statement We will be winding thatdown uh the batch that one just want tomake sure that we have the necessaryprimitives to to to run any kind ofbatch workloads I know it's prettyopen-ended but there are very specificexit criteria uh and even though theyare sometimes um similar to oroverlapping with what device managementfor example or serving work group whichare the new work groups that we'vementioned in one of the previous slidesthat were uh spun uprecently they they have their own uhvery specific exit criteria and any helpwith regards to what they are working onwhich are the projects whether that's aa part of the core Kubernetes itieswhether that's one of the sub projectsthat they are handling that's the workand if you're interested in definitelyshowing up for one of the meetings orjust pinging people on Slack askingquestions um I can tell you one thing uhthe Kubernetes community is probably oneof the most welcominguh communities that I've ever been partofum there's no stupid questionsUh and I know having a major uh impostersyndrome I I've I've never been afraidto ask questions even to senior membersof the community or asking them for helpfor reviews I'm always telling everyoneif you have any questions if you need areview if you need an advice a littlebit of handholding just ping me on SlackPing me GitHub I'm getting too manynotification on GitHub Slack is alwaysfaster Um I not I will not always beable to get to your PR your your ask butI don't mind being pinged again andagain Um because just life happens andsometimes I I don't remember my name asthe and the week just flew by and andwell there's weekend I'll recover andthen there's Monday and I I'm I'm all uhre-energized and I able to uh to helpresolve uh your problem So in the spiritof asking questions I realize we arerunning short on time but if anyonewould like to um first uh there's whereto find us We've got a QR code going togive you all the information about howto um to meet us But in general uhgithub.com/cubernetes/ community willgive you access to all of the governancegroups and and and who who people areand and how to find them Um so do wehave uh maybe a a question or two beforewe go Anyone want to step upAlso I'll mention that aside fromtoday's meeting today's um every firstWednesday and that wasyesterday we are online for you to askquestions Uh you can always reach out tous on the public either mailing list orour uh Slack channel And like I saidevery first Wednesday you can bring anykind of questions that you might havewill be covering various topics becauseit just so happens that a lot of us isheavily and technically involved in cubeUh frequently we will divert the thediscussions to various topics which areat hand at any given point in time Uhthe community's questions are alwaysresolved first because we want to makesure that the problems you're bringingto the table are being handled uh withthe highest priorityLast call for questions for thissession Okay quick andpainless Thank you everyone for takingthe time today I hope you enjoy the restof your CubeCon2025-04-15 21:58:36.158095�eviewingthe legacy console reliance engine X andsidecarblowup now I know this diagram is a lotto digest and you don't need to fullyunderstand it we just wanted tohighlight the complexity of our previousconsensus protocol-based systems wherewe heavily relied on console clients andservers for key value store dynamicreconfiguration and even in some casesservicediscovery running the console server andclient had introduced another criticalcomponent that must remain highlyavailable for our ingress systems tofunction this made our systems a lotmore fragile if console ever experienceddowntime or networking issues this ledus to look for a solution that couldsolve this problem without all the extraservices and sidecars that we needed tomaintain engine X became a painointbecause it was introduced nearly 10years ago to handle requirements haroxycouldn't fulfill at the time this isalso handled by an external team whichcreates an even larger functional gapbetween the proxies it operates as anadditional proxy hop for nearly all ofour traffic which introduces unnecessarylatency complexity and maintenanceoverhead enginex uses a set of Louisisscripts that handle JWT-basedauthentication rate limiting sessionmanagement and course configuration as aresult we're managing multiple ratelimiting and authentication mechanismsacross different layers which againincreases the operational overhead andintroduces a lot more potential pointsof failure we needed a proxy that wouldbe able to consolidate haroxy andengineext into a singleservice running haroxy in ourenvironment required a complex ecosystemof sidecars and supporting components inorder to meet our needs we had two initcontainers that we needed for a healthyharoxy startup we had our haroxycontainer for handling all of ourtraffic we had two sidecars for handlingdynamic configuration generation we hadtwo sidecars for log management two sidecars for additional monitoring andlastly the haroxy controller as you canimagine this increases our risk ofcascading failures requires carefulcoordination across all of ourcomponents it consumes a lot moreresources than we'd like and in generalit's just not a good time to debuglastly Hgroxy does not natively supportopen telemetry which is what allservices at Docker are using for ourtracing this left a massive hole in ourobservability and was one of the manyselling points to thismigration overall this left us with alot to think about for what we needed tofulfill uh for our nextproxy so we wrote down all of ourrequirements and we started evaluatingproxies against them we've added ourtechnical requirements and criteriaevaluation for your benefit later if youwant to take a look uh but we're notgoing to be covering everything herewhen evaluating our proxies we heavilyfavored features that were supportednatively rather than requiring anycustom extensibility to avoid thatscript and sidecar blowoutwe narrowed our proxy selection to envoyand traffic with traffic offeringsimpler runtime updates and golink basedextensibility but falling behind envoyand high load performance and advancednativefeatures envoy's dynamic XDSreconfiguration robust community andpotential for go- based Wom extensionsmade it the most efficient and strategicchoice for us it has a steeper learningcurve but our ability to contributeupstream upstream thanks to uh Katarinaas an envoy maintainer gave usconfidence to move forward ultimatelyboth proxies were viable but Envoyaligned best with our scale performanceefficiency and long-term strategy makingit the foundation for our ingressmigration so now we needed a way tocontrol and manage our new envoyimplementation so we debated threecustom uh custom methods a fully customcontrol plane using envoys XDS APIs thestandard ingress object and the gatewayAPI we wanted to be able to provide ourinternal developers an easy way tomanage their own routing and since ourapplication teams are already deeplyinvolved in Kubernetes having aKubernetes native solution was a majorselling point so although the fullycustom XDS approach offered full controlover onvoice conf�iguration it requiredsignificant engineering effort to buildmaintain and to be easily consumed byour developers this operational overheadwas too significant for our team tomanage so we leaned towards an alreadybuiltsolution off the bat Ingress was ruledout quite early because it lacked thecomplexity and customization that weneeded but that's when gateway API cameinto focus at this time Envoy Gatewayhad just graduated to generalavailability and it looked prettypromising it provided advanced routingcapabilities without forcing us to buildeverything from scratch plus because itwas Kubernetes native it fits seamlesslyinto the existing workflows and enabledself-service routing for teams weultimately chose the gateway API for itsricher routing model operationalsimplicity and alignment with Kubernetesconventions this allowed us to focus ondelivering value rather than reinventinginfrastructurenext Katarina is going to review whatour new ingress architecture looks likeso on this slide you can see the highleoverview of how our new ingress lookslike uh the request flow starts with theclient or docker engineer connecting tothe dual stock Amazon elastic loadbalancer which is L7 load balancer alsoknown as LB uh LB serve as entry pointto Docker network uh we split entrypoint by traffic type it can e either beexternal or internal traffic externalclients can connect with HTTPS or JRPCwhile Docker engineers connect with uhVPN plus HTTPS or JPC uh the role of Albis to perform the TLS termination dohostbased routing and to load balancetraffic across the target proxy groupbased on the health and capacitystatuses uh next after traffic has beendecrypted at LB host and path throughhave been evaluated request is sent overH1 or H2 connection to the selected envyproxy instance proxies which are alsocalled data plane are grouped by progateway each gateway serving specifictraffic type like for example externalor internal stable or canary and theyaggregate specific business logic for agiven traffic type uh for example at avery high level let's take the externalgateway which offers rate limitinggeolocation decoration andauthentication for the external requestsenvy proxy run request to theircorresponding filter chain applying thebusiness logic on the way and if therequest doesn't terminate at the proxyinstance for example due to the directresponse it is 10 load balance and proxyto an instance of an upstream or backend uh service user either H1 or JRPCuh Docker makes heavy use of ratelimiting for the incoming requesttherefore most of the traffic whicharrives to the proxy will be routed overJRPC to a global rate limit servicefirst prior to being sent upstream uh weutilize an implementation um uh uh opensource implementation of rate limiterservice called envoy proxy rate limiterwhich is a go service and a part ofenvoy ecosystem under the hood ratelimit component consist of the ratelimit service and the managed radiusinstances uh rate limit service isdynamically configured with XDS by thecontrol plane also called envoy gatewayspeaking of the control plane which isalso known as anway gateway you can seeit here on the left hand side of thediagram it is the foundational componentof the docker ingress ecosystem whichimplements kubernetes gateway APIprovision and manages infrastructurecomponents uh like data plane rate limitservice etc it is also responsible fordynamic configuration of the data planelayer and red limit service via deltaxds uh last but not least it consumesvalidates and transforms a usersubmitted kubernetes object like forexample http route back and trafficpolicy into the native envoyconfigurationuh while working with envoy gateway APIthe Ingress team made a decision tointroduce an internal abstraction layeron top of that API docker engineerswould rely on that custom internal APIto configure routing for the service ina decentralized manner uh internal APIacts at the wrapper chart for an gatewayAPI chart decision to introduce such aabstraction layer was due to multiplefactors first of all we offer a bunch ofsensible defaults for the backendse�rvice owners that covers most of theuse cases for Docker ecosystem and atthe same time we're making sure that ourincrease system is being used in asensible way for example no one canconfigure request amount for one hourwhich would make our proxy bufferrequests for the service for 1 hour inthe worst case exhausting proxy memoryuh we abstract the complexity of gatewayAPI configuration surface for theengineers so they can focus ondelivering features for the company uhwhen users want to configure routingthrough the our system we need to add uhthey just need to add our internalgateway routing API chart as adependency here you can see the exampleof such dependency on the right righthand side uh and then uh after addingsuch dependency it will download bunchof handy templates like for example HTTProuter backend traffic policy uh andthey can use it for their configurationuh here on the left hand side you cansee the example template for the backend traffic policy where we prepopulatebunch of default settings for the healthcheckin and timeouts house checkconfiguration in envoy is rich andcomplex so we abstract bunch of configoptions from our users on the right handside there is an example of HTTP routeconfiguration where we hide the detailsof attaching such route to the correctset of the gateways user just need tospecify if it is uh if they expose uhtheir traffic on external or internalingress and uh here on the slide you cansee uh the user how does the user flowlooks like when a user wants to expose aservice via new ingress it starts withdocker engineer defining HTTP routesusing our custom gateway routing APIchart uh the conflict links in theirservice repository user submits theconflict to Kubernetes cluster viadeployment pipeline within the clusterthere is an OPA agent running for thosewho don't know OPA stands for openpolicy agent uh the OPA agent validatesthe routing config against its admissionrules uh one of such examples is itchecks that domain and path is notalready in use if the configurationpasses the OPA validation it's canadmitted and deployed to the Kubernetescluster after that it gets picked up bythe envoy gateway which monitors u theobjects like for example HTTP routes andgets validated against gateway APIschema uh if validation is successfulthe new routing config is transformedinto raw XDS representation and sentover to the proxy layer uh proxy in turnwill also validate the supplied XDSconfiguration and consume it ifvalidation ispassing so to sum it up we have managedto s bunch of pain points with the newingress system we have reduced thetechnology fragmentation by mergingmultiple previously used proxytechnologies into a single envoy proxystack we have improved user experienceby providing Kubernetes native unifiedrouting API that users can interact within the centralized manner the overallsystem reliability and availability hasbeen greatly improved with LB switchcanary support and splitting control andenter plane layers uh we improvedrequest latency by eliminating multipleproxy hops in the request path uh weimproved system throughput four timesand reduce number of used cores by 50%which can uh even improved even more byfine-tuning uh we offer our customersmore detailed observability and richertracing we switch to technology ST thatsupports richer feature set and iseasier to extend in open for our futureneeds instead of relying on theenterprise features and now Ryan willtalk about how the u our migrationjourney in more detailto make our systems more dynamic for ourmigration the first thing we had to dowas swap out our layer 4 network loadbalancer or NLB with an application loadbalancer or ALB operating on layer 7 inorder to handle the migration from NLBto ALB we used Route 53 DNS weightshifting to move a small slice oftraffic from the NLB to the ALB andgradually increase the percentage as wemonitored the metrics one of thechallenges we saw during this migrationwas the ALB's different way of appendingX forwarded four headers to be specificit adds an additional internal IPaddress of the load balancer itself ontothe� into the XFFF chain this requiredsome modification on our applications toaccount forthis weighted route 53 routing was anamazing resource during our migrationbut still has the limitations of DNSspecifically DNS caching at multiplelayers can delay changes and createunpredictable traffic distribution whichmeans real-time failover is limitedbecause DNS updates take time topropagate this makes rapid reroutingdifficult and since DNS operates only atthe domain level we couldn't rely onthis mechanism for service specificpath-based routing but once we had ourALB serving our traffic we gained a lotmorecontrol in our previous architecture ouroriginal setup traffic comes from theclient and goes through an NLB operatingon layer 4 it routes straight to haroxythen the backend service this design wassimple and efficient but offered limitedflexibility for amigration with the AOB now serving allof our traffic we had powerful routingrules that could be applied to domainnames and paths which allowed us tomigrate individual services one at atime and unlike DNS weight shifting ALBallowed instantaneous and accuratetraffic shifting mechanisms this washuge if something went wrong in ourtests we could quickly roll back trafficto use haroxy again we were now able todo things like route 10% of service Atraffic to Envoy and leaving the other90% on Haroxy and then we would continueto do this with every single serviceuntil all of them were migratedthis allowed for a slow controlled rollout of Envoy with instantaneousrollbacks if required the only caveatbeing that ALB limits the number ofrules it can have to a hard limit of 200but this was okay forDocker so once we had all of ourservices transitioned over to Envoy wecould simplify our environment we leftharoxy in the background in case of anyemergencies but our standard routes nowgo through envoy and our ALB listenerdefaulted toenvoy the ALB still sits in front and wecan use it to send a small traffic toCanary envoy service if we're doing anew version or feature rollout andbecause it's an ALB we can specifyexactly how much traffic we want goingto that canary so it doesn't have todepend on a ratio of pod counts or otherimprecise methods this kind of finegrain control lets us push changes moreconfidently and keep an easy roll backpath open if any issues comeup so all of our application loadbalancer or ALB components are builtmanaging built and managed using the AWSload balancer controller this allows usto manage these components on the flywithout needing to use slow yet powerfulbuilding tools like Terraform to theleft is where we manage the trafficweight shifting to where the ALB willsend its trafficeach load balancer forwarding rule hasspecified weights designated for eachbackend which gives us the flexibilityto control our traffic at a highlevel and to the right is how we set thedomains paths and references for theunique forwarding rules that are builtin theannotations this allows us to migratesingle services one at a timeso to clarify what this configurationlooks like in action here's an exampleof how we would consume that Helm chartthe first service registry is a fullymigrated service in this example we'resending 90% of registry traffic to Envoywith 10% of our canary traffic 10% ofour traffic to Canary Envoy whereas forthe second example with accounts thisrepresents an ongoing migration servicewhere 90% of our traffic is still beingsent to our legacy HA proxy stack and10% to our new envoy stack as you cansee this level of flexibility and rollback controls control allows for smoothmigrationprocess anyway that's all for me sothank you for listening and I hope youenjoy your CubeCon uh next Katarina isgoing to be reflecting on our migrationand technology adoptionsuh building a more than reliable andperformance system for our customers wasstill an was and still an excitingjourney for us rich feature set offeredby Anvoy and Envoy Gateway projectcovered most of our needed use cases weare pleased to collaborate with friendlyand supportive envoy community whohelped us to navigate the tech landscapeto tr�oubleshoot issues to deliverfeatures and to review our PRs abilityto do canary deployments and fastrollbacks with LB gave us confidenceduring the migration and help us todetect problems early however there werealso certain challenges that slowed usdown there was definitely a learningcurve for adopting new complextechnologies getting familiar withKubernetes gateway API and anwayecosystem uh the uh envoy gatewaydocumentation sometimes uh lagged behindthe actual implementation so we had touse uh the source code to understandwhat's going on and at the time of themigration the multiple features were notsupported by gateway API so we had tofall back to the raw XDS anwayAPI uh as previously mentioned somefeatures at the moment of the migrationwere not yet implemented in an gatewayAPI like for example GI global ratelimiting and IP tagging uh that didn'tmean that we couldn't configure thosefeatures uh in Anva gateway API there istwo alternatives how you can uhconfigure features that are not yetsupported in the API uh one option isusers can extend the functionality byextending the uh control plane andadding JPC hooks to modify the generatedXDS configuration and anotheralternative is to use envoy uh patchpolicy mechanism which is unstable uhAPI that allows users to modify thegenerated envoy XDS configuration uhthat envoy gateway generates beforesending it to envoy proxy we selectedthe second option uh because we didn'twant to develop and maintain yet anothernew service uh plus envoy patch policyacted as an extension of Kubernetesnative API which was allows us forsmooth integration to our custompolicies into Kubernetes so on the righthand side here you can see uh theconfiguration for the entire filterchain how it looks like in the XDS patchpolicy format it's quite verbose andrequires underlying knowledge on how toconfigure each individual filter inenvoy format um and you may ask yourselfso why do you need to configure entirefilter chain via XDS patching policy ifsome features are already available in gAPI can't you do like a partial enablesome gateway API and some via patchpolicy so good question turn out thereis no definite merge strategy supportfor features defined in gateway API andXDS patching API uh when partiallyenabling features in Gway API we gotunpredictable merging results and it'salso not possible to specify in order inwhich such features will be merged forexample if you wanted rate limiterfilter to go first in the filter chainmaybe Aron knows it's possible now butat the moment it wasn't uh so here is anexample for you where we did oneexperiment and we tried to enable theconnection limit feature um this is akind of overload protection feature inenvoy where uh you can limit the numberof downstream connections which wouldtransform into anoway filter so we did atest and we patched a client trafficpolicy and enabled this uh connectionlimit feature you can see 1,00 uh valuehere so we uh patched the client trafficpolicy we submitted it to Kubernetescluster it got picked up so theexpectation was this feature gets mergedwith the previous filter chain that yousaw on the previous slide uh but when wechecked the resultant filter chain onenvoys um all the previous filters werenuked the ones that were the businesslogic and we all only had thisconnection limit and like terminalrouter filter so uh it didn't work forus but it worked for rate limitersurprisinglyum another challenge that we facedduring the production rollout was thelack of native canary support in angateway API both for data and controlplane uh we did experiment with Argo CDrollout but it didn't work well with angateway API argo rollouts is uh wasn't agood fit for us due to the uh currentarchitecture of the system so uh thebeauty of Argo is that uh one only needsto duplicate the service resource forCanary and underneath the rolloutcontroller would have applied the sameconfiguration to Canary first and thento stable uh in our case configurationand envoy service are dynamicallymanaged by envoy gateway it means weneed to duplicate bunch of configobjects for canary like for examplegateway gateway class envoy proxy andangor roll all rollouts will not be ableto automatically promote change tostable a human will need to updatestable config so it gets picked up byenvoy gateway uh and propagated to thestable service viaxds um so uh we ended up having a customcanary setup for data plane and nocanary at all for control plane we alsoexpressed our needs and uh created thefeature request upstream and we'rereally looking forward to collaborate onthat uh and on the right hand side youcan see that uh we basically had toduplicate all the resources with thecustom solution and we ended up havingfour gateways that we need to manage andeach uh gateway maps to a uh to a singleArgo CD application then the rolloutprocess is quite manual but at least wehave Canary nowso another challenge that we faced uhthat there is a no easy solution or atleast no documentation example on how toconfigure role based access to writeadmin endpoints in the envoy adminconsole for those who are familiar withthe admin console uh nowadays anyone whohas access to the cluster can invokewrite endpoints uh and for example drainall the traffic on ingress um pleasedon't try to do that uh it is also acommon use case that envoy operatorsneed to perform a fleetwide adminoperations like for example drainingtraffic on entire gateway during theincident or if we are wondering likeduring incident I want to check somespecific stats uh proxy wide and youdon't necessarily expose u export thestats to graphana because envy has a lotof stats so unfortunately there is nosupport for secure remote access toenvoy admin console so let's say youwanted to check uh envoy server statusduring the incident and you have like 10instances running so it would have beeneasy and amazing just to cor each envoyinstance by port IP and hit that serverstatus endpoint uh but as remote accessis uh not enabled or it's ratherdisabled one has to do a port forward inthe loop to access that info on eachenvoy proxy instance as all we all knowthat's a much slower approach since oneneeds to wait until port forward processgets open in the background then it willspill bunch of garbage output into theconsole and then it needs to be closeduntil you can proceed to the next uhiteration uh in the loop so uh quiteslow troubleshooting during theincident uh while working on themigration our mission as well was tocontribute back to the community we diddiscover multiple bugs on the way likefor example issuing draining behavior inenvoy gateway when it didn'tconsistently send connection close or goaway as part of the draining sequence wecontributed multiple features to envoygateway project like for example globalglobal rate limiting feature orconfigurable panic threshold we alsocontributed to envoy project there is awork in progress PR for dynamicfilebased IPAC supports in the IP tagand filter apart from that we haveopened bunch of feature requests some ofthem you have already seen like uhcanary support or being able to tweakglobal uh overload settings or GPsupport uh we are really looking forwardto collaborate on those featurerequests and uh what's next for us uh weare excited to start working on zoneaware routing we know that uh thefeature has been maybe already merged toAnva Gateway project uh and we want tocut the cross zone egress uh costs uh weare very interested to have canarycontrol plane support to be moreconfident in the rollout of controlplane upgrades uh the next big projectfor us uh is the deprecation ofengineext component that Ryan mentionedbefore and to move all that businesslogic into envoy and last but not leastwe are hyped and are planning to jointhe envoy gateway steering committee toincrease the involvement with theproject and to be able to impact theroadmap during our journey we got a lot ofhelp from the community we would like tosay thanks to Arco who is Anway gatewaymaintainer for inspiration support andhard work and to entire Anvway gatewaycommunity and below you can find usefullinks for the technologies mentioned inthistalk thank you[Applause]2025-04-15 21:58:36.745374 ��N#��SAEa5OuNjpi9Mhi everyone my name is Katina Onesik andI currently work as a SRA docker andalso a maintainer of envoy proxy projectuh very excited to attend CubeCon thisyear here in London and share ourmigration journey with you i'm heretoday virtually with Ryanhello everyone i'm Ryan and I'm aninfrastructure engineer at Docker andI'm also really excited to be here withyou at CubeCon virtually i wish I couldhave made it in person but I'm tuning infor my honeymoon today we're going toshare how Docker redesigned its ingresssystem and made the leap from our legacystandalone harroxy and engineext stackto a modern envoy gatewaysolution we're going to dive right inhere we have a simplified overview ofour legacy ingress stack the typicalDocker Hub request follows this patternthe client's request will go through theexternal AWS network load balancer whichthen arrives at our haroxy externalharoxy haroxy will determine here ifthere are any abuse rate limits that arebeing hit if not it's going to send therequest to engineX once it sets EngineX it's going tovalidate JWT tokens with Louisis scriptsit's going to check Reddus for sessionrevocation and it's going to forwardauthenticated requests to the internalnetwork load balancer and internalharoxy from here it's going to loadbalance the request to the expected backendnow this system has served us well overthe years uh but I'm going to becovering three core components that makethe legacy stack unique and create majorpain points at Docker i'll be r��erseven layer because we have a two layerarchitecture we decided to bring thesame familiar concept you have on thegateway to the mesh so essentially theSame envoy proxy that used to controlyour ingress traffic and also youregress traffic for traffic going outsideof your cluster will bring it to themesh and we call it waypoint proxy sothe layer 7 proxy of waypoint you canimplement it at your tenant scope thatyou feel comfortable whether it's pernamespace or whether it's per service orwhether it's multiple name space ormaybe for your entirecluster so it can do a lot of advancedlayer 7 functions such as trafficshifting traffic resiliency uh reachauthorization policy or uh layer 7observabilities uh this is actually oneof the blog we just published recentlyand about TCP throughput by John how inthe community you can see uh ambient hasthe highest throughput when we use Iperform Iperf to run the performancetest and what's really interesting aboutthis diagram which I often point out ifyou saw my post on LinkedIn is why onlyum Istio and linkodhas the diagram on the same node theother projects didn't have the diagramon the same node which is represented inred the reason is when your source anddestination traffic is on the same nodeonly link and is regardless whether incycle or ambient mode can secure thattraffic using mutual TRS usingtransaction uh encryption intransit all right now I'm going to passinto uh Raymond to talk about thepresence of with ambient how theadoption goes uh within Forbes thank youthank you Lynokay i'm Raymond i'm a senior architectat Forbes you may have heard of us we'rea thought leader in business journalismbruno Mars had a song about being on ourcover uh sowhy you get a sharedingress we don't want to have a loadbalancer for every single applicationthat just adds you know more money morecost um two we have automatedcertificates managed by search managersto can hook into those secrets whowants to manage uh rotating your C every30 days i don't uh one of the mainreasons we we adopted STO was Canarywe wanted weighted traffic to route tonew deployments so that QI could testnew features and you know if there'sanything that's breaking our detectioncan you know hopefully detect it earlyand then we could roll back very nicefeature uh out of the box you get MTLSyou know encryption talking to serviceswho doesn't want that observability youget Kali pre Prometheus metrics thathelps you know troubleshooting anyissues that you might encounterand uh at one point we had multiclusterservice mesh uh we were active active uheast and west regions in North Americauh we did roll that back at one point tosave moneyobviously uh but I'll get into that alittle bit morelater uh so a little background on whatwe're using at Forbes uh we've beenrunning STEO for since I've been theresince 5 years ago right before thepandemic started uh we're running on GKEuh we started at 1.4 when I joined andwe have seven clusters using STO threeprod three nonprod one test 500 plusservices not not toobad uh so why ambient um one Kubernetesgateway API it's the next gen of youringress your load balancing it's thefuture it's going to be the new standardfor everything uh two as Lynn mentionedno more sidecar uh one of the issues youmight encounter with psychicars is ifyou have a mutating web hook injectionuh if you have a block on that if youhave multiple uh web hooks sometimes youmight encounter your pod's not evenstarting up because it's being blockedthat's not that's not funtoo um you know if you have somethinglike a pipeline like Argo workflows forinstance uh I know the older executiveshad issues uh running with a sidecar soyou know you don't have this overheadanymore and then three the one that Ilike the most is no more restarting yourapplications after a a upgrade you knowwho wants to wait for all your your podsto rotate and scale out scale in that'snot funso how did we do this migration atForbes first you know we had to switchto Kubernetes gatewayAPI uh two then you have to upgrade andinstall toambient three we set up ourwaypoints four update your �namespacelabels five restart your apps sixvalidate clean up and optionallykill so that's it you're done no no moreno more operationaloverhead uh let's get into some moredetails all right so the first thing wehad to do was switch over to usingKubernetes gateway API so uh on thebottom you see a little nice customizethere you could uh just yink that veryvery simple um and then two you have toupgrade your STO so I think we were oneof the earliest adopters so we we wentfrom like 120 when it was still likebeta and then now 124 GA um three youknow we had to separate out uh where westored our STO uh objects uh we we hadeverything in STO system which you knowit works it works but uh you know wewanted to be a little bit cleaner alittle more best practicesum so that we created new name spacesfor our gateways and consequently wealso had to copy all our certificatesout of ISTO system into the new gatewayname space and then five create all youryour gateways uh on the right hand sideyou'll see an example of part of one ofour gateways uh you know a littlecensored but first you see a listenerfor uh redirects and then two you seelike a wild card foo and then a foo andyou have some references to somesecrets uh so once you move all your SCobjects out of the um you know SDSsystem then you want to also move it tothe application name space so we createdHTTP routes in all the application namespaces uh then we updated our DNS toflip from the uh old STO uh gateways tothe newer Kubernetes gateway API uhgateways that's that's a mouthful that'svery confusing it's a tongue twister iknowuh then you know we clean up we removeall our virtual services and you knowclean up our old gateways our oldcertificates uh so you know who doesn'tlike YAML this is a HTTP route on theleft uh this is pretty common so ifanybody has you know something thatthey're routing to on port 80 and youwant to go to HTTPS automatically thisis how that would look like uh I thinkit's very common and we had that on allof our gateways um as part of thismigration we also took the opportunityto migrate to using uh external DNS tomanage all of our DNS for the gatewaysbefore what we were doing is we had astep in our pipeline that would use umlike a G-Cloud command to update all theDNS entries so this is a lot uh lot moreautomateduh on the left you have a uh HTTP routefor the gateway using uh FQDN and on theright if you're into using uh servicemesh you got to have your service meshuh gamma http route u very similar justdifferences in the in the parentref so at this point you know you're ongateway API uh you want to start usingambience you have to install ambient umbefore we were using cuddle we actuallyhad auh anible playbook which would uh docuddle and and all this other jazz forus uh there's a nice blog by John Howardhe he kind of details all the ways touse SEO installs um long story short useHelm uhuh so there there is a nice little uhambient wrapper chart that that is intoum so we're leveraging that and then wewe have our values here on the bottom umthey are kind of GKE platform specificsince we're running on GKE uh so youmight see like a little differencesthere but if you're running on GKE thisis what you got to doand then also um the STS CNI we uhdidn't install it into the default cubesystem we put it into system so you doneed to add an additional uh resourcecord which you see on the right handside so uh if you want to use ambient uhyou're going to want to use you knowlayer 7 uh rules uh so you you get thatthrough what's called a waypoint and uhinitially I had it all namespace scopedso every every namespace had a waypointum and this is you know even before weturned on ambient you know just kind ofsetting everything up and then I kind ofdecided why have a waypoint for everyevery namespace why don't we just do itkind of like how we do our othergateways let's make a common gateway acommonwaypoint so this is how you do thatright here so we stuck it in where elseisingress okay so this is where the magichappens uh in order to move to ambientyou need to add labels uh it's actuallyvery si�mple um so we're usingYTT so uh we we have some templating inplace and we have some logic over herethat we added so that we can um kind ofdo it in a phased approach so you'll seethe conditional logic on the left handside with YT but at the end uh it'llactually look like this with three uhlabels on the right hand side it's aseasy as that guys um that's all youreally need to douh there's another uh neat label uh thatI that I mentioned down over here and Iwas on holiday um so I I actually caughtup on some mail and I actually saw thatmy team had actually implemented thiswhile I'm out so I think it all workedout i haven't seen any alerts or anyfires so uh this is to use uh thewaypoint with youringress okay so then after you do thatyour name your name spaces are allupdated all you have to do restart yourapps one lasttime your your apps already had thesitecar so you just need to remove thesidecar and switch over to usingambience and then last you have a couplecommands over here to uh do yourvalidation make sure everything isworking it's all outlined in the uhISTIA docs i think uh Lynn this morningyou you mentioned now you don't evenhave to do all this apparently there's anew CLI that you mentioned I think to todo the whole migration for you oh yeahthere there there is a migration toolthat's available for free that's uh youstill need to do migration but it'sgoing to give you guided step and alsouh check out your cluster to see ifthere's any incompatibility when you'removing from psychar to ambientso I learned about that this morning uhthat would have been very usefulactually i I'm doing it the hardway okay and then optionally uh Kaliwhich is a nice uh monitoring tool youcan see all the traffic um so we useArgo CD so this that's just a subchartto point to that KIoperator um so there were some gotchasuh that I encountered when I was doingthis whole migration uh as I mentionedwe are on GKE uh so later down the linewe did want to start using some of theGKE uh gateway features um one thing tokeep in mind is uh we are using Argo CDand you'll encounter a uh sync loop ifyou're using Argo CD I'm sure you'veyou've encountered this before and theproblem is GKE and your um gateway APICRDs that you had installed manually uhthey're going to come into conflict sothat's just something to be cognizant ofif you do turn on that feature uh so weare using you know the latest uh gatewayAPI CRDs uh I think Google is a littlebit behind so you just kind of becognizant of that uh we're also usingthe experimental features um at onepoint gamma which is service mesh uhambient that was uh beta beta now it's Gum but TLS routes we're also using thatthat is still um think alpha or betait's not GAuh gateway API so I did show you beforesubm about uh a gateway you're limitedto 64 listeners um ask me how I foundthat out uh so just something to becognizant of you might need to add moreuh listeners if you need to to umsupport more endpointsuh external DNS uh it's possible thatyou could wipe out your entire DNS ifyou turn on registry no up so that wasnot a fun day butuh we got over it and then also tomanage uh some some other zones you doneed to create a couple um external DNSzones with the prefix umadded so um here are some issues that Ihad in uh which are pretty much resolvednow that we're were GI so just somethings to be maybe cognizant ofuh you know prior if you're you're doingyour upgrade path um I had mentionedthat we used multicluster service meshum so I know as of now that you knowthat's not supported with ambient butagain this morning I found out thatthat's coming to to be supported so justmake sure that uh your STO is going tobe uh matching whatever features youwant to have uh in your ambient uh twowhen you're moving your STO objectsaround you might need to you know do thefully qualified name kind of like belowuh just something to be cognizant of umin one of the the older versions of uhAmbience we did encounter some uh stuckpods uh that's been resolved but uh youhad to restart the uh CNI for that andthen um we do have a vault injectionprocess so I did me�ntion that there'syou know multiple web hooks that we wereusing um we we did have issuesconnecting to our vault at one point umand then that got resolved when we uhkind of recreated our whole Kubernetescluster started using data plane v2which is the fork of selium so uheverything has been resolved we're allin production now so we're all veryhappyuh so in conclusion use the lateststandard gateway API ditch your sidecars with ambient you're going to havesmooth operations you can uh reduce youroverhead you're not going to have somany resources so you can focus onfine-tuning your applications your nodesand my favorite you never have torestart your apps againuh I I certainly want to thank Raymondfor his persistence you know being anearly adopter uh dealing with some ofthese issues that he outlined as we uhyou know work through those in thecommunity over the the releases and youknow getting to G and even some of thethe bug fixes were post obviously likeno system is ever perfect um so yeah yougot to experience a kind of day in thelife of a real user of ambient orseveral months in Raymond's case um soI'm going to talk a little bit aboutwhat's in the future or some of thethings that are you know happening inISTO today um you know maybe if we havetime for Q&A at the end we can we cantake some questions about that orobviously if you want to ask Raymondabout his experience we'll do that um soyou guys have probably heard about thisAI thingum obviously lots of different ways thatpeople are using uh but in particulardeploying AI in production inenterprises today[Music]um and people are deploying inference atscale in enterprises right they're usingoff-the-shelf things uh llamas etc allall the different flavors of AI and thenthere are okay how am I going to servethat to applications in my organizationor to external clients right so you needa serving stackuh one thing you know I would say aboutAIS in general is they are the moststateful sticky process I have everexperienced Right like you have a chatand that chat represents an inordinateamount of state stored in one GPU onsome machine in your cluster and if youwant to send a request for that chat itbetter go to thatplace uh so part of the gateway APIeffort and I think Clayton talked aboutthis on stage this morning in thekeynote there's this gateway APIinference extension oh it's tomorrow allright I'm not going to steal all histhunder then uh the API spec is alreadypublished he's going to talk about someannouncements related to it so there's aspecification in the Kubernetes gatewayAPI that describes how you could build arouting layer for inference runninginside a data center uh and uh Googlehas contributed an implementation ofthat into ISTO so it's right now sittingin a branch uh we have some issues towork through with it because that codewent in last week um but ISTTO willsupport that specification out of thebox so for those of you who care aboutrunning inference in production and needto serve it to a variety of things uhthis solution is for you i won't go intotoo much detail about how it works butessentially there's a a a process thatyou get to write and there are examplesof this process that the the gatewayswill call out to and say "Okay I havethis conversation this session whichmachine should I route this to?" Rightso you know in nerdy networking parlancea look aside load balancer uh and we usethe xrock extension in envoy toimplement that uh so that's how that isimplemented and many users today usethings like xrock or x.z to extend thecapabilities of the system so it's avery natural extension forus so you know this is a pretty big usecase right there there's a lot of peoplewho care a lot about this in enterpriseand we need to make sure that they arecovered uh and we're also going to dosome work to make XR proc itself moregenerally useful we'll probably add somegeneral purpose API abstractions overthat uh we did that for X to off manymany releases in ISTO ago we have theenvoy filter terror vault of an API uhdon't use that we'll give you a betterAPI uh and so this will become kind of�the standard set of extensions that wehave inSTO you guys probably know what this isum so the team at Microsoft justrecently contributed to a branch animplementation of ambient mesh thatworks on ISTTO uh sorry on Windows uhum you know obviously Windows is alittle different than Linux the waytraffic capture works is a littledifferent there's a lot in commonthere's this kind of network layerabstraction that you can use to docapture and so they've shown demos andyou should go track down the Microsoftfolks keith and Mitch are probablyaround somewhere maybe they're in theroom if they are they can wave theirhands if not you can find you canprobably find them at the STO boothactually in the in the show floor um ofSTO ambient mesh working natively onWindows right capturing stuff out ofthat so like obviously like there's along list of stuff people use inproduction today and our we consider ourjob to be you know Kubernetes first butdefinitely not Kubernetes only rightlike works on VMs it works on Windows itworks withum let's see cube for instance uh wehave made work in quite a wide varietyof platforms and you can go see some ofthe discussions about that in the slackchannels um but this one you know uhMicrosoft who's going to drive and makethat that goes to production in theproject so for those of you who runWindows in production even on VMs thisshould be quiteuseful soum obviously the way points are a kindof new thing and whenever we introduce akind of new architecture like ambientpeople want more control over it theywant defaults they want to control howbig the waypoints are uh they want tocontrol autoscaling of the waypoints youwant to score sharing like Raymond hestarted out with the our baselinerecommendation of one per namesp spacehe's like no I I want to use one perclusteruh you want to control placement of themwhere do they run both affinity andanti-affffinity do I want them closer tothe destinations do I want them on aparticular type of hardware what is thedozonal topology of thewaypoint you want rollout control likeso when waypoints are being updated youknow how and at whatthey want to bundle extensions with themlet's say you use WOM or Lua and youwant the waypoint to load that and runthat as a capability well maybe you wantto bundle it into theimage and you want recommendations forit right and we have a bunch of thingsin Cuttle analyze for it but we've beentalking to a lot of people over thecourse of this week hearing a lot ofreally good feedback so you're going tostart to see that come back into theproject uh and there's a couple of PRsthat are already in flight and so someof this will go into the 126 release andyou know just give us your feedbackabout what you want to see the waypointsdo but it because waypoints offer a lotof flexibility because we put them inthe network we can give you a lot ofcontrol over where and how and what theydo and that's going to be a pretty bigdeal yeah and there's this list is notthe list of everything I've heard thisweek umegress so you know people are put a lotof time and effort into ingress in STEOobviously put a lot of time into eastwest traffic with waypoints we've alsoput quite a bit of time and thought andenergy into egress uh and radicallyimproved the UX of how you configureegress to a service in thenetwork the the waypoint proxy and theuse of service entry uh John againpublished a nice blog about this um Iexpect to see quite a lot of evolutionbecause we've made egress so much easiertoconfigure we will see more users usingEnvoy as an egress proxy as opposed tokind of just going directly out to thepublic internet because people wantcontrol over that traffic right whetheryou know you're sending traffic to athird party sassified LLM and you wantto control how much money you'respending on that or you just want toknow who's allowed or control who'sallowed to call GitHub or S3 or someservice that's external to your networkso the same types of controls that youwant on ingress you're going to wantthose onegress and then just kind of in generalright the waypoint model offers a lotof scope for innovation right becausewe've broken the system apart and putthese things in the network we can do alot of things we can swap them out wecan have different implementations ofthem um the K gateway project which wasannounced and donated this morning inthe keynote stage is an example of analternate waypoint implementation so itcan be you can use the S2 waypoint oryou can use the K gateway waypoint it'sjust one example of that pattern workingout in theecosystem and so I actually expect tosee a fair amount of independentinnovation in this space other projectsoffering that capability as anenhancement to a standard STTO user andbecause the integration point is justthe gateway API you say gateway pointbut I want this kind of one right thesystem is naturally composable right andthat composability right that weachieved by layering the system willspeed up the rate of innovation in theecosystem and I think that's just areally powerful thing that's going toplay out over the coming years so that'sthe the next next and Ithink that is us yeah I think we are atthe end of our hour i don't know if wehave time to do one more thing i wouldlove to um capture the audienceum analyze the mood if I can all rightlet's do this um so I have my demoapplication running in is still ambientmesh and I have a local large languagemodel so if you can show me uh some ofthe gesture you have did I block it withmy hands all right let's seeokay I need to enable my phonecontinuity on mycamera what is this all right let merefresh service mesh on the iPhonenobody told me we're launching that Lynni know this is crazy okay so maybe notlou um do we have time for a fewquestions like two questions onequestion okay one question let me bringthis up certificuh well we have time for one questionbut I think they asked somebody to comeup to the mic so while Lynn is debuggingif somebody wants Yeah come to the micand show me your energy if you have anyenergyallright all right there you go that was aquickdebugging all right this is the demorunning ambient analyze the moodpositive andinteractive excellent and this is theobservability provided by ISTO by theway and you can see the traffic goingthrough is ingress gateway coming to thedemoapplication and because we don't haveenough metrics so I'm going to show youa couple of hours ago so the trafficactually goes through the waypoint andthen goes to the large language model uhthat's external to my Kubernetes clusteri had a lot of failure because I wasgiving a talk earlier i had to debug inwhat's wrong with my demo but this isthe power of layer 7 observabilityprovided by ambient without us needingto run any of the sidecast all right goahead ask your questions i would like toknow more about like uh is your gatewayabout the future of it cuz right now wehave K gateway right which is kind oflike looks like a replacement forgateway so like there are many gatewayprojects in the ecosystem yeah what whatwhat's the recommendationuh so as if you are happy with STOgateway and it meets your needs usegateway it is not going away thegreenfield projects then uhthatI look at your requirements and see ifthey are met by one of thoseprojects um like like most a lot ofpeople need fairly simple things andwill almost certainly cover those if youneed some more advanced things then Kgateway or another gatewayimplementation is probably a quitereasonable choice obviously K gateway istested and is validated to work with STOif you're using S2 as a mesh you know Kgateway is a pretty good choice rightum you know obviously the cloud vendorsthemselves provide gatewayimplementations right so there's someplatform decision to be made right i Idon't like speaking as like antomaintainer like I'm not going to tellyou that you shouldn't use rightyeah well I'll be using as a east westgood then uh then you know make yourgateway choice after that yeah if you'rehappy with Easter I would say stick withit all right thank you so much if youlove the talk I would love you all tofill out the survey thank you so muchfor having us thank you everyone2025-04-15 21:58:37.387778 ]]��#��;ApoBOYc_EkpAhello good afternoon everybody we arehere talking about the ISTO projectagain and we're going to take youthrough the history of Isto and also thepresent of ISTO and I thought who wouldbe the best person to talk about thepresent and one of our users in thecommunity so let me quickly introducemyself lensa I'm the head of open sourceat solo i'm also a CNCF ambassador andone of the CNCF TOC member now I'm goingto pass over to Raymond who is actuallygiving his first ever conference talkhello thank youuh I'm Raven i'm a senior architect atForbes andthen you get it over with as quickly aspossible right uh I'm Louis Ryan i'm theCTO at Solo i work with Lynn uh I'veworked with Lynn for a long time i'veobviously been involved in the STOproject since the very beginning anddelighted to be here talking to you guysagain for the eighth year in a row solet's get this ball rolling awesome sopast history me and present present andfuture of Isto so we invited the founderof ISO to talk about that all right uhlet's get started so when we startedIsto eight years ago correct me if Isaid anything wrong Louie uh we set thegoal we want to project to betransparent to your application workloadand I honestly looking back I think wefailed that goal for a very very longtime because when we first started withISTOum we we ask you to run the sidecaralong with your application containerwithin that same application pod rightso whenever you enroll yourself yourapplication into the mesh you have torestart whenever there's onboy CVE thatcomes with the proxy see you have torestart your application so honestlylooking back I really think we failedthat goal eight years and for the pastuh few years as wellum so as we evolve on the ISTO projectuh I think Louie uh the Google team uhalong with the rest of the communityplays a big role in drive is to CNCF soone of the biggest thing with ISTO wasit was holding by Google and Google ownsthe copyright of the ISTO project whichprevents some of our adopters adoptingISTO some of our contributorscontributing to ISTL so we were veryvery excited that is joined CNCF uh in2022 and what's also exciting in thesame year is we started challengingourself how can we be more transparentto the application workload this is whenwe introduced STO in ambient mode justquickly introduction of ambient by theway it reaches GA in Salt Lake City lastyear a quick introduction of ambient isuh wepurposefully have a two-layerarchitecture of ambient of the layerfour layer it's implemented by zerotrust tunnel it's a purposefully builtum proxy that handles layer 4 traffic itit does uh mutual TLS simpleauthorization policy enforcement foryour application and it serves all theapplication pods on that node um so thismeans regardless whether your pods isrunning within the same node or if yoursource or target is running on thedifferent nodes uh it can serve the samemutual TLS for you regardless if it'sthe same or different nodes on the lay��orts like for example cube scheduleulerit determines when to start stopped orpreempt the workload it supports all ornothing also called as gang schedulingsemantics so when deciding when to startand stop it assumes that the whole thingshould start not only a fraction of itit is very critical for machine learningerh stuff because these things requireall parts and all pieces of computationsto be present in order to makeprogress and Q decides where in thecluster should it be running should itbe running on that specific machinesmaybe with this accelerators maybe onspot maybe on demons maybe on yourreservation maybe in a specific rack ormaybe in specific nodes all of thesethings could be decided byQ q is also a resource qua manager so itallows you to provide quotas to multipletenants and uh define quotas on variouslevels on various types of hardware oralso other type of resources that youmay want to utilize in your clusterit has built-in integration for variousAPIs for classical things from trainingand batch word like kubernetes jobs jobset cubeflow portfolio cubray as well asforinference/serving deployment statefulset leader worker set and other thingsthat could be used universally like aerplane pods and pod groupsit is u hardware vendor and cloudprovider neutral it means that it workswith GPUs TPUs or whatever customhardware you may have on any cloud thatyou may be running and also onprem so what is going in Q q has beenaround for quite a while it be probablylike three years or so and uh duringthat time we have had about 50 releasesthe uh most recent one happened a weekago so this is a very very activeproject with new good stuff constantlycoming in during this presentation Iwill bring two things to your attentionthat's came recently these are topologyaware scheduling P sharing andhierarchical quarters and resourcecontrols let's start with topology awarescheduling so imagine that you have adata center and in this data center youhave couple computers connected to oneswitch and couple computers connected tothe other switch and there is aconnection between uh these switchesif you uh put your uh workloads kind ofrandomly you may get quite decentutilization here on the picture you seethat all computers are pretty wellutilized each of them is running fourpods uh from different applicationutilization is100% and well this seems ideal is itwell not really so here on this picturewe have uh three workloads uh pink greenand blue one as you can see pink islocated more on the right but it alsohas some pots on the left side the sameapplies to green workload and the blueworkload if this workloads are doingjust computationuh the situation is okay but if potsfrom each of the workloads require totalk to each other then this linkbetween the switches becomes abottleneck uh purple pots from the rightwant to talk to purple pots from theleft green from the right to the leftand so on and so on it becomes congestedand none of this workload is performingwell because all of them uh are fightingfor this red cable in the middle thecables on fire the workloads are notperforming uh GPUs that are superexpensive are effectively wastedeveryone is unhappy and that's obviouslya badsituation uh with Q you can arrangethese things a little bit better forexample put all of the pink pots on theuh well right hand sideuh and blue and green on the other sideand then they are doing only internalcommunication which doesn't put too muchload on this middleCable in in the picture and be linkingthe switches and that brings a lot ofpeace of mind to both systemadministrators cluster owners and MLresearchers because their uh workloadsare progressing well and GPUs are wellutilized the other thing that we have inQ is fair sharing so imagine that wehave a large data center and three teamsof researchers the blue one the red oneand the yellow one uh based on theirsize their priorities the importance ofthe project uh some director divided thecluster into fragments and assigned aquart or capacity to each of the teamsthe blue team got three racks the redteamgot seven racks and the y�ellow one uhthe remaining five racks of GPUintensive GPU heavycomputers uh all of them are using it uhuh to the large extent but sometimesholiday occurs sometimes offsitesometimes people have summits andsometimes not all the capacity isutilized to its fullness so what happensthen obviously we don't want to wastethe resources that were previouslyassigned to the red team while they areplaying volleyball at Hawaiiuh we don't also want these resources tobe given to the to one of the teamsbased on their time zone because theyarrived to the work earlier or they havesome other uh way of getting thisresources between uh the yellow teamideally we would like to have amechanism to split uh the unusedresources in a kind of fair way so thatneither blue team nor yellow teamcomplains and the capacity of thecluster is well utilized and money wellspent and of course when the red team isback from their offsite holidayswhatever uh they are fully entitled toreclaim theircapacity and use what's theirs okay soin Q we have two flavors ofpreeemptions one is preeemption based soin imagine that if blue team is usingtoo much of the sher resources than theyellow team and yellow team wants to runa smaller workloads and there's no spaceleft then blue team uh uh should havesome of their workloadspreempted let's look what does it looklike on example so here we have someshared capacity let's assume that itconsists of 100 GPUs and blue team hasthree workloads 50 30 and 15 GPUs eachand yellow team wants to run arelatively small workloads whichconsists of which requires 20 GPUsobviously there is no space but uh blueteam is kind of abusing its uh fairshare it's using more than it shouldso if we use this for of preeemptionsthis form of per sharing we preempt oneof the workloads from the blue team inorder to make spacefor workload from the yellow team nowthe division and first sharing of theand sharing of the anus capacity is alittle bit more fairwhat happens if yellow team wantsto submit yet another workload this timethat requires 35 GPUs well blue team isstill using more GPUs than the yellowbut if we preempted some workloads onthe blue of the blue team then thesituation would reverse and yellow teamwould get much more uh than the blue oneso we don't do it we keep this otherworkloadswaiting so that was one form of firstsharing that we have the other form offirst sharing that is uh coming reallyreally soon to Q it's called coded andwill come with uh the next release isadmission based uh first sharingpreviously we were preemptinguh workloads of the other team if theywere using more of the shared capacitythan they should this time we do itdifferently we wait so if blue team usesignificantly more resources than thered team workloads from the uh red oryellow team should be given preferencefor the upcoming admission so here likethe workload iscoming and it's waiting and it's waitinguntil something from the blue team iscompleted and then it's submitted ifsomething more from the blue team is uhsubmitted it waits because the blue teamuse more of theircapacity and thenuh we admit something from yellow teambecause yellow team use more now noticethat I have a mistake on the slidesthere obviously should be red yellowteam in the in the text sorry for thatokay one last thing that I would like touh discuss is hierarchicalresources so with this uh quotathing[Music]organizations want to kind of dividequota in the way they are structurallyorganized so uh director gets some quotathen this qu is distributed among itsmanagers and then managers give thisquot to individual uh teams uh this issomething that you can configure for Qcan have hierarchical quarter that uhfollows your orc chart and also all ofthe first sharing that I previouslyuh mentioned also follow the orc chartand uh company politics manager probablywants to distribute unused capacity uhfrom one team among their other teamsthe same applies directors and so onuh they want to distribute an capacityamong their people and then if the needsof their people are satisfied the anuscapacity goes to the rest of thecompany ok�ay so these were three thingsuh that we have launched or we arelaunching right now what is coming uhsoon uh basically we are doing much moretopology aware scheduling in topologyour scheduling we want to include morepieces of Kubernetes uh scheduleulercode in Q so that the admissions andpositioning of the uh and placement ofthe pots is more precise we would liketo give more controls uh for scoringtopologies and influencing the decisionsuh of Q in terms of which rack in whichworkloads are placedtogether and handle recovery anderrors betteruh we want to expand admission basedfirst sharing to all hierarchy levelsand uh do some significant improvementsto multiq multiq was uhlaunched more than half a year ago isQ's attempt to handle multicluster jobdistribution it is getting someadoptions and we received a lot offeedback from the customers and usersand uh based on this feedback we want toimprove controls uh over work over whichworkload goes to what cluster andsupport bods and nice things forresearchers like getting logs executingthings in the containers and soon okay uh the other project that uhbatchworking group is working is QCTLwhich is basically a tool to performday-to-day operations on Q like creatingcues draining cues stopping and startingcues listing workloads along with theirposition in the queue posing workloadsand all of the stuff that you mayimagine that you may encounter in yourdaily work as a batch clusteradministrator it comes uh as a cubectlplugin it's managed by crew it's alreadylaunched been around for a while it'srelatively stable so if you're using Qyou probably should take alook okay and then we go to APIs job jobuh has been in Q in Kubernetes for veryvery long time probably from 1.0 or 1.1release and it's really really matureKubernetes API most features are alreadyin however some of the features arestillcoming in particular we are graduatingcouple things like manage by field wentto beta in Kubernetes 132 pot failurepolicy went to G in 31 success policywill go to GA in 33 and back of limitper index uh will go to GA also in 133jobs setuh you may think about job set as job APon steroid it's an API for managinggroups of job as a unit it offersunified API for deploying HPC anduh AI ML workloads uh it comes withextended policies around startingstoppinguh fa discovering failures discoveringsuccess restarts and all of the thingsthat you may want to configure for yourjobs uh it allows you uh to run uhmultiple jobs connected to each other asa single unit and gives you like a niceservice to govern them alluh the API is uh also relatively uhstable and mature and is getting adoptedby both by both users and frameworks uhfor example cubeflow trainer v2 is builtaround job set API and metaflow is usingjob set foruh their distributed training so againuh users are uh individual users areusing it companies are using itframeworks are uh happy with it soprobably if you are not using job setyou may want to take a look atit uh last thing that I would like totalk about today is K job and uh K jobuh problem statement is around this sohow to start uh lots ofone one time jobs that have complexstorage setup so it consumes inputproduce output has some logs binaries uhlocated in various places maybe on NFSmaybe on S3 maybe on GCS all thingsaround so they have like really reallycomplex thing configuration aroundvolumes uh and these jobs differ uh verylittle from each other they maybe have alittle bit different arguments for forthe scripts maybe the different imagemaybe slightly different input p filesbut overall these uh jobs look the samethey basically like coming from uhresearchers and they are the dailywork and these researchers would like tostart them via command line tool theyfor example come from slur mode and theyare really really used to playing withcommon lines instead of editingyamland all of this uh jobs are not startedby single individual butare constantly used by entire team so Kjob is our way to address it it's a setof reusable job templates templatesstored in API server with a command lineinterfaces that transform thes�etemplates into full yamos and send backto API server forexecution okay because they are storedon API server they can be reused by teammembers they have some fields in thetemplate already populated and some ofthem some of the fields areintentionally left blank uh for theresearchers to fill uh each of thistemplate has informationuh which parameters need to be providedso that it is uh complete and can beexecutedand this templates covered job job setcubeflow jobs and uh standalone pots sothe float with this uh tool looks likethis the admin creates the templatesuh setups all of the storage volumes uhmounting points and so on so that uhwhen researchers come they just need toselect the template from the listuh provide additional flag like thescript name or maybe parameters or modelname or whatever and then uh thistemplate is compiled into fully bakedcomplete job and submitted forexecution i mentioned slurm a moment agoso one of the reasons why we did this isto makeuh users coming from slurm a little biteasier so in K job there is a specialmode withuh which tries to provide slm likeexperience it is script ccentric it uhprovides users with familiar commandline options that mimic one to one thosethat you can pass to uh sr run maybe ithas same uh environment variables builtin slm scripts when executed on slurmcluster get uh quite long list of uhmetadata information like the index uhof uh the task uh and other things asenvironment variables so we mimic themand we offer a similar approach tosearch storage binaries input andoutputs and this thing is aimed onmaking migrations between slurm andkubernetes a little bit easierokay so as you might have noticed we aretrying to provide a kind of batch uhecosystem with Kubernetes as uh as afoundation then we have Q which managejob execution quota control uhscheduling and all of the niceuh things related to execution and ontop of that we have a set of APIs and uhcommand line tools uh command line toolfor playing with Q and APIs for runningjobs and on top of that K jobs to makerunning these APIs a little biteasieruh this is just a suggestion you arefree to extend it use whatevercomponents of this proposed ecosystem asyou like fit and if you have strongopinions uh what is missing what shouldbe done what should be done completelydifferently where we are wrong pleaseplease come uh to batch working grouplet us know what you think what's youropinion what are your needs and we wouldbe very very happy to listen to yourfeedback okay uh I guess I have coupleminutes for questions the mic is here inthe middle of the room please come[Applause]okay thanks for the presentation uhquestion for the fair scheduling is itthat the quotota is just the currentquotota of or the current distributionof jobs not historic CPU hours that areconsumed so if like someone like theblue team runs jobs for a whole day andthen the red team comes in and wants torun a job but at the current state theproportion of of distribution is is isokay then like taking vcpu hours intoaccount and not just the vcpu used atthe current state uh so you're talkingabout first sharing yes so we have twoflavors of freshing one withpreeemptions which doesn't look at thehistorical dataand the other one which submission basedsharing and it stores like a historicalusage basically likedecaying aggregated usage over someperiod of time so we accumulate how muchof the shared capacity a team used andbased on the shared of the shared usagewith some waiting applied if you providethe weights we sort the workloads andadmit them according to this workloadsso team that use the shirt capacity theleast with the weight combined uh gotprecedence over teams that used uh morein the past of course if the uh only oneteam has workloads we don't stop themfrom sending them but as soon as someother team comes in and they haven'tbeen using the search capacity for along they get like the first uh spot onthe on the line but you don't preempt onbased on the history in that mode wedon't preempt it if you wantpreeemptions then uh we have a mode inwhich preemptions are applied okay coolthankshello I have some question uh on theecosystem diagram uh there is a field ofother CRDs uh I wonder how does forexample ago workflow uh fit in uh intoother CRD so aro workload as far as Iremember is integrated with Q or yeahmiho have we mergedOkay so uh we have a way of uh usingArgo workloads with with Q there i thinkthere is some documentation on site itwas merged like uh really reallyrecently right so argo workflow is ontop of the yes it's probably yellow oneother CRDs that can use Q to cue itsindividual steps mhm i see uh thank youuh and my second question is uh for theHPC type of workflow uh for example wehave simulation work that we want to douh each work does not take very long perse like maybe two to three minutesgenerate a geometry uh and they havethey usually use uh like CAD uh CAEthese like a windowsbased software umand I wonder uh for a use case whereuser would submit like hundreds ofthousands of uh very short running thesesimulationwork how how can one leverage thisecosystem to do so Q can handle quite abig traffic uh uh we run tests withworkloads coming at a really really bigspeed and the limitation factor is not Qitself but uh etc and whether yourcluster and API server can handle theamount of updates happening to pods jobsand so on uh what do you mean by theupdates okay so when you start a joblet's assume that it has uh 10 pots okayeach of the pot needs to be created oncethe job is started and once you create apot it's an API server right then youneed to schedule uh this each of these10 pots on some nodes another set ofrights when uh uh these pots land onnodes again they need their status tothey need their they need to have theirstatus updatedagain another flow of rights to APIserver and so on and so on so thelimiting factor for uh large flow of jobis the performance of your control planeso whether it can handle this amount ofchurn in the cluster the amount of potsbeing created updated scheduled and soonand not not necessarily like like you soif you think that you're you have goodenough control plane then you probablywouldn't be a bottleneck therei see uhin in that case uh you when when we firesome job the pods needs to be startingup um but usually the image for uhsimulation type of software is quite biguh and then the starting time would bequite a lot so I wonder if using thisecosystem is it possible that we have apod that's always there it waiting forthe job requestBut then it also sounds like it's by youcan you can have it it's basically meansthat you you're doing it like a batchserving so you have a service that getsrequest and you respond to the to therequestsoif on the slides I describe that wesupport serving API so you can deploythis thing that uh does this uh relativeshort processing as a deployment orleader worker set or whatever fits yourarchitecture the best and then it sitsthere and process requests that arecoming or you may uh back it up bymessage queue put things on the messagequeue and start and stop pots uh as thethings arrive uh uh on this messagequeue however for very shortcomputations lasting a minute or soprobably it's better to set up a longlift service instead of starting andstopping pot constantlyright so this setup is more suitable forvery long running jump and a batch longrunning i would say differently on thisdiagram you don't say deployment but youcan imagine that we add yet another uhyellow rectangle there put deploymentlabel on top of it and then this diagramwill suit better to what you aredescribing so really really shortcomputations happening in great quantityright the deployment would be on top ofthe queue sorry the deployment uh wherewould the deployment it depends how youuh position this service is it like aservice for uh some computation I don'tknow you do weather forecasting you dothis computation for two hours and thenyou go away or you need it uh to berunning 24 by7uh Okay I'm running out of time sorrysorry we can talk about this use case inthe lobby oh yes if Okay thank you thankyoudo we have time for one more question[Applause]2025-04-15 21:58:37.844455 22��1�#��AaWxuaEFSarUhello everyone welcome to bot workinggroup updates my name is Martin Velgusi'm one of the organizers of thisworking group uh so you all gatheredhere so you probably have some intuitionwhat working groupis allabout mylaptop almost died okay so batch workinggroup is a forum to discuss enhancementsto better support batch workloads inKubernetes and by batch workload I meaneverything that is related to uh highperformance computing AI machinelearning data analytics continuousintegrations and the goal of this groupis to reduce fragmentation in Kubernetesecosystem people in patchworking roomcome from different SIS we come from sixscheduling six apps six nodes sixautoscaling and we definitely canoperate with the six but we are notlimited to the six uh we also have somepeople who are uh just coming to thisgroup from outside the scope of the ggroup is to uh talk and discuss and doadditions to batch related APIs thinkabout job queuing and schedulingprimitives uh do tools that maximizecluster utilization and make the bestuse of the hardware to improve theworkload performance and we also work onsupporting specialized hardware forbatch AI ML and all that sort of stuffuh how to take part uh probably the mostimportant uh part of batchworking groupis our slack channel we have uhbi-weekly meetings uh they are both eurin European time friendly slot and inwest coast Pacific time uh slot thedetails you can find in the link belowalong with calendar invites zooms linksand all of this stuff needed for you tojoin thesemeetings what is uh bus working groupdoing probably our biggest endeavor andbiggest project is Q you may have heardof Q but for those who have not so farlet me give you a little bit ofintroduction of what what is it so uh Qacts a little bit like a bouncer in aclub he decides who who should wait whocan get in so that the uh club is nottoo crowded and by club I meanKubernetes cluster it acts a little bitlike a receptionist desks who admitpeople with pro reservation quickly andlet other weights it also acts like alittle bit of a waiter in the restaurantwhich gives you to gives you the tableand takes you to the desiredseating[Music]and when we're talking about people weare actually talking about variouscomputational units that come to thecluster and uh this can be in form of aray job it can be a plain pots it can beclassic pure vanilla kubernetes jobs orsomething from cubeflowportfolio in Q and in the remaining ofthis presentation we will refer to allof these things asworkloads okay so that was a little bitof introduction uh let's get a littlebit more formal so what is Q q is AI HPCbatch workload level scheduler and byworkload level scheduler I mean that itoperates onworkloads CS and hold not on individualp��rd party cloud or local machines uhthe beauty of the Qflow is composableplatform so you can use individualcomponents as a standalone applicationor as a complete end to- end platform sothis session will be focusing on GNI andLLM ops and our community in the quiterecent years work really hard toactually enable Qflow components forgeni and I think we can say that similarto AI and ML right now we all live inthis new world and all these componentsactually you know you can easily do geniand lmops with them and we will sharethroughout the session how actually it'spossible so by the flow you can thinkabout like a bridge between ML and cloudecosystem so we provide a simpleinterfaces is for data science MLengineers to leverage those advancedtools so we will structure our sessionin a way that we're going to show youthe life cycle so this is a geni lifecycle that uh our community maintainsand as you can see Qflow has severalcomponents which actually address everystage of geni life cycle starting fromspark operator for data processing thenotebooks for model development andtrainer for fine-tuning and distributedtraining kip for model optimizationarchitecture search for large largescale inference and the Qflow pipelineswhich is stitch all these piecestogether also we have a model registrywhich allows us to store metadata andartifacts and have a sort pluggablesystem in it so as you can see thisextremely powerful because this is anextensible and a portable ecosystembecause organizations can takecomponents they need and integrated withtheir uh internalplatforms so let me first talk aboutnotebooks and what kind of exciting datawe had there uh so notebooks team workedsuper hard to enable actually notebooksto the toe this is a new thing calledcubeflow workspaces uh this is like thesnapshot of the new UI that's going tocome up very soon so the idea is toprovide you simple interfaces how youcan create um interactive ID for datascientists using R studio VS codeJupyter lab in a way that this is veryplable for data scientists and at thesame time for platform engineers so ifyou want to learn about it please jointheir community calls there a lot ofexciting updates coming in so lookingforward to the release uh moving forwardto the spark operator so the goal ofspark operator is to enable sparkapplications on top of kubernetes wehave a lot of uh organizations who usingthis in production uh that recently thisproject could donated from Google toQflow ecosystem so right now it's partof our broader scope and in quiterecently they released a new version ofspark operator which is 2.1.0 uh theyintegrated new portlates there tosupport spark 3.x also they've done asignificant amount of work for unicornwith gang scheduling uh they refactoringthe entire codebase with control runtimealso like very exciting things aboutinteractive sessions um so folks in thespark operator community work reallyhard to actually enabling uh Jupyternotebooks uh works well with spark onkubernetes so this is like very excitingthings coming in uh so if you want tojoin them please uh join the communitycalls it's like calls every I believeFriday every second Friday so communityis great and let us know if you're usingspark operator we're looking forward tomore organizations being involved uh oneof the very exciting things about Sparkis actually uh benchmarks that uh AWSfriends from AWS uh have done here sothey actually uh try to run qflow sparkoperator on 60,000 spark applicationsacross 36,000 pots which was great likethis blog post has a lot of greatinformation of how actually spark can beused at large scale so it's amazingplease take a look um so J next uh sonext I'm going to speak about Qflow KTproject uh so as I mentioned thisproject for hyperparameter optimizationand architecture search so I have fewexciting updates for GNI this year sothis is one of the examples how weactually simplify hyperparameteroptimization for LM fine-tuning whichactually allows you seamlessly create uhexperiments with KATIP using just onesimple API uh so this is actually one ofuh good thing where this is th�e wascontributed by GSOC students with ourcommunity so this is fully um uh fullyin upstream so if you want to check thisin the blog post I'm strongly suggestyou to check how we can make it easy forfolks to do parameter optimization forLLMs also one of the exciting updatesabout rack so actually katip is kind ofthe project which allows you to tooptimize almost anything and one of uhour contributors um show how you can usethis with rateral generation sobasically they creating the entire rackpipeline and um uh plugin is within kipexperiment so they can see how thiscould be optimized within um withintheir life cycle so also please feelfree to check the blog post it's reallyexciting things uh then I'm going topass it to Yuki so he will speak moreabout the Qflow trainer project we havea lot of things there thank you Andreokay uh in my section uh I willintroduce about QR trainer uh asdescribed by Andre so in today MLworkforce cycle uh uh we we typically uhperform the training model training andfinetuning um after the modeldevelopment um trainer is responsibleresponsible for the um fine tuning andmodel traininguh Q for trainer is uh actually uhdesigned for not only for large rangagemodel so gener generic fine turing andmachine learning training model umacross various frameworks uh somethinglike uh pytor jacks uh tensorfl orothers uh Q for trainer uh today uh wehave Q for trainer B2 and Q for trainerB2 is um ro oriented resource model uhwhich means trainer has two typesresourcesuh which is train job and trainingruntime uh training runtime ispredefined by DevOps engineers anduh MLengineers define the train job and theyspecifyuh arbitrary training runtime in theirtrain job this model uh allows us todecouple responsibility betweeninfrastructure side and training codeside by this resource model uh oh sorryuh okay uh by this resource model uhdata scientist can focus on their uhtraining code and related parameterslike number of nodes or number ofprocesses inside of single nodes or someelse and uh this table isum current uh machine learning framesupportingtables as you know uh we are planning touh support various type of ML frameworkuh for britzu the same as B run traineruh currently uh we support PyTorch deepspeedMLX and the TensorFlow and hugging faceis under development eventually uh wesupports Jax ple and exibboost in this slide uh let me quicklyiterate uh our cube for trainer internalmechanism uh in the training B run uh wehave ML framework dedicated CL somethinglike Python job or TF job however in thetrainer B2 uh we consolidate all kind ofjob CLD into training runtime and trainjobum so instead of uh dedicated job CLDjob CLDs uh we construct cube fortrainer pipeline framework only forinternally this allows to reducemaintaining CLS and rapidly increaseadditional ML frameworks in cube fortrainer additionally this allowsplatform developers to support arbitraryML framework by themselvesinternally next Andre is introducingabout uh user experience thank you Yukiuh so Qflow trainer is a next generationof training operator for those who havebeen using there so it's has a lot ofgreat features for geni so one of theexamples this was a talk yesterday aboutdistributed arocache for distributed MLtraining so what it is is very excitingproject so what we basically have donewithin the community we connected thosedata libraries data fusion Apache arrowand iceberg and we bring them tokubernetes so the thing is like this isexciting work we're going to giveanother demo today in the Qflow booth at1 p.m about it but the goal is for us toactually create a distributed cache onKubernetes which allows us to distributestreaming for the PyTorch so basicallywe kind of converting the arrow formatfrom uh from iceberg all the way to thePyTorch tensors which basically allowsto do zero copy translation we see a lotof improvements we're constantly doingbenchmarks we're going to propose it tothe Qul community relatively soon soplease join us in the booth if you wantto see the demo it's like one of themost exciting work going on right nowgoing next so Jennai so as� I mentionedbefore in the panel and many of usactually know right now we try to seehow we can simplify cubeflow for MLusers specifically data scientists andML engineers so this is like one of thereally exciting work coming in so we tryto provide unifi SDK which can allowdata scientists to quickly process theirdata train them and optimize andeverything in a single Python interfaceso they don't even need to know anythingabout Kubernetes so what we learn fromour users they want to still leverageKubernetes to scale but they don't wantto learn Kubernetes so Kubernetes kindof abstracted the way they don't evenneed to understand coupube cuttle ordocker or yaml kubernetes they justfocus on pytorch which they run it andthen scale it exciting work coming innew working group ML experience we arehappy to announce it it will be in thecubeflow community if you want to joinplease check this um proposal in thecommunity one of the also excitingthings about Qflow SDK we collaboratewith llama community because I think asas you can see during the session Qflowkind of provide the end toend ecosystemfor uh genera at scale and we right nowtry to stitch llama pieces and Qflowtogether so folks who actually usingllama in production they can leveragethe Qflow tools so quickly build agentsdo post training do evaluations usingthe existing tools so all of us notgoing to reinvent the wheel um all ofalso exciting things about this torchtune thing so with SDK initiative wecollaborate with torch tune community tocreate this uh fine-tuning experienceexperience when basically the designerscan simply use one train API withpre-built trainers which we kind ofpre-create and all the thing they needto do just need to as you can mentionedtrainer has a concept of runtime so wecreate a runtime where they can justpass it in their Python script and thisruntime has entire utilization for uhdata sets modelization and it hasdistributed training so extremelypowerful we're leveraging Kubernetes fororchestration leveraging torch tune foruh PyTorch fine-tuning so we don't needto again uh redo the same things allover and over over again and all of thisand Jon will give a demo about end toengine experience so all this allows usto construct the entire pipeline usingthe pure Python interfaces so we haveprocess data with spark we have trainingwith uh Qflow trainer we have a servingwith Korf and all this constructing endto end GNI pipeline which again uh Jonwill mention about in this session andit's super exciting and it's greatexperience for data scientists so theycan quickly do iteration at scale uhwith leveraging all those advanced toolscommunity working with uh so I will passit to you so you can speak at modelregistrythank youAnjie so um machine learning life cycleand AI life cycles are very complicatedso you can train the model tune yourmodel experiment with notebooks andeverything right how do we trackeverything right so the Kufra modelregistry was a project donated from RedHat to help users manage and versionmodels and their metadata throughout themachine learning lifecycle it fills a gap between modelexperimentation and productionactivities and provides an centralinterface for all stakeholders in themachine machine learning life cycle tocollaborate on machine learning modelsso some updates from the model registryprojects so we introduced the modelregistry UI uh so that's pretty neat andit's uh much easier to use now besidesuh the backend APIs uh and CIS right sowith UI you can easily just click aroundand upload your models and uh do allsorts of things so it you can registermodels easily and yeah we also haveupdates to integrate with custom storageinitializers as well so that it can workwell with Kserve and other projects sothere are other minor features and bugfixes along theway and once you have the model righthow do you serve it so case of comesinto play in this machine learning lifecycle uh to serve your models so what isKserve kserve is a highly scalablestandard and cloud agnostic modelserving an inference platform onKubernetes note that it's is not onlyfor predic�tive AI but also forgenerative AI aswell some updates from the quesoperspective uh so we added theintegration with envoy AI gateway thatprovides support for multiple RMproviders and tokenbased read limitingand routing with traffic shaping uhfallback and load balancing and so on sowe've also improved the autoscalingcapabilities for large model use casesso that you can use um uh you can dolike autoscaling based on custom metricsuh this is through the ka integrationwith queso and we've also added themodel caching capability uh through PVand PVC so that you don't have todownload the model uh um uh again fordur especially if the model is large uhand it takes time for uh to spin up themodel duringautoscaling and we've also improved theRM serving runtimes so we've added themultiode inference support through VRMserverUh so we'll continue working with theVRM community to improve uh the uh theVRM serving runtime so we and we alsoadded support for uh gateway APIintegration so uh to especiallyspecifically for the raw deployment modeso that you can uh use it to uh use bothETL and envoy gateway as your uh gatewayimplementationnext I'll pass it to Jonnoyeah you would have heard about all thecomponents here so Qflow pipelines isthe last one which is stitching allthese components together so if you wantto orchestrate the entire workflow in aDAG or a graph uh pipelines can stitchtogether i'll show in the comingdemo so I'll just go with the demo ithink this will make things clear um sowe'll take a a minimal uh example hereuh in the interest of time so we'll takea hugging face gemma 3 1 billion modelwhich is a very lightweight a smallmodel which got released uh a week agoum and the idea is to create afine-tuned model for reasoning so youwould have heard about reasoning theidea is to in understand how your theinference output is obtained right usinggRPO um using the unsloth library um sothat uh the idea is to use the Qflowtrainer for doing the finetuning andonce we have the finetune model use casefor the final inference and the wholething is stitched uh within the Qflowpipelines so here is the uh default uhQflow dashboard so we have uh defaultuser which is the Qflow user um youcould and this is based on the profilesso you can switch between profiles ifyou have um uh permissions to do that sothis is a multi-tenent system where youlog in as a user and you can being anadmin you can always switch to otherprofiles so on the left whatever you areseeing are keyflow components um and uhthe notebooks a cat pipelines you wouldhave heard about all ofthat and in u if you go to notebooks uyou could create jupyter notebooks or uhr studio whatever runtime that you needand specify resources for example rightand add storage and do do yourexperimentation in the very firstFor Katib you could add hyperparametertuner optimization um with variousalgorithms with your target and theobjectives that you actually want tohave for your pipelines you can createan experiment and associate runs so runis a single workflow execution you canimagine this to be like multiple runswithin your organization running graphsmultiple times based on your requirementso let's go to the LLM ops part using Qfor pipelines i have a notebook here umwhich which is alreadycreated going into the notebook whichwhere I have a few fine-tuning uhJupyternotebook installed KFP which is the KflpipelinesSDK used uh hugging face token fordownloading um the gated model fromhugging face which is a jamaat 3 modelso the idea is to wrap your trainingcode if you have in in the currentexperiment we are using a custom trainerso it is up to you if you if you want tohave a default experiment you could usea torch tune which we talked about itsometime later um so use torch tune ifyou want your library to take itentirely from uh start the start to endor you could use if you have a customcode wrap it within your uh DSLcomponent of KFP pipeline and it willtake care of that so here I'm using anslot library here um where I specify themodel name and if you know GRPO the ideais to create reward functions so um thebetter� the better the output you rewardmore it will get the output will getaligned to the direction that we areseeing if your output is bad you givenegative rewards which is saying thatlike you have to tune more right that'sthe GRPO so idea is to give more rewardswhen you are formatting and the answersare right and give penalty when they arewrong right and that's a very simple oneum we'll share that so that like if youwant to see what happens in the backend so um all the reward functions aredefined and we have the gRPO trainerfrom unsloth um where we put all thehyperparameter tuning uh tuning optionsso if you want kip to tune this that'salso possible it's not shown in thisdemo but all the hyperparameters can betuned usingkip and um and this is the main partright uh I'm using the train function ofthe trainer so I've defined the thetrain function used the cubeflow trainerto start training in a distributedmanner so here if you see that this is acore part right I'm saying I'm going totrain this function on one nodewith four CPUs 64 gig memory and one GPUright so just change number of nodes ornumber of GPUs that you have in yourcluster and it will automatically scaleup and I'm waiting for uh the job to getcompleted and once done um I delete thejob the second pipeline task here is theinference once I have a fine-tuned job Iwill create a deployment from uh thefinetuned job uh fine-tuned model thatis created from the previous task usingcase right and the whole pipeline um isorchestrated using the Qflow pipelineswhere I create the DAG here i'm sayingthat the training has to happen afterthe storage is provisioned and theserving has to happen after uh thefinetuning is completeand I'm triggering the fine-tuninghere so once the fine-tuning pipeline iscreated uh you could go to the pipelinesection the UI to see the progress soyou would see a new run here which istriggered here and see the DAG which gotcreated right i created storage i have afine-tuning step which is happeningright now and once that is completed itgoes to the serving step and there is aseparateuh path which I have created for theoriginal model to just to uh do acomparison serving of the originalmodel so you could see logs and all theother the details within the UI itselfwithout looking anything from theKubernetes side of things so the gpofinetuning happens right now so if yousee the pod you would see that there's atraining pod which is created because weare using a single pod right now but youcan scale as per your requirements anduh you could see that the train jobwhich is a CD4 running the trainingright and it is in created uhstate and there's a base model whichwhich got created and that's the reasonI'm seeing uh inference service from thebase so once it iscomplete you could see that the trainingpart gets into the completed state andthe bottommost part was the part forserving so the finetuned model was jobwas created completed and then it wentto the servingstate so you could see that in the UIitself all the steps are complete andI'm ready for inferencingand you you could also see the samething uh all the case of related uhendpoint details from the case of tabthere to see uh what endpoints are andhow to send requests tothat just to highlight here so we usedGRPO trainer so you could also see thatthe rewards are getting better so whichactually means the finetuning ishappening so we did that for limitedsteps in the interest of time but yourrewards are getting better so if you doit for longer time you would get muchaligned results based on yourrequirementsso you could get the uh infrasidemetrics as well and we are ready for thefinal inference so I I'm asking aquestion what is the area of trianglefor length 344 right and I'm sendingrequest to the hosted endpoint by kerand you could see that the model istrying to reason right like from thesite length that we have it is trying tofigure out what's the solution possiblesolution that I have so the in a in anormal sense you would see the finalanswer the area of triangle is 3 uh55x4 and between the start working outsessions you would see how this isobtained that's the whole demoall right so I have two minutes tosummarize everything but doing my bestbut uh well thank you so much forstaying with us on this uh we have greatnews great news so we have releasedQflow 1 uh here at QC con a couple ofdays ago so kudos to the team andeveryone who contribute thank you somuch uh but yeah we have many greatthings uh so we talk about the UIalready we are also bringing efficiencyand optimization and user experiencewith training operator distributed tracksupport also uh we talked already aboutKaty so we'll skip it but also for uhQflow pipelines we are doing manyimprovements to allow you from userexperience but also efficiency tocontrol like resource uh limitsmanagement in the pipeline but also forum controlling the limits in into theloop parallelism so you can control thewhole uh pipeline uh the whole loopexecution so the other thing we have isalso a spark operator is now part ofQflow and we are working hard onsecurity so we have done CVS reductionsbut also we are looking into allcomponents and looking into improving uhsecurity best practices uh this is theroad map so just really quick again weare looking into uh many things from uhgen AI inference support from KER wetalk about trainer user experiencepipeline security upgrades and excontinue in increasing the experienceand also a storage integration for modelregistry but what is exciting is reallyabout what this community is about andit's about we are working towardsgraduation right now so it's a reallyexciting time if in our community andeverything around AI so what we aredoing is also all the working groups arecoming together to build new work newworking groups that we can work acrossthe different components by bringinglike data uh focuses on data and andalso use better user experience acrossall those components so the great thingabout the all of these is that we are acommunity that evolves that brings newidea and you can be part of it so if youhave ideas about any proposal so feelfree to submit it on any of these linksthe other thing that we are working onis in our outreach committee so we doneed help on this so please help us joinus uh we are working also on Helmchartso the idea is how we can make our uhdeployment process for Qflow more easyfor you by bringing the homes into thepicture for Qflow the manifest projectsand all the components but also fees forexample who's been an add-on the Fcommunity uh proposed to donate the uhas a feature store to the Qflowecosystem so the last things I want toinvite you to join our community this QRcode you can join uh this is our Slackyeah we also have meetings our meetingsare amazing so each working group uhmeets and talks and discuss discussthings uh so you can be part of that weare really looking forward to you knowdecide the future of of QFlow and AI solet's do it together join our meetingsuh this QR code will give you moreinformation uh the other thing is theQflow community call is the best way andthe best path uh sorry the best waywhere you can start joining ourcommunity get involved uh we have manysocial channels as well where you canfollow us and also we had many adoptersi don't know if you saw the keynoteyesterday was amazing so please join usand we also have a survey so we want tohear from you about your experience withQflow maybe you are a new adopter maybeyou are evaluating we want to hear fromyou because we will help it will help usto really start shaping shaping thefuture looking in how what are thethings that we can improve and we wantto do it with you you are our communityso we really want to hear from you andwith that this is my last slide uh justa quick question how many of you aredata scientists here todayokay how many of you are AIengineers how many of you are platformengineers okay great thank you so muchplease connect with us thank youthank you feel free to ask any questionsafter the session looking forward tospeak with all of you and please join uslet's not reinvent the wheel let's buildthings together so thank you everyone2025-04-15 21:58:38.476240 vv��i�#�� AgGP9QdlNr9Yhello everyone super happy to be here inthe CubeCon stage uh welcome i'm superexcited to be here in London in my homecity i hope all of you enjoying the warmweather in April in London so my name isAndre i work at Apple um I've been thiscommittee for the last 8 years right nowI'm the cubeflow steering committee sotoday we're going to share you theupdates from the ecosystem at what'sactually next for cloud native ML andfirst of all I think we'll spend um fewminutes to introduce our panelists soyou can startthank you uh I'm Yuki a softwareengineer working for cyber agent andmaintaining cubernetes cube cube for Qbatch related CNC cell ecosystem toolshi everyone uh thank you for being heremy name is Yan i'm a senior principalsoftware engineer at Red Hat AI and I'mone of the steering committee members inCoupeflow and also a project lead forAGO and uh also co-chair in theKubernetes working group servinghi everyone I'm a technical director atNewanx AI um in open source i uh I'mpart of the cubeflow steering committeeand lead couple of AI initiatives umtraining and autoML and also in MLcommonstorage hello and welcome i'm ValentinaRodriguez i'm a principal architect atRed Hat and also I contribute to QFlowon diverse projects i'm also part of therelease team and also work with theplatform working groups and a case and Iam a KCD organizer in New York so thankyouthanks you please welcome these amazingspeakers to thepanel all right so let's talk aboutCubeflow so first of all let me ask youhow many of you actually know aboutCubeflow all right how many of youactually successfully run this inproduction all right I see a few handshere so just to remind everyone uh Qflowis an ecosystem of open source projectsuh so the goal of our community is tomake it simple uh to make AI and MLcompares simple portable and scalableyou can run these components in anykubernetes cluster whether it's on-premthi��witchand starting to make it So we'reactually going togo to home for and then finally we'regoing to try releasing We're planning onre-releasing Helm 4 as part of QCON inAtlanta We're gonna go for gold inAtlanta and go from one gold medal cityto another So kind of a nice littleanalogy there So we've done a lotactually thus far What have we donematt Glicker Thank you So some of thestuff that we've done um let's diveright in This is the work since you knowhe talked about the kickoff uh back inNovember and we're here So this iswhat's happened so far Uh the firstthing we've done is we made breakingchanges This is big Helm version 3 Wedidn't break anything Um and this flaghere is a beautiful example of us notbreaking things When Helm 3 first cameout if you did Helm repo list and therewas an error updating a repository theexit code was zero Oops It should havegiven an error Should have reportedsomething right but we didn't want tobreak anything because so many peopleuse it in critical places that we addeda flag for that If you wanted it toreturn an error code instead of zerooops we added a flag for that becausestability and not breaking people wasjust so vitally important to us And sonow we're breaking things And it'sthings like this that we're breakingright we pulled this flag out It returnsa non-zero exit code just like it shouldhave But there's lots of other thingsthat we are breaking along the way totry to improve your experience as a Helmuser using the client the CLI or as anSDK user who's got to incorporate itinto your tooling because we've learnedso much in the last 5years Uh the command part is now apackage Um this is a subtle littleinteresting one So if you go look attools like if I pick up K3S for exampleanother CNCF project you'll see that K3Shas cubectl commands built right into itright i can run K3S coupubectl and acoupubectl command It's a way ofpackaging things up Well Helm was notarchitected internally in a way thatsomebody could do this with Helm's codeI actually ran into this with a projectthat I wrote a few years ago We wantedto make that change and we weren't theonly ones because somebody else wantedto embed Helm commands in theirapplication too And so they came alongand now the Helm commands are a subpackage So another application anotherGo application can import those and usethose commands directly the same waythat something like K3S doescoupubeCTL Oh this is a fun one So talkabout not breaking compatibility hereright we removed support for oldKubernetes APIs Let me ask this Isanybody still out there runningKubernetes115 no No Helm 3 still supports its APIsthat have been deprecated Beta API toadmit it Yeah Nobody's going to admitthis We have support for beta APIs thatgo back that far that Kubernetes hasn'tserved in years because we didn't wantto break anything or anybody Nowhopefully nobody is running a versionthat old because of security but wedidn't break things And so we've removedsupport for those older APIs So we'regoing to support modern Kubernetes andnot have to carry that legacy aroundwith usanymore Let's see Ora that's mine Thisis yours Oh how many how many know whatis how many do store their charts as OCIartifacts or is the library underneaththat and we were the last project thatused the V1 of Oruras and they basicallycreated this branch and kept the branchalive because of Helm and because of ourAPI compatibility we couldn't changethat and we've had so many users thathave had some challenges around OCIregistries OCI artifacts and we're likewe can't do too much just because of theway that Aura's V1 was created andmaintained they've moved on to V2 beforewe even went GA with with an out ofexperimental phase so we're using thisas an opportunity to finally finally goahead and switch over to Aurora's v2 hasa lot more improvements and we canactually go ahead and track all thefeatures that Ouras is in um includingand integrating into their own projectnow into Helm Yes And and this all comesdown to not just compatibility for CLIusers but SDK users If we had upgradedORS v1 the way things came out it wo�uldhave actually could have broken theexperience for those SDK users Thethings like the Flux project and othersout there who take thatWhat isthis we removed unused functions Soyou'd be surprised Code morphing overtime We had public APIs in there Lotsand lots of public APIs that it turnsout other projects were still using Evenwhen Helm decided to stop using it andwe we couldn't remove those for a longtime We had to maintain them made surethey work We jumped through some hoopsto keep those things alive We startedripping them out Um there are newer waysto do those same things but in a minorversion of something semanticallyversioned you don't break that And youyou see a lot of this out of us right wedon't want to break things It's becausewe know there's so many important thingsbuilt on it We really want to takeversioning responsibly We don't want tobreak people along the way unless wesignal that which is what a majorversion is for So we've started tochange those things Now these updateswill be documented So when you finallyneed to move from version three toversion 4 if you're an SDK user therewill be a migration guide There will besomething to say hey you were doing thisnow you can do this instead But it isone of those things we are breakingthose APIs now for our health andhopefully yours aswell All right so let's talk about whatwe're actively working because that'swhat we did already So this stuff'salready done This is baked It's built inNow this is the stuff that's in flightright now that maintainers like us arecurrently working on and uh we'reactually open to help on thistoo So the first thing is logging Allright So uh Go logging um if you wentback 5 years ago it was a little bit ofa mess Uh you didn't have things likethe slog package in Go Many of thelogging libraries that we use today wetake for granted And so Helm had areally funny way of doing logging Infact if you look at some of the oldlogging examples here you're going tosee we used either the log package or wealso passed a logger around internallywhich was a funny little wrapper thatwe' created It wasn't elegant It wasn'tnice And if you're going to use it as anSDK or other things there's far betterways to do it So things didn't cleanlyintegrate and they didn't cleanly workwell if you want to pull stuff in And asworking for a company where we pull inthe Helm SDK it was kind of a problem tomake sure the output went to the sameplace as everything else which helpswithdebugging So we looked at a lot ofdifferent ways of doing logging rightbecause it's still a little bitcontroversial on what you should do Welooked at different loggers Do we wantto do things the Kubernetes way rightkubernetes use K log There's also log Rwhich I'm probably pronouncing log RThere may be a better way of saying itbut um you know that's the API for itand this is based on what came out ofGoogle and you see in lots of theoperators but we went out and we didresearch and we said all right how arepeople using the Helm SDK we can goaround on GitHub and see how it'simported and where it's used What kindof logging do these applications all useand we found it was all over the placeIt didn't just follow what Kubernetesdoes Uh a lot of times they'll use thelog package We found plenty using umLogrus right uh we found some using Zapand others And we said "All right whatlogging interface what way of doing itwill work for all of them?" And we hadto ask that question and do theresearch So we decided on the new S Logpackage This is compatible with uh Kloand what Kubernetes does They have acompatibility layer in so things thatuse that will work There's shims forthings like Logrus and Zap and some ofthese others And this is a good way todo structured logging and that kind ofthing So Helm is in the process rightnow of migrating all of its logging overto this in a way that SDK users can takeadvantage of and CLI users will get aricher experience in theiroutput So here's a really simple newlogging experience What you're going tosee is we're actually along the wayswitching to structured logging So a lotof our log�ging had messages and andinformation intermixed We're nowactually trying to do structured loggingSo when an SDK user pipes this to outputuh they can query on this They can saywhat are the errors what are the othermessages it gives people using it in therich logging tools uh the ability toquery and do more relevantthings Uh what we have not worked outthis is one So while that's an SD whilethat provides a nice API which loggerare we ultimately going to use in theend do we want to use something thatgives us pretty output in the client dowe just want to go with the base systemdo we want something that's high-endperformance um this is an open questionAnd so actually if anybody here has anopinion on it please come find usoffline uh you know outside of theconference let us know what you thinkBut this is one of those questions we'restill trying to work out what's going tobe the default one that the Helm CLIuses because this again is one of thosethings that's inflight So let's talk about charts v3 Allright So over the last few years we'vegotten a lot of questions Can we do thisdifferently with charts can we do thatdifferently with charts uh and a lot ofthe ideas are great beautiful ideas waysto extend charts but if you do it itbreaks things for people in theirexperience And as you see we're big onnot breakingthings So we introduced the idea ofcoming up with a charts API version V3And if you haven't noticed the chartstoday are API version v1 or v2 And wehad API version v1 a long time ago Helmstill supports those um mostly there's afew features nobody ever used thataren't supported but we support that andwe said if we want to do some of thiswithout breaking all those charts outthere that you create and consume weneed to increment the API version sothat is going into an experimental phasenow let me explain this here um one ofthe things we the big reason for this iswe didn't want to break backwardscompatibility right we definitely don'twant to break you or any of your chartsuh when we went from v1 to v2 we brokebroke a few things We were lucky thatalong the way we patched them and fixedthem but some of the changes were goingto be bigger And I had people say withHelm V4 if you break my charts I willfind you and kill you I I think thenumber of users were a little smallerback then Anyways the user base it was alot smaller back then but I've hadpeople who know where I live say I willhunt you down and kill you And they'renormally people who say uh come to theconference I'll buy you a beer And so itwas a very different toneSo here's a little bit about chartshistory right charts v1 came out withhelm v1 and then we went to charts v2and charts v2 was part of helm v3 and wehad charts v1 for a long time There wasa small period where it was kind ofunstructured but almost nobody used helmback then You remember this is 2016That's that's a long time ago at thispoint and we've had charts v2 for a veryvery very long time So here's what we'retalking about with this And this iswhere things get a little bit differentfrom the Helm version 4 timeframe It's an experiment which means wecan break out of the Helm version 4 dateat the end of the year because we lookedat all the changes people wanted to haveand we can't fit them all in So we wantto play with it for a while let youexperiment with charts as we iterate andget feedback and work on it So we'regoing to continue to support Charts V2through Helm V4 and maybe beyond Wedon't know We don't know when that thatlife cycle will end But charts v3 isgoing to be an experiment It'll be anexperiment once we get it in here duringthe development phase and it'll be anexperiment after the release of Helmversion 4 So we can iterate on it We canlearn We can change You can give usfeedback And then when it's ready we'llmake it G remove the experiment flag andthose charts can be live in the wild forpeople to use for those new featuresThis meant that the constraint of tryingto get a Helm version 4 out at the endof the year in a reasonable time uhdoesn't constrain being creative hereAnd so this is one of those things tha�twe're working on Now what might be inCharts V3 right um pluggable renderingengines We've got some people in theroom who wanted to change out therendering engine and get rid of goplThis is one of the things we're talkingabout Can you make it pluggable and andembed that in the chart another form ofdependency handling Right Right nowyou've got charts in the chartsubdirectory and you can do helmdependency up There may be other ways wewant to do it and there are peoplewho've been experimenting with ideas onthis and so we're looking at that adifferent way of handling CRDs right uhthe way we handle CRDs now has somerough edges and it was there to playvery very safe but now CRDs have takenon a life of their own since this cameout They've they've expanded far more inuse and in use case So we're looking fordifferent ways to handleCRDs Uh custom resource ordering righthelm Kubernetes you should be able tojust throw everything up and everythingreconciles beautifully In reality thatdoesn't always work And Helm in order tosolve for that has an internal resourceordering from lessons learned Sometimesyou want to deal with your CRDs withinthat resource ordering or maybe you wantto change that resource ordering becauseyour own application wants to dosomething So this is a feature we'retalking about or maybe there's justsomething else that we're not eventhinking of or that we've got a listthat goes beyond this But these are allkinds of things we can look to play withwith a new version of charts we can't dowithout breaking people in the existingonesAll right And status checking So howmany of you have gone ahead and had somechallenges waiting for certain actionstofinish yeah way we way helm has done itin version three not the best So likeanything any improvement we have to helmwe have a helm impl uh improvementproposal a hip basically and we havespecial ones for helm 4 So very muchlike any improvement we have one forlet's get a better handling for statusesSo right now this is the current logicIt's very opinionated It's kind ofgeneric It's very custom to Helm whichworked for a while but then like Mattsaid with many things we've seen thusfar there's some rough rough edges Sowhat we did is we went to the communityand saw what's what's out there and wehave plenty of contacts and manydifferent tools leverage Helm and one ofthem is Flux So we were able to workwith members of that of that team tobring in K status K status is a moreintelligent way of handling weights andchecking on the current status ofresources This got merged literally lastweek and we're maybe this week It was acouple days ago It's not merged yet It'sapproved but not merged Ah close We'reclose Open source at its finest Anywaysvery close to being merged and this isgoing to provide a much betterexperience for you to be able tounderstand what the current status ofall the resources we have currentlybeing deployed as part of a chartrelease and we have a lot of thingswe're consideringMatt So uh oh actually the first one'syours Oh it's all mine now Okay So goingback to the to the topic of OCIregistries how many of you have had somechallenges in the transition fromtypical Helm repositories to OCIregistries forreleases there's not a good way to youto reference other other um repositoriesProxy and mirroring is not the best Sowe looked in we evaluated severaldifferent options out there We couldhave gone and kind of rolled our own butwe looked into what does the experiencelook like for anyone creating an OCIartifact well it's going to be a similarexperience to creating a container imageSo we looked into what provides asimilar experience of contra creating acontainer image and managing containerimages So if you're familiar with theecosystem out there there's a there's aconcept called registries.com and thatallows you to customize how aliases workmirrors as well as other configurationsand credential helpers too And you cando all this using standardizedapproaches that are already out there inthe ecosystem So it's not just Helmdoing this We're going ahead andfollowing and partnering with wha�t othercommunities are doing and you know I doa lot of work with a lot of differentcontainer engines and they were veryinterested in hearing about that Helmwas considering this and they want toget our feedback what's working wellwhat's not working well and if there wehit an issue edge case we're going tobring that back to them and be able tohelp you know really make the wholeecosystem muchbetter plugins ah yes so one of thethings that we're looking at isimproving how plugins work Helm hascertain plugins you can do things like adownloader plugin or an extension at thecommand line where information aboutwhat's going on is then passed to yourplugin We are looking at other ways thatwe can do plugins to make Helm moreextensible because we get lots ofquestions about hey could we do thisfeature or that feature or this otherfeature and a lot of times they're nichecases or you know it's not something wethat we need to maintain for everybodyor maybe it brings in a pattern that isit's a lot to maintain because there'sso many of you with so many differentideas and we wanted the ability to giveyou the the extensibility to plug intodifferent places along the process So weare looking at how do we do this wehaven't worked out the details yet orthe time frames for which parts becausewe've got everything from small ideas wecan meet for Helm 4 to grand ideas weneed to experiment with which alreadyhas us talking Few years ofexperimenting with some of these ideasmay lead us into some changes to a Helmversion 5 that can do even moreextensibility But this is one of thoseareas that we're working on right nowSecurity is a big one and I do a lotwith security and how many of you signyour Helmcharts Why don't you sign your Helmcharts because it's hard PGP GPG is noteasy and it's cumbersome So there areother tools out there that have evolvedover the last few years One of them iscosign and sig store So looking atproviding new opportunities to sign yourcharts outside of GPG and PGP So this isone of the areas where we're looking atextending this through plugins Yes Andand actually there is a plugin to docosign signing today but thefunctionality isn't native And that'sone of those things that's interestingThat was mine I I care so much aboutthis ecosystem We want to at least getsome type of bridge for the Helm 3 worldWe're going to make it a little morenative in Helm4 All right So we've talked about abunch of things what we've done whatwe're currently working on what we'relooking at doing But I know especiallyfrom talking to to many people this weekthere's a lot of things that we have uphere that other people that you knowwe're missing things and we won't beable to get everything done that you allwould like to see We're not going to geteverything done we would like to getdone in a Helm version 4 for changesbecause we're limiting some of that timeBut there are things out there and theremay be something you you really want tosee maybe even something you want tocontribute yourself We do want to hearabout it So let's talk because whilethis is what we have right now and whatwe're looking at and we have a schedulewe're still open to other things andwe're open to other people contributingideas to fit within that So if you havesomething I mean Matt yesterday at thebooth we had a million-dollar idea comein and we're at the booth all uh todayprobably done for the day but tomorrowwe're there from I think 3 to 4:30 orit's later in the day I know it's 12 to2 That's what it is It's in theafternoon early afternoon Come by Wewant to talk to you We want to hearabout your use cases If you have amillion dollar idea that you think to bethe next big thing we would love to hearabout it cuz we haven't we can thinkabout a lot of things but you guys workwith Helm on a daily basis in yourorganization in your personal timeYou've hit rough spots you've had ideaswe want to hear aboutit So how can you get involved asidefrom obviously coming down to the boothand talking to us here in the sessionfirst of all we're on Slack We're onKubernetes Slack So we have two primarychannels th�e Helm users which is whereall the Helm users end up you know youknow mingling and talking about how theyuse Helm but there's an entire channeljust for development called Helm Dev Ifyou're interested in contributing to theproject go there Number two if you wantto just get your hands dirty go aheadand help us out in the Helm organizationWe have several different projectsoutside of just Helmhelm We have pluginsthat we integrate We have CI toolsanything that you want to do and youthink is going to be helpful for theproject we want we want to get thatinformation from us But in addition tothat you don't have to be a coder If youwant to help with docs docs would be thegreatest way that we would love you Weget we would go ahead and give youStarbucks or whatever you'd like cuzfinding time for the for the maintainersto do docs is painful because we need toget features out there but it docs helpthe user experience And then finally ifyou want to collaborate with thecommunity please come to our uh weeklymeetings on Thursdays at 12:30 Easternfor us to be able to talk andcollaborate with thecommunity That's it A lot coming up inHelm 4 We're excited to be able to bringthat and work with you to make it happenSo thank you very much[Applause]If you have questions there is amicrophone over here Could you pleasecome over to the microphone and and thereason for that is uh as people do maketheir way over to it Is they like tohave this recorded because this goesonline and the questions being out thereuh that works really well And so ifpeople could line up over here we wouldappreciate it So you're up All rightThanks Can you hear me yep Yes Yes wecan Yeah you from Finland and been usingHelm quite a lot for the few years nowSo two questions or two feature requestsBe possible to add YAML scheas tolibrary charts That would be superbY schema to library charts Yes YAMLscheas to Okay Scheas to library chartsYes And the second one to make it easyto produce uh reproducible Helm chartsto when you package them so that thetime stamps are all the same and soforth Yes And so the reproducible Helmcharts uh the reason we didn't do it wasa backwards breaking change So it is oneof those things we can look at now Andschemas for library charts Yes Thank youHi Uh I want to know if post renderersare working on Helm hooks in Helm 4Not yetIs it planned would you like to comehelp us with that is it planned uh is itplanned to make it i want to say thereis somebody who's been bringing it upIt's somewhere in there I don't know thestate of it but it is one of thosethings There was either a poll requestor an issue for it It is being discussedI don't know the state of it though OkayThank you Yes Yes So once we've got Helm4 how quickly is Helm 3 going to beretired ah so the current plan I want tosay issix months Eight months I want to say itis two minor versions which minorversion comes out every four months andit's uh roughly a month delayed from theKubernetes minor so we can update eachtime we wanted to do I think two minorso I think it was six or eight months isthe way it ends up working out is thethe timetable so it won't be delayedright away we'll still be supporting itfor bugs and security and upgrading theKubernetes libraries for support um fora little while to give people anopportunity to switch over thanks ThankyouHi there Got two questions So questionnumber one is if you can elaborate alittle bit on CRDs like uh you knowwhat's what are the actual problems withCRDs and Helm and you know you know whythey are a problem and what can you dobetter in Helm 4 to make it less of aproblem Uh so we've got a whole uh helmimprovement proposal that'sinformational on this one and I want tosay it prints out to be pages long andit deals with things like with a helmchart you can have two people install ahelm chart in different namespaces whodon't know about each other right and soif it installs CRDs there's anopportunity for them to collide at aglobal resource and one installingsomething could break the other likesomebody did a roll back to an olderversion and somebody depended o�nsomething in the newer version of the COops they broke somebody's applicationIt could be the same version of the CRDor a totally different CRD version Andso there's opportunities to break thingsSo we've always been very conservativewith CRDs because we don't want to breakyour cluster And and that's one of theproblems We actually have a document umin the Helm improvement proposals thatoutlines all of those problems So if youwant to learn what the problems are tohelp us come up with solutions it's wellwritten out Thank you So the otherquestion was about uh you know there arenow a lot of tools coming up like thisprogrammatic um you know templatingtools like CDK where you actually writeyour object as a you know proper objectslike in in the program language paradigmlike you know dictionaries or you knowwhatever you want to call them so thatyou can have a proper typed API andstuff like that Do you see like Elm uhsupporting that kind of paradigm anytimesoon or is it just like out of scope orYeah Yeah So when we talked about thepluggable rendering systems this iswhere that would fall into because youcould have something like that and thenhave a system know what to do to thenturn it into the valid Kubernetesobjects that get sent over That'sexactly where that falls in And and it'sthe types of things like that that do itIn fact Ing's up here who's working he'sworking with us and he's talked aboutthese kinds of things to bring in YAMLscript to do some stuff He's Mr YAMLYeah Mr YAML So it's one of those thingsYeah So it's one of those things we aredefinitely thinking about it and thatgives into the pluggable renderingsystem All right Thank you Now uh hi Ihave two questions Uh the first one youjust mentioned that uh Helm Helm 4 willsupport cosign but what but what aboutthe integrated sign subcomand in theHelm command right and uh it's awell-known issue that uh Helm has helmis using a uh uh a legacy go cryptolibrary and it cannot support EDDDSA uhyes so in Helm 4 I want to fix that evenfor PGP Yeah me too We'll do that I Ialready have a plan for fixing that inPGP with um Helm 4 I would like to getrid of that old crypto library andreplace it We got a plan for it but itwould have broken APIs to do it So it isa Helm 4 thing It's on my mind I Ishould have put it up there but it is onmy mind Yep Okay Thanks Uh and thesecond question is that uh uh in thelegacy chart museum version of thecharts repository uh we have index.mylto uh list all the charts in therepository What about uh the OCI versionof it the challenge is more about rightnow there's no real support inside OCISo it's not as it's really hard just tobe able to list registry tags I mean youcan list tags but in terms of listingrelated images that's challenging It'sbeen one we've been thinking about butwe got to work with the community toactually provide a solution So is it onthe road map or when when when will wesupport it well that's going to besomething that we need to work with theOCI community Yeah this is so we in thepast have worked with the OCI communityon this when we worked with them onartifacts and uh we didn't come up witha good solution then we need to go backand work with them again because theyhad ideas for it but the collaborationstopped So we don't have something likethat There's extensions to registriesbut it's not standardized So that'swhere the challenge is going to be Itmight work on some registries but notall registries Yeah So that's what wespent time on this problem We haven'tcome to a good solution yet which is whywe don't have something for it ThanksYep You're welcomeSo Aad 27 thank you so much for thisgreat session H I want to thank you fora small nice feature in uh Helm 315 thehide secret flag in Helm template causedso many headaches you know rotatingsecrets in the CI where outputting thethe template basically I alsocontributed documentation because I gotmy hands dirty seeing that this issupported but the docs doesn't mentionit I wanted to ask a question about apending upgrade and pending installwhat's the the thing with the breakingchanges where uh if a a release isalready in pending AB install you needto uh visit it like do a roll back amanually roll back Why not just uhredeploy like it was before it was kindof a breaking change in 35 somethinglike that Do we have an issue on that isthere an issue erh not an issue just thefact that it was kind of a breakingchange where for example I give you adeveloper was deployed cancelling thepipeline in the middle the release is onpending up pending install and thenredeploy won't work I guess the questionis was it a known feature that waschanged and that's the new functionalityand it's causing challenges now itsounds like a bugI don't know what happened there I don'tremember at this point um please file anissue and and raise it to our attentionThis sounds vaguely familiar but I don'tremember what's going on there Uh I Iwish I could answer it on the spot but II don't issue Thank you so muchHopefully in a helpful Yeah Yeah Thankyou Thank you Hey thanks Very nice talkWe are working in with air gapenvironment and I was wondering ifthere's a plan to be able to overridethe chart registry So that's where thehelm the registry is comp that wementioned is meant to solve thatchallenge We have an issue And is AndrewBlocker on to talk that's him That's meHiYes Find him afterwards Plug in pleaseYeah Hey so I was wondering what's theuh intended scope of the plug-in systemso for example would you be able towrite custom filter plugins like GoTemplate filters for example we're stillworking that out Um it'll probably beminimal when Helm version 4 comes out uhlooking for pattern that we can do thatin minor releases we can expand on itBut that's some of the stuff we're stillworking out We want to do better pluginsWe've even been talking about webassembly for those plugins so you canwrite plugins in different languages andexecute them on different architecturesand OS to make that easier Uh we've gotideas there So we haven't worked it outyetHi Uh I wanted to ask if there's um anyplan to support uh the templating ofdependency ch dependent charts Nodependency charts like if I have adependency on some chart how I can so Ican uh template the values of it and youcan't multiple times So are you talkingabout templating the values or like Ihave like an example I the O of twoproxy from Vietnami the chart and I wantto um insert the ingress the host nameand they have to use it in multipleplaces and now the user would have to orright now the user you can so you canuse the TPL function and get and do thatuse TPL function you can go ahead andbasically it renders at runtime Yeahwith the TPL functions the problem thatevery chart has to implement that valueand absolutely yes So every chart thatuse it was have to implement that TPLfunction That's what you can do today Wehave not talked about doing anythingaround templating the values that getpassed to subcharts That's aninteresting idea Please raise the issueand we could look at the positivesnegatives feasibility of it I actuallyhadn't thought of that Okay Thank youThank you Thank youHi I'm um I'm excited about therendering system and I was asking um ifyou have any idea how long you want tohave the chart version 3 in experimentalmode before it becomes stable Uh wereally don't uh if we've learnedanything the timets that we think okaymay not happen They're ambitious Yeah wehave ambitious timets Uh so I I I reallydon't want to share but we'll do it whenit's ready when we've got a good enoughfeature set and we know we won't meeteverybody's needs Um and uh and peoplehave tested it enough we know thesystem's robust We are we'repurposefully making it ambitious to keepourselves you know accountable You knowit's been a long time If we didn't dothis it would continue to prolong itselfand we'd never get anything done And andone of the nice things about doing it asan experiment is then we know Helm 4isn't going to go on for an unknownamount of time while we wait on aparticular feature OkayAll right Thank you all If you have anyother questions we're here Yep We'rehere Grab us outside Thank you very much2025-04-15 21:58:39.237501 ��P�#��WA_xoDbpm-Qksyeah helloeveryone yeah today's topic is LCD3.6.0 and LCD operator 0.1 you know uhwe will walk you through what's new andsome key improvements you know RCD 2.6is really a big milestone because it hasalmost four years since3.5.0 so but so far we only released RC3release candidate 3 due to an upgradeissue we will dive into later but anywayis a major step forward we are going toshare the detailtoday yeah this is our today's speakerhi everyone I'm Orush Shaha from uhVMwar��V�#��cArdTPbm9f_fci get this question every single time wedid this Is Helmpopular So let's go ahead and getstarted This is Helm for you My name isAndy Blanc I'm a distinguished architectat Red Hat and a Helm maintainer Hi I'mMatt Fina I'm a distinguished engineerover at SUSA and Helm maintainer Andtoday we're going to talk about Helm 4And by judging by just the audience Ithink this is somewhat popular sessionespecially the end of the day on thesecond to last day So we're going totalk about Helm Helm 4 and how we gothere and where we're going So first ofall Helm has been here for a while Youknow we look at the keynotes today andyesterday We talked about Kubernetes andand and the CNCF 10 years old WellHelm's been around for quite some timebefore it which is just scary when youthink about how long Helm's been here SoHelm's been here for a while now Therehave been a lot of frameworks that havecome and gone been trying to replicateHelm You know I can think of five fouror five in the last couple years sayingthey're better than Helm We're going totake it on But Helm's really always beenthere as your trusted tool for how youmanageKubernetes However even though we'vebeen around for a while it doesn'treally mean that we've always beenperfect There's a lot of issues a lot ofpull requests So even with all of thatgoing on we know that there has to bechange Think about it It's been fiveyears over 5 years since Helm version 3came out I you know we had this exactsame talk back in Salt Lake City I thinkit was 5 years to the day when weactually had that talk So it's beenabout five and maybe five and a halfyears It's time for an update A lot haschanged in 5 years Who would havethought that last time Helm came out apandemic hadn't happened yet and here weare Since then new projects have creatednew frameworks new patterns new versionsof Go for goodness sakes And because ofHelm and how in a lot of the primitivesthat our project livesbehind we couldn't make any changesMatt's going to talk about some of thosein a minute So we really use thisopportunity to talk about and use thisas a way to say okay let's go ahead andmove on So what is the Helm for timelineso Helm for we started development backjust around Kubernetes um CubeCon backin Salt Lake City in November Well we'rehere now in you know in March in CubeConEUWe have a switch coming up in Augustwhere we're going to start switching thefeature work to stability work So wehave you know we've been working on homefor home for for a little bit We'regetting very close to making that s��e by Broadcom a software engineerwho is working on contributing to cityas well as making downstream releases ofcity and Kuberneteshi uh my name is Cyprian i'm aKubernetes maintainer Kaops uh noproblem detector cloud provider and manyother tools uh I'm happy to do a lot ofopen source in my spare timehi um hi uh I'm Ivan uh I'm VP of uhengineering at Inar Technologies uh Icontribute to CD um I'm mostly in chargeof CI/CD um tooling and stuff like thathello uh I'm Pami i'm the citymaintainer and also SD code tag leadsuan didn't come this time suan iscoming from Google he is revieweryeah yeah before we dive in we have someexciting updates for the community firstwe have a new maintainer on board Fuwayfrom Microsoft yeah welcomeFu yeah we also have a new SIG LCD chairIvan[Applause]yeah yeah we also formed a new uhrelease team the the team is led by Ivanand James ivan is also hereyesyeah yeah this is the agenda first wewill go through the new feature for3.6.0 yeah apart from the new feature wewill highlight uh an upgrade issue it'sthe exact reason why we didn't release326.0 before couponevents and after that we will dive intothe performance improvement in 306.0zero and also the major enhancement forthe performance testing tool and also IBwe are walk through the releaseprocess and we and separate we areintroduced new working group LCDoperator and also the detail for the LCDoperator 0.1 and AR will talk about thefuture work and road map for LCDoperator finally if time permit we willhave Q&Asession Yeah the first of the feature ismigration to W3 store you knowpreviously we are using the W2 store w2storeis is a Lexi uh data format with thesuffix SNAP you can see the example ithad already deprecated since release 3.4but user can still enable it using theflag enable way too up to uh 3.5 it isstill the source of truth for me data in3.5 w3 store is the bot database file isthe source of truth for the membershipdata in 3.6 but the flag enable 2 hadalready been removed in 3.6 so there'sno way to enable way to store in3.6 the efforts of migration is stillongoing currently we are still bootstrapLD on W2 store and replay the war recordon the latest W2 snapshot in 37 we aregoing to bootstrap LD on W3 store andreplay the world record on the startingfrom the consistentindex yeah the second feature is D gridL3.6.0 Zero is the first version tofully support downgrades at a high levelthe downgrade process involved two stagethe first stage is to migrate the dataschema to the target version for exampleif you downgrade from 3.6 to 3.5 thenthe target downgrade version is 3.5 thesecond stage is rolling down grid eachmember just place the binary or imagefrom user perspective there are threesteps the fourth step is validated thedowngrades currently we only support onemanual at a time for example you cannotdowngrade from 3.6 to 3.4 it's notallowed you can only downgrade from 3.6to 3.5 you can use the LCD cer downgradevalidate command or you can use call theclient SDKAPI once the validation is successfulyou can go to the second the step two toenable the downgradeyou can run the LCD control down gridenable or call the clientSDK you know we need to you know we oncethe downgrade is enabled LCD willautomatically migrate the schema so weneed to wait for the storage versionbeing changed to the target version inthis example the down target version is3.5 and the story version is still isalso 3.5 so it's save to do the nextstep the step three is just to place thebinary or image one by one for all themember yeah this is the fishialgates so I didn't come this time justplay video hello everyone i'm excited tointroduce the new feature gates in SD3.6release before the 3.6 six uh any newfeature is added to SD through a featuregate experimental uhflag and then when it's ready tograduate we have to remove theexperimental prefix in the flag sothat's a breakingchange and also it's pretty hard totrack what stage the feature is in so itis notuncommon for some features to getabandoned uh over the years so due tothese reasons we are switching to usethe kubernetes style feature�gatesuh so for a feature gate it can gothrough alpha beta and GA uh stages andthe users can enable or disable afeature through the feature gates uhflag and in 3.6 Six we have migrated allthe existing experimental feature flagsinto the new featuregates with this uh new feature gates inplace uh sedd is also adopting thekubernetes cap process for any newfeaturedevelopment so with these changesuh we can get the benefits of no more offlag breaking changes for graduating afeature from the experimentalstage and generally the feature gate isless uh error and there will be lessrisk of new feature breaking SD sohopefully uh developers can be feel morecomfortable to introduce new featuresintoSD in additionuh the feature stage will be moretractable and then there will be clearcriteria for future graduation so withthese changes we hope that uh they wecan bring more faster uh developmentcycle into SD for future releaseshello everyone uh I'm excitedyeahthat's so inity 3.6 we're alsointroducing two more house checkendpoints that's live Z and ready Zuh so the live Z endpoint reflectswhether the process is still alive orwhether it needs a restart the ready Zendpoint reflects whether the process isready to servetraffic before this change in 3.4 and3.5 there's only a single healthendpoint so it doesn't reflect thedifference between whether you want torestart or you need to stop uh servingthe trafficand here I've listed an example of howto call the endpoint and what it returnsum with a verbose uh parameter you cansee uh what checks are being performedand whether the health uh the server ishealthy or notso now the SED house check is fullycompliant with Kubernetes uh APIexpectations and hopefully this canbring some nice change to your healthprobeso inokay the next feature is we discoveryyou know we can bootstrap LCD clusterwith the help of the discovery serviceyou know when we bootstrap a new LCcluster each member know nothing aboutit peer it when each member start up itjust regist itself with the discoveryservice and then watch the discoveryservice until all the peer have finishedthe registration and finally thediscovery service return the full peerlist so is the discovery service isdiscovery protocol is only used in thevery first cluster bootstrap the W3discovery is using the LCD client SDK 3to talk to the discovery service thelack discovery is based on the LCDclient V2 is already deprecated thepublic discovery servicediscovery.io is not maintained anymoreyeah I would like highlight an upgradeissue it exactly reason why we delayedthe release of3.6.0 the symptom is when upgrade from3.5 to 3.6 may fail the reason is toomany lender in the cluster you know whenwe upgrade from 3.6 uh 3.5 to 3.6 somemember may reward to lender the rootcause is there's a bug in the 35 whenwe're promoting a lender the promotingis only possessed in the W2 store notthe W3 store as we mentioned earlier W2store is the social truth in 3.5 but W3store is the social truth in 3.6 so isthe reason why we only see this issuewhen upgrade from 3.5 to 36 the issuewas introduced in 3.5.1 so all the patchbetween 3.5.1 and 3.5.19 areaffected we fixed the issue in 35.20 sothe most important thing is user mustupgrade to 3.5.20 or high patch beforeupgrading to 3.6 yeah we have a blog tosummarize the details please check itout yeah Iuh All right so I'm going to dig alittle bit deeper into how we measureperformance in net cityd uh and also I'mgoing to compare a little bit uh theresults of the performance of 3.5 uh 21or 20 uh versus 3.6 release candidate 3uh we initially we had all of thistooling and uh the plotting was greed inPython we decided that because our maincode base is created in Go we decided torewrite it and now all of these chartsare generated uh in Go uh this is a readwrite heat map uh for a different readwrite ratios uh so we have differentread write ratios and we are measuringthe different valuesizes versus different number ofconnections so basically this is thecomparison of 36 and 35 uh if you seeblue here it will mean that 35 is hasbetter performance but as we can se�eit's uh so the performance isconsistently better in 3.6 um there'snot going to be a lot of time to discussthese charts so I if if you want todiscuss a little bit more about theperformance I invite you to join the uhto come to our booth tomorrow uh thisthe previous charts where the readperformance this is the rightperformance uh and again it's verysimilar story uh 3.6 is is has betterperformance uh there's just uh the firstuh ratio that it's a little bit uh 35performs a little bit better uh so we'realso exploring new ways on how torepresent this data uh because sometimesuh the heat map is not veryrepresentative so we are experimentingwith these power line charts and it'sbasically the same story uh blue is uh3.6 red is uh 3.5 um and it's basicallyjust the the same data just visualizinga different representation and I thinkit's a little bit clear uh also it'sclear uh to see how 3.6 is behaving uheven better in the same chart we are nowhaving read and write um and so this isalso like this is more uh read writeratios um we're also trying to see howwe can measure the resources that HCD isusing so we added some measurement tothe RAM and to the CPU and uh so here wecan see again same colors blue is HCD3.6 six and we can see that there isless memory consumption uh you see thatthere's like a lot of uh like uh linesthat like uh vertical lines and it'sbecause we are running uh severaliterations of the same tests but theresult is very constant uh consistent uhthe RAM usage is better there is aslightly almostuh it's not there's not a lot of CPUimprovement uh it's very similar it'smarginal if there's some CPU improvementuh and so some something that you canalso see in these charts is that uh it'sand it's consistent with the read writeuh heat maps is that 3.6 ends before uhthe process from 3.5 so it 3.6 has waybetter performance um so why is 3.6 uhwhy does it have a way betterperformance there's a default in 3.5 thesnapshot count it's in 3.5 it defaultsto 100,000 um and in 3.6 we just changeit to 10,000 and that has a significantimprovement in RAM and also Marik inpull request18825 he did uh reduce the number of rapsnapshots in memory making the compacthistory more frequent uh we're stillworking in several ideas to improvememory usage andresourcesum and now I'm going to jump topics intothe release process very quickly uh sowe have we initially we had a verymanual way of doing the releases uh Ithink the first the first 3.5 releaseduh took like more than a day because uhthe new the new maintainers were stilltrying to understand how the releasesworked uh so basically James and I tookover the process and what we are tryingto make it uh more to bring moreautomation in James words the release isbecoming more and more boring and forautomation that's great um so we're alsowhat we're also want to do is improvethe security because we know that a lotof so if we have a a CVE uh because of alibrary that we're using I know that youare all affected so we are trying to seehow we can uh add some out like havesome periodic jobs to scan for thosevulnerabilities so we can do a releasefaster and uh so you don't have issueswith uh image scans like 3y and uh yeahthat'sme okayu I will talk about the operator now andthe CD operator work group uh we startedthat about a year ago uh with the goalof providing an operator that couldmanage at CD uh for the community and tostandardize on something that would beadoptedby much more than a single company thereare lots of CD operators in the wildsome better some worse some moremaintained than othersso um we had a scope uh we started tocollect requirements use cases for thatuh prioritize uh the work uh in terms ofwhat would be needed to have such anoperator and reviewed existing uhimplementations from other companies wehad numerous uh meetings about thatuh we decided to start from scratch andaccept uh contributions from uhinterested partiesuh so that we build something that ismodern and compliant with uh Kubernetesrequirements so right now we have a newrepository bootstrapped uh everything uhthat uh a project needs up�datesuh andum we also started uh working on a Pum and uh a road mapthis uh operator was uh meant to be onlyfor running inside Kubernetes obviouslybecause it's an operator so it's not forbare metal clusters so if you want tostart a cluster from scratch this is notyouum and we have a bi-weekly meeting incase anyone wants to join the effort orprovide feedbackcool uh for this CubeCon we have amilestone that's CD operator0.1.0 that's the first uhrelease um it is a first step so don'tget too excited you cannot use it inproduction yet um we're just uhbootstrapping the repo automated updatesbasic testsuh release scripts and we're able tocreate a new cluster customize it withwhatever options you want uh and uh usevarious CSI drivers for storage um we'regrateful to all our contributors and uhwith special mentions to Ivanum Abdul Raman uh Ara anduh G Duh say good job to them uh they reallyhelped us get us to this pointanduh Ara please uh present the future ofuh the operatoruh thank you Sepron so let's look at howthe future work looks like for uh thecity operator so in version 0.2 we'll beenabling TLS communication throughcertificate management and uh startingwith the providers like search managerand auto auto providers so what autoprovider basically means that if youdon't uh by default provide any uhcertificate provider it willautomatically uh provide a certificatefor you so the TLS configuration lookssomething like this uh where there willbe a provider say we are saying managerand it will have a provider configfollowed uh which will have therespective uh provider configuration forthis instance is the search managerconfig and that will have its ownnecessary uh fields like common nameissuer kind which is relevant to that uhprovider and we will be adding moreproviders going ahead uh next is theupgrades uh upgrade is still supportedin 0.1 but in 0.2 we will be adding moretests and E2 uh workflows to fortify itand that includes patch upgrades as wellas minor upgradesuh for 0.3 and beyond we will beimplementing uh disaster recovery foreach of the cluster members backup forperiodic and u on demand uh backup ofthe clusters as well as uh creation andnew uh uh restoration of new clusters uhfrom availablebackups and that brings us to the end ofour uh session uh thank you and we areopen to questions[Applause]hi um yeah thanks for the updates uh wasvery insightful uh looking for this 36release actually um so I'm I'm kind ofhavea well related question like not wedidn't mention it like but I'm I'm I'mthinking I'm considering about likehaving you know like multiple ETCDclusters and syncing them all togetherfor some use case and I know that uhthere is a tool to sync the clusterscalled like mirror maker I guess rightit's like a synchronous way of likesyncing one cluster to another right uhbut IWonder if this stuff kind of supports abirectional thing and if it is how canit kind of makes the conflict resolutionin that case or it's like just not aproblem that ETCD ever will solve umjust trying again trying to understandum how to say um the possibilities rightof etc when it comes toseveral clusters that each has its ownquorumsBasicallydirection in bothdirections still don't get questionsorry could you could you could yourepeat sorrysorry what could you repeat sorry didn'tget your question sorry yeah yeah um soit's it's like um so hypothetical sinceI I I don't know i just like startedlooking into it um that I know that umif you have multiple different clusterCTCD clusters each with its own quorumlike each has let's say three nodes raftright um and you need to sync the datafrom one cluster to another right sothere is a utility that you providecalled the mirror maker oh right um butand I know that it's kind of asynchronously replicates from one toanother uh but um I'm I'm kind of likewonder maybe you can also like from yourside explain like the use case for itand and does it support maybe this likebirectional things and if it does howdoes it work with this um kind ofconflicts problem because each clustercan be consistent right but when itcomes to trying to make consistentclusters across geo distributional stuffit's kind of like hard problem rightokay got Um yeah yeah yeah you know thethe mirror command I don't think thecommunity has verified the in productionenvironment it is just a command foruser to use I don't think is verified byin production if you want to migratefrom one cluster to another cluster Ithink the backup and restore is thesimple is most simple is most robust wayto do that if you want to do migratedata online I think The probably thebest way I can think of is for exampleif you want to migrate from one data uhdata center to another data center youcould you can add a member in anotherdata center join to the existingcluster and then remove one member fromthe old data center and then add a newthe second one in the new data centerjust rolling migrate member one by onethis is probablythe you can do this online you want tooffline just back up restoreyeah okay yeah I see yeah thanks a lotthank you uh thank you i just firstthing I wanted to say was thank you forforming the working group and pickingthis project up and putting your timeinto it i think it was silentlyterrifying for a lot of people that CDwas kind of getting into a state that itwas in so really really appreciate itthank you um kind of curious to hear ifyou have any thoughts that you don'thave to well I'm not going to hold youto anything that you say but kind oflike beyond the initial road map likewhat are you thinking like if you thinklike a couple of years out are there anythings that you kind of want to do thatit doesn't do or things that it could beused for that it's not used for today orlike spaces that you would like to seeit grow intoagain yeah yeah please yeah sorry ohit's okay it's just an an open-endedquestion about kind of longer term plansor ideas that you have even if youhaven't fully fleshed them out or likeareas where you would like to see morework go into where maybe people can helphelp you or or stuff like thatoh yeah uh for the for the future workyou knowuh you know just mentioned you knowcurrently we are still bootstrap LCDfrom V2 store and replay the worldrecord based on the latest V2 snapshotsyou know the V2 store is alreadydeprecated in F7 we are going tobootstrap LD from W3 store and replaythe world record starting from theconsistent index this is a major changeso we need probably need carbonmaintainer to collaborate for this thisis the main change for the F7 server thesecond change is to second thingpriority i think we need to spend moreeffort onperformance in the in the previous wereceived several issue of complaintabout memory usage on LCD probably weare spend more effort to improve memoryusage in the 37 yeah these are two thingI can think of yeahhey guys uh thanks for the insightfulsession uh my question is related to HCDuh database size where like uh with thecurrent limits of 8GB but we do have arequirement we're hitting thosethresholds because of hundreds ofcontrollers which we're runninguh due to performance issues you guyslimit that size to 8GB right so in theadvanced editions like versions is thereany consideration to increase that sizesorry so can can we repeat again sorrythe HCD uh database size from 8GB hardlimit to increase up to nextsize sorryoh yeahuh yeah currently the the default is 2GB the default quarter is 2 GB and theuser can but is configurable us canchange the quarter can do eight or 16 GBwhatever but the problem is when uh whena new member join the cluster we need tosend a snapshot to the new member sowhen you send uh send a huge snapshotthis is one thing but probably thenetwork network bandwidth is not bigproblem because they have highthroughput in the current the secondthing is you know we add the memory theDB file into memory directly so thebigger the DB size the more memoryusage and we also have some internalcache which means when you have moredata then you have more catch so thememory is is big pressure for the LCDthat's the reason the LCD is designedfor metadata is not for large scaledataYeah yeah yeah thank you2025-04-15 21:58:39.911599�nd efficientconfiguration sharing and allocation ofaccelerators and other specializeddevices So it it formed around GPUs butit's certainly not limited to that Westarted the working group uh afterCubeCon Europe almost almost a year agobecause at that time interest reallypicked up and we figured that the thesmaller team that had been working on itdefinitely needed to reach out to lotsof different SIGs across Kubernetes Thework that we've been doing affectsoverall Kubernetes architectureautoscaling networking node uhinteraction on the on with devices andand to a large extent also schedulingdecisions So these six then decided tosponsor the working group We set upregular meetings and yeah that's whatwe've been doing ever since We're tryingto figure out how Kubernet is goingforward can get out of or well get gobeyond its original roots as as aservice for mic as a management systemfor microservices towards what we all dotoday like running AI inference perhapsAI training and also figure out how tohandle all the other hardware thattraditionally has been perhaps not sowell supported inKubernetes So as I said this is mostlyabout dynamic resource allocation at themoment If I had to explain it uh tosomeone who hasn't seen anything aboutit yet it's probably best to mention thefour key parts The first is it's a newAPI U the part that is relevant for thedrivers is that they are publishing amore complete description of thehardware that is available on a node ina so-called resource slice It's a fixedlist of devices each device with certainattributes And this could be the vendora product ID amount of GPU RAM that isavailable if you have this uh devicemapped into your container or how manycompute nodes it has And then with thatinformation users can define theirrequirements in a much more flexibleformat With another new API uh type theresourceclaim in a resource claim a standaloneobject you can specify which attributesyour desired device must have Ittypically references or it mustreference a device class So you kind ofknow that you're dealing with a certainvendor at this pointAnd you can use attributes defined forby this vendor to really get exactlywhat you need You can also have bit moreflexibility You can say I need at leasta certain amount of memory but I'm okayif I get more Um we have cellexpressions in that resource claim toexpressthat So a lot of flexibility and alsobecause it's a separate object we havecontrol sharing You set up your workloadexactly the way you want it Differentports can share the same resource claimand they all get the same hardwareinstance at runtime and you can alsoselect inside the pot which containersshare the samehardware To make that possible weupdated then the scheduleuler Itbasically looks at these requests andmatches against the available hardwareand sends it all off to cublet One ofthe key changes that we did to get towhere we are is that we moved all thelogic into thescheduleuler and that's a key change tothe original design uh and that thatbasically unblocked us from going tobeta as you will see soon It's calledstructured parameters because it's astructured format of a devicedescription that is expressive enough todo logical conclusions and simulationson top of it And then at the end of thepipe once we have scheduled a port thecublet is working with a driver thatneeds to be run on needs to run on thenode uh like a like a like a deviceplugin except that it's now a differentAPI uh that then sets up the hardwaremakes it available incontainers and this is worth repeatingwe had the same message up in uh northcube North America we reached betain 132 It was breaking news uh sixmonths ago or few half a year ago It'sstill important We are now a bit closerto G but this is still the the messagethat we are sending out to the audienceDRA is here to stay We are movingforward and we are extendingit We have not certainly not uh restedmuch since uh the time that we moved tobeta did a lot of things and we arebranching out So a lot of the newfeatures that have been worked on arenow being worked on by othercontributors o�ther people who gotinvolved as part of forming this workinggroup Um for133 we uh moved one feature forward tobeta together with with core featurethat is the uh driver own resource claimstatus that is something that isrelevant for network devices becausethey need to publish IP addressinformation Then an important featurethat we missed earlier is partitionabledevices So I mentioned that the resourceslice lists a fixed set of devicesum but they are all independent Withpartitionable devices we can haveoverlapping hardware in a sense that wedefine some partitions but if you pickone partition something else mightbecome unusable because it needs thesame underlying resources in in the oneGPU that you have and that ispartitionable devicesSo that that got added as alpha devicetaints and tolerations It's a managementfeature If uh you are familiar with nodetains that's exactly the same conceptexcept that you don't need to taint theentire node Uh you can just say thisthis particular device is unhealthy orI'm taking it down for maintenance Andthe effect then is that it's not gettingused for new scheduled parts and partsthat are running running can beevicted Then I mentioned resource claimsThe the next list the thing isprioritized alternatives in devicerequests That is a way to express yourintentthat from say a list of requests any ofthem is okay for your part You could sayI I want something from vendor A If Ican't get that I'm fine with a devicefrom vendor B or I'm taking N or this orM of that And then the scheduleulerbasically will look at what is availableand try to satisfy one of theserequests So it's a it's a bit moreflexibility basically that we'reoffering It's all entirely implementedintheuler Um finally admin access that wasa thing that we had earlier It's a modewhere some privileged user can say Iwant this device or this set of devicesall devices on a node in my containerand I want them even if they arecurrently in use by a normal user Soyeah they are in production use but youwant to monitor something and an admincan say okay give me all of them and Ido something that is not intrusivethat's a privileged operation becauseyou could also break the the workload Sowe added something in 133 that is astandardized label that needs to be setin a namespace otherwise you can'tcreate resource claims using thatprivileged mode Previously it had to beadded to a cluster with validationadmission policy Now it's built intoKubernetes and secure out of the box Itneeds to be enabledAnd last but not least we worked on thecore cub core DRA to move it closer toGA Uh we do we are doing another v1 beta2 to simplify the API Some structuralchanges that make it a bit simplereasier to promote to GA Um but V1 beta 1is also still available So both bothcontinue to coexist Both areinteroperableThen um small change perhaps butrelevant if it gets up picked up indownstream Kubernetes distributionsThere are now predefined arbug rules andsomething that goes into the reliabilityaspect of DRA If you have a demon setthat run that deploys your driver youcan now enable rolling update where twocontainers of your driver run inparallel That's a first for for Cupletplugins It wasn't supported before youhad to take down your container bring upthe next one and then it takes over Butthere's always a certain amount ofdowntime in between With this seamlessupgrade mode the cublet is intelligentenough to see to register two instancesfor the same driver and then talk to anyof them And it it's it's seamless as theas the nameimplies So here's a here's a very longlist As I said we are now moving thecore thing forward but we are branchingout We are adding features on top of DRAto enable additional use cases and thoseare all under a separate feature gateThat means we don't need to block thecore DRA from going to G We are justdoing these other things on on the onthe side in a sense Um so I mentionedalready GA4 core DRA we hope to do thatin 134um together with claim status becausethat does really doesn't need much morework and then the other features that wejust added as alpha in 133 �hopefully canpromote can get to to beta in the nextcycle We need to look at the betacriteria but there's there's I know noreason why we shouldn't So all all ofthose are smaller features that we canmove forward with just one release uh uhdifference between alpha beta and and GAOne one thing important to note there isthat these new features even thoughthey're in alpha and they'll move tobeta once DRRA itself goes GA these betafeatures will be turned on by defaultwhich hasn't been true for for uh DRA ingeneral because there's a um there's anAPI group that's been associated with itand those aren't allowed to be on bydefault when you're beta but once DRAfeature itself goes GA all these subfeatures uh can be turned on by defaultonce they're in beta you have to waitfor it to be in G to useit So let me call out a few things Noteverything Uh some some of these thingsdidn't quite make it Uh thepartitionable devices kept for examplehad an aspect that was called mixinsthat reuses uh a set of attributes Wehad to take that out to simplify theimplementation It will now become aseparate cap That's relevant if you'retrying to publish lots of devices in aresource At some point the object mightbecome too large So we have limits onhow many attributes you can have perdevice and mixins helps a little bit tomake it more compact and and moreefficient more efficient So that thatgot taken out You might have also seensomething that was called admincontrolled device attributes Um that wasmy other half of the work that I wasdoing for 133 and we took it out becauseit wasn't that important and the API wasa bit controversial It was the basis fordevice tains Um but we then separatedthe two and this is now pending in asense Um it would have allowed admins toadd attributes to a device withoutchanging the DR driver but it's on holdWe basically need your feedback if youfind this useful to figure out how toprioritizeUm partitionable devices is one way tosplit up hardware but there's alsopeople who want to really have just onedevice that can hand out chunks ofsomething Um it's mostly networkhardware guaranteed bandwidth forexample that can be assigned to certainports that then get a fraction of aguaranteed bandwidth That is theconsumable capacity cap and that is alsothat has been in discussion for the lastcycle It looks good to move that into133 14 uh 134 as a as an alpha and andthere are others So this is really whatwe are discussing in the working groupYou're welcome to check out all theselinks when you get the slides from theschedule and find more information aboutthese individual thingsThen we are also organizing with fellowtravelers work that is not directly tiedto a cap or directly tied to KubernetesThere are some outofree efforts thatneed to happen to make the array usefulUm mostly in other scheduulers perhapsum autoscalers carpenter it's not alllisted here but there are some thingsthat we are just really discussingUm also higher level app controllers uhwe currently generate a resource claimfor one port at a time uh with a from aresource claim template But then if youhave a set of pots that might have toshare a single resource claim because itsets up something for the entire set ofports then we need control or support inin these job controllers So we'rediscussingthat And here's also a list of driversuh that currently supportDRA Um at some point we might have tothink where we want to list thosebecause I I already know that there aresome that are not on our list yet Uh sothe list keeps growing If you arestarting with DRA the example driverthat we don't even have here on thelist is a good starting point So wemaintain one example driveruh that is more or less realistic Youcan basically just copy and paste theentire code fork it and then implementyour vendor logic in it It's also a gooddemonstration tool So one of the changesthat I haven't even mentioned we nowhave readily available container imagesfor it We're going to update theKubernetes uh documentation so that youhave a JAMAL file that you can justdeploy against the kind cluster and youbring up the� DRA example driver forexample and can play with it as a userThat's one of thethings and another thing um yeah gettinginvolved I mentioned it we have activediscussions in the working group onSlack uh you're welcome to join us worthcalling out is that we now have also asession on Wednesdays that is more uhAsia friendly it's a or early morningEuropean time we didn't have thatinitially but people were just showingup at for at at 1:00 a.m air time andthat's just not acceptable I don't wantanyone to do that So we are now runningtwo sessions alternatingly in indifferent time zones or timeslots and yeah we haven't done thatbefore We should have a shout out to ournew contributors I've mentioned thatthese some of these features wereimplemented by other people Let's givethem a clap of hands perhaps whetherthey are in the room Morton is here UmRita did some work on admin mode takingover basically from me with the Rbackpermissions Shingo he he he made verygood proposals for the device chains andI'm happy to get those suggestions Itwasn't even a code contribution in thatsense Uh it really was just an idea andI picked that up and we both togetherbasically wrote for CAP or defined someof it John Hoon he's not hereunfortunately He did lot lot of partspart of the imputation of device ts He'snow our maintain Well what happens ifyou are show up and do good work we giveyou more work So he's now also ourmaintainer for the DR example driverthat I mentioned And previously whatwhich we hadn't in which we should havehad in the last update meeting in inNorth America Lionel uh and Antonio theystepped up and covered the network sideof of DRA So thanks to everyone And withthat up to you OhOh yeah we had CC she well we had we hadto shuffle around work a bit I sorry I Iforgot I we had other people jump in andand really help us out and it just s mymind You know this is again somethingfor future reference Lot of talks atthis cube con some some some talks frompeople that I just mentioned theyalready happened it's too late for youto watch them live but you perhaps youwant to go back and and watch therecordings There's one more session tolater today that you can still catch inperson Um that is um from CERNDiana and that was really the last slidethat I had Now up to you Yeah So I'mjust going to give a quick update on umhow DRRA is being used uh by Nvidia Umif you've been to these sessions beforeyou've been following along what we'vebeen doing um with the model that we hadin Kubernetes 130 I wrote this uhdocument called NVIDIA GPU use casesthat that outlined 12 use cases for howwe wanted to use DRA um and what weneeded to do to improve DRA in order tocover these Uh and with what we added in130 we were covering half of those umsix out of 12 of the use cases Um by 131we were supporting nine of those 12 Uhand I'm happy to say that by 133 whichis coming out in a few weeks we'resupporting all of the use cases that Ihad outlined in this document except forone which we'll probably never coverbecause it's very application specificuh and is is is really hard to ever uhachieve And so um we've made a lot ofprogress over the last few releases andI'm I'm really happy with uh where we'reat now with all of this Um one new usecase though that kind of emerged fromall of this um was the ability tosupport um kind of cross- node resourcesuh in the case of NVIDIA GPUs uh thismanifests as what we called a multi-nodein VLink And so if those of you that arefamiliar with the GB200 NVL72 systems uhthis is how they're organized There's um18 compute trays that exist in in a rackuh with nine NV switches sitting in themiddle And in order to support these usecases uh in our NVIDIA GPU driver we'veadded this new um uh API server objectcalled a compute domain which behind thescenes if you instantiate one of thesewill um create a bunch of the uh youknow abstractions that are that areneeded by DRA in order to allocate andmake use of these multi-node and Vlinksin a multi-node environment And so um uhif you see the link that I have here atthe bottom uh it's not properdocumentation yet We're� still working ongetting uh a lot of the details put outfor people to to learn about but um thisis a good starting point if you haveaccess to one of these systems and youwant to try this out This outlines thetest procedure for what you need to doto install the GPU operator what youneed to install our DRRA driver and thenhow you can run some example workloadsthat get launched on multiple nodes andmake use of these uh high-end VBbandwidth uhconnections Um I'll be giving a a demoof this at the Nvidia booth at 4:00 Sothis talk will end in 10 minutes andthen 30 minutes later I'll be at thebooth In fact I'll probably walk over tothe booth directly after this if peoplewant to just follow me and get a demo ofthis Um the demo I'm going to show is ona a mini GB200 cluster So as I mentionedthere's 18 compute nodes that exist inthese clusters For for my for my demoI'm only going to be making use of fourof them Um but on those four nodes eachof those has four GPUs each which addsup to 16 GPUs total And they have fullinv connectivity between all of thoseGPUs And the the workload that I'll runwill be four worker MPI jobs thatmeasure the throughput of reading andwriting memory uh between every GPU inthat entire mesh And the output that youget from running this demo is what yousee at the bottom here Um where you cansee that you know all GPUs across allthese nodes are communicating atapproximately the same the same speedwhich is unlike what you were able to doup until now on a single node you couldhave these high bandwidth connectionsbut if you ever went across nodes youhad to go over Infiniband or Ethernetand it would be much slower than uh whatyou get going over inVlink Um and with that uh we'll take anyuh questions This is the QR code to sendany feedback on the session And um ohthe NVIDIA booth is in the S side of theuh expo hall And if you walk in and youturn right it's all the way at the endnear the um uh where you can get yourpick your t-shirts up So if anyone wantsto follow me after there I'll be givingthis demo as I mentioned So yep[Applause]thanks Thank you for the updates Myquestion will be from field perspectiveMost of my customers hasn't receivedforget about black well they haven'treceived the even the hoppers so most ofthem are on P100 V 100 and lucky onesare still on A100 on Nvidia side onIntel side same problem with the SROunique cards so for this project are weonly looking forward driver support withthe new hardware or backwardcompatibility with the certain hardwarewhich where we are struggling especiallywith the Nvidia CUDA versions Thank youSureFor in the NVIDIA use case we're we wewill support all GPUs that we supporteven with the standard device plugin Sowe're not being opinionated about onlysupporting new hardware with thisObviously for something like the computedomains and the multiode and VLinks youhave to have those physical connectionsbetween the GPUs but no you can use DRAto allocate T4s L4s A100's K80s whateveryou have available It's it's it'sindiscriminate because at the under thehood we're just leveraging uh Nvidia'sstandard um NVML library so theirmanagement library to enumerate whatcapabilities exist in the GPUs and thenadvertise them in that resource sliceobject that Patrick talked about And soonce that's advertised the mechanics ofhow those actually get allocated andrequested from the user is standard likeyou would do for any type of device It'snot specific to to Nvidia at that pointAnd then I mean you can sp speak to theIntel side Yeah we we also have our DRAdrivers for existing hardware and weplan to do more in that space certainlygoing forward Uh it's currently at thealpha level So we we but we we certainlyplan to promote that together with DRand Kubernetes in future products to toproduct what what we recommend peopleshould be runningUh hello uh I have a quick questionabout u now we are going to use drstable what about the device plugin inthe future with the kubernetes that'sactually a good feature that we have inin the pipeline so the device pluginsdoesn't go away that was one of ourpromise to the kubernetes community andthe downstream ecosystem that we are notnot removing device plug-insupport we are offering something newwhich hopefully you will find compellingand more more attractiveBut we understand that the extendedresource API that people are currentlyusing is important It is sufficient forsome use cases And we have one idea onepending cap where we want to map theextended resource request in a containerto something that is provided by a DRAdriver So for an admin that means theycan promote or they can graduallyconvert the deployment in their clusterfrom device plugins to a DRA driver andapplications don't need to be rewrittenThey will continue to work as they didbefore So that is something that iscoming in 133 to simplifyrollouts Yeah I think the important bitthere is that with that in place youonly have to deploy the DRA driver butit'll be able to service requests ofboth types instead of having to figureout how to get the old device plugin andthe new DRA driver to work together Okaygot it Thank you Thank youHi Um I was just wondering if uh yourworking group was uh was talking aboutsupport for software basedfractionalization of GPUs Is that goingto be supported at some point um it issupported to some degree already Um andwhat we we so there's this umabstraction at least for Nvidia GPUscalled MPS I don't know if you'refamiliar with that Yeah Um it's a it'ssomething that you can layer on top ofthe allocation for GPUs Um and it runsas a separate service and without goinginto the details of it it lets youbasically say for any given uh workloadthat you start you would point them allat the same underlying physical GPU butyou can decide in a cooperative way howmuch memory each of those clientsactually has access to And it'scompletely fungeible like you can pickup to the megabyte I think how muchmemory you want for each of them acrossthe thing And so we could talk about thedetails of it later And we do have somedocumentation on this and there there'sexamples in the NVIDIA uh DRA driverrepo for for how to do this CoolHello Uh so I was just wondering ifthere is anything that changes in termsof monitoring the devices in yourcluster So before we were using NvidiaDCGM now anything changes with a new DRAdriver or it will work as it was workingbeforeUh DCGM hasn't been updated to be awareof how these get allocated with DRA Oneone thing we did do was so so DCGM and alot of other monitoring tools rely onthis API in Kubernetes called the podresources API and for the device pluginthat was extended to expose the devicesthat are allocated by the device pluginand then these monitoring tools can canquery that to figure out what deviceswere allocated by any given pod and thenemit metrics about them Um we alsoupdated the pod resources API to exposethe DRRA devices that get allocated butDCGM specifically that exporter hasn'tbeen updated to consume these Butthere's there's nothing stopping it frombeing done The the thethe initial work has been done to updatethe API inKubernetes We also haven't updated cubecuddle much yet at all So you get ofcourse the normal support for listingresource slices but cube cuddle describeresource slice that certainly issomething that we could extend to makeit more useful We haven't reallydiscussed much around it One one thingsthat come comes came up just now is thatdevice taints if they are set by anadmin are in a separate object So thereis a resource a device tainted rule thatbasically gets applied on the fly by ascheduleuler to matching devices But ifyou do a cube cutle describeresourceless you don't see it So ifsomeone wants to do some interesting funwork that's not too critical too hardeasy to get started with We could makecube cuddle plugin a cube cuddlesubcomand for device tains We couldenhance cube cuddle describe to be moreinformative And all of these things areactually fairly easy things for for newnew contributors because they are notmission critical in the sense that youbreak the cluster of everyone usingKubernetes uh if you make a changeThank youOkay thanks everyone2025-04-15 21:58:40.563781 �x�#��'Ax8wEo6ZDT1glet's getstarted heyLondon super excited to be here so uh alittle bit of myself i'm No and I weartwo hearts my open source heart i'm amaternal gateway project and also Icontribute to way and also have anotherheart this one my company okay i'll justwear thisone so although this might not be theflash topic like genericAIM is a bored no security stuff whereor old but I think we can all agree thatsecurity gateway traffic is super superimportant for any of your application soit's definitely something worth talkingabouttoday we are going to dive into someadvanced security policies for gateway iwill also give you a quick demo on howyou can use it for OID authenticationandauthorization okay let's jumpin first I saw some friends from ourcommunity so I'd like to do a quick pollhow many of you have used or played withGway on your laptop or maybetesting can you show me your hands ohthank you a bunch ofthem are there anyone already put itinto inproductionanyone okay Isay onetwo one twothree oh awesome awesome love to saythat so if you are a lot of veryfamiliar with onway gateway is just acontrol plan to make it much much easierfor you to run only proxy as an APIgateway that's it that's it we know howhard it can be if you like uh uh runwayproxy by manually confƁ�O�#��UAZ_15EyXOnhUso welcome to this update session of theworking group device management inKubernetes also known as what the heckis going on with DRAnowadays Um my name is Patrick Oie I'mone of the co-chairs of this workinggroup And the reason why I am in thatrole is because I started kind of DRA inKubernetes along with a few other folkslong before it became a workinggroup Um I'm I'm Kevin Clues I was alsoone of the people that started workingon it way early on Um and I just want toreiterate for those that um are herethat this is this really is an update onwhat's going on in the working group Ifyou're here to learn about the detailsof DRRA and how it works under the hoodand things like that come talk to usafterwards because we're not going togive a primer on that in this in thistalk So and and the third person who isnot on the stage today because we arethe official presenters is John BellameHe's also here We can discuss anythingwhat comes up afterwards in the hallwaytrack So the three of us we areorganizing the working group It startedout as an effort to have a more formaluh way of of driving dynamic resourceallocation forward and uh that was themain focus but it's not the only thingSo overall our mission statement is toenable simple a��igure them it'slike a ton of thousands of y file it's adisasterbut instead of manually configuring inway our gateway just use the kubbergateway API uh to help you set up andconfigure the proxyautomatically whether it's a standaloneservice or inside of kubernetes sowithout all the hassle so basically is asimple way to man to manage and handlethe gateway traffic foryou and Today our focus is securitypolicy security so what's exactly isthat or more broadly what do we mean bya policy right so we mentioned that onlygetway uses corated gateway API which isa standard and only gateway is one ofthe implementation so that'sit and one of the coolest thing I loveabout the G API it's uh is the the ideaof polish attachment basically you candefine your own traffic policy and uhattach them to the core gateway APIresources likegateway HP route to enhance itscapability to add your own feature butwithout necessarily modifying those coreresources directly though that's a verypowerful very flexible way to extend howhow the G API standard works and that'sexactly how security policy worksSo you just defy your security policyit's maybe like 10 times of file 10times of jam file that's it and uh whatGway will take care all therisk and what secret pass can dookay uh all the aspect of your gatewaytraffic securityfirst course you can define which origincan access your back end API so if youhave ever developed a front endapplication okay you made IP API callfrom your JavaScript bang you hit withsome random broader restriction ormessages up right but with secret passyou can just define which origin canaccess your back end you don't have thiswhole world about that thenauthentication you can define who canaccess your system we support ATP basicO simple way uh to but sometimes stillused and JWT token uh moresecure more scalable way and API Kauthentication uh very good for machineto machine access control and OIDC uh itcan be used with any standard OIDCidentityprovider after user gets in you candefine um what they can do can controlwhat they can do in your system uh wesupport fangauthorizationum secret policy allows you to identifyuser based on their client IP JW CLbasic O user len ATP header and methodand it can allow or delay access to anyspecific AP route based on rules and thelast but not least if you have your ownoperation system okay you have some veryspecific requirement we can satisfy youso you can just hook hook it up with ourgateway uh by using actualO um so let's take a look uh on howsecurity policyworks uh it can be applied at two levelsfourth level gateway level which meansyou can define one security policy andapplies it to the gateway level and allthe routes under that gateway will getprotected so you don't worry aboutit you can also define another securitypolicy for example you may have a haveendpoint which is uh very sensitive canonly be accessed by administrator so youcan define another security policy andapplyapplied to that specific route to overto override the globalone so this two level structure is veryf flexible andpowerful let's break down with anexample here on left a very simplegateway we define gateway ng whichhandles incoming ATP traffic at the port80 in the middle we have a ITB routewhich under this gateway to to directtraffic to the four service at port8080 and on the right most importantpart we have a securitypolicy so we define our IDauthentication so it's very simple it'sjust like uh eight line of Jam file hereso we specify the provider client idsecret which you can find theinformation in your in your provider'sconfiguration page and then we we add auhauthoration okay only the request fromthis client IP uh address can access ourback end so with this simple setup youcan the only only the authentic userfrom this particular sublet can accessour system all the other request will bedenied here are some more examplesum for example course you can define theallow origin allow header allow methodand get post header and basicallyauthentication you can uh authenticateuser using using the user and passwordstored in a sec security �and you canalso uh extract API key from an HPheader and you and use the uh API isstored in a secret to uh verify againstagainstitokay next we uh I will give a demo onhow you can use OIDC authentication withuh Amazon uh but before that let'squickly walk through this uh demo so youyou you can understand how it worksfirst administrator um create a securitypolicy uh using using Amazon Grado asthe identity provider and once it'screated all way gateway just picks upand translates it into onwayconfiguration and apply it to the uh totheproxy and once the user request come inif the request don't have ID tokenneeded so because authentication isrequired so only proxy will redirect theuser will block the request and radiouser to the to the Amazon kiko al uhlogin page and user need to input theirusername password or whatever credentialrequired by this login uhprocess then after successfully loggingin the user get sent back to the webproxy with an authoration code and uhthe web proxy will use that authoringcode to exchange um ID token from the bycommunicating with the ko token endpoints so that's how or worksright and this I token can pro can provethe user's identity so user getauthenticated and the request will beprocessed to the back end so that's howit works we can say with this with thissetup only uh the all the login ishandled by theAmazon and the radio the ID tokenexchange and the access control arehandled by the web proxy so all theseare transparent to ourapplication okay let's uh let's go intodemo i hope everything works wellbecause we load the conference Wi-Filast not so good so let's jump in okay Ido have like a uh kind of cluster on myde machine preloaded because I load thisWi-Fi lot so good and uh let's see herewe can see this Lspaceway gateway system that's wherethe gateway is installedokay and this isthe actual web proxy which will be willhandle your incoming traffic you don'thave to smear up by yourself uh if youjust create a gateway resource and oneor get control plan will like set upthis proxy for youautomatically and let's see hereso if we go to the default length spacewe can see I already installed the backend application that's a demoapplication it's just elo k here rightnow because that's a demo I will I willshow later okay and then let's get okayHow we define a gateway uh which islistening on the port 443 uh handlesHTTP HTTPStraffic and we also have uh HPlo yeah this one my appso we can say uh this loss parent is EGgetway so it's under the each gatewayand uh we just have a very simple ruleto rout traffic uh which match thematches the my app prefix to theuh to this back endservice and the most important partsecuritypolicy thisoneokay so if you look athere so this is target which means thesecret policy will be applied to thisparticular HProute and this is a K part uh oilauthenticationconfiguration one two three four fivesix 7 8 9 10 11 okay 11 time now yamfile that's it souh basically you have a client ID whichyou can get from the Amazon calculatoruh configuring page and you have asecret we store it in a correlatedsecret object andum actually quickly is not less if youjust do authentication but later I willalso show how you to use it forauthorization so I specify the quick lenso I can later to get some uh crucialcrucial information about the user fromthat cookie so I set up a quick lenghere and local path provider issue soall this you can get from the conf uhAmazon Ko configuration page okay ohthat's it let's tryit ohhopefully my god oh I'm relievedso we can say actually we heading to thethe gateway uh port but getway redirectus to or to toto me to the Amazon credo login pagebecause I'm I don't have like add tokenin my request then I input my credentialhereoh ohokay everything goodwill so that's a power of 11 file whatis what is authenticationokay now we get authent authenticatedand uh what happens next so we know weget an ID token returned by the identityprovider for example this is a uh J GWTtoken returned by the Amazon CDO so wecan say here every kire is actuallycalled claim which holds someinformation about the authentic user forexample sub is the unique ID of thisuser and email is workflow email so youcan use this information to enforceauthorization right at the gateway levelfor example we have a secure pass asecret policy on the right we can sayhow we use this information in oursecret policy so we just say okay if thelane if the sub claim is this unique IDand the email is this email then weallow access otherwise we'll be deniedlet's tryit okay i'm going to quickly apply thisokay or I change it okay we can see hereso let's scrolldownokay so the original OD say part isstill there but we have add some newsection here likeJT what this mean it means that okaywe we get it ID token from thisparticular cookie and we decrypt thiscookie and get all the claims from thecookie then we use it for authorizationthe this part so if the sub is this IDand email is this email we can accessokay let's tryOh lot happened what happened becausebecause it's my email so it's it'sexpected but let's changeit let's change it to let's change it toanother email maybeso let's change thisemail remove thisokay let's try it[Music]again okayso because the email doesn't match my myID token so I'm I I'm delighted sothat'sauthorizationso but not only JWT claim you can alsouse GW scope and you can also combineother conditions like uh probably countIP address yeah you can also use thatokay let's continuethis so just now we we we saw how uh ODauthentication work with Amazon leaderwhich is a managed service but sometimessometimes you might want more controlover your identity provider so youchoose to deploy a open source solutionmaybeklog so it's within your your ownorganization you have more control andsometimes you may choose to useself-signed search for it becausebecause you you use it for testing orbecause you don't just don't need apublic sale and in that case you can usea back test policy to specify the uh theCS for the gateway to collect toestablish a test collection with withcake logoh okay so uh on the right side we caninside secret policy we use uhanother only gateway resource calledback end which we use it for uhcommunicating for sending traffic tooutside of the kubernetesso anduh we also create a backend resourcehere called a back end catalog and in inthat backend resource we define the hostname and port of this cake log and inthe back tier policy we specifythe the CSR from a config map so that'show it works so with this setup the Ogateway will use they provide pro theyprovideuh CSR to establish TS collection with Kclock but actually this here in my setupthe back end is not necessary because Ideploy Klo within the same cluster of myown gatewayif you deploy the kickload outside yourcluster maybe in your virtual machine oranother another cluster you can't reachit with uh throughlike DNS so you might need this so justforexample and try itokay so I already deployed Klog uhwithin my cluster and uh let's see thesecuritypolicy so I have an article policy hereso very similar tothe previous one the only difference isthat okay the provider is a backendresource instead of a URL to like Amazonkiko and uh you can also define someother traffic policy you can like defineretries for the collection with kick logsomething like thatyeah okay that's iti'll tryit oh I also need to show the roadbecause this route this target of thissecurity policy is this HProut so it says is loso okay so any request sent to exempteraccount match thisprefix will beuh be controlled by the secure policylet's tryitokay yeah getit okayyeah luckily itsucceeded so that's pretty much mypresentation today and here are some keytakeaway so what a security policy cando here yeah and you can I do have thethe demo the script the yam file uh inthisuh GitHub repo so definitely downloadand try it out it takes like fiveminutes if you want to try it out andalso our friend from Docker they alreadyput our getway into production so if youif this is something you are interestedin so definitely check it out across thehallway on that room it will be nextpresentation thank you[Applause]2025-04-15 21:58:41.153465�ove to beta in 133 yep stand okay uhnext feature is fine grain cublet APIauthorization uh again this isgraduating in beta le by vineiac um andum there's not a lot of big changes forthe beta release um and this featurebasically enables admins to grantspecific granular permissions forendpoints um like config z and pausewithout um resorting to granting a morebroader proxy per permissions which isthe case today um this again aderes tothe principle of lease privilege um andallowing components such as logging andmonitoring um agents to be able toaccess only the necessary cublet APIendpoints that it should have access toallright uh does this work yeah uh so youmay know that UIDs in and in Kubernetesclusters maybe don't seem as importantbut UIDs are actually quite or maybeimportant in in some external systemsright and uh so recently starting withstructured uh authentication uh youwould start seeing UIDs appearing insome of the authenticators that startedwith the OIDC and in 133 uh two other uhauthenticators were enhanced with UIDsuh so we had the uh UIDs uh being beennow honored in the uh client certificateauthenticator uh where you would consumethe UID on on the API server from uhfrom the client certificate from aspecific relative uh the end in in thesubject of the certificate right uh youcan see like the oid of the RDN here uhthe reason I'm mentioning it like thefirst part is like all the way until twoit's uh do ID that is reserved by Annafor CNCF and you can see that that it isfolded by two and you may be wonderinghm that's weird I wonder what's whatwhat happened with one we will get tothat later u so that's one thing uh oneof the authenticators uh the otherauthenticator which actually graduatedfrom uh the UID graduated from alpha tobeta was the request headerauthenticator uh so now in beta bydefault the cube API server will alwaysuh include the UID in the extreme remoteUID header uh however if you if you wantto consume these headers you still needto go through some configuration thisdoesn't happen by default but we'll belooking into that in the futureversions all right so these were thegraduated features uh let's look atwhat's upcoming and uh we will starthere with the feature that I personallyam the most excited about not onlybecause I got to work on it but you knowthis this is a feature that actuallywill eventually when it reaches G itshould uh improve the security stance ofKubernetes by default so this is reallygreat and to introduce it to you so uhI'm not sure how many of you know thatwhen a bot uh tries to uh download aprivate image from a private repositoryright you specify your uh image pullsecret in in in your pod manifest andyou you know the the the container imageeventually gets downloaded on the nodebut from this point when the imageactually makes it onto the node uh anyother pot that is capable of running orcan can which can be scheduled on thenode can use that pot even without anyother you know additional credentialsthat that the original bot had to had touse and so up until now what you have todo is you would have to turn on specialadmission that would just turn yourimage pool policy into always and withthis feature however uh we were lookingwe started looking into this and uh sofor every pull that the cub does weactually record uh the the informationabout the pool like about thecredentials used and and suchand whenever any other pot tries to uhtries to access the image which is onthe node which was pulled by uh by thecube we check the uh credentials thatthe bot is presenting the the cublet youknow when when it's trying to access theimage and when we see that uh thesecredentials were not previously used tosuccessfully pull the image we just uhwe just push uh the pot to uh to go orthe pool to go uh through the registerauthentication again um and yeah so sothis is this is really great uh so itnot only as you can see it not not onlylike uh uh removes this like weirdbehavior of the image pool policies uhif not present and never uh but alsolike eventually it should remove uh theuh container registry from the cri�ticalpath when uh you know so so when whenpreviously you would have to use thealways put policy and you would alwayshave to query it so I think it's it'sreally exciting i hope that you will beexcited about this just as I am and yeahuh it only says my name here in theslide but this was really a group effortthat these were there were many hours ofmany people invested into it uh not notonly sig like me and Jordan jordan didreally great job on reviewing and designlike helping design all all this butalso the sign group we spent many hoursdesigning this i spent many many hoursimplementing this but yeah uh reallyexcitinganother feature called service accountimage pull credential which is verysimilar uh in the same space where todayadmins are limited to using long livedhard to rotate image pole secrets rightthat are stored directly in thekubernetes API or secrets that aremanaged um at the cublet level uh via acublet credential provider this meansthat any pod running in on the node canaccess those images um and a pod shouldinstead be able to use its own identityto pull the image that it should haveaccess to so this particular feature wasworked on by a niche um and is a part ofa bigger effort to reduce reliance onlong live secrets uh in Kubernetes um itallows the cublet to request shortlivedtokens tailored to specific audiencethis uses a projected service accounttoken for cublet image credentialproviders enabling dynamic configurationof service account names and audiencefor tokenrequests uh another feature uh is calledthe DRA admin access um so for those ofyou who are familiar uh DRA stands fordynamic resource allocation this issomething SIG O worked with SIG node onum basically the feature introducesbunch of APIs that facilitates dynamicrequesting and sharing of resources likeGPUs or other specialized hardwarebetween pods and containers to enhanceresource allocationnow in 133 as an alpha feature now onlyusers that are authorized to createresource claims and resource claimtemplates with elevated privilege canbasically roll out workloads that tomanage devices that are allocated andused by other usersand this is only possible for namespacesthat have this specific resource cateioadmin access true label again to ensurenon-admins cannot misuse the field so asyou can see that's an example of aresource claim template using the adminaccessfeature uh and I should say uh at thebeginning of the talk that we we havethree demos so if time permits we'llshow all three um the next feature iscalled external signing for serviceaccount tokens um this is somethingJordan Samuel Harshel worked on um uhand there's bunch of uh changes in 133so today when we think about tokensigning it is done in process with localprivate key files this feature adds theability to allow the Kubernetes APIservers to make gRPC calls to for tokensigning with external systems like KMSum making it easier for for you to dokey rotation key verification and auditum and reducing the need for theKubernetes API server to manage signingkey directly and offload thatresponsibility to a external searchsystemall right p oh yeah good uh so Pcertificates that's uh another big thinguh that here is working on it's it'sanother uh certificate feature uh thisone lets you uh easily mint certificatesclient certificates for your serviceaccount identitiesum this is that there is a currentlylike a huge PR which we are allreviewing uh done by Tahir and itintroduces new API there are parts ofnode authorization that that there thereare all all these things and yeaheventually hopefully we we will get to apoint where uh where mintingcertificates for for your serviceaccounts will be really easy and Yeahyeah and identity for service account isrepresented by the structure that youcan see here and maybe if you rememberthe original slides and and the oid youwill see where that num number one forCNCF group went so it was it was decidedinhere okay uh so there there are someother features i was specifically toldthat uh you know Siga hasn't done theplanning for 134 yet so you know the thethe numbers may change� but these arethese are things which are currently inflight uh namely the harden of cubat 7ser validation is a feature that willadd additional check for the cube APIserver when it's trying to connect tothe cublet u you know beside the normalhost name validation uh the cube serverwill also check that the host nameactually matches what what the cube APIserver would expect because it knowswhich node is connecting to so that'sthat's just one add additional checkthere u there's also the extension ofthe node restriction to service accountsso when you have workloads which may youknow like like Damon says which are justlabeling nodes for so so you canschedule some specific workloads on themuh you don't actually need those serviceaccounts to do more than just likeaccess the specific node object and sothis is this is exactly what uh thisparticular feature is aboutokay and um the next one is PSArestrictions for probes host fields umfor those of you who thought we weredone with PSA not quite um there'salways new vulnerabilities and new waysto protect your clusters um so with thisnew PSA control that is still gettingworked in progress um and it's hopefullygoing to alpha in 134 um Sora is workingon a design for this right now um andthis control basically restricts settingthe dohost file in probe handler andlife cycle handler configurations thisis to mitigate potential server siderequest forgery vulnerabilities thatcould result in imp from improperconfigurations uh of probes or lifecyclehandlers so look forthat all right demo time yayall right so the first demo is um the DRammon access feature we'll probably haveto click it on the other screen let'ssee click oh no click okay so maybebefore Oops did it stop okay so maybebefore the demo starts um so think of auh so this example basically talks aboutlet's say you have a cluster with GPUsand you want to make sure um your normalusers can allocate workloads thatrequest certain amount of GPUs but forammon tasks you don't you want to runthose ammon tasks but you don't want toactually um prevent normal users fromallocating those resources so that'swhat this feature is about now withoutthis feature what it means is when anadmin tries to get access to the GPUs orcertain hardware it would basicallyprevent them from using that be or oncethe resource is allocated that thenormal users can't use those resourcesi can just click it nowoh think it went to the next demo uh didit okay i'll click it here if I getthere all right where's my cursor hereall righttechnologies can you see thisclearly okay so basically here we haveum a kind cluster with the uh alphafeature enabled with bunch of DRAfeature flags um and as you can see herewe have bunch of work worker uh nodeswith GPUs allocated um and here we havebunch of um regular let's say AIworkloads that request bunch of umresource claims and using uh resourceclaim templates that request uh say nnumber of GPUs and once this is deployedum then all those GPUs are now allocatedor assigned so any future um requestthat uses the same uh resource claimwould not be able to get any rightbecause they're already used or in useso next we haveanother work oh wait what am I lookingatuh okay so as you can see all u all theGPUs across all the um worker nodes arenow allocated and assignedreserved and as you can see these arethe um uh resource slices that shows theGPUs are currently being used uh next wehavesee this is another admin uh pod that anadmin is trying to allocate resource andas you can see it's getting an errorthat hey it looks like you can'tallocate this because all the GPUs aregone like I'm sure all of you havegotten um so in order for us to run thistype of um uh task let's say you'retrying to get some metrics from the GPUsor you're trying to get some healthmetrics right um so what what we wouldlike to do is actually update this uhpod yaml to use another resource claimthat is using the admin access featurebut before it can do that um first weneed to make surewe're can't seethis it's my computerit's actually I think it stopped did itno no it's good okayokay um so� now as you can see it'strying to use the resource claim but itnotices that it can't schedule theresource claim because it's not in anamespace that's dedicated to run admintasks so next what we want to do um isbasically um create a namespace and andlabel it with the um Kubernetes um nameuh label to make sure that thisnamespace can run ammon tasks um andnext we want to uh update the um podyaml to use the new resource claim andthe new namespaceso here as you can see the new pod gotum alloc or got scheduled and it's ableto runsuccessfully um and next we see thatum that it is running successfully soyeah next[Applause]demo all right now I need to find mycursor yeah my computer is a littlecursed but hopefully you'll you'll beable to see something so in this demoyou will see a bot failed to launch uhthat's basically it on the left side youcan see uh you know u the old behavioruh this is this is by the way the demofor secret put images uh on the rightside you can see uh a cluster with uhthe feature uh enabled the feature thatprevents you know the the unverifiedum u image container image use so firstuh we you know we will be using uh animage which uh is from in a privateorganization uh so we will create a botjust to see uh this fail um so you cansee thatuh you can see that I made a typo therebut uh you can see that that the imageactually failed to pull uh because itdidn't have any authorization because wedidn't supply any credentials for it tobe pulled right and so that is all goodum so we we just didn't delete that portbecause we don't need it for now u sonext we will create a pool secret thatwe will later use we will use it realsoon in fact so you can see here is is apod that actually specifies that thatsecret that we just uh created for forthepool and it will use that to reach outto the registry and just represent thecredential to it so we are creating podand as you can see uh this was reallythe thing that made the previous botfail to start because the image wouldn'tpull all rightum so uh next uh what what you will seeis we will just create the old poolwhich was uh I call it broken um and nowwe will uh see what what actuallyhappened or didn't happen eventuallyhopefully the video didn't stop there itis so now that we can actually see Yepyou can actually see that with the newuh setup the the image failed bullwhereas with the old setup you know thethe the image the the the pot stillstarted even though it didn't have thecredentials to actually use uh use thatcontainer image and here on the rightside you can you can just see the proofthat the image actually failed to pullbecause it was oh yeah because it wasjust forced to to repool the the imageregistry and here here for completenessyou just see uh that even with the pullu pull policy never uh your image willstill not be used instead uh the potwith in the new configuration will justfail to start so yeah on the left sideold cluster right side new cluster yaymulti-tenency multi-tenencyyeah here here I'm just uh showing youknow what what happens uh when you saysay I live in in a different umdifferent name space here I made a typoso so the pot isn't actually there but Iwill create it inside of the newnamespace again so first you knowshowing the broken part you can see uhthat again old old custer running newcuster fail to fail to start the imagebut that's that's not the important parthere uh next you will see uh me creatinguh the image with uh well the secret andthen the uh the image actually uactually using that secret uh to pull uhto pull the image again although thistime it will actually not pull it but itwill use uh the image from the node soyou can see both PS are running asexpected right we we have we have thesecret and then uh we'll be listing theevents uh and yeah this is a littlemessy but eventually I find the rightevent there which says that let'sseedata which says that the image wasactually used from inside of node andyeah oh you you can see that that theimage improved a little bit right Yeah Ithink that's Yeah that's that's probablyfine for this demo yeah the anonymousfailedexactly yeah yeah yeahyeah and we have one more demo here isthis uh No this is the this is the theotherone but we have one more demo here let'ssee the time yeah I think we should begood right so this demo just uh showshow you would use the cluster justbundlesum if it starts it starts yeah uh hereyou can see uh we we are configuring theuh feature gates on the API server thecontroller manager and cubat in controlmanager this is basically just so thatthe trust bundle for the cube API serveris published and it's not very importantfor this demo but I'm using this fortesting and in cublat this dis thisallows the cluster trust bundleprojection into the pots and on the APIserver obviously we we need the APIs andthe arpegg for all that so again wecreate a name space for testing um wecreate asecret service with with ser certificateand key for the service that we will berunning right we can you can see that uhthe issuer is standable at test CA anduh this is uh this is the DNS sign forit u and this is the CA that's signed uhthe ser certificate so yeah you knowit's self signed obviously way um so inthe next step I should be creating uhthe cluster trust bundle for this CAright uh let's see it happen yeah uh sothe cluster test bundle looks like thisyou can see it has a certain name whichhas a very specific format to clustertest bundles this is a serer name so thethe name of the cluster test bundleobject is actually derived from thesigner name and uh yeah you you've gotlabels which are also important for uhfor how you mount uh or how you requestthe mount inside of the pot so we createthis uh clust bundle object and thenyeah then then we are basically creatingjust a service this is an echo httpserver which basically like send sendsback information about the request thatyou made to it and you can see that itis serving with a with a SER file andkey file that we just created uh here Imade a typo um but here is yeah herehere's just a service for for the pot sothat you know it it matches the uh thehosting match and u yeah here is aclient port which will just you know uhcurl the uh the endpoint uh of uh of theHTTP service we just created it will usethe uh it will use thecaert from inside of u of the clustbundle volume which is specified downhere you can see the way how the clustbundle volume is is actually specifiedhere's a sign name and here is a labelselector which which actually uh letsyou pick you know specific CA you canalso like pick pick a few this couldalso like pick a few of them like youknow previous and the new one so thatyou can rotate a little bit more easieruh and here you can see uh that theclient already uh finished and I willalso show the logs of it and yeah uhbasically here I'm showing uh that thisis exact this is yeah this this is uhthe you can see theservic and you can see that the CA filewhich you specified in part was beingused for for this successful pull orrequestyep and uh so That's it for the demos uhanduh this slide will will take like thisQR code will take you to the descriptionof of the sego group what we do how wemeet uh and when u and basically all theinformation about about the group andwith all that I think now is a good timefor questions we've got five minutes[Applause]there's one question back i seeoh sorryum related to the pod image pulling yousaid that one pod can pull the image andthen if another one uh wants to pull itthen it needs to reau authenticate tothe registry what happens if for areasonum the pod gets evicted deleted whateverwhat will the new pod do uh do you meanlike the pod that previouslysuccessfully pulled it it doesn't matterthese uh the the records are actuallypersisted on uh on the node right andeventually they may be uh they may belike cleaned up but uh yeah those areusually persisted actually right so ifthe second pod has the same secret ifthe second yeah if the second pod hasthe same secret uh it's it's fine thethe the request will not go uh to theimage registry right because yeahOkay any other questionsgo try all the new features oh yeahplease[Applause]2025-04-15 21:58:41.789417 ��`�#��wArVz-vIFGT4khi everyone thank you for being in thesession uh today we will talk about someof the updates from SIG O uh both for aand authorization um and we will touchon upon a lot of features that arecoming in uh v133 that's coming out in acouple of weeks and then we will talkabout some new features uh that areplanned for future releases my name isRita uh I'm an engineer at Microsoft i'malso a SEO co-chair i am i'm also anengineer in Microsoft and well I do allthings well all things contributions insego that's true um and throughout theslide uh we will also be mentioningcontributors who are contributing to allthe different features um because uhnone of this would be possible withoutthecontributors so let's start with thegraduated featuresuh bound um service account tokenimprovements this is a a kept anenhancement that has been uh done by MoJames and Jordan for the past fewreleases and it's going to GA in 133 umso what is this feature right so in thepast when attempting to verify a serviceaccount token associated with a pod itwas very it wasn't possible to verifythe pod associated to a specific node umyou would have to do a lot to get therelevant pod um get the pod object fromthe private claim in the drop token andcross referencing that to find thespecific node name um this featureactually allows us to uh have a robustchain of identifi identity verificationall the way from the requester to um theprojected token and and we want to getthe no object reference in the requestedpod where it's embedded in the sign jotthis feature um incorporates additionalclaims like the node name uh and JTI inthe token itself so that we can do thistraceability and prevent any replayattacksuh cluster trust bundles so uh thisfeature uh makes it really easy for youto install and maintain an additionaltrust in your cluster right sopreviously if you wanted to add atrusted signer into your cluster youwould probably have to juggle a lot ofconfig maps around and with the rotationthose those were some some nightmaresout there and so we've got these clustertrust bundles which basically with whichbasically you define uh your serer yourtrusted serer as an API object and thenin your ports you you just mount uh thetrust uh the trust TR bundle of thattrusted object of trusted signer intoyour pots yi projected volume u so uhyeah you know one of the showcases herewas uh a ser for the QPS server that Iwas actually adding in 132 uh so you youcan see in the slide basically how dohow how you would work with these uhcustom trust bundle objects right youyou can see that a lot of it revolvesaround the signers like the signers nameand you you you basically as as aconsumer of of trusts inside of acluster you you you can list all thesigners you can decide who you want totrust and then you just simply mount itinside of your uh of your pot and I justwant to add to here started the originaldesign and stand made it possible tom�� of the challenges thatdevelopers are facing today Right asdevelopers whether that's at theapplication level or the systems leveluh where you're building platformservices uh typically we pick a generalyou know purpose language to use uh forbuilding our software or building ourapplications uh go is a very popularlanguage in the uh platform universe butJava is equally popular Rust is you knowalso gaining traction uh and others uhyou typically Think about how do youtransfer data because in observabilitygiven that you're generating so muchdata you know across applicationcomponents or infrastructure uh thinkingabout data transfer is also anotherimportant aspect Uh typically gRPC isvery commonly used or HTTPuh and of course most of you must haveheard of protobuff It's kind of anintegral part of many of the componentsthat uh open source observabilityframeworks use today uh and then you'rethinking about how do I store this dataright uh picking a database with aspecific query language and this iswhere you know the uh details creep inbecause you now you have multiple typesof storage and you have multiple typesof query languages to actually go andlook at that data right whether you'rewriting or you're reading from it andthen last but not least you know how doyou deploy the code so there's a fairbit of complexity where you want to haveeach part of this life cycle you knowbeingobservable then you care about okay sonow that I I have my service in placedoes it actually work and of course atthat point you add telemetry or metricsright most of the time um typically logsare easy to do you know if you'rewriting serverside applicationsespecially and you will go and you knowat at the least go and turn on you knowwhatever the equivalent of print f is inyour in the language you're using uh inorder to at least get basic you knowobservability If you have too many logsyou know because you now just you knowhave logs everywhere then you startstart to think hm how do I you know kindof correlate that with metrics and getinto that if metrics are too high leveland don't give you the cardality that isrequired then at that point you're alsosaying okay can I use traces you knowand and get more understanding of the ofthe uh what the application is doing andlast but not least you know again if yousay hey my application is very slow Ireally want to understand you know whatthe po what the perform how we canoptimize performance for theapplication then can I get profilinggoing right so now you can turn on allthese knobs for different types of dataand you are now you know kind of usingum different types of data to understandyour application as well as your overallbehavior of your systemSo enter tele open telemetry Again opentelemetry as you know has been a verypopular uh you know uh collection u andingestion agent uh as well as hasdifferent SDKs for instrumentation andopen telemetry of course can consume allthese different and ingest all thesedifferent types of uh data uh which thenyou can use uh forstorage So I've got my data now I haveyou know I'm collecting it what do I dowith it right i have baked in charts Uhthe alerts are nice but how do Iunderstand the data um I want to be ableto analyzeit So you know again hey I put togethera UI you know with one of the UIs I useuh to build my own charts and tablesbecause the general ones aren't goodenough And then I need to you know docustom metrics because I really want tounderstand more about my application Andyou know I have tool XYZ which is agreat UI you know for slicing and dicingbut how does it work the u and and youknow maybe another tool that you may beusing it doesn't have a great UI Solet's try querying the data because theUI is not enough for us to understandmore So now what do I do right i havelots of domain specific languages thatis DSLs as you commonly know uh forquerying data and you know if you wantto query metrics you have promql youhave graphite you have flux if you wantto query logs you've got splunk QLelasticuh logql lots of different bsls againand then you have traces like trace QLyou know new �relic QL piped pplum UQL lots of different languages Ifyou go and look up the landscape ofobservability you'd be amazed at thenumber of DSLs that have come about youknow from each uh uh storage of thesetelemetry types So um so far there isn'ta query language which is very specificto profiles or to real user monitoringdata from data which comes from clientsBut you know you never know when youknow we'll see another set of syntaxcoming in for going and quering thatthat data too So how do how do Iactually you know unify all thesedifferent types of data to understandthem because as an end user I may haveyou know 20 types of uh data that I'mcollecting stored in 20 different places20 different types of storage and henceI have to now you know kind of force myuh you know entire org to go and learn20 different types ofDSLs So the context you know is spreadaround all these different types of dataand uh what is interesting is that someof the context now is not only uh youknow stored in observability specificdata stores but also in BI systems likeclick house or druid or some of theothers right and operational data storeslike spker or crossplane or you knowother frameworks that you're using sothere's data all over the place and justyou know to set the context Next yeartraditionally we had developers and ssurres and today we have we're movingtowards you know a universe where uhmore more teams are consuming thisobservability data So you havedevelopers you know who have been nowkind of are doing devops also becausethey also you know support theirservices So they're on call and theyhave to also understand theirapplications end to end uh and theirautomation and deployments also have toprovide them that data Then you haveSRRES who uh triage and you know workwith the devs You have platformengineers who are also scaling based onunderstanding the usage of their systemsYou have uh security engineers You haveAI engineers You have uh managers alsoasking for data to say hey you knowhow's your system doing uh were thereyou know what kind of uh customers wereaffected you know when there's an outageSo you have executives and marketingcoming in into that mix and you alsohave support engineers who are actuallyhelping customers you know across theedge of your applications and customersthemselves who want to understand youknow what are the details about whatthese failures or errors or outages wereSo with that said again I just wanted toset the context and give it over toChris uh as I you know kind of just wantto say that when you have such a complexenvironment you have to strike a balanceright like am I going to provide deepunderstanding of our systems or be turabout what we are providing uh deepanalytics versus quick lookups right seethe considerations because you have somuch data and interoperabilityuh uh or optimization for very specifictypes of queries right so these areconsiderations from a design perspectiveand Chris will dive into that uh in moredetail Christhankyou Let's see Oh let me go back one Uhhere we go Yeah So some solutions tothese problems here right um we want tomake it as easy for users to work withthis at Netflix observabilityengineering that I'm part of is underproductivity under the platform work Soour remitt is to make it as easy to usetools and assist developers and otherfolks at the company as possible So thefirst thing that everybody asks nowadayscan we just throw AI at it you know letthe ML the big language models take careof it for us Can we just say you knowshow me my uh the uh the root cause ofthis incident this alert that triggeredyou know yesterday well or can I uhwrite a query uh or ask a question andsay "Show me the CPU utilization for myfleet of Kubernetes hosts over here."You know AI is getting a lot better atthat We're not there yet Of course umone risk that we have is that during anincident if you ask an LLM to show youum an answer or whatever you're lookingfor if it's close but not actuallycorrect it's going to cause moreproblems for you than if you actuallyunderstand the quering behaviorunderstand your tools an�d craft a queryyourself Um we also have the problemthat with various DSLs right now if youtry to create one LLM that's going toquery Elastic Search and Prometheus andall of this these data sources you'regoing to have more hallucinations Kindof going back to like what VJ wastalking about in the keynote there whereif you target an LM for one specificthing it's going to be a lot better thanif you're trying to you know grock thewhole world of telemetryUm another problem is that vectorzingdata there's been some research in thisright now for observability and thenrunning models on top of that The ROIisn't there yet It's too expensive Weproduce way too much telemetry right nowto make that ROI feasible In the futuremight work but we're not there yet Andthen Agentech AI also uh works better ifyou have a common model common querystandard and common semantics across allof this data than if you have to dealwith all kinds of data sources that aredoing things just a little bitdifferently right um and we of coursestill need a way to validate the queriesand the data that we're getting backfrom LLM right now So we can't totallyturn it over to them yet Keepexperimenting That's great We want to dothat Another option would be you knowjust switch to a specific telemetry typelike white events right um that's pushobservability 2.0 and it's great butagain um it's really expensive at scaleat Netflix scale or Google scale Amazonmeta etc For targeted use cases it'sgreat um like the scuba model at Metawhere a user may want to pick out and uhtrack a certain event with very widecontext and add it In the past we've hadHTTPD access logs That's a wide eventessentially Didn't have tons of contextbut it had a lot You know it had metricsin there It had user ids it had uh URIendpoints etc It was great There's alsothe problem though that there is somecontext that we can't capture in theseevents We need to join it and enrich itat the end somehow We do our best butit's not there Think about capturing CPUum utilization That's not going to bedoable per honor request basis veryeasily You want to sample that still asa metric and join it after the factright um and then also we still need away to even if we have wide eventstraverse graphs like distributed tracesfigure out how to dothat Uh wrong way let's go here So goingback to the CNCF workg groupoup some ofthe things that we've looked at here asfar as coming up with a solution Umwe've interviewed 11 different DSLdesigners u from data dog to um uh thePrometheus folks everybody We'vecataloged a lot of like languagefeatures We've cataloged some telemetrymodels particularly hotel and someothers there We've also cataloged a lotof observable use uh cases and writtenthat all up It's all open sourceavailable in the tag GitHub repo and wehave links here We'll have them afterthe talk to um that you can look atinternally at uh Google prayer and histeam um did a bunch of interviews wherethey looked at the user experience umacross observability where they hadthree to four different languages in uselike monarch's language you have um SQLand use you have other things promql isused there internally too they alsolooked at the different experiences forGoogle engineers from the folks who justjoined to people who have been there alot longer um and they found out thatyeah the people coming in um when youhave a bespoke query language it adds onaverage about you know 2 weeks ofengineering time just to learn that overtime Um I'd add to that on top that ifyou have somebody who interacts withobservability irregularly when they comeback they have to refresh and learn allthat knowledge You don't just learn itonce necessarily and if you're not usingall the time you're going to lose someof that knowledge right they also lookedat the uh um the query performance So ifyou have three or four different queryengines you know are you spending a lotof time trying to optimize those enginesfor that specific use case but then youkeep optimizing as new uses come uh usecases come up or is it better toactually focus on one or two engines uhtry to get al�l of those consolidatedthose DSLs and optimize that engine forthe wider range of use cases Um and justanother note that they mentioned at uhGoogle and majority of observabilityusers are developers It's not sresanymore It's developers And that leadsinto quantifying things at Netflix aswell So I took a look at our um helptopics and our Slack help channel andthe UI interactions that we have andthen looked at a number of u kind ofcategorized all the users We have a lotmore developers at Netflix and SRE It'sprobably the same thing at mostcompanies and indeed as we've talked tofolks that uh pans out and the users thesupport topics the support burden for usobservability engineers comes fromdevelopers asking questions about how doI use this language for our case it's anRPN based uh stack language called Atlasstack language that's totally differentfrom everything else out there it'sreally powerful but you have to learn itum you have to take time to learn and wespend a lot of time saying this is howyou use it and that's a pain in the neckreally So that's why we're looking forstandardization You also have platformlibrary owners who are trying to buildexperiences with observability for theirusers and they're the ones who come tous and ask for some help too but they'reusing the UI tools surprisingly a lotSRRES are actually not using it as muchonce they learn the tools They're notgoing to bug observability aside fromasking for features right or performancequestions things like that And thensurprisingly or not surprisingly datascientists um are using observability alot more at 10% and 22% in our tools Uma lot of that is big data analysis but alot of it is real time analysis too Umand then managers are of course usingobservability External customers arealso using it In Netflix we have toshare our observability data with ISPsThey want to know what's going on insideour tools That's happening more and moreacross the industry as we've talked topeople like Salesforce Adobe They haveuse cases where uh customers of theirproducts want uh analytics products andwhatnot want to know if say a customerjourney failed because internally somenode in some Kubernetes cluster died andthat's the root cause for that youknow again an analyzing telemetry welooked at it um everything's kind ofconnected through attributes we do a lotof coercion of uh logs and traces andprofiles into metrics so when people saywell these domains are totally differentthat's not true anymore really this iswhat we're doing we're converting theseinto metrics we're analyzing them alltogether um we also analyzed a lot um ortalked with and interviewed the DSLdesigners and found out that yeah thedata store isn't that tightly coupled tothe syntax to the query language A lotof them felt like yeah we could extendthis um into other telemetry types noproblem And a lot of them all had thesame things similar predicates similarfunctions similar aggregates um very fewof them a couple actually supportedgraph predicates like uh tempos traceqlwhich is pretty cool and then going backto that and looking again at thecommonalities between data we found thathey it's relational right and I think alot of people raise your hand if you'veheard of relational databases before youI think yeah quite a few probably rightso no surprise here when you're lookingat correlation which we're trying to domore and more of with telemetry nowadaysEh it's relational You have the sametags and keys and values on other thingsThink of them as columns Yeah whateveryou want to do they're commonattributes So the question then ariseswhen you're looking at all theselanguages Is there anything thatsupports uh conditional and relationalpredicates anything that supports graphpredicates anything that supportsjoiningdata you know grouping aggregations timewindowing Anybody here want to hazard aguess yeahSorry it's SQL So let me turn it over toPereira here and he'll talk about whythis makessense Thank youUm turn off YouTube right cool Um soyeah so the the approach for uh we getto the point to say SQL oops there'ssome echo here SQL is t�he language umproviding that for us rightand um going back to the research thatwe did in Google we after that all uh uhresearch that uh we took some months todo that uh we decided to converge in SQLextending SQL and and and extend uhaligning on SQL and extending SQL andmore specifically adding uh more bettertelemetry support with pipe syntax andtime series extensionum and as well trying to focus how tobring this this this square language umto support more observability datasources uh across all the needs uhacross the company Um and the reasonthat we like the the value that we gotfrom this is really reducing the cost ofon boarding and training in the companyAs we mentioned the the the number ofusers of service data is much more broadthan before or or traditionally that wehave like only as is like everyone inthe company in practice Um and at thesame time we want to have a properfederated querying across all those datadata types right it could be like uhgenerics logging trace productiondatabases other source incident data etcetc right and we can do that all withSQL and we would like to unlock uhcapability for uh observable data lakeand obser back to why not just throw AIin the problem right like in a data lakethat have proper federated andunderstand semantics across differentdata sets That's the best way to use agentic AI is that right because there'salready kind of a a relation between thedata which kind of makes easier for AIto do what they do good right whichreally kind of like present thatconnection go across the things thatpresent the connection of the data Umthe other benefit that we saw in thecompany as well is that improveengineering efficiency because we don'tneed to kind of do multiple things Wecan focus in in in in one SQL engine wecan focus improving one group ofoperators and so on soon Um so tying back to what uh withinCNCF the goal here is is really kind ofanalytics word is returning to SQL rightduct DB is is a a good example on thatwe have other inspired SQL uh we haveSQL mesh as well for processing data wehave other lap stores coming on here andall of them like are focused in SQL welearn a lot from NoSQL um there's a lotof custom code to run queries in NoSQLuh Cassandra HB or map reduce um everyNoSQL start you figuring out how to doSQL the the the OG of NoSQL fromGoogle's big table right and and uhcloud big table not supports Google SQLwhich is the SQL language that's run ontop so so the no SQL have SQL um and umin general as OBI operational data areare not as as as different than beforeit was very different ways and now we'veas our AI engineers and so on we havemuch more common usages acrossoperational data and and businessintelligence Um when we do as all likekind of analysis for for uh changedeployment or when we do uh feature ukind of experimentation right featureenablement let's say open features baseum we need to more know much moreoperational data and do analytics acrossthe data So those are common patternsthat happen between let's say operationsand andBI Um and at the same time NCQL hasevolved right in 2011 we have kind oflike temporal support Uh 2016 we broughtJSON um not more XML um and 2023 uh webrought as well some kind of graph graphentities So this is kind of likeevolving is slowly but it's evolvingright Um so the recommendation from CNCFis using a subset of SQL semantics as astandard Um at the same time we we wantto avoid DML right like modificationinsert and create but we want to focusin the in the select part forobservability needs Uh the goal is todefine the models by type supporting ofcourse the open telemetry models focuson on relational execution enginesignoring a little bit for the focus Nowthe working group is ignoring the syntaxUm focusing federated queries uh with aquery plan for intimaterepresentation Define this kind of likestandard functions or syntactic sugar uhto support existing systems and reduceverbosity right SQL can be I I'll show alittle bit can be a bit verbose andcreated this gateways to permit uh greenfield usage and brownfield usage Rightso people that are already using thequery language now they can easilymigrate with those gateways Um we havethe QR code for the specificationlater So what what's coming up now wehave uh pipe syntax is something that uhuh we we launched in Google Uh it itbased on Unix philosophy modern requirelanguage We have that as well We havelike Splunk uh custol and others Um it'sreally just to kind of facilitate theirexploration It makes in a more directway Um Google released this pipe syntaxthat's then SQL without replacing youcan use easily both of them and this isalready kind of available in in inGoogle SQL with BigQuery F1 and Spannerand as data bricks as added to uh Sparksupport for pipes and this week waslaunched as for firebolt Firebolt hadthe pipe support in their syntax and forOSS we have Zeta SQL which is the GoogleSQL uh open source version um it'scalled Zeta SQL um so one interestingexample here we can see duck DB uh duckDB Prometheus uh a rate uh using SQL itit can be very verbose this is two pageby the wayum this is like the type of rate thatwe're doing in Google with Google SQL umand using pipes right was like very um Itell what a table what kind of a timestamp like the of course there's nooverlapping windows uh the time stampthe delta time stamp and I then Icalculated from the delta the rate andgroup by for whatever I want and andorder by time stamp which is kind of usehow I want to go here um when you seethis type of kind of difference betweennon-pipid and pipe it could be this likeone line one line difference or not muchdifferent um But that's not the realityright like in in standard SQL it can bethe syntax syntax order and the semanticorder can be very different as you cansee the order of the rules but when youchange to pipe syntax is much moredirect way and it's usually the way thatwe think so it facilitate for you towrite that and if you're doingoperations you want to make a verystraightforward way they're not like ohhow I go there and go back and and andso forth right so so the for the pipespecific usage in the last year inGoogle we have uh the grow is reallyreally usage this is uh this is like theusage grow in the in the in the companyuh this for internal query engine thatwe have there F1 um and and sticky andand spreading very fast and we're usingit more right and and this again this isgen query engine we have bigquery as allthat launched this uh January people areon vacationu so just as a summary here for thestatus and x4 pipe syntax you can trybigquery you can try the brick spark Ilink the docs fireboat I link the docsasum the firebolt launch was this weekthere's the paper for uh the pipe syntaxlaunch on on Google we have thatimplementing zeta with the oss versionof that um and we want community supportI want to hint here um yeah has apresentation later today uh please uh gothere he'll talk about pipe syntax sojust to tie this back the collaborationwe're having collaboration here betweenGoogle and CNCF the conclusion is thatfrom different perspectives reach thesame same outcome um that SQL is kind ofthe best language to kind of converge toum but at the same time there's some uhwork that need to be done uh pipes isone example better operators bettersyntactic to reduce verboseum and the next step that we do is tojoin this effort on sharing researchacross across this team from Googleperspective we have uh the oss versionin the second half o sorry sorry secondfive I said wrong here for aligningwindow histogram and some syntacticsugar and we work together for alignmentuh unifying syntax post the the existingwork for uh the workinggroup so um tying back here I put likethe three uh QR codes here first theCNCF draft semantic specification uhthen we have Zeta SQL if you want tocheck which is this kind of open sourceversion of Google SQL and as all thepaper about pipe syntax as all I want tocall out go to the talk for from uh yachlatertoday Um and that's it from us Uhquestions yeah we aretime Yeah we're happy to take both thethree of us can can take questions justoutside the door Yeah perfect Thanks alot[Applause]2025-04-15 21:58:42.298395 ��2�#��A1iWD14xvBQAso good morning everybody Uh very niceto see you all and uh hope you'reenjoying the sunny weather in London aswe are all you know sitting inside inthe rooms but this should be aninteresting session today Um again youknow we are um from the observability uhtag and we've been um we'll kind of gothrough you know our work and uh some acase study from Google uh on uhdesigning an observability language Sowith that said let's getstarted Um I'm Alolita Sharma from AppleI lead observability engineering and I'malso uh co-chair on the uh tag forobservability Um I have Chris Larson whois also an observability engineer uh andmy fellow um colleague in the work groupthat we have in the tag for um querystandardization specification Um and wealso have Pereira from Google uh who isum a uh observability steward at Googleand has been working with uh internalteams at Google to be able to uh look atyou know how observability is being usedby different user groups uh developersand others across the company and we'llbe talking a bit more about you knowsome of the research that they have donewithin their org So with that said umlet's get started Uh so the focus ofthis session today is really to um notyou know really be prescriptive about aquery language Right that's not theobjective of this discussion It reallyis that hey you know how do we propose arecommended recommended way of commonlyyou know using a common query method tobe able to query all kinds of data rightbecause again as you know inobservability data we have metrics wehave traces we have logs we haveprofiles events you know client eventsum as well as errors failures all kindsof exceptions that are uh considered tomeasurable you know observable andreally tell us a lot about the system uhthat we are trying to manage So withthat said you know I mean again as anenduser or any kind of user how do youactually um look at that data end to endthe other the so the other part that Iwanted to call out here as focus of thesession is that again we want to presenton the research that you know theobservability tag has been doing as partof the query language specificationworkg group uh on building anspecification for open observabilityquery semantic definitions right commondefinitions that are used across theindustry uh and Chris will be talking abit about that and Then of course Googlewho you know as a uh diving in into theopen- source uh SQL extensions that uhwith with uh time series and pipe syntaxum uh details to support observabilityqueries uh and and also share theirfindings with the communitySo just to be clear again making surethat you know people don't get confusedand us especially uh it's uh metrics andtelemetry will be used interchangeablyespecially by Pereira because you knowuh uh in the larger observability opensource observability world we typicallylook at metrics and call it out asmetrics but u uh often you know acrosslarge organizations It's also referredto as telemetry Um so these are timeseries based measurements Uh we usethose interchangeably in these slides Sodon't get confused Uh logs which we arereferring to is unstructured orpartially uh semistructured data Uhtraces um again distributed traces thatare consisting of spans uh acrossapplications often uh profiles which isreally giving you profiling data forsystems um information about CPU or GPUusage from the infrastructure memoryallocation memory usage uh etc And thenwide events if um some of you have heardabout wide events um or may actually usethem Uh and these are structured logswhich have additional context built ininto the log recorditself So um let's talk a bit about whatare some��n the cloud native worldis something of very complex or sometimevery complicated very challenging tounderstand just check the CNCF landscapeyou know you can see the number ofproducts the diversity of products someproduct have some gap some overlap interm of feature and it's also verychallenging to find the right productfor the right use case if you are not afamiliar user in thiscase but now If we step back you knowlet me talk about security governancebecause that is a foundation uh thefoundation of a cloud security team andyou know if you want to be able to scalein your organization you have no choiceto have a right framework in place andlot of organizations specifically theregulate organization follow this uhfollow this framework in the first stepwe will have what we call a securityreview what is a security review inreality it's a technical document wherewe will highlight all the securitycapability of a specific cloud servicesfor example as a storage account or AWSS3 we will also identify you know whichsecurity control we need to create to besure the service is deployed from anhard way and also we will also identifywhich potential threat we can have if weuse these services with misconfigurationin addition of that the next step is todo trade modeling it's very interestingand very important to do thread modelingin the early phase you know in this casea threat model for people which are notfamiliar with that is like a mapping ofthe different component of yourarchitecture and the goal of that is toidentify if we have any uh missingsecurity control any trait which are notcovered you know and that will help youto be very proactive and give an earlyfeedback to your developer versus towait to be uh at the production gatenext step is cloud control validationbecause as you know regulatedorganization we have no choice to followregulated standard and what we want toadopt is to be sure we have all the bestpractice in place and for that we areconducting cloud control assessment lotof organization are leveraging the cloudsecurity aliens matrix uh as a baselineuh for their cloud control uhvalidation next step is pentest it'svery important to have pentestspecifically if your application isexposed to uh to internet because youwant to have a third party opinion aboutthat and you know all these four stepyou know that will feed what we call ascorecard a scorecard it's an IT risk uhdocument which will define the securityposture of your uh application uh in thecloud and you know this scorecard shouldbe in correlation with the risk appetizeof the organization just to be sureeveryone is comfortable with that andthe scorecard is is is often presentedduring a cloud governance board whereyou have the chief architect where youhave the chief security uh securityofficer without to have a framework likethis one is very very challenging for anorganization specifically a largeorganization with regulation toscale and to help us to do that the CNCFcreated this forcy security model fromcode container cluster to cloud that canhelp you to map your different securitycontrols the different tooling you needto have uh inplace just I want to give you someexample about some potential Kubernetessecurity challenge we are very luckybecause our friend from the OASP createdthis OASP top 10 as you can see we havethen more 10 biggest sec securitychallenge one big security challenge isidentity identity in Kubernetes issomething very critical very complex lotof organization have a lack of identitymonitoring and you know if you want toaddress that um if you want to addressthat lot of tooling exists uhtoday and you know to give you an ideaabout the different security thread youcan have against a Kubernetes securitydeployment microsoft created this uhthreat matrix that will give you an ideaabout which attack an attacker can uhcan use against your uh Kubernetes uhinfrastructure as you have seen nowcloud security is very complex you knowspecifically in the cloud native worldbut for that we have an answer and theanswer is platform engineering andMatthew will uh present you t�hat yeahbecause Max stop it's too complicated wegot the feedback that we are blockinginnovation for our developer and we arelike limiting how we could deliver valuefor the end user and at the end of theday our company need to make money rightso how we generate that and thehappiness of the developer are not goodwe did a survey and they are likecomplaining that security is just ontheir own responsibility so are youhelping them max or with that iunderstand it's important so what we aredriving now it's this kind of initiativearound platform engineering and platformengineering again is not new don't makeme wrong but what I really love aboutplatform engineering is it's uh takingexperience learning from the experiencefrom a CIS admin world a devops worldand getting together best practices uhmore focused on what we call building aninternal developer platform how we couldimprove developer experience Right um sothat's what we want to discuss todayshifting down to the platform instead ofshifting left to the developerand we want to start with this need thedeveloper are asking us to make sure wesimplify their life they want to focusedon their code and not doinginfrastructure not doing security notdoing Kubernetes if possible on theother end we want the platform team wewant to make sure we havestandardized recipes we have moreautomation how we could put thattogether and actually I'm not a brandnew uh team uh reinventing the wheelreplacing those guys they are verytechnical so what we are doing with Maxas a security expert we are combiningforces we want to put together all thisautomation the tools and we want to addmore abstraction um again there is toolsand framework the CNCF platform workinggroup put together some white papers umand uh building an internal developerplatform it's all about building blocksand existing teams existing tools youhave right now the goal is picturingthese building blocks making sure youunderstand the interaction between toolsbetween people the processes and now youmay want to optimize how we could savetime how we could remove frictionbetween them because again as adeveloper I don't want to know what isat the bottom of this slide and platformengineer will need to orchestrate maybeor facilitate the integration ofsecurity engineer cloud engineer networkengineerUm what is very important for um thisplatform engineering initiative is alsoto see your IDP as a product you havecustomer and maybe you have multiplecustomers um I love I love this um uhthis quote from a link post caught myattention and it was security expert andthey said yeah we need to evolve we aresecurity engineer but we need to havethis mindset of a product decision andnot just technical engineering detail uhjob right and that's very importantright try to resolve painpoint focus onyour customer and all your personas notjust the developer many other uhstakeholders You need to have a backlogyou need to have a product manager aproduct owner and maybe multiple peopleon that and you need to focus not juston tools we will discuss a lot about thetools today but please take this slideuh very seriously uh because we are notfocusing on this but that's very veryveryimportant let's now share some tips wewant to start with on a developerperspective what could be improved wehave heard we have listened somefeedback and we want to focus on thisinterface interfaces maybe between themand our platform our internal developerplatform so as tip number one we have uhseen a huge benefit of using this kindof project i don't know if you'refamiliar with all of them but the goalis to abstract as much as possible whilehaving our developer in their inner loopright locally being uh very being sorryvery productive so if you know buildpacmaybe you want to abstract writingdocker files that's a good exampleanother one I don't know if you'refamiliar dapper they won't interact withradius directly or with aninfrastructure in cloud uh directly theywill contact dapper client so they willhave an abstraction uh on what they areuh talking to in term of infrastructureum micro mocking API co�uld be a goodbenefits to uh avoid integration testtoo eight in the pipeline and maybehaving some test uh shifting left thistest locally and score in order toabstract the deployment for the uhplatform uh that's kind of some tool youmay want to look at for your developerto help them being productive staying intheir inner loop without having all thisintegration uh pipeline or test um laterin the pipelinetip number two how you could provideself-service how they could get thecentralized maybe technicaldocumentation do they have multipleplaces to get a documentation toaccomplish something could youcentralize that how they could onboardon a new project on on your new platformum how you could provide also a a viewof the different system maybe they needto connect they need to switch contexthow you could reduce this cognitive loadon a centralized place and if you'refamiliar maybe with backstage that'ssomething you could look at there isother portal uh framework and tool outthere but that's a very very popularone as tip number three very popular umrecently if I could say u how you couldinteract with the um with the platformand the tools because we got thefeedback max mature that's good youabstract ract as much as you want foryour developer but I'm receiving a lotof slack message personally I'mreceiving a lot of teams service nowticket jira ticket how you could empowerthem and shift left on how I could querymaybe some systems so you will have andyou will design what could u could theyget right to enhance the education can Ihave the status of that the log of thiswhy this and why not that so with thatyou could reduce the ticket ops andincrease the visibility for them and notshifting down the cognitive load onsecurity cloud engineer Kubernetesengineer aswell now we want to um to highlight sometips more for the platform engineeringand uh securityperspective okay perfect thank youMatthew but now if you remember my myslide with a 4C you know from code tocloud you know uh we will go more deepuh about that you know what is a verygood and best practice to do is to doinfrastructure as code code scanning andalso to to to do source code uh scanninguh because you are all aware about thenumber of CV uh we have across all thedependency all the library uh you needto do that in the uh in the early stagefor that you can leverage tool likecheck off kicks for example next step iscontainer is a container image scan lotof people are doing container image kindthat is right you know but in additionof that what is very important is to dois to sign your container to be sure youhave the right container only thecontainer you allow running in yourkubernetes cluster I know that could besometimes challenging um for largeorganization ation but that is a verybest practice you should to put uh inplace in addition of that and myfavorite part is compliance control oryou will prevent misconfiguration or youwill prevent configuration drift in yourKubernetes cluster for that you canleverage solution like OPA policy kiverofor example and in addition of thatruntime security monitoring as you knowit's good to have your workload but it'sbetter you know if you can monitor thebehavior of what happen inside yourdifferent uh your different workload inyour Kubernetes cluster you know maybeyou have one pod we try to talk withmalicious botnet and you want to benotified aboutthat or maybe you have another pod whotry to do like lateral movement and youknow one pod maybe can uh can cancompromise you know the node and fromthe node move do anal movement to uh tothe cloud provider and compromise allyour entire uh cloud infrastructure andthat is very important to uh to takethat uh in consideration and monitorthat and be able to react to that iwanted to give you an example about uhdroles droles is also a very goodsecurity practice to adopt not lot oforganization are doing that but I willinvite you to do that because withdroles what you want to have you want tohave very small container image becauseif you have a very small containeraryimage that mean your attack surface isalso very smal�l you have less CV tomaintain less vulnerability and by theway you are more secure to that and youcan use solution like chain guard forexamplenow we will talk about preventmisconfiguration what I have alreadyhighlighted before uh to you but beforethat I want to take the time just tohighlight we have two type of securitycontrol in this case we have cloudvendor security control you know likeAzure policy AWS config uh GCP uhsecurity organization uh policy that isa security control you apply at thecloud provider level but that is notenough in addition of that you need tohave Kubernetes security control toprevent like to have container runningas root to have a container withextensive uh privilege for example andfor that you can leverage uh PSS opagatekeeper keo gs config lot of solutionexists in in the market please find totake the right solution for your uh foryour organization and be sure you havethis security control in place to havean ardent cub kubernetes clusterNow you know about security control asmaybe you already know we have threemain type of security control we havedetective security control that sometimeis the first step you know when youdeploy a kubernetes cluster you deploythe uh the security control as detectivethat mean that will only inform youabout what happened the next step ispreventive security control in this caseis we will block what happened from myexperience from day one I will recommendyou to deploy your cluster withpreventive security control because it'svery challenging to switch fromdetective to preventive and after thatwe have auto remediation securitycontrol you know when you have amisconfiguration to have an automatictask and automated script to fix themisconfiguration but that is not enoughwhat we wanted to do is to give morepower to the developer more flexibilitybecause this security control aredeployed at the cluster level at the andyou know the developer is notified onlywhen he try to push a workload to acluster and at the cluster gate and thatis not enough and for that it's veryimportant to give early feedback and forthat we can have directly securitycontrol in the IDE of the developer butit's still not enough I will explain youwhy because yeah a developer will willreceive a message hey your deploymentwill block because you have a containeras root but now you need to educate yourdeveloper education is a key point ofthe success and you know as securityteam is sometimes challenging because wecan scale you know uh to spend hours andhours to educate developer and for thatas match mentioned before you know weneed to have a self-service model and EIagent that could be a solution to giveuh more context to the developer andmore best practice to follow and sampleof code for example to resolve themisconfiguration i wanted to just showyou an example of a pot security uhstandard uh policy uh that is somethingyou can uh leverage for very uh simpleuse case and if you start withKubernetes but for lot of largeorganization that is not enough and youknow the reason of that is uh becauseyou want to have more advanced complexuse case sometime you want to manageexception for some specific uh use caseand for that I will recommend you tochoose a solution like kerno or opagatekeeper for exampleWonderful Max we have seen a lot oftools um that's beautiful uh and what wedecided all together is as platformengineer how wecould make sure we could shift left thedetection of the error but not askingthe developer to maybe run the commanduh and uh leverage the tools so hereit's a very busy image but that's toshow you where maybe all these toolcould fit and we decided to havetemplates we are using GitLab GitHubaction genkins multiple tools and wewill have recipes template reusable forour developer in order to um um havingmore test and failing fast as soon aspossible in the pipeline not waiting todeploy uh in the CD part but maybe inthe CI part continuous integration sothat where we have uh for example uhrunning a container in my CI tool is itrunning if it's not failing andproviding the rapid uh feedback to thedeveloper uh can I scan yourdependencies can I scan your um yourcontainer can I also check someconfiguration and I will fail on the CIpart I'm not even deploying somethinghere and then moving to the stack andthe workflow and here we would like tohave more control as well in the clusterin the continuous deployment pipeline aswell the idea of that is as platformengineer we have a responsibility toabstract all this kind of complexity andfailing fast and providing the feedbackearliertip number seven um what we have seenis where we failed in the past wastaking Kubernetes as our platformactually Kubernetes is not a platform issomething to build a platform upon rightand that's where um as a referenceDaniel Brian did this talk um and herein the industry you could look at thisuh at these different tools and thenotion of orchestrator how I couldorchestrate on top of Kubernetes ismaybe orchestrating some cloud resourcesKubernetes resources right and how Icould abstract and having a layer on topof Kubernetes so I highly encourage youto uh consider um looking at themtip number eight is we got a lot of umconcern about how my workload whenrunning somewhere will connect to mycloud SQL database in my cloud providerright here it's an example with AKSAzure but it's working with differentcloud provider there is also some CMCFproject helping for that the idea is togokeyless how your workload will connectthey don't need a long life token um uhto connect from the workload to thecloud provider maybe you want uh toleverage such mechanism with OIDCproviding a short life token and havingthis uh rolebased access control uhusing the list privilege uh concept aswell but that's a very uh convenient wayto um to not block and having issue withrotating key etcetc and we have seen eight tips todaythere is more and maybe that's a lot thekey secret is uh all about startingsmall you need to study what are thepain point what are the part you couldoptimize uh what you could achieve andhave this kind of backlog right souh we have seen how we could having ourum developer more productive uh havingthis uh velocity in mind and what wewant as platform engineer is havingstability and making sure we are stayingcompliant but we don't want to blockinnovation for the developer and what weare developing is this product mindsetvery important with that uh we couldhave investment priority making sure uhwe this seamless integration of thedifferent key techstakeholder some takeaways the securitycould be seen as a culture right and notjust tools or people blocking uh uhinnovation and creativity and combinedwith this platform as a product mindsetthat's what what we have seen uh verysuccessful uh so embracing the securityfirst approach is uh very important howwe could empower people empower thisplatform engineer uh understandingbetter the life of the security peopleand also making sure we could optimizestandardize and abstract from thedeveloper yeah maybe if I can share afeedback about that one point which canbe very uh successful if if you embeddedyour security team directly in theengineering teams you know in theplatform engineering teams they need towork together from day one and not to inthis case you know that will change themindset of the security people of theplatform people of the offs people youknow and work together work to have thesame the same goal and achieve thattogether if you increase thecommunication uh that will be uh betterforeveryone trust me coolso we want to call out all the resourceswe have been collecting and uh we haveused uh for the slide uh CNCF taxsecurity app delivery as well uh andsome other blogs articles so please lookat them the uh PDF uh format of thispresentation is already on the scheduleand you could also please evaluate thissession share your feedback we want toimprove uh this presentation and thereis a slide as well that you coulddownload we hope you enjoy uh thecontent of this talk please come by umshare your question concern ownexperience as well uh and we wish you agood rest of your cubecon thank you verymuch thank you2025-04-15 21:58:42.773346 � ��^� #��sA9lPp-6nJ8bIcheck check all right there we go holycrap hey guys wow nicecrowd okay we've got a lot to talk aboutso I'm just going to go ahead and jumpin i know a few of you are still comingin and uh there are a few empty seatshere and there so uh you know just uhlet people let people in as you can uhso good morning everyone very nice tosee you all my name is Camille Fornieri'm here with my co-author Ian Nolandand we're here to talk about our book onplatform engineering and specificallygoing through some of the some of thehighlights on starting and scaling aplatform team uh just to give you asense of the agenda for today uh I'mgoing to kick us off and just talk aboutsome basics how should you think aboutwhen to start a platform team how shouldyou think about the types of people youwant to have on that team and then I'mgoing to really speedrun through a fewthings about scaling these teams and howthat goes wrong and how the sort ofexecution process for platformengineering can go wrong then I'm goingto kick it over to Ian who is going togive you a case study on platform and uhyou know his his uh experience sort oftaking a team that he managed at datadog and actually turning it up andscaling it successfullyso uh CNCF is 9 years old let's be cleari don't know what this 10y year thing isabout kubernetes might be 10 years oldbut CNCF is nine years old and I haveemail evidence to prove it um and in thenine years we've had this incredibleboom in the ecosystem which is awesomeright that's great there's a reason thatwe've had this incredible ecosystem boomit's so much easier to you know startnew companies these days because you canget whatever kinds of sort ofinfrastructure support you need whateverkinds of products you need you caneasily provision stuff in the cloud܁�� #��;AEs3DBj2UgIEwelcome welcome day two of CubeCon rightbefore the launch thank you very muchfor your time being hereum we are very very excited to be hereum I'm Matthew Benois i'm Clative umambassador and I'm also customer successengineer at humanit i'm very excited tobe uh with Max today hi everyone uh myname is Maxim Kwell i'm a cloud securityarchitect and very glad to be here uhwith you todaybefore to start this presentation uhjust a quick disclaimer any vieweropinion expressed in this presentationare only my opinion and not necessarilyrepresent the opinion of my employer itsmanagement or its employeeso Max you you remember uh when Istarted uh my journey I was uh appdeveloper and I started my journey uh askind of platform engineer a few yearsago and when I run docker run the firsttime it was like what amazing like I wasvery uh mind-b blown uh then Idiscovered cubectl run wow okay I coulddo more stuff in my Kubernetes clusterright and then I discovered Elm and Idid Elm upgrade and I was able to deployeasily uh something more complex andthen I tried with Terapform apply and asa developer I was like empowered i waslike um very amazed but actually it'sway more than just running this commandright could you elaborate a little bitmore exactly Matthew you know uh as youknow security i�� andjust getting started is so so mucheasier than it was even 10 years agowhich is awesome um and so you knowyou're you're a new startup you're a newgreen field application you want to dosomething you have your application youthrow some YAML and other thingstogether and you throw it in the cloudand you're great um and in the processof course you create this glue code thisintegration code one-off automationconfiguration administrative tools thisyou know this stuff that kind of bindsthe application to all of the underlyingbuilding blocks you've chosen to run itfine natural no problem um this is allwell and good except that you knowthere's a reason we call it glue coderightglue you know is easy to stick thingstogether but when you want to likechange the things that are stucktogether with the glue it actually getskind of hard and when you're a smallcompany and you're a small team thisisn't so much of a problem but as youstart to scale and you have moreapplications and you have more glue codeand you have more choices that you'vemade in different ways across theunderlying infrastructure and you havemaybe some pieces that you're you weresupporting in sort of a communal wayyour CI/CD maybe or your databases orwhatever and you grow and you grow andyour company grows and your teams growand all of a sudden nobody wants tosupport those communal products and allof this glue code is starting to slowyou down right you need to upgradeKubernetes you need to upgrade EKS andall of a sudden every application teamis sputtering and scrambling you need todo security patches right and so this isthe point at which it makes sense tostart a platform engineering team itdoes not make sense to start a platformengineering team when you are a smallcompany that is not really useful rightit's it's overhead platform engineeringprovides a lot of benefits we agree butultimately it's a little bit of overheadand so the time to start a platformengineering team is not when you've juststarted your company but when you'vescaled to the point where you'reactually having problems with your gluehaving problems with your shared sharedinfrastructure that isn't really beingstrategically managed when you haveproblems 50 engineers 250 engineerssomewhere in that neighborhood is reallywhen we think it makes sense to startthis now um it's important to say thatthis doesn't mean you won't have thingslike wikis with shared documentation onthem but when you're a smaller companybut when we're talking about platformengineering team we're talking aboutbuilding platforms we're talking aboutbuilding stuff and we sort of like thisdefinition when we talk about platformswhich is the sort of foundation ofself-service APIs tools servicesknowledge and support uh that is youknow curated it's it's created itcurated into sort of a product offeringthat application teams can use todeliver features more quickly right veryvaluable um so we have these kind ofthree main goals that we're reallytrying to achieve with the platformsthat we're building our first goal is tomanage complexity ultimately all ofthose different little underlyingbuilding blocks we're talking about thatis complexity and the more of them youhave and the more different choices youmade the harder it is to reason aboutwhat's going on for anyone butespecially for application teams rightpoint number two is we want to actuallycreate leverage we want to deliver morewith fewer people but we also want tohave the right concentration ofexpertise um that can do things thatwould be hard for any one applicationteam with a few you know thisapplication team has like one or twoDevOps engineers and this team has youknow a couple of different people andit's relatively hard for teams like thatto have the concentrated expertise tobuild really good solutions in certainspaces so we want to be able toconcentrate that expertise so that notonly can we do big things with fewerpeople but also that we can do them atall and then finally of course we wantto improve productivity we want to makethe application teams more productiveright um and so these goal�s inform thetypes of engineers that we are trying tohire and for the types of people we'retrying to have in our platform teamstarting with managing complexity sothis is probably one of the morecontroversial points that Ian and I arevery firm on which is that platformengineering is software engineering itis a strong base of software engineeringit is not just sort of the next phase ofDevOps or whatever but it is in facttaking software engineering to buildabstractions that manage complexity itis important that you are willing tosort of engineer and apply thosesoftware engineering principles if youreally want to be able to create theright abstractions and manage thatunderlying complexity this can beanything from self-service interfacesbuilding out multi-tenency you knowactually modifying the underlying opensource that you might be using uh addinguh enhanced APIs um guardrails securityquality guardrails building out you knowsort of combined products that yourplatform that your that your applicationteams need that can do sort ofspecialized things so that they canfocus uh if you are not willing to buildout software yourself in this team it'svery hard to actually manage thecomplexity you can vend infrastructureand great your application teams nowhave access to that vendedinfrastructure more easily but theproblem is that these underlyingcomponents each have fairly broadinterfaces that they expose and thecombined exposure of all those broadinterfaces is the complexity we'retrying to manage here right so softwareengineering is a pretty core part andyou're going to want to make sure youhave software engineers on your platformteam okay but software engineering isnot everything uh and we also want tocreate leverage okay so another problemthat a lot of platform engineering teamsseem to have is that they think that itis like this very thin layer so we'rebuilding software maybe but we're notreally operating anything ew operationswho wants to do that a lot of softwareengineers don't want to do it umunfortunately it's hard to build a greatplatform if you're not willing tooperate things for many reasons one ofwhich is that it's really hard to getand maintain budget and funding if youare not seen as building really critbuilding critical systems and supportingthem right now in this cloud erasupporting and operating is not likeyou're racking and stacking hardware anddoing all of that stuff but there arestill operational components and wethink it's pretty important that as muchas possible platform teams ownoperationally the underlying componentsthat they're building on top of right umthis does mean that you are going toneed people who can build reallytrustworthy solid foundations ofoperational excellence right you needpeople with strong system skills yourDevOps SR infrastructure engineerssystems engineers who are willing to goin and really figure out when you haveweird issues what's going on when youneed to be really resilient reallyscalable really foundational to abusiness how do you do that operationalational ownership and operationalexcellence so you do need that sort ofsystem skill set in your platformteam last but not least if we want toimprove productivity we cannot go theroute of the old school centralinfrastructure team that builds what itthinks is right for you and has a verycost center mindset to what it choosesto do right uh old school infrastructureteams of which I have been on a few umthey do provide a lot of value tobusinesses particularly scared scaledbusinesses right um they are often verygood at operating things but ultimatelybecause they are seen as a cost centerand because cost is what they end upoptimizing for uh they often end upproviding you know providing offeringsthat developers just don't like and thatare not really in the interest ofimproving the productivity ity of thosedevelopers but sort of maintaining thecosts of the company um we don't want tobe in that world anymore right that weactually think that developerproductivity is really important andthis means that we need to take aproduct focused approach we need t�oactually not just think about what wethink is the right thing to do what wethink is easiest to build and operateand cheapest but also what do the actualapplication teams that are going to useour platform what in the world do theyactually need and what do they want nowwe are going to have to curate itbecause you cannot offer every singlething on that CNCF diagram to all theapplication teams and have any kind ofsustainable footprint there have to bechoices made but those choices need tobe made in the interest of the softwareengineers that are going to be using ourplatforms the application teams and theyneed to be made with their input as avery primary decisionpoint okay so platform teams they blendsoftware engineering system skills and aproduct focus and I think it's importantto note that um you may not be able tohire product managers in your early daysor ever on your platform team partlybecause it's really hard to findplatform product managers as someonewho's hired a few of themum that is okay right so you product theproduct exercise may need to come fromyour managers or your senior IC's andyou need to have that sort of customerempathy skill set throughout your teamum you don't necessarily need to hireproduct managers and in fact we don'trecommend again that your early team uhstart out with a bunch of productmanagers on it because that has its ownproblems as I'm sure many of you know uhokay so you've got a team it's probablynot this perfect team that we've justdescribed it might be you know a handfulof software engineers or it might be ahandful of people in SR and you know acouple of software engineers whateverbut you've got whatever team you've gotand you're going to try to evolve it inthis direction um and the early thingsyou're going to do are going to be a lotmore about proving value and impact tothe company than building out yourbeautiful dream of a platform right sowe very strongly recommend that in yourearly days you make sure that you aredelivering wins on a relatively quickbasis and you are earning the trust ofthe organization that you know what todo right I think big mistake thatplatform teams make is going off formonths at a time to build things thatthen they come back and say here is mybeautiful offering uh that is not a veryproduct focused way to deliver and it'salso a really bad idea when you are asmall newteam So speaking of scaling very quicklyuh I wanted to touch on our productplatform plat uh execution life cycleit's pretty much like your your sort ofstandard product life cycle in in manyways um but there are a few differencesthat I think are worth calling out hereso first things first you're all here atCubeCon great you are learning aboutsort of the growth of the ecosystem andnew offerings and you're going to bringthat back to your companies as ideas ofthings you should adopt um you shouldalso be doing market research withinyour market which is the engineers atyour company that are going to use yourplatform to figure out what theyactually want what they have appetite toadopt right because people will say theywant things that they have no time to dothe migration onto and actually use buteven more than that when you want tofigure out what to build a lot of thetime what you're going to do is look atthe application teams and see what theyhave already built that could be usefulmore broadly at the company that manytimes platform engineering teams are notdoing zero to one innovation a lot oftimes the work and the challenge is notthat zero to one innovation it's takingsomeone's zero to one in innovation thathas been proven to be necessary and youknow in demand for your company andturning it into something that lots ofother teams can use which is a bigchallenging exerciseum another mistake that platform teamscome sometimes make in that sort ofdesign build exercise is that they sayoh well we've got a good idea of what todo there's open source that does this orthere's something in the company thatyou know one team has made work for themthat we think we could we could use thisidea and then they say "But we're �goingto completely rebuild it we have to do acomplete rewrite of this so it willscale." Bad idea right um again the thedesire to go away for months or evenyears and build and then bring out theperfect thing is a common failing ofplatform teams and it is a bad idea youneed to think about how you're going toplan set milestones communicate progressget early adopters get incrementaladoption where possible you need to doyour product work of actually having anidea of what success looks like morethan adoption adoption is not the onlysuccess measure you are presumablytrying to do something with this whatare you measuring how are you measuringthe impact of productivity or the impactof scaling or the impact of whateverbeyond just did people adopt it or notbecause these are internal products youcan force people to adopt them it's nota great measure once you've got proofthat you think this is a good idea andyou think that people actually want toadopt it uh you want to make it easy forpeople to adopt it build those on-rampsfor your platform finally great you'vebuilt something people are using it it'swonderful hopefully what you built whileit will require constant amounts ofiteration to an extent you want toideally build something that doesn'trequire change every single time anapplication wants team wants to do likeanything slightly new that is often anindication that you have too tightlycoupled the platform itself to theapplications right you you do want someamount of abstraction and some amount ofability for the application teams to dotheir own work here um but you stillhave lots of stuff that you need to workon and need to worry about as theplatform has been adopted and sort ofstabilized right you've got to provideuser support this is something peopledon't think about that much but likeit's not just on call it's I don't knowhow to use this how do I get theplatform to do this how do I do thatmaybe AI will solve this problem for usbut right now we all have to actuallythink about how do we handle all thoseuser questions that are distracting ourdevelopers and support and operationsfolks you know day in and day out uh howare we scaling how are we thinking aboutSLOs's how are we thinking aboutstability finally one of the big thingsthat I think platforms have a realopportunity to make a difference for ourcompanies and our application teams ismigration pain right now every timestuff changes application teams have todo work to deal with that migration eksupgradesand Python 2 to3 good lord and you knowthis that and the other right there's somuch pain exposed to application teamsand again our old school classic centralinfrastructure teams really pushed a lotof that pain out to the applicationteams yes they did their own work therewas plenty of work for them to do butthen it turned into these horrible hugeproject management exercises that werejust constantly hit can we abstract thatpain and handle it ourselves in theplatform as much as possible withoutpassing it through to the applicationteams i think this is a greatopportunity for platformswoo all right so just to wrap my sectionup and hand it over to Ian um there arefour questions that I think are prettyimportant to be asking yourself asyou're building out and scaling yourplatform team the first question is areyou writing software you may not bewriting it everywhere all the time butif you're not willing to really take astep back and say I'm not just doingautomation we're not just vendinginfrastructure we're actually thinkingabout what software might look like tomake this problem better uh then youmight be missing an opportunity right onthe other hand if you're only writingsoftware and you're not operating anycritical systems you are puttingyourself at risk because you are notreally delivering something that is seenas essential to your company almostcertainly i I am sorry to to to say itthat way but it's just true it's veryeasy to get you know your budgetdestroyed uh when you're seen as a teamthat's like kind of an optional nice tohave right people who operate criticalsoftware are not an �optional nice tohave are you talking to your customers ithink we've beaten that one to death umso finally are you communicating withyour stakeholders stakeholders are notnecessarily the same people as yourcustomers your stakeholders are theirmanagers your stakeholders are the headof product your stakeholders are anybodywho has a big say in the budget processwho can decide what your headcount lookslike who can decide how much we areinvesting in this exercise you need tobe communicating the value of your workto those people in ways that they canappreciate and understand rightthat means you know knowing what theycare about uh making sure you arepushing that value out there regularlycommunicating regularly um and you knowdoing a little bit of sales and so withthat I'm going to hand it over to Ianthank you Camille uh so my name's IanHolland i was a VP at Data Dog betweenuh 2019 and 2023 uh and today I'm goingto talk about one of the teams thatreported to me reported up to me formost of that time and it's interestingin terms of a platform team in that theyhad to restart uh four times before theyactually found a mission that worked forboth their customers and themselves tobe sustainable uh so this is actually avery very common team you see in theindustry which was a data data platformteam a data reliability team they hadCFKA Cassandra Elastic Search Postgressuh it was a sixperson team data Dog atthis point was like growing rapidly sothere was about 300 uh product engineerswho were using these platforms indifferent ways uh a key thing about datadog is we'd already decided to go on allthree major cloud vendors so so wecouldn't just like use you know Google'sbest offering Amazon's best offering wedid sort of need to have some amount ofinternal uh internal development here uhand the key thing is this this team whenI started was was already in operationalhell they were they were seeing 50incidents a week uh you know sort of itwas clear that burnout was was on thehorizon and so so so what was happeninghere and and so I've sort of set up abunch of slides like this is in theteam's mind their mission was to own thedata platform automation on Kuberneteson all major clouds you know this wasback when Kubernetes didn't supportstateful so well all three clouds aredifferent so this is a pretty goodmission in theory to like bring a teamtogether and so the plan was like yes wewill operate this was a team that washappy to own operational encore for whatthey did but they they thought they weregoing to spend all their time buildingthis automation uh you know the realityas we've sort of seen with the 50incidents a week is that they were inoperational hell uh this basicallythings were on fire all the time and soso if if you look at what happened hereit's sort of a very common story I thinkwe see across the industry is is theythey thought that they'd be spending alltheir time uh writing automation um butbut really what what was happening isall their time was going into reactiveoperations that was happening under likeevery new product uh teams load and theproblem with these open source systemsis they're so broad in their interfacethat if you just offer them up toproduct teams to use in any way thatthey want eventually they're going tofind some type of hotspot some type ofhot shard and then they want experts tohelp debug it and so so the team'sproblem was uh they were very very goodat reactive ops they didn't want to begood at reactive ops but they were uhbut then they were sort of blamed bytheir customers like why aren't you infront of this like what what you knowwhy what's so hard about operating thesefour platforms um the weirdest thing wasthat they were constantly getting thisblame but then not they're also gettingthis blame for well but also why aren'tyou taking on all these other opensource that we use you know we want touse Reddus we want to use Foundation DBand and so the team was sort of in thisplace where they had unhappy customersthat that were working you know way toohard to to try to keep them happy andand it was just a burnout that th�ey felttotally unappreciated and so so so itwas clear that there needed to be areset and so the first reset was sort ofI think taking a leaf out of maybe somestuff that Netflix had published whichis let's just get out of the game likeif someone wants to use RDS let them useRDS like if they want my SQL RDS why dowe care like if they if they own theoperations and the cloud owns theautomation uh we don't need to be inthis game and so uh again in theory thisthis sounds like a really really goodidea so the plan was like you can seethe happy product engineers uh using thethe open source uh no so using thevendor open source uh the reality was uhsort of the opposite that like againthings were on fire and that the persongetting pushed in was one of a platformteam to sort of deal with the fire andso so what what had actually gone wronghere was the team believed they couldsort of draw back and sort of become aconsulting team around data architectureum but the biggest thing is if you'velike tried to use Postgress across allthree major clouds you'll see it's it'snot actually not the same that there'sthese slight differences that come backto bite you particularly under load andso the problem is that the team stillexisted so when when uh the productteams were having these issues uh youknow on one cloud with with one systemthe team was still called in to sort ofyou know come in and you know solve thecritical issue uh cuz you know why elsedo they exist otherwise right uh screenhas disappeared u and so so they wouldyou know they had this idea that theywere getting out of ops but actuallythey got further into reactiveoperations uh this had two problemsfirst of all they now lacked a lot morethey now lack context on what theproduct team was done and they had neverbeen consulted on the use case and sothey actually knew less the incidentswere worse uh the further thing was likein terms of any follow-up like now theysort of lost influence that because theyweren't operating or because they weretrying to get up the game of operatingthings what what did their advice matterif if you thought you knew better on aproduct team you could just go and do itand and so so again we sort of saw thissolution uh wasn't working at all uh soso the next reset was uh SLAs's andSLOs's this was like you know taking youknow the S sur blue book believing likeif we just document uh everything thatwe can around uh uh you know what weneed to what a product team needs to doto hand off their data platform onto usif we just document all the details uheveryone will be happy and so on theleft you see that the happycollaborating product and uh uh uh uhplatform engineers uh on on the rightthis is what actually happened is isthat there was a lot of disagreementsand the the person in the middle is thesad executive who had to like adjudicatethese results and uh the what wasactually happening is there's this sortof idea that if we just set expectationsand customers were saying like oh setyour expectations and uh you know thiswill make you know then we will knowwhat we have to do the reality was theexpectations they had to pull back thesebroad offerings to such an extent uhthat they basically required the productteams to rearchitect their systems rightas the uh these systems were inproduction under heavy load and you knowthey were getting yelled at by theirmanagement for not shipping featuresfast enough so so SLAs's and SLOs's didnothing to actually improve the coreproblem that someone needed to do thiswork uh to do it and so so what happenedin reality is yeah each handoverrequired an executive inter interactionthat that that was sort of me um productteams uh you know they didn't want aprocess at the end of the day they justwanted this problem solved they wantedsomething you know abstracted and solvedfor them and and this was particularlybad because the platform team became tobe seen as bureaucratic that theypreferred they preferred rules andprocedures over actually doing workwhich wasn't the team's background atall uh but this is what the the whatSLAs's and over focus on SLAs an�d SLOs'smade themappear okay so so reset number three waswas build it and they will come and soso at this point the um the the companyhad hired a bunch of X-fang engineersand if you know anything about Xfangengineers at a smaller company like thesolution to your problems is exactly thesystem they had at their last companyand so uh so you could probably guessthe company for in particular for thisone uh they were they were the solutionwas basically said look we need to buildthis in-house we need a reducedinterface you know SQL is toocomplicated we need to just do key valuestore uh and in particular they said ohbut you know we know the businesssomeday will need global data stores soif we bundle these two things togethersurely the business will be happy likewe'll be seen as precient we're sort ofahead of ahead of a curve uh the realityon this one was the developers who justreally wanted to use SQL and so thissort of this mismatch between what theywere building and what the developersactually wanted and so you know this isall like you know the plan was you knowunify with a simple API we we we know wecan scale because because it's so simplethe problem was like you know it'sactually four years later now data Dogstill does not do multi-reion becausecustomers turn out not to really wantthat it has complex uh things around youknow costs and outages and so so theywere way too far ahead of the curve inthe meantime what the business actuallywanted was the product developers to beable to move fast they just wanted SQLto be able to do this and so uh whatsort of happened after they built thefirst version of this which did getusage it actually got usage within theplatform team uh in particularly forconfiguration but it got no usageoutside and so then they came back to meand sort of said hey we need to buildour own in-house uh multi-reion SQL youknow they basically want to buildspanner or cockroach and that's ofcourse is a massive undertaking and andbasically led once more for you knowjust the this reputation of that teamthey're they're out of touch with thebusiness even as again they were stilloperating crucial businessinfrastructure but but this this builderthan they will come approach theirbiggest investments really got peopleoff off off board okay so fi finally weget to the the magical product thinkingand like you know this this always youknow Camille's had explained it a littlebit already i always find like there'sthis tendency in platform engineering tolike oh you you rub product on it andand it all gets better the the way thatI so in this case I'm going to talkabout how I think of product and I thinkthere's there's three main stakeholdersthat there's this the health of theplatform team is what they're doingsustainable so everyone isn't going toquit in 12 months that's number onenumber two is of course the the internaldevelopers are you doing what they needso that they they are more effective intheir jobs are you getting leverage thethird thing though is is very veryimportant particularly when it comes tostakeholders which is the business areyou actually delivering things in thenext you know you know businesses haveshort short attention spans right areyou delivering things in the next 12months that the business actually valuesif if you do those three things togetherthat then I think you're on on a goodpath to doing platform engineering and Ithink I think product thinking is whatbrings them all together in a strategicway so the plan here of course waseveryone gets around the tableeveryone's going to be happy uh before Ishow you the the outcome I I want to goa little bit into the details um so ifyou actually look at what it looked likeit wasn't sort of rocket science it wasbasically looking at that config casethat global config case and limiting tothat so it's purely for configurationthen then for actually SQL which theteam wanted what what they did is theywere interviewing customers is theyfound that there was these constant painpoints between people who wanted tocouple together elastic search radus anduh and postgress you know essentially asa caching searchable database like youknow probably problem that all of yousort of see and they sort of realizedthat they could raise the abstractionhere and if they raised the abstractionteams would actually take somerestrictions on how they use elasticsearch and how they use Postgress ifthey were actually getting value out ofthe combined offering um the key otherthings they sort of did in terms ofthese product thinking is they did notlike build it and then throw it over thefence the early offerings were very muchpartnerships with key businessinitiatives um the other thing they didis they avoided this future platformlanguage so the last one when they theydid it it's like this is what everyone'sgoing to have to move to someday whenthey built these two systems they dobelieve that they are going to replaceeverything someday but they sort of sawthat the fear that the prior type oftalking had done where everyone likefeared a migration or feared wasted workand so they totally avoided the futureplatform language they're like "Heywe're just building useful things." Andyou know may maybe someday you want touse them too okay so so so looking atthe outcome of this one it's uh so theplan was of course happy developersaround the table and then uh you knowvery very memew worthy Studio Ghibli uhyou know developers are happy around thetable and you see in the backgroundusage is going up so so the outcome ofthis one what was actually finallysuccess uh and so if if you look at theactual reality here you know it tooktwice as long you know it's buildingthis platform was not easy uhparticularly getting getting it to scaleunder real load uh but but the socialproof the the fact that they wereworking well with development teams uhwith product teams on things that weredelivering business value just justmeant that it was interpreted waydifferently you know this delayed sortof execution like it wasn't seen as aproblem it was just seen you knowengineering is hard things are going totake a while um the big thing is it'ssort of coming back to the stakeholdersthis is what the this sort of seeing itall work together is in my mind a largepart of what made stakeholders happy togive extra headcounts to the team forthe first time it wasn't seen as aplatform team just like buildingplatforms for platform sake it was aplatform team serving the business andso really for the first time I had mypeers like happy happy to give headcounto over to the platformteam so so sort of coming back toCamil's four questions I just uh put puta little bit of slant on all of them soare you operating critical systems uhthat that's absolutely important but aswe you know probably everyone in thisroom just be very wary of just vendingopen source where you're on call for anyparticular problem it might have becauseyou haven't you haven't at allabstracted away the ways that you wantit to be used it's very very hard tosurvive the operational burden of thatuh are you running software very veryimportant but really beware of thatbuilder than they will come notdelivering value for 12 months justgoing in the back room we know what theywant and they'll love it when they seeit i don't think I've ever seen thatwork in my career uh are you talking tocustomers that's really important uh butbut one thing with customers that thatsort of anyone who's worked in productknows listen to their requirements donot at all listen to their solutionslike it is your job to look across allof the different customers with alltheir conflicting uh requirements andfind a common theme and and that waspart of a problem of his team is theythey were often just thinking bylistening to customers going to makethem happy and did not uh finally areyou communicating with your stakeholdersand uh you know if they're not if you'renot communicating with them they're notgoing to know when you're succeeding andthey're not going to give you whatusually in platform what we need all thetime which is more headcount to actuallymake things work and so with that I amdone and uh thank youthank you2025-04-15 21:58:43.335774�ut often whatdevelopers end up doing is figuring outwhat the domain is wherever they'retrying to deploy and sticking it in aconfiguration variable for theirapplicationthat way the application can be deployedto a staging environment or a productionenvironment with different domain namesand the application stillworks realistically this is often howapplications deal with backing servicesas well or connected resources imentioned earlier databases backend apisso the most common practice is for theapplication developer to put somethingin configuration into the environmentrepresenting the location of the servicethey're trying to dock to that way theyhave a different database in stagingversus production they put in adifferent uri the application stillworks so essentially the value of havingthis place to store configurationinformation gives you portability youcan run the same application deploy itlocally deploy it in staging deploy itin production and all you have to do ismodify a few configsbut when it comes to workload identityand the reason that we need it isbecause there's a problem when we starttalking aboutsecrets so this is a very common patternwe see this across a lot of apps atheroku where people need authenticatedaccess to a thing and so they take theirsecret data and they store it in aconfigurationvariable and there's a few problems withthisso first of all this kind of a modeltends to favor longived secrets becauseconfiguration tends to live for thelifetime of the application so if yourapplication is running for weeks betweendeployments that secret will stay thesame for the entire length of theapplication secondarily it's hard tokeep in sync if your service providerwho's providing your database gives youa new database password and you have togo update your application or thatdatabase class password gets leaked andyou need to tell the service provider toupdate the the password you then have toorchestrate deploying the new credentialin one place and the other place at thesame time the other issue is that inthis form it's easier to accidentallyexpose the secrets and part partiallythis is because the environment tends toget logged in a lot of placesum the fact that it's in multiple placesis a problem the fact that it's oftenlogged and so what we see are these sortof cascading failures where uh thesecret is is easily exposed and thenbecause it's easily exposed it has to berotated and when it has to be rotatedyou have to worry about these otherproblems of keeping it in sync and beingvulnerable for a long time as thatrotation goes through i've heard storiesof companies with hundreds or thousandsof microservices that have a largecredential leak where they have torotate all of their applications at onceor over a period of multiple weeks andduring that intervening time it's not avery fun time for the security team atthatcompany so what i'm going to talk abouttoday is how we could build somethingbetter so what if we could haveshort-lived connection scoped workloadidentity credentials managed by theplatform that are standardeverywhere let's go through those piecesin a little bit more detail so when isay short-lived that means 30 minutes 60minutes for the credential to rotate itminimizes the risk of exposureconnection scoped means that it's notone credential for an application foreverything that it talks to it's adifferent credential for the applicationfor each thing that it talks to and themain value of this is you don't want tohave what's called the confused deputyproblem where service a is talking toservice b with their credential serviceb takes that credential and goes talksto service c and pretends it's service aso that's that's a risk of exposure aswell we want to be platform managedbecause that's the value ofstandardizing on something like this itmeans the application doesn't have toworry about the complexity the developerdoesn't have to figure out how to doworkload credentials separatelyeverywhere they want to deploy theirapplication and standardize means thatit's portable across platforms andacrossenvironments so let's� talk about how wemight get there so the first step islet's standardize the way that we storecredentials in the environmentso that basically very simply means wehave a standard way of representing athing that we're talking to and thecredentials associated with that thingcould have a type like user passwordsecret etc associated withit the next step is to make the platformresponsible for synchronizing thesecrets between the front end and thebackendservices so we have a secret we changeit in one place with the platform it'sexposed to the the front end and theback end it's shared we haven't gottento shortlived yet so that'snext the next thing we can do is replacethose usages of secrets that areusername and passwords or secret sharingwith a different type of credentialthat's shortlived now we don't have toreinvent the wheel here i'm notinventing a new type of secret that'skind of like building your own ssl wedon't nobody wants to do that you don'tbuild your own crypto we all know thisas developers right so we can use someof the existing standards for example inthe in the cloud native community wehave spiffy there's two standardsalready there there's the um workload iduh represented with the jwt that's oidcbased and there's a certificate basedone x509 we can just use one of thoseso in the example of oidc here theplatform provides the token to theapplication in a known location that'sthat it can share in the credential umin the environment and the applicationis responsible for loading thosecredentials and using them to makerequests and then the platform alsoprovides the black the back end with theinformation on how to validate thosecredentials so that it knows what isconnecting to itbut there's a couple of problems herethe first one is that if you've evertried validating oidc while it's astandard is somewhat complicated it'snot exactly the most straightforwardthing you have to remember to check theexpiration date you have to download thejw uh json web key set you have to cacheit because it gets inefficient if you'redownloading it every time you have todeal with network outages you have tomake sure your clocks are in syncthere's a lot to do here the secondproblem is if we start thinking aboutx509 as the other option here is that xthe the most common way to validateidentity between applications using x509is to have to use mutual tls and inorder to configure mutual tls yourbackend service needs to be the oneterminating tls it can't actually usemutual tls unless it's terminating thetls connection and as we mentionedearlier in most platforms it's not thebackend application that does that it'sthe platform so it really needs to be inthe platform that that tls terminationand identity validationhappens so what does that mean thealternative here is to proxy theincoming rest request between the frontend and the back end so when youconfigure identity for your front endand backend application you're providingcredentials to the front end and you'resticking a proxy in front of the backend maybe it's in your already existingproxy that you're using to terminate tlsand that proxy is responsible forvalidating the incoming credentials andproviding an identifier to the backendapplication so it knows where therequest came from and it can do this ina standard way like providing aheaderso summarize on the sending side theapplication has to read the uri of theservice from the environment it looksfor a matching credential it reads thetoken and it sends the token in anauthorization header there is one morebit that it has to worry about which isrefreshing the token so i mentioned thistoken is short-lived so there are a fewdifferent ways to accomplish refreshingthe probably most compatible and obviousway to do it is to watch the token fileusing i notify and whenever it's updatedyou get notified and you can reload itbut there are other ways that would workalmost as well so for example you couldreload the token every time you're aboutto make a request if the requests areinfrequent that's not such a bad optionyou can also parse the oidc c token ifyou want� to do a little bit more complexintegration and check for expo expirycoming up soon and refresh it um andthen the fallback option is you can tryand make the request get a 403 and thenredo the request with the new token thenext time on the receiving side it'smuch simpler the only thing thereceiving application has to do is checkfor client id in the request header andit can trust itso going through these steps you mightthink well what about step five like wehave a proxy on the incoming sidewouldn't it be much simpler to have aproxy on the outgoing side as well andi'm going to warn you that they'redragons here uh this is complicated andit's not completely out of the questionthat something like this can work butthere's a lot that you have to thinkabout if you do an outgoing proxy firstof all you're man-in-theming everyrequest so if you're making a if theuser application is making a request toa known um endpoint that has its ownsslert you're going to have to serve upan alternative ssl sort that doesn'tmatch the actual one because you don'town the sslert from example for examplegoogle.com um additionally often there'sretry loops that happen in the proxywhich can hide reconnections and retriesfrom the incoming from the front-endapplication and that means that suddenlythere's extra latency that's beeninjected that it doesn't know about andthat can cause failures so the good newsis if we standardize on this approach ofhaving a uri and then having anassociated credentials if people manageto build proxies if a platform doesbuild a proxy and they're comfortablewith it and it works they just don'tprovide a credential in the environmentand the client application works thesame way it doesn't need to worry aboutum the connection it just knows thatthis is authenticated by the platformokay the next big challenge is there's ahuge chicken and egg problem and thishappens whenever we want to updatestandards and move to something new inthis case application developers wantsomething better platform developerswant something better but theapplication developers can't move untilthe platform side moves and the platformside can't move until the applicationdeveloper side moves and so the approachthat that we've come up with for this isto create a cli which can act as aplatform if the platform doesn't supportthis kind of feature already um i liketo think of it as a polyfill soborrowing something from web standardsand from browser standards often whenthey're introducing new features in cssor javascript they will provide animplementation that's not native thatmight be a little bit slower toimplement the feature so everyone canstart using it until the browsers areupdated to support the feature nativelyso the idea of this is the same and theadvantage of this is one of the biggestchallenges of this kind of a system isyou don't have an ident identity locallyso when you're in development mode andyou're trying to connect to a backendservice you need an identity becausewhen you're in the production or astaging you need an identity and so thissolves that problem by allowing you tohave a faux platform running on yourlocalmachine and i'm going to give you a demoof how it wouldwork and i'll say that you're in luckbecause usually all of my demos would bejust a bunch of curl commands in aterminal but thanks to vibe coding iactually have a front end and backendapplication that you get to see that uhlook a little bit nicer so let me startby justrunning the front end and backendapplicationand there's the frontend there's the back end okay sobasically i mentioned about the frontend and the back end they're not runningin the under the factory cli right nowthey don't have the polyfill so at themoment if i make a request and i don'tpass in the client idum what you'll see is the back end isgiving me unauthorized because itdoesn't see xclient id in the headerwhereas if i make a request with clientid then it says ah yes i know who thatfront end application is well that's notexactly what we want here because i justany client could just pass inxclient id in the header and t�hen itwould be authenticated so what weactually want to do is wrong way um runthisapplication with the factor cliinstead let's give it a little refreshnow if i run it without client id istill see unauthorized and if i run itwith client id i see unauthorized so nowthere's a proxy in place between thefront end and the back end applicationthe backend application is isidentifying um the credentials beingsent by the front end application whichin this case are none and then uhpassing that uh client id header if itexists to the back end backend doesn'tsee the client id header anymore and soit says nope you're not authorized andthen on the other side let's do the samething let's just run this front-endapplication in the factorcli and reloadit and now when i request now you'llnotice it says that i have a tokenloaded so if i make the request withoutthe token it's unauthorized but if ipass the token in it's authorized okayso the the really cool thing about thisis that the the difference betweenrunning your application with andwithout the polyfill is essentially acouple of extra words you enter in theon the command line now let's brieflyjust look at what were the code what wasthe code that i had to do to make thatwork so on the sending side here it iswe load the backendcredentials we check the file that it'spointing at we load the file we start awatch to see if the files changed andthen we're just taking the taking thatdata from the from the load and we'repassing that in every request to thebackend backend even simpler right all we need to dois check for xclientid and return unauthorized if it's notthere so we've got something that'sprettysimple and that can be used but let's goa little fartherso briefly the features of this factorcli and by the way this is all prototypefor discussion um one of the things thatwe're trying to do is start thisconversation by introducing an identityfactor into the 12-factor app manifestoand trying to get the so if you disagreewith some of the details i don't reallylike xclient id as a header type comehelp collaborate with us and figure outwhat it should be right we we don'twe're not attached to the specificdetails of the implementation i'm tryingto get across the idea of us all as acommunity getting together and finding away to do this so that every applicationdeveloper every framework can implementit the same way and we don't have tohave this same problem that everyone issolving at every companyindependently so uh workload identitycan manage local oidc it can use ozeroit can use kubernetes service serviceaccounts um incoming valid validationvia a proxy it'll automatically loadyour end file and detect if it's changedto give you an easy way to to change theconfig of your application and itactually has an another interestingfeature which is in local developmentwhen you need to do uh incoming requestsor you need to provide the j json webkey set um you need to be exposed on theinternet so it can start up enrock foryou and expose your local application onthe internet so that you can do thosepointto-point connections remotelyokay one more demo i was going to dothis live uh but i was informed thatdepending on the internet is tough soi'm just going to talk over it um so thenext step is let's take the backend appand deploy it on a on an actual platformi'm going to deploy on heroku because iwork there um and in this case herokudoesn't support this yet but i can justadd a build pack which i've createdwhich will wrap your application upautomatically in the factor cli and runit so we're using the factor cliremotely as a polyfill as well so we'restarting up the application as it'sbuilding all i'm doing on the on theleft side is now that i know the uri ofthe backend application i'm setting anenvironment variable um over there onthe on the frontend application and thenthe build's going to go through you cansee it's building from my source codeit's just a go a pretty simple ghost uhapplication pulling in all thedependencies building it up wrapping itwith factor and then i'm running thefront-end application using uh� thefactor cli on the front end as well andyou'll see there that enro has is uhbeen is exposing theapplication so that uh the receivingside can validate the tokens that i'mthat i'm passingin and in a moment we'll be going andseeing that application in action uh invery similar to what we saw before sothere's the front end here's the backend again make the requestand if we request without a token we'regetting unauthorized and we'rerequesting with a token oh wait we'realso getting unauthorized what happenedthere okay well the issue is that iconfigured the front-end applicationwith the back end but i didn't configurethe back end with the front end and so icopied the token values out of my nicelittle front-end application and i set aconfiguration option on on on thebackend application telling it uh whatthe front-end client creds that it needsto validateare and then what we'll see when we gobackis that we have we refresh the back endnow i'll have you notice that when irefresh the back end here um we lose theold requests i'm going to talk aboutthat in a second so make the requestagain without a token now with a with atoken we see that it's been validated itall works so the same thing i was usinglocally i just use on heroku in 3minutes very simple now about thisbackend piece not working where it'sabout the reloading of the display onthe back end wouldn't it be great if wecould store the previous requestssomewhere hey maybe in like an objectstorage system that's a remoteapplication and so the the extra littlebit we're adding here is the backendservice also has a set of credentialsand it can use those credentials to talkto a remote web service like aws whatyou saw me doing in there in the amconsole is configuring aws to match arole to um the token that's beingprovided automatically by the factor clii have to set a few environmentvariables like the ro arn here and theni can just use the aws sdk with the webidentity token file to store things ins3 and so now i can make new[Music]requests and on the back end these aregoing to be stored in s3 i can go overand restart theapplication and we'llsee that now when i reload theapplication because it has stored therequests in s3 it'll reload them from s3on the backend again as a developer the code that ihad to do to to write to make thathappen was very minimal because i had atoken that was provided to me by theplatform so you can see some value herei hope um but there are a few otherchallenges we have to worry about firstof all um connecting to cloud servicesso every one of the major cloud servicesor most of the major cloud servicesprovide a way to use an open identityconnect token to connect to them uh sos3 um google objects um aws blob storeor sorry uh azure blob store supportthis but setup is hard you saw i had togo in the im console and do some extrasteps it's not exactly perfectlystraightforward and sdk support is mixedmany times you have to do some verycomplicated steps with the sdk toactually get it to use the tokenproperly so that could be a lot nicer ofan experience another problem isnon-http services so it's possible toconfigure postgress for example tovalidate an oidc token but it requiresinstalling a pam plugin and then using apam oidc library it's it's pretty jankyum so when we're trying to pull in we'dlove to be able to use the same methodto connect to everything so if we'retrying to pull in these non-httpprotocols we might need something elsewe either need to support x509 we needto get these services to update andnatively support validating ydccredentials or maybe we use a customproxy in those case thatcase another thing that i haven'tcovered at all is policy and that wasintentional the way that i'm thinkingabout workload identity is separatingauthentication from authorization andyou can still use the existing systemsin the that are in place like openpolicy agent once you're at theapplication level you can validate thethe client id and then send it off tosomething else or use an open policyagent library to validate uh and decidewhat to do inside the application onceyou know who is talking toyou little breadcrumb for the future soone of the challenges here is thateverything i've talked about assumesthat you kind of have one platformthat's managing both things um otherwiseyou still have the synchronizationproblem between platforms so there'sactually a prototype specification andexample called cloudpipe also in the12-factor repository that's trying toget towards the next step on this whichis how do you get the platforms to agreeto make the setup easier so wouldn't itbe great if i didn't have to go in theim console and click through a bunch ofthings manually and i just had an apithat i could say this thing needs totalk to that thing and it does all thesetup for meautomatically so quickrecap and after the recap we might havea couple of minutes for questions so ifyou have anything get ready okay so i'mproposing shortlived connection scopedcredentials for secure workload identitywith a standardized interface tosimplify development across platformsthat identity should be managed by theplatform which removes the complexity ofmanaging it from the apps developersfocus on code and not credentialmanagementa couple nights ago i was having aconversation and i had i had an insightthat some of the most costly problems incomputer science actually aren't thereally hard problems they're theproblems that are just a little bit hardand the reason is because if the problemis a little bit hard nobody really goesand figures out how to solve it so whenyou take the aggregate cost of smallproblems across an entire industry it'ssuper high you have someone in everysingle company you know trying to figureout how to do identity they solve it itsort of works it's hard they set uptheir credentials they kind of have itworking and then the next person has tosolve it the next application has tosolve it and so yes you can figure thisstuff out you can there are solutionsthat exist for these things but it's noteasy and therefore it's very costly so ithink if there's one main takeawaythat's what i'd like to convey secondlycome help we we want touh we want to collaborate with everyoneon making this better let's let's getthis standardized and let's geteverything supporting it so we don'thave to worry about this problem anymoreso we have i think maybe three minutesfor questions if anybody wants to askone feel free i put a qr code for our12actor discord if you want to come talkwith us collaborate work on thingstogether we would lovethat anybody have anything got onethere's a mic there you might have tounless you can yelluh thanks very much for the talk umalways impressive to see hioku alsocoming up um in a way and coming back umthe workload identity piece of work isthat with the new platform is that witha is that applicable with the newplatform you're launching uh it not yetso the our goal is to get the communitysupport around this idea and then we'llthen we'll introduce it to the platformthat's why we built the polyfill but wedon't want to we don't want to buildsomething that nobody else thinks is theright way to do it and doesn't adopt andthen have to go rebuild it again laterso we're trying to get community supportfirst before we implement it internallyand you're not using any part of thatwithin heroku or maybe between herio andsalesforce or we so we have we have aninternal implementation of some of thisthat we use to talk to some of our awsservices on the back end and it'sactually been great for us because it'sgot us a whole bunch of aws credentialsout of many of our systems which is muchsafer uh but we don't have an anexternal offering like this yet becausewe want to get the standardization donefirst and that's all spiffy based uhit's our internal one just uses raw odcnot spiffy oidc but yeah could be thankyou you're welcome okay i think i'mabout out oftime yes uh so thank you um i want toinvite everyone to come to our happyhour this evening if you haven'tregistered there's a qr code it's goingto be a lot of fun we can talk moreabout 12 factor and workload identitythere thank you so much for the time2025-04-15 21:58:43.929345 ��f�#��AdNb1m84Bp4chello and welcomeeveryone i'm here to talk to you todayabout workload identity for humans a12-factorapproach first let me give you a littleintro my name is vish abrams i'm thechief architect atheroku and i'mexcited to talk about workload identityso i think the best place to start iswhat the heck is workload identity and iput down the simplest definition that icould come up with which is that it's aunique identity assigned to a workloadfor secure resource access so in thecase of an app developer that means yourapplication is your workload and thesecure resources you want to connect toare backendapis databases caches other applicationsand services that your application needstoaccess what about for humansthis is where 12-factor comes in howmany of you are familiar with the12factor manifesto all right it's beenaround for 12 13 years essentially umyou may have heard that actually we'rerecently open sourced it actually at thelast cubecon 6 months ago and we areworking with the community to update itand bring in some new ideasso realistically the thing that made12factor so powerful and has made itlast so long is that at its root whatit's doing is defining a contract or aninterface between an applicationdeveloper and a platform developer or anapplication and a platform and what thatmeans is that a developer who builds a12-factor app can build an app that'sportable acrossplatform and a platform can run any12actor app so i like to think of 12factor as the place where thoseinterface definitionslive here's anexample today most applicationdevelopers never have to worry about tlstermination for their app all they do isthey listen on a port and the platformmagically directs traffic to theirapplication that allows them to buildlocally using localhost on whatever portthey want and then when they push theirapplication into the cloud or onto aninternalplatform it just runs and they don'thave to worry about it there are otherexamples that are not quite asstandardized as this idea of port andtls termination here's one domain namesso many times the application isagnostic to where it's running but thereare certain cases where it actuallyneeds to know what domain it's beingaccessed on a really great example isooth callbacks if you're doing an oothintegration the callback you have topass to the remote ooth server includesyour domain name so it can redirect backto you at the end unfortunately domainis not consistently provided acrossplatforms some platforms may have a wayto provide domain like via anenvironment variable b��ng into all kinds of stuff youdon't want itto so what should have happened wellthat's our topic today and how can youprotect yourself from situations likethis um the main thing to do is going tobe to secure your network so what doesKubernetes offer today there's networkpolicy v1 um that's been a stable APIfor a long time there's new adminnetwork policy and baseline adminnetwork policy um and there's alsoarbback for you know securing networkand access to your API and controllingin yourcluster so um we'll go through networkpolicy first um so this allows appowners to control ingress and egressfrom their appsum it boils down to IPs but it uses likewhen it's implemented it boils down toIPs but you can put direct citers inthere or you can put um it's reallycommon to just use namespace or labelselectors um the key thing to note hereis it's a namespace scope policy um it'sinstalled by default with Kubernetes theCNI needs to implement it for it to workbut it's been a stable API for overseven years um but the key thing here isit's namespace scoped right so it'sdesigned for developers namespace adminsapp owners to control their traffic likemaybe between a front-end pod and abackend pod or the backend pod and thedatabase within that you knowapplication and thosemicroservices um so network policieshave somegaps uh there's an implicit deny sopolicies only specify what's allowed umand that's been a bit of a source ofconfusion um there's no support forcluster admins so these are namespacescope policies right um and there's noway for admins to kind of globallysecure thecluster um policies this is a big onepolicies are at the IP level not atidentity so it's kind of implicitlyusing IPs as like identity for thesecurity of what the pod can and can'ttalk to um and it's using labels oftento derive those so now labels are asecurity boundary and um that that canbe a little funky and they're notnecessarily a real identity uh Leor willtalk a lot more about that later umthere's scalability issues with largeclusters like imagine you have 20,000pods and you have all kinds of differentsubsets of labels and groupings andthose pods are churning and you'reconstantly updating the IP rules on likeall of your nodesum eventual consistency right um so likewhen a new pod comes up everything needsto get synced um and there's limitedsupport for north south traffic orexternal to the cluster um there's workbeing done in this area but right now umyou can kind of specify a citer but it'snot uh really wellsupported so to try to address thesegaps there's admin network policy um sothe main thing to notice here is thatit's a cluster scope policy right um sothese are non-overridable and they allowan admin to control the rules globallyfor the cluster um so in this exampleyou can see that um it uses similarsemantics right uh the label selectorsand in this example it's allowing yourmonitoring namespace to talk to allnamespacesum because you need to monitor thingsrightum this also adds support for deny andsupport for priority so you can go at ahigher priority and say monitoring cantalk to everything except for namespaceslabeled security restricted at a higherpriority we're going to denythat um and finally you can also pass soyou can allow deny or pass and in thisexample even higher priority if thenamespace is labeled security internalwe're going to pass it and what thatmeans is it'll be up to the namespaceadmin to use network policies to umcontrol their traffic um this is anoutofree install installation so you gotto actually install a CRD and it's a v1alpha API um there's a lot of work goingon to get it to beta um if you were justin the SIG network talk they were theywere discussing what they want to do toget it to beta um and it requires anetwork plug-in implementationum baseline admin network policy this isanother one this is for that defaultdeny it allows you to it's a globalresource um named default um but itallows you to define what happens if youdon't have name like a network policythat applies to your traffic um they'reactually looking at combining these asthey go� to beta um but it exists now soum so that helps with the implicit denyum and it helps with supporting clusteradmins but we haven't really handledidentity scalability um north southtraffic and we've added a new gap uhthere's increased complexity because nowyou have layers of admin network policynetwork policyum and you have all these differentresources so trying to figure out what'sgoing on and what's applying to yourtraffic can be complicated um there's alot of tools that you can use that SIGnetwork's been working on that can kindof help you map this and test this andsee what's going on um but it does addcomplexity nice and yeah and just youknow before even moving to the gap shoutout to some of the folks here you knowlike as we seen seeing our updatesthere's tools like policy assist foranyone that use admin or policies toolslike policy assistant that you know cando variety of things that can help withthat but it's still more complex so howdo we go about addressing these gapsright so first I want to quote JohnHoward John is a is maintainer and he'sbeen doing a lot of work in this areaand John compares this problem to um youknow security officer securitycheckpoint that check for employees andJohn is saying that the strengths of thecheckpoint vary substantially based onhow we identify employees right whetherit's a company t-shirt that means it'sour employee whether it's a fingerprintscan whether it's showing a badge all ofthese options would substantially um uhwould would yield a substantiallydifferent results right and the samething applies to network policies and Ido encourage you reading this blogbecause it talks about a lot ofdifferent useful things but this is thefirst thing we want right we want wewant to ensure the identity um issomething we rely on now another thingis immutability of this identity rightif the identity is the identity is fixedfor the pod's lifetime uh there's norisk for label changes because I'm sureI'm not sure like how how many of thepeople here know but labels can changein runtime nodes um have accessprivileged like especially cubelet to tochanging labels and there has been someCVS and attacks around that um but alsoif the pot's changing on on runtime uhit means that you know if we if we if wetie the policy to to the pod labels umthen how can we be sure that uh thepolicy is not changing on runtime now uhwith immutable identity there is a redjust attacking surface right um if um ifa pod is compromised right um andum so if a pod is compromised and andthe IP address um is reassigned to adifferent pod right umso sorry I forgot about thatso there has been cases where pod cameup right and then it got an IP addressand Then because we're using labelselectors uh the IP address is what weare uh denying or allowing the trafficright so but when a new pod comes up umif we're caching this network policiesit means that right now a new pod itshould not have these right it can havedifferent labels but it has its IPs uhcan basically get access that itshouldn't have so that's that's thewhole story but forgot to mention thishere um additionally immutable uhimmutability ensures that a policiesremains predictable for the life ofanother pod we talked about itidentities identities like serviceaccount as an example we're going to seein a minute um remains fixed nobody canchange the service account of a pod onruntime so it requires a restart andthen uh gives us the policies to bepredictable now can we tie theKubernetes primitives obviously we justenumerated some things we want out ofthe policy um and yes service account ismeant to be um an identity we'reattaching arbuck we're attachingpolicies to it right um it clearlysatisfies the request we were saying uhit's also hard to manipulate rightbecause uh service accounts um are alsotied to arbuck we're giving arbuk forexample you know a pod usually cannotchange its own service account so evenif a pod is compromised um and if itneeds to escalate privileges meaningchanging a service account to a more uhprivileged service account it willrequire a restart and by the time itre�starts it maybe the pot is notcompromised anymore now before we go andtalk about more implementation of thatright eager mentioned some of theproblems being scalability um it's justto illustrate that um so when we talkabout label selectors right uh whichagain is not an identity um we theselabel selectors produce subset of podsnow when we talk about ingress andingress rules um so it define theseaccess controls between these subsetsnow when things gets complicated is whenwe querying for these subs sets it'spretty expensive for the data plane andcontrol B to do because we may haveoverlapping sub subsets we need toenforce an order of evaluation of thepolicies and quering for all the subsetsalthough you know Kubernetes API givesus a lot of the good machinery to do itit's very very expensive and it's veryhard to implement as wellum now the questions we we're askingourselves can we instead transmit theentity as part of the networking packetright right now it's not there we haveIPs um but if we are able to stream theentity as part of the networking packetthen we can save the mapping betweenlabels and IPs and some of the eventualconsistency nature of it and being moreand being more strict and secure aboutthe network can we enforce on somethingelse other thanIPs now there are many many specificsolution solutions not necessarily allof them satisfies what we said butpartially some of them are um but youcan see I'm highlighting specificsolutions i'm going to go over two ofthem right now some of like you knowleading projects it doesn't mean thatthis is the complete list there are manymany other projects that implements theAPIs but I want to start with seliumnetwork policy i guess a lot of peoplehere know selium you know very widelyused CNI and city network policiesenriches network policies like corenetwork policies uh and providing L3policies on the IP level and it providesyou know various you know as we seeuseful obstructions I just listed two ofthem but uh which you can you know youhave a list of predefined entities thatyou can use or you can do from securitygroups with AWS um but it provides youknow again value substraction but basedon IPsum at L4 it provides similar support forport protocol um and it also provideextended support for L7 all in the sameCRD by the way which obviously we knowthat the L7 functionality is enforcedwith the node local envoy proxy um theCNI level cannot enforce the L7capabilities there now Selium uh alsosupports egress and eress um uh policiesuh there is support for DNI policies aswe see with you know as well adminpolicies and similarly uh there is theclusterwide that is um basically youknow giving this solution for thecluster admin because we said networkpolicies is for the applicationdevelopernow similarly we have is authorizationpolicies which some of you heard of it'sa different name right this one isnetwork policy this one is authorizationpolicy but it is mainly to provide thesame functionality right networksegmentation now supports theenforcement not in the L in the L3 notin the IP layer support the enforcementuh on the L4 TLS and the L7 layers andwhat it means um you know for all of youthat are keeping up to date with theadvancements right where I now have twomodes We have cycar we have ambientwe're not going to go into this uh butum the enforcement basically happens indifferent parts um in these modes but umwithin um the pot identity is onetoonewith the service account um and it'sdistributed as a certificate from thecontrol pin to the proxies and withinthe certificate there is the sandfieldsubject alternative name where we couldencode um the identity being the serviceaccount and the identity is usuallyusing the spiffy format which is um anincreasingly common standard foractually you know describing uh thisidentity now we can see a little exampleon the right i'm I'm trying to keepexamples throughout as I go but notgoing to spend a lot of time on that butuh for example in this example rightwe're allowing payments back end umwe're lying traffic to the payments backend from the payments front identi�ty asanexample now East S2 supports as wellexternal authorization uh which issomething I wanted to call out um so youcould delegate you know um authorizationdecisions to a different uh applicationa different web hook you know we knowthings like OPA or any other callouts wecan do uh there is support for denypolicies and different deny controls umand there is support for clusterwidepolicies with some semantics you knowfor those using with putting the policyin the S2 name space uh but there's nono support for egress so again this isjust a rundown of what's supported ornot um but you know as you see like thistwo leading projects right which a lotof people probably use uh provide thesame goal right same purpose networksegmentation how can we secure trafficwithin our network but there's nostandardization there it's it's it'swidely different um and why there's nostandardization because standardizationis hardnow what is you know why why is it hardor like where are the areas which weidentify needs more standardizationright uh what's the standard identity isit pod labels right is it serviceaccounts is it spiffids what is a standard format right isit jots is it um is it x5 ornon-certificates is it svids svids forthose who are not familiar is um a newidentifiable format that the spiffyproject definedUm how do we uh distribute and issue theidentities there is a cap in Kubernetesalso known as the pot certificates capthat suggest that cublet uh is going tobe the one issuing um uh identities andum basically distribute them to all thepods uh is it sufficient as a newstandard is all the ecosystem is goingto be developed and use this identitiesum we don't know what's a common rule oftrust and what's the interoperabilitysemantics between L4 L3 that we talkedas well and L7 enforcement how do wecompose all of them uh this is somethingwe're kind of living day-to-day tryingto understand and you know deal with butit's it's increasingly important to dosome sanitization in these areas becauseif there's no standardization thendifferent implementations implementeddifferently which can yield to somesecurity breaches or um someNow what you can do now I'm not going toyou know put this slide and tell you heyyou know just employ zero trust defenseand death obviously there's like youknow common uh patterns that youprobably want to use but in thepractical part of our presentations ifyou are using labels as a policyentities make sure you're usingadmission control on those labels rightbecause you don't want anyone to be ableto change the label because we oftenthink like you know changing a label isnot something uh that's you know notthat's giving you know a lot ofpermissions but uh you probably want touse things like vernal things likegatekeeper things like that going tobasically provide admission control forchanging the labels and for the labelsyou're using uh to tie policies uh youwant something stronger to validate umthat no one can bypass your policies nowif you're using identities uh you knowit sounds magical no but make sure theidentity is transmitted so in the Eastercase that the identity is transmittedwith the encryption over MTLS meaning ifyou disable MTLS if you're disablingencryption in your cluster you cannotenforce authorization policy um so justyou know it's obviously just a a highlevel overview but make sure when you goabout securing your networks youactually uh understand uh what is theimplications of doing something that arerelated in this field now obviously whatshould have happened and just tocomplete the story obviously right uhthere should be zero trust and defensivethis right MTLS should have blocked uhpotential impersonation of the debug podbecause if we have MTLS right so anyonepresents every service presents its owncertificate and it's much hard toimpersonate to a different serviceauthorization policies should haveblocked the access because the debug podshould not have been deployed with aidentity or any like you know dependswhat you're tying on but like or pleactually have access to anything elseright we just wanted to get a req�uestdump the headers and inspect them um andsimilarly obviously the service accountshould not have any um API accessbecause it didn't need to so thesesecurity postures um should beespecially if we're a platformengineering like we're we're the peoplethat kind of take care of the platformwe should make it easy for developers uhto make you know to make this decisionoperate in a very security conscious waynow lastly standardization is importantright we said it's hard but it'sextremely important i just listed two ofthe things here one of the potcertificates kept that I talked about uhI think it's cap 43 uh7 uh uh which isprogressing and you know as I said uhstandard trying to standardize theissuance of the certificates the otherthing is a GitHub discussion happeningon the SIG uh policy SIG network policyAPI group about gathering use cases forusing identities and not IPs from the UXperspective actually not from thesecurity because from the security umit's different but from the UX would itbe easier for users to reason abouttheir workloads with identities and notIPs because usually the developer youknow they just spin up a pod a workloadthey don't know what IP is going to beassigned um and these two slack groupsare the best place to actually go andtalk about it and ask questions or youknow bring up your idea and there'spretty responsive um um uh folks thereuh that can help you you know um drivingthis areayeah with that um uh I think we have uhplenty of room for questions umu more precisely five minutes and yeahthank you[Applause]do you know where the mic is forquestions yeah the mic should be the micshould be behindright oh there's a mic just here in themiddleoh we got someone there's someone wherehello thanks a lot for that presentationum I have a question about authorizationbetween two services uh I have a usecase where the authorization is changingregularly uh between two services howwould you manage that well can youelaborate more on the use case like soyeah so I have a key lock um and I havetwo microservices and the uh like theauthorization between two these twoservices is managed by key lock butsince it's administrated by other peoplesometimes the uh the authorization ischanged like the scopes O2 scopes arechanged regularlyand what's the ideal scenario you wouldlike to get to like do you does itchange because it's supposed to changedo you want to enforce something elselike um no I just want to um I just wantto know a little bit about how the thepolicies are updated is it at everyrequest made between each of theservices or is it something elseum well definitely I mean stay here andwe're going to follow up on that but Ithink in theory right you want to youwant to segment your network you want tounderstand hey I have this thismicroservices and theoretically thesemicroservices should talk to thatmicroservices you know obviously backand front end being the obvious case butyou know everyone has their own uh andonce you segment your networks you wantthe policies to be predictable if youneed the policy to be changeddynamically maybe you want to usesomething like an external authorizationmechanism as we see with a custom with acustom authorization where you say hey Idon't want my policies to be pretty finebut you know uh delegate my uh policyenforcement decision to something likeOPA to something like you know otherproviders what is it yeah sorry I wassaying like Google or something likethat yeah like any any thing thatimplements this protocol there's a bunchof things out there and then you coulduse this you know this mechanism toactually you know uh provide the uauthorization decision and whichobviously can be changed dynamicallywhether it's integrated with your IMprovider or whatever yeah okay thank youvery much no problemhi um I was wondering if you had any uhtips or what you used for observabilityinto troubleshooting uh networkingissues in regards to network policiesversus security groups versus WFTS or uhanything like that yeah definitely imean yeah definitely um so if you go andlook at network policy level often timesyou would have a feature I mean maybe indifferent implementation it's calleddifferent but um a thing called networkpolicy logging so it logs authorizationdecisions and there's a bunch of usefulobservability tools kind of go on aroundthat and present it to you in whateverum observability dashboard you're usingthere was in the again in the previousum SIG network discussion someone askedabout uh whether we want to standardizethe format of this logging for examplebecause right now it's not standardizedum so this is being discussed um there'sactually issues about that uh but I cantell from a vendor perspective forexample within Google we provide thisnetwork policy logging uh which you canuse um um there is uh as well sometooling like I think the policyassistant as I mentioned before that canlets you actually simulate what wouldhappen if uh a network um that whatwould happen if a policy would bedeployed so would would your trafficbeing denied or not um and I think simsame thing if you kind of go above rightwhen you go like to L4 like L4 TLS or L7um again like useful logging fromwhether it's the envoy it's something umthat you can definitely uh use umyeah I hope that answer your questionthank you very muchyeah hey was uh super interesting i'minterested in or intrigued by the sortof direction of going towardscertificates next 509 for client orth umI'm just I guess I'm interested inwhat's driving that whether it'sconvenience or some other technicalreason and I guess the reason I'm askingis because I'm sure you're aware X509ASM1 these are pretty old things and thePKI comes with a bunch of operabilityissues around you know revocation andlike issuance and lifetimes and all thatsort of thing i guess my point is weknow we can do better than X509 when itcomes to authenticating it's why MTLSisn't everywhere right it's only in verysmall places usually um so what'sdriving that move towards Cublet issuingcertificates rather than those otherstandards you talked about like Jots i'mnot a fan of Jot either but there'sother ways of doing this that aresimpler and have less of a history ofvulnerabilities and pausing issues yeahthat's definitely a good question and bythe way good feedback you know feel freeto go to these issues and put yourfeedback there but like for example oneof the things you mentioned with Jotsright jots for example operating only onL7 what happens when your traffic is notL7 how do you how do you going toenforce that right so MTLS andcertificates being one of the ways bythe way anything in the CAP like in theKubernetes enhancement proposal arelinked um is not being opinionated abouthow do you transmit the identity it'sjust okay how do I distribute how do Iissue an identity but MTLS being one ofthe ways that for example is you happento use to or any mesh to transmit anidentity and then when you have MTLSover the network it's easier for you toyou know to decode the certificate andyou use it for enforcement um yeahthat's uh and that that's part of whatwe talked about about standardizationbeing hard there's a lot of uh a lot ofdifferent ways of doing this stuff umbut the the key point is that we canprobably do better than labels and wecan do some form of identityyeah and definitely I mean you know Imean go to like these barcodes like youknow put your feedback there you know ifwe want to standardize on you know bigwe obviously want to standardize onsomething that is most commonly usedright like you know um I'm not likethere are some things that not commonlyused and going to be harder for peoplewith the Kubernetes ecosystem to adoptbut you know obviously all the thingsyou said we can do better than that whywouldn't we do better than that mtls isnot necessarily everywhere um um so ifwe can standardize on this identitymechanism then maybe we can standardizehow we can transmit them thetransmission part is not is up for animplementation right now sotheoretically you can extend it withyour implementation to say hey myimplementation transmit identity in thisway and that's how it is used forauthorization thank you2025-04-15 21:58:44.453348 ��R�#��[AQ15XbASxHM0hey everyone uh thanks for attending ourtalk um I'm Leo Lieberman i'm engineerlead at Google and I'm Igor Velichkovichi'm a engineer at a stealth startupfocused on AI infrastructure yeah andtoday we're going to talk aboutencryption identities everything inbetween kind of some you know some tipson building secure Kubernetes networksand some of the hopes we're hoping toget out of this presentation and out ofstandardization efforts in this aspectbut I'm gonna hand it off to Eager tokind of walk us through and you know getus into the domain yeah so let's startwith astory um so suppose you have thisenvironment you have some ingress cominginto your cluster and a proxy pod um andyou have a rate limit config prettycommon setupright um but now you have some issuewith your rate limiter it's not ratelimiting what it should and it'soverrate limiting other things so youneed to go and troubleshootthat so you add a debug path and a debugpod because you remember that the ratelimiter is based on headers so you addthe debug pod to intercept some trafficandum try to see what's going on that debugpod is meant to be temporarily lived andyou do figure out what's wrong and fixyour rate limit um but because it wassupposed to be temporarily lived um ithas broad unrestricted network accessright you didn't mo bother to secure itum so you can see in the graph thatdebug pod can access everything in yourcluster so that ingress was exposed onthe public internet and you forgot toclean it up so now suppose someone getsinto that debug pod and they startaccessing your user information paymentinformation transactionsum and the attackers move all over yournetwork because you don't have anysegmentation um after this you havereputation and compliance issues thatfollow um that was a madeup story butthis has happened many times in the pastso this is a good example um this led tochip and EMV cards in the US um theyweren't really very popular before thatso had like a cultural impact um butattackers got into Target and they wereable to steal 70 million people's PII 40million credit card numbers and they gotin through a weak link in a contractorthrough an HVAC contractor that wasusing free anti malware software thatwas not robust enough um the weak linkthing is going to be a pattern thatyou'll see in the following exampleswhere someone is able to get in througha weak link in your security and becauseof a lack of network segmentationthey're able to get to other moreprotectedresources so another example a lot ofpeople probably remember is Equifax 148million people's personal informationwas stolen including social securitynumbers and financial records um thereis a vulner vulnerability in ApacheStruts that um is used for Java webapplications and again once attackersgot in through that oneweek link due tono network segmentationum they were able to move to more highlyprotectedassets um this is an interesting one soStarwood Hotels and Resorts Marriottthere were several breaches um that werelong-term breaches but essentiallyhackers got in and then they movedacross the network installing keyloggers Trojans all kinds of stuff umgetting 339 million personal datarecords um so what's interesting aboutthis one though is that both Star isthat Marriott was purchasing Starwood umand Marriott knew Starwood disclosedthat they had a breach and Marriott knewabout this so they tried to do their duediligence and they did a 10-monthsecurity assessment well they missed anongoing breach and due to poor networksegmentation once they purchased it andintegrated it into their network thatbreach spread to Marriott so again it'sit's either a weak link in your softwareor even someone that you purchasesomething that you purchase andintegrate and then that ends up you knowspreading across your network andgetti��is exposed as DNS SRV recordsusing core DNS which is queried byservices through normal DNSinfrastructure in our case cloud DNS anda few custom resolvers since we have aregional failure domain our servicediscovery system uh has no interreionaldependencies and operatesindependently excuse me all right sowe're hitting some limitations with DNSwe're running into like DNS responsesize limits for our largest deploymentswhere DNS a DNS response would um exceedthe maximum response size of 65kilobytes plus DNS adds uncertainty tothe system um the time it takes for aservice to unregister becomes difficultto reason about when there's DNS cachingat multiple layers goingon one way communication is quitelimiting and combined with theinflexibility of DNS it makes it verydifficult for us to build any advancedrouting features that we're starting torequire but on the other hand ourDNSbased system has some benefits it'srobust and battle tested it's beenaround forever it's simple it saysrelatively simple in parenthesis herebut it's well supported by a lot ofclients and it took Spotify this far sowe're not kicking it out there willalways be some legacy and third partyservices that need basic uh DNS servicediscovery and uh we're also using DNS asa fallback which Erica will talk moreaboutlater right so to pick a successor onhow to solve our service discoveryproblem so we or what kind of shoes toget I think there was some shoe metaphorin the beginning we took a structuredthree-step approach uh to assess ouroptions and we started with a set ofrequirement criteria where we assessedour capabilities of with thecapabilities of market leading servicemesh solutions and also like included inthat like the idea of expanding ourexisting infrastructure then weestimated the effort it would take tobridge any gaps in solutions that didn'tmeet the requirements um and first upthen was to find the set of criteriathat we thought was most important andwe were considering mainly stuff aroundreliability scalability andextensibility so for example it neededto support proxyless RPC it we had to beable to configure sonware load balancinguh and uh it needed to fit thehetrogenous service architecture we haveat Spotifyso after that we picked which candidatesto evaluate and we looked at trafficdirector which is a Google managecontrol plane and we have extensiveexperience with the with GCB's trafficdirector from like experimenting with itat Spotify for for some time uh but thisexperimentation have surfaced severalaspects of the product that wereincompatible with our service networkand our needs but since we already knewthat uh it wasn't a good fit we alsoevaluated a variant where we wouldbridge this product gap using a customproxy layer in between TD and and andour our network we also looked at whichis the most popular control plane I hearu we evaluated it both in running it ina self-hosted mode uh or as a managedproduct like like Solo or Anthos andfinally like I said we looked atextending the system we already had inplace and yeah and once we had pickedour candidates we did an assessmentwhere we evaluated each candidateagainst the criteria we had laid out andthere's a lot in this slide uh I don'texpect you to read it all i see you weresquinting there in the back but uh to tosummarize them for traffic director wehad a we had some we already had someexperience here we kind of knew that itdidn't quite meet our current needs andadditionally the fact that it's onlyavailable in a managed mode uh and it'sa lockdown product we had limited likelimited um options to extend itsfunctionality with features that that weneeded at Spotify for ETO our mainconcerns were around scalability u andthat certain critical features that thatwe we needed weren't available for theproxyc mode ofrunning um and then finally likeextending our own system yeah I mean weour hunch when when going into this workwas that this was going to be the mostattractive option and it tends to be forengineers and for all we've all beenthere uh and we would be build like wewould be building on a foundation thatwas pro�ven to meet our scalabilityrequirements and had been tailored to toour service network from the start sounsurprisingly when we weighed ouroptions we found that our extending oursystem was the best option yes but eventhough we suspected that we would end upwith this conclusion it was still worthgoing through this evaluation process uhbecause it helped us understand whichproblems that we thought were mostimportant for us to solve and it wasvery appreciated by by management likeKalir sitting over there um because theyare naturally suspicious of engineeringteams coming with proposal to spend alot of time building infrastructure inhouse and I also wanted to point outthat we did this evalation a bit morethan a year ago and I know that therehave been many relevant improvements inall the products that we evaluated sincethen so it might be like that this tablewould look different if we if we didthis evaluation today but anyway withthat in place uh we were excited to shipsome improvements and first fundamentaldeliverable we had was to introduce anew protocol to manage the data planeand as mentioned a few slides back DNSwas too limiting and one of themust-haves in our EV evaluation was theability to use XDS to configure ourproxil gpc fleet so XDS is a universaldata plane API at it's spun off from theenvoy proxy project and it's supportednatively in gRPC and envoy meaning thatour if our control plane supports thisprotocol we can easily integrate envoysand and other gpc services into into thethe mesh that we're building and we haveprior experience using XDS for uh ourenvoy based perimeter that we've beenoperating for five or six years um ithas a good feature set that meets ourneeds and uh using XDS as a foundationwould let us then you know build out theprioritized features that we had withoutmuch hassle and now just briefly go overthree of the features that we have builtor are in the process of rolling outfirst up is um traffic splitting so thewe needed the ability to divert trafficbased on rules there's a pilot use caseat Spotify where a team needed to divertsynthetic traffic that was used I thinkfor some ML training or something butthey need to separate that from realproduction traffic at multiple points inthe call chain and solving that in theapplication layer would have meant youknow code duplication and shading andthen changing these rules and and andworking with this would have meant youknow changes that would span severalsystems and and services which wouldhave been very cumbersome to work withso instead we solved this in the networklayer by implementing a imperative APIthat our the pilot team was then able touse to add uh custom uh matching rulesand and GPC routes which were thenautomatically pushed out to the relevantclients usingXDS uh next up is uh my favoritezoneware routing so uh in each GCPregion that Spotify is running we havelike we usually run in three or fouravailability zones due to our sizemostly like but our service network isreally not you know zone aware it's aartifact from the on-prem days um but soour failure domain is still regionalwhich means that we're not really takingany advantage of being deployed inmultivail availability zones like if onezone were to have troubles we would needto drain the whole region anyway andsince sending traffic across the zonalboundary costs money we want it to berouted as much as possible within thesame zone and to do this safely thecontrol plane then needs to account foruneven spread of server or clientcouplings between the zones um and theserver locality part of the servicelocality is already part of theKubernetes endpoint and when we use XDSin the client we that means that we havea persistent connection to each clientwhich then lets us also know what arethe clients that are calling eachservice and how they are distributedacross zones and with this informationthen we're able to build a sonware loadbalancer that and we used the envoyproxies locality aware load balance loadbalancing algorithm as a base and thenwe factored in data from cluster clusterload reports which is ano�ther xs featureto account for differences in throughputbetweenclients and this load balancing policyis uh now replacing a less effectivemitigation we have for to keep trafficwithin the same zone and can talk moreabout this after the talk it's myfavorite topic anyway thirdly uh uh thelast or the third feature that we builtum or rather got for free was thisdependency graph like since the newarchitecture has a connection to everyclient and we have every server in ourin our uh in our service data setalready we can easily dump a data setwith a global view of what's connectedto what and with a later addition ofload report data we're going to also beable to show how traffic flows betweenservices in real time or or dumped intoa data set yes and now I'm heading overto Erica to talk a bit about how webuilt thisyep all right so um the first thing weasked ourselves was where in the currentuh design of Nameless XDS would fit uhthis diagram here shows how endpointsregister themselves with nameless soworkloads that run on VMs uh sendperiodic harbit um to a component callednameless registry uh and thecorresponding service discovery data isthen stored in a spanner data databaseuh for workloads that run on Kubernetesthere's a component called shameless asEric uh mentioned before that on startupuh will connect to all of the clustersin that region uh and creates watchesfor endpoint slice resources uh fromthere it stores any service discoverydata in memory and over time it updatesit according to the change events thatit's notifiedabout uh DNS lookups come in uh from uhuh core DNS uh and this in turn forwardsto another component called namelessdiscovery so a service discovery data umlives in three different places we havea spanner database uh we have shamelessuh for kubernetes workloads and we alsohave cloud data store which holdsconfiguration for redirecting traffic toother regions for um uh services sonameless discovery basically looks inall of these places to put together thecorrect answer to um um for every DNSlookup that uh it's forwarded by coreDNS so again given this setup wherewould XDSfit so we considered a few options uheach of them optimizing for somethingdifferent uh and in the end we settledfor option number three which combinesadding XDS to nameless with a largerinitiative to revamp uh its tech stackand its design and we'll see exactly howin the uh next slides an objective ofours was to move fast throughdevelopment cycles without disruptingthe um current production traffic so weuh deployed the revamped nameless as aseparate system uh we also built a newdeployment pipeline that would allow usum to have deployments that arebasically safe quick and handsoffcompletely uh and this is powered byArgo CD and Argo rollout um and only atthis point we then went ahead and builtXDS capabilities into Nameless once allof this development work was complete uhwe gradually migrated traffic productiontraffic from um let's say the old systemover to the new one uh and note that atthis point all of the clients were stillusing DNS as a protocol for servicediscovery once the redesign namelessproved to be stable we then startedonboarding clients onto XDS but let'snow look at the new designall right so first off we changedruntime from VMs to Kubernetes so now wehave a nameless pod instead of anameless VM uh workloads that run in uhKubernetes register themselves just likebefore nothing has changed there uhworkloads that run on VMs still sendperiodic harbits to nameless registrybut now this calls out to shameless andshameless via the Kubernetes API serverwrites a bunch of CRDs into CD to storethe relevant service discovery datashameless also creates watches for theCRDs uh storing the service discoverydata in memory and updating it um as itreceives change events over time asimilar flow is also in place for theconfiguration that allows service ownersto redirect traffic to another regionfor theirservice cord DNS of course is stillthere but now it forwards to shamelesswhich can answer correctly by justsimply looking into uh uh its in-memorystore of service� discovery data so wehave a simplified design uh because thedata is no longer scattered acrossdifferent places nameless discovery iscompletely gone from the picture and soare Spanner and Cloud Data Store and anice effect of all of all this redesignis that latency of DNS queries has gonedown significantly because of courseinmemory lookups are cheap and now withthis new design the answer to thequestion where does XDS fit became a bitmore obvious uh shameless has thecomplete picture uh of the servicediscovery data at all times and alreadyanswers DNS uh lookups so it seems likethe most logical place where to add ouruh XDSinterface all right before we move on uhlet's just go quickly through arefresher on XDS so XDS stands for Xdiscovery service where the X uhindicates an arbitrary resource it's aset of APIs that is used by a data planeto understand how it should configureitself and the configuration comes froma control plane in the form of resourcesof different types one possible setup isto have individual APIs implemented onthe control plane side each providingspecific configuration resources theclient then fetches them over separateuh gRPC streams an alternative setup isthat the control plane insteadimplements something called ads whichstands for aggregated discovery serviceand the client then fetchesconfiguration of all of the types overone gpc stream but back to our story nowso initially when we began we thoughtthat we could just simply rely on uh aninvoice open source library called Javacontrol play so straight away we startedbuilding a pock uh they used thislibrary but soon came to a stop uh inthe case of a proxyless gRPC servicemesh each node can have multiple gRPCchannels to the control plane uhspecifically one per service that thatnode wants to uh talk to uh and each ofthese channels maps in turn to a seriesof conf configuration resources that thenode subscribes to and when one of theseresources changes the control plane hasto be able to push updates to all of thenodes in the mesh that have subscribedto it so to do this correctly uh thecontrol plane needs to remember everyindividual gRPC channel from every nodeand the configuration resources relevantto each of those channels the opensource library simply didn't give usthat level of control so again what todo uh we considered a couple of optionsforking the open source library maybeextend it to somehow fit our use case orsecond option just implement XDSourselves from scratch and it didn'ttake us very long to decipher option twoand for us it's been a good decisionafter all for a couple of reasons uhfirst of all we could optimize for themost important thing so our focus is tobuild a proxyless gRPC service mesh sowe implemented XDS specifically withproxyless gRPC clientin mind uh effectively that meantimplementing um ads state-of-the-artvariant of the XDS protocol which didn'ttake us too long to do uh a second ubenefit of all of this is that we havefull control over this implementation soin the future if the need arises we canextend it to support other types of XDSclients um we could also instrument itto uh fit our u observability requirerequirements exactly and finally wecould start simple and just hold off onany premature optimizations um and ifbottleneck bottlenecks emerged when wescaled this out to um many many servicesuh then we would address them at thatpoint of course implementing XS fromscratch uh also came with somechallenges no formal protocolspecification exists even though thereare some extensive envoy docs on how theprotocol works so um many times for thefiner details or edge cases we refer toother implementations for example the E2codebase and especially at the beginningwe assume that we would end up with aton of like super complicated code toreason about and to maintain but itturns out that it's relatively littleamount of code and it's also relativelynot obscure to to reasonthrough um so uh with our XDSimplementation in place we then turneduh to on boarding services onto XDS andwe started by laying out somenon-negotiable requirements for the rollout so �first of all no big bang uh oryolo roll out for this because it's toorisky uh for such a big change we wantto be in control of which and how manyservices switch to XDS throughconfiguration and in particular we wantto control groups of services that areeligible for XTS based on their metadatauh but also we want to control whatpercentage of any given group willactually switch to using XTS so to giveyou an idea an example configurationwould be 10% of services that run inregion Europe blah blah whatever andhave reliability tier four uh meaningthey are less critical servicesanother requirement of ours was that nointervention from service owners wasrequired also we aim for zero disruptionor in any case minimal disruption andlast but not least we want a nice bigpanic button that takes us back to aknown good statequickly so let's see how we actuallyrolled out XDS so all of the gRPCservices at Spotify use an in-housemanaged Java framework which is calledApollo apollo provides gRPC specificmodules to do um things like start up agRPC server or make outgoing gRPC callsto other services for lower level thingsthe framework rely on a library anin-house Java library you can think ofthis as a wrapper around the open sourcegRPC Java containing some specific uhSpotify logic so to roll at XCS wechanged both of these pieces first offwe created a custom managed channel inthe gRPC client module module of Apollouh this behind the scenes maintains twoopen gRPC channels one for DNS and onefor XDS it also runs a task in thebackground periodically that reads thevalue of a global flag indicatingwhether XDS is currently enabled or notuh we use a DNS TXT record for this anddepending on the value of this flag theneither the gRPC channel for DNS is usedor the one for XDS and this that I justdescribed described is effectively whatpowers our fallback procedure to toDNS also when a gRPC service boots up uhsomething that we call XDS eligibilitychecker determines if that specificservice should use this new managedchannel or just the standard regular oneit does so by comparing the servicesmetadata against the currentconfiguration of the XDS rollout if theservice is deemed eligible to use XDSthen we also need to configure the XDSclient that runs inside gRPC so that itknows which control plane to talk to soit's configured to talk to shameless inour case and to send over metadata toidentify itself to the server and forthis we wrote some XDS bootstrappinglogic uh that puts together together theright configuration at runtime and dumpsit into a system property from there theconfiguration is read by the XDS clientso that it knows um who to talkto all right so once all the pieces thatI just talked about were in place westarted on boarding services onto XTS weinitially just targeted a group of lesscritical services and went slowespecially at the beginning byconfiguring a small percentage of thegroup in just one region as the nextstep we went for 100% of that group inthe same region and then in threedifferent increments we on boarded thesame group in all of the other regionsto give you an idea this initial phasetook us uh approximately a month and ahalf and saw a few hundred distinctservices using XDS instead of DNS forservice discovery we then targeted a newgroup of services uh this time a bitmore critical uh and we moved throughthis group faster on boarding largerpercentages globally and from therewe're just rinsing and repeat basicallyuntil we run out of groups of servicesto on board but let's see how it's goingso definitely rolling out XDS graduallyand in a controlled way has been theright choice uh going slow at first andtargeting only uh less critical servicesallowed for bugs and different types ofissues to emerge without causing outagesfor Spotify users also seeing hundredsof services using XDS and working justfine gave us confidence to push throughand then on board more critical thingssomething else that proved very usefulis our beloved panic button because withit comes peace of mind for the engineersthat uh are doing all of this work uh atany time we can have all of the serviceswhich back to FD XDS in a matter ofthree to four minutes and that alsomeans we can troubleshoot issues atleisure free from the pressure thatcomes inevitably during an outage or anincident it's also a very simpleprocedure easy to operate and to reasonabout and that's greatly appreciatedespecially when you're on call for thisstuff through the night all we need todo if we're you know page the 2 a.m isreconfigure the value of a DNS recordand just then watch some graphs andfinally it's also kind of neat that tofall back to DNS we're usingDNS but of course there were also somechallenges so let's talk through some ofthose so lots of moving parts createcomplexity of course and complexity is apain most of the time uh admittedly someof this pain is self-inflicted becauseour decision to revamp nameless meantthat for a while we had to maintain twoum separate systems and to operate themfor servicediscovery but also complexity comes frombig initiatives that are led by otherinfra team as teams at Spotify which canmake things tricky to navigate at timesuh sometimes there had also beendifficult to pinpoint um issues uh whenservice services started using XDS sotypically a problem would pop up in oneor two services while hundreds of otherswere working just fine so from there itwould be a bit of a game of spot thedifference to understand how theaffected services were in some waydifferent from the rest and also in somecases it was difficult to reproduceissues outside of the impacted serviceswhich slowed us down a bit uh obviouslywe had our fair share of bugs in thereuh and some of the assumptions we madeduring development turned out to beincorrect uh one issue in particularthat I want to mention is thatgenerating the XDS bootstrappingconfiguration at runtime has provenflaky uh and a couple of services runinto some XDS initialization problems soas a stop gap solution we make sure thatwe run that logic as early as possibleuh in our um managed Java frameworkApollo but in the long term we want toswitch um to have this logic happenahead of the Java service um running uhfor example via an an init container andfinally rolling out XDS has sometimesbeen about educating service owners theycame to us convinced that XDS had brokentheir services their service when inreality it surfaced pre-existing issuesuh with some services like not havingzero downtime deploymentsso what is next for us well we have abunch of ideas uh about features that wecould add next to our service mesh and Ilisted them here but most likely beforejumping too deep into any of thesefeatures uh we'll do some thinkingaround how we can extend our servicemesh to cover more of the Spotifyservice network so our focus so far hasbeen on gRPC Java services specificallybut definitely moving forward we want tosupport other protocols and otherlanguages as well all right so now it'sback to Eric for some concludingthoughts thank youright so to conclude did XTS deliver theprotocol itself is fairlystraightforward that like Ericamentioned the implementation part onlytook us a few weeks uh and then afterthat we were able to move really fast onon new feature development uh on our newXDS powered stack but because runningXDS and proxy SGRPC is quite uncommonstill there's very little publiclyavailable information and guidance whichdid slow us down a bit uh and would werecommend what we did to others thenyeah the it depends is a classic answerright it's a flexible and pro powerfulpro protocol XDS but it might take a bitof engineering effort to integrate intoexisting existing systems but it's beena really good fit for us and we hopethat this talk will uh inspire some ofyou to try it out and we're veryinterested in talking more and sharingmore learnings about proxyless gRPC orXDS with the peer companies so if youwork for such a company or you just wantto talk about this stuff then pleasePlease reach out to us now after thetalk or or over an email we're aroundthe conference here until tomorrow uhotherwise you're can always send us mailhere and yeah thank you2025-04-15 21:58:45.037178 ��}� #��1A2_ECK6v_yXchelloeveryone so this talk is about howSpotify outgrew its service discoveryshoes and how we went about finding anew pair of shoes and instead of 1million DNS shoes we now have 1 millionXDS shoes this will all make sense uhgive a minute yes so we'll cover whatproblems we were facing and why we endedup designing around XDS um design andimplementationchoices and how we transparently rolledthis out without any adverse effectsthere's an asterisk here after that butI'm sure we'll get to that as well i'mEric i work as a staff engineer inSpotify's core infrastructure um productarea and with me I have Erica who's asenior engineer and subject matterexpert in service our service networkingteam so first off a little technicalcontext about Spotify stack so it's amicros service architecture that runs onGKE we were originally on prem and movedto GCP in I think 2017uh the first running VM with the customorchestration and first first we'rerunning on VMs with a customorchestration uh layer and then sorryyeahum then we move to GKE and andKubernetes so and yeah the majority ofservices are written in Java and usingan in-house Java framework but there'salso a smaller set of services writtenin other languages like Python and andNodeJS and yeah and yeah we're reallybig you can see some numbers on theslide here that should give you somekind of sense of scale uh most of theservice to service traffic is gRPC Ithink 75% give or take and then we useproxyless gpc which means then that gcclients connect directly to servers asopposed to traditional service measureswhere uh that might make use offorwarding proxies and sidecars25% of uh the traffic that remains afterGC is a proprietary protocol which we'remigrating off and while most of ourinternet ingress is in HTTP there's verylittle HTTP that goes between servicesum what else we're deployed in five GCPregions and we have a regional regionalfailure domain uh where each region canprovide most of the end user servicesindependently but it's possible for usto make for services to make crossregion calls for example when a servicehas a regional outage or if the serviceisn't deployed in all regions whichhappens for our smallerservices so here's a basic sketch ofSpotify's service discovery system it'scalled Nameless it was created in 2013to replace a bunch of bash scripts thatwere managing DNS zone files uh and whyit's called nameless we don't rememberthe same system has been in use sincethe on-prem days and has been updatedover the years as Spotify's uhinfrastructure has evolved it hasseveral internal components supportingheartbeat driven service registrationwhich is used by the workloads that wehad that are still on VMs uh it controlsregional redirects on a uh per serviceand per protocol level and then itsubscribes to endpoint slice changes inour Kubernetes clusters to provideservice discovery to services running inKubernetes the the component that doesthe this Kubernetes business is is alater addition to nameless and we callitshameless i thought you might enjoyknowing that so uh service discoverydata then ��ild yourinfrastructure and your platform and youask your team like hey we need a servicemesh Some people would say hm why do weneed it for sometimes it's hard toanswer this question So I'm here todayto tell you about our journey on why wedecided to use a servicemesh So as a platform team we offermultiple things to our end users When Italk about our end users I'm talkingabout our product engineering teams Soour road map is like workflows So westarted to define the workflows for ourengineers when they want to create aservice and then they start like codethe service and they get like CI/CDpipelines out of the box and we have ashared Kubernetes cluster that we runthings on Then we route all of thattraffic through our platform and all ofthat is encapsulated with observabilityout of the box So that what we offer ona high level context But what aboutrouting how we do routing so we dorouting like this in some way in somesense So by default when you build aplatform you wanted your platform to beresilient You want it to be highavailable So we think about it from likea multi-reion perspective How can we dothat so we have a multiple regions DNSload balancers gateway services But whathappened when service A want to talk toservice D like they are in the samecluster how they do that like how he doretries what if service D is notavailable what if service C is deployedin a multiple clusters but the localcluster dies and then like we have toswitch traffic to somewhere else whathappen when we want to do a canarydeployment how all of this would happenyou still can do it but you have tobuild it in your application and youbuild it in your application by likehaving HD client do retries timeouts allof that kind of stuff So it added anoverload to your application teams tothink about all of these steps when theydevelop theirapplications So we need a mesh andhere's why So first we need to routetraffic We need to route traffic betweenservices in a fine grained way We needto be able to do canary deployments Weneed to do traffic shifting If clusteris failing if a if a service is failinghow we do that we also need to decouplethe application teams from being able toroute traffic through their clients andthrough custom logic that they have Oncewe have it transparent in a differentlayer it's decoupling all of theapplication logic from the routing logicBut also other things that we look at weneed to have it secure by default SoMTLS we need to offer this out of thebox We don't need them to worry aboutcertificates and spiffy and all of theIDs that they need to have to keep theirapplication running All of that isdelivered to them and at the end we needto have observability layer across theservice mesh and not just across servicemesh across everything So why we askthem to like hey expose all of yourmetrics like do this with your clientsSo we need to offer all of that layer 7fine grain matrix to the service teamsSo we'll walk you through the journey Sowhen we start doing like multi-tenantcluster as a New York Times we looked atwhat we need we need networking we needlike a very efficient powerful matrixthat we can look at and like how to doisolation So we start with selium seliumseems to be the right fit from thatperspective then we have a multiclusterarchitecture so we have some challengeshow to do that across multiple VBCsacross multiple tenants So we started toflatten the network use selium clustermesh to make everything flat and therethat's when we looked at we need a meshwe need to be able to shift traffic fromone place to another we thought aboutlike using services in selium andcluster mesh and we I give a talk withPeta uh one of the engineers that I workwith on the service mesh most of thetimes and we have found out that like weneed something that is really likecapable of doing service mesh across theboard so that's why We used STO So STOout of the box is a very service meshrich in all of the features that it canoffer It can offer a lot on what you cando But here's the thing nothing comesfor free It comes with a cost You haveto run it You run a contr�ol plane dataplane there are proxies there aredifferent things It works and deliversthe value but there's an overhead thatwe can see And from there ambient cameon and it's a game changer because likenow we are shifting how we use proxy andsidec cars in our application tosomething like less proxies becausebelieve it or not you don't needeverything that you offer out of the boxbut you need to have this decoupled in away and from here on give it to Lynn soshe can walk us through the ambientarchitecture All right I was so excitedto see so many people raise up hands onusing service mesh So you're allprobably familiar with the sitearchitecture When we started almosteight years ago we had a vision that wewanted a service mesh to be transparentto the application workload And we findout throughout the years of developingservice mesh with ISTO we couldn'tachieve that goal with SCAS because themoment you drag the side car with yourapplication you always have to restartyour application The moment there is aenvoy CVE you always have to restartyour application pod So this is why weinnovated in the community with is stillambient So let me quickly walk youthrough on the architecture How many ofyou know uh the concept of node agent inkuberneti i assume it's everybody uhright it's a agent serving all the podson the node Um so with your ambient weintroduce the zero trust tunnel who isthe node agent You can think about it asa node agent running on your Kubernetesuh node that serves all the paths inambient on that node Uh what does it doit provides um identity for the pods inambient through spiffy It upgrades theconnection through mutual TS So we havetransition uh encryption in transit Wehave we can do fips uh 14203 it canenforce a simple authorization policy atthe layer 4 So that's what's providedfor uh Z tunnel Uh the other interestingfactor of ambient architecture uh youall familiar how many of you are yourunning gateways in Kubernetes orfamiliar with Kubernetes gateway API Sogateway controls the traffic coming intoyour cubernetic cluster or the trafficfrom your cubernetic cluster goingoutside of your cluster to externalservice like open AI Right so we broughtthe concept of the gateway into the meshthrough our layer 7 proxy which we callit the layer 7 processing layer calledwaypoint And the reason we brought uhthe waypoint proxy as a gateway intoambient architecture is we want to bringthe same gateway experience you have onthe ingress on the egress into theservice mesh So you have the consistentexperience through Kubernetic gatewayAPI to control traffic for multipledirection whether it's east west oringress or egress So the layer 7 proxyuh waypoint does all the service meshlayer 7 capability that Aman just talkedabout uh traffic management trafficresilience uh layer 7 observabilitylayer 7 security policies uhauthorization policy enforcement sothat's all provided and we reuse envoyproxy to implement uh the waypointproxy so the key innovation of yourambient architecture when we inventedambient is this multi-layer architecturethe layer 4 which is the zero trusttunnel that provides all the functionrelated to uh uh traffic management atTCP security and observability at layer4 that you can share for all theapplications in ambient on that singlenode and the layer 7 proxy throughwaypoint and you got to define what isthe scope for your waypoint pointTypically we see user running a waypointper namespace You can also run awaypoint per service or you can run awaypoint per multiple uh name spacedepends on what tenant scope you feelcomfortable And what's really cool aboutthe architecture is the Z tunnel and thewaypoint proxy are all running outsideof your application pod So we canachieve the goal of service match wesaid eight years ago to be trulytransparent to your applicationworkload Now why two layer architecturewhy are we not using envoy proxy for thelayer 4 because envoy inherently wasn'tdesigned for multi-tenency There's noiselabel factor you have to concern Uhthere is an outage you have to concernYou don't want to be one busy neighboron that� node and bring down your entireproxy that's serving everybody else onthe node who may come from a differenttenant different application and thefounder of enmatt client definitelyagree with us on that sentiment Uh wedid announce is still ambient uh going gin solic city uh in November last yearSo now I'm going to pass to Ahmed totalk a lot more about ambient Thank youLynnSo imagine this going from one side toanother But like if you are using likesmall clusters you might not see thisout of the box quickly But like if weare running hundreds of nodes thousandsof buds like this is what you need Thisis what you look for like all of theseproxies on that side will go awaybecause like you don't necessarily needevery single proxy for every single budlike if I run like thousands of budsthat mean like the same number of budsI'm running the same number of side carsthat I'm running but like what if Idon't need all of them what if I need toprocess my traffic differently that'swhen you move to the ambient side and weonly have the L4 proxy which is the Ztunnel that Lynn talked about that canmove a lot of that traffic can manageMTLS identity like layer 4 policies andif I need to have a layer 7 I can add awaypoint and that's what I can dodistribute traffic through so it give uslike multiple option about like the waythat it's designed we can have likeambient and waypoints running pernamespace or running per application soalso decoupling the scaling from themesh to the application so now myapplications can scale independentlyfrom my routing traffic layer on howthat day scale And that gives us like alike adifferent architecture for how it worksAnd I still remember the days I wastalking to the team about like heythere's this thing called ambient likewhat do you think about and then westart testing it I was like that's greatlike we can reduce a lot of footprint ofour CPU and memory and all of the thingsthat like comes with a service mesh intolike a simpler layer So let's talk abouthow we tested it So we start testing itto see like what's really happening overthere First there's no sidecars So ifyou are into like annotations like startlike hold your application until theproxy starts and like you get some 503and like you've been playing with likehow this happens That doesn't happenanymore Web hook injection doesn'thappen anymore application can on boardfaster into this way because like youeither have an L4 tunnel which is bydefault running on the node or you canhave a waypoint that will route traffictoo So simplified everything from ourperspective goes to like CPU by defaultyou are losing a lot of the proxy CPUand memory So like that is a waste andnow we like achieved that by likeremoving that proxy But like also if youlook at the performance of how the Ztunnel versus the proxy you're going tosee like more improvement on the Ztunnel Z tunnel is written in rust Soit's more performant on the CPU layer Itcan do more things that like what invoycan do So it's better like 25% uhreduction in latency the numbers whenyou look at the numbers from like a verylow like uh I had that number on thescreen and it was like hey we reduced itfrom like 498 4.98 milliseconds to 3.7millisecond doesn't seem like a lot butit's 25% on a large scale that canimprove like all of the latency andimagine like when we talk a service meshand a microser architecture like everymillisecond matters because like therequest through your cluster might gothrough multiple services across Soambient is working for us as we areexpected on a multicluster level withthousands of buds hundreds of nodesmulticluster let's talk about it So whatwe learned through the process of thatis that we can just label the name spaceSo the way that we like looked at itwe're like okay we'll install like theSTO data plane with sidec cars and we'llinstall also the ambient data plane andwe'll start migrate like a namespace bynamespace a service by service and seehow it works That works very wellAmbient D tunnels will be able to pickthe traffic as it's needed and we don'tsee any interruptions Then apps doesn'tne�ed to change anything they don't needchange policies if they don't do like alot of things that works in the ambientas it's expected also there's a lot ofdebugging tooling that we have used fromhttl waypoint and the tunnel config tomake sure how things are working but Iwouldn't be honest if I didn't tell youabout the things that didn't work wellso when we started looking at thisearlier last year we found out that somepolicies in oz might behave differentlyon the ambient player and these are thethings that like when you are a veryearly adopter for a technology thingslike start like get resolved and likeworking with Lynn and John Howard andlike all of the community and STOmaintainers they would able like to do apriority for all of that main featuresthat we need to do to adopt somethinglike this Some of the obserability workalso needs some tweaks from filteringbecause like architecture has changedlike instead of having a proxy for everybot now you have something else Now wehave way points you have Z tunnel youhave like shared architecture across thenode So some of these stuff might bedifferent Expectation has changed Alsolike earlier in the process when westarted to load test some of the Ztunnel proxies we found out that theyhave a CBU spikes when we board manyservices like one at a time or like alot at a time So that also was likesomething that we have to look at butworking with the SD team that was ableto get resolved over the next releaseBut here's the point We run amulticluster So what happened is we weretoo excited to go through and then likewe went through it works well for asingle cluster but unfortunately itdoesn't work for a multicluster So we'restill in that space We are excited butwe are not there yet until I talk to Lenmore about it and she gonna announcelike some stuff about ambient modeYeah So we're very very first of shoutbeing one of our early testing adopterfor help us identify this issue and wewere able to resolve all the issues hehighlighted for him So I'm very happy onthat Now working with uh the communityon the multicluster uh the community isworking really really hard on drivingmulticluster coming to is upstream Sowe're going to have similar multiclustersupport as the scika multicluster Uh theinteresting thing with ambientmulticluster coming is we the trafficwill go through the zero trust tunnel Uhso you don't have to drag the side carand depends whether your destination hasthe waypoint The traffic will go throughthe east west gateway hop to thewaypoint for additional authorizationpolicy or layer 7 policy enforcementbefore it goes to the target destinationAll right with that um who wants to seea live demo of ambient yes All rightlet's see it We have uh 10 more minutesSo I'm going to be actually didn't showAmmon this demo So I told him before oursession if this time I can show a twominutes demo of ambient So I do havethis is my cubernetic cluster and I haveuh my client application my demoapplication I'm also learning generativeAI myself So I have the rag applicationI have with version one and version twoIn my default namespace I have thewaypoint proxy deployed I also have awaypoint proxy deployed in the egress umnamespace that controls all the trafficgoing to the large language model whichruns outside of the Kubernetes clusteralso on my machine It's a llama righthere Um I also have ambient installed Ihave is ingress gateway installed and Ialso have a kayali permissis installed Ihave uh zero tunnel installed Z tunnelinstalled of course So uh this is theapplication looks like So the firstthing I want to show you about theapplication Wow I guess let's see if Ican get it working Uh we're going toanalyze the mood of the audience Um sowhat I'm going to do is actually connectthe application to my phone By the waythe application is all running inambient uh right now So um I'm going toconnect to my phone and we're going toput you guys on stage Let's see if wecan do thatAment All right So we're going toanalyze your mood Thank you for thelights I appreciate that And we're goingto take a picture Um Whoa No Okay �Sothat's when like live demos goes wrongThat's when live demo goesOkay How fun Let me see if we can do achatOkay Okay Let me see if this works Whatis fun things to do inLondon okay I may need to kick off somepart Uh let me see Uh unfortunately uhthis uh demo application hasn't gonethrough too well Uh certainly my machinehas been getting up and down So let mesee if I can bring it up Uh if not maybewe can't do a live demotoday All right let's see what it saysConnection reset Okay I'm having alittle bit connection issue So uh whatI'm going to do is what I'm going to dois see if I can just delete this partand bring up them up live Uh becauseum all right see if they can be runningSo I'm going to delete um a couple ofpods Going to delete this podstoo Um let's see if not or we'll getback to questions Actually does anyonehave question while I bring up uh theenvironment upum if uh there's a microphone in themiddle if you guys have any questions uhwe can take questions tooAnd if that demo didn't work we stilllike promise I'm going to come inanother CubeCon with like an actual demofor our production setup like hearingLynn talking about multiclusterarchitecture that will be promising tosee how this works in production andlike how we like test this with likeactual read traffic for the New YorkTimesAnyquestions looks like we got one Mike'sphone is over there if you want to HelloCan you hear uh yeah Thank you very verynice presentation first of all Um myquestion was um you mentioned that youstarted with the Celium CNI and Celiumum cluster mesh Yeah Um so was there anyparticular reason why you didn't go withselium service mesh and do you thinkthat might have been solved by nowpotentially as well that's actually agreat question So the question like westarted with psyllium why we didn't gowith the selium service mesh actuallythat was like my earlier intent to gowith the selium service mesh but thething that like we found out that likeselium for the service meshspecifically didn't try to like createits own CRD as have I seen like itstarted to depend on like the servicelike the actual service object inKubernetes and it doesn't give you a lotof flexibility and like how to do likedifferent things because like they wereworking around annotations So you canlike define here's my global servicehere's my local service like the optionsthat you can get in that service It itdidn't work until to the extent that wewant to It works out of the box We canget like sure like a multiclusterdifferences between two regions We canlike shift traffic quickly but we canget retries We can get canary deploymentLike there's a lot of things that likewe have to work through And selium atthis time as well wasn't integratedreally well with the gateway API andthat's where STO came into the mix andlike will offer some of its ingress Sothey are complimenting to each otherlike they are complementing each otherin a way that like we still use seliumand we're not planning to move awaybecause they provide us with all of theHubble flow logs all of the stuff on avery like layer three layer four wherelike we can see like who's talking towho but handle like most of the otherlike layer 4 processing identity all ofthe spiffy maybe we merge them one daybut we're still using CNI with the S2CIcombined together and then like we has awaypoint and uh ambient on top of itThank you very muchHi Um my question is more on theservices which are outside of theservice mesh which are actually notparticipating in the service mesh Sowith sidecar you can actually havetraffic policies applied on them as wellBut now it looks like with waypointproxies all those policies will beapplied on ingress rather than on egressSo is it still possible to like havethose policies applied on those externalservices which are not part of theservice mesh yes it's still possiblelike that wasn't the use case that wetried but like with the new architecturethere's also you can deploy it as anegress way that you can control yourpolicies through the egress So like youcould have a waypoint Lynn I didn't trythis personally but you can have awayoint for your egress So the questionwas more about like what if the trafficis going outside of the cluster can wehandle this with all of the policiesthat we apply That's what I was hopingto show you in the demo too So basicallyyou can deploy the waypoint proxy tocontrol traffic into your namespace Youcan also deploy a egress waypoint tocontrol traffic going from within yourcluster to external uh services like uhAmazon um storage service or open AI Sothat's all possible and uh you can usethe egress uh waypoint as the policyenforcement You can do retry resilienceall that capability of layer 7 proxy isavailable in the egress waypoint That'sawesome Thank youWelcome Any more questions while we getthe live demo towork give it a minute Yeah Any otherquestions uh we have three minutes leftYou see an actual debugging for livedemos That's why like usually peoplelike go for like let's record the demoand not risk it But like Lynn here is sobrave to call this last minute So let'ssee if we can get it towork All right I'm bouncing a bunch ofmy pods So hopefully um the what'shappening I think if I get it work I'llexplain to you all what'shappening OkayLooks like we have one more questionsYeah nice talk Thank you My question isabout the 50% CPUutilization So did you see the samenumber in production i mean what couldyou share the details of the testing yesSo the the details of the testing areactually public on the benchmark itselflike you can look at this But like whatwe have seen is the Z tunnel CPU usageis 50% less than the S2 proxy usageitself So if you compare that to this itwill like you see the 50% Also keep inmind like you you also like control nowall of the CPUs that like you've beenwasting on the proxy and[Applause]all right let's bring you all onstage I'm so sorry but I will explain toyou what happens uh after I got thisanalyze All rightSorry No that's good And just likedebugging on the stage Love it All rightThe audience are engaged Awesome Sometaking notesActive listening andengagement Awesome I love you allThank youall Thank you manSo uh just a quick note um I do you wantto finish answer the question first nolike I answer the question like we weretalking about the CPU usage Okay perfectSo just a quick note on what's going onwith my environment because I'm runningeverything in the my laptop theconference Wi-Fi is horrible So whathappens is it still does certificaterotation every 24 hours for all the podsin ambient So um I have my demoapplication I have my uh waypoint I havemy uh egress waypoint and my namespacewaypoint So because I'm running on mylaptop I'm not always connected rightnot running So when the certificaterotation happens my environment was downNow when I bring up I I lost the windowof certificate rotation So some of my umpaths doesn't have the rightcertificates So when when I show youguys the live demo and it tries to gothrough the demo application through thewaypoints if one of them doesn't havethe right certificates it would hit itwould fail with the 503 or unable toconnect So that's exactly what happensin my environment After I delete all thepod make sure they have the latestsearch from Isto Then we will able toget the environment running And you cansee this is what I just showed by theway the traffic come to the ingress andcome to my demo application and it's allsecure through mutual TLS And uh sothat's the observability portion uh wejust talked about So if you zoom in tothe last 10 minutes you're going to seeit's connect through uh egress waypointto the external large language modelwhich I think one of you was asking thatquestion You see a lot of red becausethat was me struggle to bring the demoenvironment up running and we see theHTTP request metric that Aman talk aboutright layer 7 metrics all available Somy application is all available withoutus needing to do anything There's nopsych in the environment All powered bythe Z tunnel and the way points Allright I think um we're over time We'reright on time We're one minute overThank you all so much I appreciate it2025-04-15 21:58:45.706268 DR��#��cADc6S4vU9GiMum we'll be talking about um imagesnapshoters in the context of particlephysics where you'll see later that thisactually doesn't only apply to particlephysics but also beyond so I'm gladyou're here um and before we get intothe details um let's briefly introduceourselves i'll start with myself so I'ma physicist um so I'm largely actuallydoing particle physics research at PaulSher Institute PSI and um yeah I'mworking on one of the experiments thatis at and then again at the largecollider specifically the CMS experimentand here's Valentine yes so I amValentine i'm a physicist turnedsoftware developer and I developed aCERNVM file system that actuallyimplements a containerd snapshot andwe'll hear more about that later butfirst uh Clemens will introduce to youactually the problem uh that we face atCERN and why actually uh containerdsnapshot solve an important problemgreat so yeah you heard you have twophysicists right in front of you sowe'll talk a bit of physics but not toomuch um so if you think about particlephysics in Switzerland um well I guessthe first thing that comes to mind a lotof water um green pasture and so on umso actually two largeum labs in Switzerland one is thePorsche Institute or in short PSI that'sSwitzerland's national lab for naturalsciences and engineering and thenthere's CERN you know widely knownyou've had several talks actually fromCERN here at this conference theEuropean laboratory for particle physicsand they both have in common that umthey're they have particle acceleratorsso we're basically accelerating sciencewith ��#��_AvrG5tBDsdd0hello thanks for coming Uh we're goingto talk today about debugging envoytunnels So whether you're using Envoydirectly or you're using it through someservice mesh we're going to show youwhat things could go wrong and how todebug them and fix them Hopefullysomething that you will find useful foryour day-to-day workOkay I'm Alexandra Stoka I'm a sitereliability engineer atAdobe I want to say that this is myfirst CubeCon and first time as aspeaker at a conference ever So��#��mA9U3WMez9q74good morning CubeCon Can you hear us allright We are going to be here talk aboutjourney at New York Time is cycllessservice mesh ambient mesh disappearinginto theinfrastructure Uh my name is Selen Icome from Raleigh North Carolina asuburb of Raleigh Kerry Uh I work atSolo and a fun fact about my employer weare the 10th largest contributor to allof the CNCF project combined Now I'mgoing to pass to my co-speaker Thank youLyn Hello everyone My name is Ahmed BorsI'm a principal engineer at the New YorkTimes I work in infrastructure I doKubernetes I do STO And this is aspecific talk is special It's my secondtalk that I have my daughter with meShe's sitting over here Let's give her around of applauseSo let's start talk about the New YorkTimes Our mission is simple We seek thetruth and help people understand theworld And we aim to do that by buildingthe essential subscription for everyEnglish speaking person so they canunderstand the world One question I geta lot is like what we do like we arevery known for our news and journalismbut we also do other things like gameslike I'm sure you heard about Wordlecrosswords and other games that weprovide We also have the cooking apps ifyou are inspired to do some recipesWould love to talk to you about what'syour favorite recipes And then we havethe wire cutter if you want a productrecommendation Athletic for like all ofour sports fans and audio as well if youwant to listen to our stories anddifferent articles as well So let'sstart by aquestion Do we really need a servicemesh who's in here use a service meshcan you raise your hand everyone hereuses service mesh That's great But likesometimes you come to a question whenlike you start to bu� I'm veryhappy for that I spend my days managingcloud infrastructure automatingeverything inside and making sure mostlyKubernetes behaves I like to say thatfor me is like coaching a team ofunpredictable gymnasts because I used tobe a gymnast myself So turns outbalancing on a beam was great trainingfor balancing performance cost andscalability in the cloud The onlydifference when things go wrong in techAt least I don't land on myface And I'm Carlos I'm a principalscientist at Adobe and I've been workingwith open source for a long time Youprobably heard about the JenkeisKubernetes plug-in that is been now over10 years and I I started that and we'reboth working at Adobe Experience ManagerCloud ServiceOkay before we dive in I want to tellyou about what Adobe Experience Manageractually is Short is AM It is a CMS acontent management system that helpsenterprises build manage and delivercontent at scale You create your contenthit publish and boom it's out there forthe world tosee Technically speaking AM is adistributed Java OSJ application Is itis it a it is a bunch of uh open sourcecomponents from the Apache softwarefoundation and one of the one of thecoolest thing uh things about AM is itshuge market of extension developers Thatmeans people write code and then we runit on AM for themcustomers used to uh run AM on premiseor in other cloud setups but a while agowe launch Adobe Experience Manager as acloud service Uh that means we run wehandle everything for our customersdeployment scaling upgrades and all ofthat is running on Azure Right now wehave over 60 clusters worldwide andsince this is a content managementsystem people wanted to run uh theircontent as close as possible to theirusers So we deploy them in almost anyregion we can get our hands on Whatabout our scale that we're operateoperating at think of an AM environmentas a fully stacked kitchen You haveeverything you need to serve and preparecontentUm customers can create multipleenvironments whenever they want and wehave a huge bunch of those over 31uh,000 environments more than 200,000Kubernetes deployment objects and over20,000 namespaces across ourclusters But why did we need Envoy ourcustomers wanted something very specificand one of those things were uh wasdedicated agress IPs That's that meanswhen uh their traffic leaves ourclusters they wanted a unique unsharedIP addressuh uh because it was more secure forthem They also wanted privateconnections uh to their data centers orother cloud services like V-Net peeringprivate links express route directconnect and VPNs So we we askedourselves how can we do that mosteffectively and that when that's whenenvoy camein Let's see who here usesenvoy Okay lots ofyou We also uh it is an open-source edgeand service proxy designed for cloudnative applicationsuh it focuses on service to servicecommunication handling uh retriescircuit breakings automatically It canhandle load balancing Um it manages uhcommunication using TLS certificates andit provides a deepobservability So the beauty of a is thatit simplifies complex network problemproblemsUm and yeah this is what we're workingon leveraging envoy to meet ourcustomers needs for private networkinguh dedicated egress IPs and portforwards and trust me the journey hasbeen as fun as debugging a productionissue on a Friday at 5:00 p.mRight So let's switch things up Yep It'stime for a quiz with live demos wherewe'll explore some scenarios of what cango wrong in MTLS communication YeahLet's put this up there Do you want toscan it too oh okay I don't have myphone Okay So we're going to show youseveral scenarios as much as the timeallows and we're going to go and you'regoing to help us figure out what's wrongand how to debug it Right So going togive you some time there I'm going tostart the first scenario Okay that'sgoingonAndokay can you see it okay on theback Okay No nobody complaining So we uhwe have oh actually I forgot to show theuh ahyes that that uh we're going to do thedebugging and I like this quote that islike being a detective in a crime movieYou are also the murderer You're alsothe one looking for uh for you are theone holding the knife So what we havehere uh onthe on the demos or the scenarios we aregoing to go through uh we have twoenvoys talking to each other overMTLS and we have a client connecting toa local envoy and that envoy talks tothe up let's call it downstream upstreamenvoy over MTLS using a certificateauthority generating certificates forbothsides and the upstream envoy talks to aSo from the client which we're going tobe using curl to the service all thispath uh between envoy to envoy is securewithMTLS So all the scenarios are going tobe the same and we're going to gothrough some failure failure uhscenarios and yeah we're going to callit downstream upstream and it's simpleand the source code is already in uh isgoing to be it's in GitHub and I'll showthe link later So uh let'ssee the firstscenario You got everybody got the okaythe cahootgood The first scenario uh I'm going toshow the lo get the logs from curl envoydownstream and envoyupstream And why is thisnotokay this should beworking Okay this is running Uh curl isgetting 503 errors So curl is trying toreach the service through the envoytunnel and it's getting 503sNow where why is this not showing me thelogs okay I don'tknow So the first let me try to figurethis out and then the first let's startwith the firstquestion Come onYes What do you think we should checknext to figure out why these 503 errorsarehappening should we increase should welook at the increase the logging on theupstream increase the logging in thedownstream check the metrics check thedownstream certificates check theupstream certificates or move to a cabinin the middle of nowhere with nointernet I would do that which whichlooked like this place that has nointernet But anyway so all the questionsare going to be more or less the sameand you tell me what to do Okay Increaselogging on the upstream side Uh okay Whyis this uh let's dothis So one way to increase the loginuh we're going to dockercomposeexec Uh you said upstream right what'sup upstreamIf you have the admin console open uhsorry the admin uh port uh open on theapp on the invoice you can do arequest and increase the logging So I goand say everything uh increase the loginto debug Now you get a lot of noise andthink that this is just making onerequest per second when you are lookingfor issues this is not very helpfulbecause maybe you are getting hundredsthousands of requests and there's a lotof noise So one thing that we can dohere although if you uh have good viewyou probably saw something going throughthere Let's put back the level toinfo and you can do the bycomponent and an interesting componentwhere you can increase the logging isthe connection So I'm I'm I'm I have putback the log level to info andincreasing the log level to debug onlyin the connectioncomponent Andnow what I see is a nicer easy to findOpenSSL internal Let let me stop this abit Uh certificate expire TLS errorN So if you want to if if you have thelog level info by default you're nevergoing to see this Uh envoy does nottreat this as an error It's justsomething that you would see if you aredoing debugging So now we know somecertificate isexpire Uh let mesee Let's what we should do nextIt's up toyou So same questions as as before Umand the extra one is you use IPO AC totunnel the traffic If anybody does doesnot know what is IP o over AC you lookit up tell melater And most people are saying checkupstream certificates Okay So to checkthe string certificatesuh let's see it's dockercompose exec and it wasuh what was this it was a cut I thinkhere Uh the upstream upstream upstreamuh thecut sh uhuh envoy upstream sh and I go and seethe certificate There's a lot of texthere and I see that certificate is uhnot valid after March 19 So this casethe certificate is expired on theupstream Other ways of doing thisuh in a easier way you can put uh justput dates and you get just the dates ifyou if you know what you're looking forit's a lot easier to get theinformation So first scenario uh welldone The uh the certificate is expiredWe found theproblem Let's go to the secondscenarioOkay Uhwaiter let's see if they are not runningThey're still running or notdown just in case And let's start itagain Okay And now let's see if thistime the logs workandlogs Okay for some reason the logs don'tworkhere I'm going to blame it on theinternetAnyway anyway same case as before Bydefault we only see that curl the clientis getting Firefox 3 errors So whatshould we do nexti put it in 10 seconds Maybe you have tobeagile So same things as before Checkincrease login check metrics checkcertificates or look for HTTP 418responses and you are saying increaselogin on upstream sign 41A HTTPEverybody knows what that is If not lookit upand increase login on upstream side Solet's goagain docker and this time we are justgoing to increase the login compose execuh no docker composeexec upstream connectiondebug and now uh we get a certificateverify failed in thelogs that doesn't as much does it so theonly error upstream is certificateverifyfateSo what should we do nexti wouldrecom space programming language If youhaven't heard of it look it upThat's okay Now downstreamcertificates Oops WaitGo the downstream certificates So thiswas uh dockercompos and thedownstream very long certificate thingand the certificate is stillvalid So what could be wronghere i don't even remember what was thisscenarioUhokay And we can see the dates and we cansee more information Uh wait asecond So we can see the subject we cansee the DNS we can see thingshere So what should we donext so the certificate is not expiredOkay Oh then now we have a tie increaselogging on downstream side and checkupstring certificatesI'm going to go with checkup stringcertificatesAnd the certificates up upstream arealso not expiredWe should look for who signed them MaybeI don't know It's too bad that I don'tremember What was the problemwe have it in the notes YeahAnd if we look at the whole certificateuh yeah we don'tsee nothingobvious Somebody else was sayingincrease the logging on the other sideWhat was the other side so this wasupstream Now we have to increase thelogin thedownstream because it was a tie uh[Music]curl But it wasdownstream Yes Downstreamupstream anduh and now we can see a bit moreinformation So the problem is unknownCA that means the certificate is signedby an authority that one of the envoysdoes notrecognize Actually in this case it wastheannunci there's one way here to see itjust withthe this istheupstream And we see thedownstream anduh is the the CN Yeah you can see heresomewhere that the one of the the CAS uhor sorry one of the CN is different fromthe other one So the certificates are isit that the certificates are signed bydifferent two differentCA was that ituh yes yesokay so we can stop thisone and go to anotherscenario one that We will remember maybetoolate Okay Uh okay This isstarting This is a a slightly differentoneAnd ignore the graphana errors just thecurl ones So curl cannot resolveum cannot connect basically orresolve and it's not even able toconnect if we do adockerdocker well actually I'm going to askyou what we should donext squirrel is not able to connect toenvoy not getting anytraffic We still have 8 minutesGood Okay You are saying there's a lotof there's a mixture People are like"Yeah okay Something or another Checkmetrics." WowOkay I don't think you're we're going tofind matching metrics but we can do anwe have a dashboard here Is it loaded[Music]and because it's shared across thescenarios let's go into the last fiveminutes And there's no connectionshere in the there's no connectionsBasically there's nothing showingup There's a request here on theupstream No Yeah with this small amountof time is not great butuh you because we have to ignore the theprevioustimes But uh basically it looks likethere's no traffic no connections goingon So what should we donext you didn't think you were cominghere to work rightyou're going to say I'm not coming backto this guy's talks any ever again everUh increase logging on the downstreamside So docker composeexec compose Well actually it's going tobe faster if I find it here and you saidthedownstream downstreamdownreamhere And oop something happened Nothingor nothing happened If I go and look atdocker compose statusuh oh it's not the status uh composps and uh a yeah the downstream is uhnot running it is here exited 3 minutesago thedownstream So we can go and look at thelogs Docker compose logs envoydownstreamand an error here envoy refuses to startat all because the private keyuh it's uh it's the key valuesmismatch which means uh thekey in the what was the key in thecertificate is is the key andcertificate does do not match How do youknow that besides thiserror is uh with a bit of a complicated[Music]uh open SSL command I have it here isthisuh because the container is not runninganymore I can find here that the modulusof one of thecertificate does not match the modulusof the key This uh you may think this isa bitcomplicated or a bit hard to break itthis way but when you are renovatingcertificates if the key is copied beforethe certificate or you have some sort ofrace condition you end with this andwill refuse to start These are all thescenarios that we hit I mean we we areprobably bad at doing this but but theseare the scenarios that we hitAnd I think we go to the last one Uh wewe go stop the we don't do thefourth This one is a bit morecomplicated We have just four minutesleftandDocker composeapp Uhwait So let's just stop the other onejust incase Come on docker composeapp and we're getting uh are we in theright direction yeah we're getting amixture of 53 errors and 200errors So I there's not much time leftso I'm not going to ask you You can restfor now And uh if I go tothe to the dashboardsuh this should start getting trafficsoon And when we have a mixture ofsuccessful requests with fail requestswhat you realize is okay thecertificates might be right because ifnot I would get errors all the time Andthere's some things that you can lookinto themetrics One of them is theoverflowand and the well we see there's a numberof connections here We are pounding theenvoy with uh connections We are doing alot of connections and one of the envoysupstream or downstream is rate limitingus and causing this those connections tonot uh go through and the only way youhave or the best way you have to know itis going into the uh overflow or uh orrate limiting depending on how youconfigure envoy So if you're using thedefaults it takes a number ofconnections But if for whatever reasonyour client is not closing theconnections properly you may end you mayend up with a number of connections thatexceeds uh the thresholds of envoy andenvoy is going to start uh closing theconnections on the upstream for instanceand the client is not going to see thatthe connection is not established It'sjust going to see that you're getting503s and that rips a lot of peoplesaying why I'm getting a 503 error Thatmeans that the upstream envoy is notable to cope with the number of ofconnectionsThat happened also That also happenedYesSoum the resources you can see uh all thisis the official documentation of Envoyuh TLS at boxes TLS good examples abouthow to use TLS with envoy uh the doubleproxies this configuration where youhave envoy upstream and downstream theytalking to each other and the last oneis this demo where you can see thedifferent scenarios and use dockercompose to start them upand this is the QR for the feedback ifyou have good feedback you can put itthere if you have bad feedback you cangive it to us personally Okay Anduh hopefully this will help you ifyou're using envoy again directly orthere's a lot of services out there thatuse envoy service mesh things like thatIf you are uh if you see problemshopefully you got some ideas of howcould go wrong and how to checkit Thank you Thank you2025-04-15 21:58:46.230602accelerators and what acceleratorsdo is in the name they accelerateparticles and that means um to someextent And they um either shoot thoseaccelerated particles then on targets orthey make them collide so um shoot themum on on each other right and these umexperiments or these accelerators um inprinciple all look very similar so forinstance there's one that's been thereat PSI and since the 70s and it's beencontinuously upgraded the so-called ringcyclotron and then there's one thatyou've probably heard of um you knowmore often already that's the certainlarge hydronone collider in shortLHC okay so when these collisions orinteractions happen there needs to besome device that actually detects theseparticles um and one of them is the CMSexperiment um and yeah that's the onethat I'm working on it's also a prettyuh huge uh experiment so there's a humanfor scale down there um uh so this thisexperiment is 15 mters in height andthen 25 mters in in length and uh weactually get a lot of collisions fromthe large hydronon collider up to 40million events uh per second and this ishappening 24/7 almost uh all year longand then you can imagine that this issome kind of camera that has like 134megapixels um but it's taking these uhphotos um you know with a yeah every 25nconds so this is a bit better than whatyou probably have in your pocket as yoursmartphone and now what is reallyimportant is that these data are uniqueso um this experiment um together withthe atlas experiment they have been usedto um discover the hig boson and uh wecan actually also only measure and disor you know further investigate the higboson at at these experiments so this isreally um you know unique and valuableuh data now if we want to keepcontinuing um um analyzing these data uwe actually run into the issue that umyeah to some extent the experiments arepretty expensive and also theaccelerator parts um so we have verylongunning experiments and I think bynow um at the LHC and also the SIMXexperiments or the other experiments atthe LHC um there by now one of the momost longrunning uh experiments thereare in in in the world and you canactually see here on thistimeline that um you know initialcollisions or operation started around2009 right so we're now like 16 yearslater already and then if you look um atthe end of this arrow so this is notentirely to scale here but the currentplan is to continue taking data andcolliding particles until 2041 so that'sanother 16 years to go okay and now assoftware engineer you think about okaynow what is actually my Linuxdistribution or release that I can useand and you can see that um well weinitially started actually with theRedhead Enterprise Linux 5 and derivatesof of that right so um binary compatibleones typically uh so then at some pointpretty soonish we switched to um versionsix then uh we have seven running stillin quite a few places but now themajority of systems have moved to Redditenterprise Linux 9 and um well somebodyyou know who can predict the future cantell me which Linux release we're goingto be running in 2040Um but you can tell that um I mean it'sactually really um important um to umyeah containerize all your workloads andthat's like the only way in which we cankeep uh doing these these things becausenobody wants to um have like SLC likescientific Linux 5 or enterprise Linux 5machine uh somewhere connected to theinternet right so that's particularlywhat admins don't like I don't knowquite why but um you know that peopledon't do this anymoreOkay so now on to Valentine right so Ihope this doesn't come as a shock to youbut when you run in container on averageyou use about 15% of the actual datathat you download from the registry so85% of what you download is effectivelyunused of course you can minimize veryaggressively containers for your usecase and then you're more efficient butthis not uh really um possible in alluse cases in particular for for particlephysicists we have a huge softwareframeworks that do lots of things andyou might want to really package a wholeexperiment software into one containersin order that it's a that it's a usefulstandard containerum so in order to execute thecontainerized workloads efficiently isis really important then to to also bemore efficient in actually the data thatwe use and one of the most uh promisingavenues here is the use of uh uh of lazypooling basically the snapshoters thatimplement lazy pooling the basic ideahere is that a container is composed ofa layers it's basically a set of tarballs and a manifest list and what whatlazy pooling allows you to do is toexpose all of the layers uh to the tocontainerd without having to downloadall of them typically this is uh this isthen uses some kind of internal indexthat's for example how starg one veryinfluential lazy pulling snapshot worksbut in the end that means that it canstart a container almost immediately butthen be slower during execution asthat's actually then where you on demanduh download the data that isaccessed all right uh and then uhClemens will talk you a bit more aboutthe different actual implementations oflazy pooling snapshot that that existtoday hi thanks Valentine yeah so thereare different options around on on themarket and actually I talked to some ofthe containerd maintainers early on andthey have their session actually inparallel to ours uh and I mean this listis not complete but this is one a listof those that are basically um aroundand they will actually be talking aboutone that they will add with the nextcontainer uh minor release um so theyall work as plugins to um containerds ialready mentioned it and uh weinvestigated the three that are listedhere in red so the starg snapshot theSochi snapshot and then also the CERNVMFS uh snapshot and we'll I'll go umthrough all these in in some more detailin the following then as a comparison uhwe always benchmarked everything againstthe traditional um overlay FS uhsnapshot so that doesn't do any lazypulling i just want to mention thatthere's also the Nidus snapshot um whichuh yeah is also a very interesting uhproject but we haven't used it for ourbenchmarks in principle you'll see thatyou know I'm already giving it away nowthat the performance is very similar aslong as you're just using a lazy pullinguh snapshot and now some people mightask so what if I run on like publiccloud offerings um so it's a bitdifficult to get like insight into thisbut we found for instance that on umGoogle kubernetes status engine there'salso an image streaming uh service andthen on Microsoft Azure for instancethere's they also have uh something thathelps you with the image distributionandum there's also nothing that wouldprevent you from installing either ofthese snapshotters into uh those publiccloud offerings if you wanted to dothat okay now also the good thing isthere's actually no risk in uh usinglazy snapshotters because all of themare implemented in a way that they willactually fall back to what I call herenow legacy polling so I'll just go backto downloading the entire imageunpacking it and then uh executing it incase um the image isn't available in away that you can uh um you know performlazy pulling okay so now this lazypulling mechanism or the way thatenables lazy pulling is implemented indifferent ways and that's basically howyou can differentiate between thedifferent uh snapshotters so the firstone that I'll discuss here is theso-called starg snapshot and the idea isbasically in the name is to use imagesin a searchable targ format so they'reactually then required that the imagesthat you push into your registry um aredifferent so there's there they actuallyuse an extension to the um to the umyeah gzip format um so that that um youyou can search these um images um in thestandard OCI formats um the youbasically have to take the entire layerblob and extract it to get a single fileentry now by basically packagingeverything into smaller um tar balls umyou can can extract specific filesindividually and yeah that then enablesthis uh lazy pulling also here on thetop right you'll always see there's a QRcode and also the title of the slide ifyou want to download them is actuallythen links to the repositories um or thedocumentation okay um so now one thingthat can actually happen if you're doinglazy pulling if you're like you knowpicking all the bits and pieces fromyour image or from your from yourindividual layers you can actually seethat things somehow slow down valentinealready mentioned this so there's oneway for instance that in the star thatsnapshot you can improve this by forinstance um um prior using a mechanismthat's prioritizing the files okay andI've been talking too long uh on thisslide um so that um u sorry let me justgo back um that will give you the umoption um to then have you know a bunchof uh files um prioritized for downloadand you don't see these delays later onokay now the big caveat that we've seenalso when trying to um you know convinceother people to actually um use this isthat you actually have to rebuild theimages in these um in this extended uhstarg um format right right so youbasically have to recreate or convertall your images and that's um somethingthat makes adoption a bit difficult ormakes it somewhat umslow so the soi snapshot or seekable OCIuh snapshot um is actually then a forkof the star snapshot um which uh somehowaddresses this problem by adding aseparate index artifact to the uh to theimage that is then hosted in parallel umin in the in the in the registry so thathas the big advantage that you do notrequire a build-time conversion step umand you can actually add this um indexartifact after the fact right so youbasically just um have a job that or youknow command that analyzes the image andthen um adds the information on on whereyou actually um have the files that youneed to extract from each layer as a asa some kind of um yeah index artifact ormanifest in parallel then in theregistry okay and then for optimizationyou can for instance then also configurea minimum layer size for index if alayer is particularly small it doesn'tnecessarily make sense to also indexthat but you might just download theentire thing and and un unzip it um andthe default here is something like 10megabytes and you know something umreasonable so now they actually madethem the um deliberate choice of notprioritizing not having this fileprioritization um which again you knowcomes with an advantage um because youcan then also share um layers with otherimages right so in principle you can umif you already have like you know alwaysthe same base image and only somesomething light on top you can sharethis because you're not moving aroundthings in your um images and you're justum always using the same base okay nowthere's also a small caveat to thiswhich is that not all registries supportadditional artifacts so we're actuallyusing quite a lot of GitLab and theGitLab registry does not support umthese additional artifacts but forinstance Harum does that without anyissue so you just have to be aware ofthis and this is also then listed in thedocumentation here okay and last but notleast um before I hand them over to umValentine to give you more details umthere's the um CVMFS snapshot and theidea is to use so-called unpacked imageson CVMFS and I just want to mention herethat this is somewhat similar to theNaida snapshot uh and you there thenrely on some additional contentdistribution uh service that then bydefinition here is not OCI compatiblebut um yeah I'll hand over to the CVMFSexpert So more details on that so nowI'd like to give you a bit morebackground on what is CVMFS and how webuilt our own container D snapshot soCVMFS is a global readonly file systemfor software distribution and at itscore it's really an ondemand streamingservice for scientific software thinklike um Spotify but you can uh downloadbasically uh all types of scientificsoftware so what happens is that youhave on your computer this globaldirectory/CVMFS and whenever you accessor open files on this directory uhactually what happens behind the scenesis that they get downloaded and cachedfrom a remote server that offers thembasically your someone publishes thecatalog of available software and thenit's it's on demand available to you sothis is implemented as a file system inuser space so that it's easily usable onall worker nodes in distributedcomputing environment like we have atCERN and it's really heavily optimizedfor storing and distributing software sowhat we can do is we have an objectstore that backs backs the CVMFS thatallows to dduplicate files and evenchunks like at the at the subfile levelwe have many levels of caching do we usebasically plain HTT transport and thenwe have verification of data integrityand we can compressdata uh to give you a few numbers soCVMFS is quite a quite a mature projectby now it's more than 15 years old uh wehave around 4 billion files in thisCVMFS tree these come from a variety ofscientific experiments uh sort of doesthere does the big LHC experiments butthen there's also uh experiments fromfromastronomy LSST uh then the UK experimentfrom from the European Space Agencyuh and uh and the long tale of ofsmaller experiments that make use ofCVMFS as well to deploy uh theirsoftware to distributed computingresources all around the world so hereyou find find as well a a picture fromthe the worldwide LHC computing gridthat heavily uses CBMFS to distributesoftware and actually the biggestrepository that we have is unpack.ch chwhich stores container images unpackeduh and I'm going to talk a bit aboutthis later just wanted to highlightfirst that uh apart from the manyacademic users that we have uh we alsohave a very nice collaboration with jumptrading a financial trading companywhich has a very interesting use casefor CVMFS and they use it as a filesystem view on their market data archiveso while it's optimized for softwareit's also possible uh to use as a filesystem view on on external data uh anduh we also want to thank them for forsupporting the open source and and ourdevelopment of ofCBMFS and then there is as well easy uheasy is a project of European CIS adminsthat tries to build a common softwarestack of HPC applications and those areas well then deployed on CVMFS and thenthe idea would be that that any uh HPCin Europe that is participatinguh has can can access the same softwarestack of uh of HPCpackages so how do we use containerswith CMFS so we provide basicallytooling to take the whole containerimage and unpack it in a CVFS registrythis allows us to to use the the backingobject store dduplicate any files thatthat that can beduplicated and then basically uh accessthe the whole container uh directly in aregistry so it's like this this ofcourse requires a bit ofinfrastructure for the servers and theworkflows to unpack images but once youhave them there is in the end like avery efficient way uh to access imagesand I'd like to to uh um show this witha with a short recordeddemo the nice thing about thesesnapshotters is that they're verytransparent so basically that the theuser doesn't really uh see what'shappening behind the scenes but here wewant to actually get a few moretechnical details and for this uh I'vewritten like a small uh application thatshows uh a few a few additional piecesof information uh so down here you haveum a widget that that shows the thenetwork and the the bites downloaded andon the right here there's two widgetsthat basically uh through the Linux Inotify interface basically just show theactivity in two folders one is the thebasic overlay FS snapshot folder fromcontainerd and then there is the cvmfscache and now let's uh see what happenswhen weactually then uh run CMFS so first I'mgoing to to pull uh a standard image uhfrom the the normal container IDregistry and we see basically that we'redownloadinguh around 400 megabytes and this thenall gets gets uh gets put into the thesnapshot directory uh so you you're nodoubt familiar with all of this and youcan see that as you finish the the thebasically the the pulling of the imageyou you have all of the the files thenbasically available directly now uhwhat's what's going to happen next isthat I'm going to remove the image sothat it's no longer on on the on the onthe local host uh and then use the CVMFSsnapshot and you see that the behaviorhere will be very very differen tuh so we use basically again uh pull uhand then have just have to provide thethe snapshot argument so we we do theuse the CVFS snapshotuh and then basically pull uh the thesame image and here basically as as wasmentioned before this is basicallyinstantaneous so it's only needs to ffetch the manifest and then verify thatall of the layers are available on CBMFSthen actually once you access uhanything on uh once you run this imagethen of course is we can't quite domagic so at one point we actually doneed to do to do the download uh and uhand and start things so if I actuallythen do something inside this containerhere sort of as importing somethingthat's that's non-trivial theneverything uh gets gets actuallydownloaded and accessed of coursethere's a there's a certain latency whenyou do it interactive use but this isusually way more efficient you can alsosee that that actually what what you getin the files are really uh files that'sthe chunks of the files that areaccessed so that's that's kind of likelets you lets you be very efficient hereall right uh so far from for the demoand now Clemens will actually uh give alike this was this was a very intuitivepresentation of this but we alsoactually did a bit more rigorousbenchmarking which Clemens is now goingto present okay thanks um yeah so we'reactually going to look at the same uhimage so now we just arbitrarily chose aPython 3.9 uh image from from like aDocker Hub and uh here on the left handside you can actually now now seecomparing the different snapshoters uhthe data downloaded in megabytes okayand you can see that um the overlay FSyou know downloads the entire compresseduh image uh whereas all the snapshottersthey yeah I mean they they downloadsignificantly less so something like umyou know 10% or even less of the dataand you can see that there are somedifferences here that has to do with theway that the snapshoters work um youknow and also how they're configured butI mean you can see that that yeahthere's a significant uh reduction so ifyou have to pay for um downloading uhthings from a registry um you know thisis something that you might consider uhto use um also then if you look at theright hand side you see the time inseconds again comparing the differentsnapshotters and uh or you can see thatbefore um something uh happens so all wedid is basically we have a workload thatum you was like something like nerdctlrun um and then something like pythonprint hello okay so really doingsomething very um trivial uh but you cansee that of course the overlay umsomehow limited by by the network uhthroughput right because it needs todownload everything so here took ussomething like 15 uh seconds uh to runand then in light green you see can seethe um time until we actually ready toexecute um our um workload okay um andyou can see that again the the three theother so the lazy pulling snapshottersthey are very um comparable i mean theytake a bit more time to actually thenrun the workload uh but um the but youknow you almost instantaneously in inthe in the terminal um or or the rippleum and and um you know can get things uhstarted okay so now we did this forsomething slightly more um complicatedum so there's a software that is used umwidely in the CERN community that iscalled root has nothing to do withprivilege access it's something that weuse really for data analysis um andthat's here using a specific versionbased on Ubuntu Linux distribution umand um now here comparing again firstthe uh data throughput and now we'redoing this for different complexitiesright so the first thing is just likegetting the uh bash uh print u in placeso that's shown in in red then startingPython so that's you know doing a bitmore than just getting the our shellready um and then we do some uhimporting of um you know here rootbecause we download a root image sothere's a python the python bindings tothis root software available and you cansee that at that point um you actuallystart downloading significant uh amountsof data um and then we're doingsomething like filling a histogram in arandom uh way using this Python rootimplementation and again you can seeyeah I'm somehow repeating myself youcan see comparable performance acrossthe lazy pulling snapshoters and uh youcan actually see that this is a somewhatalready optimized image because now weactually using more than 20% of theimage already right um and in othercases you might actually not have let'ssay optimized image images in there andalso then maybe for the data scientistsyou can actually see that importstatements can be quite expensive umbecause all of a sudden you startdownloading like 20% of the overall umimageAnd last but not least um now looking attime again um you can see okay there'ssome fluctuations here in our um overalluh timing um but but you can see that uhof course then this downloading um noadds time to our overall um execution soI mean the the overall execution is a isa matter of um maybe one or two uhseconds it's not it's not really takinga lot of time you can see that here onthe on the left hand side for theoverlay FFS right so the execution issomething like two or three seconds andso the delays then in an execution forthe lazy uh pulling snapshot is reallybecause you execute a command and thenthe things are downloaded from theregistry or from your contentdistribution uh system so you lose uhsome um time uh because of that rightbut I mean you've started much sooner soif you have now very very long runningum workloads uh it probably doesn'tmatter anymore for short runningworkloads you you still probably have anadvantage using um lazy pulling okay umso that already um brings me to the umsummary so I mean first thing to note isit's really great that containerd hasthis uh plug-in system because itbasically allows anybody to write theseum uh snapshotters now here we lookedspecifically at the lazing pullingsnapshotters and umthey basically have two major advantagesyou can almost instantaneously orinstantly start um your workloads andalso you significantly down reduce thethe amount of data downloaded and youdon't have to tell your users you reallyyou know try to optimize how you buildyour images uh because it doesn't reallymatter um anymore at that point okay nowwe've shown you a couple of differentoptions um which one you use um dependsa bit on your personal uh use case andalso the cluster setup but I think thebottom line our message here is um youknow don't be lazy use a lazy pullingsnapshot because we haven't found anydisadvantages to using them andbasically only advantages and if youhave any feedback for us uh there's thelink that leads you to sket.com um withthe um um yeah feedback button so umyeah thank you and we're also happy totake any questions[Applause]yes so there's the microphone if youhave any questions there's one yeah soyou benchmarked it against popularbinaries that are used at CERN yeah exeexecutables uh however have you tried uhhaving an image with a lot of smallfiles or few large files and thenbenchmarking it against against thosetwo use casesyeah so um we have actually tried thatso if you have a lot of small files youcan actually see that this like gets abit problematic so things uh slow downright so then I mean there are likethese ways of optimizing things forinstance um you know if you're using thesocially snapshot just don't botherabout like figuring out where thesesmall files are but just download anentire layer of those small files yeahbut we've seen that in in this case yeahthat's yeah it's actually very goodpoint that you mentioned it that that uhum this this is yeah like a not the lazypulling snapshoters don't look that uhgood anymore okay but I think we we alsotraded very artificially small files umand then like 100 thousand of them andthen we could see that things reallyslow down but yeah yeah so so yeah it'sa very good question thank you and incase of big files can your solution alsodownload chunks of big files dependingon you know runyeah for the CVMFS snapshot that thatinternally chunks files in aconfigurable size and uh and that thatdefinitely does help with all kinds ofnetwork problemsthank you2025-04-15 21:58:46.766898h GPUs One ofthem is time slicing So with timeslicing only one process is actuallyrunning on the GPU at any point intime So because there is only oneprocess running on the GPU at any pointin time you have a contact switchingbecause you need to you need to do acontact switching in a round robinfashion between the processes And thiswill always bring an overhead that youneed to be aware with Another point thatis very important with time slicing isthat the memory is divided between allthe processes that are using the GPU Soyou can have one process actually starveprocesses on your GPU because theprocess just greedy and they don't careabout making sure a polite nicecollocated workloadfor at the same time with all thedrawbacks time slicing is very easy toenable and it works on a very widevariety of GPUs So what you do is youjust go to your GPU operator there is aconfiguration for a device plugin andyou're gna link here a config map and inthat config map you will say that in mycase I called it slice four you couldcall it something else but the guest ofit is what you should say I want thisconfiguration to mean that I want toachieve GPU sharing in this case withfour replicas with timeslicing by default this configurationdoesn't do anything you need to actuallygo find the GPUGabelthat label the node gna see that yournode will start advertising that it hasa shared GPU instead of a full allocatedGPUNow another option to do GPU sharing isMPS or multiprocess service This is verydifferent from time slicing because inthis case all the processes are runningon the GPU at the same time You also getan extra process which is like the MPSserver makes sure you are consuming onlythe amount of memory and compute thatyou are allowed to consume So this alsoproblem where you can starve hours andalso you don't have a overhead from thecontact switching anymore So this isquite good and enabling it is almostsimilar to time slicing Again you have adevice plugin is referencing a configmap and in that config map you can sayin this case I called it MPS4 I said Iwant my GPU to be shared with MPS intofour replicas again you have to go youhave to label your node accordingly andif it works you just gna see what yournode is advertising that it has sharedGPU to beallocated and lastly we have MIG ormultiinstance GPU so in this case we canhave up toisolated partitions from one GPU andpartitions are isolated in terms ofmemory compute cash so they cannotinfluence each other anymore and this issomething we care a lot about so youcould think of them running in their ownseparate rooms and they don't have to beaware of eachother Enabling it is available only onthe MIG capable GPUs For example A1 H100You couldn't enable it on a T4 forexample So this is something to keep inmindAnd another thing that is very importantwith is that you could think of yourfull GPU as eight memory units suteunits And when you choose a M layout youshould make sure that the total sum interms of partitions is the same what youstarted with and you're not losing anyresources For example if I would choosethe partitions that you see in green onthe screen So for memory for computememory to compute If you su them up I'mgna end up with seven memory sevencompute but I started with eight memorySo I'm losing one memory unit and thisis alot In this example I'm using anotherpartition layout So in this case I'masking for two instances of 2G 10 GB andone instance of 3G 20 GB And if you themup I have 7G 40 GB which is actually howmuch my full GPU and A1 with 40 GBmemory This is how my full GPU had SoI'm not losing any resources This iswhat you should aim for Now configuringMIG is very similar to sizing MPS butit's still a bit different In this caseyoul have your MIG manager in theconfiguration the GP operator that isreferencing a config map And in thatconfig map you can have an extend listwith all the possible allowed MIGpartitions in your cluster So in thiscase we chose one and then again youhave to choose a node that has MIGcapable GPUs You have to go label yournode wait for a little bit and then youlsee your node started advertising yourMIGinstances Okour performs ofsharing We start with sharing one GPUinto four we share another one intothree and we make it available to ourusers We have small workloads we havemedium workloads we have biggerworkloads So now we can schedule themaccordingly We get a big workload thatcannot fit anywhere right now becausethe GPU shared but at least the smallerone can while the bigger one will haveto waitNow the small ones are leaving GPU Iwant to schedule my workload but I canbecause I have to go and relabel thenode so that againavailable this happens I can finallyschedule the work has been waiting to bescheduled but now I have multiplesmaller ones and we need to manually goand do it again and it takestime finally it happen so now I canschedule my smaller workload aswell Now there is a clear paint pointhere which is manually reconfiguring ournodes Another clear point is that I'mdoing this per node So this is fine ifyou have one GPU per one node but thisis not usually the case For example weat turn can have two GPUs 4 GPUs eightGPUs per one node and we need a way tobe able to manage all of themindividually So if I one time slicingone time slicing on one GPU not on eightof them and if I want to enable MIG Imight want to enable MIG on all of thembut definitely not the same layout onall of them So this is aproblem and this is where DR comes intoplay and this is why we're doing whatwe're doing So DRA is an API forrequesting and sharing resources betweenpods and containers inside the pod Thisis a generalization for persistentvolumes API for genetic resources and inour case resources areGPUs This is still in beta in 142 Thisis in active development and there aremany things many interesting things Somake sure to follow this It's it'sreallycool By the way this is disabled bydefault but there are many tutorialsthat explain how to enable the array howto use the array So I'm sure you couldhave it in your cluster aswell Now by default by itself justhaving DRA in your cluster doesn't doanything You need a driver that actuallyunderstands this and publishes varietythings so that you can successfully useyour resources So in my case because weare using Nvidia GPUs I'm using theNvidia DR driver and I linked here theGitHub repository for this driver whichis the Nvidia Kay driver GPUSo once you have a DR driver running onyour cluster it's gna publish a resourceslid where you have a lot of informationabout what is the hardware available onyour cluster what is the driver versionwhat is the GPU version and many morethings Now you could use thisinformation to create device classesbased on the attributes that areavailable in the resourceslides Once you have this in your systemthe only thing left is what your usersneed to know how to create resourceclaims and resource claim templates inorder to actually allocate the GPUs fortheir ownworkloads More or less it looks likethis So your driver will publish aresource slides and in this case I havea H100 So it says architecture hopperbrand and video and then a bunch ofinformation is very usefulWhen as a cluster administrator I couldcreate a device class where I'mreferencing some of the attributes fromthe resource to decide which GPUs I wantto allocate under a certain name Andthen my users they will just create aresource claim template In this casethey want a time sliced GPU and you cansee that we're just saying that we wantsharing with a time slicing strategy Andthen in the pod wereferencing the template namecontainers containersinside one GPU with timeslicing This is much more complicatedthan it was before Or is it It'sdefinitely more verbal So why do we evenwant to dothis And the short answer is becausethis is much more powerful BeforeDrance decide how we want to advertisethe GPUs in our cluster So I had tothink maybe we have GPUs with timeslicing Maybe we have GPUs with M with acertain layout and then you need to doit on the fly as well And that you haveto be aware of what users are waiting toget a GPU scheduled and maybe change itSo it's quite complicated Now with thea rray we don't have to do it anymore Ijust have my dedicated full GPUsavailable on the cluster and then thearray makes sure that they are allocatedin the right way when there is a userrequestingthem This is how it looks like So I'mgna create here a pod uses a GPU Nothingelse It's one container using theGPU You can see that immediately I getthe processrunning Perfect Now I'm gna delete itand I'm gna create a pod that has twocontainers that are sharing GPU withtime slicing And if you remember beforeyou had to label the node here I'm justcreating a pod with two containers butone time slicing and it just works Itgot scheduled and it works and my GPUshared Now I can again delete this podand create another one that will havetwo containers but now two containersbut are sharing the GPU with MPS insteadAnd again I do not do anything in termsof labeling the node looking at theconfiguration anything I'm just creatingmy workload and it get scheduled and itruns As you can see here it's actuallyfree containers plus I get my extravidiacode MPS server which I said when I wasexplaining MPS You get a control processhere Again I can deleteit And I'm gonna create now a pod thathas I think eight containers that aretrying to share the GPU with timeslicing and it just works I don't haveto do anything else There is noreconfiguration in place which isamazing which is what youwant Now this is how system now lookslike I don't have to repartitionpreparation anything just gets donebecause there are userswaiting I'm trying to scheduleeverything I can Perfect Many things fitNow the big workload can fit into thededicated GPUPerfect Now I get my smaller workloadsbut we are sharing the GPU gone So I candynamically just repartition GPU andmake it available for my new usersAwesome Two things to keep in mind fromour research is that currently GPUs canbe shared only inside one space This isimportant for us because we have a modelwhere we have one user per namespace Soas a result with MPS and time slicingyou could argue that it makes sense onlyto share GPUs with workloads that youare aware of and maybe are inside alsoyour team because workloads can affecteach other But with M I definitely wouldwant to be able to have MIG partitionGPU and partitions just behave smallerGPUs in different name spaces So this issomething to be followed up And secondpoint is that dynamic M is not yetavailable That's why in the demo as wellwe just saw MPS and time slicing Butthis is something that comes withcubernetis 143 So just make sure to beup to date with a space because thiswill be working soonAlr so now let's benchmark when you setup Just a couple of words I'm using 142I have an A1 with 40 GB GPU and PCI Andthe GPU access is done with the arraywith the driver that I linked before onone of the slides and this is the exactversion that I'musing Now let's see what our use casesare So we have two use cases One of themis a simulation We're gna call it LQFTSo in this case atomic nucleare h together by the strong force whichcan be described by the f theory ofquantum chromodynamics and because thisis quantum mechanics we need to take itaccount the probabilistic nature of itso you need to be able to take intoaccount all the possible ways whatinteractions can happen and this becomesinterctable really fast so in order toresolve this problem we're gnadiscretize our dimensions into just afour dimensional grade So three forspace one for time This is what you'regna see as an input for this use caseFour dimensions Now this this can be runon GPUs and if we run this on GPUsleverage Nvidia KUDA platform and thisis why it's called CUDA This is alibrary forperforming for solving linear systems inorder to simulate VCD interactionsleveraging and Nvidia Coda platformOur second use case is MLPF and in thiscase particles when they collide insideLHC interact with each other a lot andnew particles get created and usually inhigh energy physics or in you wouldrefer to this as an event So MLPF ismachine learning particle flowreconstruction This is a project whichimplements full event reconstructionusing state of art machine learningmodels In my case for the benchmarking Iuse the version 162 and you could seehere the people made this projectpossible And I just benchmarked this ona full GPU By itself numbers don't saymuch So let's put them a bit intoperspectiveSo now I run two processes on the sameGPU and the GPU shared with time slicingand one thing that you can immediatelyis that LQFT is doing great mind thatrunning with someone else with timeslicing while MLPF is doing not so greatanymore So what you would expect becauseas I said you have only one processusing the GPU at any point in time youwould expect your total time will belike sequential plus an overhead becauseof a contact switching and we can seethat this is the case for LQFToverheadfor contacting is great with MLP this isabsolutely not the case we cannot ignorethe contact switching overhead in thiscase it's really slow basically doublethe time for running the same thing evenin parallel which gives some problems interms of isolation so you wouldn't wantto do that and when I move from twoprocesses to four it becomesLQ oft still works fine with smallerinputs mlpf out of questions you wouldnot useing for it it performs reallybadly but one thing that is reallyimportant here is that I did not expectfor LQFT the third input size to give anout of memory So when I was choosing theinput size I calculated this should fitinto GPU to run four processes at thesame time No surprised to see it was outof memory And one thing that I learnedabout it is that when you use timeslicing you will have some additionalmemory allocated because you need tokeep the state between the contextswitching So as a result even though oneprocess is using less than 10 GB so Ishould be able to schedule fourprocesses on my GPU In reality I wasn'table to So this is something to keep inmind if using timeslicing Now we're gna try not timeslicing but MPS inad And again we arestarting with two processes and QFT andMLPF doing great What this means itmeans when we running one process on theGPU we still have idol resources butcould be used in order to run multipleprocesses in parallel and this is whatwe're doing We are reusing vitalresources to be able to run twice theamount of processes and we can see thatthe numbers are great Running inparallel with MPS prove to be muchfaster when running them sequentiallyBut when I increase the amount ofprocesses again from two to f we can seethat this is not anymore the case SoLQFT did so great with time slicing isnot doing so great with MPS anymoreandvaro around MLPF who did so bad withtime slicing can and should be used withMPS which is very interesting So one keypoint here is that sometimes when youincrease the number of processes aretrying to share the GPU at the same timethere is so much overhead because of allthe handling of a concurrency but it canhave very big impact on yourperformance and a conclusion here isthat the performance of time slicing andMPS it depends heavily on your workloadprofile you cannot ignore what you arerunning and just expect it toadministrator because can aware of allusers workloads and as a result we needto educate our users what it means torun with mps what it means to run withtime slicing so that they are aware andthey know what to use and when touse Second thing is that MPS works greatwhen you when your GPU is heavilyunderutilized and you have enoughresources you could reuse in order torun in parallel which is not true fortime slicing with time slicing actuallyforcing things to be run sequentiallyand as a result it makes more sense touse time slicing when you have processesbut use the GPU maybe not to 100% but tolet's say80% and lastly time slicing can severelyunderperform when the process state isso complex it makes the contex switchinga verytime very time operationG run LQ oft on 1G 5GB 2G 10GB and thebiggest partition that we have which is2G 4G 20 GB so one thing we can see isthat the smallest partition is kind ofsmall to round this use case We can seewhat we have out of memories for twoinputs out of four what we haveBut the second thing I find veryinteresting is that the performancegrows three times when I'm moving from2G 10GB to 4G 20 GB So by doublingamount of resources I actually get ascaling in performance of three times Soas a result I would prefer to give tothis use case a bigger slice of MIG sothat I make sure that the resources areused efficientlyWhen I compare this with MLPF I can seethat again the small slide is too smallthis already kind of new but theinteresting part is that here when I'mdoubling resources I just get a littlebit better results but not by too muchso the result in this case I wouldactually stick with a smaller partitionbecause I don't see the immediatereasoning to give biggest partitionbecause just the gain in terms ofperformance is not thatcon small partition can be to small foruse case request reasonably sizepartition so don't just request the bigone because you can and don't be a badneighborhood to your users on yourcluster just because you want to be sotry to be very reasonable when you aretiming your benchmarking and try tothink what is a partition that makesmost sense for your use case And lastlybecause M isolates the work clouds andthey cannot influence each other anymoreIn this case we actually don't need toknow a lot of information about the usecases themselves about the workloadswhich was the case with time slicing andMPS In this case we are just runninglike a smaller workloads smaller GPUs Sowe can be bit ignorant about what kindof workload we actuallyrunning And final conclusions actuallyusing GPUs with DRAAPI a little bit more but very flexibleI think worth it and second approach isstill inactive development so try it outgive feedback I think this is the momentwhen you should give feedback becauseit's easier to change things when theystill under development when trying tochange them once everyone alreadydepends onthem and with this thank you very[Plojimai]Silencneedfor micrHi thank you so much I have twoquestions if you don't mind A first ofall have you tried to use VGPUWe tried in the past but because of alicensing I think we kind of droppedidea for now Ok I see And the secondquestionwhen kickof reprofiling with another MHGprofile it will do the same procedurerestarting GPU driver restarting cubletand you have some kind of two minutesgapI do not think so I would have to testthis is actually available as far as Ishouldn't be the case It should bedynamically without any down Ok thankyou so muchWelcome I also have two questions Yousaid that MIG can be used only in thesame name space but that only applies ifyou use DRI Dra If you use normal MIGsthen you can just do whatever you wantwithsas for time slicingvidia a confusingline in their documentation which saysthat if a workload is dropped then theGPU drops everything when you had theout of memory issues did it drop all ofthe workloads or just the ones whichwere out of memory m I saw bothbehaviors to be honest I think in thepast I've seen behaviors where oneprocess just tries to allocate a lot ofmemory and only other processes that aretrying afterwards to allocate the memoryare killed but also seen behavior whereall of them are trying at the same timeto get memory and then there is somecontention in place and all of themcannot run anymorethank youwelcomstarif that with partitionabledevices coming in13 deviceThank youPerfect I have a question abouting thework when they don't fit So haveyou somehow solved this part when I'msorry don't think I got the questionqueing whating the workloads when theydon't fitthe workloads when they don't fit itdepends what exactly you mean so youcould just cu them in terms ofcubernetis using some of the projectsand then you just don't schedule them onthe sameGPU so if if the GPUs are too smallingprioritization ok so i think depends soyou could have priority defined somewhelse so not in terms of dra or gps butalso i attended the talk from themaintainers of of working group for andthey mention that they will have thisprior priorThank you Thankyou Thank you everyone I think this isit Thank you for attending2025-04-15 21:59:21.643192 VV�� #��QA_pgOuaYwvBQCan westart Perfect Thank you Hello everyoneTo be honest I'm very shocked with theHolly so full of people Thank you verymuch for being here Today we're gna talkabout GPU sharing at gna talk aboutcutting the cake without losing aslice So to begin with what is SER So CNis a very very special place It's theEuropean organization for nuclearresearch and it has the biggest and mostpowerful particle accelerator varias Theparticle accelerator is LHC or largehydron collider and here particle beamsare accelerated to close to a speed offlight and they are made to collide atfour points along this ring This iswhere we have our detectors CMS LHCBAtlas andAlice Now one could ask why sir needsGPUs at all And here I put a screenshotfrom the amount of papers that arereferencing GPUs along the years And youcould see that there is a huge hype forGPUs and there are so many papers nowuse GPUs in their research and it's verycomplicated to cope with ademand And here are some examples whereour experiments are using GPUs ininteresting ways There are many more Soif you want to find more about thistopic this is a good place to startall right so what often happens is ateam run simulations and we are runningthe simulations on cpus at some point westart investigating what if we would runon gps instead you have to make yourcase you have to do your research youhave to explain why you would need gpsat all but eventually if you're luckyand hopefully you are you will get adedicated GPU for your team to run yoursimulationwork as a result actually have a lot ofid time you might use the GPU during daybut during the night stays unused andsometimes even if the GPU is actually inuse you could have suboptimal code ormaybe you have just a spiky workload soall in all you still get a lot of idletime in your GPUSo to solve this we decided tocentralize our resources So instead ofgiving a GPU per team we are trying tocentralize everything into a common pooland then if you need a GPU you go to thecommon pool you get it for a little bitof time you run your workload be itsimulations inference C runners trainingor anything else You run and then onceyou're done you release the GPU backinto a common pool so could use it aswell And this is more or less what itshould look like We have a GPUs in oursystemworkloads are scheduled on the GPU theyrun for as long as they need to and thenonce they are done they are leaving theGPU so the GPU can be allocated tosomeone else now this works great whenyou have less users than GPUs but we allknow this is not the case So at somepoint the demand is really big and wesee that a lot of users are waiting forhours or even days for a GPU and ofcourse user get mad and you get angrymessages saying been waiting for a longtime please do something and you have togo and see if you can manually releasesomething so of course this is far fromideal and to solve this we are lookinginto this idle time in the GPUs what Imention so we are trying to reuse the idtime so that we can increase the GPUoffering in our system so that we cansatisfy the needs of moreusers and to do this we are looking intohaving efficient GP G sharing system Nowthere are multiple options to achieveGPU sharing on kynetics wit hat you can uh hook your code into thatso one of them is the schedule extenderswhich is basically web hooks you canwrite this in any language and it's justjson over http obviously you'll have afairly high latency because it goes overthe network and there's only just fourplaces you can extend that this isbasically legacy stuffum the other one is theuling frameworkplugins that is like the big boy stuffwhere your code ends up being part ofthe scheduleuler itself um and you haveto deploy a custom binary uh of thescheduleuler plus your code that issuper efficient because you'recompletely uh entirely native go codebut that's not very flexible in terms ofdeployments and also there's more than adozen extension points basically all theones you saw in the previous slide soyou can really go crazy in terms ofimplementing your own custom logic andthen finally we showed the wam pluginsuh yeah so we're going to say wam a lottoday um was means web assembly um sowho's familiar with it in the audienceokay cool that that's quite a few peoplejust just for the other ones verybriefly uh web assembly it's portableacross architectures you know arm inteletc it is sandbox during execution andyou can target it from many programminglanguages uh it's a bit like the oldjava write once run anywhere uh exceptthat there's way more languages that youcan use and it is sandboxed from thestart so wasn't plugins let's diveslightly deeper you've got yourkubernetes scheduleuler and this timeyou deploy the super cool cubescheduleuler wasom extension which is amouthful uh as a framework plug-in sothat is the second kind that we talkedabout before uh and this one it it'simplementing all the possible extensionpoints and it will forward the requiredones to your own plug-in written in webassembly it's running via wazero whichis the zero dependency no see gograss-fed organic wom runtime um that'sa favorite in the go universe um so thecode that it's calling onto the uh isonto the guest side so this is withinthe web assembly vm and that code isgoing to look like this so this is avery very simple plug-in which is goingto filter the nodes on which a pot canrun that that's really important pleaseall you pay attention to this becausewe'll come back to this example over andover again uh it's receiving threeparameters just forget about the firstone not relevant this time uh but youcan see the second one is a pod which islike your entire pod definition so itcan be pretty big and the one after isnode info which you can think of as justyour node information plus a few otherthings uh it's making a decision basedon that and it's returning a status thestatus has a code and a reason uh thecode is going to uh let the scheduleulerknow what your decision was um and uhreason is just a string that is usefulfor operators that's what's going to endup in your lovely kubernetes events ohand one more thing you can only write aplug-in code ingo and not even in big go but just intiny go which is a bit of a sad tromboneright uh so diane what do you think weshould do so i mean i really like theidea oftheuler plug-in framework and it seemslike a perfect candidate for webassembly to allow us to write ourplugins in uh any po popular programminglanguage i know golang gras javawhatever supports it so what's the issuehere uh why only go so do we need towrite this what do we need to do so okayso let's start by looking at the currentguest sdk and seeing what it looks likeuh in the wasam shell extension so weunderstand what what we should do tohave those other guest sdks actuallywe're going to take even one more stepback and we're going to look at theplace where thescheduleuler is handing over to the wasmextension host via the scheduleulingframework so this is the filter thatwe've seen before as you can see it'staking a pod and a node info and thenit's doing all kinds of ceremony therethat's happening uh we're pushing thepod and the node name to a stack andactually what you don't even see here isthat there's a pref filter before thepod is going to be serialized intoprotobuff and put into a buffer inmemory that can be called later um andimportantly it ends up here calling intoanother filter function that is stillpart of the host and that one does a bitmore ceremony and ends up doing thisthing there which is basically let'stell w to jump into web assembly landsso jump into web assembly land we landinto the guest sdk so this is code thatthe user is not writing this code if youlook at the function signature it'staking nothing and it's returning aninteger and the reason why is becausebasically in web assembly you onlyreceive pointers and integers and youreturn just one thing um the pointersand integers i'm talking about theyactually put on the stack so you'll needto retrieve them from the stack this isactually happening a bit further you seethis important node name this is callinginto another piece of code of the guestsdk and there you see this mem.getstring this is actually doing all kindsof memory shenanigans so that it cancall into that skate scheduleulercurrent node name which is an exportedfunction uh sorry an imported functionso that's a call back into the host umand that one is going to be given apointer and a limit and it's askingplease mr host can you put a string withthe current node name into this bufferwhich is in my memory so that i can thenreturn that and you can see that there'squite a few other places where it'scalling back into the host and doing allthe memory shenanigans i'm not going toshow you each of them you can see thatstatus to code that's the one that'sgoing to be responsible for putting thatreason string um into a into a bufferthat can then be read by thehost eventually we reach this filterthing and this isfinally the code that the user can writecool souh this is fine right let's just write arest sdk and a python sdk do all theglue and all of that and uh yeah no noone does not simply write multiple guestsdks that would be just superunmaintainable we would need to keepthem in sync all of that like surelythey end there must be a better way soyeah actually there is so for the pasttwo cubecons uh the component model wasvery hyped we attended a lot of talksand we really were impressed with it soit sound really interesting we heardsome really promising things like thecanonical api complex types so we had anexciting idea uh let's bring thecomponent model to the cubeuler wasmextension what does that really mean sowith the component model we get astandardized uh language neutral way todefine clear interfaces and structureslike the ones you can see on the slidesuh we can describe our structures thecomplex kubernetes structures like uhfor a pod for a node uh we can alsodefine uh and export our plug-in methodshere we opt to just for a filter forsimplicitybut this allows us so what jonathan wasshowing and all of the complicatedfunction calls whatnot and then doingthis in uh any other language this willultimately allow us to autogenerate allof that stuffso uh first thing which i was reallyexcited about was when differentlanguages need to communicate so andespecially when sharing complexstructures one major challenge arises sohow do we safely pass data acrosslanguage boundaries without losinginformation without risking errors oradding complexity uh that's preciselywhere the canonical api steps in uh it'sa standardized low-level binaryinterface designed explicitly for thecomponent model it specifies exactly howcomplex data types should be representedhow they should be ma serialized and howreference should be managed so here wecan see uh on the host side this is inrust and rust has an amazing macrocalled bind genen where you can point itto a wid and it automatic automaticallyadds uh structures and functions intoscope like you see here so what we candefine in a wid here we can reference itwe can instantiate a pod instantiate anode we can call the filter uh we canpass the complex structure so that's alot nicer than what we had before andalso what we can do for the guest sdk orfor the plug-in developers we canautomatically generate bindings uhgenerate automatically all of the glueand then a developer would just focus onwriting the actual logicso why generated bindings are importantso if you ever tried manually creatinglanguage bindings you i guess you knowthe pain it's tedious it's errorproneand uh most importantly it's notmaintainable because if we did write ago sdk uh rust sdk python sdk who willmaintain that it will just battle withincompatible versions uh some will bebetter maintained some not the componentmodel uh makes this problem practicallydisappear so thanks to the standardizedinterfaces the canonical api we cangenerate all of that so what jonathanshowed you pretty much we can justgenerateit so pretty much yeah like uhlike the meme shows you get a bindingyou get everybody gets a binding thatthat is just like that that just soundsawesome so let's go refactor the projectand just like use the component model uhyes that was the plan but well the rereality hit us hard so we were hyped wethought component model will uh save usfrom all of the troubles we just gorefactor and everything is perfectbut first thing which we started wasokay we need to define our width and uhwe were like okay what now do i need torewrite uh in the width do you need torewrite the kubernetes structures sohere's an actual uh pod spec that's asmall one even do i need to rewrite thisin the width do you need to do the samefor the node so we're like nobody woulddo this so that's not maintainable ilet's see is there any wid generatorsomething like can we generate wit fromproto sadly uh we cannot uh we're likeokay is there something for open api wefound some project but we didn't haveany luck with it so open api to bit wasa no-go we were hoping like let's ghosttracks but sadly ghost tracks to bit wasalso a no-go and uh yeah that was ourfirst road bump uh well we need to reuseexternal types i mean for anyintegration we cannot just uh write byhand uh manually into it so nonethelesswe're like we want to try to make itwork what's the next thing so uh now weneeded to run our filter component in ago host why a go host well thekubernetes plug-in framework is writtenin go theuler wasm extension we've beenworking on also in go uh it uses zero asa runtime so picking was zero was ourlogicalchoice soon we realized well we cannotrun components in zero we searched foruh the github issues there was a oneabout plans to support the component thegeneral sentiment was uh currently theydon't they aren't planning until someprerequisites are met we're like okaywhat are other options wasn't time go wetried it you can see uh wasn't time gostill isn't ready it's missing a coupleof things they're being worked on butthat left us in a bit of a pickle so wedon't have a goruntime we could hack it we could userust uh and theuler extender to make itwork but that's not really what wewanted to do so uh jonathan on your partwas it a smooth sale or nah so the firstthing i did is try to look atalternatives and actually there is areally good alternative there's the topcontender of the alternative that's xismwhich has a an interesting logo uh xmsupports many hosts and guest languagesone of the host languages in is go usingone zero so happy days that's good anduh in terms of guest sdks as they calltheir or as they call them the pdks forplug-in development kits uh you can seeit supports rust and javascript andpython and c and go and many moreactually so so how do you use it so youdo a little bit of of ceremony again tojust load your was a module and then youend up using this plug-in call functionuh just give it your function name umand a single input in this case it'sbytes and it's giving you back an exitcode a single output which is againbytes and a go error so let's look atthe guest side in this case i chose rustbut as i said like it works with manyprogramming languages uh this is usingthe pdk it's got that nice plug-in fnmacro and so you end up just havinggetting inputs bytes in bytes out prettycool uh we can probably do somethingbetter so you can use it with strings aswell in the case of the go host you needto convert but on the guest side it'sagain it's pretty neat see functionsignature just getting string as aninput string as an output again happydays um where it gets even moreinteresting is that you can also usejson to pass things around so let's justsay that we define a json schema likethis one so we take the existingkubernetes json schema and then we addor filter inputs the reason why there'sa filter input object here is becauseyou can only have one input uh that'sone of the requirements of xism and uhyou see my filter input has a pod and anode info that's again the same thingfor the filter that we've seen on the goside it looks like this so you've got afilter input at the top the strruct youcould have that autogenerated there'smany projects like quick type that cantake a json schema and just generate itfor you you can even generate a jsonschema from gostruct so all that is fineum you just initialize it you you havethis uh json marshalling and marshallingof the input and the output that arearound the function call that's a bitunwieldy uh that's a bit of ceremony butit's fine that's the host path likethat's not what the users need to looklike and we only need to write that onceon the guest side it's actually superneat because xism provides the plumbingso that we can receive a nice nativecomplex type and return another one whati didn't say right before so when ishowed the bytes example um is that it'sdoing all the memory shenanigans that wesaw before like it's not magic it hasall that glue code it's just that theywrote it for all the guest sdks forus okay so uh what can we do if we careabout performance well we could useproto so this is a proto where i'mreferencing the other uh uh officialkubernetes proto packages and i'mdefining my filter input again which isa pod and a node info blah blah blah sowhat does it look like on the host sideit's very similar to what we've justseen with json because on the go hostside you need to do all of those thingsyourself i'm using vt proto which ismuch faster than regular proto marshalland unmarshall round and then on theguest side well that's a bit of a letdown uh we don't have all the fantasticuh stuff being done for us yet so i'mjust taking bites if you see that filterfunction at the top i'm doing a bit ofceremony so that i end up uhunmarshalling and marshalling andcalling the real filter implementationwhich again looks pretty neat so that'snot too terrible but it's a compromisebetween you want you want performanceyou want ease of generation of guestsdks um briefly some runner-ups in termsof alternatives we tried there's xtpbinden which is from the same peoplethat made xism it's built on top ofxtism it's uh it's quite interestingbecause it defines a schema that looks alot like open api so the cool thing isthat you see again in component schemasi've got my filter input once again sothis is just this is yaml but basicallyit's again a json schema definition andwhat you see above is exports so thereyou can you can actually define thesignature of your functions and thenit's going to take this and it's goingto generate even more of the glue codefor you so it's going to alreadygenerate the function signatures and allof that that sounds pretty cool i'vetried to just convert the kubernetesopen api schema to this and then i ranthe xtp command line thing which isproprietary and calls into some opensource bits and it just didn't work itcrashed so yeah that's sad uh itprobably can be fixed uh another quickmention uh this go plugin thing that'snot the hashikob go plugin it's anotherone i cannot pronounce this uh it'sdoing all the stuff with vd proto andall of the memory pointer passing andall of that it has a go sdkum for the hostand only a guest sdk in go so we'reagain back at squareone but there's a new hope there's a newproject that was released in januarycalledgravity and gravity is very very earlyat the moment but it's essentially atranspiler so it is trying to build ontop of wazi to implement the componentmodel at the moment it is still supersuper early these are the only typesthat they support that basically doesnot allow us to have all the complextypes that we wanted to have also westill have the lack of protogenerationbut it's really going in the rightdirection and i've basically tried acouple of benchmarks i'm going to go i'mgoing to go very briefly about this uhso there's a lot of stuff start from theleft this is a go host using exism withthe json method and then running variousuh guest sdks uh you see they're allabout in the same ballpark then skip thesecond one for now go to the third onethis is when using vt proto so obviouslythis is faster than with json and justas points of comparison i've tried touse gravity with a gohost and with usingjson so i'm just using their stringtypes so there isn't the advantage ofhaving the the entire um richness of thecomponent model but if you look at thegreen bar over there on the f the firstgreen bar and that fourth green bar youcan see that it's just for sending astring even just for that it's already alittle bit faster uh i'm not sure whybut it's going again in the rightdirection and then we've alsoimplemented uh a host in rust that isusing wit with just a super small subsetso it's not exactly an applesto applescomparison but if you look there at thegray bar this is a tiny go guest that isrunning or plugin or i should have saidthat but it's basically doing a littlebit of reg x filtering it's a filterplug-in that is reading an annotationfrom the pod def from the from the podand if it finds a reg x it will compileit it will apply to the node name andbased on that it will filter yes or nouh so a tiny bit of logic the tiny gothe tiny go one uh the gray one that isrunning with the component model that isrunning with wasm time and you cancompare it to the second gray bar whichwas the same piece of code but usingjson in was time and you can see thatthis gets gets us a lot faster and thenfinally just for comparison because somepeople are like why are you doing all ofthis stuff like why are you using womjust just call into a shared library ordo something like that uh it's not freeuh you will end up with the exact sameproblems of having pointers and all ofthose things so you will end up alsoeither having to share your memory whichis super dangerous like we're writingstuff in go we want safe memory we don'twant to just hand it over to some clibrary and then just crash umand i forgot my train of thoughtuh the point is this is jsonserialization in a go host calling viacgo a library a shared library writtenin go as well and you can see theperformance there is in the sameballpark as what we would get with witu andnow what what are we what are we askingour audience so um as we heard in aprevious talk um issue with p1 is uheverybody does its own thing umcomponent model is an amazing ideacomponent model standardizes a lotcomponent model can make things so muchbetter but it's still in its earlystages so we want to do a call out toall contributors so we uh the toolinglacks a couple of things so the widgenerators prototum that's uh something uh for go uhlet's decide what we want to do with goruntimes uh will was zero support itwill wasn't time go support it so uh ithink this should be a joint effort sowhoever is interested we urge everybodyto get uh involved to contribute todiscuss to join working groupsand also we'd like to do a shout out toall of the maintainers to the wasruntime technical advisory group toeverybody who's involved in this projectthe community uh the authors of the thecomponent model so you all did anamazing job and uh it leaves us and onemore thing which is that the componentmodel really is the way we are quiteconvinced about it and if we just likeall pull together and we make it happenin go this use case that we've shown isjust one use case amongst many other usecases in the whole cncf landscape we'vegot we've got go everywhere what if wecould just extend it and not have to dothat glue over and over and over againin each project we just do that once andthen we will all benefit fromit do you have any questionsso i believe if there's if anyone has aquestion that they should walk to themic that's there in the middle2025-04-15 21:59:22.604720 y �y��C�#��=ATz8IcMSY7jwseems like we should get started Sothank you everyone for coming Uh todaywe would like to talk about writingdynamic multicluster controllers withcontroller runtime My name is MarvinBeckers I am a team lead at KuberneticYeah I'm Stefan working on API serverthings for a long time Many of you willhave seen me before working at Nvidia onAPI server topics So um yeah All rightUh before we get started I would alsolike to briefly mention that my part ofthis work has been made possible by theIPSIS project by the European Union Youmight have heard that coming up duringthis CubeCon but yeah really nice to beable to work on cool stuff here Okaylet's get started So first of all let'sset the stage Why are we here and thisroom is pretty full So I think a lot ofyou are facing the same problem askingthe same questions So why would you wantto reconcile across multiple Kubernetesclusters we live in a multicluster worldUm I think most people that are runningKubernetes are running more than oneKubernetes cluster maybe also more thantwo maybe also more than three Um andthe tooling around that is quitesophisticated these days You have toolslike cluster API You have also othersolutions like vclustluster who give youlightweight kubernetes API servers andyou know countless others So I thinkit's fair to say multicluster or likemany clusters is the standard butrec��#��;AhjbZOBghxYUso hello everybody uhi so i hope everybody's uh interestedfor yet another web assembly componentmodel talkah okayyeah that's theenthusiasm i'm a bit surprised nice soumtoday what we wanted to do is uh we wantto show you our journey where we wantedto apply the component model to thekubernetes scheduler so my name is danpchev i'm an open source engineer withthe g research open source team sotogether with my colleague we'll try towalk you through our journey uh throughthe road bumps we encountered sojonathan if you may yeah so hi everyonei'm jonathan i also work for g researchin the open source team and i work onall kinds of things uh that are opensource in pretty much any programminglanguage and always with a focus onperformance andmaintainability and so first i want todo a little previously in episode one ofdiane and yonathan jonathan go tocubecon so back then we discussed themagic behind the kubernetes scheduleuleruh and the various ways that you canextend it if if you are interested inthis and you missed it the last timewhen we were in salt lake city i wouldhighly recommend you go ahead and watchour talk uh but not right now obviouslyjust half an hour uh because we're notgoing to talk about it in more detailsnow i'm just going to try you to to tryto give you enough context to appreciatethe next 25 minutes hopefully um so lasttime we looked at the wholeuling cycleand all the places where you can hookyour own code into it and there's as youcan see on that picture there's quite afew and then we explore the three waystonciling across that fleet ofclusters that you have it's not trivialand it's not something that you know isstandardized and controller runtimeitself is fairly single cluster in youknow how how to use it and the toolingaround itSo but the fact remains that you need toreconcile across many clusters if youhave many clusters and a lot of toolshave written solutions around this Sofor example cluster API has the clustercache package which allows to reconcileagainst multiple clusters Um we at theKCP project we have a fork of controllerruntime that added multiclustercapabilities to it Um there's also someolder projects like a multiclustercontroller by Admirality and honestlythere are probably countless others onout there Uh probably all themulticluster management solutions havetheir ownimplementation But that's the thing noneof these solutions are widespread Theyare not generic and they are not nativeto control controller runtimeeitherSo let's set some more context Let'stake a quick look at scaling models Sohow can you scale your controllers soone of the ways this is the classicalone is per Kubernetes cluster you have acontroller pot running within thatcluster It has a manager it has a cacheit has a workq it has a reconciler andyou run each of like this construct forevery Kubernetes cluster that you haveAnother scaling model would be to have amanagement cluster and run onecontroller pot um per external clusterthat you want to reconcile So you haveone management cluster that's basicallykind of the you know workhorse and thereis a process manager maybe anotheroperator or controller um that launchescontroller ports against theseKubernetes clusters that you want toreconcile umdynamically and then there's this modelas well um also you could have a hostcluster management cluster and thecontroller pot here would start multiplemanagers so controller runtime managersto recon reconcile against externalKubernetes clusters that you would wantto reconcile against So these are acouple of the scaling models that we'veseen already Um but we think thatthere's a way to make it more genericand to build like a tighter integrationwith controller runtimeYeah So we mentioned the the word herealready on the slides So there's a newproject we introduced a couple of weeksago and it's called multicluster runtimeand there's a reason it sounds likecontroller runtime Um it's a friendlyextension of controller runtime So it'sit's not forking anything Um it's justusing the primitives of controllerruntime under the hood And we will seein a couple of slides um how this lookslike Um the project is hosted in theKubernetes 6 uh on GitHub So um you willfind it here Commun 6 multiclusterruntime and it's sponsored and owned bysik multicluster So it's an official sikproject and um what we mainly add tocontroller runtime is adding a providerum concept So there's a concept um aprovider can discover or should discoverum the clusters which the controllerswill talk to and we will see a couple ofthose providers in a second and also howyou build them And if you look on therepository there is a kind of disclaimerin the beginning It's anexperiment and we are all here or thetalk exists because we want feedback andpeople trying that out whether this is amodel which is actually useful Um thetwo of us we we have background in KCPand in KCP we needed something like thatand we were for quite some time uhworked on on uh such a project and butwe realize that there are many many moreuse cases and the provider concept isbasically opening thatup We talk about two topologies whichare probably common The most easy one isa uniform reconciler It's a reconilerwhich actually doesn't know aboutclusters It's not interested incrosscluster operation It just does itsthing for events from a cluster talk Imean reads from a cluster it writes tothe same cluster That's why it's uniformSuper simple Um if you run an existingreconciler but against n differentclusters That's probably the pattern youwill followBut of course you can build clusters orimagine clust uh not clusters you canimagine reconcilers which know about thecluster concept and maybe they have umyeah dedicated special um clusters likea host cluster for example and then somechild clusters a sinker is a typicalexample if you sync objects from theleft to the right um that's anon-uniform controller um you can alsoimagine you have a search manager andthe search manager has CAS the issuer inone cluster like in hostum central location and it createscertificates for many many clusters butit never exposes the credentials or thesecrets um to create um certificates Sothere are many examples for that and wewant to support both and the providerconcept is basically attached to themulticluster um runtime core So this isa multiluster runtime you will see inaction in a second and there's aprovider concept and you can have multimultiple different implementations ofthe provider So imagine cluster runtimeYou could have a cluster runtimecontroller provider which knows how tofind discover clusters in controller uhin in cluster um in cluster API sorry incluster API You could also find umimagine a provider for kind or providerfor cloud providers Um that's just aninterface you have to implement and wesee in a second how this lookslike All right so this is a picture thatMarvin already showed Um that's not whatwe want to do We want a tighterintegration and the model we follow atthe moment is thisone So you have a manager and we call ita multicluster manager It's basicallylike the the one from controller runtimeand we see the interface um very soon Umhere's a provider The provider engagesclusters when it finds out um there's anew one and it disengages them when acluster goes away And here's yourreconciler code Here's a work And youhave sources like um yeah basicallythose which get events from the cachesput them into a work a joint work andthen the recon does does its thing Onthis picture the work looks superinnocent but um it plays a crucial roleand um lots of thought must go into thathow this work must look like You canimagine there might be one cluster hereone cache which has thousands of eventsand the others are pretty silent but thequeue is full because the one clusterexhausts all the capacity of thereconciling process So you have to thinkaboutthat you can imagine even cluster thecluster concept doesn't mean it's a realphysical cluster You could for exampleum build a provider which knows aboutnamespaces and it uses a differentclient depending on the namespace you'retalking about So if there are someisolation reasons or you want auditlocks or something like that which umshows a tenant something in thisdirection this could be also provider Sothe cluster concept is very wide verygeneric It doesn't have to be a realcluster One special case is um the KCPcase where in the background youbasically only have one cache oneinformer and theprovider simulates whatever the sourcesneed here So we have workspaces in KCP Aworkspace looks like a real Kubernetescluster to the outside but they actuallyall running in one KCP instance And ifyou look on the on the um yeah theinformer the watch interface herethere's just one connection to KCP Soit's super efficient to have many manymany workspaces thousands if you wantbut you have only one cache even thatcould be modeled in this providerconcept All right so we have seen thetopology um the objects the componentsand now let's become more practicalright see how providers work howcontrollers work All right So let's takea look NoI keep talking Okay Um All right So uhhow to build a multicluster controllerOkay Yeah I guess we'll do this way OkayUh so how to build a multiclustercontroller So let's take a look atcontroller runtime first So if you'veever written a controller this isprobably somewhat familiar to you fromyour main.go file Um you set up amanager That manager is coming from apackage of controller runtime and wellyou eventually register your controllerswith that your reconcilers with that andthen you start your manager So far sogood So with multicluster runtime uh wehave a couple of additions but they arenot too big So the first thing that youwant to look at is the provider So weinstantiate a provider and in this casewe're using the kind provider So that isone of the examples that we've builtalready And then what you do is youcreate a multicluster manager here Sothis is what Stefan already mentionedearlier Um and it's fairly similar to anormal um you know a control runtimemanager with the difference that you'repassing a provider in So this is where Iplug in the kind provider and I alsostart the provider at a later part ofthe code And you know this is basicallyit This is your controller now being amulticluster runtime controllerAnd even if you don't know what providerto use yet in your controller you couldadopt this already because if you setthe provider to nil um well then you getthe same behavior the same singlecluster behavior that you're getting outof controller runtime as well So if youdon't know yet what provider to use gowith nil You could already use this Nochange really to what you've been doingwith controller runtimealready Let's take a look at the reconilSo the reconil in controller runtime youprobably have seen this already maybeyour ch your code looks slightlydifferent but the gist is there is thebuilder package from controller runtimeUm it uses the reconcile package mostnotably the request type from that andyou have a reconcile function and thatreconcile function can access thecluster object embedded in the managerto access the client and the cache forthe underlying cluster for theunderlying like physical KubernetesclusterIn multicluster runtime uh we have ourown builder package Um this builderpackage um is you know multiclusterwhere it adds on top of what whatcontroller runtime is adding and we haveour own reconcile package and thatreconcile package has also its ownrequest type which has the controllerruntime request type embedded plus theinformation about the cluster where thisis coming from So it enrichesinformation that you would also get withcontroller runtime with multiclusterruntime So the main difference in yourreconciler function is that uhpreviously we had the static cluster umcoming from the manager Um but here weneed to dynamically fetch the clusterobject because now we don't know whichcluster to use until therequ the request comes into ourreconciling pipeline Basically um weneed to dynamically fetch the clusterobject from the manager and then usethat in the in the like following codeBut that's basically it Like all of therest of the code can look exactly thesame This is really important to stressfor us Multicluster runtime is not afork of controller runtime It's atestament to controller runtime'sarchitecture and most notably the gogeneric support that was added I thinklast year something I don't know exactlyUm so we so multicluster runtime usesthe generic types from controllerruntime with the concrete multiclusteraware types that we are adding in ourpackages So not a fork it's an extensionIt's kind of an add-on that you can useon top of controller runtime or like inconjunction with controller runtimeYeah the too long didn't listen isselect a provider um kind provider KCPprovider cluster API provider Uh switchto the multicluster runtime builder andreconcile packages and dynamically fetchyour cluster object which includes theclient from the manager and then ideallyyou're already done You have amulticluster controller TadaYeah that's I think your sister workingIt works I think it works Oh good Allright Sorry So let's dive a bit into umthe interfaces like the Golanginterfaces And we talked about themanager and we handwaved it basicallylike the normal manager plus some stuffAnd here it's a bit more concrete Um Ididn't put everything here on the slidebut you can imagine just take allmethods of of the controller runtimemanager manager and remove everythingthat the cluster the embedded cluster umadds to it Like there's a get client incontroller runtime This will not be herebecause um the embedding doesn't existBut we re um the get cluster to getaccess to the cluster um by cluster nameAnd if cluster name is um the emptystring it's a host cluster where thecontroller is running the multi umcluster aware controller And um you cando for compatibility uh reasons you canalso ask for a manager which is a normalcontroller runtime manager but nowscoped to one cluster And the specialcase um without any error here when thecluster name is empty again empty stringthen you can get even that manager inone call and super simple um you canalso ask for the provider which might benil um if you provide nil on theconstructor but you might want to usethat um provider if you want to createindexes you can imagine there are manycaches many informers and if you want anindex there might be clusters beingcreated in the future right a newcluster comes online and it's engagedand then all your index definitions mustbe um applied to that new clusterobjects new cache That's why there is aconnection to the provider and theprovider has a job to manage indexes aswell Um the provider interface that'swhat you have to implement when you wantto write your own um provider So clusterAPI or the kind provider that Marvin hasshown super simple um has a getter againthe same getter we have just seen for umgetting a cluster by cluster name and umthis is of course called by reconcilersindirectly We see in a second how thisworks but it might be called often So umdon't recreate your cache every timeright don't recreate the watches to yourclusters all the time but use some somekind of state some some map of um ofcaches which are running um and thenreturns the right one for the clustername And as I said already indexmanagement So remember the indexes whichmust be added whenever there's a newcluster coming online read the indexesthat the user or the developer of thereconciler wanted That's all supersimple and you can build um providers in50 lines or something So not verycomplex So this picture we have seenalready and if we now um add the themethods names here It's not surprisinghere again there's a the manager I saidthere's a get cluster getter That's theone that Marvin had in the in thereconiler The reconiler asked for theget cluster This call is forwarded hereto the provider to the get functionsProvider returns the cluster Inverselywhen a new cluster goes online and it'sdiscovered then it's engaged on themanager So the manager knows about itand it will manage the sources So thenyou can imagine another cache comes umhere into that that list So it's createdand all the sources which are um yeahthe watches basically against um thecaches they are managed by the managerhere So they're recreated or created forthe new cluster and they then produceevents which go into the work QE and thereconciler just um processes events fromthe new cluster And inversely when acluster goes away then the provider hasto disengage Disengage is at the momentit's a context cancel Maybe we move itinto a disengage function again forreasons Um but um basically that's itThat's the whole um yeah the wholetechnique behind the scenes what umprovider and managers do in themulticluster context Um to sum up aprovider responsibility is to watchchanges in the cluster fleet This ishighly specific to which um whichcluster manager which fleet manager youuse and if there's new cluster constructthe cache object and the client and putit into the clusters controller runtimecluster um object Remember that in somemaps so keep track of them of thoseclusters engage them on the manager sothe manager knows about them and when itdisengages so when the cluster goes awayum cancel the context that's all whatyou have to do bottlenecks um I Ipromise to talk about the the work hereum you have to think about what is thebottleneck of your reconciler now youmight have many many clusters and um youhave to think how to um Yeah to organizeum yeah maybe fair queuing So if there'sone cache which produces tons of eventsanother one hardly any event maybe youwant something like individual cues hereand then some fair queueing mechanismwhich picks events from every of thoseum in a in a fair manner so that not oneof them can exhaust um the capacity andmaybe priority cues will play a role umin the same way as in controller runtimeSo this this thing will depend on whereyour bottlenecks are So you can imaginemaybe the QPS of the client as abottleneck could be or maybe um you runa heavy um scheduleuler here which issuper CPU heavy and um the CPU of thepot is the bottleneck or maybe theworkers um are the bottleneck So umdepending on that you have to thinkabout what a what a good um queuingstrate strategyis and um I bet here in the room manywill have experiences with that and alsorelated topics like charting ofcontrollers So we are very open forideas and previous experience to todesign that At the moment um the spherequeuing only exists as an idea and avery crude implementation Of coursethere's more work necessary So if youhave background in that um very welcometo hear about that Yeah And now let's goto the demo I would sayAll right let's see if I can do thatthis way But let's have a quick look atum you know how this works Like let melet me show you a small littlecontroller that I wrote It's anextension of the code that we've seenbefore And let me show[Music]you let me show you this running againstKCP So we start with KCP first So thiscontroller that I've written it it justlooks for config maps um and when itfinds config maps in the KCP instancethen it will log a line You knowsimplest reconciler you could write tobe honest because it doesn't do anythingSo something that you can see from thelogs here already is that we have thecluster name Um and I'm not sure if Iwell should be fine Um the cluster namesare in here in the lock line So each ofthese reconciles that you see here theycome from a different cluster So from adifferent source than what you've seenin the diagrams before and all of themare actually reconciling this config mapwith the same name um which is the cuberoot CAert Uh but it's differentinstances of that object because theycome from different clusters So um howdifficult would it be to switch thiscontroller now to use the kind providerwhich we've talked about before Soum let me quickly show yousomething namely metyping Thank you verymuch Soall right got it Uh okay So this isbasically all the changes that I need todo uh when I want to switch providers Soum I import a different provider I setit up instead of the previous KCPprovider that I've been using and I'mdone So if Inowcheck all goodum if I check this out now and it's ofcourse the wrong checkoutcommit So let's see should be thisone Perfect So now I'm using the kindprovider the like the changes thatyou'veseen and I do a go runagain and now my provider will pick upconfig maps coming from a kind clusterthat is running on my local machine andI could also start another kind clusterIt will dynamically be uh put into hereand then it will reconile against twoclusters but this is how easy it was forme to switch from one provider to theother And let's have a brief look at howcomplex that kind provider is becausethe answer is not thatmuch So this is basically the the gistof this provider Um it is rating in aloop It is listing the available kindclusters it's doing some uh some uhfiltering on that and then it goesthrough the process of creating a cubeconfig of eventually setting up a cacheand a client and um basicallyremembering that So that is one of thethings that Stefan mentioned earlier andthat is really it So this whole filewith you know all the type definitionsin the in the beginning it's 200 linesThat's it That's the provider I'm doneSo with that inmind I think we're a bit slow on timeYeah And we have a couple of providersas a prototype already in the repositorybut nothing fancy It's really PC's So wehave a cluster API here Um the kindprovider we just saw some experimentalnamespace provider Um and yeah basicallytwo trivial ones one which has nocluster at all one is just one clustersuper uninteresting Um our goal is notto have all the providers here obviouslySo a monor repo is an antiattern Um soif you build something like a reallyserious cluster API provider um probablyoutside is a better place Uh we have uhthe KCP provider for example here Soit's in KCPDE dev multicluster runtimeand yeah it implements what Marvin hasjust shown um uses virtual workspace APIendpoint for exports Um and yeah that'sthe KCP one Hopefully we get more So ifyou have ideas um I'm looking at thecloud provider attendees here Um wouldbe interesting to to run somethingagainst all clusters in your project orsomething like that Um what is next sowe have many ideas Um the experexperimental message here shouldeventually go away I guessYeah exactly So as we said it'sI forgot Um so yeah so far this is anexperiment Um but of course we want tostabilize it and you know make it usablefor everyone and basically hope that ityou know gives you some value So we haveyou know so many ideas we don't have thetime to talk about them Maybe out ofprocess providers maybe you know asChefan said better providers a clusterinventory provider for the multiclusterAPI and you know go on go on therethere's so many things to do Um nextthing that we want to do is gather someexperience with existing code bases Wehave some patch sets with like tryingthings out but of course we would loveto work with projects to see if they cangain value from adopting multiclusterruntime and also see if we're missingthe mark because this is you knowsomething we want to build for thecommunity and you all know the XKCDcomic Uh we hope to not like createanother standard but something that ismeaningful So we need to hear from youif this is really what you need Um or ifthere's something you know it's toosimple it's too simple it's too complexit misses the mark So please please tellus And with that thank you for listeningThese are the QR codes for therepository for the Zig multiclusterselect channel and um for rating thistalk whether you liked it or not We'dlove to hear your feedback Thank youvery much Thank youSo if we have we have a few minutes forquestions if there areany and there's a microphone in themiddle if anyone wants to ask oneI have a questionUm in probably your second this is not amulticluster situation If you rememberyou have a central manager managingother clusters Finally what is differentwith this multicluster manager ormulticluster controller because if Iunderstand correctly it's still livingin one cluster and it is able toreconcile all the clusters Yeah But it'sstill master or central So it's likeyour management cluster you wereillustrating but different I don't knowSo you can run multiple managers Thispattern works is totally valid and foruniform controllers it's yeah you can dothat and people do it Um what you gethere is tighter integration like the verso you can better control what your potis actually doing Um if you havemultiple managers they're completelyindependent right um they do their thingand they use resources um as they wantUm you have no control over that Okay Sothe network architecture is the same Youhave one central management cluster withthe multicluster controller Yeah Butjust the fact it's cubernet is nativeand integrated as a controller followingall the patterns the cues all thethings it's the main difference if Iunderstand not sure I understandthe fact themulti-cluster controlleris well integrated using the cues as yousaidand so maybe thinkwell I mean so the it's basically youknow what what what part of thecontroller are you scaling right so umin the other examples we had like onemanager one like Q uh sorry one one onesource one one event handler all of thatand you just you know you scale at adifferent level you have one manager andyou have one reconcile and you have onework Q but you're scaling in the middlebasically because you can you know startmultiple sources so I think that's justthe difference in the scaling mechanismOkayUm thank you guys for contributing thisThis is really awesome and I lookforward for the integration with clusterprofile provider on this sigmulticluster Um the question for youwould be around graduation from theexperiment to GA What do you see uh asthe most important thing you want totackle and maybe a time frame for that ithink the main input we need is thefeedback that this works like this modelis a correct one So we have someexperiments taking existing bigger codebases and trying to implement that butmore of this kind would be helpful So wehave seen the other models um which alsoexist and they are used does this bringvalue um that's the question basicallyif we agree if we agree on the topologyand it makes sense then I think this isthe bar Do we want to put a number on itor I don't know Oh no the the number ofthis implementation is like 3 weeks fourweeks since we started basically orsince we made the project uh public Socould be a matter of weeks matter of thefeedback we get hereSo thank youA couple of questions Really quick oneFirst uh I didn't see the slidesattached in the app I don't know if thatwas a mistake with the app or if they'venot been uploaded but if not do you planto add the slides we'll add them YepGreat Thanks Uh the the main questionthen um is there support for umheterogeneous providers so if I've gotuh multiple clusters some on AWS some onGCP for example Um yeah I I I think youcan just um build a union provider andattach multiple of them I mean theinterface you have seen right there's agetter and there's the index function sothis should be possible Yeah I guessYeah Cool Thank youUm hi I'm pretty excited about thisproject as someone who manages arelatively large uncommonly large numberof clusters in production and I thinkthis could really help us and like helpus also control costs So I'm prettyexcited about it Uh the main question Ihave for you guys is how to um like whathow how this is going to work with uhCRDs that might not be the same versionum in the various different clustersthat are being managed Yeah technicallyum in with the exception of the KCPprovider this is different but um theproviders we have shown here there's onecube config per cluster So there's aninformer um behind the scene one cacheper cluster Mhm And um this will use aversion of the specific cluster So thereis no difference in your reconciler Youmight have to distinguish right you willuse go types and maybe there's an old ora new cluster So you could imagine thatyou built some meta information intoyour provider You can query and then youknow which version is installed orsomething in this direction Okay So sowithin the reconcile loop it would uhtry to differentiate between the theolder version of the CRD and andpossibly a newer version and okay whenyou can you can ask the manager for theprovider you do some interface uh typecheck whether it has this metainformation interface and then you askwhich C something this direction okaywill work cool thank youthank you so in the code walk throughthat Marvin was doing in a kind pro kindprovider right uh I saw you wereinitializing indexes and caches yourselfuh like is that going to beresponsibility of the provider asopposed to just providing the provideryeah because it differs by by um usecase I think there's one class ofproviders which basically provide cubeconfigs and then we could basicallybuild a library which unifies that ifyou want to if that's the model you youwant to implement yeah I mean the relike from a layering perspective itsounds like the cluster provider wouldprobably just provide you the cubeconfig right and most everything elseprobably is common across all providersfor most of them Yeah Yeah So we needsomething like a unified frame I meanframework is a big word here I think uma package which does that and the GPCtopic like the remote or out of processprovider maybe it just provides cubeconfigs as simple as that and everythingelse is done process and what's theposition of the controller runtimemaintainers on merging the two um Ididn't mention that in this in thisexperimental message there there was alink to the design um in controllerruntime and Um it's a friendly extensionUm so Stefan is here He proposedactually to try that So um maybeeventually it might move back when weagree on the shape of the thing This isone possibility if we need somethingmore in controller runtime Um I hopethey are open um to add it Um I agree2025-04-15 21:59:23.428083nsion run in aseparate component called extensionruntime each extension connect back to ahost use the define entry points likethe entry entry one and entrytwo for example the extension can alsoread or modify the variable in the hostapplication well they can also like callthe host functions or in the hostapplication like the for here as we cansee from this real world fe uh cases andthe extension use cases there are threecore requirements for the extensionruntime frameworkintercon with seamless intercon safetyandefficiency first what do I mean byinterconnect uh interconnect is how muchpower we give extension to interact withthe host application extension need todo things meaningfully they need to readdata modify space or callit existing functions inside applicationdifferent extension need different levelof interconnect for example a securityextension need to re request details andblocks suspicious requests whileobservability extensions just need toread request details as we've discussedsafety is how much we limit anextension's ability to harm the mainapplication if there's a bug in yourextension this back shouldn't crash yourwhole web server so compromise yourentire application without safetyboundaries a single small mistakes inyour application could take downproductionsystem efficiency is easy to understandit's about performance how much overheadextension framework add to yourapplication uh different use cases havedifferent level of performancerequirements like the distributed systemand the kernel uh they have verydifferentlatency the key challenge is that theinterconis and safy fundamentally at othe more interconis you allow the lessinherently safe it becomes to keepthings safe you need to rest theinterconis which limits it tensionusefulness balancing this tension whilemaintaining efficiencies will make ittension framework so challenge todesign here is a table of limitation ofsittingframeworks uh like first uh a lot ofextension implement just nativeexecution approaches like LD preload ordynamic modules they offers excellentperformance and simple integration butthey provide no isolation a bug in yourdynamic modules crash your entireapplication there's no safety boundariesand no no fangground control over whattension can assess uh software forisolation based technologies and alsohardware based isolation uh like webassembly and ru provide better isolationthrough runtime checks but they pro theyintroduce performance overhead from thischecks and binary crossing they alsooften rely on the host application toimplement the security binary correctlyfor example in what a lot of checks youneed to implement is manually corritingis this is buggyand rely on manualefforts subprocess atolation or RPCbased approaches like a lot of clativeapplications and like uh model contextprotos mcp in application and some otherresearch projects here offer strongisolation but expansion in separateprocess they suffer from contest visualor has this is sometimes negligible butsometimes critical make them to slowertheperformance criticalapplications some like per extensioncontrol while others require significantchallenge to the host application uhhere we also list like a verifier basedapproaches which is EVPF based use spacetracing like your pro is uh currentimplementation of your problem likefground control over extensioncapabilitiesuh is tightly coupled with kernelsecurity model and each extension corerequires a costly kernel contest whichis make them inefficiency for highfrequency hookswe can see that current softwareframeworks have not handled thistrade-off very well i don't allow toomuchintercon modules or they provide strongsafety through heavy isolation likesandbox crabing environments or subrosubprocess isolation methods but listcanbevalentions often become slow andlimiting in what they can do so what wefound is that the key to manage thispension uh thisinterconvirtual safety trade-offs is theinterface we choose forextensions it's theinterface if your extension frameworkinterface can carefully define exactlywhat resource and functions and likecapabilities and extension can use youcan precisely manage the tension ideallyyou can you just give the extension justenough in the canon to do its jobs butabsolutely no more this uh this soundssimple but current systems struggle toachieve thisuh let's pick two popular approaches forexpansions web assembly and ebpf and seehow they they approach this problem webassembly is a binary instruction formatused in many cloud native system asextension runtime it provide SFIsthrough runtime tracks and while EBTF ismainly embedded in Linux kernel and alsousing Windows kernel compared with webassembly EBPF has a history of focusingon performance first which lead todesign using a very toric tracksafety of the extension at low time ininstead of runtimechecks uh let's look at how these twoapproaches in achieve the interfaceproblem the web assembly componentsmodel repls web assembly's approach tosoftware challenge uh is a specspecification and a set of tools thatdefines how web assembly models can becompromised together and as componentsand how they interact with their host ofenvironments the component models designwith several key goals define portableinterface that work across languageensuring capability safety through uhsplit interface and supportingvirtualization in diverse environmentsuh from browsers to cloud systems uh ascalled a component model introducecomponents as compressible units builtfrom web assemblymodels they interact through welldefinfined inter interfacewriting w that's web assembly interfacetypes as showing the fig figuresThe component model also implements acapability based security through theresource handlers uh they that'sunforgettable reference that grandassets to specific resource this handlercan be passed between componentsallowing frame control over whichextension can assess which result forexample in this direct interfacedefinitions in uh in the code block wesee how components interact through wellthrough the request and responseresource they are handlers that can onlybe assessed through a definitivefunctions a component must in a handlerinterface to process requests in firstcapability based assesscontrol uh this capability basedapproach religion with the principle oflist privilege we discussed earlieruh they have a rich cap system forinterface but they are mainly based onfunctionslate pro uh the drawbacks is it alsorequires runtime checks at interfaceboundaries and sometimes data copingwhen crosses boundaries that createsoverhead especially for extensions thatfrequently interace hostuh inEBPF uh they are mainly through low timeverification so the runtime overhead isminimminimized uh this verifier tracks allpossible execution path to ensure memorysafety and prevent like infinite loopsand a lot of issues in theextensions the EPF interface in kernelwas original based on like uh helperfunctions and program types and attachtypes as showing the in the table herewe can see there's a lot of programtypesand attached types corresponding tosomething similar tocapability but this are hardcore in thekerneland it's hard to extend them as EBTFevolved to avoid a growing number ofthese helper functions and entry pointsthe kernel community introduced moreinterface mechanism like the structureops and k funs uh which we will looknext this provide more flexible ways forextensions to interact with the kernelwhere maintain safety they use spia typeformat for this test similar to the witformat uh for example the structure opis similar to the isin web assemblyallow user to register new EBPI programtypes called by and this EBPI programcan be called by a kernel uh thisstarter ops can be registered throughkernel modulesand extend uh by and then you can writea ebpf code to use this structure offsuh k fun is similar to uh it's likesimilar to import allow register newkernel functions called by eBI programsin uh for example you can call this PCPfunctions in in your eB program and thisis no need to modify the verifier theyare using BTF for verification and typesthere's also some annotations and flagsyou can use like this this whereide ntify the resource acquired andresource release like where this andsome high level facts like where thefunction is sleepable and destructiveAs you can see the kernel module willregisters the uh K funs with some flagsfor this hello effects so they can youum be embedding the verifierto summarize the EBPF interface approachwe can see our trade-offs the EVPprovides strong verify based securitybut it can catch bugs before roll it canprovide better performance because itdon't require runtime checks by tallycouple with colonel EPF and the expressuh and also the expressiveness is limitbecause it's only using JC it's hard todefine fine grainsafety so what we are proposing is tryto combine the EVPF strength and webassembly strength is in inspired by EPFverify and web assembly component modelscapability based security our codeinsights treat all extension hostinteraction as is splitted capabilitieslike web assembly so we can followprinciple of this privilege forextensions and use verifier allow us todesign for both safety and performancein use space extensions uh but also thelimitation of EPF style is becausebecause of the express so it's a littlehard to write EVPFcode uh so here is a EPS specificationexamplethis is uh generated by the steadyallances code through all theeval verifier to verify this spaceextensionuh interfacespecifications so we can we have threetypes of capabilities like statecapabilities function capabilities andextensionentries which is similar to web assemblybut you can by besides this simplemechanism you can also in useconstraints you can encode like binaryrelationship between arguments andreturn values like high level semanticfacts and bring operation over theconstraints then we can use uh the EVPverifier to checkthem before the expansionslot so enterour use spaceevftime uh we build a model on BPF time tomake an extension framework uh is uhit's a BPI time is a use space eBPFruntime compatible with kernel itsupports a B a bunch set of uh EBPFIfeatures and like your USD s points ADPso the goal ofthis implementation is to leverage thesitting eBPF ecosystem so uh we can wecan try to overcome at least some of theifexpressiveness problem include a set ofebpf map types and some helper supportsand also u supports it can run togetherwith kernel ebpf and you can switch theVM inside it so BPI time is a runtimeinstead of simple VMit's not like just UBPF and RBPF becauseit it has a bunch of other things uh soyou can see there's a set of supportivefeatures at least part of them here uhlink include a bunch of share memory maptypes your space kernel shared maps andsome program types you can attach inyour space include some tracing typesADP and even some GPU features that'srecently added bycommunity you can also define othersteadychoints you can support a lot of helperfunctions and you support both kerneland new spaceverifier this is the EBPF design BPFtime descent diagram white componentsare fromEVPF orange components are new we add inBPFtime blue arrows shows the executionflow while compiling and loading BPFapplication uh it is similar to how theAVPF was originally performedso uh we use the BPF time loader to loadand operate theEB with the EVPF control planapplication white arrow with black linesindicate components that interact withEVPF maps here is a basic example toshow how you can run build and run EVPIleave VPI based EVPI program starts withBPItime command lineinstructions you can use upro tomonitor use space macro functions in lilike just bpi time load to load thecontrol plan application epi program andstart to trace theprogram here's the evaluation usecasesexflication can run with minimal fixwith some additionalcustomizedcases so the first use case is like forobservability why we use use spacetracing because it's faster and moreflexible it's much it's much faster thanyou space your props and it has muchfaster use memory assets there's also notracing overhead on untrace processwe can run a set of tools like BCN BPFtrace and also complete observabilityagents with Krom and your proptogether here is some microbenchmark wecan see there about 10 times ofimprovements in your prop and also someimprovement in hash maps and your spacememory riskyou can run the BPF trace andBCC in BPF time just as you run thekernel you can see theretest contour and SSL Smith tour uhyour space EVPF can significantly reducetheoverhead than the kernel EVPFuh it can be also be used for useEV network likeDP like combine EVPF with DBK and ADP soyou can use exiting EVPF ecosystem andtools but achieve better performancewe also build our EVPF VM with RVMT is uh RVM BPF is a steron EVP VM andcomparing tools it's a easy to useprojects there's a bunch of more usecases like engine plugins and fusecaches and radius durability tuning somealso include GPU tracing and errorinjection code patch please check ourcoming papers for more detail you willbe releasing next months Ithink so there's still a long way to gobecause BPF time is not production readyyet we still need to do a lot of improvestability and bug fix and make more easytouse here's the takeaway here's thetakeaways bpi time is a new approachesto use space extensions we try tocombine the strength from both worldsebps verification andperformance and also web assembly likeinterface flexibility it can enable highperformance extensions and fun safetycontrols and it's also compatible withexisting EVPFecosystems okay thank you[Applause]uh hi sorry I was a little ethis uh any questionsor Oh no maybemaybe Okaywhat are the largest problemsuh the main challenge is that how we cankeep compatible and leverage the kernelinfrastructuresuh like how how how I can because libpfand like kernel kernel ebpfruntime they they interact through theebpf cisco but they are tightly coupledso we need to have auh Cisco compatible layer to make surethey are compatibleand this also cause challenge and a lotof engineering efforts in the long runlong longrun uh sorry I didn't see the the themicrophone before another question wouldbe what kind of performance impacts doyou see using the using web assembly inthis environment with respect tostraight ebpfsorry what can can you repeat again yeswhat performance impact do you measureusing web assembly in thisway in contrast to a native ebpfuh we implemented the same use caseslike some uh engines plugins and try tomeasure their performance differenceuh is that what you're asking okayuh the exactly use cases and you can seeit on our website and thepaper we will really see lateruhbecause I didn't heard the previousquestion maybe you already partiallyentered it but how muchis coupledthe was module to a specific kernelversion I mean do we must run this wasmodule what is expected to be moreflexible to a specific kernel version sohaving all you align on the same versionor is it pretty flexible based on theminor version patch version i don't knowuh I'm not quite get itthe the was pluginis intended to work withEVPF but eBPF depends on the kernel yesso let's say you run on Linux kernel 5.2does your module is expected also towork on kernel 5.6 how much is coveredis it something where we must align allthe o on the same kernel or we have someflexibilityuh the first we only use kernel for uhin the EM model we we only use kernelfor verification so you can just putkernelin a VM and run it doesn't need to beyour actual kernel actual kernel so ourattach version is likekernel6.10 and form but uh if you if you justuse your space verifier in BPF time youcan uh ignore your kernel version anddon't need to care about that okay sominimum requirement is a kernel 6.10 asyou said to get thefunctionalities but outside of thatthere is no requirementuh sorry maybe maybe youyou can you speak a little loudly yeahyeah I was saying so you say there isrequirement of a minimum qual versionlet's say 6.10 10 something like that toget the functionality but havethat it's flexible we are not attachedto one specific versionuh the BPF time doesn't need likeminimal kernel version to get afunctionality you can it can work withuh like low very low kernel versionseven without EPF support is that whatyou're asking okay Yeah uh2025-04-15 21:59:24.343763 ��o�#��AW5C0O7vk78ouh good afternoon everyone my name isJun and I'm maintaining a bunch of EVPFrelated opensource projectsin organization called Yomia BPF and I'malso a PhD student in Yoshi SantaClaus today I'm going to talk aboutsomething that has been around thesoftware industry for a really long timeasindustless software extensions andintroduce our use eBPF runtime calledBPFtime specifically I want to talk aboutwhy we extensions what make themchallenging to handle correctly and howour currentapproach to manage extension might notbe good enough there will introduce anew approach to manage extensions calledthe extension interface model and ouruseless EPF runtime BPI time and how weimplement this principles uhthis so we have been maintaining the BPItime project for over two years uh theresearch paper of it is 10 applicationsafely and efficiently has recently beengot accepted into OSBI thisyear so uh there's a link you can Googlesearch it or just visit it[Music]okayso why we need to build a new EB useEBPF runtime why we build this use EBPFruntime as yet another expansionframework first software extensions arehas a very long history they includedweb servers database editors and like VScode or and also some cloud nativeapplications like Kubernetes extensionsand web assembly models in kernel theyalso have EBI programs and kernelmodules but there's a question a lot ofpeople may ask why don't we justintegrate everything into the main codebases why do we need to use extensionsthe short answer is uh we need flexflexibility and isolation we wantflexibility and customization because itmakes our software adaptable useradministrator want to tweak things tomake meet their specific requirementswithout waiting for core developers toimplement changes but flexibilitywithout isolation is risky extensions bydefinition they can be third party or atleast externally developed code you maytrust your core engineering team buttrusting external code is a differentstory even if it's not maliciousexternal code can have bugs causingcrashes performance degression orsecurityvulnerabilities so you need to protectyour core applications that's why mainyou need to is have extensions but uh inreal world extensions can bebuggy uh for example a few years back apopular video stream believe suffers aserious performance production outagebecause one of the engine's attentiongot stuck in an infinity loopuh Apache HTTP server has similar issueswhere buff overflow bugs in ru basedmodule can crash and security host likeradius and a lot ofapplications they also have things uhthat really has or big companies andcost a lot ofmonies we recently like did did a studyto CV reports from some open sourceprojects we searched all the CVs in thissoftware and found there uh like morethan 1,000 CVs related to extensions in17,000 total CVs from this project uhwhat we found was that extension relatedvulnerabilities made up significantportion about 7% of allCVS of them could lead to system crashesor data leakage so isolation and safetybecome absolutely critical we don't wantour bug in one extension to crash ourentire system we don't want our poorlywriting plugin cause our service to slowdown we didn't want external code toexposing our internal data to attackersso as as the left left side shows thisfigure shows how a regular applicationcan be extended using a separateextensionruntime think of the host application asthe original app which has its own spacelike variable and codes like functionsinstead of directly modify the code theusual add new behaviors through itspensions this exte#llers at scaleand that's why I decided to put someeffort into this topic in my master'sthesis which was called horizontallyscalable Kubernetes controllers and thisis what I'm going to talk to you abouttoday let's review some controllerbasics so that we are all on the samepage kubernetes controllers are whatfacilitate the declarative statemanagement in Kubernetes so that you canjust apply some YAML and the controllertakes care for you to scale thedeployment to the desired amount ofreplicas right so for this controllersperform these typicalsteps first of all they watch the APIobjects at the API server for changesthen when receiving change events likewatch events they cache these objects inmemory for fastretrieval if there are any relevantchanges they will encue the object forlaterreconciliation when they reconciled theobjects the first thing is they read theobject from the cache from memoryideally to not put load on the APIserver and if necessary they perform thechanges like creating new pots and so onand typically the last step is to reportthe observed status so that could berecording a Kubernetes event that yousee in cubectl describe but it couldalso be updating the status section ofyourobject okay let's illustrate this i'vebrought an example operator which I'mgoing to use throughout this talk so letme first introduce it to you and then wecan see how we can make it scaleokay so my demo operator is called theweb hosting operator the idea is prettysimple we want to run a web hostingplatform for our customers so hostingengine X as a service basically uh andwe want to do this declaratively withKubernetes because there is simply nobetter way to host engine X than onKubernetes right okay so let's firstapply some examplemanifests and uh we can take a look atthe themes now themes um basicallydeclare how our websites configured bythe customers would look like there isan exciting theme it uses some color anda specific phone familybased on this our projects can get andcreate websites in a project name spaceand I have already created a website forCubeCon and it uses the exciting themeobviously okay and we see the website isready and uh what the operator did inthe background was to create some moreobjects like an deployment an ingress inthe service for exposing our website tothe userso now that this is running uh on mykind cluster I should be able Wow thisis small uh I should be able to visitthe website I just created in theproject funamespace and it says welcome toCubeCon okay let's jump back to theslidesso now now that we reviewed howKubernetes controllers work and how youcan use them to implement operatorswhat's the problem withit kubernetes controllers must alwaysprevent conflictingreconciliations so it cannot reconcile asingle object in multiple instancesfor this typically Kubernetescontrollers perform a process calledleader election to select a singleactive instance and only this specificcontroller instance will ever makechanges to theobjects this causes controllers to notbe scalable in a horizontal manner soyou can't just add new controllerinstances to increase the performance ofyour systemthis imposes limits on largecale usecases like the ones I'm doing atstacket but so far there is no standardsolution for it so let's fix this butfirst of all I will illustrate again theleader election process so we all knowwhat we are talkingabout okay in the name space where myweb hosting operator runs it alsocreates a lease resource this istypically used for leader election andyou will see the basic property of alease is the active holder right now itsays uh this pod this host name whateveris the activeleader so what happens if we havemultiple replicas let's take a look atthe locks of our operator we see that itreconciles the websites and in parallellet's put a watch on the um ports andthelease and if we now add a secondinstance of our operator we just scaleit up we see that it is gets ready butit does not do anythingif we now create another website we cansee that the existing previouslyexisting instance is responsible and isreconciling$ the website and making itready so you could now call this anactive passive HA setup so it's notscaling out and adding any performancebut it's just warming up anothercontroller instance for a quickfailover we can also demonstrate thefailover by just deleting one of theinstances let's pick the current leaderit's the foursomething okay now we see that anotherinstance got created but moreimportantly the lease object wasreleased by the active controller and itwas taken over by the other one whichwas already pre pre-warmed and nowanother instance reconciles our webhostingobjects okay while we jump back to theslides let me prepare the next demookay so now that we know what theproblems is and why Kubernetescontrollers can't be scaled horizontallylet's take a look at this design thatI've come up with in my master'sthesis the core principles of thisdesign is to apply sharding mechanismsthat we all know and love fromdistributed databases to the world ofKubernetes controllerswe use some dynamic membership andfailure detection that is pretty similarto a big table for example based on thiswe can determine which um controllerinstances or shards as I call them areavailable forscheduling if a shard goes down we canperform automatic failover of the APIobjects and if another chart joins thering we can rebalance thedistribution all of this is achieved viasome label based mechanism so it'snothing fancy uh that you don't knowit's just pure Kubernetes APImachinery still we prevent concurrentreconciliations in multiple instances soone object is always just reconciled bya single controller instance but itmight not be the same as another APIobject the implementation of this designthat I've came up with is reusable andyou can start using it in yourKubernetes controllerstoday so let's take a look at thearchitecture it might look a bitintimidating but don't worry I'm goingto walk you through itat the core of it we have not only onecontroller instance but let's say threeof the samecontroller each of the controllerinstances or shards creates its ownlease resource for declaring membershipto the ring of controllersand now that we discover which shardsare available we can assign a portion ofall the objects that we have in oursystem to a specific controllerinstance for this there is a newcomponent in the cluster which I callthe sharder the sharder discovers whichobject which shards are available andconstructs a consistent hash ring thatyou might know from Cassandra forexamplebased on this we can assign the objectsput a label on it um during admissionfor this we just create a mutating webconfiguration so that the API serverwill always ask the sharder which shardit should assign an objectto to put this all together we need tocreate a new resource it's calledcontroller ring the controller ringconfigures the sharder and declares whatobjects belong to the controller andshould be distributed across multiplecontrollerinstances and once we've done this thesharder takes care to inject the shardlabel into every single object thatbelongs to this controllerand the last part of the design is thateach individual controller puts a labelselector in the watch cache so that itonly requests and sees the relevantobjects that are assigned to thisspecificinstance okay enough of the abstracttalking let's get to the demo and hopeitworks so back to our web hostingoperator we can now see that in mycluster there's the sharder componentrunning so great we now configured acontroller ring for our web hostingoperator and we can see that the sharderalready recognized three availableinstances great how did it recognizethese this is based on the lease objectin this case we have three instancesrunning as three different pods and eachof them created its own lease object andwe can see that the label uh being namedcontroller ring is referencing ourcontroller ring web hosting operator andthe other thing we can already see isthat the sharder put another state labelon it and it recognized all of theseshard leases as ready so it considersthem forscheduling okay let's look at the umcontroller% ringresource as we have seen before the webhosting operator is mainly responsiblefor the website object this is the onewhere we uh specify the theme and foreach website it creates more objectslike a deployment a config map serviceand so on you could say these arecontrolled resources because all of themhave a owner reference back to theowningwebsiteokay based on this the sharder createsthe mutating web hook that I talkedabout as we can see this mutating webhook configuration points to the sharderrunning in the sharding system and it isspecific to this controllerring and we also have an object selectorin it which has the shard label um andit checks for non-existence of thismeaning that when you create a newobject the API server will check for youif it is already assigned to a shard andonly if it is not it will reach out tothe sharter to ask for anassignment and this process is done forall the website objects but also all thecontrolled objects like deployments andso on that we've specified in thecontrollerring okay let's see this in action andcreatesome let's recreate the CubeCon websiteand get the YAML uh representation ofthe object we just createdas we can see our mutating web hookinjected the shard label and it has avalue pointing to an individualcontroller instance that should be a uhresponsible for this particularobject we should see that the website isready no that's the wrong name spacewait so the website is also ready andnot only the website but also the ownedobjects have gotten a label and all ofthem are assigned to the same controllerinstance otherwise the controllerwouldn't know if the website is ready ornot based on this we should now be ableto see our website again let's refreshit now we have a new server because werecreated the websiteright cool let's create some morewebsites just a random set of 50websites and now we should see that thewebsites are distributed across these umshards right we have three differentshards and the website objects aredistributed roughly equallyso let's check what happens if we removean existing controller instance from ourcontroller ring for this we put a watchon the pots andleases and we will also watch thewebsiteobjectsokay now let's scale the web hostingoperator deployment down totwo and what we can see is that theshard lease belonging to this particularinstance was released there is no holderidentity anymore and the sharderrecognized this shard as unavailable soit will not assign any objects to thisshard anymoreall the objects that were assigned tothis chart previously now just get anupdate to the chart label pointing toanother availableinstance great there is another case Iwant to show which is adding a newinstance to the ring this is a bit moreinvolved because we need to make surethat even when moving an object betweencontroller instances there is only asingle controller feeling responsiblefor it otherwise there would again becon conflicting reconciliations which wewant to prevent right okay so thereforethere is a handover mechanism in placethat asks for a confirmation of theactive umcontroller to acknowledge the operationthat the object is moved to anotherinstance and it needs to acknowledgethis because otherwise we wouldn't knowif it has seen the reassignment andstopsreconciling okay for this there isanother label it's called trainsomething something and we we should seethat when we add another instanceobjects first get the train label andthen the active instance acknowledgesthis operation by removing train andchart label and then the sharter canjust put a new shard label on it andassign the object to a newinstance coolso now that we understand how shardingfor Kubernetes controllers works youmight ask yourself how can I use it howcan I implement this in my owncontrollers and I will tell you it ispretty simple you can just reuse a fewcomponents that I'vepublished the first step is to installthe Sharter components this brings thesharder deployment and the controllerring CRD into thecluster once you've done this you canconfigure the controller ring matchingyour Kubernetes &controller in our caseit was the web hosting operator it isresponsible for websites and a fewcontrolledresources this is fairly easy so farnow we need to change our controller inimplementation only a little bit thefirst thing is to create the short leasethat we've talked about and what we cannotice is that it has the label pointingto the controller ring that weconfigured and the second thing wenotice is that it has an individual nameand holder identity compared to thetraditional leader election leaseif you use controller runtime to buildyour controller this is very easy youjust import my library configure thename of your controller ring and thenwhen creating the manager that runs yourcontrollers you simply pass on the shardlease implementation to the leaderelection mechanism and with this yourcontroller runtime manager will nolonger create a single lease for leaderelection but it will create anindividual shardly specific to yourinstance if you don't use controllerruntime or use another programminglanguage you can easily implement thesesteps by yourselves it's veryeasy the fourth step we need to take isfiltering our watch cache we saw thatthe objects always get this shard labelwith a value of the shard identity andwe only want to watch these objects thatare assigned to our specificinstance again if you use controllerruntime this is very easy to do you justset up the label selector you can alsoget the um the name from the librarypass the shard name and then you givethis one to the manager in the cacheoptions and this will make controllerruntime only watch the objects that youcareabout the last step is again a bit moreinvolved and it is about handling thetrain operation that we've seen whenmoving objects from a active instance toa newinstance when the train operation istriggered we can see this particularlabel on the objects and once we seethis in our controller we need to stopreconciliating objects and remove boththe train and the shard labelso you might think this is quitecomplicated but again if you usecontroller runtime you can just use mylibrary to wrap first of all thepredicate to react to events where thetrain label is added and you can wrapthe reconciler to remove the labels sowith this you don't actually need tochange anything in your business logicof the controller you just need to usethe library and wrap both predicate andcontrollers to summarize this you caneasily implement sharding for yourcontrollers in 50 lines or less less Iguess if you leave out thecommentsokay so you might ask yourself have youtested this is it really scaling theKubernetes controllers horizontally oris there any overhead so this is alsowhat I've done in my master's thesis andfor this I conducted some load testexperiments before we jump into how theload test experiments work we shoulddefine what a load on a controlleractually means load on a controller hastwo dimensions the first one is how manyobjects are in the system how manyobject it needs to watch in my loadtests I increased this over the time of15 minutes to about 9,000objects the second dimension of load ona controller is how often objects changeorchurn that could be object creationsupdates anddeletions in my cases I increase theload up to 300 changes per secondthe first thing I did was to observe howmany resources my controller actuallyconsumes as you can see in thiscomparison the overhead of a shardedcontroller is very small in comparisonto a singleton controller where you onlyhave uh one one active instanceyou can see the bene the overhead is inthe resource consumption of the charterthat is added to your system but it'sconstant it's not growing with the loadof thecontroller if we now want to judgewhether our system is scalable or not weneed to measure performance and againperformance measurements for controllersneed to be defined in my case I put uptwo SLOs's so the first one would bethat objects are worked on very fast sothe Q latency should be less than 1second ideally this correlates toresponsiveness of your controller so tosay we get this metric from controllerruntime we can just evaluate the 99thpercentile fromit the other performance indicator ishow long it takes for the object to getready so I would call thisreconciliation latency and ideally itshould be very low let's pick a numberof 5 seconds as the threshold for ourcontroller now I took performancemeasurements for a controller set upwith one instance up to five instancesand as you can see the capacity of thesystem meaning the point where itdoesn't break our SLO is increasing withevery single addit instance it's almostincreasing linearity so you could callthis perfect horizontalscalability so let me wrap thisup today we saw how to make Kubernetescontrollers horizontally scalable andthe capacity of the controller systemincreases with every single addedinstance this design and implementationis reusable you can just apply it toyour controller or any new controllerthat you writei would say this project is ready forusage and I would highly recommend totry it out and reach out with feedbackso we can build a community around it weshould gather some more experience inreal world usage i wouldn't recommendrunning inproduction so with this I will open upfor questions if you like the talk feelfree to check out the project where youwill also find links to the master'sthesis if you are interested in the fulldetails leave a star if you like thetalk and now up to your questions[Applause]i think you can walk up to the mic ifyou wanthi thanks for your session it has beensuper interesting i have a questionabout you are scaling your controllersdoes the the sharder uh supportautoscaling the controllers itself imean if I can monitor the queue lengthor I can monitor the CPU of the of myoperator will be the sharder fast enoughto re rebalance the shard if Iautomatically scale out and scale scalein the the operatorlet me repeat the question for the sakeof the recording uh the question wasabout um whether I can use horizontalautoscaling for my controller uh forexample based on the performance metricsthat we identified so I I definitelywant to try this out i haven't done sobefore maybe I can present it at thenext CubeCon then but in theory itshould work and it's uh it's a perfectuse case for this projectcool hey um great presentationum I wanted to understand ifthe result of sharder that gets put intothe label uh could be simplified alittle bit by saying that okay I willnot update uh the label on each of theobjects but the sharder and thecontroller predicate could both come upwith the uh same hash resolution basedon uh an implicitly written algorithmand on the fly u thethe correct sharding instance controllerwill receive um the object and work onit was was there any designconsideration made on this particularapproach what would be the pros and consuh of it okay got the question let merepeat it for the recording um so thequestion is why we need to update theobjects basically at the labelexplicitly instead of computing the theconsistent hash ring on the fly inpredicates forexample the answer is that if you umhave an outofbend um yeah coordinationmechanism you can't really be sure thatan existing instance stops working on anobject when you want to move it toanother instance for rebalancing let'ssay um I I tried to make it work withoutthis um but I couldn't figure out it umat least also this is what otherdistributed databases are doing uh and Itried to to apply proven mechanisms soto say uh this this is what I came upwith but maybe there are smarter ideas Iwould say um yeah so just for our usecase um it it might not the traininglogic uh might be a little uh o overkilllike we don't care uh if you train ornot uh but we want to uh distribute whenthe uh capacity is available so thatcould be a simpler use case and maybehelp a little bit with making the designsimple so thank you great presentationthank you i I'm afraid we are out oftime for more questions but I would bemore than happy to chat about them feelfree to come up to the stage and also wecan continue the chat you can grab me atthe STEIT or the gardener booth andhappy CubeCon everyone2025-04-15 21:59:25.108980 ��U�#��aA40OmDwTgl1Ahi So um welcome to this presentationThank you for coming Um I know it's uh alittle bit a topic a little bitdifferent from everything that wenormally have at CubeCon but yeah Iwanted to see how it could go Um so thistalk is about uh hardware in the looptesting with Kubernetes UmSo first uh yeah I I want to to clarifywhat I mean with hardware in the loop Sohardware in the loop is the technique orsometimes art ofum testinguh with hardware in your life cycle umto make sure that your your software isbeing test uh with the final hardware orat least a subset of that hardware inyour lab Um and the idea is that yousimulate part of the environmentor to to make sure that you canreproduceum something that worksSoin companies uh where I've been whenthere where there is no hardware in theloop testing uh this is how itlooks Um so no maybe you have differentvariants of your devices and when youare going to make a release of yourframework uh you need to test it on allof them and make sure that everything isfine do lots of manual testing and thatdoesn't scale well Um so and and withouta good testing strategy we all know whathappensUm you you know all the first ones Uh sostuff that used to work doesn't workanymore It's difficult to figure out uhwhat's um what's really happening youcannot meet your your deadlines But inthe case of of hardware and the physicalworld sometimes it's even worseSomething called explode or crash orwell yeahum so in the software industry this is Imean we have this very well figured outwe havea a good landscape of options uh that wecan use to make sure that um everythingis under control at every point of thedevelopment processBut I mean what what happens withhardware Uh it hardware is not as easyto manage as VMs containers Umnormally you don't have out of bandmanagement management Sometimes you havebut it's it's very rare And you havetons of different interfaces uh likecanvases video inputs video outputs uhhuman interfaces Um there are many manymany options I would like to ask uh tothe audience if umuh if in your company do you build anyanything with embedded devices Um okaynice And do you already have any kind ofuh hardware in the loop testingOkay And are you happy with what youhave OkayOkaynice So uh so let me tell you about uhour our project and what we have beenworking on The the name of the projectis is Yams Arthur anduh we started it h in emergingtechnologies at at Red Hat and now wehave a small community of contributorsUh we have an automotive OEM uh theydon't want to be named yet Uh we we havethe Red Hat automotive team and um thethe original team from EmergingTechnologies Um we first you see a lotof automotive in there We don't wantthis project to be only automotive Uhthis is something that yeah we we arerea(�� #��QAOTzd9eTtLRAthis is uh beyond the limits scalingKubernetes controllershorizontally my name is Tim Ebot i workat STEIT and I'm going to walk youthrough some crazy Kubernetes controllerstuff in the next 30minutes but before we start with anintroduction let's get a quick raise ofhands who has ever used a Kubernetesoperator in their classesbefore oh wow that's that's many let'ssay 70% of the audience okay let's getanother raise of hands who has everimplemented an own custom controller orKubernetes operator wow I didn't expectthat almost 60% of the audience that'sgreat um so let me tell you why thistopic is so interesting to mei work at the stick kubernetes engineteam and we are running thousands ofkubernetes clusters for our customersthis is based on open-source projectgardener without it that wouldn't bepossible so shout out to the gardenerfolks in the audience and outthere gardner uses controllers formanaging the Kubernetes clusters meaningthe control planes and the worker nodesso you could say my day job is aboutrunning Kubernetes contro")lly trying to be careful aboutUmsothe the goal of this project is toenableum open-source hardware in the looptesting from um the developer desk atthe start to later on being able tobuild a a lab where you have yourdevices in racks and you can connect allthe interfaces that you need and usedrivers to inter to get to thoseinterfaces but you also canum I mean use that as a developer butalso connect it to CI/CD which is themost important part of that talk So ifyou see that arrow it goes down to thetypical diagram and the idea is that thethe framework and the service is reallyopen So you can you you can connect itto anything uh Tecton GitLab uh runnersGitHub runners Jenkins and anything It'sreally as long as you put thecredentialsuh in in the secrets it's been able toaccess the the service and run yourtests But okay I I'm a developer andthis sounds too complex So you can startwith something like this you have thethe jump starter uh framework and clientUm it's it's written in Python I willexplain why later Um well this part iswritten in Python but the server sidepart is written in Go because it worksmuch better with the Go with the uhKubernetes ecosystem So if you're adeveloper you can write yourconfiguration to define the interfacesto your uh device and then use itlocally the same way that you will useit uh in a remotelab So okay and done with that I knowit's working and now we are growing Weare bringing this into our lab So youcan move into something like this So thesame configuration that you were usingas a developer um you can now uh put itinto you what we call sidekickuh servers sometime uh it's it's just uha Linux device that has to be side byside with your hardware and that has thephysical interfaces to that hardware andand the that same configuration thatthat you had as a local developer youjust put it on the exporter run this asa as aservice and add an endpoint and a tokento connect it to the controller andeverything else is is the sameSo as a from the from the developerpoint of view when you areum connecting remotely this is how itwould look if you are going to use thethe command line options which is justan option is some some of the drivershave the ability to export command lineactions So the first line is requestinga shell for a selector of hardware andif that hardware is available you'regoing to be connected to to thathardware and you will have access to tothose commands with the J J commandwhich is completely dynamic So dependingon the drivers that you have you willsee some commands or other commands Uhbut you can also use it from Python Sowe allthe all the drivers provide a clientside that uh allow you to to performactivities on the hardware like in thiscase we are using the client under thecell connection to flash an image intothe device power it on and then uh wekeep capturing a a video snapshot It's avery yeah smallexample OkaySo yeah can I can I connect this to meto my CI/CD system So it's it's the samething Uh the CI/CD system can we havemore actions and and I I will show anexample later Uh but you can do the samething You can write your tests in Pythonuh and take your device through the lifecycle that you want and and provide theresults of of your tests Uh eventuallywe also want to add support for well youcould do it now but it's not verypredictable to testing with severaldevices at the same time Maybe sometimesyour test is going to involve severaldevices that are working together Soeventually you could do that Uh and thisis if if you are writing tests we alsouh provide some um some helpers for forpi test so far But yeah we are open toany other frameworks Um and this is anexample of of testing So if it abstractsthe part of getting a client and andand you can just use it So in thisexample we are just uh the first test isgoing to boot a Linux device with aserial port make sure that it can loginThe next test is going to take a videoframe and compare it to a snapshot thatwe have on the repository with some uhdegree of of freedom You can you canconfigure that and then it's going tointeract with the console* to make surethat you have a specific devices Forexample we use this uh for testing thatthe Nvidia device drivers work and thatwe the GPU is working that we can run uhinference and and so on It's one of ourinternal use cases and and this is howan exporter configuration looks like Uhthis is like the first part which isum definingum the the endpoint token and the nameand name space where this exporter isavailable on the cluster and then wehave this export part This export partis going to it's is defining like thedifferent interfaces that you have to tothe hardware In this case we have astorage device that we can control forthe device We have a power interfacethat is connected to an SNMP uh PDU Umwe have a serial connection and a videoinput Um but we have uhlots lots of uh drivers and and we aregrowing that uh and we are happy toaccept contributions and and the fraframework is super open you can writeyou don't even need to make your driverspublic Uh so the we have an like anexample repository that you can cloneand make your own driver Every driverhas auh exporter like close to the hardwareside and close uh to the client side Umwe even have we are starting to havedrivers to do some level of virtualhardware also because people wasrequesting that Um but yeah we have alot of future ideas and open to otherideas and I don't know if you can seethat on the screen well more or lessthis is how a a driver looks like uhevery driver is composed of a serverside that is what you configure on onthe exporter service and a client side Ithink I don't have time to go throughthrough the details but it's it's veryuh it's very easy to to write a adriver and okay let's get into the demoside of things I hope everything workswell Let me first explain what what Ihave deployed in in Kubernetes uhtogether with jump starter So in thisdemo I wanted to put together like a acomplete development environment on thecloud So we are using Eclipse and thedev spaces operator Uh we are usingTecton for the pipelines Uh Dex is incharge of theauthentication and we use GitLab uh inthe cloud for our repositories And thenwe have twoum two exporters Those exporters areactually running on Madrid on uh uh myhome office Uh I will show it later Andthere uh you have the the deploymentconfiguration for what you will see Soyou can reproduce this There are someinstructions I have tested it to somedegree If you try it and something failsplease ping me I will try to help Uh sothis is how how it looks Please don'tmind the don't look too much at the dustbecause we were doing renovations justbefore that picture and yeah I tried toget rid of it but yeah it is what it isSo I just try to make like a very simpleexample in this case So we have twoexporters One of them uh has umum has a Raspberry Pi Pico a de debugpro and a camera that is pointing to thedevice The other one uhhas one of them is is a debug de isworking as a debug probe to uh thedevice and then a logic analyzer but Ididn't have time to finish the driverfor the logic analyzer But the idea isthat you could use a logic analyzerconnect it and and take traces and checkthe traces if you wantSo okay I'm going to open uh this devspace here I I I already startedit because it normally takes like aminute or so to start the first time Andif we go into this desk spaceum you can see it comes from this uhgitlabrepositoryUm we have this Okay So this is uhEclipse and and we have our developmentenvironment in here Uh the example is iswritten in Rust It's just from theuh Raspberry Pi Foundation examples withwith Rust and uh we can uh we can buildituh with with cargo we canumyeah I can showyou is is the font size okay or so likeYeah make it just a little bit biggerjust in caseSo in in this uh workspace I uhconfigured um a couple of uh sorry Ithink I moved my mic Uh a a couple ofclient credentials Those are on thecluster If we lookuhhere those are configured as secrets andevery time you you start the workspaceumuh Eclipse is going to m those thosesecrets for you So I I have this is likethe cube configs in in Kubernetes thatis going to give m+e is going to give meaccess to to the hardware are availableon my lab So if I do GMP getexporters I will getuh the available exporters in in this uhjump starter controller and I can um Ican request some of that hardware Forexample if I do GMP shell and I will usethewebcam webcam uh through selector Sothis is going to ask the controller fora list for a for some time It it willexpire in 30 minutes but you can requestmore or less and we have like role basedaccess control that you can define anddefine like the limits of that Um sowhen I connect I I'm now connected tothis uh exporter I can use the J commandand I have like the drivers that Iconfigured Ican show you the configuration of ofthat um that exporter that looks likethis You will see it's it's what you seeon the J command So you have the youhave the J uh you have the pro uh driverfor the using the pro uh res uh toolwhich is a flasher debugger tooluh and and I provide the the serialnumber of of the debug probe Um I have aserial port uh pointed by ID So I'malways sure that it's going to pick theright one And then I have a a videooutput that is pointing to the to thewebcam um device So what I what I see uhhere on the other side it's just that uhthe video interface doesn't have a CLIyet Eventually yeah we want to edit butyeah we didn't have the time So now inhere I I can do things like um forexampleuh sleep I will give it three secondsVR resetAnd then in parallel G proof uh serialsorry serial start console So this isgoing to start the console very quicklyand then in the background it will waitthree seconds and perform do reset thedevice So okay this is thedevicerunning Um and I could I can interactwith the deviceremotelyUm it's it's it's a very sillyexample[Music]and you can you can use the othercommands Every command has subcomands Soyou in in this case the probe has theload arrays info You can read frommemory if you want but if I do a JProinfo I can see um what's connected tothe to the debug pro in this case But umI mean the most interesting part is isthat you canum you can interact and and build yourtests in in Python So I will start witha small example It's it's still not atest but it's a a video This one is avideo stream like the one that I had onthe slides Uh that is going to bewritingum a JPEG file It's like a Porsche manuh video transmission Uh thanks to VSCode and uh Eclipse it's it's working Soum and I can stop it anytime For exampleif I discon if if my session my listends or I disconnect I lose access tothis connection and any driver any anystream that it's it's been connected Butwe also haveuh some tests that we wrote for thisexample So I I like to use uh the Pythonframework with hardwarebecause it executes the the tests in inthe in a written order And that'shelpful because hardware tends to bestateful and it's quicker to to iterateSo you don't need to bring everythingfrom the start to end every for everysingle test It's not so deterministicbut it's it's it's faster So thisexample is um first testing the proconnection So okay I'm just checking Ihave an ARM device connected to my proThe next test is going to download mybinary file thatum uh Rust compiled and then the nexttest is going to interact with theserial console reset the device and lookfor this hello cubecon output The nexttest is going to yeah check that theregular work right is working Thenchecking that the DMA work you are rightis working and then interacting with thedevice a little bit And then we havesome tests for the leads So making surethat the leads are blinking and then thenext test is checking that the leads areblinking in opposite directionsSo if I run pi test here inside my myconnection uh it's it's going through uhall those tests checking that everythingworks We can also open the the videohere and see what the test is seeing orhow how it's looking for the leads inthis case UhOkay so the all the tests passedUmbut yeah now let's exit the console Thelist is released and it this is thishardware is again available for forusers in the cluster So let me show youthe cluster side of thisSo this is in this demo This is what wedeployed on the Kubernetes cluster Ifyou're going to use jump starter youonly need uh this deployment nothingelse so far Uh if you want to I mean Iwanted to make it nice and how it couldintegrate with the other projects likethe Eclipse Tecton pipelines and so onSo that's everything else but you onlyneed this one It's very it's very thinand uh you getum a few CRDs on the cluster Uh forexample you have thesporters So you can see the exportersare registered on the cluster and youcan label them You canum uh you can create new ones uh and getcredentials for them and it's it's veryeasy You can alsosee the the jump starter clientsuh on the on the system and you can alsoseeuh the the leases that can have beengoing on on the cluster So who used uhwhat and and whenum we we will need to improve that inthe in the future like do clean up orand start moving it maybe to a databasebecause that's a lot to put on CRD thisand on the long term I think that thatwill not work but yeah we we will wewill be working on that The idea of thisis that eventually you can alsoum uh figure out the usage of theplatform who is using what how muchmaybe you need more hardware of one typeor less hardware of of one type or eveneventually assign cost to to the usageof the hardware Soum I want to show you the in this casethe connection to Tecton Um so I Idefined uh a pipeline run here that isusing uh several uh tasks So we have atask for getting a list from the starterreleasing it uh running commands and Ihave a g clone because I had to to tweakit So and yeah I will not get into thedetails of of this but I can show youhow it works So let me do it fetchall So I'm goingto breakum make a breaking change So we have thethoseum leads that are are alternatinguh in intensity So I'm going to put themnonalternating and that should trigger afailure in the test because it's is oneof the things that we are checking So ifI do getcommit and I push[Applause]it I can create a mergerequestYes AndYeah we see uh it this is blocked untilthe tests will pass and uh that triggersthe pipeline in in Tecton You could Imean also use uh the the Gildab umrunners as well but I just wanted to toshow uh how it integrates with Tectontoo So if we go to tecton we will seethat yeah it's fetching the repositorythen it will uh build the binary requestjump starter for a list of the specifichardware that we need Um here in thedetails yeah I think we can see it laterit has the the selector for the hardwarethat we want and then you can after thatthat list you can perform severalactions and then the cleanup is arelease of that list So the hardware isback on the cluster and another job cancan use it or another developer can useitand yeah let me I will leave thatrunning Uh I'm just running out of outof time So let meum you go to the takeaways So[Music]um yeah with with jump starter you youyou can build a hardware uh you can useit as a hardware in the loop uhframework Uh you um you can create a awhole software factory in Kubernetes ifyou wantAnd it's jumpst is an open sourceproject The drivers can be uh on theproject contributed to the projectitself or you can have your own driversanywhere that the the architecture is isreally simple and also wanted to youhave this on on the slide So we run thisparallel community to jumpstarter aboutuh hardware uh hardware in the loop Soif you are interested in this topic uhplease uh join our meetings We we talkabout this all the time And finally yeahum some some links to our repository ourmatrix uh channeland yep that's it Thank you very muchforcomingUh I don't know if we have any time forquestions I think without will beavailable Uhwhat SorryWhatOh yeah let me open Uh it it should havefailed because we we we broke it So uhlet'ssee Thank you That was a good questionSo yeah it it failed Uh and it shouldhave failed Yeah in the H sorry the fontis too tiny Yeah but it it was failingon the test oppositeuh let's cycles So it it just detectedthat they are notalternating And if we go back to yeah toGitLab Okay it failed2025-04-15 21:59:25.880623- super new startupcalled Carabiner Systems but moreimportantly for this talk I am uh one ofthe uh tech leads on Kubernetes releasewith the release engineering team and Iam also one of the uh technical leads inthe open vec on the open source securityfoundation which is a sister of theCDNCF uh so you can think of me beinghere on this stage wearing those twohats uh for theday um so here's a quick rundown of whatwe'll well I'll try to touch on todaybecause it's a lot to pack into half anhour uh so I'll do like a super quickintro on what VEX is um we're uh alsogoing to show how we're thinking uh onvexing the Kubernetes project and thechallenges of doing that for such alarge uh project and organization andfinally show you a couple of new toolsthat we're working on uh to to make thisa reality and it hopefully if theinternet works because it it was failingvery bad a couple half an hour ago uhI'll we'll get to the demo so first up uvex um so what is vex uh if you had todefine vex um you can think of vex as asystem of documents that getchronologically ordered uh thatcommunicate the impact that avulnerability has on a piece of softwareuh in other words if there's a CVE doesit affect me or not and that's a channelof communication uh between you and yourusers and downstream tools and otherprocesses that um enable you to uh makeassessments on the on the exploitationcapability like the exploitationpossibility of that uh vulnerability umso to u go a little bit more into detaillet's assume that you have a containerimage uh in in VEX slang this is calledthe product so when you have a containerimage you will usually point uh yoursecurity scanners at the image and overtime you start seeing that CVS uh startshowing up this is normal and completelyexpected uh and most of the time uh isbecause the your container image hasdependencies and new vulnerabilities getreported uh on on those projects everytime um so most of the time thevulnerabilities will come from yourcomponents which in container images youhave roughly two kinds one is theoperating system dependencies in theimage and the other kind is the languagedependencies that youuh use for your software projectscanners will look find thosedependencies compare them againstsecurity databases and give you a reporton whether those apply those are foundin your in your components now um withVEX uh what happens is that you haveyour piece of software and when you haveyour released uh artifacts you supply umyou supply the scanners you also issue aa VEX document that pairs together u abunch of information which we'll uhtouch in a little bit and then when youscan the VEX document gets fed to thescanners and other and other uhprocesses that understand it and thenthey will still find thosevulnerabilities in your in yourartifacts in your container image butthey can understand that they do notaffect you and then may choose tosuppress them from the output um sothat's kind of the the really core uhway of uh how VEX works a vex documenthas a number of statements and eachstatement pairs together your softwareproduct which is in this case thecontainer image the component where thevulnerability was found andvulnerability itself and that getspaired with a vex status uh label thereare four labels one is underinvestigation which is a way ofinforming and acknowledging to your endusers that you know that there's avulnerability present but you're workingto understand the impact it has there'saffected once you assess that thevulnerability affects you not affecteduh which is kind of the most widely useduh vex status and fixed which is onceyou understand how the data flows it's akind of a niche uh niche flag it'sdifficult to use correctly but it wasthere just added for completeness of ofthe flags there are four um vex specsvex four flavors we call them um becauseVEX was defined by a community group uhhosted under CISA the cyberinfrastructure security agency in the USthey facilitated a group that communitygroup that works on on VEX um acrossboth here in Europe and andcollaboration between people here inEurope and the US. and also a lot ofpeople from Japan um and then so there'sa big flavor in CESAF the um the commonsecurity advisory framework um it makessense to have it there because once youuh publish a software advisory you canalso issue vex in the same formatthere's one flavor of vex in both of thesbomb formats one in cyclundx one inspdx and of course the one that we caremost about uh open vex which is theflavor of vex that we are um developingin the open sf uh openvex uh contrary tothe other ones is intended to be vex andvex only it's supposed to be superlightweight and embeddible in otherformats such as assessations which we'regoing to see in a little bit um but youmay be thinking well why should I careabout all of this vex stuff well it'simportant and we think it's about tobecome a little bit more importantbecause of the CRA that's coming up inEurope um so if you look carefully atthe text of the CRA it has some mentionsabout vulnerability management like thislike you cannot put a software producton the market if it has vulnerabilitiesuh that can be exploited and sometimesthose vulnerabilities um for example ifupgrading a dependency is like a reallycostly solution and the vulnerabilitydoesn't affect you you can choose to fixit and you will technically I'm not alawyer but we think that this willsatisfy this requirement or also theremay be other cases where you issue yourown patches and scanners are issuing umfalse positives and then you may want tovex them as well or maybe sometimes asis the case of Kubernetesthe uh security advisory doesn't haveenough resolution so some artifactsappear as vulnerable when they are notand soon um here are other uh instances ofvulnerability management that can vexcan uh address so for example you needto provide disclose information aboutfixed vulnerabilities vex has a channelfor that the impact of thevulnerabilities vex is the way it's thereason that VEX was designed and it hasfields so that you can publish thatinformation and as well uh aboutremediations that you may use to um touh um remediate those vulnerabilities umnow let's talk talk a little bit aboutthe challenges of vexing Kubernetes soKubernetes so vexing is not easy becauseit requires either information from atool that you can trust or informationfrom a person that you can trust and inthe case of Kubernetes it's such a largeproject with so many sub projects andcode areas that it's hard touh understand how to how to vex properlyso there are issues around who can issuethose vexes so who can you trust who hasthe specific um knowledge in depth ofthe codebase uh so that uh they canissue the vex uh the vex statementsthere's also an issue around um alsoauthorization so if you're someone who'sauthorized to issue those vexes how canwe let you do that in a in a securefashion um then there's the aspect ofpublishing where do I put thesedocuments then there's the question ofdata sources uh we'll see in a littlebit how we're structuring the feed uhbecause there are many ways that thereare many sources where VX uh data can beoriginated such as our own CV feed butalso the maintainer assessments uhoutput from uh tools and and other othersources may be possible as well uh andthen finally the the whole publishing sohow do I publish where do I put them howdo I manage the life cycle and all ofthat um obviously this is a multi-IGnetwork effort so not only we in in SIGrelease are handling this problem wehave ongoing conversations with SIGsecurity so we're teaming up with themand with the SRC to um to ensure that wecan uh together manage the project soafter some discussions we came what wecall the big Kubernetes VEX plan and thegoal of the big Kubernetes VEX plan isto have a VEX feed and since I'm also umone of the people working on VEX itselfit's really important that we uh I'mvery strict on on what we want Vex to doso we were thinking so how does theperfect and most complete VEX possiblelook like for Kubernetes so the big VEXuh Kubernetes plan uh we are planning onhaving one feed that is itself a mix ofthree different uh three different fixesthe three differe/nt feeds the first oneis vex about vulnerabilities that showup in the Kubernetes dependencies uhthat by itself is in reality two feedsthat are mixed into one one uh siksecurity runs perical scan so go v onthe kubernetes code and now the idea isthat we're capturing that output uh ohso go v check is the analyzer uh fromthe go uh project that gives you thereachability of vulnerabilities so ifyour code has a vulnerability go v cananalyze using the compiler if thevulnerability is actually reachable inyour code um it has native openvexsupport so you can just pass it the flagformat openvex and it'll give you a anopen vex documentand finally uh well so that's one theother so this is coming from go voluntethe other one is a vex from uh generatedfrom maintainer assessments soassessments the true um the most uhsenior maintainers of the of the projectum can if there's a vulnerability in theareas of their code they can uh generatean assessment that says okay even thoughthere's a vulnerability in this gomodule we don't think it affects theproject then sign it and then send it uhso that's that's the first part this iswhere we're at so we have tooling to dothisalready uh and I'll show you the toolsin a little bit they're general purposetools so it's not only for Kubernetes soif you want to issue VEX in yourorganization you can use uh the projectswe're going to present the second one isthe Kubernetes um VEX feed generatedfrom the Kubernetes advisories so whenyou publish an advisory for Kuberneteswhich is own numbering authority so ifKubernetes a project has the authorityto issue its own CVSum we can also generate a vex feed tocomplement the advisories uh the in theway to to think about it is like it'sthe negative of the advisory you issuean advisory and says okay the cubletversion whatever is affected by this newCV but here's all the other vexes thattell you that the other artifacts arenot so the cublary is not affected andso on and finally uh so in the the waywe're doing this is that we have a planand this is uh the next the next uhstage in the project we have a plan torewrite the Kubernetes vex feed in theOSV format which is anothervulnerability format that it'sincubating in the opens and the ideawould be to start from OSV uh write theadvisories in in OSV early and fromthere we can generate the otherdifferent formats that we need to to beable to to generate the the the theproject uh vulnerability feed so one isthe CV feed that gets published wherescanners um find the the the informationof Kubernetes the JSON that gets sent tothe CV database and also the informationthat we capture and publish in therelease notes every time there's asecurity uh release of uh there'ssecurity information in in one of thereleases so once we have once we havethose we can uh feed it into the vexprocessor and generate another one ofthe vex feeds and finally and this iswell we still not need to discuss alittle bit more of this is the CVinformation that gets picked up from ourcontainerbased images usually uh ourcontainer base images some of them areheavy have a lot of dependencies so alot of CVS sometimes show up uh so we'rethinking what to do thereum and then once we have the vex feed wecan take those vex uh statements andpublish them somewhere and somewhere canhave a number of possibilities so wehave the kubernetes we could build likea kubernetes vex feed that's onepossibility we could push them to theregistry with the images so thatscanners can find them um or Aquise alsoalso has this project called Vexov wherethey are trying to host VEX informationfor a number of uh open sourceprojects all right um okay so to tacklethe challenges of issuing VEX in the inthe organization we wrote a new toolcalled Vexflowuh invexflow uh is a tool that gives youa chat ops interface in github issues sothat you can handle uh the vex lifecycle and this is how it works so youtake a repository and then Vexflow willconstantly scan the repository over andover and over and one at at some pointit will pick up a vulnerability in yourcode and once it finds it it goes toGitHub and it o0pens an issue uh withtriage instructions uh and these triageinstructions contain information on howthe maintainers okay it's below that howthe maintainers can interact with it sothat they can either ignore the issueissue a vex statement or just issueanother one of the of the vex statementsso once a maintainer uh so onepossibility is that you just leave itthere don't do anything and we'rethinking maybe it'll just auto closeafter a while or the other one is thatone of the maintainers can come in anddo an assessment say okay this is yesaffecting us or not affecting us thatgets captured into a vex statement umand to solve the issue of um hook andvex we this tool natively supports theowners file that the Kubernetes projectuses so you can define the themaintainers in the owners file and uh itthe the plan is to support like aliasesand everything uh so that's when youwhen you interact with the issue it'lllook to see if you're there and also newnew names that need to issue vexes canbe authorized via pull request becausethey they get um captured there so oncean assessment gets made we capture theassessment convert it we capture thecomment convert it into an um an openvex statement sign it with sixur andthen we push it uh to the github at thestation store uh it's a new feature wellrather more or less new i think it'slike a year old that any project cangenerate at the stations and push it uhso it gets um stored in in in the in theGitHub at the station store and then theissuecloses so this is this is the flow ofhow it works so scan periodicallyvulnerability opens the issue you getcaptured assessments and then it goes sothis is so if you remember the initialuh the vex feeds of kubernetes the firsttwo this is one of one of those and theother is the go bone check so we havethose two feeds one is the vex data thatcomes from this process and the other isthe automated flow from go bone check sothis is a clone of uh VEX uh lab I'mworking on um so this is the first demois going to be about this critical pieceof technology that I calledum it's uh I call this project theinsult connector 2000 so the insultconnector 2000 is a piece of softwarethat uh implements uh in go a proxyconnector but it never connects the onlyits only purpose in life is if you tryto connect it'll give you back an insultum so I hope they're not too spicy forthe honest here um so for example I tryoh no I won't connect no I won't connectso it will never connect but in realitywell obviously the the purpose of thisof this code isto demo the vulnerabilities so you havetwo vulnerabilities in this in this codeuh here I'm importing a module that willpull in one vulnerability and by usinguh this version of the version of uhXNET that I'm using it'll also give meanother one so if Iscan thisproject it shows me two vulnerabilitiesso the OSV scanner is like pairs with gov and lets you know that one issupposedly exploitable and the other isnot now the first one is theoreticallyexploitable but since the only purposeof the insult connector is to give youan insult it will never connect to thenet so it cannot be triggered or you'resafe from from that oneumso overhere in theproject so the the way the way Vexflowworks is that you give it a repositorywhere it will uh open and handle all ofthe issues and metadata uh that itgenerates whichis is this one so I'm setting it up forfor this one so the problem with um sookay so let me go back quickly to theslides all right so now let's I'mrunning out of time so I'll just do thedemo okay so let's say that the authorsof the insult connector wanted to uhgenerate a uh to make the project uhcompliant with the CRA policies that arenotum are not are that they require you notto have anyvulnerabilities uh so the the projectrelease is fairly straightforward youhave a bunch of go files build a binaryand push it out um so now the the waythat the project attempts to comply withthe C the CRA regulation is that wellyou implement a process where you scanyour project and if there is avulnerability you block it and then itum it will stop the the release ifanything gets found so using Vexflow wecan do that so the first one is so thisis supposed to be running on a chrome oralso as a GitHub app um but I have the Ihave the actions uh stopped uh not notthe chron's running so that we can goand slowly see how it works so I'llattempt to run the the Vexflow uh uh theVexflow um uh job so that it updates thethe vulnerabilities in the project nowinternally we link OSV the scanneruh so it has it it's importing it as amodule and running it natively but italso has an example in the codebasewhere you can use it to basically shellout to other scanners if you if youprefer another one all right so thisrun hopefully yeah so it run and it wentand and scanned the codebase and itfound those two vulnerabilities so thefirst one isum the first onehere you'll see that it's the the onefrom uh from from the net module so thisone is not exploitable uh as it isexploitable from the point of view ofthe scanners and go on check but sincethe project the the program neverconnects to the net so it's not it's notuh reallyum it's not really exploitable so whatwe do here is that we issue a vexstatement here thisproject never connects to thenet so in vex when you issue anonexploitable nota affected statementyou also add a justification which isthe second part of the slash commandthere uh so it's not affected whybecause the vulnerable code cannot becontrolled by the adversary and then youcan also add a human statement belowwhich is what I'm putting there what Iput in the comment it's going to get umcaptured written into the into the intothe ve statement and then published as avestatement so that's uh the first one umso in a little bit the bot should replythat it already issued an attestation soif I go in this project to the at thestation store here you'll see that Ialready have an attestation and theattestation type is open vex andthen unfortunately the UI doesn't showyou a lot but it's it's there and now sothat's one that's part of that's one ofthe of the of the others now to simulatethe way we're doing the um the way we'redoing the the vex statements I'm goingto run the other one with go v so thisworkflow runs work uh go v and the onlydifference between go v is that it willgive you like an unsigned document butthis this action will actually umgenerate the attestation sign it andpush it uh to the to the registryso um we're probably going to run out oftime but I'm going to show youhere okay just let it let itfinish so inside of this it's runningthe the go bone check it takes a whilebecause go bone check it's compiling itfrom source uh so once it runs it itwill capture it and then well once youhave all of those at the stations thefinal command is so you run vexflow withthe this assemble subcomand and whatthat subcomand does is that it goesscans your repository fetches out all ofthe vulnerabilities that are stillpresent picks up all of the vestatements that are available for thebranch any that are not in avulnerability get get discarded and thenit assembles a full document with the CVdata that you just uh generated this ismissing the second one but uh you getthe idea so I'm not going to go deeperinto theum into the release process because I'mI'm running out of time and yeah so onceyou have those vexes you can the policyof no vulnerabilities gets done and allof the vulnerabilities are um suppressedonce the scanner picks those documentsup and yeahsoalmost at time so this is this is kindof the gist of it so the the here's thedata about the project so we're still weneed to have a conversation on whenwhere the the VEX flow tool is going toland there's a like on one side we couldhave it on Kubernetes or we could uhdonate it to openvex um we need to Imean we we have to have thatconversation um and then if you'reinterested in joining the VEX talks umwe have meetings every other Monday uhor join the ser release meetings wherewe uh periodically do updates on this uhsub project and yeah thank2025-04-15 21:59:26.700979 ��*�#��AlQEYxCXVkVUhey everyone how are youtoday um my name is Rascoand welcome everyone to CubeCon Londonand before I'm introducing myself Iactually wanted to talk a bit about whatwe are having today in London and maybewhat we had last year in Paris andactually last year I was speaking inCubeCon Paris and I remember walking inthe booth aisles and uh most of thebooths werearound cost optimization and Kubernetesoptimization and stuff like this maybedata optimization and when I walkedtoday and yesterday at CubeCon at thebooths todayum what I saw was a bit differentbecause last year some products Someboosts were around optimization withchatbot of AI right something like thisit was really cute but it was like aquestion is it a real thing is it AIwould be really answering all of ourquestions or is it just a buzzword orjust a hype and I think when I walktoday in CubeCon something elsewas was there it was not just a chatbotwas not just a cute LLM that you writeuh some words and it gives you someanswers it's really different you cansee a lot of products fully AIdriven youcan talk to developers and de DevOpsengineers and platform engineers thattell you that they not write codeanymore not writing YAML anymore theyonlywrite prompts right it's completelydifferentand I know it's it can be a bitoverwhelming to walk in any every prepresentation has this kind of cuteimages or every uh product try to showyou yeah I'm we are doing the best withAI but uh I'm sorry to tell you this uhpresentation is also going to talk aboutAI and most specifically aboutauthorization and access control and howwe can actually control the AI agentsand not only just develop stuff withoutunderstanding what they are and beforediving into the actual uh topic ofaccess control for AI agents I wanted totalk about Isaac Asimov and I'm almostsure that most of you know Isaa2��a�#��yAoCbJdcy3zzAall right um welcome everyone thanks uhfor coming to such a niche topic on thevery last slot of the day um hopefullyum you'll get something some goodtakeaways from the talk um so this talkis about how we're trying to build theand compose the VEX Kubernetes feed uhto talk about so that we can communicatethe vulnerability impact to our endusers uh my name is Adulo Garcia or PCOMthat's my usual pig if you see that onthe internet it's me um so I am asoftware engineer in a,3c Asimovat least uh these books and his storiesso this guy was uh came from Russia toAmerica in 1950 1940 and one of thebooks that he wrote was about robotsactually most of the books that he wrotewas about science fiction and stuff likethis but this one Iroot was about robotsand in one of the stories in this bookhe was writing about the three laws ofrobotics and the first lawis makes sense right do not harm therobot should not harm a human the secondis that the robot must obey a human thenthe third is that the robot shouldpreserve themsel itselfso these laws are pretty simple and wealso ask a lot of question these daysabout these laws right because whatabout chach what about GPTclaude will they always obey us willthey never harm us or at least uh breaksome rules or try to get in intounauthorized actions or leak data sothese questions are really good but theaim of laws quite are not the best fortoday innovation because aim of laws waspretty basic they were like either therobot can do something or not rightthat's not what we here for today weneed to understand what the AI agent cando and what it cannot but it has a lotof context around of it right and thinkaboutit the asim of laws could not handlereally competing uh interests becausewhat about a robot that might harmsomeone in order to save 10 others or Idon't know stuff like thisso this was it for the robotic historylesson and I'm here for in introducingmyself so I'm Rasco Kuan i'm anentpreneur from Israel i'm passionatetraveler and food explorer and for thelast eight or nine years I'm mostly uhdigging into DevOps Kubernetes platformand this is I'm a bit nerdy about it andlike most of you I guess that likesKubernetes and today I'm solving accesscontrol at permit.ioAnd for those of you who are not 100%sure about access control I reallyencourage you to maybe uh take a look atmy previous talk from last year CubeConat Paris and I covered most of the basicstuff around accesscontrol but don't worry we will have theprevious episode kind of the slide and Iwill make a wrap of itand before diving into AI access controlI wanted to talk about why is it so bigwhy is it really a problem accesscontrol and this one is a report fromOASP so for those of you who don't knowOAS OAS is the open web application kindof foundation that releases from time totime kind of reports reports for the top10 risks and top 10 vulnerabilitiesaround web applications and more typesof applications and the last report thatthey've released was in 2021 i know it'sa bit old but don't worry in the nextslide I'm going to fix this and and alsoactually they supposed to release in fewmonths the newer report but in 2021 itwas pretty obvious broken access controlis the first and most risky kind ofvulnerability that we have in the marketand in web application and nativeapplications and few months ago OASPreleased the top 10 LLM application andgenerative AI um risk andvulnerabilities and And in this reportyou might thinkwell it might might be the same asnative applications right what's thedifference between access control in webapplication and AI agents these are thesame and yeah you're right because inevery vulnerability and in every riskyou can see something related toexcessive permissions or unauthorized ACaccess so it'severywhere so I hope this kind ofexplain you why this issue is sobig and to start from the beginning Iwant to maybe go up on the letter slowlyand give an understanding what is reallyaccess control so binary like we talkedabout in asim of uh asim of laws thebinary is pretty simple is either you anadmin or not an admin like everyapplication it's pretty easy tounderstand but going up to the ladder wesee role based access control and youmight know this from kubernetes or otherapplications and role-based accesscontrol is just set up a role assign itto a user then you have access controlfor this user based on roles Right andthis is nice but in most cases for todayapplications you have a bit more umbusiness needs that you need to fillbecause whatabout what about a case when you are youhave a bank application4 and you have acustomer and this customer needs to havesome access to her child right to herchild bank account so this is nice butwhen the child get to be 18 the mothershould not be an admin anymore on thesechild accounts right so these kinds ofrelationship based and attribute-basedaccess control is what we call today FGAwhich is fine grainauthorizationand next on the letter we have AI accesscontrol and this can be also a bitoverwhelming because it's not thatintuitive what does it mean like we haveAI agents and LLM applications and thenwe need to actually understand what theycan do and what's the valueof controlling them so really the firstthing we need to do is understanding thevalue of it and the first value thefirst thing that we need to understandis that like every native application anAI agent has an identity and it hasaccess to data data sources datawarehouses and this excessivepermissions and this excessive accesscan be a bit tricky because AI agentsunlike code unlike The code that webuild has enormous actions that it cando right in our code it's prettydeterministic we decide what the codecan do but when we talk about AI agentsit has a lot of uh data sources accessand also external services access so weneed to in a way to guard rail this AIagent but we need to make sure we don'tharm and we don't delayinnovation and the challenges that weare talking about today when trying tocontrol and get access control on AIagents are first of all promptinjections and for those of you whodon't know prompt injectionsum think about a way that an attackercan just instruct an AI agent to dosomething that we don't want him to dowe don't want it to do so we need someaudit and tracing right we need to makesure that the II agent will have some uauditing around it so we know what theagent will do and we need also to makesure it will not leak any insensitivedata and will not also uh do someunauthorized action and I know it can bea bit uh sounds like a big deal and itis and we in permit.io we uh kind ofimplement and then develop a frameworkfor all of us to understand how we canreally save and make sure that our AIagent will be secured and will be umreally controlled end to end from theprompt that we are giving to the LLM tothe end to the response of it we need tounderstand how we make sure on all ofthe layers we make sure it's secured andwell controlled so In this framework wecall it the for perimeter framework wehave first of all the prompt filteringsecond the rag data protection thesecure external access and then theresponse enforcement and you can thinkabout these four parameters even in aneasy way we can just split these fourinto two and we can think about one thatbasically controls the data so now weunderstand what the LM gets as an inputand also which data it's trying togather and the second part will beaction controls so we need to understandwhere the AI agents is trying to accessto which services and also we need tomake sure the response is well enforcedso diving into the four parimeterframework we can understand that we havefirst of all the prompt filtering rightwe are giving a prompt either it's theuser or the instruction that we madebefore like the system prompt we need tomake sure that um the prompt is wellfiltered so in this case we have Sam I'mtalking about the diagram of course wehave Sam and Sam is uh asking the LLMthe AI agentgenerate an investment advisory i hopeyou can see that so Sam is asking forinvestment advisory and in this case wehave an AI agent that can obviously cando it but only if it is uh if Sam isopted in to this kind of advisory sobefore we are actually doing stuff likebefore the AI agent is actually pullingsome data and uh getting ready theadvisory we are putting as the firstgate kind of a classification agent thatin the next slides I will show how we doit actually and this classificationagent is trying to understand SAM promptso after understanding it we get twoimportant components one of these is theresource and the second is the theaction so in this case we have threecomponents in total one is the u5ser thesecond is the action and the third isthe resource by combining it bycombining these three components we canactuallyunderstand if SAM can actually generatean investment advisory by our policy sowe have the offz that's how we call itthe authorization uh engine and you canreally grab any authorization enginethat you want that available on themarket you have OPA you have open FGAyou have permit you have Opel you havemany kind of authorization engines so sothis is not really important which oneyou choose but to this authorizationengine you just need to put all of threecompon all of these three components andthe policy uh combining this you'll havean evaluation of the policy so in thiscase Sam is opted into this advisoryprogram and the AI agent delivers himthe uh investment advisory and it'sreally important to understand in thisstep that Sam sorry the AI agent doesnot do anything until the prompt isallowedokay the second thing is rag dataprotection so sorry okay so REGG forthose of you who don't know REGG Rag isa kind of a technique a framework for AILLMsthat kind of brings in in a simple waybrings the ability for the LLM to fetchand gather data from external datasources so in this case we have adatabase like a kind of a tabledatabase relational database but in manycases you can meet some data warehousesor even other uh data sourcesand in this step you can see that wehave Dr berto and Drbertos uh is asking does Robert haveupcoming procedures today and in thiscase uh we are showing Dr for Bertos umspecifically because we are showing howimportant is to understand that datashould not be leaked right it's amedical uh data it's a health data ofboth of the doctor and both of theprocedure uh for Robert sorry so we needto make sure none of the data is gettingexposed and leaked so in this case weneed to filter the query we need tofilter thetable and you can seethat also again we the prompt is gettingfiltered and we have the threecomponents that we just talked about wehave the user which is Dr berto we haswe have the resources which is theprocedures of Robert and we have theaction so Dris trying to read these kind ofprocedures and in this case theauthorization engine is trying tounderstand are the resources uh reallybelong to Dr bertos and I'll show youalso in the implementation how we do itbut if there are procedures that are notrelated to Dr robertos we need to filterfilter them out right after filter themout we are passing it on to the AI agentto take care of the rest of theresponse sorry the third step would beactually securing external access toexternal services so for those of youwho heard about MCPS or at least uh knowsomething about it to make it simple MCPis just a server for the LLM to kind ofinterface from the LLM to externalservices so in this case we have the MCPand the MCP can access um a calendarright we have Sam again and Sam istrying to schedule a meeting withRobert in this case we are expecting theAI agent to just schedule a meeting inthe maybe Google uh calendar or any kindof calendar but we first need tounderstand does the AI agent and Sam hasproper access forthis for this calendarso in this case you can see that theauthorization service again getting allof the three components that we justtalked about the re user resource andaction and now the authorization serviceis asking can the AI perform this actionand in in this situation it's prettyeasy yeah of course it can schedule ameeting but what about a case when Samis asking to delete off all of the mailsthat it has in the mailbox or dosomething that is pretty harmful inexternal service so we need to make surethat the policy has the right uh dataand the right access to make sure thesekind of actions would not take part asthe in the AIagent the fourth thing we have is theresponse enforcement so for this examplewe chose Pentic AI which is also a nicetool to kind of make the agent a bitmore uhstructured and in this case you mightthink okay we are done we have theprompt filtered we have um the data thatgot gathered from the AI agent kind offiltered out and also 6the externalservice and external access is alsofiltered so you might think it's donebut actually no this step is the mostimportant one and in this step we'remaking sure that the response the AIagent prepared following our policy soin this case it's really easy to talkabout compliance it's really easy totalk about maybe violent response andstuff like this and in you can see thatSam is asking kind of a weird questionabout the Chinese square and in thiscase we don't want to get Sam the all ofthe answers or at least the answer thatthe AI agent prepared is kind of riskyand does not uh following our policy sowe are denying the policy from we aredenying the response from Sam andtelling Sam that this question was notproperafter talking about the four parimeterframework I can really if you thinkabout it in these four workflow kind ofsteps you can understand how to do iteasily but in the next slides we aregoing to talk about really how toimplement these uh four steps fourparameters and you can choose whateveryou want you can choose all of the toolsthat you have today all of the uhauthorization engines that you havetoday and for this example I chosePentic AI and as you can guess I want tobuild a Kubernetesagent so this one is going to be a bithard for me because I want to read soonesecond okay coolso first of all I made a map for you toto be able to follow the steps and theflows and the parameters that we aregoing to talk about and this in fourslides we are going to develop andcontrol and access control AI agent forKubernetes it's really easy but it'sreally important to also make sure it'swell enforced and well uh controlled soin this slide you can see I'minitializing my AI agent with Pentic AIand I'm using something calleddependency so Pentic AI is thiscomponent calleddependency and it's like a middlewarelike you're using in fastest API or nodeusing this middleware we are able tokind of inject an authorization serviceto our AI agent and use it in severalstepsand you can see that also I'm choosingthe LLM that I want so in this case it'scloud 37 so sonet and I'm giving it alsoa system prompt and in this systemprompt you can see that I'm I'm tellingthis agent you are a Kubernetesoperation assistant and you must followthese steps you should always check theuser permission first only provideoperation guidance if the user hasproper role which is really importantand only attempt resource uh access ifuser has required permission for thenamespace and one thing that is reallyimportant to say here is that this isonly one part of the prompt filteringand this prompt is really nice to getthe AI agent familiar with the conceptthat it has access control but that'snot enough because we can all alwaystrick this AI agent and inject it someother instructionslike don't take care of this and justwith one sentence you can make sure thatall of what we are writing here isredundant so we must enforce this promptfiltering in a more lower and deeper wayso here you can seethat we'll go step by step and you cansee that first of all we are classifyingthe prompt like we we did in the firststep in the first uh slide about promptfiltering and we are classifying theprompt so I'm trying to understand ifthe user is trying to uh changesomething in the cluster or just readingsomething in the cluster afterunderstanding this I have all that Iwant right I have the user I have theuser ID I have what the user wants to doand on what right the user wants to readfor example all the ingresses in theclustergathering all of these components we canjust send it to the authorizationservice and the authorization servicewill uh get a response for us back andyou can see that I even made it evenharden and I'm actually saying that ifthe user my policy that is that if theuser did not explicitly consented toreceive AI uh advisory like we saw inSAM example so if the AI if the user didnot explicitly consented to receive AIgenerated adviceum we are not letting this prompt to getinto the nextstep and in the next step we have therag data protection so in this case ourdata source is Kubernetes right we cansay the data source is the is the CDDBbut we just let's make it simple and wecan say that the data source isbasically our Kubernetes cluster so inthis case we have Sam is asking give meall of the ingresses that has internetaccessand it's a pretty simple request but itcan be a bit riskyumbecause maybe Sam does not have accessto all of these increases and we don'twant to show Sam all of these increasesso again I'm building all of this resourresource request and by the way all ofthe code will be in a in a repo that Iwill share with you uh in a QR codelater but anyway I'm building all of theresources that Semi is uh is trying toget andthen and then later on I'm asking permitto filter the objects so using permitfilter object function we are able toonly get uh only get the resources andthe objects that Sam has access to afterfiltering all the the objects I'mreturning it to the nextstep in this case the next step will beum really securing the external accessand again the external access and theexternal service that we are here for isKubernetesto make thingsshorter because I don't have much timethe the most important thing tounderstand here is that I want that Samuh will only get access to which namespaces that it has access to right idon't want to give Sam all of theobjects in all of the name spaces sofirst of all I'm trying to understandwhich name space is it trying to changeandthen after getting the response frompermit I'm able to get the actions doneby the u resources that Sam hasauthorization to accessto the next step and the last step isthat we need to make sure that theresponse to SAM makes senseright and in this example I chose um tosay that if there is you can see in thewarninguh this response contain Kubernetesoperation that may modify the cluster soyou should add the flag of dry run so Idecided to do this instead of cancellingor disable the response and this isbecause things have done in the previousuh in the previous step all of theactions on the cluster have done on theprevious step so now we are just uh kindof enhancing the response for Samso I'm enhancing the response withletting him know that the the uh actioncontains some Kubernetes operations thatis a bitrisky that was uh the example for thoseof you who want to get into it you canscan thecode we have also another uh example forfinancial advisory if you want to getyour AI agent to do some financialadvisory for you you're more thanwelcomeand we got to the takeawaysuh slidesso I wanted to talk about the future ofAI access control after understandinghow we can make sure our AI agent issecured we need to understand how wemake sure uh we are ready for the nextsteps right and the next steps are ifyou think about it today AI agents areare really nice and maybe doing somestuff with external services but whatwith uh the fact that in few months or afew years AI agents will start talkingwith each other and in this case we needto make sure access control is fullyenforced on these AI agents because wedon't want these AI agents to kind oftalk with each other and do a lot ofunauthorized actions leak data and allof the things we talked before so forthis we need a really good userinterfaces right we need to controlthese agents with a good user interfacesand that leads me to also real-timemonitoring we need to make sure this AIagents workflows that we have in ourcompany in our product are monitoredlive and real time of course also alertsis really nice and important and alsoauditing and tracing like we talked inthe slides before really reallyimportant and with this I will end so Ithink the key takeaways that we need totake from this uh AI access controltopic is that AI agents always needstrong identity and access controlsaround it fga provides necessary thenecessary granularity that we need andyou can use the four parimeter frameworkfor all of your projects to understandhow we can make it secured step by stepso that's it and if you have questionsyou can uh scan this one talk with me inLinkedIn or I'm staying herehave a good day everyone2025-04-15 21:59:27.63180080 Adobe's alreadya heavy user of web assembly It reallyin many ways replaces Flash for us It'sthe successor to Flash Um it's in all ofour web applications Um and I kind of Igot really into it I uh learned aboutthe web assembly system interface Uh thethe the AB the AI for running webassembly outside of thebrowser Uh I meet a bunch of peopleacross many different organizations andbecome involved in a fledgling projectcalled WAMCloudUm so my experience with Fed Ramp sothis is what you're getting out of thisTwo important lessons One Spiffy isawesome It's an invaluable tool forworkload identity Nothing else like itat least then Uh and um and then two uhI'm kind of done personally with Dockerand in a way KubernetesUm and that that's not to say therearen't plenty of awesome use cases forDocker There are U but just for me I wasI was ready to move on with somethingelse Um so today we're going to talkabout uh the what and the why of webassembly We're going to briefintroduction I want to just kind of goover the idea of Woml cloud right theform the ideal form Um then I'm going tohand it over to Jonas He's going to talkabout what we've done with Spiffy He'sgoing to have a demo lessons questionsrecap Okay So web assembly as I saidright initially developed for thebrowser Initially developed to havethings other than HTML and JavaScriptrunning in thebrowser Uh for the purposes of thisdiscussion it's a binary format you cancompile to write once read everywhereYou can think of it as a tiny virtualmachine That helps That's what webassembly is for the purposes of thisdiscussion Okay So um Woml cloud can doa lot of different things It can do uhembedded type stuff if it can do telcotype things but for the purposes ofreally this discussion and from myperspective not that spiffy and womcloud doesn't apply to every use case ofweb assemb of w was cloud but fromcoming from a microservices perspectiveum you know what do I think of if I wereto think what's the ideal platform rightso secure right pretty important so uhit should be sandboxed you should beable to run untrusted code withoutworrying about it you should be able tohave multiple tenants who you don't evenreally need to know about Um and thereshouldn't be any vulnerabilities basedon RPM packages or you know debsUm it should be efficient right ideallyas efficient as possibleUm as I said Docker great tool but itwas created for development initially Itwas you know chude and croups and abunch of other stuff Um it's wasn't madewith production in mind and there's beena lot of great work everyone you've alldone fantastically to make it reallynice but it wasn't made for productionworkloadsUm and three it should be global Ishouldn't have a separate control planefor every single cluster right if ifwe're going to have workloads where thedata is if we're going to have lowlatency for our customers if we're goingto have a really good applicationexperience I'm not going to want tomanage you know 30 to 40 separateKubernetes clusters individually And Iknow there's stuff that goes on withthat right there's been a lot of work inthe last four years around that Butfundamentally you're putting a band-aidonsomething Okay And so why Web Assemblywhat is how does this line up right soit is a capability based security modelcoming from thebrowser Has really quick cold starttimes over a 100 times faster thanDocker to start up And it's portablesmall size portable across architecturescan compile to web assembly anywhere andit'll write for anything and it'll runit and it will uh you can run itanywhere So there's two technologies inWMCcloudnats and WASMtime It's a combination of two reallyamazing technologies revolutionarytechnologiesUm allows you to run small performantefficient all those things from the lastslide And NATS allows you to connecteverythingtogether You get you get WMC cloud fromthose two things NATS and WAM time Nowwe're going to talk about how we getSpiffy andWomcloud While the runtime is secure weneed to get the distribution secure andNAT has a lot of really secure featuresbut we need to kind of take it t9o thenext level Workload identity that Spiffyprovides which is mere was merely niceto have in Kubernetes Uh very useful inFed Ramp with Kubernetes is reallyabsolutely required in ahyperdistributed system like WASMC cloudHere's Jonas All right Thank youColin All right So let's talk a littlebit about what our use cases are forSpiffy Um so first and foremost as Colinmentioned Womcloud is powered bysomething calledNATS If you're notfamiliar with NATS I would highlyrecommend you check it out It's anotherCNCF uh project uh incubating I believeUh so what we have in Womloud isessentially two layers We have a controllayer and we have an RPC mesh layerwhere the idea is that the control layerwill allow you to send messages to thehosts and other components that arerunning Womloud itself And then the RPCmesh layer allows your components tomake calls to providers and othercomponents And provider is essentiallyyou can think of it as a as a binarythat fulfills a purpose It provides aninterface for the components to be ableto call remotely uh or potentially alocal host if it's running next to thecomponent Um but the host itself thecloud host facilitates all thatcommunication on behalf of the componentSo the component is never directlyinteracting with any of these systems Uhit's being uh provided by the hostitself So it's very important that weare able to secure that communicationlayer because if somebody were to ableto uh impersonate for example a host oror one of the control services in themesh um they could potentially do somemalicious things or watch for traffic UmNAS has some really interesting featuresfor debugability and things like thatbut we don't really want somebody to beable to tap into those withoutpermissions Um one of the things we'reworking on or looking at using uh Spiffyfor is uh secretless OCI pools So ifyou're running on a cloud provider youshould be able to essentially present aa spiffy token that's tied to a givencomponent and that should be able to beuh used for pulling down that componentbefore it starts executing So that waywe don't have to store credentials in abunch of places for the uh for thesethings to be uh able to talk todifferent registries potentially thingslike that Um the use case we'll talkabout today or kind of demo demonstratetoday is secretless access to thirdpartyresources So you can imagine that youhave a wide variety of different cloudresources that you may want to talk toLet's say it's an S3 bucket that you'restoring data into Uh or it's an LLMsomewhere Um so those are reallyimportant things to be able to talk towithout having the need to introducehardcoded secrets Um I think you'reprobably picking up on a pattern here Umwe're also looking at using it formutual authentication between the hostand and um providers in once we startintroducing additional layers of uh umhow we can talk to the providers Sotoday it's all overnat but in the futurewe're also looking to expand it topotentially over HTTP or uh t u TCP ifif that's what the provider requires Andthen finally as Colin was talking aboutuh you know forming these clusters thatpotentially across clouds or acrossregions across um basically anything uhdifferent uh edges um it's important forthose uh those uh environments to beable to trust each other and so that'swhere Spiffy is also very helpfulNow uh if we kind of zoom in on the mostbasic deployment um we have a wasn'tcloud host it has a spiffy agent runningnext to it on the same node uh and thatagent is talking to uh server the spireserver sorry um now um there's just oneproblem when you look inside the cloudhost um there's actually a whole bunchof these little virtual machines asColin called them the uh binaries thatare executing and those are completelyinvisible to the uh Spire agent Uh sothat's a little bit of a problem becausein order to get an identity from Spireagent you actually have to be able to uhattest it has to be able to attest tothe workload So at this point you'relike well what what do you do wellthere's some good news Um Spy hassomething called the delegated identityAPI Um and: if you read the kind of thedescription of it um the key words thatI bolded here the delegated API allowsauthorized workloads to obtain SVIDS andbundles on behalf of workloads thatcannot be attested by the Spire agentdirectly That sounds a lot like a usecase that we could use it for So thiswas very very uh I was very delightedwhen I discovered this Now the nextquestion I had is who else is using thishow is this being appliedum well the bad news is that thegenerated SDKs or the SDK for callingthis there this is a little bit out ofdate uh sorry this is a little moreupdate picture than the one I had at thetime but I believe there was basicallyfour known imports Um so not a whole lotof people uh projects to look at UmSelium is maybe the most baked one outof those Um and then the Spy Ha agentthat has since since then shipped was isnow a good another good reference Umwhat's worse is that on the Rust sidebecause we are actually wasn't clouditself is written entirely in Rust Um onthe Rust side we're the only ones usingit So this was a fun time learning uhabout the things uh the way that the SDKfor this actually works uh because thereis nothing to look at Um but the goodnews is that it looks very much verysimilar to what the um actual standardspiffy SDK looks like Uh it's very welldone So there's not a whole lot oflearning Uh it's just that some thingsmay not be tested very well becausethere's nobody else using it So at thispoint I was like sweating a little bitUm though good news on this front Uhthere's this proposal in the spiffyissue tracker that I've been uhfollowing along where they're talkingabout potentially introducing this as anthey're like an official way uh anofficial API into Spiffy itself for thisbecause the thing I didn't mention thedelegated identity API is actually notpart of Spiffy the spec but rather it'san uh a API that's added on top uhthrough the Spire project and and it's areally valuable API So I'm I'm reallyglad that this work is h starting tohappen Um and they they've kind of goneback and forth on exactly how thisshould be done Uh but you know I it'ssuffice to say that hopefully we'll seesomething actually standardized as partof Spiffy as we move forward here Now umso we have a booth here at uh at uhCubeCon uh both a was cloud projectbooth in the project pavilion and also acosmotic booth uh in the vendor villageSo uh we have this wasn't paid demo thatwe're showing It's kind of essentiallyjust demonstrating what you can do withuh web assembly as a whole Um so whatI'm going to do for the purposes of thistalk is actually zoom in on thisspecific set of things here Uh becausewe don't care for what happens after theAPI gateway for the purposes of thistalk So we'll just focus on that bit Andto zoom in on that a little bit morewhat I'll be demonstrating to you isessentially this flow where I have um alocally running installation of WOMcloud uh as well as the applicationdeployed onto it Um I'll make two APIcalls uh one API call uh to fetch anSVID uh from the comp uh from fromwithin the component calling out to thehost that it's running on which then isgoing to call the um Spire agent on theback side And and then what we're goinggoing to do with that um SVID when weget it back hopefully get it back um isto make an HTTP call out to AWS STS uhget back a secure a secure credential uhtemporary credential and then we willuse that So we'll exchange the S SVID uhfor a STS provided security credentialand then we'll use that credential totalk to uhBedrock All right let's see if thisworksUm Wi-Fi Wi-Fi hopefully All right So Imade my life I've cheated a little bit Imade my life a little bit easier Um forthe purposes of this talk I have createduh both back Oh I guess you can't see myeditor That's all right Um nope not thetab I wanted All right So um let's startUm I'll just show you that this is realUm oh my make token CubeCon All right SoI have a command What this is now doingis c um it's calling out to the localhost uh was cloud application the APIgateway application and it's uh sendingan audience saying please mint me a;nSVID uh that is for the audience of uhCubeCon UK 2025 spiffy in practice APIgateway right and it returned thispayload Um let's see if we can take thispayload I should move this over ActuallyI'm going to move a new browser tab overUm and then we'll go to I'm going to goto JWTMS because this is a little easierWe'll plop this Where I should have mademy cursor bigger huh where is mycursor uh all right Here we go So if weplop this JWT that was returned whatyou'll see is the audience that we wereasking for And then we'll see thesubject uh of wasn't cloud dev/ spiffyuh wasn't cloud dev spiffy inpractice/appi gateway So this is tellingus that we're getting back a real svidthat is addressed uh to this imaginarycubecon.uk um website Now what I alsohave is um a pre-baked command for uhcurling for a different token And thistime it's a token that's intended forthe AWS STS service Um and so what I'lldo is I will set this up as aenvironmentvariable Um and then I will make um arequest to bedrock So request print Iapologize I had to make these ahead oftime because I will typo all day and Idon't think you want want to see mefumbling on that So what we're going todo now is we're going to take that SVIDfrom the environment variable we willsend it to AWS STS and then once we geta response back from AWS STS we'll sendit to Bedrock and what Bedrock gave gaveback to us um what we asked Bedrock togive us back is a poem about workloadedity and spiffy so this is Bedrocktheir uh AWS's LM service uh just makingup stuff making up a story about spiffyuh in production which is great love itall right so that is the demoOh I didn't have to even use any uh anybackupsThank you So let's talk about Oh I wrongone Sorry Let me do this Is this allright cool So we saw the demo Great Umso you might be wondering what what didwe just do so just to recap we had a webassembly component that we're calling Uhand if you're interested in seeing thecode I'm happy to show you the code Umit's just basically standard looking goum that we first called to acquire anSVID uh through the token endpoint andthen what we did with that SVID is wesent it back to that service on adifferent endpoint uh for the purpose ofthis demo In a real world you wouldobviously not make two calls but bearwith me Uh we took that SVID we sent itback to the service uh that the servicethen forwarded on to bedrock uh sorrySTS and then S it um the response it gotback from STS it fed back to AWS bedrockuh converse API which made up a storyabout spiffy and workloadity which isdelightful Um and then so let's talk alittle bit about what the lessons we'velearned so far and and by no means arewe ready to go to production with all ofthis This is more of a proof of conceptUm but it you know as you can see itactually does work Um and I believe thisis the first time we're actually broughtSpiffy into components which is prettycool Uh so using uh using Spiffy andSpire from inside uh web assembly Um sowhen I was implementing this what Ilearned is that we had made someinteresting assumptions that reallydon't hold in a world where you're usingSpiffy Um we had we had expected thatthe authentication information the hostshave is longived Well it turns out whenyour tokens get expired as often asevery five minutes uh you need toactually invest time into making thoseum authentication handlers be able torespond to the case where your tokenscredentials have um have been expired Uhyou you've disconnected you have toreestablish the connection So end upshowing up some of that logic and we'llcontinue to improve that logicUm what I also learned in part of thisproject is just how much uhcustomizability that the Spire projecthas built into it It's really incredibleUh there's a plug-in for basicallyanything important that you can do withit Both from if you have a customauthorization or atestation methods thatyou want to implement but also in termsof how you uh potentially publish thingsout or how you notify about changes inthe system things like that So there's alot of customizability surface in thisproject So you're not just stuck withwhat what's given to you out of the boxbut there's plenty of um uh services youcould write And this is based over agRPC based uh protocol So it's actuallysuper easy uh super easy to implement Uhand then finally I I really want to givea shout out to the Spiffy community Umboth the Spiffy Slack itself and theconversations um in there were avaluable resource in terms of trying tounderstand the system and what has beendone before what are people doing uh andhow to approach different problems Umand then also the fact that there's likea whole host of people that are reallyworking in day and in day in and day outon this both in terms of implementing itbut also improving the spec and andmoving things forward I think that's uas someone that was part of the earlydays of Kubernetes I I reallyappreciated that because it's you youhave access to the people who know thistechnology the best Um there's a coupleof things I would love to see come outof the project Um so um my hopes anddreams are that the delegated identityAPI will become part of the spiffy specso we don't have to feel bad about usinga non-standard API Uh not that I feelbad about it because it's literally theonly way we can do this Um and then youknow it would be nice if once it getsstandardized um there being some codeexamples and docs around how to use itWe're hoping that we can create some ofthis ourselves as part of the when weannounce the feature more widely but umthere's I think there's some room forimprovement there And then on a personallevel um right now I can't build our ourhost gets built for Windows and a bunchof other environments and the Windowsbuilds uh we cannot use Spiffy Inspirein uh Windows because the Rust side ofthe world does not support the WindowsSo I'm hoping to probably after thisconference um spend a little bit of timeadding name pipe support into the RustCrate so that we can start using it onWindows and don't have to have optionalbuild flags uh for for that anymore Andthen finally um just recapping things uhI just want to emphasize Spiffy is notjust for VMs and containers You canbring it to web assembly You just haveto kind of work through the uh make itwork for your host essentially but it'sit's really easy to use once you get itthere Um the in my opinion one of thethings I learned from this is is thatthe technology choices you make reallymatter Um I think that's betting onSpiffy is is probably one of the betterthings we've done this well last yearbut this this coming year as well Spiffyhas so it gives you so much uh in termsof what's what's available out of thebox and the fact that the team behindSpiffy or the community behind Spiffyare also continuing to push the standardforward Uh there's a new proposal uhfrom a whimsy uh a working working groupcalled whimsy uh that's adding workloadtokens support and that's being broughtinto Spiffy as well and Spiffy beingpart of the part of the conversation forthat is really amazing Um and then Ithink the last thing I'll say is I Ibelieve that workload entity mattersmore now than ever You know we'regetting all these AI agents and thingslike that running a muk Um so it's kindof important that those things have umhave some kind of identity you can youcan actually make decisions on and Ithink workload identity is going to be ahuge g uh uh or game changer for that soto speak Um so you know we started withuniversal identity here I'll just recapit's for us it's the you know spiffy oror svids give us a secure passport forour wom components as they try to talkout to the world around them So I thinkthat's really really important Uh andthen finally um just want to uh shoutout that if you're interested inlearning more uh we have a really wehave a couple of resources We have agreat uh Slack community as well Uh wehave a very active GitHub um really gooddocs and uh as I mentioned we have a uhbooth at the project pavilion uh in uh6B and then also a booth for cosmonicwhere we have a bunch of the maintainershanging out uh on S680 which is uhbetween S6 and S7 I believe on the onthe on this2025-04-15 21:59:28.445992 ��0�#��Ac0dEL_bBRVUthis uh welcome to spiffy in practiceweb assembly web assembly identity forweb um sorry universal identity for webassembly workloads That'll be the lasttime I make amistake Um my name is Colin Murphy I'm asenior Rust engineer at AdobeI'm Jonas Um I'm Jonas Vio I'm a umsenior software engineer at Cosmonic Andwe're both WAMOM cloudmaintainers All right So I'm going totake you back to a really pleasant timefor all of us March 2020And[Music]um this is the origin story of myinterest both in serverside web assemblyand um my appreciation for spiffy Um atthe time I was the manager for Adobedocument cloud's infrastructureum for services Um and my number one uhpriority at the time was uh Adobe SignforGovCloudUm we had dozens of microservicesProbably many of you are familiar withthis kind of story Dozens ofmicroservices Uh they were HIPPA SOCK 2PCI you name it compliant But Fed Rampis is a very different ballgameUm and uh we'd been using ISTTO I'm veryI'm always trying to you know use thingsthat don't work yet So I was using uhISTTO 1.0 in 2018 I'd been using it forawhile Umuh and we needed FIPS 140-2 complianceSo I needed a vendor I needed somebodyto contractually uh do that And theythey stepped up They did it It wasgreat But there was a problem Um andthat's uh we didn't have Adobe'sidentity management system that's thatcouldn't go to Fed Ramp with us uh toour government cloud Um and uh we haveat Adobe we have the IMS we still do tothis day uh inter you know identitymanagement system originally created forentitlements for people to use ourproducts uh prevent py you know piracyum uh but we also use that for serviceto service calls which not great um butwe couldn't bring that with us um and sowe really this is the first time Ireally got to use spiffy um solo IO hadthis agent this little uh RPM I couldinstall on our non containerizedworkloadsand uh everything just got integrated inreally nicely with the stack So I was Iwas really happyUm so everything was going really wellLook at all of our containersEverybody's moving in the same directionUh everything's going according to planAndthen right yes uh so received some verybad news Fed Ramp in March of 2021 willrequire security scans for CVEes rightuh crit critical vulnerabil vulner CVSDoes anybody know what that is criticalvulnerabilities and exceptions Um youknow uh vulnerabilities usually insystem libraries RPM packages that sortsof thing Uh they're going to requirethat just as we go live so we have to doit now before we go into our quietperiod and uh and all of a sudden that'sthat's really everyone's full-time jobUm uh we have somewhere around a hundreddifferent Docker images That's probablypretty typical for most people now Ihaven't been in the Kubernetes spacereally for four years but you know umwith between our applications and thevarious parts of Kubernetes and uh we doscans we use J J J J J J J J J J J J J JJ J J J J J J Frog Artifactory we havewell over a thousand maybe 2,000vulnerabilities in our images becausenobody at least at that point I I sawthe previous talk this was this was nolonger true but had really thought toscan for CVES and Docker images Um andthis really just became our ourfull-time job We shift priorities tigerteams tracking and remediation of CVEessomething that we're constantly updatingupdating dashboards for our governmentsponsor um trying to just make sure thatwe can get a number low enough And itwas it was our full-time job Instead ofworking on features instead of workingon our backlog instead of working ontechnical debt getting things ready fora product we're just scrambling aroundfor CVS Um and I had always disliked theyou know inherent inefficiencies ofDocker or typical you know run Ccontainerd type containers Um especiallywe were running a lot of Java right Umbut now you know I really becomeconvinced there has to be a better waySo um at this point 2027>is it right so uh a lot offolks who are selling in Europe you'renow responsible for the security of yoursoftware that includes what you arepulling in that means uh no more blamingopen source for the problem open sourceis now your problemtoo so what do we mean right like it'sbeen the same old story every heresupply chain attacks attacks againstopen source attacks against vendorsoftwareuh things like you know uh the XZ utilsissue from uh last year again uh uhthings like the NHS blood tests whichwere compromised um if folks rememberpolyfill.io IO where somebody hadcompromised the DNS uh and things likeCisco Duo and uh from Orange CyberDefense uh there was some research where58% of all UK financial servicesorganizations were the victim of asupply chain attack in 2024 and let's behonest this is going to continue untiluh practices changeor the beatings will continue inevitablyyes until the pain is uh is too muchultimately there is no such thing assecure software there is no form ofsecurity that has proven infinitelyinfallible we deliver features and werace to get those to market because thatis the nature of the universalmarketplace that the internet has givento us so security will always be onestep behind and organizations that hyperprioritize security above features aredoomed to deliver features more slowlytheir market conditions are inevitablymore challenging so weoperate willfully and meaningfully inthisworld those supply chain attacks aretargeted in the same way that stucksethad data on the target system encodedinto the malware into the implant that'swhat we're seeing across supply chainsecurity at large and the CRA by tryingto deal with the fact that botn netsbased in light bulbs took down MK for amulti-billion dollar insurance claim andwiped out physical supply chains thereis a very real issue of patching andupdating that we need to address butwe're all getting tred with the samebrush and more specificity to follow sothis is why the European Union steppedin the US have taken a slightlydifferent approach with um a nicesticker is that a fair approximationyeah in in the European Union there isuh legislative movements with the cyberresiliency act to ensure that creatorsof software are responsible for timelypatching and the nuance that goes withthat sure so uh I won't go into too manydetails uh about about the uh cyberresilience act but really it's abouttaking again the the responsibility awayfrom unpaid uh uh uh maintainers of opensource software and placing them on thepeople who are actually selling thesoftware so if you are making money onthe software it's yourresponsibility but it's a good thingright this is like no more uh uh youknow getting uh if you're like a if youare an open source maintainer thanklessopen source maintainer uh no moregetting an email from somebody's legalteam at a company uh that is valued attrillions of dollars you're now um youknow it's their responsibility to makesure that that the their open source issecureand in addition to that um there are uhuh uh still things that will uh the LFuh comes in as well here where the LF isconsidered a steward and the LF is doinga whole lot of work to essentially sayall the projects underneath the CNCF theuh um open SSF the Linux Foundation moregenerally will be um applying reasonablerules to make it easier for folks whoare consuming open source touh adapt uh to to comply with the CRAwithout having to spend a ton ofmoneyso as we've said security will alwayssit one step behind feature delivery newsoftware inevitably brings untested codepaths bugs with security side effectsthen are called vulnerabilities but it'sthe same same problem different day andso ultimately we don't know what's goingon inproduction this is about as far as we'llgo with the context framing of theproblem but I'm happy to announce thatif you'd like to watch a keynotetomorrow morning with even more CRA init then Mike has one with uh our friendEddie Knight who is one of the co-chairsin Tag Security and the Open SSF as wellthis is an emerging space as a Europeanopen-source software provider artic?lefive which we will be under the opacesof includes a caveat that it willrespond to market conditions and actualbehavior after the implementation of theact the Linux Foundation legal team theOpen SSF Eclipse Foundation Open UKeverybody is looking to respond to thisand um yes it emerging unknown and oftenthe fourth Rumsfeldian quadrant pleasedo come and talk to us if this is apassion or a concern of yoursso with that what software is vulnerablelet's talk about how we can actuallyconcretely deal with thesevulnerabilities when they arise legalissues aside this is not legal advice sowhat software is vulnerable well whichpart of software is not vulnerable onething missing from this list actually isthe AI generated software that you justshipped to production but of courseanything we ingest anything that itdepends upon from an application contextanything it depends upon from anoperating system context in the baseimage all the other layers of yourcontainer binary artifacts that havebeen pulled down from GitHub and justdropped in the underlying platforms thatwe trust for I mean Kubernetes itselfand the orchestrator the hardware thefirmware the entire stack everything isalways at some riskso if all software is vulnerable how dowe optimize our remediation so that thatcritical door metric meanantime toremediation is as low as possiblebecause that is not just a feature ofdelivery benefits it is also ourremediation for vulnerability as well asforbugs so uh that's where guac comes in alittle bit uh so guac or the graph forunderstanding artifact composition uhit's an open- source project underneaththe open SSF and again that's the opensource security foundation for folks whoaren't super familiar with that they'rea sister uh foundation underneath the LFso it's like similar to the CNCF umwhere the CNCF is focused obviously oncloud and cloud native um open SSF isfocused more purely on open- sourcesecurity uh uh there and so uh it's aproject that sort of lives under thereso um before kind of getting into thedetails of of that let's kind of talk alittle bit actually Andy oh yeah uhright tailing practices to contentuh broadly from a dev sec opsperspective we care about theclassification of the systems that we'reoperating under and back to that returnon investment riskreward balance forsecuring a system if a hedge fund has aproprietary algorithm that itself has ahuge and reasonably tangible value ifI'm listing physical project productsvia an e-commerce store probably my riskis related to the reputational damage ofseeing somebody scroll or deface mywebsite versus the actual monetarydamage of the hedge fund losing theiralgorithm so while we're talking aboutthese riskrewards return on investmentbalances you must be the judge of yourown data classification nothing isuniversally applicable but we'll we'llgo through this from the most paranoidperspective possible and one thing toadd on there data is king right ifyou're not tracking it if if you don'tknow about what's going on in yourenvironment you don't know that hey yourhedge fund algorithm was just stolen umif you're not tracking all the thingsthat are going in and knowing oh wowsomebody has changed thewebsite you're you know you're going tofind yourselfsurprised so uh assumptions for thisbrief context setting this is anorganization with a defined SDLC we'reusing a source repository and a buildsystemwe're using GitOps becausewe approve of the inversion of controland the removal of secrets so instead ofpushing directly to production we'regoing through an indirect measure whichis pushing to a appendon system ofrecord which essentially in this case isuh a nonforce pushable git log you couldalso use something like recall this isalso what blockchain is supposed to befor but it's just a Merkel tree so thoseare thebasics and then what do we care aboutwhen we're pulling in dependencies thatwe're defining into thoseartifacts yeah soum how many folks here have done curlpipe bash or sh whatever right most ofus right uh soum that's not a great practiceespecially into uh production systems orinto systems @that are part of your SDLCwhat we should be doing right is weshould be uh verifying the provenence ofthat software did it come from themaintainers I expected it to come fromor is it a typo squad did it come fromuh the maintainers or did somebodyactually uh compromise their source orbuild right this is where things likesig store and tuft come into play andmake it very easy and support for someof these things are now in mpm they'rein homebrew there's discussions to addthem to multiple other um uh packagerepositories like Maven and and andPippi and uh for folks who are familiarwith salsa which is a build framework asecure build framework definitely dothat and that's also supported uh in npmand homebrew and if you're not doingthose things hey build from source umverify through other mechanisms like wasthe the software signed by the rightparties and use other build systems likeyou know basil uh uh uh use trustedpackage uh packages from a vendor anduse things like nyxESBOM is not a four-letter word okay uha lot of folks there's a lot of FUDaround esbombs right uh esbombs are justa means to an end right it's for folkswho are not familiar it's software billof materials it's just a way of anecosystem agnostic way of describing uhthe composition of your software themetadata about your software and thatsort of thing right and also uh as a lotof folks who are currently just using itfor purely compliance reasons sbombs aremostly useless if they just sort of sitin a network share and that's it youshould be doing things with it right youcan analyze them you can and we'll talkabout that in a in a second here but atthe same time whoops I forgot a fewwords at the end there um sbomb is not apanacea sbombs are not enough and alsodon't overload them way too many folksare putting ev like whatever metadatathey can possibly imagine inside ansbomb at that point it no longer becomesa standardized format right you justhave a ton of data that is hard tounderstand what's in uh what is thisthing even describing right but we cankind of put these things together andmake it easy um um to to help with thesupply chain security hereso uh oh if anyone is playing the AIgenerated image drinkinggame best of luck to you on this one uhright when we're talking about the CRAone of these questions is is the buildsystem secure are we going to get solarwinded are we going to get um the the TFsomething action TJ something fromGitHub last week where this supply chainvulnerability proliferated becausedependabot bumped a load of um sorrybumped a load of dependency versionsultimately running speculative buildsand commit uh sorry running speculativebuilds on commit in an untrusted CI ison the maintainers of the projects butbe careful about where your secrets lieand and This comes back into the samepiece when people are providing softwareto an organization how do theyself-certify that they are above the barof reasonable industry best practice sowe have Salsa which is about at astationlevels we're just going to save themetadata for our build stages in case weneed to query the data lake and provepost vulnerability or post exploit thatsomething was actually done to astandard scorecard from the open SSFwhich is again partially self attestedum there's also security baseline fromthe open SSF in toto is a project verydear to my heart that deals with thoseuh interstitial interstitial build stepsigning so what came in what transformsand what came out that is GPG signedwith hashes very simple but brutallyeffective there's nowhere to hide thepoint being for CRA self attistationthose security scorecard securitybaseline these build hygienepractices are the level of detail thatthe Linux Foundation and the theorganizations we're involved in isproposing to the European Union as avalid baseline for security for aproject remains to be seen whether ornot that stands up to uh legalinterrogation which for which I hope I'mnot individually involved butnevertheless as Mike says we can alsorebuild from scratch if we really don'ttrust and then we're into hermeticbuilds and uh essentially keepingAeverything as far from pulling thingsdynamically from the internet uh curl tobash add and and copy and with thatwhat's the solution Mike I hear you askah quark And I'll try and go quicklyhere so we can kind of get into a demowhere we show some of this off thank youso much thank youso what is GUK um so again GU is it's abackronym for the graph forunderstanding artifact compositionreally at a high level all we we'redoing is we're just sort of ingestingall this data about the your softwaresupply chain so things like softwarebill of materials vex documentsscorecard salsa data we're correlatingit with upstream stuff like uh um uh youknow things like clearly defined forlicense information which is an OSIproject uh OSV for vulnerabilityinformation and then that allows you tothen have a whole database you can queryabout stuff like hey where does thisvulnerability live does it impact oneproject or a hundredso um this is a little bit of an a moreuh deeper dive in there but pretty muchat a very very high level you should belooking at using something like GU uhlooking at you know taking an inventoryof your software on an ongoing basisdon't just take a snapshot of an SBOthrow it in a bucket never look at itagain correlate it with stuff like youryou know uh your runtime information youknow use various open source projectslike CubeCape and those sorts of thingsand then if you don't want to run guacand you have a handful of sbombswhatever they're in JSON just use jq useduct db there's lots of different waysyou can just sort of query a a a JSONfor for some of thisinformation and then once you have thatinformation you can do stuff with itright you can create security rules youcan create policies right and you canuse things like uh other CNCF projectslike OPA and key verno right and then ifyou don't have that establish cleargates right you know if you have don'tjust allow somebody to just publish arelease without doingchecks and happily that moves again fromsecurity on the one side to well how arewe actually concretely going toimplement this when we've got a pipelinewe've got pull requests coming throughwhat does our continued integration looklike so to to give uh a high levelexample of a hardened CI/CD pipelineagain optimizing for paranoia here areference supply chain so we generateour software bill of materials that islike software composition analysis whichwe do to scan all our dependencies forCVS except for we're doing thisourselves sbombs can be generated atmultiple phases of the build process ofthe ingestion process never trust ansbomb fully without reverifying itbecause somebody could then be bypassingyour software security scanning vexdocuments vulnerability exploitabilityexchange it's taken me years to get mytongue around that twister thesedocuments say that's a CVE and we're notgoing to fix it because it's actuallynot the right classification or that's aCVE it's mitigated in this pull requestthese are live documents that provideadditional metadataaround the reason that your softwarevulnerability scanner is lit up red likea Christmas tree solzer assistationsthose vulnerability scans these all getdata logged data leaked we generate themso we end up with an artifactunderstanding across the entireorganization which gives us thisqueryable document format data set sothat when a CV happens on ChristmasEve the individual responders have asingle registry of all of thosedependencies and they know where to lookwhat needs patching if you think aboutthe log for shell inversion and you knowwe've been working on this problem forfor many years and the headlinevulnerabilities that melted the internethave come out repeatedly and there willbe more it's supply chain is is the newblack in manyways these vulnerabilities tend to turnup at the most inopportune times becausethere are higher order adversaries whoknow when different holidays occur let'ssay so reducing the amount of toilrework and vulnerability assessment thatmust be universally performed by everycompany that's exposed to this aroundthe world is the value proposition ofthis level of uh Binterrogation so uhlive demo in progress what are we goingto do we are going to uh artificiallycreate a CV a CVE rather uh we'll issuea pull request use that to verify thatthe software still builds automaticallywe're now going to loop in flux so we'retaking the security teams angle we'retaking the intersection where they bumpthe dependency with the development teamwe're taking the continuous integrationportion where we verify that all thetests still work and this softwarechange is not going to burn the hedgefund and then finally merge and deployand return to our uh our Christmasdinner or whatever it is that we'redoing um did you know I've actually thisslide is slightly out of order becausethe live demo is three slides ahead sowhy is this important and what have wechanged configuration is part of thesupplychain i've eaten buried the lead herevery slightlyso th this is actually based upon um aflux reference architecture excuse meI'm now multiple slides in the wrongdirection i've slightly rearranged theseright so before the live demo of thelast slide that you saw uh we are opensourcing something new for Flux todaywhich is using OCI artifacts in place ofgit so this is githops without git anduh the pathy gitless githops i'm sureyou'll agree is the easiest way to referto this why is this beneficial we'veremoved network routes from GitHub andGitLab and internal systems and externalsystems from production we've removedthe cloning burden of pulling down anentire git repo sparse or otherwise andwe're using OCI artifacts so there's nowa universal set of security controls wecan apply to the configuration to theapplication these are going to be signedwe can trace those signatures back tothe build systems and the provenence sowe can identify is this verified to bethe thing it's supposed to be and did itcome from the place that we thought itdid there's uh there's a whole host ofall sorts of things that we've opensourced as part of it uh it's up thereon the control plane website we've alsogot cluster sync for workload identitywe have got recursive bootstrap so onecommand will boot a management cluster ahub and spoke cluster or hub and spokemembers and then shards arbitrarilycomplex airgapped hundreds and hundredsof clusters can be deployed from oneplace and all sorts of quality of lifeenhancements for administrators of verylarge clusters we've also integrated thecommon expression language kubernetesnow supports admission control covero isis another instance of of a tool thatdoes this common expression language isanother way of low-level interrogationof metadata in policy it's awesome superI'm super stoked that it's landed inKubernetes we now support that in Fluxas well so what does that mean it meansyou deploy a cluster you drop clusteradmin inside a thread somebody getsaccess to your git repo which is goingto get bundled up into an OCI artifactthat gets pulled by the cluster they'vetried to escalate to cluster adminsecondary cell policy says there's nomore cluster admin available in thisname space we're alreadydeployed and way more than than oneslide can fit the multi-tenency lockdownextensions of that so universal controletc etc um plenty more in the blog postand uh as always Stefan and the teamhave put an incredible amount ofdocumentation together beforehand plentymore is available um go to the demo andwe're racing through to the demo we'llput these slides up forposterity problem space supply chaineverything's terrible this is one way tofix it that's the slide that I waslooking for right please bear with meyeah you got to just escape out of theuh Yeah let's kill this guy[Music]yep goodokay so let's see what our Kubernetescluster currently looks like we shouldhave Guac deployed the Wi-Fi here hasbeen notoriously flaky so uh crash loopback off on the certifierwonderful okay so we have uh with acouple ofexceptions guac running on a clusterlet's have a look at um what we haverunning uh we've got a customizationhere and what apps are we running theregreat okay so we can seeuh that we're running a pod info binaryhere or a chart rather so this is theapplication that currently has no CVsand we will artificially introduce a CV2and show you the remediation process andhow that works so digging in to see whatthat lookslike we haven't done it yetright so we are we are up and runningthereuh okay and we'll just pop briefly intothe repository here to see if that'sgoing to work full screennope okay and and that's the chart thatwe're running on the sideso that's the context that's the clusterimagine we're building the sbomb forthat during the CI/CD run we're going touse Trivy which is the de facto tool umone of the best open source securitytools that has everexisted so what are we doing we'regenerating anSPDX SBOM in JSON format and we're usingpod info whichis a pod information tool I suppose verysimply uh wonderful so that probablyneedsuh Stefan's name in front of it there wego okay so now we've gotum podinfo right so very simpleespads and relationships between thingsthere's the fundamental supply chainsecurity artifact if you like so we cannow run a vulnerability scan or let GUdo it for us and build thegraph so we will use Guac toupload that podinfo lovely he says collector maybeforwards and ports while I'm on the waythrough apologiesokaythere we go so we've now taken that SBOMwe've pushed it into Guac so it isindexed and it's available for us toquery with a host of of differenttooling it's got a GraphQL API on thefront but maybe you want to talk aboutthat very briefly on set up um so justpretty much sbombs are in there itparses it out turns it into a giantgraph if you have lots of sbombs you canconnect uh all your various differentsoftware projects over time and space soyou can go and say "Hey I have a 100projects great you have 100 sbombs." Youcan find common vulnerabilities commondependencies that sort of thing and thenas you go create new versions you cansee what got fixed what didn't getfixed thank you right so we've uploadedto Quark is this the right package wellwe checked repo digests we can correlatethem there because we have contentaddressability and they are the sameum in the meantime we've received asecurity notification we know thatthere's a vulnerability here so we aregoing to search with Trivy through thepod info sbomb for Viper now Viper is anincredible piece of software um what amI doing there i should have put Stefan'sname in front of all of his imagesapologies okayokay so we can see that Viper is beingused as a dependency for thisapplication um andthat that fear is what we can use Guacto verify across the entire estate soinstead of doing that on a case-byasebasis for each image we can use Guac toget an error for all connections but toscrape everything that we'veingested and then here we will mark thepackage as malicious now of course it isnot really malicious but we are justgoing to pretend it is so poor Viperit's not really malicious and now we canuse Guac again to query for bad packagesum and poor old Viperuh181 uh right there's In the interest oftime I'm not going to try and debug whyI've failed to get a network connectionum but running through with this so whathappens next this vulnerability thentriggers the pendot there is a PR comingthrough u which would be for example onthis repo hesays very quicklyum that PR is then speculatively mergedwe can then run our tests to ensure thatwe don't break our productionenvironment andmerge with a human in the loop with asignature so we can verify that we havenot broken the things that broadly isthe demo i realize we are running alittle bit closer to time than uh weshould have been uh so withthat we're summarizing veryquickly he says able to click on it umcommon expression language is amazingthese slides will be available it'snative for admission control forKubernetes and for Flux multi-tenantpolicies there's a load of escalationprevention and multi-tenantum compassionate tooling that we'veadded uh Timony is a whole new worldwhere Q is used for non-uring completeHelm charts so you can reason about whatyou deploy beforehand and thank you verymuch for your extended attention2025-04-15 21:59:29.166081 R R�� #��QAD21yF0E-v2svery good afternoon everybody welcome toMind the Gap the the most uh British punwe could attempt for the talk contentbridging supply chain policy withGitless GitOps and Guac i'm Andy uhClicker is here let's move that outslightly and it's not quiteworking you're focused on the wrongthing yeah thank you thankyou there we go hellouh despite the shaky start I am in factCEO at control plane co-chair in tagsecurity in years gone by that is now anemeritus position and CISO at open UKwhere we focus on open source openhardware and open data and attempting toprevent regulatory bodies andgovernments from blowing their own feetoff with things like the CRA which we'lltalk about more and this is my dearfriend hi I'm uh Mike Lieberman i'mco-founder and CTO of Cusari uh done awhole bunch of stuff in open source uham a tech lead in tag security i'm agoverning board member of the opensource security foundation or open SSFas well as a TAC member and a maintainerof various open source projects like GUand Salsa and also there's a book thatif uh folks areinterested right so what are we going totalk about today you may have heard ofthe cyber resilience act that we'llfeature intermittently throughout GUACthe graph for understanding artifactcomposition and how to manage CVS wherethat is relevant to GitOps fluxdeclarative paradigms and ultimatelysecurity a live demo which I hope you'reas excited about as Iam and then how do we tighten that evenfurther with the things that wouldn'tfit into a half hour talk and then someof the extra things that we can seearound Timony and Q on the edge of theecosystemso uh the cyber resilience act uh okayso not going to go too deep into thisbut what =E of tools that it hasaccess to which we'll talk about um andgo off and actually do the worknecessary for us um which is incrediblyexciting right we can ask it to do tasksthat that were otherwise quite laborintensive Um and we can give it theinformation it needs to make those thosedecisions So lots of autonomy That's theimportant point The ability to interactwith the environment I think this isquite unique So rather than justreferring to a trained LLM uh whereperhaps you know the knowledge isperhaps out of date it can rely on someproviding it some extra information Umit's goal oriented um and it's got somememory and we'll talk a bit more aboutthisSo I think one of the things that one ofthe places I would recommend reading ifyou haven't already is you know buildingeffective agents uh by anthropic um agreat read It talks about all thevarious different models that they'reseeing and of course this is evolvingall the time but I really like thisdescription that it uses which is itdescribes you know a type of agent as anaugmented LLM I think that's a reallygood way of putting it which iseffectively it takes input and it hasaccess to a variety of things which cancomplement effectively uh its responseum in order to you know really drivetowards the goal of answering ourquestion or delivering on the task thatwe ask it to do and those are mostlyretrieval So retrieval has been aroundfor some time We've always been able toadd sort of so-called rag um where wecan take some existing knowledge of acesthat we have maybe internal knowledgebases and we can complement the LLM Sowe can give it some like somewhatorganizational specific domain specificknowledge that it can use Uh and sothat's actually been something we'vebeen able to do for a little while Umwe've also been able to provide it withthe ability to have memory uh as well Sorather than having to go to the LLM eachand every time um you can actually use amemory caching layer and this canobviously save on save on tokens whichis pretty advantageous um but it canalso you know remember the insight itcan remember the context of theconversation that we're having with theLLM But what is interesting now is thatwe're beginning to see um these LLMs beequipped with tools which we can provideaccess to Um and there's a lot of I'msure you you may have seen some of thework that's being done with MCP which isa you know a protocol that's beingchampioned and being developed in theopen by anthropic Um but what it enablesyou to do now is to start plugging inthese tools um that we have and that'sreally quite interesting because we canstart giving it access not just to likeuh corporate corporates a corpus ofmaterial but we can also start providingaccess to things like APIs things likedatabases our actual systems and so wecan actually ask the LLM to call thoseand actually use it as we would right ashumans or as as other systems wouldwhich is really good which is reallyyeahexciting and so you know specificallythinking about tools well you know theyhave very specific functions right justknow much like APIs and they can becalled when required which is reallyinteresting so the LLM can reason aboutthe task and then it can decide when touse these at the right time um and itcan interpret those results you knowit's it's you machine it's youresponding in machine code you know it'sreturning uh messages that can then beinterpreted by uh the agent um and theycan be used for like particular tasksintegrating into the workflowBut effectively you know what they'redoing effectively are the tasks that wewould be doing but in that kind of loopand determining when it's needed And soI sort of thought like okay well hearinga little bit about this particularlyfrom some of the the partners that we'reworking with all right actually howwould we go about building one of theseuh and actually how would we think aboutand this is where we're going to get toin this talk is like how would we thinkabout securing these things and so yeslike everybody else I vibe codedsomething rather rapidly not something Iwould think about putting intoproFduction but I sort of thought like ifI was to do this and I was somethingquite canonical um so to keep thingssimple for the purposes demonstrationbut if I was to build like a chat agenta customer service assistant you knowwe're all probably used to probablyusing these or trying to engage these UmI I think some of the legacy versions ofthese chat bots have been reallyfrustrating I don't know about you butwhenever I've tried to engage with themum it's been a really frustratingexperience They often don't understanduh and normally speaking they end upjust basically sending you to a humanoperator to deal with your inquiry Umbut I think with the advent of some ofthese newer generation models and LLMs Ithink we actually have the ability toreally enhance those agents quitesignificantly Um and we can a we cangive them access to informationreal-time information um that they canreason about and and actually provide areally sort of meaningful response to usas a as a user Um and of course thatwill hopefully be really assistive um toto companies So I thought about buildingthis one um and what I wanted to do isattach a couple of different uh tools uhalso you may heard these referred to asfunctions uh as well and so I verysimply wanted to represent like anorders database So if I wanted thecustomer service agent to be able torespond to me understand my request umand go off and actually check up on theorder itself and also the status of theorder I wanted to give it access to APIsthat it could use to do that Um and inmy case I used obviously lots of choiceshere but I plugged into Vertex AI withGoogle and I used the Gemini Flash modelUm of which of course there are many butand this is the prompt that I gave it Umso I gave it a yeah pretty reasonableprompt I mean um really wanted to giveit its role like understand uh how whatit was what it was doing and how itwould go about achieving that as well uhand so I gave it a set of uh tools inthis case kept it really simple like getorders and get order details um that ithad access to So informed it about thosetools and how it could use them um andyou know the sequence that it thateffectively it should follow So um Imust say I'm a beginner at these promptsso I'm sure there's better that I can doCome and let me know later if you thinkthere might be something to add and toenhance this um in in order to actuallybuild this um there's a variety ofdifferent frameworksum that you can use and I think thischanges by the week Yeah there's there'sthere's lots of new tools in the opensource that you can use to build theseagents Um and I mean very simply couldbuild one without a framework if youreally wanted to but I thought using aframework might be really useful UmLangraph is um from the Langchainproject Um so if you're familiar withusing Langchain already Langraph is yourability to go build out a graph-basedagent Um it's not quite a DAG um becauseit has these cycles which are reallyuseful So you can uh have the chatbotdetermine the tool that it needs to useand then it can choose to use likemultiple tools right it may choose onethat may fulfill the request of the userbut it may also need to uh you knowreview the response from the user andthen actually go away and use anothertool So this actually suits that patternreally really well Like it's built forthat Uh and so that's a kind of verysimple representation of what it lookslike right it has nodes it has edges Thenodes are the components effectively Theedges are the decision-making edges ifyou like Uh pretty much so pretty simpleto represent build this and you canbuild this in Python which is what I didUm there are other frameworks aroundthere I think notably there's I thinkautogen from Microsoft uh as an exampleUm there's Kuru AI There's a manner ofthe all manner of them out there But Iuse I use Lraph to do this But what Irealized is that when building this uhand sort of putting my sort of securityhat on I realized that I was storing upa huge amount of risk when building thisprototype Very easy to get startedlocally I could build this on my machineI could hGave I could give it all thetokens that it needed access Um butactually if you were to put this intoproduction there are lots of concerns Umand hopefully I'm sure some of you mayhave already come across these as you'vebeen building agents when you knowyou've been putting into production thesecurity teams have been asking you haveyou thought about authentication haveyou thought about authorization um andthe questions that answer the questionsthat answer sorry to that maybe no um soreally what I wanted to think about hereis like what are the things that we needto consider but also like how might wego about using some of the CNCF tools todo that pretty pretty much so number onewell we've We've got this risk with uhthe token that we use to access the LLMfor a start So normally this is a verylong livedtoken Yeah you are downloading it fromyeah your provider You're putting givingit access to your code and then forevermore you might be using the same tokenuntil such time you rotate it but yeahgenerally speaking it's longived Um sonumber one that's kind of the cause of aconcern particularly if you're puttingit in an environment that's not yoursright okay on your laptop where you haveyou know protection of that key of thatof that key but as soon as you put itinto another environment this is aconcern that you're going to have rightwhich is that number one it's availableto other workload potentially um and itneeds some form of rotation um the otherpoint is also around the tools and thisis like of particular interest here likewe are now giving the LLM the ability torecommend the tool that it wants to useand then go actually callUm pretty much it would inform the agentThe agent will actually do the work ofgoing to that tool perhaps interfacingwith an API Um but we now need some wayof being able to identify the agent Uhand we've got the challenge ofidentifying the user as well that havemade this original request righteffectively it's made being made onbehalf of that user So this is reallychallenging they um and I think a lot ofthe examples certainly a lot of theexamples I've seen online generally umthey uh don't really have concern forthis right they have just pretty openaccess to these APIs you know just likeyou point it at a URL and perhaps youdon't even have any form ofauthentication at all and certainly noform of like very fine grainedauthorization where you're limiting theaccess that the agent has to the datathat it should be able to see to fulfillthat request right it shouldn't haveaccess to the entire orders database itshouldn't have access to the entireentire customer database and so thatthat is a concern So how would how wouldwe do this in the uh CNCF um well wewe've got existing tools and standardsright to do this there's a lot of thingsthat already exist There's a lot of opensource projects that can help us to plugsome of these gaps right we can usechoose to use them to help us to get sofar There's some challenges still whichI'll highlight Um but for the most partwe've actually got a quite you know anumber of these and these are justseveral right there's the the CNCFlandscape looks huge and there's there'sa large number of them that I'm sure wecould also employ but these arecertainly the ones I've been familiarwith that I've decided to use here Um soone notably that I want to talk about umis Fify and this is something I've beenworking on uh and with for some time andjust a quick show of hands anyone has isanyone familiar with Spiffy is anyoneusing Fify wow Okay Good a good numberof you Fantastic Okay so for those ofyou who don't know what Spiffy is it'san acronym of course Uh it's a secureproduction identity framework foreveryone Um it's a relatively matureopen source project It's actually one ofthe graduated CNCF projects Um and it'sused by a number of end users inproduction at scale Um it's a set ofstandards So you might be familiar thatSpiffy is the standards are thestandards the APIs and Spire is thereference implementation So if you wantto go to choose to use this today Spireis what you need to go and use AndactuallHy under the hood of all of thisI'm using Spire as well It's it'spowering some of the identity whichwe'll see shortly Um so differentaspects to Spiffy I won't go into hugeamount detail There's some great talksthat cover this uh at previous CubeConsI'm sure there will be at future couponstoo Um but just at a really high levelwhat does it provide us it provides uslike APIs and standards for being ableto identify software and for a givenidentity So it's got a a set of APIs forissuance It's got a set of APIs forbeing able to validate uh that identityfor being able to federate it betweendifferent environments And thenultimately how you represent thatcryptographically Um and we'll see thereis there are two different ways to dothat currently in the project Ultimatelyit lands in an X59 certificate or or aJWT token Um and we'll see how we cankind of use these uh throughout Just avery high high level how it works foranyone that's not already used this notnot already aware Um it introduces likebasically has a server agent uharchitecture Um and it differs from someof the existing techniques that youmight be familiar with with grantingidentity at the moment in that you don'tgive like longived access keys This issomething that you know traditionallywe've been used to doing right it'sminting tokens and putting tokens intoour code perhaps and obtaining it from avault but instead it enables theworkload to talk to a node local agentThis is like an agent that runs on everyone of the nodes in in a say Kubernetescluster Um and the agent is thenresponsible for basically doing aprocess of what's called attestationIt's a process of like proving uh theworkloads if you like provenence Um andthe agent is able to do that using avariety of means It can do that bylooking at for instance the KubernetesAPI which is a you know if you like asource of truth um you know is myworkload has it been registered with theAPI and now does it exist in the placewhere we expect it to be um and there'sother forms ofation that it can do butthe point being is that that is done outof band the agent runs out of band it'sable to prove something about theworkload um for instance you might evenwant to extend it to whether thecontainer image has been signed there anumber of different attestations thatyou you can apply um those attestationsare then valid validated against theserver So you basically provide thoseproofs Um server matches those up withwhat it expects whether it expects thisworkload to be running um in the placethat it is uh has been has beenprovisioned Um and if all that matchesthen that's the point at which theidentity is then granted um and theidentity is then given to the workloadSo the workload itself is not actuallyhaving to do it anything right It's justreferring to this node local API Um it'sthen obtains the identity and then itcan use that identity thereafter And theimportant point is it's short-lived butit can use that identity then toestablish secure communications withother workloads in the system Um you caneither use the X509 to establish an MTLSconnection and it could also use the JWTas well uh to convey uh its identityright effectively it's a credential thatit can use Um so this is a really areally high level anyway Um if you'reinterested in reading more Solving theBottom Turtle fantastic book that talksthrough a lot of the motivations for theproject the history uh and thearchitecture and how you would go aboutusing uhit would use this in practice and howdid I use it in practice um well onceyou actually have this identity you canthen start using it to make secureconnections And you might want to dothat in your own code And so you can usethe SDK that the Spiffy project providesin order to integrate it into somethinglike Go for instance and Python which iswhat I used in my case Um but you canalso use proxies to do this for youright so you may not want to build thisuh those smarts into your code but youcan have a proxy do it for you And sosomething like Envoy supports Spiffy outof the box right it's got the ability uhto obtain those credentials via Ithe umSDS protocol Um so that's already builtinto obviously envoy built into as welland this is howto is indeed obtains itsidentity for all its workloads that runsin the mesh And so this is now going toenable us to establish these secureconnections Really important point tomake here is that we've also now got theability to encode the service ID intothat you know that cryptographicidentity And so we've got and you canprobably see here this so-called spiffyID This is a way of representing theidentity uh and you can choose to you umrepresent this in whatever way works foryou effectively Um but everything umlives within in this case a trustboundary or a trust domain So uh the uhmyorg.com example there that you can seeis a so-called trust domain andeverything within that then can be givenan identity Um and that's up to you howyou kind decide to do that So peoplecarve these things up by perhaps theirenvironment Perhaps they might want togive it some key value pairs thatrepresents the labels of the workloadHowever you want to represent it But thegood thing about this is that now thatit's encoded in the cryptographicidentity the X509 stiff or the JWT isyou can start using that ID meaningfullySo you can start you know basicallyauthorizing requests based on that ID Umwhich becomes incredibly usefulSo how how would you apply uh Spiffy uhto sort of this agentic um architectureright how would you take Spiffy and anduse it in the example that I that I havewell the workload when you think aboutit the agent itself is a workload rightit is effectively something that isgoing to run potentially in a Kubernetescluster Doesn't have to be but it isjust a workload Um and so we canactually start just plugging in Spityhere and using it in conjunction withSpire right we can give each of thedifferent components in our system aspiffy identity uh including the agentand including the tools And so now whatwe've got um are you know uniqueidentities they they they're given themas they as they're kind of provisionedin the system Um we can scale them ofcourse um and we've got the ID encodedin that credential Um so first foot ofcall is right let's just credentialeverything in our system right let'sgive it some kind of ID um let's makesure that this ID is shortlived we don'twant to have to the job of having torotate this ourselves and we certainlydon't want tokens that are longived thatcan then be discovered and potentiallypotentially exfiltrated um from thesystem so this is already increasing ourposture quite significantly right fromthe first vibe coded example where Ibuilt this and it didn't really have anythink in the way of any uh protectionsUm so this is like applying thosecredentials The other thing that Spiffyprovides well certainly Spire providesum is it can also operate as an OODCprovider So you can actually uh havevalidation of the tokens that itgenerates and then B gives to theworkloads And so I also use in thisparticular example the JWTS vid umtokens um and those are given to the uhthe tools or the agent um and the agentcan then use that to make a secureconnection to something like Googlecloud So you know the way you might dothis out of the box is using a long liveservice account token You can avoid thatYou can replace it with the JWTS vid andthen you can have Google or anyhyperscaler for that matter but you canhave it um configured to create a trustrelationship with your spire OODCprovider You can host that the jocksendpoint You can then have the trustrelationship established and that willmean that Google uh or anyone else willbe able to trust the tokens and be ableto validate those tokens importantly umand give like access in this case tovertex So I was able to create anintegration here where all of the longlive tokens pretty much were removedfrom this architecture So everything wasshorter uh lived So let's see this kindof in action I don't want going to dothis live because you never know quitehow the Wi-Fi is going to work out Um sothis is just a just a pretty canonicalexample right of of creating a chatbotAnd if anyone's seen this beforeJ they'llprobably see be quite familiar with itBut yeah this is me having theconversation But incidentally um you canhave real fun with these chat bots Um II probably should in maybe I should haveincluded some of the examples of whereyou can really push the the LLM andstart to ask it a little bit more thanit's willing to give up Um but anyway umlet's kind of stick to the script um andfollow um follow it This is metalking to the uh to thechatbot and you can see here and I'llI'll probably just pause it at thispoint and and hopefully hover away soyou can actually see it But yeah whatwe've now is we've made the request tothe chatbot The chatbot has reasonedabout the question that I'm asking it Umand it's now referred to the tools thatit has access to Right so in this casepretty simple It's got two APIs that itcan use it's determined that in order toanswer my question that it needs to goand use one of these tools which is theget orders API And so it's respondedwith a request to use that um and theagent is now able to go off and actuallymake that request um to to the endpointAnd so I can continue theconversation Yes like it's thatparticular order that I'm interested inUm and then it will go away and actuallyfetch more details about the specificorder Um and yeah we'll wait for itYep And there it is And so at this pointyeah certainly the chatbot could becomea lot smarter right um it could go andactually obtain like the informationfrom another delivery agent The pointbeing here is it could like hand offpotentially from one agent to the otherUm so yeah certainly more smarts couldbe added to this that's for sure Um sothat's the example What I wanted to talkabout though is not just like how we canthink about the workloads themselvesright we've managed now to give identityto each of the different components inthe system But really like how do weencompass the user identity so I wasmaking a request to this agent Um theagent should really be using my or havesome uh notion of my uh my identity whenit makes the request to downstreamsystems It what it shouldn't be doing isjust having completely open access toour internal systems right i meanparticularly if the prompts could beengineered to perhaps do things thatperhaps it shouldn't be doing Um andactually there are recent examples inlike the last few days where that's beenpossible with the likes of things likeMCP where you can actually ask it to dothings which are similar to the briefthat it was set um but pretty nefariousUm and so yeah this is what you want toavoid And if we follow all theprinciples of zero trust what we shouldbe doing is taking not only theworkload's identity but also the usersidentity as well and using that uhcombination to make a really fine graindecision uh of the back end Uh so what Idid here is I integrated Dex and youmight have noticed this was this alreadydid have some login So it did alreadyknow about me as a customer Um and itwas able to like pass that through Sonothing sort of unusual here right i wasjust using standard oorthth JWT tokensum using a YDC in case of DEX and I wasable then to pass that through uh to theagent The problem is though what youwon't really want to do or shouldn't bedoing is then propagating that JWT rightthat has specific intent um and it hasyou know of course all the access thatis attached to that user So what wedon't want to be doing is just basicallytaking that JWT and then passing itthrough the system Um what we reallywant to be doing ideally is we want tobe attenuating that access We reallywant to be attenuating to we kind ofmake sure that ultimately when the callis made to the order database that it'sspecific to the user right the agent isnot possible It's not possible for theagent to start seeing things that itshouldn't be seeing about some othercustomer Um and in some of my earlyexamples I did find that it was prettystraightforward to do that right youcould instruct the agent to go ahead andtell me all like orders in the databaseregardless of the customer And of courseit was very willing to do it because ithad the access to toK do it Um so how howdo you do this right well one approachis that you could as I said you couldliterally take the JT and you couldpropagate it in a header and so you'vealready got mutual TLS you could againuse the JT and just pass it through Theproblem you have of course is one oflike token replay Um ideally with thesethings you shouldn't be taking somethinglike that and passing it and propagatingit through the system um particularly innow as you're giving something like theagent access to that token right that itcould could use in well hopefully goodways but potentially more uh nefariousways Um so there is a there are someinteresting like uh projects comingabout that that might help us to addressthis Um and this is not in the CNCF thisis in the IETF Um there is a no workinggroup Um and there's a number ofinteresting projects uh and standardsthat are in development And one thatsort of took uh I guess took my interestfor solving this problem potentially issomething called OOTH transaction tokensAnd so what this enables you to do is itenables you to take the user context andthe request context and effectivelyencapsulate that in another signed tokenAnd so this is really really usefulbecause it enables you to at the pointof entry into the system for instancesome kind of gateway or proxy as youruser basically passes that trustboundary of being external internal umyou can grant this like new token typeyou can mint this new token type whichum has all the context that's needed forthe downstream systems um to be able uhto make a decision So there's an earlyimplementationuh which is called token edes Um you cango and find this It's pretty early It'san implementation Um and the actual umtransaction token or the track um ortrap sorry um is still in draft So it'sprobably worth noting this is not astandard yet is still in draft Um and soif you might be familiar with beyondprod perhaps you might be aware thatthis looks a little bit like perhapsinspired uh by the enduser context tokenAnd so the way that Google operates thisis broadly similar right it kind ofenters the the Google front end TheGoogle front end is responsible fordoing user authentication and it thengenerates that user end user contexttoken much like this transaction tokenand then that gets propagated throughthe system um and that can go throughthe entire call stack um until itreaches ultimately the microser thatwill be delivering you the data um andfulfilling your request pretty much Soyeah broad maybe perhaps broadlyinspired uh by that And so just at avery high level I heard the claps nextdoor so it might mean I'm running out oftime Um the uh how does this integratewell you can effectively you use it atthe point of entry into the system Um soas the request is made um into yoursystem um you can use a gateway I usejust a go you know reverse proxy here Umand yeah you could use envoy You couldyou know you could use all manner ofdifferent proxies to do this The pointbeing is it then integrates with thistransaction token service and the inputthat transaction token service is theaccess token from the user The one thateffectively is um passed from the clientUm you can use that together with theidentity of the gateway itself becauseyou want to obviously apply some kind ofvalidation uh to this request and whatis then given back or minted is a veryshort-lived uh JWT sign token um that isspecific to that user and importantlythe the nature of the request that'sbeen made So you're basically saying atthis point in time this user veryspecifically wanted to fulfill thispurpose Um and it's that token that ifyou like attenuated token that you cannow start passing through your system Nolonger do you pass like the originalaccess token pretty much So this ismoving us towards a model of certainlymaturing our zero trust um applying ouryou know more zero trust principles tothe system Um and now yeah importantlyto point is that your agent your toolhas a can make a very fine graindecision right it has access to theoriginal workload that made the requestand it also has access to the user andthe context Um so it can evaluate all ofthis and you can use like policy todetermine whether it wants to you knowrespond to this request or not Um justvery very quick look Um I'm sure I canlet you go read a bit more detail onlineabout this I don't want to run out oftime as I'm sure I have already Um butthis is like effectively what one wouldlook like Um it's a draft so clearlythings are changing all the time But thepoint being is here though you have thetransaction identifier you have thecontext of the request you have thetransaction uh context and and thepurpose And so this enables you to usethis and you know validate it much likeyou would any other JWT So really what Iwanted to show is that by employing someof these existing tools and techniquesand also some of the you know sort of uhstandards on the horizon we can move tofrom a model of like one of justcompletely implicit trust um which iswhere I started I guess with my vibecoded prototype um and then Iincrementally added these uhcapabilities So I enabled service toservice trust using spiffy I couldcredential everything I could give it umshort-lived cryptographic uh tokens umor certificates and tokens um and Icould then complement it with user trustright i could start bringing into thisthe user and the context of the user andtherequest And I I'm running low on time soI'm going to flick through these nextslides Um happy to share these one wewanted to draw a little bit of attentionto um and would recommend you go andread up a little bit more about this ifyou're interested but there's a lot ofinteresting work that's going on in theon the IETF specifically a group calledwhimsy they're trying to standardize ona lot of these things um there's a lotof new token types which will be reallyinteresting um to solve some of theseproblems particularly when you'reactually passing across trust boundariesmy simple my example today wasrelatively simple like everything wascan everything was within a single trustdomain the reality is that of course youwould be spanning workloads acrossdifferent trust boundaries Um and so youwould need to be propagating thesetokens not just within the context ofthat domain but potentially acrossdomains and so yeah a variety of workgoing on in whimsy uh and also theoorthth groups um as well some prettyinteresting challenges for instancearound you know what happens when theagent needs uh more privilege right whathappens when it needs to actually goaccess a system and it needs the userspermission to do that right there that'ssuper interesting problem uh to go solveum so just in summary and apologies forrunning over um agents are really justeffectively smart workloads you knowthat's that's really the best way tothink about it we've got existing waysthat we can credential we've gotexisting ways that we can apply applyauthorization to this Um lots of toolsand techniques for doing this in theCNCF Um some interesting challenges onthe horizon Uh if this interests youplease get involved These groups are allopen to participation If you're anenduser perhaps you're from a vendorperhaps you just want to get involvedregardless like this is something thatinterests you please do We're going toneed more blueprints We're going to needlots more best practice around thisbecause AI is certainly coming at uspretty fast And so um these challengesare going to need to be something thatwe're going to solve um really soon Sothat's me Thanks so much Thanks forcoming today If you found it interestingor indeed if you didn't find itinteresting please um give some feedbackIt's really really useful to know Umthat's the QR code I'm I believe it'sthe one I was provided Um so hopefullyit works Um yeah but if you want to giveme some feedback please do And I'm goingto stay behind I'm a little I overran abit on time so I'll probably just takequestions here So if you want to comeask questions please do I'm reallyinterested in any kind of real world usecases that you've got and perhaps youknow if some of these things aresomething of interest to you please yeahlet me know2025-04-15 21:59:30.027531 wTiw�� #��IA7KCBigZi_Rkwelcome everybody uh hope you are havinga good time here at CubeCon and um todaywe are going to share how we built a AIHPC cluster on the top of Kubernetes inpublic clouduh so quick introduction my name isChandraan Abdut I'm a productionengineer at Meta and my co-presenterKalen Saledi uh he's a software engineeruh he couldn't be here in person todayuh but for his part of the presentationwe have a recording that we are going toplay uh we both work on uh differentteams R��N�#��SA26p_qvuCy-si'm uh John Shell uh at Control Plane Uhjust started on Monday so I know mostlyabout what the company does through theinterview process Um but we have a boothhere too Uh I'm also a co-chair for theCNCF security tag and a maintainer onwitness and archavista um which are subprojects underneath into um and I'vecontributed to uh things like the salsaproject and other thingsYeah my name is Ka I'm a open sourcesoftware engineer as well I'm amaintainer of tough python tough libraryand uh also uh archive vista an inputproject as well and we are here to talkaboutidentity basedtrust All right So just to prepare youall um we're going to talk about theento project the tough projectsimplementations underneath each of thoseanM��#��=ACvGbwn5ZrFgokay good morning everyone Welcome Ithink we're gonna make a start Thanks somuch for coming Uh my name is uh MattBates I'm a founder at a company calledKofi a UK startup We focus on workloadidentity It's really close to our heartsand all of the open standards and toolsin the CNCF are ones that we use Um sopleasure to be back in London Um I don'tknow if anyone can remember back to thelast time this was in London uh 2016Just a quick show of hands Is anyonehere at the Yeah I thought Ben was YeahUm but yeah the room I think the keynoteroom was uh smaller than this Uh muchsmaller than this in fact So great tosee how far we've come So today I'mgoing to be talking about something I'msure you've probably not heard muchabout at the moment which isAI Um no specifically we're going to betalking about um IM uh for AI agents andsome of the interesting challenges umthat it that it presents So hopefullythis will give you some food for thoughtyou can see some existing toolstechniques that you can use today in theCNCF Um and I'm also going to sort ofshed some light I guess on some of thosechallenges and some of the ongoingefforts within the CNCF um and beyond umfor solving some of these challengesIt's hopefully um super useful for youtoday and to take away with you Um so weI guess we we've all heard about AI Weall use it Um quick show of hands Infact is everybody using AI here iseverybody using generative yeah Okay Iprobably don't need to ask Um so we'reall very used to using like generativeAI generally speaking Um yeah a lot ofus now using it for coding I do probablymore vibe coding than I probably shouldbe doing Um but we're all finding itincredibly useful in our daily lives AndI think by the day by you know by theweek things are changing Really what Iwanted to focus on specifically are ouragents Um and yes this is somegenerative AI from OpenAI Um you mightnotice a few of those throughout thetalk Um but an a AI agent is one thathas is one that's more autonomous rightwe give it a task and rather than justrelying on sort of some of thegenerative AI we've been used to wherewe engage in a conversation it respondsto us um using an LLM uh gives us somekind of text or some kind of other mediaUm the agent can actually have somedecision- making and some reasoningcapability So it can actually understandthe task that we're asking it um and cango away much like a human and can reasonabout all the things that it has itsmemory and setsDNd the sig store project and itsdifferent components Um this is a lot uhI apologize in advance uh hopefully yourbrains don't melt um preparing this oursdid slightly So um the other thing thatthis does not include is uh introductionto uh the cryptography or X509certificates or the underlyingtechnologies that have made all of thesedifferent projects possible Um so wewill be well I will be available for aslong as needed afterwards uh forfollow-on conversations We also have umeach of these projects in the projectpavilion so feel free to stop by thereas well too um in toto and the updateframework have both had people writeentire PhDs on them So we will try to dothem just as we can in in the amount oftime we haveOkay Um let's start about signing and uhhow we used to do in old days actuallywe still doing that So how it works uhin general basically you have a privateand public key uh that uh it's from someperson or some organization and based onthat you use your private key to signsome artifact and uh of course you keepyour uh private key saved and itgenerates for you an um an signatureuh and uh as your private key is savedyou share share your public keybasically and other people can just getthe private key uh the public key and asignature and validate the um theartifact signature So um but what theproblem and how you share uh new trustedkeys uh if you lost your private key orthis private key got leaked So thesolution is quite uh easy You generate anew key you resign the package with thenew key and you publish your uh publickey new public key and you need to tellin a trust way to your users to use thenew public key right And after resigneverything you need to I'll take uh inmind the the amount of artifacts thatyou need to resign everything anduh wait for the the the replication inCDN morse and caches all there Okay Umhow we do uh nowadays based in OIDC uhidentity uh signature basically you wantto sign your package uh on artifact youuse you can use cosign for example anduh six star cosign you will generate uma verification of your identity and withthis verification generate a temporaryuh private uh key that will sign theartifact and it will generate acertificate and the private key isdestroyed You don't need this keyanymore It's what we call ephemeral keysSo with that uh users and client canjust use the certificate uh to validateyour signature in thepackage But um this is was the uh toplevel uh high level how it works Butbehind of this I'll let the John tellabout that Yeah So you can see um to theleft side here developers uh they have athing they would like to sign uhrequesting a certificate which getsauthenticated with uh OIDC Um thatcertificate request goes to Fulsio whichis one of the projects we mentioned umthat provides the signing certificateand also publishes that certificate umin the recor transparency log Then whenthe developer publishes the artifact umthat is signed you're able to downloadit as an end user on the far other sidehere Um and then the end user will gothrough the verification process ofchecking that signature in thetransparency log and also checking thatthe um signing party of the certificateis actually a part of the trust route Sothere's the update framework embeddedunderneath of um Recor and Fio And soonce this happens that signature is inRecor forever It's trusted forever Um itwill always verifyUm and so what are we going to do aboutthat We're going to walk through kind ofan example Um project we have here justin time for CubeCon We were not in arush at all We released V1.0 auto of oursoftware So we uh we tagged it we pushedit it went through our GitHub actionspipeline Uh we got asuccess and there's a Docker image nowSo people can go and start pulling thisDocker image Everything will bewonderful Now one thing we wanted to doto do things the right way in this uhthis software project was to createsalsa build attestations And so um weuse uh witness to wrap the build processand create a salsa build providence umwhich you can see a link to here in thesummary from the GitHub actions NowSalsa if we look at thOis overall set ofthings that can happen throughout yoursupply chain is really focused on thesethings in the middle here And that'sgood but there's you can see there's alot of other triangles not on thisscreen And in addition to that there's awhole bunch of things that are not evenon that threat model right Things thatshould be happeninghopefully that might not be or you haveno way to trust or verify that thosethings have happened So things like yourvulnerability scans static analysis codereviews testing Um I can think of atleast one situation in which uh vendorswriting uh things like kernel modules ordevice drivers on Windows should do moretesting before releasing software Itmight be good to ask them for an accessstation proving they did testing um orlicense compliance Um a lot of thesethings in our pipeline here This is kindof the the end to end for the therelease we have Um you can see we'retaking a lot of these different stepsFor every one of these steps we'recreating an accessstation All right So since we were in alittle bit of a rush for CubeCon we hada bug in our software We need to fix itSo we've got an emergency release hereWe just smashed merge on that PR We'regonna um tag everything in uh therepository push that to the GitHub reposo that we can kick off our releaseprocess Um our customers are absolutelyclamoring for everything Uh we need toget this out the door as fast aspossible So we're going to speed throughWe've got our super mega GitHub actionsbuilders running underneath here toaccelerate the process Um and you cansee we're going to go through and onceagain we're creating att test stationsfor every step along the the line hereEvery one of these attestations aresigned with um a certificate from Falsioso we can verify them and we're going togo through our release process You cansee everything here is green So we'regood right Um well we're going to checkwe have a policy um from the witnessproject And this witness uh projectmakes sure that um the policy wants tomakesure we'll go right back to that thereUmthat allright All right The is going to makesure that our expectations about eachone of these steps matches reality Andso in this case we can see an errormessage about our linting step We have aDocker file llinter in our pipelinelooking for things that we don't want tohave happen in our uh in our Docker fileAnd so somehow in between the previousrelease and this one something changedBut the thing is the attestations wereall valid still The signatures allpassed Everything in the pipeline wasgreen All right So here was the codethat was removed from the PR Um that'sright We needed to take the panic out ofthere for some reason um in order tohave working code But there were someother things in here I don't know I AIgenerated probably uh someone decidedthey needed the container to be root andthey just so happened to know the rightrule to ignore uh from the staticanalysis tool so that that wouldn't failin thepipeline and the policy was able todetect that So um but what happens ifthere's some sort of compromise to thesignature process that you can't detectright Um so if you look at things likeum this whole giant set of componentsthat back that workflow we looked at forthe identity based signing before um ifthere's a compromise to your OIDCprovider um they could someone could beable to take that identity and requestany number of certificates from FalsioUm the way that you would try andmitigate something like that would be byexamining the certificate transparencylog that's behind Falsio um if there areuh a compromise that that way as wellyou can have a recor monitor observingthe entries in the log looking to see ifthings are unexpectedly published underyour name Um if recor itself it's anappendon transparency log Um if someonetries to modify that in some way theum the hashing and the um indexing ofthe entire log will be disrupted it'sexternally um something that could beexternally validated Anyone can look atthe log and uh do that sort ofcomputation So there's a lot ofdifferent ways to handle this but thething is allP it handles is detectingthat there was an issue Um recor as Isaid it's an appendon log You can'tremove any of those entries Uh so whatwe really need here and what we kind ofsaw with the attestations is you needthat extra layer And so by actuallylooking and signing something that hascontents in it that can be expectinspected you can make better decisionsabouttrust or the other thing that's notgoing to show up in any attestation orsignature process or anything else Youjust you've released uh badsoftware All right so let's go ahead andtry and fix thisissue the right way Uh so we're going togo ahead and look at the PR this timebefore just smashing merge We're goingto see that we're removing the ignore uhand also removing the root from thecontainer image Now we're going to goahead and go through the same process tokick off a new releaseSo we'll tag this and push this up TheGitHub actions workflow will kick offagain And as this runs we are creatingthese attestations All of theseattestations are also being stored inarchavista And so that's um when we'reable to do the policyevaluation it indexes and understandsthe relationship between all theseattestations So it's capturing theinputs and the outputs and understandingwhich sets of attestations to use whenwe do this verification process So herewe're going to see this kickoff onceagain Um and hopefully it's going tomake sure that that linting step passesand in this case it does um which isgood So now we've got everything meetingour policyexpectations Now we'll see how weactually distribute this securely YeahUm what we saw here we did uh actuallythree helices and all them are availablein our uh registry but how we avoidum the users to deploy those uh versionsthat are uh with uh the malicious uhpart So what we want to do now is securethe the last part here the distributionof those uh images And the way that wedo uh here is using uh the updateframework uh tough Um so what we do hereum let's go for the next demo part YeahSo we have everything verifiedSignatures are okay as well So what wecan do now is using tough to approve themetadata that has all the artifacts thatdo you want distribute and we see thatone is here and we want to uh remove itSo what we're going to do first um isremovethis to make sure that our uh uhdeployments will not be able to to tofetch this uh version Once it's removedit still required someone to sign Herewe are also using those uh identitysignature Here you see John using uh thesig store to sign the metadata aswell Then uh what you going to see nowthat the users uh will be able to uhdownload uh safely uh the version uh 1.2two verifying the tough metadata verifythat John approved it by the signatureand tough helps that you can rotatethose signatures as wellSo well uh now what we want to do uh isremove 1.0 as well We can avoid the userto use the old version uh that iscompromised with the bug Um then we signit the newmetadata We can see that we have nowjust 1.2 And now if you try to downloaduh the uh buggy version you're not ableto do that So um I I I cannot tell toomuch about tough in that talk but whowant to uh listen more I will be runningto the my next talk right after this oneabout that will be about tough but I canexplain the important part here that wasthe client that we use how tough clientworks uh in a big picture you have thetoughclient and uh you we store yourartifacts in anywhere in our case Herewe are using a GitHub uh containerregistry but could be uh the docker hubor harbor or even your JROG in yourorganization and you have the toughmetadata that's usually stored outsideof uh this uh the same storage uh butyou can have leave it together there'sno uh problem about security on thataspect So all the tough client thecontains the initial root metadata thatcontains what are the keys that can signuh my metadata and uh what are thesignatures already there So it isusually embedded in the the tough clientYou distribute that The advantage youcan never need to go to your client andsay which keys they should trust or nolonger trust The metadata can validatethaQt How it's done here An examplefollowing our uh demo Let's say that youwant to download the version 1.2 that uhwe released What happens is the toughwill check the tough client will checkfor the new versions of the toughmetadata and uh this check uh um is doneby uh the client and the root uh uhmetadata can validate uh the next one Itcan see if the the signatures are goodor not Even the rotation of keys you seethat it allow it can verify if therotation of keys was done in a safetyway or not and what are the currentsignatures after there's no newmetadatas uh uh to for the root thatcontain the keys um it will download uhother metadata that I will not go indetails here but basically it's the timestamp that will guarantee that your youruh artifact storage where is your oryour registry it's a uhupdated and also the consistence of youruh your repository And the next stepactually the last step is to verifyingyourum container if it's the container thatyou are expecting Basically it will uhdownload the artifact uh and after thevalidations of signatures it downloadthe artifact and uh can check forexample the the hash and the size ofyour image what is signed in themetadata So it will if it's it passesyou get that okay you can go and deploythis image saved and um what happenswhen uh you you are not assigning likewe deleted for example uh the version1.1 right uh but in the registry isstill there it's still available orimagine that you have CDN caches or uhdistributed uh this image is still therehow tough will secure you uh the thingis that in the metadata this version isnot signed so it will not allow you toum download it so it's not a safe oneYeah Yeah So um kind of going back to tothe beginning of everything um and theseidentity based signatures the idea ofjust say signing your container image isnot enough Right We looked at differentways in which we can add additionallayers on top of that to understandwhether or not you really should trustthat signature or really that artifactbecause the only thing this signature isdoing is saying this is a thing that Isigned It's not saying much else besidesthat Right we get a little bit closerwith things like salsa build providencewhere you can say this is a thing thatI've signed that came from this locationSo if you have a package that's in uhthe npm registry or pippi or somethinglike that that's disconnected from theGitHub repository How did it get fromthe GitHub repository to there It's alsobuilt providence can prove that Butthings like whether or not the code issafe secure um that's not necessarilyanswered We saw we added a bunch ofsteps in to our pipeline to address someof those concerns by doing things likelinting And in order to have that chainthat set of attestations and verify themall together we have to have a policyAnd so that was the witness policy ableto detect someone modified the lintingstep And we knew that that shouldn'tverify the policy That software wasreleased in this case but it was neveractually distributed through the securerelease process And then once we hadeverything secured being able to thenrelease it uh the right way with theupdate framework Uh that's a really goodway for organizations to decide on theirown what software is safe to consume Soum yeah that's it Yeah Yeah So I knowthat was a lot We have uh time forquestions if uh any of that wascomprehensible and there's anyparticular uh part of it folks want todive into a bit more[Applause]Hi Can this be run fully offline orwhich part needs an online connectivityUh yeah you can run all of this umoffline Um the so the there's a lot todeploy offline I I suppose Um so fromwhat we've uh showed storing the accessstations in archavista you can run thatlocally um and store that there The sigstore instance to run offline is alittle bit harder One of the things thatSIG store is moving towards is the uhbundle format that can be stored offlineand it has things like an inclusionproof rather than having to look up theentry in the transparency log as well asa um assigned timestamp And so thatallows you to ensure the certificate wasvalid at the point in time the artifactwas signed That way you can do thatvalidation um without having to call tothe transparency log And then tough isoften deployed in in aircraftenvironments And in fact um the updateframework has a a variant of it uhcalled Optane or Optane which is whatpowers a very large majority of all ofthe automobile updates over the air Soyou can imagine a car parked in a garagefor months at a time uh not turned onstill needs to figure out how to getupdates after it comes back online YeahNo Yeah And you you can also bring themetadata to uh to validate offline asone If you have the initial metadata uhin in the place already you can bringthe uh the whole metadata offline andyou have ways to deploy to remove likethe time stamp and snapshot or notremove them fully but trust them in a ina more uh offline way like not withoffen updates Yeah Okay Thank youHi thank you for the talk Uh can yougive a bit more detail on the firstexample when you run the witness verifyWhat exactly happened there Why did itfailYeah Um so let's see here We'll do alittle live We'll see if the internet isworking Uh so all of this is in a uhGitHub project uh that's public You cansee the runs there Uh even though we hadrecorded uh video um there is a a policyfile in here Um and inside of this itlays out all the different steps that weexpect to have happen And one of thesteps is this linting step And inside ofthe linting step um there's different uhgroupings of the attestations And thisis uh called the command run attestationAnd so it captures the um the parametersthat are used the inputs um the outputsto standard out standard error Itcaptures all of that and then you canwrite a reg policy against that So thisRIGO policy here um I'm not sure ifthere's a basically what it is lookingfor um in that rego policy is that thecommand executed matches theexpectations which doesn't allow for anyd- ignore flagsSo you did run the linting but becausethey were ignore flags this failed theverification later right Correct YeahThanks Yeah Yeah And so if you wouldhave the um the root properties in thatdocker file the llinter would fail andif the llinter would fail then thepipeline would fail and the developerswould see something is wrong with thisIt never would have built the containerimageBut in that case why not just fail theCI Uh so this was maybe a way to avoidgetting the CI to fail So since the thecode review wasn't sufficient to knowthat this would be essentially aworkaround and so you're able todecouple the policies you create fromthe the pipelines and you may not evenhave control over that if you want towrite policies just as a consumer ofthis You may have different expectationsor standards about what you want to havehappen So it's a good way or if you wantto have these sort of policies as anorganization for software developedinternally you don't want to stop yourdeveloper pipelines right You don't wantto be a roadblock for them being able todo uh their work dayto-day And so havingthat policy happen later on uh is a goodidea sometimes Yeah And it was a wayalso to demonstrate that if you didn'twrote a good uh policy in your pipelineit could anyway So you need to have away to secure you after that Anywaygot like two more minutes if anybody hasany other questionsHi Um so the the last point that you'reverifying and the thing that kind ofsays it all yes or no uh I trust it ornot is that last registry signing oncethe image or the the binary is in thethe registry that'swhen I I think uh use six star identityessentially So um at the very end whenwe were doing the secure download thatwas with a tough client Um so that theinformation about whether or not wecould download something was stored inthe tough metadata and so the client isable to query against that andunderstand if it should pull the imagefrom the the registry Yeah And thesignatures there actually are to sec toprevent that someone could modify thismetadata to uh allow you to download itBut without the signatures from SIGstore for example it would not uh permitit2025-04-15 21:59:30.862679Sbut our team collaborate togetherto uh build and uh operate uh AHPC thecluster for research in publiccloud so today we'll go overuh like overview of a IML training howit looks like what are the uh ininfrastructure requirements for runninguh AIPC workloads uh running slum oncommunities and uh node node life cyclemanagement and uh some of the storageaspects of the clusterso with that I'm going to uh play arecording uh so first part of thesession is called by Kalyan so I'm justgoing to uh start the uh videorecording thank you Chandan uh helloeveryone my name is Kalyan i'm asoftware engineer in researchinfrastructure team at MA sorry Icouldn't be there in person at CubeConthis year but I'm uh I partnered withChandan to help you understand what MLtraining and research space looks likeand what does it take to build aperformant and uh uh highly productiveresearch cluster in the cloud usingKubernetes so let's move on and try tounderstand what ML training is at a highlevelthis is a oversimplified picture but youcan imagine a training loop typically inPyTorchum deployed on a bunch of uh GPUs righteach of these GPUs is operating on dataum training data samples and iterativelycomputing and refining the weights thisgoes on for a while eventually resultingin a set of weights that that would becalled a model right the model is goingto be released for production or foropen sourcing right uh to give you um uhwhat are the examples of models likellama mistral GPT photo you all might befamiliar with right um before weunderstand what does it take to build uhML research cluster in the cloud it'svery important to understand what arehow does it differ from traditionalworkloads right uh I'll focus on threedimensions here number one runtime orlifetime of the job as opposed to webservices where the requests take a fewmilliseconds or hundreds of millisecondsright here the jobs run for any anywherefrom hours to weeks to months in somecases that means you the time span thatyou're looking at is really differentthe second dimension is job size andsemantics uh the semantics are reallyimportant here um you can you have thisall or nothing semantics say you have aum 16 GPU job if any one of the GPUs isunhealthy or is suffering from some sortof infra failure the entire job isimpacted even if the other 15 arehealthy you'll have you're effectivelyrestarting the job right this ch this isquite different from the uh traditionalweb services where each node isindependent of the other and if a if aparticular node is unhealthy you canretry the request on a different nodeand still satisfy therequest right as you can see given thelifetime of the job and the job sizesemantics reliability becomes that muchmore important in the context oftraining uh workload infrastructureright um yeah so let's we we have anvery high level understanding of MLtraining right with that let's figureout what is an ML research life cyclewhich is actually quite early uh in inthe if you in the entire ML life cycleif you see these guys are um theseresearchers are creating are innovatingin the model architecture or the modelcapabilities right let's go through atypical research life cycleresearchers are um reading papersunderstanding the latest modelinnovations they're reproducing theresults from open source or they arestarting with a state-of-the-art modelthat they have and trying to identifyfuture directions they eventually comeup with some proposals right um based onthe directions identified the proposalsresult in a in in they move on toexperimentation phase which may involveablations sweeps hyperparameter sweepsand u new model architecture innovationswhere you need to modify the code andand prove that the architecture workseventually when everything settles downyou go into your scale run mode meaningyou're now ready to launch your workloadon on on hundreds or thousands of GPUsbased on your based on the modelsrequire compute requirements right oncethis process finishes the final resultis going to be like published model orproductionized model or released to opensource community the code and Tthe modelright that's what we're looking at um ifyou followed the space models have beenrapidly evolving getting bigger and moreefficient of late and uh and if you lookif you look at this curve the modelcomplexity in terms of u G-flopsexpressed is is is a really uh dramaticillustration of the complexity growthright we we can kind of safely assumethat the curve is not going tomeaningfully shift downwards and beprepared to handle the scaling needs ofML training in thefuture okay um in before we figure outhow before we figure out how to build aresearch cluster we need to understandwhat why researcher experience isparamount to our design right um as wetalked about it researcher experience uhin in the researcher experience slide umthese uh there is a lot of iterationinvolved right effectively iterationspeed is what is unlocking innovation ifyou're able to experiment more and proveor disprove your hypothesis you're ableto move farther along in in in thedirection that you want to um that youhave identified right um researcherssimply stated researcher taking an ideato experiment should be uh as fast asand as simple as possible the more thebarriers the slower the innovation rightthe second part of researcher experienceI would like to call out is um uh sincethis is research and experimentationthere is bound to be failuresresearcher is going to face a ton offailures and the key aspect here is canwe separate the research failure versusthe infraure right and that's one aspectthe second part is uh can we give firstclass monitoring and node metric noderesource metrics so that the researcheris able to track how is my model doingand how am I using the compute and theresources underneathRight um the third aspect of researcherexperience uh is training reliabilitythe more the interruptions the the morethe researcher is wasting time debuggingthose failures and figuring out is it acode bug or is it an infra bug so we weneed to we have a goal to keepinterruptions to a minimum right andsupport longunning jobs at massive scaleokay we talked about a uh we talkedabout reliability already but let'sreally understand why distributedtraining reliability is is such achallenge in this picture you're seeingfour servers with two GPUs each runninga workload running a ML um trainingworkload right slum is dispatching thetasks to the workers uh to the serversright and the workers are pullingtraining data from the data storeright there are different trainingaspects but uh one thing one key thingimport one key thing to understand hereis that all workers need to be in aconsistent and consistent in terms ofconfiguration if you have divergesdivergences among these nodes you don'thave control over the outcomes right andthe nodes need to be healthy while inorder for the job to progress in orderfor the training to make forwardprogress um so this is again goes backto the reliability and why reliabilityis so important the outer circle thatyou see is the workers communicatingamong themselves which means you need tohave reliable compute storage and thenetwork if any of these components areweak or unreliable your your job isgetting less productive time trainingtime on the cluster and that's reallykey hereuh one um this might be you knowsomewhat obvious to most people but theother nonobvious aspect is the sort oflockstep training approach that iscurrently in vogue is is is going toresult in load spikes on centralcomponents and services in the componentlet me explain what I mean say all theseuh let's take a multi,000 GPU job rightwhen we when the job decides to take acheckpoint there's going to be massiveload on the storage servers when thecheckpoint is written right and thenright after the checkpoint is writtenyou will have near zero activity on onthe file system and then back again whenit comes to the next checkpoint thereare multiple examples of this load spikenature so it's very important to designthe cluster in a way that can sustainthat can handle these loadspikes um one other aspect ofreliability is which we also alluded toin the researcher experience part is theUattribution of failures is critical andis also a challenge right imagine amulti-week job that's running onthousand GPUs and if your same infracomponent is failing the job but we areunable to root cause it is going to bereally expensive right you need to beable to have good observability in a toenable you to pinpoint what component isproducing job symptoms and how you canuh mitigate or uh improve close the gapsAll right so we we have a sort of apicture we knew we we we talked aboutwhat ML training is what researcherexperience what researcher experienceshould be now let's talk about whatwhere did we start before going to thecloud right we this is the picture thatyou're seeing is is a realcluster that was customuilt and launchedin 202122and we had thousands of uh DGX serverswith GPUs uh with CentOS running on themand uh orchestrated by slum slurm is thescheduleuler that we used it had thecluster had directly managed backendnetwork fabric multiple pabytes of flashstorage and purpose-built storage toaccelerate training a purpose-builtstorage service to accelerate trainingright now this is awesome this is directcontrol that we had but can we uh takethis and the challenge is can we takethis and replicate it on the cloud maybeeven do better than physicalcluster um to to understand what we aretalking about this is what a typical uhslum research cluster looks like from aresearcher's point of view they don'tsee the rest of the complexitythe researcher is logging on to a abunch of nodes one of these nodes andthey have access to the control planewhich lets them submit jobs to thecluster and those jobs are running onthe worker nodes rightand let's take our requirements and anduh let's project it on on onto the cloudright what does an ideal ML researchcluster in the cloud let's come up withthat problem statement we want the sameexperience of researchers able to log inget access to control plane submit theirjob batch jobs to the queue and slumbeing uh uh the dispatching agent whichdistributes the workload onto thecompute nodes of course you have thestorage which helps you do the data uhload the data training samples and uhsupport checkpointing right this is whatwe want but on the characteristics sidewe want near bare metal performance wedon't want to compromise the researcherexperience in and we don't want to slowdown the research velocity right we wantto be able to support longunning jobswith heavy data exchange among theworkers right and demanding data loadingrequirements with terabytes of data ifnot pabytes of data being pulled by theGPUs right all this has to beaccomplished while keeping things stableand performant one of fundamentalrequirements that we lay down for anycluster that we uh consume that weoperate onis there is no perfect reliability whenit comes to GPUs or these nodes right uhour job becomes can we find the badnodes fast and isolate them and shieldthe workload from the infra failures andand this is a goal that we want toaccomplish even on the cloud and youwill see later in the presentation as tohow we accomplish that okay now that Ihave laid down the um case and and uh wehave a problem statement in front of usI'm going to hand it over to Chandan tosee how we solved this problems andbuilt multiple research clusters in thecloud leveragingKubernetes over to youChandan okayso so we just saw like how we wanted tohave like reserve built in the cloud soin the public cloud so when we startedwe we did not like started usingKubernetes at the uh from the start sowe we use something like this so uh wehave a meta instance u so in meta wehave uh something called meta instancewhich is just a machine image uh that weuse across the public cloud providers uhso we take the uh cloud provider basedimages uh host images and then we uh putour own layer on top of it so here inthe hello you can see the containerizeduh meta instance layer so pretty mucheverything runs as a container uheverything that we manage but on the topof that uh our the different teams atmeta can put their own applications andrun it on the public cloud so in thiscase Vuh on the CPU nodes typically werun the control plane components thoseare easy to containerize and run it sowe already have those containerized uhbut uh the slumd component that runs onthe GPU node and runs actual uhworkloads that was not containerizedjust because it was it handles a lot oflike resource allocation uh cgroupmanipulation and also it needs uh baremetal access uh or direct access to thehost to uh control the resources andallocated to uh to the jobs so as we sothis was our first implementation at thetime the footprint was not that big inthe public cloud but slowly we startedgrowing there were like multipleclusters uh we also started growingacross multiple cloud providers as welluh so this was clearly not the solutionthat we uh would that will scale uh forthe for the future requirements so weknew like we needed a better umcontainer orchestration orchestrationmechanism and Kubernetes was obviouschoice because of being industrystandard and was a across multiple cloudproviders so we thoughtlike we if we change the architecture wewill do something like this so whereinwe'll run everything as a community spotall the control plane components all theinfrastructure components required forlogging and metrics etc and then we canjust leave slumd as it is uh running onthe host so that we don't have to dealwith the complexity of resourceallocation at the sum level and getinvolved into uh process of uhresearcher submitting a job and thenthat job getting allocated on thekubernetes on the on the GPU node uh butwe ended up implementing something likethis uh so now cluster is divided intotwo differentparts on the right hand side uhtypically uh we run u things likeingress and ingress gateway on top ofcommunities there are a few otherinfrastructure components that run therethat are common that are uh reusable uhacross various clusters uh and thenthere is a left hand side where weactually run the GPU cluster and all thelike all the all the components that arerequired to run the workflow includingthe login environment control planecomponents and all the all thecomponents that we need for uh uh needto get the telemetry out of the out ofthe cluster so reason for doing thissplit was the right hand side prettymuch like stays the same we can reuse itacross multiple cloud providers we don'thave to customize it at all on the lefthand side because we are running onKubernetes now uh we can take that andrun it across multiple cloud providersbut we still have to do somecustomization based on the like crowdprovider or specific needs of thecluster uh specific hardwareetc so uh only thing we need in betweenis the L3 connectivity so that like theyare uh connected over the network and uhthat's about itso as we moved on to Kubernetes and umlike it it kind of like made our lifeeasier as infrastructure team to manageinfrastructure across various cloudproviders um we did not want uh wewanted to make sure the researcherseither we provide a better experiencefor the uh researchers or at least wematch it to what was it before we movedto communitiesuh so typically when researchers log inuh they just run slums uh commands to uhrun their workflows and they so wewanted to keep that same so because onthe Kubernetes now um you have to runlike Helm cube cube kettle uh thingslike that to interact with the uhKubernetes cluster uh we do not wantedto have researchers learn all that orchange their workflows across variousenvironments uh uh so we wanted tocompletely avoidthat so uh to do that what we did islike we came up with a custom CLI thatresearchers use to login into thecluster uh so that's custom CLI uh inthe back end it just creates a small uhhelm release uh custom resourcedefinition and then we have a flux hemcontroller that is running in the insidethe cluster so whenever researchers login that he release object is created andthen dynamically a login pod is createdfor them and um when that pod is up andrunning uh SSH session is launched anduh researchers uh get into the loginenvironment and they can uh from thereon they can just run their normalworkflow thWey don't even realize they'reinside the cluster they don't have torun anything uh they don't have to learnanything about Kubernetesso uh there are so we also run uh Kyoadmission controller so that is the uhweb hook that you can that you can useuh that's so the admission controllermake sure uh the Henry is uh is createdas per as we expected it to be so forexample if I go and start likecustomizing it to get elevatedprivileges inside the cluster uh itwon't allow that and also I it won'tallow me to create a login pod for withsome other user's name things like thatso uh if you if you if it validates thepolicy it will reject the uh that uhobject and it will not let you logininto the environment so with this now uhlet's look at let's compare it to likehow it was before so before u we had avirtual machines that uh were sharedbetween the various uh users and um nowthey get dedicated pod with thededicated resource allocation so now wehave better security posture and also itavoids problems like noisy neighbor soat the end at the end like we we endedup giving them a better experienceuh infrastructure component managementalso uh changed a lot it become a loteasier so now you can see here wealready talked about the slum uh so werun few more things on the cluster as acommunity demonstrate our deployments umas I mentioned earlier we typically justuse the cloud provider uh provided hostimages and we don't like to maintain anycustomization or have to deal with anycustomization at the host level but wecouldn't really avoid it entirely so wehad to do certain things for example uhin the yellow layer there you can seecisco demon set that runs across thecluster configures the kernel parametersthat are required to run our uh ourworkloads uh it also shows the automountdemon set um that mounts the NFS volumesacross the host providing the consistentaccess to the file systems we'll see alittle bit more about the automountdemon set later in the presentation uhwe also run various telemetry componentscombination of open source vendorprovided and custom build so as Kenmentioned um it was important to have alike very strong observability story sothat if something fails uh we can easilyuh triage why it failed uh whether it isa application problem or in problem andif it is impro problem uh how howquickly we can we can solve itso we have some customuilt componentssuch as some job matrix u core dumper uhwhich uh takes a core dump of u uhwhenever process is terminate uhunexpectedly uh we also have fewcomponents that are like customkubernetes controller so in this examplehere I have added user identity andservice identity controller and there isalso search manager search manager is aopen source tool for managing search onkubernetes uh in our case we use X509based uh searchs which are signed byprivate CA and uh to be able to do thatand then we have our own policies thatapply when uh some like when the userrequests a search uh so with the customuh controllers now we can in uh we canintegrate with our own tooling uh soutilizing both uh so utilizing opensource tooling with some custom layer ontop of that to be able to integrate withour own uh our own uh infrastructurecomponents there are a bunch of otherthings that are run that run asinfrastructure components and nowearlier pretty much like our team usedto manage that uh working with otherteams now it become little bit more uhit's become more independent they canjust we can just give them the namespace they can run their stuff and uh wedon't have to get in between they canautomate u uh using our helm automationtooling uh deployment of u deployment ofwhatever components that need to be runon thecluster uh so as Kalan mentioned earlierit was important for us to make sure uhnodes are consistent um so let's say ifresearcher submitted a job across likeuh 400 nodes all those nodes should havethe consistent uh configuration and consconsistent image versions etc so to beable to do that what uh what we have isour everything is handled by our helmdeployment automation so whenever thereis a change uh in the helm chart oXrimage that gets applied on kubernetesand the custom controller running on thekubernetes um first thing it does it itstarts a rolling update on the slumdnodes and um it updatesuh it drains all the nodes in thecluster so at that point uh slum makessure uh those nodes don't get any newjobs allocated and it lets it finishexisting jobs that are running on the onthe node so once uh the node finishesthe uh existing job it it it goes intothe drain state and then uh thekubernetes controller then uh replacesthe slum pod with the new version andthen um undrains them so now it becomethat node becomes available again backin the cluster for uh to pick up thenext workload we run uh a few telemetryuh health checks that run uh in aKubernetes control plane as well as uhon the slum to be able to detect faultynodes and then those nodes are kind oflike tracked and then our support teamuh triages it and in some cases we haveto take it back to the cloud provider uhto get it fixed or replacedso for the storage um because um AI HPCworkloads are like data uh dataintensive we needed to make sure weprovide many of storage options startingfrom simple posic storage to um AIaccelerated uh object storageso for non-posic storage such as objectstore we did not have to do any changesbecause we moved to kubernetes becausepretty much uh it just depends on theclient and um uh they just make the sameAPI call they used to make before butforNFS because uh if we look at theKubernetes and how Kubernetes handleslike volume mounts uh using CI driversposted volume claims and volume mountsetc uh anytime you want to new mount anew volume you need a part restart so uhthat that is not possible uh in our casebecause we don't want to disrupt AIworkloads that are running there wewanted to have a way where we can uhwithout any disruptions uh introduce newfile systems or take away existing filesystems when they are not needed so weended up implementing something likethis so uh so as I talked aboutautomount demonstrate a little bitearlier so automount demon set runsacross the cluster on every single nodeand it makes sure uh it mountsNest volumes so sorry Nest file systemsso those file systems are categorizedinto various categories such ascheckpoints data set infra uh and thosecategories are parent directories uh onthe host and inside that parentdirectories those gets mounted and thenthose parent directories are bound bindmounted inside the slumd and login podso that whenever new file systems areadded nothing changes for uh slumd andlogin part but those file systems becomeavailable for the uh for the[Music]workloads so to to conclude so fromonremises to public cloud across uhacross multiple cloud providers was madeby possible by moving to Kubernetes uhas um as we kind of like iterated overuh the iteration speed so now by becausewe are on Kubernetes um various teamsare can handle their componentsindependentlyuh we are able to like fast we are ableto deliver things faster in the cloudmaking more services new featuresavailable faster uh unlocking the innovuhinnovation and uh we are also did nothave to sacrifice on research experiencethat was uh very very important for usuh from performance Once reliabilitysides uh we we did not had to so we kindof like met the requirement that was setforth uh even though we moved onKubernetes or in some cases we havebetter observative story because we areable to utilize um tooling from cloudproviders as well as open source as wellas custom build uh running HPC onKubernetes has challenges at scale uhbut most of those we found veryaddressable but sometime but still therethere are it's not like painfreeenvironment so there are still somescaling issues that you will you willrun into uh Kubernetes ecosystem uhallows us to like move fast uh acrossmultiple pro cloud providers is justobvious like it just uh vastly supporteduh industry standard so it's much easierfor uh maj easier to adopt and reiterateonokay sorry okay so that that brings usto the uh end of the presentpresentation uh I just wanted to alsothank uh the open source community formaking Yall these various componentsavailable uh that made us possible tomove to Kubernetes and uh in in veryshort period of time[Applause]with that um that concludes the sessiontoday so with that we I can take somequestions i'll be also available here orin the lobby later if you have uh if youhave questions as well[Applause]thank you very much for an amazing talki have a couple questions aboutreliability for the cloud uh publiccloud solution you presented uh firstone is about how do you distinguishbetween uh workload level issues whichalways happen at scale and the infralevel issues and the second is when youexclude certain nodes from the clusterbecause of the failures do youoverprovision on your sides to accountfor them or do you expect your cloudprovider to compensate for this costsomehow yeah so uh regarding the firstquestion so yeah it can be tricky uh andwe might not always get it right rightaway uh but we have obserity story builtin to kind of like to help and we arestill working on like making it betterso for one instance like I can maybegive you an example so in in some caseswhat happens is in the public cloud uhwe found a problem with the node we sendit back to the cloud provider it comesback they don't find any issues with itit comes back to the cluster again thatnode kind of like keeps failing keepsfailing so we kind of like trackfailures on the particular node acrossuh across various jobs and kind of liketry and find okay so this is a like bada bad node that caused uh x number offailures in last 30 days so kind of liketake like keeping the bad actors away uhis kind of like approach uh we haveimplementing one more one more trickything in the public cloud is uh the samehardware can come back as a as adifferent host so we are also currentlylike looking into better ways to trackthat using GPU IDs and uh serial serialids and so that like okay we don'tdepend on the actual physical locationon the host or actual uh name of thehost but uh we can really track down bythe uh uh by the by the GPU IDs and uhand things like thati have a question about your W2 or W3 Idon't remember about your slam D how howdo you deploy that one you you deploy itin the bare metal environment it seemsto me uh yeah so it was not clear likeentirely a bare metal environment so itwas more on like something like AWS EC2so it has a thin uh virtualization layeron top of it so but we just run but youother components is actually part of thecontainer part of the Kubernetes rightyeah yeah but this one is separate yeahso they it just runs on the OS like asystem you need uh so but when theylaunch job what kind of a like when wefinally launch the job to the workingnode what does it look like yeah so inthe slum pretty much uh in the in theboth the environments when somebodysubmits a job they log into the slum andthey submit the job and then the slumdkind of like spins up the uh taskprocess on the worker node uh and thatexecutes the actual workload so nomatter you are running inside the pod oryou are running as a process on the hostthat process doesn't doesn't change muchuh and then also about the deployment uhin that way still use SAF and separatordeployment on this node only uh yeah soin new model we don't use chef we justrun everything as a Kubernetesdemonstrate because we do only smallthings on the host level uh we don'treally need a config management uh andhave to deal with uh like differencesbetween various cloud providersdifferent versions of kernel and thingslike thatokay thank youuh hi I have just one questions uh haveyou considered some open sourcekubernetes operator which managed slarminside of kubernetes for instanceoperator or something like that yes weare reeuting that as well u so yeah innear future uh we will pe out on that Ithink uh but yeah that's that'ssomething we have been like discussinginternallyokayhi uh great talk so I mean like when youhave these long running jobs and youhave to do system level updates right umthere might be uh a period where youhave to wait for that job to finishbefore you get the feedback uh how doyou handle that uh portion of it yeahyes so uh on the Kubernetes becauseum now slumd runs as a job on the as apod so let's say if you want to do asecurity patch on the host uh thatdoesn't need a restart so you can stilldo it without affecting the slum becauseslumd runs in its own name space uh uhand uh uh it doesn't affect it so that'swhy like everything running uh alongsideof like slumd pods on that particularhost they can continue to update thingsbut uh workload running inside the slumdpod doesn't see those changes so itstill see the whatever there whatever isthere there on the slamd some things canbe tricky so we have to avoid uh forexample like upgrading uh or changingkernel uh parameters or stuff like thator applying a patchSobut overall like the things we donormally such as for example hey we runa let's say a metric agent that needsupdate we can update stuff like thatwithout without having to worry aboutaffecting the workload that runs insidethe slumpod thankshey thanks for the talk it was reallynice um and yeah I just had a fewquestions so number one I think you saidfocusing on the researcher experiencewas paramount and I noticed you said youcreate like a fake CLI in a sense tohelp do that so why did you insist on uhcontinuing to use slurm shall we say uhif you were kind of uh able to interjectin that point and create clayeah whouh so question is like why we insistedto use slum and why not kubernetes yeahso for example today I've uh well inthis week I've seen sorts of thingsabout uh training pipelines you knowcubeflow qflow pipelines and otherthings like that so you know why not gothat approach and create things in theCLI to yesso I think slum islike like very widely uh used tool toolfor doing AI research ai is basicallykind of like research um and um at leastin the near term we did not had uh plansto like change that and then pretty muchlike lot of people working in acadeacade uh academic research theytypically have a lot of experienceworking with slum and uh it alsoprovides uh way better uh controls forrunning longunning bad jobs so that issomething that um uh we did not want itto change but what you're saying is alsolike we have been like considering so wemight end up like having another uhcontroller that runs on kubernetes butwe can keep slum for people who arecomfortable with using slum and then butthere are there are there are certainusers they want to kind of like like useuh kubernetes or some other uhkubernetes way of doing things basicallyso in future we might end up doing bothnice yeah I can totally understand thefeeling of sticking to slurm to you knowkind of get that researchers buy in thesecond question is actually very relatedto what you were saying at the end wouldyou be able to make this work for uh acombination of on-prem clusters andpublic cloud and I assume that you knowhelps the companies stay quite agile aswell yeahso for internal like we have very maturetooling for internal like uh handlinginternal like whatever runs in our datacenters we already have like automationwe have like uh so we don't need to likeuh do much customization there but inthe public cloud what what happens isbecause uh we are kind of like limitedby what public cloud has to provide andthen also kind of like implementsomething hybrid that works with ourinternal tooling so that's why likeKubernetes helps there but in on the forthe like private clusters we alreadyhave uh a good tooling that integrateswell with our own infra so it's notnecessarily we have to customize thatyeah fair enough uh so what I mean is uhhaving a common interface that would bethe same for the private cloud or thepublic cloud has that ever beenconsidered not really is it no notreally yeah cool uh because like thereare various groups various like you knowit's little bit complex uh yeah but uhbecauselike yeah the training workflows aredifferent there are differentrequirements and stuff like that that'sperfectly fine thank you so muchpleasure thank youso thank you everyone for joining2025-04-15 21:59:31.683834[es and also uh during myexperience I have met a lot of end usersthey are kind of restructuring their uhresource pools as their uh businessgrows so previously a lot of businessteams they would uh independently uhbuild and uh manage their own uh clusterand later on they went to the uhstructure that have a uh shared infrateam to you know uh uh provide the uhorder maintenance and also share thetechnology and also why not a few just ahypers scale uh clusters a lot of userssaid they are It's just very hard torebuild and also you know maintaining alarge scale clusters can be very hard sothat's how the story begins right so umuh this is just a very uh briefarchitecture of the uh multicluster AIplatform so uh actually for theincluster scheduling incluster workloadmanagement uh we know that uh uh thevolcano project has done a very good jobuh it help provide a lot offunctionality uh to support uh AI batchworkload friendly struggling uh uhscheduling features as well as uh Qfeatures and like uh resource sharingamong cues and alsouh priority based on cues and thecapacity management sort of things atthe multicluster uh actually we can callit just a federated layer uh we can useuh commada to manage the uh cluster uhhealthy state and also uh propagate theworkloads to the clusters according toyour uh preference uh whether you wouldlike to uh whether you would like theworkload to be kinddivided or just scheduled to some of thecluster or uh replicated uh in the caseof managing AI workloads most of thecase we will just find one of thecluster to get it uh scheduledespecially for training workloadsand uh uh during my uh uh discussion andcollaboration with a lot of uh end userswe found that actually there are some ofthe assumptions and the design uhprinciples very important and that's uhbeing kind of proven multiple times frommultiple users so um two-level system iskind of unavoidableso with that the layered uh architectureneed to have specific f uh focus looselycoupled i mean the federation layer thefederated control plan layer morefocuses on the intercluster coordinationthing while the member clusters they aremore kind of focusing the inclusterthing and they can be highly autonomousright so uh for the federated controlplan we are kind of not able to storethe whole uh the full detail of the uhmember cluster status because otherwisewhy not we just have a hypers scalecluster right uh and also like thefederated scheduling um you know uhshould not replace theincluster scheduling it's more likely uhjust collaborating with the in clusteruh scheduling and also the scheduling uhin the multicluster layer is quiteexpensive you find a place to spin upyour workload send it to that clusterand the incluster scheduler made thedecision and it turned out no and thenyou kind of evict it and return againit's quite takes long and sometimes evennot able to uh get things fixed so weneed to optimize the first timeattempt um so with the limited time uh Iwill just cover three uh majorchallenges I've have been working onfirst is about the tradeoff of the uhscheduling second is about the uh failover uh between clusters for theworkloads and the the third part is uhthe uh the queueing at the uh federatedlayerso uh as I said that actually we are notable to easily kind of uh store or catchall the uh cluster status details weneed to kind of balance between uh youknow the footprint the overhead for thecontrol plan and also we need to take uhhighly uh the efficiency and thethroughput and also scheduling latencyinto consideration so uh in kamada webasically have two major options one isthe uh resource model it's kind of uh uhcompress this cluster status in thecontrol plan and another one is the uhresource uh the scheduleuler estimatorso the uh resource model actually wekind of use grades to uh quantify the uhcluster basically uh node status into uhuh uh some of the groups and you can asyou can see on the right there it's anexample and also um and also uh this isactually better you know uh uh for thelike ap uh CPU or memoryincentive workloads like big data andalso it has mor\e uh kind of uh betterthroughput because the the schedulinglat latency is better all the things youneed isinsideuler but you know actually the theaccuracy versus uh efficiency uhtrade-off varies depending on a lot ofuh different effects so um for theestimator it's like actually estimatoris quite similar as the um inclusterulerit just not make the final decision butit will still go through all the uhscheduling or algorithms that youenabled in your cluster so it uh willneed to watch the member cluster tocollect all the uh the basically nodeand the pod status uh in memory and whenthe commander scheduleuler doingscheduling uh we will trigger uh RPCcall to each of the uh uh estimator andcalculate the uh max available replicasas you can see this would take uh longertime uh when doing the schedulingdecision and also uh you know it's kindof still um storing the whole uh clusterstatus in the control plan so theactually the footprint can be a littlebit more extensiveexpensive however actually uh you knowwhen you are doing on the uh AIworkloads especially you know for bothuh training and also uh inference uh wethink that it's uh it's a it's a worth aworthy uh tradeoff because we don't wantwe don't really want waste any of the uhGPU resources and also uh like todaymore and more uh distributed trainingdistributed inference uh requirementcoming on uh coming out uh we need moreawareness of the uh incluster networktopology that's why we uh need the uhestimator way for the uh multiclusterscheduling for AI workloadsyeah so uh the second part is about uhthe cluster failover for uh workloadsUh actually uh in uh during the uhproduction uh uh uh resourcefulmanagement we found that it's actuallyquite uh complicated you know to uhreally uh determine if there's any uhcluster failure you know and uh in mostof the time the failure are just uhpartial cluster issues not uh criticalical failures so uh people really don'twant to kind of drain the uh cluster orevict most of the workloads right andalso uh we need to uh distinguish thedisconnection and the the actual failurestatus and uh um typically we also thinkthat multicluster high availability uhuh consideration may result in moreredundant uh resources we need to thinkabout how we uh really want to deal withthe cluster failover if you don't haveenough uh resource thefailover may you know uh amplify the uhfailure from one cluster to another thatwill be kind of unacceptableso um that's why we were thinking how wecan uh you knowuh revise the uh cluster failovermechanismuh to from uh tent based eviction to amore flexible way to do it so that's ourkind of revised one so uh we uh thinkthat the cluster status may havemultiple resources uh we can just uh youknowuh uh check the cluster lease uh that'sa quite an easy way to to uh check ifthe cluster is online and we still needto kind of uh extensible uh to have aextensible way to let people implementsome uh you know well-known clusterissues and make them to uh become uhconditions on the cluster and also uh ifyou have like infrastructure provideryou may want to uh have some of the outof tree mechanism to core some uh tocall some API to uh help uh really foundthe the the problem and uh for the tentfor the tent uh before we already haveuh like the no schedule tent and the noexecute tent at the at the uh uhfederated layer but the thing is thatactually the new no execute tent isquite dangerous because once you add anyno excute tent without a predefinedtoleration on your workloads it cancause all them all of them being evictedso that's why we are thinking uh can weadd a a little bit more you know uhum uh type of uh uh a type of tent thatis in the middle that we don't wantreally evict most of the applicationsbut just a few of them so uh with thatwe actually added uh the uh preferredthe no execute the uh the interestingthing is that this tent don't actuallycause any of the eviction because youcan uh it doesn't require any tolerationto uh let the workload stay on the uhcluster so then how we can you knowtrigger the uh some of the workload thatuh don't want stay wh]en the uh part uhpartial cluster issues uh appeared uh sowe are using the uh failover failed inthe uh propagation policy to define thatuh with that uh when the uh preferred noexcute uh tent uh appeared on thecluster um the eviction manager will uhuh trigger the eviction process and forthe both no execute and the no preferredno execute tent triggered eviction wewill uh basically incure the uh uh theresource binding to the eviction queueand we know that uh it's very importantyou know uh to make sure the eviction ofthe workload may result in a workloadrunning in another healthy place so uhin the uh new design of the evictionqueue we are thinking that we canactually uh import the uh algorithms ofthe commander scheduler to do a uhpre-scheduling to check if it'suh uh possible to get the workloadrunning otherwise we will just give theev give up the eviction and also uh uhin the uh kamada uh eviction managementuh process we already have the gracefuluh eviction me mechanism so we just uhuh uh reusethat so uh as I said uh uh that's themost part of the uh theimprovement for uh as I uh said the likethe the cluster status detectionuh we change we will change from uh youknow uh checking the whole clustersinstead of uh uh two uh the clusterleaves to reduce the uh pressure to theAPI server and also we will add thecluster uh problem detector to uh uhsimplify you know the customization ofuh uh cluster status uh collection andalso we are currently working on uhdesigning and explicit API forcustomizing the policy how you want touh tent your cluster by uh certain setof conditions and also you can choosethe type of uh tent you want to add sothat gives more flexibilityuh for you to uh to you know kind ofreally manage the eviction uh behaviorof your wholesystem yeah and uh in the in a longerfuture uh we would also take theworkload priority into consideration somake sure even though you don't haveenough resource in the rest healthyclusters you can still tryuh uh migrate the most high priorityworkload to the healthy clusters so uh Ijust provided link uh in the slides youcan uh check out later onyeah so uh this is the actually the uhnew API for uh you know uh defining theeviction behavior migration behavior uhthat reflects the preferred no uhexecutedtent yeah uh so third uh third uhchallenge I would like to discuss isabout queuing uh workloads and actuallyuh we have met a lot of uh adopters theyare trying to use a uh multi-clusterplatform to manage their uh workworkloads like training and also like uhuh computing jobs as well as uh thisyear more and more uh inferenceworkloads are uh making plan to move tothis architecture but as we all knowthat kind of uh the uh scheduleulerframework have uh a lot of issues rightum so here are some of them uh basicallysmaller jobs always get implicitlyhigher priority because they are alwayseasier to get scheduled and the secondthing is that the more jobs you createdbecause the queue of the queue insidethe scheduleuler that would resultingmore resource youconsume uh and when there's more uh moreand more different type of resourcesthings get uh more even more uhcomplicated uh in the talk uh just uhbefore uh the last one actually uh thespeakers are already uh give anintroduction about the uh priority basedscheduling inside kamada uh that reallyhelps a lot but still not enough whenyou are providing your uh platform tomultiple like business teams and uh uhyou would run into limited priorityclass and for those workloads at thesame priority class they still met theissues uh of the uh fairnessthing also another another uh big issueis that uh we tended to consider servinghas higher priority than uh than the uhtraining workloads but we don't reallywant to uh preempt them because thatwould result in waste of GPU uh fromtime wise right and also um currentlycommander scheduler already taking a lotof uh resourceuh perspective uh conditions intoconsideration when it's doing schedulingbut uh as we know that in the cluster weare we are also trying uh like Qmanagement and also uh resource sharingbetween cues how we can uh enable it inthe federated layer and make the twolevel basically decision making alignwith each other so that's uh the uhthings we want to uh improve and uh aswe can see that actually uh volcanoalready implemented the queue conceptand also like the volcano job inside thecluster and uh uh with the uh uh commadadesign it's quite actually quite easy toenable it in the uh federated controlplan because commada don't require irenew APIs you can just create any uh anyuhCRDs so the little bit different thingis that actually when you create volcanojob uh you you also need to enable thevolcano global to make sure uh it willbe uh mapped into a queue at thefederated layer and also uh actually uhwhen uh volcano uh global doing dealingwith the uhthe uh the order of the uh scheduling uhthe different workloads uh we just incuethe resource binding and also this isactually relying on the uh schedulingsuspension feature uh just released inthe commander latest the version to uhto make sure we get higher priorityuh jobs being scheduled byKamada and also uh in the uh graph youcan see the uh there's a example that wecan uh implement fair sharing between uhcues so one of the little bit differentthing considering with the uh volcanoinside single cluster is that uh todaywe don't uh schedule only part of the ofthe job uh that's a kind of restrictionwhen we are implementing uh thefederated scheduling so you are not ableto uh achieve fair sharing among joblevel between uh different workloadso this is a overview of the uh volcanoglobal and uh the first wish uh versionwe implementeduh just a very basic fair sharing and uhin the future we plan to uh introducemore algorithms and also uh we arethinking about maybe we can uh implementlike hierarchical queue and also uh alot of advanced um queue managementfeaturesand also uh set up alignment uh with thecapacity resource sharing uh between theincluster volcano and the federatedlayer uh clustervolcano so that's all about my uhsharing so just a very uh uh uh uh Ithink still uh the whole work is veryinitial and uh if you have any feedbackand questions welcome to join thecommunity and then also help usprioritize our work thanks[Applause]so any questions welcomeAll right uh if no questions uh let'ssave the time for beers thank youohyes microphone yes i have one questionum have you considered to use thevirtual cublet to simplify theschedulinguh assuming every node could be virtualevery cluster could be a virtual nodeand even porting your new proposal forthe kindum I don't remember the conditional uhtaint to the SCG nod for example becauseI think on a nod today we can't have a non taint based on condition it'soutcoded on the cublet based onpre-built conditions but this extensioncan make sense also on a node andconsidering every cluster has a virtualnode with a lot of resources a lot ofthings also prevent also solve thisproblem of cache what you were speakingabout because you have every precomputednode on your master cluster and you seethem as static nodes mhm uh yeah uh uhthat that's a very good question soactually in the early days when we aredesigning the multicluster uharchitecture uh uh virtual node is oneof the option um uh that why we choosethe uh the federated way so basicallyconsidering I meanuh managing directly the uh clusters isbecause that if you do go through thevirtual coat way you are kind of uhscheduling pods a pod to a cluster youstill need to kind of resum the workloadinformation when it goes to the membercluster API server right if I want toschedule a safe set a deployment or evena custom uh resource workload definitionthe pod API is not that ideal for youknow uh carrying carrying uh the extrainformation ah yeah because in fact onlythe pod will move to the node but yeah Isee what you mean so the deployment andset will remain in the master clusteryeah and only the pod will move yeahmost of the fails are remain the sameyeah ah yeah yeah yeah okay yeah makessense thank you but the conditionaltaint still makes sense in node no uhnode level right yeah yes yes okay thankyou yes2025-04-15 21:59:32.440252 ��\�#��oA6hWoA4jEk5Mall right are we live we're live yayokay thank you all for coming everyonehaving a good time at CubeCon so farsecond day yes yes all right i'm goingto match yourenthusiasm okay um I'm here today totalk to you about open platform forenterprise AI um AI pipelines with Opiathis is a a project that we argue abouthow to pronounce the name i pronounce itwrong all the time it's actually OPA soI'll say OPIA you all will remember thatum I'm Melissa McKay i'm the head ofdevelop_��Y�#��iA4pBhVVrCHyMum hello everyone thanks for joining thetalk today i am Kevin Juan uh currentlyworking on uh multiple projects on theCNCF and my personal background is moreuh about uh scheduling and uh actually Istarted uh contributing to Kubernetes inthe early days and during recent years Ihave been working on the volcano projectas well as uh Kamala project so today Iwould like to share uh my thoughts andmy uh experience working on the uhmulticluster AI uh infrastructureespecially for you know a lot of uh day2optimization so uh a little bitbackground of uh the talk why we need amulticluster actually uh we all knowthat there are kind of multiplebackgrounds some of the company theyhave uh multiple uh regional uh businessit results in physically uh divideddistributed clusters and also some ofthe users may run into the you knowhardware procurement uh cycle issue sothey have to kind of extend theirworkload to the public cloud so they canuh rapidly uh rapidly get some of the uhuh uh resourcZ`er relations at JROG and I amalso a member of the OPA technicalsteering committee and I'm here withEzeka most known by easy ease um opensource AI evangelist on Intel and I'malso Lei LF AI and data chair and memberand board member um yeah I'm I'm I usedto say open platform for enterprise AIinstead of OP that's a lot easier yeahopar or OPAokay so let me let's get startedokay so first thing um we're going tostart with the end at the beginning herebecause we have a fun treat for you thisis the first time that I've ever donethis so I'm kind of excited about itwe'll make improvements later on uh ifwe do this again but I've got a coupleitems for you after this session there'sonly a hundred of these i can't see allof you i don't know if we have more thana hundred but if we do you're you guysare going to have to race to Sonia inthe back she'll be at this exit on theback right my right and she will hand acard out to you this card has a secretmessage on itin order to read the secret message youcan one cheat and figure it out on yourown or you can play the game have fun goto the JROG booth grab a filter to readthemessage and then head over to the Intelbooth to get a prize all right or theoption for a prize so just something funfor you to do little scavenger hunt umwe'd love to see you at the booths aswell so uh yeah all right moving on uhthe next thing that I want to do is uhsomething crazy we're going to startwith a demo immediately instead ofwaiting at the endcool allright so how many of you knew thatCubeCon has a kids day prior to theeventcool yeah yeah that's not surprising isaw like a couple hands there yeah it'skind of it's a relatively new thing thisis the third time that I've done it iwas a volunteer this time and I was veryfortunate to have my daughter who's acollege student and another friend ofhers also a college student to teach twouh workshops at this little kids eventuh it's for uh younger children from 8to 14 and it's to inspire this nextgeneration of technologists right andthe idea is to come up with some funthings for them to do so that um theycan you know be like us someday sittingin these chairs listening to talks likethis it's going to be great anyway um soI have a little program here it's just aregular kind of like a chat GBT likesystem and I'm just going to typein whatis kidsday now what do you expect I'm gonna geti just gave you a bunch of informationabout Kids Day it was here in London itwas for a group of kids from 8 to14 and it it's an event of CubeCon but Iget a bunch more information herebecause this system has so muchinformation in it it's just going topull out what it thinks kids day is soobviously Kids Day is a special occasioncelebrated to honor and appreciatechildren but this isn't exactly the kidsday that I was talking to you about soin order to fix this so AI is not as assmart as we think it is right yes likeYeah so this isn't helpful at all thisis why I I don't like AI don't likeusing it it doesn't work for me okayokay all right one of the mostfrustrating things is making sure thatyou get the answer you'reexpecting so in order to solve thisproblem for this particular program I amactually going to give it some contextand in this case I'm going to provideit alink this is theevents page for CubeCon i'm going toprovide thisURL and then I'm going to ask myquestionagain what is kids daythis time I get the information that I'mlooking for it is a a technology eventat CubeCon for children's 8 to 14 umfocusing on open source technologies bythe way this was really fun uh we had amicro:bit workshop and we had a Robloxworkshop which was a ton of funso this is an example what we just sawhereand you can keep making questions aboutwhat is happening here because wealready provide some context some kindof so if I like to know for instancewhat is the workshop and you can startinteracting and you're just focusinglike the answer um so this time I askwhere is kids day and this time it tellsme it's in Excel in London yeah this isan updated information right so okay Solet me switch tothe stopmirroring just coolnice sao we finished thedemo show thisokay so what you just witnessed that isan example of rag how many of you knowwhat rag meanseverybody knows what rag means okay sofor those of you that don't I'm going topass it to easy or you can know it butnot the specific details on how it worksunderneath which is basically what hwhat is happening underneath is you aremaking a question right let's supposethat we don't have this entire this coolsystem where you have a UI and and youupload everything you can actually ifyou provide a new prompt and you can sayum answer the following question basedon this context which is actually if Iknow let's suppose that I know thecontext I will say kids day is a one dayevent it's held in London XL and it'sfor ages of 18 and between eight and 14and the question will be what kids dothey so I'm providing all the context onthe on the prompt this is not howbecause we don't actually know theanswer right so you're looking forsomeone so you are instructing the LLMyou are providing the context we'll seelater how this context is provided andyou are providing a question right so ifyou give that to an LLM the LLM willstart making an answer and will say getready for kids day and so on and weactually give them like an instructionto say try to advertise it to me so themodel wasn't trained on that informationthe model doesn't have this informationbut it's actually making up an answer umbased on just on the context that youare providing so what happens so thefirst thing that happened was retrievalthat's the R and rag so the brain inthis case is the context that weprovided it the URL from the knowledgethat's the knowledge base so that's theR that is the R the A augmented so wegave it some clues so instead of justrandomly asking what is kids day whichgave me a completely unrelated answer inthe beginning I now get context withthat prompt I give it instructions andthen it generates a response based onthat data that I provided it this is ragright so if now I mean that's fun butlet's go a bit deeper now and for thoseof you that know rag we know that withrag we have multiple pieces we have theembeddings we have to convert there's aprocess we now go for the entire processbut just as a high level you need tohave your information the context that Iprovide before that's not a context thatI will be giving to the user I need toextract this content from my from myknowledge base or from my localknowledge base so This is why this isvery useful so I will need a kind ofvector database which is actually mylocal vector database or myURL and there is some process involvedon that like the embeddings isconverting the text to vectors thevectors that's going to the to thevector database and once you are makingthe question which is what's happeningon the top it's also converted tovectors and those vectors are comparedbetween doing the similarity search andwill give you the most similar documentsso this kids day is a one day event andkids day is held on London XL and so onthis is what we have in the site andactually the retriever is extractingthose documents that are most similar tomy question and are creating a newprompt underneath and you are gettingthat answer but it's actually if westart thinking we can see the boxesright if we we are all at CubeCon rightso we know what what what a microsservice mean so if you like to deploythose kind of examples we would like tohave microservices we would like to havecontainers to specifically use or do orperform perform one particular actionright so we're starting to go a bit onthe details and why we need to have thatwe all know why we need to usecontainers or microservices right sothis is basically but those componentscan be a multiple components not justone component yes definitely there was aLego session yesterday anybody in herego to that oh I didn't know it yeahthere was there was you all missed out iI heard it was a ton of fun um in factit was a full house they had to turnpeople away um but I have a bucket ofLegos at home and um it is fun sometimesto be able to build like your own thingum but it isb a matter of figuring outwhat you want to build and then you haveto find the particular pieces thatmatter or that's that's going toaccomplish what you want to do stickthem all together maybe you do a lot oftrial and error um there's so manyoptions and you will find if you go tothe sponsorship floor you're going tosee lots of demonstrations lots ofexamples of all differentimplementations and different ways ofproviding this solution that we justshowed you uh for various use cases umit is a lot of like I said trial anderror reinventing the wheel um wouldn'tit be nice if we could do somethingbetter without so much chaos and putsomething together so that we can nothave to start from scratch every timeyeah I mean we are we love open sourcewe love the community and try to buildnot to have one company doing everythingjust let's try to build somethingtogether using the community so this isactually how we we build when we once webuild this if you're are buildingsomething on jai how many of you aredeploying or or working on with genaiactually okay not a lot okay so you knowthat this is a challenge so you all liketo build this kind of beautiful chatbotthat we present at the beginning and youdon't know where to start so it's can becomplicated you have multiple optionsmultiple companies creating thing so youdon't know which path you need to followSo what do we need it's we needstandards we need a way to do this u tomake it easier for everyone to start umI had an experience in my past where Ihad to do a lot of like video work and Irecognized immediately how manydifferent standards there were out thereand how difficult it was to do that andpart of that job for me was to figureout those differences and for ourspecific use cases build adapters andbuild you know um specific use cases foror specific uh components for all ofthese different use cases uh that's aproblem right it's easier when thingsjust fit together we've all have anagreement we have bigger pieces to workwith we can just put them together andwe're happy about itum having standards like that is youknow it's an abstraction layer um it'sgood for us it's good for us yeah yeahand it's also easy to build likesomething right we need those componentsbut in this example that I showed beforewhy we need standards is actuallybecause ji is not just one model it'snot just one part one LLM that you needto use very optimize you actually needto use multiple components and it'sgetting more and more advanced you haveagents that the agents are starting totalk with other tools and how all thosecomponents work together is getting moreand more complex so you have multiplechanges uh multiple challenges sorry inin the data how you index the data howyou do the changing how you preparethose vectors how you select the rightmodel for you are you going to useremote inference on open AI are yougoing to do something local with an SLMprobably because you don't have youdon't need to spend money with a verysmart model because probably you don'tneed it and sometimes it's okay if youlike to use it in your local environmentwhich frameworks are going to use aregoing to use hiding are going to use anyother kind of framework and thedeployment we need to care about thehardware efficiency which methods aregoing to to use for the retrieval theevaluation of the LLMs how you canevaluate that so these are multiplechallenges that we may face when westart to deploy this and this is kind ofwhat we end up yeah exactly this is uhwhat I end up with quite a lot uh quiteoften at home with my bucket of Legos issomething creative something fun fun tobuild of course um as developers we'realways you know looking for the next newthing and putting things together but alot of organizations do not have the uhcapability of investing in all of thistrial and error and even just looking atthis this is a wonderful Legomonstrosity i like it i think it'sbeautiful but what would happen if youneeded to build another layer forexample on top of this you'd have totake pieces apart redo it uh that kindof thing so when we're talking aboutcode and wce're talking about all thedifferent components we're talking aboutmore money more developers having to beinvolved in this process uh fixing itevery time maintaining it is is umdifficult so if we already have a set ofstandard building blocks we're going tobe able to get much faster further andthese organizations that do not want toinvest all their time just to find outthat maybe this particular use casedoesn't even work for them they're goingto be able to find that out faster andfail faster i like the monster though sobut we have a solution i mean we arebuilding a solution as a community forsure yes and how about an open sourcesolution something that all of us canuse all of us can benefit on and um wecan all contribute to we all havedifferent use cases we all havedifferent scenarios that we want toaddress uh this is an opportunity for usto get this part out of the way thedifficult part so that we can all moveon to something better something greaterand this is why we have OPA OPA openplatform for enterprise right yes yeahcool so the the um this is part of theLFAI organization under the LinuxFoundation and um Oh yeah sorry yeahyeah so first of all we've we've come upwith a set of these composable buildingblocks uh that was step one figuring outwhat pieces actually make sense thatapply to most use cases that we can allbenefit from yeah and this is that meansintegrating LMS data stores agents ummultiple protocols that you are usingwhen you are building this kind ofapplications this is basically what I mementioning before about it's not justone component one optimize it's multiplecomponents you need to put them alltogether in a solutionyeah and it's not just rag use casesit's not just um you know there's otherthings too there's patterns that we canapply across the board on severaldifferent use cases translation codegeneration and even multimodalum applications or solutions uhincluding both both text and images yeahand also the evaluation part sometimeswe don't care that I mean we care aboutlater like the evaluations but it'ssomething that we need to start when youare deploying those applications youneed to start thinking really at thebeginning how can I assess that how canI evaluate the model is performing wellif my if my rug um is performing wellit's not being biased and so on so weneed those kind of extra tools and thesetools is something that is alsoavailable um in the project and it'sjust I mean this project was launchedone year ago uh it started as a as anIntel project two years ago and it wasdonated to Foundation and now it's notjust Intel now it's Intel and more than55 partners that they are contributingto the part to to the project each ofthem for instance are contributing whatthey do for instance for vectordatabases you you would like to have thechoices to use I don't know minio or redhat or open search or red shift um oropen open search or any other vectordatabases so the main goal of thisproject is let's build together let'savoid the vendor locking let's avoidbeing focused in just one company evenIntel is there AMD is contributing and55 companies just and it's growing andthe list is growing and growing andgrowing yeah a lot of excitement aboutthis project and I w also joined when itfirst started oh we have a partner yeahyes um and so I'm here today just totell you about just our you know ourspecial use case why is JROG involvedwhy are we even interested in this thisuh project um one of the biggest reasonsis just think about the vision you knowwe want to enable enterprises to developand deploy genai solutions um uh we wantto focus on security safety scalabilitycost efficiency and agility we lovedemos we love playing with things welove proof of concepts but the realityis is most of us go to work every day wehave to meet regulations we have to meetcertain requirements we have certainbars of safety that we need to meet uhwe may be um you know I mean like we'rejust working behind a a um a systemwhere you can't even access the theinternet you know there there's someregulations that we need to meet so whatI'm concernedd about and what I wasreally excited about is uh theopportunity to get involved in thebeginning um in in the subject that Eziebrought up earlier like the evaluationsfiguring out um making sure that whatwe're building is actually efficientthat it's the best for our use case andthat we're going to save money doingthis and also be safe doing this um JROGis specifically interested becauseum obviously we are uh u artifactstorage um uh and management uh withArtifactory is one of you know ourflagship product so we are concernedwith that we want to make sure thatpeople have access to this and you'reable to to uh manage and store yourcontainers appropriately for thisprojectum we also offer model storage andmanagement so we kind of we have anopinionated way of versioning models andu making sure that they are safe andalso scanning uh the security part ofthis is a big piece for us so we wantedto make sure that we provideinstructions uh for anyone interested inusing this this project to be able toput this actually in an enterpriseenvironment okay so I will showsomething uh we have five minutes so Iwill show you a quick demo more than ademo it's more like explain a bit theproject um so let meseehereum cool so I will not show this again sofine so the project is of course is onis on is on GitHub and you have multipleprojects within the project of course wehave the geni examples we have genicicoms and geni infra what they areis the project has more than 20blueprints so if you like to deploy thisexample as I show before with all thecomponents and everything if you like tobuild an agent Q&A or a chat Q&A ordocumization or doc indexing and so onyou can go to each of those examples andyou have recipes or blueprints in bothin docker compos files or in kubernetesyou'll have the helm chart a bit ofexplanation on how on how it works butactually is the end to end example youhave here a sample on how to use it ofcourse you don't have the 100% optionsbecause if you have 20 vector databases20 lms you need to have I don't know 4060 Helm charts available so if you ifyou know how to work with helmcharts youcan easily plug and play which componentyou like to use so this is what you willsee in the genai examples if you like togo to the component itself because youyou will say I don't like this componenthow it is built i would like to addguard roles i would like to add anyother components you can go to geni comsand you will see the entire list of thecomponents that were contributed to toto the project divided by datapreparation by embeddings by vectorstores by third party you have multiplemultiple options for you to choose so ifyou like to build a specific componentand you would like to contribute it ofcourse you can contribute it or you thisis where you will see the options youhave to pick in your health chart andgeniinfraos is the all part of theevaluations geni infra is is actually ifyou are working with kubernetes ascubecon right h you can have terraformsfor some providers for instance let'ssuppose for AWS you have a terraformyou will have a terraform also um forAzure and you will see all the hemcharts by use cases here right so thisis basically the project and what I havehere to show real quick is this examplethat you see here I actually have itdeploy it let me yeah so if I would liketo see the bots this is a cubernetes umdevelopment and these are the bots thatare running of course I didn't mentionin the in the main example the UI so itgives you an UI it gives you an engine XI didn't mention the in the engine X andbut of course we need an engine X toexpose those services and also there isa mega service which is actually theorchestration orchestrator thatorchestrate how the communication isdone between between the containers Iwill just show one and each of thosemicroservices they perform one functionright so let's suppose I would like tosee what the embeddings are doing whichis actually converting ingvectors words or phrases to vectorssince I cannot go directly to thisbecause they are all exposed through theengine X I will need to connect first tothe engine X and from the engine X Iwill just curl to see to the embeddingmicroservices and I will make a questionhere what is what was deep learning andthis actually will pick the text andwill convert the text to a vector andthis is thevector it's a big vector right so fineit's a512 i will not do to any of them becausewe know we don't have 20 minutes to todo it but just to let you know that eachof those components and what OPAprovides is the specification on howthose components should talk each otherand how you should build is thosecontainers someone that most companiesor partners or customers are doing isthey are picking this project and theyare building applications on top of thatthis is kind of the infrastructure layerso you can build if you like to buildyour own application for healthcare orfor finance probably the vector databaseis where your information or you will toyou need to use a specific model so thisis where you start to pick this modeland fine-tuning or to provide somethingas a service i don't know but this isthe project this is the open sourceproject forthat so letme let me go back tothemirroring allright so finish the demo cool coollots more that can be done here there'sa lot more sets of instructions forexample those Lego instructions that Ilove to work through that are very clearguaranteed on what you're going to getwhen you're done uh there's a lot morethat you can contribute to here yeah andone cool example is the prediction guardthat was contributed by a partner andthis is what we love like submit yourRPR if you like to provide somethingthis is a great example that wasprovided now the project has guard railsthanks to this to this personlots of other opportunities i just wantto tell you some of the working groupsthat are currently operating right nowwe have a developer experience workinggroup end users evaluation that's theone I really like uh community workinggroup we have a research working groupand also a security working group yeahand the security is growing i meanwithout the confination computingconfinition containers this work is aspecific this working group isspecifically working on that so if youlike to participate submit an issuecontribute provide feedback again is anopen source project as every open sourceproject has improvements to be done soum this is basically the goal what we'vedone if you like to put your hands on wehave we created a workshop that is inAWS um so it basically lands for you anEKS cluster and deploys the example foryou it's a guide it's very educationalworkshop it explains all the thing thatI explained for the microservices itexplains how is the architecture buildand so on so this is the first one weare trying to make to all the serviceproviders like for Azure for GCP h havethis kind of same examples so if you putyour hand if you want to put your handsin you can you can go therewe just want to build a safe place a lotof us are new at this a lot of us arenew at these types of applications wewant to experiment we want to see whatwe can get out of them with ourorganizations this is a chance for youto do that easily with a set of thingsthat you can already use right out ofthe gate yeah um please connect i meandon't be shy we are we are open to talkto connect go to both boots or connectus on LinkedIn Twitter X um you have theQR codes here we didn't mention at thebeginning we should mention that at thebeginning so if you want to to connectthese are thelinks um don't forget to take your cardsto the booth and you need both when youhit the Intel booth you need the cardand the filter the filter the little redcard you'll get at the JFro booth sohave fun with that thank you so much foryour2025-04-15 21:59:33.418945fhe model repository right you can mountand from your local host on this simpleexample but it can be anywhere right inthe distributed system additionally youset up service right listening to theport and where you want to get and uhlisten and receive the request like hereright we have GIC protocol port and aswell as the HTTPone so the third step right you have togenerate some workload and send theworkload and a very nice tool is calleda performance and analyzer it cangenerate the request sent to the tritonand the inference server it's alsosupport tensson flow and the torch seras well so they're all different modeand uh you can and send the request andfrom the like the specified concurrencylevel and the specified the read or youcan also the interval right how you wantto send and the the request also how youwant to measure the performance here issome sample and the output right howmany requests you send also the latencydifferent percentile percentile 50percentile 99 percentile and detail andalso the different concurrency level sothere are tons of differentconfigurations you can try and uh checkout the the reference for how to usethis and performance and analyzeryeah if you want to run a containerizedand a client generation this is verysimple example file you can see and uhthe the most important thing you specifyhere is you want to send a gc or httprequest and also which model you want tosend the request to as you see and earlyso for the Triton server we can populatethe the Triton inference server with abunch of model here specify this andrest that 50 as the model and alsorequest here is very simple just specifyI want to send a constant and request200 and request per second okay anotherthing I want to mention yeah recently aspart of the performance uh analyzer andthey are new and workload generatecalled GI PF so it's a specific tool formeasure and do the benchmark for thegenerative AI models so it's still underactive development but please check outif you're interesting it can provide alot of the metrics and the data for theJI specific and uh workloads yeah likethe time to first token and then theinter latency between tokens are thething it could be very and help helpfuluseful tools to understand the behaviorof this and JI and the modelinference so as part of the benchmark weknow right monitoring tool and otherthings is important so of course if youare familiar with the GPU setup the mostuseful or you should consider to useNvidia SMI Nvidia system managementinterface to collect the data and it canshow you this is A100 which each loadhave eight and GPU devices right likethe GPU utilization temperature memoryusage or other thing another usefultools I used and from open source calledGPU stats I think it's a very and usefultools can show you real time right theeach of the GPU devices temperature GPUutilization as well as the memoryconsumptions so you can try it so I alsolist some of the referencehere okay so use cases actually we usethis and Triton and the inference serveruh my colleague Kevin Cruz and I gave apresentation last CubeCon at CubeCon soclick city and we do a benchmark studycompare different GPU share strategylike time slice ing and MPS multiprocessservices compare the trade off and uh uhdifferent sharing strategy for differentworkload so if I interesting check outand we gain a lot of the insight byusing this and Triton uh uh benchmarkand uh inference server so okay so nextlet me see hopefully I still have timeokay I will try to give a live demo evenI recorded and I know live demo is riskybut uh I think it's probably moreinteresting so let me see if Ican make it workso I hope can you still hear mehello okay so now anduh this is actually on my uh workstationso like I mentioned you need andpopulate a model first so here I alreadyknow populate a bunch of the models andlike I show you Ernie and uh the modeland uh you can see each model right anduh yeah there will be a model like thelabel andfile yeah whatever and uh yeah youdefine it then under the model and youhave I I I only for this one have aversion one right if ygou look at that'sthe real model right it's quite big anduh Yeah that's basically what you needand we have a bunch of model alreadyhere so now and I deploy this modelright and uh as I showearlier yeah this is the YAML file verystraightforward and there are alreadyimage provided and in our Nvidia andregistry and you specify the the mountof the model and a bunch of metrics youwant to connect it and specify the portand other thing okayokayas you see it's running okay let's seeand uhOkay that's the the log of the serverTriton server as you see here a bunch ofthe model already populated right wehave like uh uh nine or different modelsalso with hue right it's listening tothe different port for different requestokay now let's see the client anduh okay the client was very quite simplethis example one right you just run alsotheir containerized version you just runthe performance and uh analyzer specifythe port you want to send also the modelname And I want to measure the 95percentile and use GPC protocol requestrate is 200 right okay let's seeand usethe yeah Nvidia SMI you can see theutilization here right both and arerunning and alsoyeah that's the tool you can you you cansee because I send the client send aconstant request so the GP utilizationis 21 22% yeah I just run a very shortperiod of time it's done now if you goback to look at the output in theclient okay you will see this and thedetail of the like this one of coursethe batch size one you can specify otherthing how many request is greater sendlike 3,600 request or differentpercentile of the latency and throughputor other thing so this is just the basicset of the benchmark and hopefully youfind yeah it's quite straightforwardthat containerized the version youpopulate the model you just start atriton server you can use thisperformance analyzer to generate aworkload collect the detail of theinformation okay so now I will anduh switch back and give it to Chen topresent theOkay the foundation model performancehello um so today I'm bringing a projectuh initially uh developed by uh IBMresearch and we uh actually open sourceit in uh the open source community andlater donated to the uh serving workinggroup uh to consolidate like opensourcecommunity effort so as you already seethat you showed the general workflow ofthe benchmarking process you need toprepare the server deployment you needto prepare the load testing uh jobdeployment you need to collect the datanecessary data so this framework isexactly help you to ease that process byuh providing you a pro uh programminginterface in Python and let's go throughthe details of it so um what is fmpperfright uh it's a python basedbenchmarking tooling library designedspecifically for uh evaluatingperformance uh efficiency energyefficiency of uh large language model orgenai serving frameworks such as the uhserver VM server or TJ server uh whichis the IBM folk version of TGI serverand then why we want to use FMerve uhbecause it kind of provide you a simplebut very powerful Python API so you canuh so it allows you to deploy inferenceservers as Kubernetes deployment andservices without the hassle to defineall your YAML deployment uh service uhYAML etc and then it gave you the simplePython interface to create differenttypes of load testing and within theKubernetes or open shift environment uhso why we target Kubernetes so of coursethis is KubeC everybody is concernedabout Kubernetes but the initialobjective was is um Kubernetesinvolvement actually allow you to doconsistent and reproducible uhperformance benchmarking and testingacross different types of serversdifferent types of load testers and alsodifferent types ofinfrastructure so this is the high levelarchitecture of FMF basically you have amodule called FMerve library which umyou can use uh in a simple Python scriptto configure what server you want tobenchmark what types of load tester youwant and then uh how long you want torun the experiment right and then umbehind the scene it actually used theKubernetes uh client library to allowyou to to deploy different serverdeployment uh creathe correspondingservices mounted the necessarypersistent volumes and here wespecifically uh by default it will mountthe uh slashmodels for your model um uhfiles and the slash request fordifferent types of data traces you wantto do the load testing and then um forthe uh for the server deployment rightnow we already support uh VM and TJS andwe expect to support more like Tritonand uh other servers for example SGAN souh here is a list of initialcontributors from IBM research i reallywant to thank uh to them i'm the uh oneof the first adopter of this uh FMPF uhlibrary it helps me a lot in terms ofbenchmarking uh performance acrossdifferent hardware uh servers anddifferent versions of the servers foroptimizationsso um key features of FFM perf uh firstit's um automated deployment usingsimple python code I will show theexample and then it provide uh a simpleload testers for different scenarioslike homogeneous workload scenariohegenius workload scenario and also uhwe have some models behind the scene uhbased on our real production traces uhso you have more realistic workload uhtesting and then right now it supportmultiple inference servers and we willexpand the list uh in the future andthen we extract uh LM specific metricsas Yan already mentioned the importantthing uh for example time to first tokeninter token latencies throughput andthen uh in addition we actually collectthe metrics from GPU as well using thetool uh you mentioned so we have someenergy efficiency metrics uh availableas well and uh because you have a simpleuh Python library you can write yourscript and then it can also help you forexample you are developing andoptimizing a server framework and youcan also uh use this library to writeyour own Python script to integrate withyour CI/CD pipeline so every version ofoptimization you did you can kind ofcompare and automate the performancebenchmarking for your serversso um I will then uh dive into how canwe start using it is very very simple wejust need to define three types of umPython classes the first one is clusterof course they support local and remoteuh Kubernetes servers as long as yourclient laptop has access to the allconnection to your clusters uh you candefine it using for example the uh loadcube config uh in Kubernetes client toinitialize the uh cluster uh class andthen the second one is workload specworkload spec allows you to define likewhat load testers you want to use forexample here we use the default FM perfuh uh load tester uh image but you canchange it to uh Triton client librariesor other types of load tester librariesand it it provides three types of umworkflow specs including homogeneousworkload spec uh which just defines theuh fixed input output tokens uh for thestressing uh testing purposes so it'smore consistent and reproduciblebenchmarking use cases and then you candefine the heterogeneous workloads backthat allows you to uh plug in differentrandom distributions for your uh inputoutput tokens for example and then it'sideal for testing performance under somediverse request patterns and then againrealistic workflows back is uh somethingwe contributed to the open source fromuh FMerf uh it really uh learned the thetrace patterns from our production logsand then we provide statistical modelingfitted to those uh production logs so uhyou see more realisticpythons so the third one is model specof course you want to understand whichmodel you want to benchmark and thensome configurations of models whetheryou want to enable uh for example thequantization uh some differentconfigurations in model compilations andum uh we specifically right now supportuh TJS model spec and VR model specbecause they allow different inputargument configurations for the serversand um of course you can also uh definefor example what GPUs you want tobenchmark this uh server on uh what isthe CPU and memory allocations you wantthe deployment to be so those are thejust three simple uh class definitionsor declarations you want to have in yourPython script and then the only functionyou need to uh call is called runbenchmark so basiically you pass thecluster model back workflow spec intothis function and then you configure howmany times you want to repeat theexperiments and then uh how manyconcurrent users you want to emulate orsimulate in your benchmarking and thenhow long you want to run the experimentsabout and then uh the all the FM perflibrary will help you go through theworkflow of deploying a model servergenerated the workload and runsevaluator parts to uh collect the dataand finally you just need to check thedata so and then uh for the for theresults we actually provide two types ofuh results one is the CSV uh summary ofall your experiments uh in differentreputations what are the statistics ofuh for example throughoot prefilllatencies um inter token latencies andtune latencies etc and then if you wantto get into the details we also have uhthe the load tester pod will actuallywrite some detailed information aboutthe time stamp of each token and then uhthe uh the the inter token latencies ina JSON file so you can later review allthoseresultssookay so compared to popular those uhpopular benchmarks you may heard of likeML perveum uh the the distinct capabilitiesFMerve is providing is really umKubernetes integration so Kubernetesinvolvement allow you to do the repeateduh reproducible results of benchmarkingand then for example ML perf is morefocused on giving you some accuracy andthe latency constraint and then letdifferent uh submitters to uh improvetheir optimizations uh to to to showwhich server has better performancehowever uh those are not not necessarilyeasy to reproduce and then LM Perf umrun load test again the existing uh umservices especially commercial servicesso it doesn't give you the choice todeploy uh your own um optimized serverversions and then fmperf is really uhfor yourself to compare for exampledifferent versions of your inferenceservers and different versions of yourhardwares uh to actually give you anidea on whether you are uh improvingyour server performance or you are uhyou got some regressions in your serverperformance so there are also a lot ofother open-source um um benchmarkingtools from uh different communities uhwe actually in the uh working group ofserving we started all of those and thenuh we decided to consolidate the effortinto a standard way uh to contribute andthen FMerve we actually donate FMP tothis community so uh we help othercompanies or other contributors to notrepeat the same tooling process likecreating YAML uh for for the deploymentand jobs and then um so the the uniquepart is uh we um we we focus on the codefirst approach so we provide codelibrary to easily launch the benbenchmarking and then uh we we want tothis class to be extendable tensible sowe can easily extended to more types ofservers right and then we focus on uhkubernetes uh nativeintegrations so I will show a quickdemo on how we can easilyuse fmperfSo here we uh have some local foldersmounted to the current cluster uh on themount models and mount uh request andthen we first just check if the uh theboth volumes are properlymounted and then here I I will show asimple like VM uh benchmarking script soyou can see all you need to do is todefine the clusters back workloads backinitialize the cluster right configuresome security context andthen configure how long you want to runthe experiment and just run benchmarkthose are all the things you need todo then we run it you can see like inthe cube control get parts monitoring uhwe already uh started creating the uhdeployment of this VMserver and then of course it takes awhile for the VM server to boot up andwhen it's ready it will continuechecking your workload backconfigurations and startum anotherjob right here is the FM generator sothis workload spec is using the uh thethe native FM loadtesters and after configuring what dataset you want to import into your loadtesters we will launch a pod calledevaluator the evaluator is really justsending the request to the server andthen summarizing theresults so we actually launched twoexperiments here one is emulating oneuser keeps continuous sending requestthe other is two concurrent userssendingrequest so here are the results of thesummary this is the CSV um and then youcan configure a parameter calledrepetation um so you can conduct thesame experiment multiple times with thesame configuration you get so you getvery robust performance benchmarkingresults and then here is a detailed uhthe detailed result JSON file that givesyou each tokens timestamp uh decodingtimes amount prevailing time stamp andhow long it takes so this is really aquick demo and the the demo video isavailable on YouTube as well um so as Imentioned so we we actually donate thiswe actually donate this uh library tothe inference perf uh project and rightnow it supports the python library itsupports the VRM uh model server and TJSmodel server and then um I will ignoreall of those as uh you may already heardit from uh uh from the working group ofserving talk yesterday and in the futurebecause uh the serving is not just aserver if you think about it I it's ainference cluster and then we in theopen source have a lot of uhorchestrators inference clusterorchestrators like production stack AIbrickor even Dynamo right so in the future wewant to expand this uh benchmarkinglibraries to not only uh benchmarkingthe server performance but alsodifferent use cases like long context QAsummarization agent and then our focuswill be expanded to uh not only theserver performance optimizations butwhat kind of routers performance weshould have and then whether we can uhbetter reuse the KV cache etc okay Iwill hand it back to Yan okaythanks Ch and for giving so yeah we arerunning out of time i just quick mentionif you want do some GPU and intensiveworkload benchmark the GPU burn is avery useful tool you can try it out it'sopen source also have the containerizedversion unfortunately I don't think Ihave time to demo it but verystraightforward you can specify how muchmemory and uh it will consume also howlong you want to run right just try itand you also can use the monitoring toolI mentioned earlier like the Nvidia SMIor GPU stats to collect all thisinformation we also use this benchmarkto study the different GPU sharingstrategy in our previous cubecon pleasecheck it out so I will skip the demo soalso there are other benchmarks andtrain already compare and uh uh I'd liketo highlight It's a few of them verycompre comprehensive set of the MLbenchmark for all the machine learningHPC and generative AI is called ML proofbenchmark also and recently Nvidiaannounced the DJX cloud benchmarkingservice please check it out as well sookay to summarize our talk uhbenchmarking is essential to forunderstanding optimized AI workload sothe key takeaway is there already avariety and a set of tools available andalso benchmark and software so we shouldtry it but moving forward I I I think wewe need and probably and provide thesoftware tool can do all this morerealistic end to end performance rightit's not just about training orinferencing and it's the end solutionseven data processing post-processingother thing another thing is not onlyright the synthetic workload for what ifanalysis also can replay and some realtraces that's important also simulationemulination are used for technicalfor deeper and insight and finally Ialso want to mention right DI is the newAPI to request allocated GPU and youshould take a look at it and uh finallyI want to highlight the AI workload anduh not only about performance rightreliability and feed recovery veryimportant how to support that andevaluate this and feature and we be acritical features hopefully communitycan work together to improve it finallyall this tool you have seen so far rightare publicly available and definitely weshould make it open source and thecommunity can use it and work togetherto enhance and developing the newfeatures so yeah we list a bunch of thereference gear and uh you can take alook so I will be and at NVIDIA booththis afternoon yeah also we're availableand after the talk so please come talkto us and uh yeah any question commentsor feedback thank you very much2025-04-15 21:59:34.286676 �x�!#��'ACK7Il4ZiqTAokay time is almost up and let's startUh good afternoon everyone Welcome to behere It's my pleasure to uh have thisopportunity to share the testing oflarge scale edge computing using KU edgeas well as the work we have done toensure the stability at a scaleUh before starting the presentation letme briefly introduce Yani and uh myselfYani is the uh software quality engineerof doc and she is also the chair of siktesting of kubage community Uh my nameis Ba uh and I workedk��[� #��mAOnqzoBf7dUEgood morning welcome to our session i'mYen Chen fromNvidia okay let's echo from another roomuh I'm working and with the Nvidia DJXcloud to build Kubernetes optimized AIinfrastructure platform and for AI andGPU workload hello I'm Chen uh a seniorresearch scientist from uh IBM researchi've been a very active contributors tokubernetes and six kubernetes includingthe autoscaling scheduling community andnow uh I'm contributing to the workinggroup of serving as well uh like thisbenchmarking toolokay so in today's session we are goingto walk through how to run modelinference benchmarking use the Tritoninference server and uh FM performancetools we will also cover how to run theGPU intensive workload in this talkwe'll also cover some of the monitoringtool like the Nvidia SMI and anothertool called GPU stats uh we'll brieflyand uh overview give an overview of allthe benchmark tools hopefully and uh youwill find this session helpfulokay let me start it and uh so firstlythe Triton inference server is verypopular and uh inference server and it'sopen source tool can run and uh a bunchof different models and uh from like uhthe typical and uh uh all this andpopular models and you can use it itsupported x86 and ARM also provide uh uhsome client library and the toolsincluding uh running on the edge andmobile devices uh also recently Nvidiaannounced and advanced version calledDynamo it extend and uh optimize theTriton uh with advanced features forimproved scalability and performanceso in order to set up a triton uhinference so there are three and keysteps the firstly and you need to createa model repository and populate and withall kind of different models I willcover next the some details how to dothat then the second is deploy theTriton server instance itself also andTriton provide and there are some clientor workload generation tool and theperformance uh collection tool uh coreperformance analyzer can collect sendrequest and collect all this performancedata so I will go through each of themone by one so for model and uhuh deployment and Triton support and alldifferent models from PyTorch and uh tolike the standard OMX models and not VLMso some example models like Nama andFCON and other thing so if you look atthe the the structure of the modelrepository basically and you canpopulate and deploy and all differentmodels so each model is a subdirectoryyou name and the models under each modelthere are configuration file also youcan have multiple versions for each typeof models so for example just one twothree name it so on the right side hereis the examples for example you havetext detection model version one rightyou have another one is text recognitionand uh yeah under this directory is thereal model file and uh yeah there arethe Triton supports like I mentioned alldifferent and uh a variety and uh uhtype or format of the models so this isyou populate it and the models so a lotof models and available online you canjust download it and uh yeah store anduh in a storage or filesystem so the second step is uh set upconfigure the Triton server and we havecontainerized the version so the Tritonserver right and have some client API reuh receive the request it support boththe HTTP and gRPC protocol then insidetheir flow process request send back thetheresponse so this is a sample the YAMLfile and very simple and the mostimportant thing right is you set up theTriton server ports it support both thelike the HTTP protocol which port welistening to also the RP GIC the port sothen you specify most important thing istel at Huawei now anduh I'm currently also a maintainer ofKubage community Um uh is a beauty dueto the scheduling conflicts Yi is unableto attend this conference in person Soshe interested in me to share thistopic Okayuh let's talk let's talk about some uhbackground uh with the rapid developmentof 5G networks uh industrial IoT AI andother fields edge computing has become atrade leading the digital transformationuh future scenarios such as smart citiessmart transportation smart health careand intelligent manufacturing ing arebecome more familiar to people Uh andedge computing has garnered muchattention Uh Gartner has said uh pointedout in that in 2023 the number of the uhsmart devices at edge is more than 20times the that of traditional IT devicesand uh uh by 2028 the integration ofsensors storage computing and advancedAI cap capabilities in edge devices willgrow steadilyuh due to the diverse types and largestnumbers of IoT devices the increase inthe scale of IoT device uh connectionsbrings significant challenges to unifiedmanagement and operation and maintainersUh at the same time users of the Kubagecommunity have also impressed demandsfor managing niscale edge computing edgenodes and edge applications For examplein the high-speed tour station projectbased on Kubage nearly 100,000uh 100,000 edge nodes and over 500,000edgeapplications are connected at highwaytour stations across the country As theproject involves the scale of edge nodesand applications will continue to growThe vehicle cloud collaborativecollaborative management platform builtusing Kubage is the first cloud edge andintegrated architecture in theautomotiveindustry enabling rapid software upgradeand integration integration for softwaredefined vehicles In this platform eachcar is treated as an edge nodes and thescale of edge nodes is expected to reachmillions Uh before dive into how Kubageachieves large scale management andtesting We let me first give a briefintroduction about the Kubage projectThe kub edge community is the industry'sfirst cloudnative edge computingframework and it is designed for ageedge cloud collaboration in in edgecomputing scenarios Uh it focuses on uhproviding a consistent experience forcloud age resource collaboration datacollaboration device collaboration andintelligentcollaboration Uh so far has get over8,000 stars and 2,200 forks on GitHuband uh uh we also have more than1,500 contributors from more than uh 100country organizations worldwide andthere there is also a exciting news inOctober last year Kubage has become theCNCF graduatedproject Okay Uh this is the architectureof Kubage which covers the cloud edgeand devices Uh which means in otherwords cloud edge and devices areconnected together seamlessly withKubagearchitecture Uh in the cloud uh Kubageis built upon the native Kubernetesuh we don't make any changes toKubernetes So users can use the nativeKubernetes APIs to manage edgeresources and anduh in the cloud we have a comp cloudcomponent which called cloud core andthe corresponding C uh component at edgeis called edge core Cloud core at thecloud will list watch the uh res thecluster and get the results you uhespecially the edge resources and sendthe message to theedge Uh we the cloud and edge the cloudand edge cause uh com connection isusing the uh cloud edge messagemessaging channel uh def defined inkubage Uh it is a stable uh channel Wewill introduce itlater And the uh for the edge core forthe edge core we integrates anightweight kublet uh we it retains thecore compilities of kubernet whileremoving features are not used on theedge The native kubernet interfaces arefully supported at the in the edge coreenabling the management of edgecontainers use CI integr interface Onthe right is the devices side uh wherewe support device devices uh based onvarious protocols can connect to theedge core and uh to communicate with theclusterOkay first we need to define uh theservice level indicators and the servicelevel objectivesuh which me which called SIS and SOSSince Kubage is developed based on thenative Kubernetes we reused someKubernmetes defined SEOIS and SOS such asAPI coencyuh the API core nancy uh demand the SLOsuch as requires 99% of latency to beless than one second and for longstreaming read only APIs the the P99latency should be less than one to uh 30seconds depending on the scope Uhadditionally we also defined the portstand up net latencyuh uh requiring the latency for statusports to be less than five five secondsonly when our testing results meet theSEO can we con that our cluster meet thecorresponding scabilitiesrequirement Okay uh in clust inKubernetes clusters there are manyresources such as nodes ports namespaces and custom cids Therefore scascalability cannot simply be equal tothe number of uh the nodes in actualkubernetes clusters Various dimensionsare not comp completely inindependent Uh stretching one dimensionoften compresses others For example ifwe deployed a large number of ports itin a cluster it might name it way to uhcreation create the otherresources Uh so we need to choose thetesting dimension according to ourspecific use usagescenarios and Kubernetes also recommendssome scalability dimensions andthresholds For example Kubernetessupports a cluster with 5,000 nodesand 150,000 ports However the scenariosuh faced by Kubage and Kubernetes arenot diff are different Uh testing alldimension is challenging So uh for thistest we choose aclassic application scenarios for kageuh based on meeting the LCL SLI and SL Owe we tested a single class manager100,000 edge nodes and 1 millionports Next let me introduce some ourtesting tools The first one is the HMAKuh HMAC is similar to the cube mark andis used for scalability testing ofkubageclusters In the di diagram there are twocl clusters The Kubernetes masterclusters uh represents the actualtesting cluster and the right onecluster HMAC Kubernetes master uhrepresents the virtual uh cluster Uhwhen we depo deploy the HMAC ports uhthe port start and it will register as avirtual edge node in the HMAC master Sowith the edge mask edge mask edge markertool we can uh we we can uh use limitedresource to simulate simulate largescale kubernetes and kubage clusterSo uh use this tool we can expo exposecluster management issue uh only rightarise in ncllscenarios and the another testing toolis the cluster node tool It is anopensource kubernetes node testing tooluh it tests the SLIS SLOs's defined byKubernetes to verify uh whether thecluster meets the v various servicequality standards Uh cluster node 2 alsoprovides v vronized data for clusterissue diagnosis and performers optimoptimizationuh it generate a class a Kubernetesclass performance report with detailedresults for various performancemetrics and this is the final deploymentmodel for the test Uh the Kubernetesmanagement place uses a single masterdeployment with etc coup API servercoupe schedule and kuba container Uheach deployed as a single instancesinstance Uh the kage management plandeploy deployed the five instance ofcloud core The master node IP connectsto the coupe API server and thesourceboard node balancer exposesservice Virtual edge nodes can rerandomly connected to the one of thecloud core instances with the nodebalancer and the right is some versionof of our uh the tester in thekubernetes and the k edgeUh here are some specific configurationused in the testing including the uh QPSand burst par parameters for variouscomponents such as Koopa schedule Koopacontroller manager and uh uh since cloudcore aggregates and processes messagefrom edge it has many uh resourcehandling workers config here So uh wedefined some uh such as update portstatus workers here If you want to knowmore about the detailed configurationyou can refer to Kubage scalabilitytestingreport Okay this is the uh the finaltester results The chart shows the APIlatency matrix compared to the SOS weintro introduced earlier For example thelatency for modification APIs is lessthan 100microscond and the read only API latencyis less than 1 second The result meetthe SEOrequirements And uh this is the uh retesting result of the port standup nancyUm here uh the the diagram you can youcan see the nensy for each stage of portstandup and operation The top row isshow is the end to end port standupnancy with uhP99 uh of 4,000 and4,087 microsconds and the SEOrequirement of the port standup uh Nancyis 5,000 micro uh microsconds So we meetthe SEO'srequirement The over the overall testresults shows that both API core nancyand port standard nancy meet the SEO'srequirement uh defined by the KubernetescommunitySo we can say Kubernetes and Kubagecluster can stablysupport 100,000 edge nodes online andmanage over 1 million portsuh in actual in actual productionenvironment due to the network securityuh partition management and other issueas edge nodes and the cloud network arenot always connected with each other andmay connector on demand for maintainersBased on the online or offline ratio ofedge nodes the scale of the edge nodesmanaged by a single cluster can beexpanded And additionallyuh by using data shiding technology tostore different resources in thecorresponding ATCD storage even largerscales can be achievedIn the remaining time let me explain howKubage achieves stable operation at ascale Uh first Kubage uses an efficientcloud edge message channel As we allknow in native Kubernetes cluster everynode must next watch its own resourcesthe ports assigned to it and all servicemetadata With the increase in nodes andthe ports in large scale cluster thenumber of the nest watch requests growsuh growsvery grows very rapidly So uhsignificant increase the node on the kubapi server Uh kuba edge uses a by b bydirection by directional multi multipostcloud edge channel that supports websocket and quick protocolsEd core initiates connections to cloudcore and cloud core will need to watchthe uh change to kubernetes resourcesand uh pro proactively push the metadatato edge core So the through the cloudedgechannel this is this will reduce thenode on the coupe api server andimproves the cluster performanceAnd and second kage ensures reliable andincremental cloud edge data transtransmission in scenarios with complexage network uh topologies and uh nocommunication and no communication faceschallenge like high ny and intermittentdisconnected disconnections and frequentand reconnections When the networkreconnect and edge nodes reconnect tothe cloudnarist request from the edge to node canoverwhelm the coupe API server Kubageuses an incremental data push push modeland uh clot core will uh clot core willuh record the latest metadata versionsuccessfully send to the edge Uponrecollection only incremental data willsend to the edge solving the problem offorest request and improving statstability in high latency low qualitynetworks Additionallyuh in the edge we have a we componentedge core on the edge side trims thenative kubernet remove unused featurelike uh inra volumes and cloud providersuh it compresses status information anduh uh optimize resource usage So eachcore require only 70 megabytes of memoryat minimum Uh this reduces the commcommunication pressure and assuresstable operations even in high latencyand high jitter environmentsAnd currently we have only tested ncalenodes and ports Moving forward we willfocus on edge devices edge cloud messagemessaging and edge service meshes Foredgespecific scenarios such as ncale networkdisconnections and high latency networksWe will introduce new SLIS and andSLOs's to evaluate service quality andconduct further nclltesting Okay that's all about mypresentation today Thank you for yourattention If you have any questionwelcome to visit the kubage both at 14Band communicate with us Thank you again2025-04-15 21:59:35.124042omance we'll tryto propose couple of ideas how we cannavigateand hopefully maybe we could uh diffuseit uh at the end I have couple ofresults uh from performance metrics sowe we could see how it turns out sooverall Kubernetes is awesome like youcan take your containers you can ship iti it it works really well like we havepretty big conference i think it'stestament to that but not always it canturn out how you can think um somethings are harder sometimes there areincidents but I think in my experiencethere are something there are some areasthat you're just destined to failunfortunately that are not welldocumented that are still a tribalknowledge that is spent betweenmaintainers and if you don't ask them uhyou might get uh hurt so for last yearI've been looking into the API serverperformance and hopefully we can we candiffuse some of the problems to uhtoday so the problem that I want to talkabout to you is the uh problematic usageor abusive usage or of CRDs andoperators so uh in one of the incidentsuh that uh I I was working on uh on GKEwe we had a pretty small cluster uh itseemed pretty pretty small relatively toevery the things that we run butstill customer upgraded as a operatorand everything worked for two weeks andjust when everyone want you alreadythought that we are done uh the memoryspiked 20 timesso what happened like how and how whywhy did did it spike every every likecouple of days and totally took down thecontrolplane uh and how at like 50 node clustercan can go down well Kubernetesapparently supports 5,000 nodes so since1.6 six Kubernetes runs uh a weekly testthat has 5,000 nodes 15,000 pots or150,000 pots 10,000 services andeverything so it should work right uhthe problem is Kubernetes tests its ownscalability focusing its own mainpipelines and its own concerns so whenyou go into CRDs you might find that theproject that you have picked from theshelf never tested anything beyond like10nodes uh and a lot have happened since1.6 so now everything is a CRD uh andyeah anyone can create a new uh whichmeans there can be a lot of uh thingsthat are lower qualityuh one ofthe maybe unique patterns aboutKubernetes that people forget is thatcompared to traditional application likeSQL uh you're in Kubernetes you're notinterested in a single user or a singleor some data related to this user uh inKubernetes we we treat we we still havea a database like CD but reality is uhthe the consumers of API are controlledcontrollers and controllers what they dois they reconcile so to reconcile theyneed to find old state understand itmake a decision and transfer to the newstate but they are really processingeverything that happens in in in clusterso it's not one to one user comes or oneclient comes and makes one request andthey fetch one one row of data you haveone controller that comes and fetches150,000 pots how do you handle it whatKubernetes did uh solike I could say or personallyKubernetes was originally not reallydesigned for uh large data uh or atleast for the CRDs that we have today uhthat going back to the incident itturned out that customer had 100 or 500megabytes of data in one of theirresources how do you how do you handlethat you have a controller or a clientthat every couple of seconds fetches 500uh megabytes of data and each of thoserequests was allocating gigabytes andgigabytes so even though uh Kuberneteshas limits has quotas uh each object canbe one megabyte or over one megabyte insize and there is you youyou can easily uh you can easily get thesize of a resource or size of all thepots or uh or CRDs or your config mapsto gigabytes in size and reallyKubernetes overload production doesn'twork here um AP uh uh APF so APIpriority and fairness mostly accountsfor the CPU usage and doesn't accountfor memory so those five 400 megagigabytes of data that was allocated wasjust coming from couple of controllersthat were consistently listing a lot ofdata and they were getting them but withthe cost of API serveroming uh so let's deep dive why uh howthose 400 gigabytes came out so firstwhen you make a list from the client uhbe uh kupbernetes client the request goesthrough API server is directly fetchedfrom the CD uh API server just transfersthe request from a list to a range whichis just a similar but different se alittle bit different semanticallyuh at CD then fetches the data reads itfrom the disk and serial uh desserializes it so it can uh prepare theresponse next you need to uh serializethe response into protobecause uses gRPC so it needs to writethe protobuff serialization it sendsthis data into the API server that thenreceives one like in this example half agigabyte of data and loads it also tomemory so now we have threeallocations uh if you want to use thedata you cannot use the just bytes or ablob of blob of bytes you need to decodeit so uh API server needs to decodes thedata and again which requires copying ituh to um yeah copying thememory uh sometimes depending on your uhrequest uh API server can in the storagelayer can do some filtering so that forexample when a node or a cubelet fetchesuh pots on its node will not return themuh it will not get back all the pots inthe cluster it would just get subset ofthem but still in most cases forcontrollers filtering doesn't reallyfilter it's just a pass through and atthe end to for client to receive the theresponse they need API server needs toencode the data back again so if wecount we have five allocations that's alot if you want to fetch a gigabyte youwill get five gigabytes an API uh ifthere you have multiple clients likethis and they'remisbehaving unfortunately Kuberneteswill not protect you and those 500 GBwere just 10 clients or 20 clients thatwere fetching this this memory and allocpushing API server to allocate andallocate uh to mitigate this Kubernetesuses something like called caching ituses caching uh if you want uh there wasa previous talk my madav uh about exactdetails about the how cache works butlet's look at it again uh quicklyuh so cache is a uh component in APIserver that has its own storage which itpersist uh which has uh which stores thesubset of states from CD and this uhcache is filled by API server just rrequesting all the the data uh data fromCD decoding it once and storing inmemory so cache has already readyalready uh has uh prepared a response ifuh for a request that is ready to be orit's already decoded and it's ready tobeused if uh because it we cannot do itfor each request and the data changes wecache also uh maintains an open watch toadd CD that will send uh updates on eachchange so instead of if we have half agigabyte of data still updates might bejust one megabyte because one pod changeor one change in the object will um willsend only one event which will have justone megabyte so we are now just decodingone megabyte at the time which is muchbetter than beforeuh so if there is a request from theclient we will just go to the cache andfrom the storage uh API server will justneed to encode the data one uh the dataso resulting in just oneallocations isn't it great like cachingis great we got back from five to justone allocation is it enoughunfortunately no if you look at the fulltable uh from all possible arguments inuh API I server only two cases usereally cache and two cases use cache butwithout ignore a limit so if you youcannot uh you cannot limit the data youcan just get uh gigabytes of response sothat's a lot of reds um making it veryhard for most users to to know what whatkind of performance uh they're gettingbecause they don't know if they'rehitting cache if or not uh and the topline is the default configuration if yourequest by default uh uh default emptyrequest without any parameters tokubernetes it will be delegated toCD um so this is a state of Kubernetes 1let's see if we can uh improve it souh so how we can uh if the situation isso bad is there any way that we couldnavigate it so if we uh if there is afull minefield of wrong decisions thatwe can wrong parameters that we can pickhow we can know what path to takebetween the the minds uh so first stepwould be reading the manual i I assumethis is CubeCon conference everyone hereread documentation Kubernetesdqocumentation everyone read your cloudprovider documentation because itbasically says you should not do uh whatI shown before you should not have hugeresources you should limit your sizes uhand this is consistent with therecommendation of six scalability ofusing list list watch pattern this isconsistent with scalability uh trying toset resource version using protobuff toreduce allocations uh GK also recommendsuh limiting the storage and I personallyI would say like you should you shouldreally watch out for the sizes of theobjects to make APF workbetter uh second option would be to runyour own scalability tests like if youcannot uh if you cannot read or knowwhere the minds are can you have a toolto to find it and the answer is yes youcan run your own scalability test toverify before you go to productionbefore you upgrade anything that therewas no huge regression and until youreally run uh a similar environment anduntil you validate the dimens dimensionsthat uh that you run in production youdon't really know there could be just aa a small bug or small change in a oneline change in the PR or betweenversions or one small change betweenversions of operator that you run thatcould totally go from using a cache tosending requests to CD resulting in hugeamount of allocations uh so I think weshould encourage all the projects inopen source to run to set their ownscalability goals and ask for budgetfrom CNCF to to do the testing like ifwe care about our production why weshould run scale test everything our ownit's an open source project we shouldagree as a community to set a commongoal that will be maintained throughoutall the releasesso if the two first options failed canwe can we diffuse can we diffuse andremove all the mines uh can we live withthe can we improve the situation andmake it available so uh to everyone soeveryone can understand the scalabilityuh and not need to to uh to read read abook about thisuh but before we go into fixingKubernetes let's uh uh first we need totake a look what is a resource versionso a resource version works as a uhlogical clock a global int 64 uh thatevery time anything happens to CD uh orkubernetes any write any update uh thisthis number will be inc uh increased sothat we know uh we have a single globalorder of everything that happens in thesystem uh this brings us to the semanticof the list so when you have a uh a aoperator or a controller or a userasking about a list they really askabout some particular meaning uh or someparticular snapshot of the time soKubernetes like uh and CD support notonly getting the newest data from fromlatest state but it allows uh they allowyou to pick uh pick historical values sosnapshots of the state from before uhyou could askuh you can ask Kubernetes about anystate of the word that it was aware sobasically uh ask hey give me anyresponse any from any time that you haveyou could restrict the staleness of thedata by asking that the response will benot older than some particular last uhtime so for example from last whatyou've seen uh you could also uh askabout the most recent so normally formost databases you usually ask for thedefault what is the state of the neweststate that is available and this is thedefault when you send a request inKubernetes and you can ask abouthistorical sit about state of uh ofKubernetes from uh some exact resourceversion which would be uh askingKubernetes to go back in time and checkwhat was a state of Kubernetes I knowfrom couple of minutes agothere is also a continuation which whichis um a mechanism that you need tosupport pagenation you need to be ableit's basically extension of the exact umsemantic that you allow not only askabout some older snapshot but also withsome um shift so you don't you don'tneed to read the same keys so you youwant to continue the list uh but from umyou want to continue list from some oldsnapshot but you don't want to see thesame keys at the beginning so you couldsay it's atraingmachine so let's look how thosesemantics work on the uh caching so ifyou ask about any resource version youwill get just any data trhat is currentlyavailable in the cache assuming thecache is initiated and ready to servedata uh at CD could be far be beyondforward maybe the watch hasn't yet beenupdated so the cache can be stale butyou will ask you will get a responsefrom the cache that's easy and cheapway you can ask about the not older thanand it will just make Kubernetes waituntil the resource version is uh in thecache is fresh enough and when it is uhit it will respond but here therequirement that there needs to be otherclient that will update the resource foryou so first there needs to be a clientthat will send an update and will makeincrease the revision in net CD thatwill cause a new event to go to thecache and update the resource versionhere to 120 which is more than onerequest at one uh revision uh orresource version sorry uh 100 and thatwill allow uh the weight to stop and youwill get back theresponse so we have two things that wecan easily serve from the cache whatabout the last three so those are thenewest improvements in Kubernetes uhsince 130 so let's look how we can uhhow those problems were solsolved to get mo most recent let's lookhow CD does it so when you if youdisable cache in Kubernetes um the mostrecent uh request will be just serserved from CD and CD will serve youwhat is the latest state in it so cachecould be stale but HCD will always knowwhat is the latest state of it so thisgoes to the back to the definitions orto the previous slide of um API serverworking as just as a passthrough but on cache side how do we knowwhat is the freshest data incd like wehave some version resource version incdhow we know this is the newest resourceversion in the in the whole cluster uhso to do that we need to first we maybewe could ask CD like let's ask CD whatis its newest version so we can make asimple uh very cheap request to tocd andask us ask it to give us the resourceversion and then we know that we need towait for uh if it responds with resourceversion 100 we know that we need to waituntil uh re revision hits 100uh and second to not need to wait forany other things in the system so anyother update we also need to uh do somefixes in the watch so when we know weneed we are waiting for resource version100 and current cache is not up to dateenough we can go to watch and maybe pokeit to to ask for progress so uh at CDfor long long time I think like since3.3 uh supports uh watch progressnotification so the client here APIserver can request an update uh on uhoncd if you are aware of the bookmarkingmechanism in kubernetes it's the sameprinciple just on kubernetessite so if um maybe there is no updatein HCD but still if we requested aprogress HCD is happy to give us uh thethe the confirmation that yes you are upto date maybe there was no event that'sand no maybe not any data didn't changebut still it can inform the cache thatit's up to date with the resourceversion and this allows us to uh tofulfill the check of uh minimal requiredresource version and get a response toourclient so we are done with uh the firstone uh or the third one uh and thisfeature is called uh consistent readsfrom cache and it's implemented inKubernetes 131 bydefault uh and another case that is muchharder is what do we do if a clientrequest about exact resource version sohere at CD can magically go back in timeand and understand if user is askingabout the old resource version and hasstill the data to to serve uh the clientso it will be able to send uh a resourceversion with uh for equal 42 but oncache side we don't have thisinformation we don't store all data wedon't know how how uh cache shouldshould how cache looked like fiveminutes ago so can weimplement that for uh to to do that weneed snapshots but before we go tosnapshots we need something thatsupportssnapshotting so in Kubernetes uh 32 weintroduced an a separate a newreimplementation of the storage forcache which is based on bit trees andbites are awesome so not only it brought25% performance improvement reduceallocations by uh 15% but alsointroduced a new superpower we cansnapshot and clone the cache so not onlywe gest one superhero we can get at anymoment as many as we want souh so like I said we can clone the cacheso if the so if there is a watch eventwe no longer need to update this thestorage and lose the historical pointlike previous version we can now getthis information and clone the storagebefore we apply a new changeand so and we can do it multiple timesso if we get resource version 20 we cansave it as a snapshot with resourceversion 20 if we get another event wecan clone this this storage and get andupdate it with with new event thatincreases the revision to 42 and now wehave two two snapshots and we can do itagain and again on every and this allowsus to have a full history of everythingthat happened all the snapshots of allthe states of CD in memory ready to uhdecoded from HCD and ready to be servedso if a client request about resourceversion uh 42 we can just simply go tothis snapshotand if we don't have resource versionbecause we I watch could be or watchcache could be not initialized or couldhas a cleanup mechanism and we might notbe able we will still delegate therequest to CD because we want to be wedon't want break our users we want to beconsistent with the previous behavioruh so this feature is called list fromsnapshot and it's now alpha in 133 whichmeans it's still not uh enabled bydefault and we are validating it uh itin our testing and we are validating itwant maybe you to help us validate it tomake sure that it's it's uh productionready but hopefully we can solve uh wecan bring it to beta and make it itdefault in34 uh so this solves for us the case ofexact and the continuation like Imentioned continuation just is uh uh uma case additional case for the exactwith additional preuh additional key uh or shipped in akeys that we read and this works up to75 seconds because we clean up the watchcache uh after after the events areolder than 75 secondswhich is enough uh for like 90% ofcases so can wecelebrate almostuh if you read JSON code or if you evertested API server even serving fromcache will require you to to do moreallocationsuh uh because JSON was not really alsodesigned to handle large data uh so ifyou send if you request from API server1 gigabyte or you will uh API servermemory will increase by aroundtwo so instead of having just one uhallocations in the encoder we'll getwe'll double that and that's also notgreat so maybe we could fix that uh theanswer is yes in uh there is a uh cap uhstreaming collection encoding in 133that is enabled by default so itimmediately went to uh to beta and itbasically reimplementation of JSON andprotocoding uh and this makes it reallyefficient uh maybe just to give you hopeuh we we do this custom this customimplementation is assumed to betemporary because Golang uh is workingon a better implementation of JSONencoder version two that will solve thisproblem but we want to bring thebenefits to you a quickie quick quickerso the next Kubernetes release will notrequire any additional memory forencoding at least for JSON uh and protoprotobuff so this instead of writing uhinstead of uh taking half a gigabyte ofpayload and encoding into one huge giguh gigabyte blob it will just streamobjects one by one making it makingallocations extreme much muchlower i promise performance results thisis our official benchmark for for thisfor all of those those efforts uh sobenchmarking a list has gone down from70 GB of memory usage to justthree uh so to summarize uh caching ishard uh it took Kubernetes a very longtime and still there is some work to dobut we will get it right and thanks tothat we will the memory will problemswill disappearbefore 131 uh you need the defaultrequests to Kubernetes needed uh were orwere served from HCD making it not veryefficient but thanks for to 131improvements now even default requestwill uh not require as much memory asbefore uh and streaming in 131 willalmost eliminate uh any memoryallocations in in Kubernetes and CD uhthanks Thanks to the streaming list andhopefully we'll get the uh list fromstorage working soon in the next releaseand that would make Kubernetes fullysafe uh from memory perspective so thankyou that's all from me uh this is afeedback QR code and uh we are open toquestions[Music][Applause]hi um I was just kind of wondering solike if we're having these noisyneighbor problems when we're using CRDsinside of the cluster like I guessthat's sort of the perspective that Ihave here where like you know the thememory usage here is exceeding ourexpectations and what the how the APIserver works like why do we encouragemovement towards like CRDs and sort ofusing Kubernetes as like a low codesolution for creating and managingresources than like externalizing thesethings like when should you andshouldn't you use CRDs I guessI think or extending Kubernetes isawesome and a lot of work has been putinto that it's still I would say uhthere was I think the assumption isthere were not many project that use CRDdo scale testing and Kubernetes itselfbecause it doesn't have default uh CRDlet's say example CRD that it will scaletest because it doesn't want toopinionate it then you get into theproblem CRDs are not scale tested and wedelegated to the other projects uh asfor doing it externally uh I like thereare options to serve uh extendKubernetes uh by running a separate APIserver or your custom reimplement of APIserver it just turned out to be too hardfor most users so this it works some forsome use cases like like metric serverbut uh but that but most users just wantdon't want to manage CD they don't wantto manage write their own API serverthey just want to inject so uh I don'tknow if we don't encourage it like itbut it's basically still work I wouldsay it's still working in progress andwe we need to catch up performance needsto catch up for the biggest projectsthat currently exist in in the ecosystemthank youso about the JSON uh encoding relatedperformance gains and memory gains Iguess um are you pre-enccoding the listresponse already or are you justconverting the pod list response to aJSON stream response like you know Iguess one object per line sort of thingso this is not relatedto object per line or maybe do you meanwatch list effort so at the end the um Iguess memory usage dropped a big dropthat you got in the API server was thatrelated to um JSON encoding yeah yesthat that is the results fromimplementing JSON custom JSON encoder isee so that JSON encoder basically is itlike just stitching the objects togetherin a custom encoder or is it like justnot allocating too much me i'm trying tofigure out why is it so before we hadone structure like list list pod listlet's say and within this pod list youcan have gigabyte of data within thearray that has items right so instead ofdoing that uh I I the customimplementation is basically detecting ohuh or trying to recognize what kind ofobject it is if it's a list object andhas this fields then we will do customimplementation of the at least the toplevel and it will justSo reimplement the JSON as you likewrite token by token and then when cometo list it will write each thingseparately so Kubernetes by by its ownsupports the writer uh interfaceu but and technically JSON too but ifyou look underneath the implementationJSON it just takes that whole liststructure and encodes it in memory intobytes and then passes it to the writerso which means it it doesn't do releasestreaming makes sense so inside thecache uh I guess watch cache are thepods themselves already stored asencoded JSON or are they like in just uhno because of uh conversions andeverything you need to you store them umdecoded from CD because it could be youcould store data incdro in now in cboardwhich is alternative uh also encoder uhso you and you don't know what clientwill request so watch basically youstore decoded objects like uh inKubernetes native structures and thenwhen you encode you you you base it onthe client headers and request2025-04-15 21:59:36.003339 ��;�$#��-AufY_JFPpzRIuh just a quick overview of the agendaWe're going to talk about autoscaling Uhwe're going to talk about something wecall make before break internally Uh andthen we're going to talk about what isthe context behind this migration andhow we actually ended up executingthis So for those who don't know whatclick house is uh it's an open sourcecolumn oriented distributed databaseIt's an OLAP database Uh so if you wantto store like pabytes of data and uhkind of do like really{��c�##��}A6l5zCt5QsdYhello helloeveryone well that's a massive crowd souh welcome to this session a hugecluster or multiclusters identifying thebottleneck we'll try to squeeze as muchas we can in the 30 minutes i hope someof your questions get answered else weare here for today tomorrow as well ummy name is Sam Paruk um I'm from LoftLabs uh creators of vCluster and I'mjoined by I'm principal developeradvocate um founder of um cube simplifybuild safe I'm a cubestronaut you canconnect with me at the sai impart on anyof the social platforms I'm prettyactive over there and I'm joinedbyI'm from China and I work in docloud uhI'm a member of kernetes steeringcommittee and also a crew maintainerYeahu so we have a small uh kind of um not aquiz obviously but we just want to takesome idea of where things are so if youcan just scan and uh there are just twoquestions if you can just it's aanonymous votes no sign up and stuff ihope there is no sign up uh so anonymousvote if you can just do it that wouldhelp us understand u��l�"#��ASdLLOcNZN5Eokay hello everyone and welcome to mytalk about uh Kubernetes API performanceuh I'm Marik Sharkovich i'm the SIG leadof ATC and uh and I contribute to alsoAPI machinery unfortunately Maddaf wasnot able to to join us today so I willbe hope I'm enough to talk about theperformance of API uh so the plan fortoday is explain what is the uh theminefield in the title we will try tounderstand on example why um why you caneasily uh get hurt if you don'tunderstand the API perfornvthe dynamics of uKubernetes clusters the size ofKubernetes clusters how many Kubernetesclusters you areusing all right we do have 44participants responding 39 that's coolokay we'll not we'll not take much timethough uh so we'll just see the resultsof where weare well that's a very good number of uhpeople having Kubernetes clustersgreater than 100nodes and out of 239 it's 40% which is alike very massivenumber let's go to the next oneand having more than 30 clusters in yourorganization well that's again a veryhuge number so everyone has those hugemulticluster and big cluster problemswell that's cool uh that's what we arehere for and let's get started let meshare the slides againyou can take thisuh uh this is a page from thekubernetes.io uh it's aboutconsideration for light clusters uh Ithink most of you may uh have alreadysee this uh uh here we recommended tofor uh 5,000 nodes uh this is a testednumber by kubernetesci uh by sikscalability and uh this this means thoseuh this member is tested already testedand meanwhile these numbers here uh havenot been updated for a long time so uhhere we have um some updates oruh and uh for the previous public blocksuh early days uh open eye has scalingtheir to uh two uh 2,00 and 500 nodes inuh2018 and uh later uh we can see uhthere's ungroup and openi and also thebayer uh they have about uh uh 10k or uh15k uh nodes uh for their cluster and uhuh later for bounce they use crew crewbrain to scale to 20k nodes and uh lastuh last year in kcon North America uhTKE has announced this scaling to the65kuh nodes and there's a demo yesterday ifyou see uh you are in their demolater uh and also before the before thiswe have uh asked a question uh and inthe Reddit uh this is a very hot uhtopic uh I think most of uh most of youmay uh want to know uh the bottlenecksand the pinpoints here uh and here'ssome feedbacks from the Reddits uh uhlike the events like cast scaling and DSalso IP tables or uh endpoint sync anduh uh also one thing is mentioned thatuh no node number is not the only uhgood meter for the class size uhsometimes we need to the total objectnumber uh which may be more accurate uhand also uh some mentioned that the forthe large cluster they need somemulti-tenency solutions that is one ofthe pinpointsuh here's are some uh general beneck ifyou uh want to build a big cluster uhlike the server benecks like the EDCthis is talked many times uh and also uhDS and storage challenges are uh one ofthe key uh issue on bottleneck and uhnode management and monitoring is alsouh a problem when uh at scaleuh so uh here's a pro uh here's thequestion why we want a huge cluster uhlike uh like our city uh there's uh inthe top top 10 uh buildings in the worldI think there's three uh or two inShanghai uh we we built the uh higher uhbuildings uh uh and also uh people likeuh huge clusters sometimes and uh thismay reduce the maintenance costs and uhuh centralized the management andenforce the policies uh and also uh uhyou can uh uh this uh when you when youhave multi rooms uh so your cluster iscan will not fail for the single rooffiller uh those are the pros and uh forthe cons you you can see that for uh fornative kubernetes cluster if you use thenip space it's not uh it's a little peruh tenant isolation and uh also you mayencounter the large blaster uh radiusand uh also uh some users may needdifferent versions so the end user oryour team may have have uh have notflexibility uh and and also you youshould uh do the uh version selection uhuh if uh as we only support uh uh forfor the control line is version n andthe node should be uh newer than uh nminus three so that's another problemalso uh the scaling challenges and theuh custom node configurations issomething uh is some sometimes is therequirements and also if you use someoperators they may have the version uhuh dependency so uh and and also thismay depends to uh on the uh Kubernetesrelease uh release period uh and thesupport well uh one version afterrelease uh the community may uh uh thelong long-term support is only about oneyear now and maybe it will be two yewarbut uh this will uh depends on the workgroup uh LTS uhlater and I think the biggest problemhere is the flexibility and the uh largeblasterradius uh and uh uh when you want to usea light cluster with challenges andsolutions uh uh this uh I give a list uhmaybe not details and uh we may uhmention some details of the uh solutionsin the following pages uh and maybe notall of them and uh we will leave thelinks and what we attach in the slidesand you can check it later uh so here weshould mention that for the API serverpressure we may use the rate limit andAPF and also cachions uh something likethat and for scheduling you uhscheduleuler you should uh choose aproper uh schedule for you and also dosome testing based on your scenario andalso the disc uh disculer and uh uh foreach city uh it's one of the uh key keyproblem we we should uh address uh thiswill be mentioned later uh and for ETtuning and the the heartbeat andsnapshot related and ET also there thereare some some EDCD replacement like crewbrain like Kane and uh this will betalked later and also you may have someuh red cache like casper uh or somethingelse so you uh this can uh makes thepressure lower and uh also uh uh youshould uh test those performance of allyour management components and operatorsuh and also sometimes you you should uhsee your GPU and uh different GPUs andand CPUs attackctor and uh for thenetwork uh uh uh CI is uh is one problemand gway and also DS S is uh sometimesyou should uh uh no you uh you shouldnot uh you should use some cache orscaling or or some uh replacement likeuh host alias for some uh sometimes tomakes the DS server uh in a good stateand also there's uh IP tables change uhthe current uh we have the NF tableswhich will be G in uh next release ofKubernetes uh this this month uh andalso you should take care with the podIP range and also your node IPs and alsosometimes if for the large cluster itcross rooms you you should uh uh knowthose uh affinity or something like thatuh and for storage uh it's a big topicso I may not mention much about storagehere uh and also for autoscaling uh thecarpenter is joined the kubernetescommunity and also uh for for users whouh the scenario HPA cannot cover you mayuse koda uh and uh also you should uh domore test with the large cluster beforeyou extend your uh cluster to more nodesuh so quark is a good choice and alsothere's group mark uh when you want tohave a real glet quark is with uhuh fake hlet or something like that souh but uh this we will mention later aswell there's some details and also forlight cluster you may have a localregistry like harbor or some CR proxythere's many choices here uh I mentionedseveral solutions but this is not all ofthem uh as you can find more in thesense landscape uh and here's someKubernetes community updates the uh Ijust uh uh I talked with uh Wtech uh hehe has a keynote which is a virtualkeynote so I I think uh many people maymiss it so I give the uh keynote linkhere uh he he he talks about the uh 15uh thousand nodes cluster feature ofKubernetes uh 2021 or 2022uh and here's a list of caps which isrelated to uh the light clusterespecially the ECD servers overrides uhfor sharding uh this this really helpsuh and just I talked with ECD manager uhhe mentioned that the uh if use thiswith the node list uh uh the list is forfor separated uh ECD uh cluster then youthen the cluster can uh be around uh uh30kuh nodes and and uh uh below uh beforein 20 uh 19 we we have the nodeheartbeat heartbeats and also uh laterthe watch bookmark and uh uh the endendpoint slice is also very earlierchange and uh you can see the list uhthere's many changes in recent years uh2023 and 2024 and also in recent uh inthe latest uh release uh in the currentrecycle is uh one um 33 uh there aremany changes which is related to uhcaching or streaming or stuff somethinguh uh like that uh there's many updatesrecently so if you want to build a largelarge cluster uh maybe you can try totest on the uh recent versions and andalso the crew proxy uh NF tables will beGA in this releaseuh this this is about etc etcd uhx forthe native etc there's some improvementsespecially in the 3.5uh uh Wilson and Justin has an uh videoabout this uh for the performance changeof ECD this version and uh uh you shouldtake care that there's a minimum ofrecommended EDC version for 3.5uh you should use the patch uh after sixuh because there are some criticalissues uh and and uh recently the Googlereplaced with Spanner they can uh havemore uh morenodes uh and uh uh before we talk aboutthe replacement solutions we can uh wecan check uh how can we reduce thepressure on ECD uh first is about theAPI server uh use the limited render uhand use the resa and also some readcache uh you can see the uh attackctoruh those those things can uh help you toreduce the pressure for example if youhave a demo set which connect to APIserver and with uh a lot of uh API callsit it is the API call is expensivesometime for the demo set so we shouldavoid using uh it or uh and also uhthere's some validation policies andlike OPA criminal and uh uh uh uhvalidation or something like that uh toreduce the EDCD uh pressure and uh justmentioned EDCD overrides the serversthis is generally used in for events anduh uh now we we can see the list is oneof the uh pinpoint so sometimes we useit for list leases and uh in some caseif your operator or you have the uhcustom uh API you you may use it in uhfor your uh customer API group uh thisalso helps and sometimes the port is oneof the uh largest uh uh object it it canalso you can use more than two uh ECDclustersuh and uh then about KLET this is not uhquite related to uh uh because Klet willuh will connect API server so their intand QPS API burst may you may need totuning them and also the node updatefrequency uh this is for the n list uhand for ECD there's a tuning page uh inthe official website uh you may have theuh uh better uh uh disk and also thenetwork also the heartbeatins uh and for the replacement groupbrain is a is a uh uh is a open sourceproject for uh for by dance uh they theyuse it with 20k uh nodes uh and theywill have a talk in kon cloud nativeChina in Hong Kong uh this join and uhuh uh but there's some uh current theproject has some limitation because uhbecause some uh API related uh I forgotthe details uh it is only supportkubernetes1 and al also there Uh if you want abetter performance with their solutionyou need a special patch which is notopen source yet but they plan to opensource it later uh and I hope uh andlook forward to their update uh so wemay wait for their updates lateruh and for other solutions like Ken uhin last kon North America the there's atopic for with Ken and uh this is aninteresting topic you can check uh butKen is always used in some uh edgescenario so for the uh large cluster wewe need sometest and also you should uh uh uh beforeeverything you should test your uh testyour uh cluster with huge clusters butit needs so many resource so quark is agood choice to uh fix uh quite a lot ofnodes this is used uh quite uh in manyscenario at first quark is used in thescheduling uh performance test uhyesterday there's a talk about this uhuh so you can you can check if yourscheduleuler uh works well on the largecluster uh and also quark is used insome uh uh UI related uh uh project likeuh uh the the head uh headlamp uh alsoyou use the quark to uh mock some uhsome uh some nodes and some status uhand also you can test your controlplane uh so uh so the key takeaway hereis uh okay uh the key takeaway here isuh the replacement solutions and thecommunity updates and uh also you neededto do the performance testbefore and what's more uh multi-tenencyand multicluster uh is uhthe popular popular questions and whynot multi small cluster uh to uh thismay uh be mentioned by uh my co-speakeruh so I I I think uh uh this is similarto the question why uh we use a lightcuster it it's just uh uh vice versa andalso uh last year there's a multiclusterapplication management radar uh thismentioned the projects related tomulticluster uh so you can check thoseprojects for your solutionawesome uh so I hope you remember thisparticular slide um liyke the poor tenantisolation uh large blast radius uhreduced end user flexibility um and theversion selection stuff so let's try tosee if we can simplify that or we cansolve this challenges so that with thepoints specified for the huge clusterthis can also uh work like we can findsome solution for the cons which arethere so what we are doing right now isu if we have three teams three differentteams who want to apply uh or deploytheir applications everybody does youknow cubectl apply hyphen f theirdeployment.l AML file it talks to theAPI server of the um Kubernetes and thenit creates the replica set um you knowand the deployments uh for each team uhthe problem here is sharing yes we aresharing which is good uh we are tryingto share the same host cluster but theproblem here is it's the same API serverthat is getting hitand what do we do so we create somethingnative which is called Kubernetesnamespaces so in Kubernetes there's aconcept of Kubernetes namespaces and inKubernetes namespaces there are a lot ofthings that you can do like there are alot of resources which are namespacescoped uh for example if you can seethere is port security um you knowpolicies which are port securityadmissions uh enforce um restricted umthat you can do there is network policythat you can uh do to restrict thetraffic from one namespace to anothernamespace and you can curate it in a waythat a a tenant in a namespace A and atenant in a namespace B do not talk toeach other at all but they are able totalk to the API server uh so this issomething it looks like uh and thenthere is other like resource kota uhlimit ranges all that stuff is namespacescoped um so what where we are now umwhat we are doing and what people aredoing and what you have specified in thesurvey as well uh there are manyclusters there are big clusters and thenthere are many clusters for differentteams uh there's cluster per teamthere's cluster per environment and thenfor each Kubernetes clusters that iscreated if you want uh certificates ifyou know if want uh all that things soyou will be installing search manageryou will be installing ingresscontroller all over through all theclusters because there's no other wayuh key issues with the namespace basedapproach is uh it becomes very difficultwhen it comes to different or multipletenants let's say team A wants threenamespaces team B want two name spaceshow do you distribute the cube configfiles how do you do the cluster levelresources crds is a big pain and how doyou test different versions of CRD let'ssay uh you want to test out uh CRD uh ofdifferent versions for your operatorsyou want CI testing uh on differentversions of Kubernetes clusters that isanother issue which is there uh sodifferent version of your applicationson different versions of Kubernetesclusters is also a pain which isdifficult to manage with uh namespacesthis is a very interesting spectrum uhthat I would like to cover um which isthat this was um image was created bylearn k8 so credits goes to them so uhif you if you see we have uh on the lefthand side uh we have name spaces that wealready discussed on the right hand sideand on the next as we move on uh so thenext part is namespace as a service uhwhere what it means is all theconstructs of the name spaces like theresource kota limit ranges networkpolicies are grouped and then they areintelligently provisioned um for exampleyou have the projects like uh capsulethat can do that hierarchical namespaces um yeah they're able to do thatbut it's not that much maintained um andthen you have the uh Kubernetes uh APIuh as a service uh where you have acertain type of proxy in front of theAPI server that does the filtering ofwhat user is allowed to see and what isnot not allowed to see for example uhfor the previous model that I mentionednamespace uh as a service it's difficultstill for the CRDs because you can onlylimit by uh the namespace scopedresources and when I'm talking about theKubernetes API as a service kind ofscenario which is which falls for the umthe third layer so you have the capsuleproxy uh over here kcp doesz some somesort of this as well falls into thisparticular category this helps you uh inhaving those CRDs as well limited to theview of particular team let's say team Aonly needs to see uh certain customresources team B only see to needs tosee certain persistent volumes so theywill be able to do that it's a bit hackyuh but yeah it works um and then youhave uh let's talk about the dedicatedclusters first so there is a thing uhwhat you can do is here dedicatedcluster means still each user gets theirown cluster but uh it's it's stillexpensive and um uh but it's more of thelike strongest isolation that you needso if you need hyper strong isolationsyou need to go for the kind of separatecommunities clusters so here um thereare some uh projects uh that helps youdo that uh but here you have to havethose control um the for example youhave karmmada over there uh you havecube slice so what happens is if youhave karmmada as a cluster so you willbe having a control karmmada controlplane as one cluster and then you willbe having multiple um teams multipleteams will be having multiple kubernetesclusters each cluster will be having anagent that connects back to the karmmadaand then the deployments happen in thatparticular flow so that is one way andthere are multiple tools in this um thisspace as well on the on the right handside now two interesting pieces whichwhich is very interesting for thekubernetes multi-tenency one is controlplane as a service internal one isexternal here what we are trying to sayis uh running control plane as a pod sowe are running control plane as a podcoming to the external one where you seeu you know more of the kama g or thevcluster pro or uh k0s moton or uh thehypershift uh so uh you will be having amanagement control plane in themanagement control plane you will becreating the control plane pod then youwill be having the worker nodes that youset up for each of the worker nodesyou'll uh connect those cublet uh you'llhave the cublet and stuff and thenyou'll connect back to the control planeuh pod in the management cluster Nowthis becomes one cluster if you wantanother cluster for team B what you dois you spin up another virtual machineinstall cublet over there connect to thecontrol plane uh you create anothercontrol plane pod for team B on thecontrol the control plane uh cluster andthen you are able to connect now thissort of scenario is uh useful whenyou're talking about like it's mentionedexternal when you are providing kind ofkubernetes as a service when you want toprovide that as a service it this modelkind of works well and uh the last oneis where you don't h don't want anotherkubernetes cluster you have existingkubernetes cluster and you just want toreuse that uh so you have the controlplane pods running on the clusters fordifferent teams and they will sync theresources is in the uh in the hostcluster so that's how it is that's thecomplete spectrum um so these threethings we need in multi-tenency with thecomplete spectrum one of the tool uh oneof the pro another problems is uh theresource sharing so even if you have allthe stuff that I mentioned previouslyall the teams we do not want them tohave separate ingress controller managerand all those components at least thereare certain set of components whichshould be shared across uh you knowcertain policies falco however noenginex um controller search managerwhich is there on the host cluster and Ido not provision them again that was oneof the pain points of the huge clusteruh this is where I would like tointroduce one project um and obviouslywe cannot discuss all from the spectrumso one is vcluster what vcluster helpsyou do is v cluster is a open sourceproject that helps you create virtualkubernetes clusters uh on top of thehost kubernetes cluster and uh it's thecluster that you get it's a cube configfile that you are getting and the modelis same uh it runs a control plane poduh which is running as a stateful set onthe host cluster and then when the teamgets it let's say this is team A team Bso team A will be only able to deployworkloads to team A's cluster and notable to see team B and they get a CNCFcertified distributed Kubernetes clusterteam B will be able to deploy anythinguh team A can have Argo CD version V1team B can argo CD version v1.2 team Acan have a Kubernetes cluster withversion 1.31 team B can have a clusterwith Kubernetes version 1.32 team B candecide to upgrade their Kubernetesversion to1.33 team A can say I do not want toupgrade it to 1.32 or 33 so CRD testingyou can have your different versions ofuh operators that you can install ondifferent virtual clusters and rememberthis is the same big cluster that we aretalking about but we are having multipleclusters so it solves many of thechallenges that we discussed if we havethe uh bigger kubernetes clusters sothat's how it works i talked about thesyner when I was explaining that so thatthere is a syncer component and most ofthe tooling in that area like K3k uhvcluster uh they will be having thissyncer component so um or a way to syncthe resources so in vclustluster whathappens is uh like you said a user whenthe user runs the command it goes to thecontrol plane of the virtual cluster sothere's no performance overhead and uhas soon as you create everything there'sa synch that copies it on the hostcluster and then the regular schedulingtakes place um on the host cluster whichis there so I have a quick demo uh it'lltake just one minute um you can you cantry the demo out on your own as well letme just showthat oh okay I need toclose this yeah so cubecdl getnodes so I already have a three node uhcubernetes cluster which is running uhwhat I do is I'll just create vclustercreate demoert um and just pass avcluster yaml file what I want to showhere is how the vcluster is created andhow the behavior looks like right now ithas started to create and if we see hereand cubecdl get pod a so we have all thepods which are running this is the hostcluster so as a as admin of the hostcluster obviously you see all the podswhich are running um but what ishappening is you can see there is astateful set so this is the controlplane pod that is coming up on the hostcluster for the virtual cluster that isuh being created so vcluster is beingready the pod uh is in the status so itis you knowuh hoping to run yeah it's in thecontainer creating state and soon whatwill happen is suddenly I do cubectl getpod hyphen a and I don't see any otherpod i only see one pod which is thenetworking pod why that happened thathappened because uh the cluster uhcreated a new context and itautomatically switched to the newcontext so that's the cube config filethat you can hand over to the team soyou can apply u theapplications you canapplyissuer and you can create and you see onthis cluster I haven't installed ingresscontroller I haven't installed managerbecause I have synced those in thevclustery file apply hyphenfcertificate and cubectl apply f ingressthis so I hope I have deployed theissuer certificate ingress and theapplication yeah all these componentsare created so if we do cubecdl getinguh it it might take some time for thecertificate to propagate but at leasttheum link should work let me see if I canscroll back into the screen okay it hascome and if I paste well welcome toCubeCon2025 thank you uhand so in the end we just want to saythat uh there is use that there is thereare huge clusters benefit for that thereare small clusters benefit for that asPaco mentioned um there is fine-tuningthat can be done for the huge clusterswith different set of components thereis multi-tenency with the completespectrum which has different tools andcater to different use cases so based onwhat is your requirement what do youwant to use uh you will be using fromthat particular spectrum one of thetools I showed vluster vcluster we havea booth uh of vcluster if anybody wantsto talk more you can find me in my fancyjacket uh and at the vcluster booth i'llbe happy to answer any questions i hopethey don't kick us out because we arealready four minutes over uh we'll bearound for the questions until the nextspeaker comes in and thank you so much2025-04-15 21:59:36.766990| fast aggregationsand analytical queries on top of themyou can use click house it's open sourceuh we also have a cloud version uh whichuh our team is also part of managingthat cloud and it's got all the fullymanaged features serverless idlingseparation of compute and storage uh sowhat happens here is like uh we havepods running inside these click houseinstances uh and these pods are attachedto PVCs that have metadata in them theactual data is stored on object storageas is the trend these days with uh youknow fully managed serverless databasesUh and then we have like a bring yourown cloud offering if you don't wantdata to ever leave your uhnetworkSo uh like uh most uh uh people who wantto manage like uh something onKubernetes a service or even a databasewe started with like a stateful set uhuh started with our operator which inturn is managing a stateful set which inturn is of course managing the pods anduh they have the PVC attached tothem Uh I'm going to do some like a lotof context sharing I'm going to rapidlydump a lot of information here becauseuh the the background context for themigration is actually a talk I did lastyear at CubeCon Paris that's calledfantastic ordinals So uh it's going togo really quickly So if you uh kind ofif you have trouble following pleasecheck out that talk because in that oneI go really in depth about the problemwith stateful sets and then how we cameabout solving those problems Right Sothis is just a quick plug for thebackground context Uh okay so uh reallyquickly autoscaling So autoscaling ingeneral in Kubernetes today requires podevictions That's because uh every timeyou want to resize a pod you want tochange a request or the limits you needto kind of uh you know restart the poditself And that is done inside our cloudusing the autoscaler right Which isresponsible for evicting the pod andthen a controller is going to go overand then resubmit that pod In this casethis controller could be a stateful setcontroller could be a deploymentcontroller whatever right Or could ifyou're managing bare pods yourself couldalso just be your owncode And this is a general high levelarchitecture of when we do autoscalingSo we have recommendations we haveuserdefined resource limits For exampleyou want to have a minimum size or amaximum size And all that information isgoing to uh come in uh the autoscaler isgoing to react to that information goingto start doing pod evictions And thenwhat's happening here is that you cansee there's a mutating web hook when thecontroller actually resubmits that podUh the web hook is going to interceptthat pod spec and it's going to mutateit And then for example let's say uh thecustomer decides that hey I want to havelike an 16 GB pod instead of the 8 onethe web hook is going to be responsiblefor mutating that pod spec and then whenthe pod gets submitted it's going tocome up with the new size and that's howyou do vertical scaling inside uh youknow that that's how basically thestandard VPA alsoworks vertical scaling so the problem isuh thus right so what we want invertical scaling is we want to have likeuh let's say three replicas of one sizeand then you want to go uh to adifferent size for all of thesereplicas and when I say replicas I'mtalking about the database replicas andall and if you If you're familiar withPostgress or other kind of databasesthis is not that kind of a scenariowhere you have like a primary withreplicas following All of these uh nodesare equally capable of ingesting dataand reading data Uh that's because it'sa multim masteraster kind of a setup ifyou want to think about it in that inthat direction Uh so break firstvertical scaling this is what's standardtoday in Kubernetes Uh that's uh slowand it's disruptive And why do we whatdo we mean by big break first That'sbecause if you maintain a pot disruptionbudget uh in our case for example wehave a max unavailable equals one you'realways going to have to do a rollingrestart right so uh the image I showedear earlier here like when the poteviction is happening you're not goingto evict all of your pots at the sa}metime to do vertical scaling rightbecause otherwise you'll have downtimeso you're going to do it like as safelyas possible you're going to start withone replica you're going to wait for itto safely terminate all the longrunningqueries if you have a backup running onthat part you're going to get rid ofthat and then uh gradually you're goingto restart all of that and that processis extremely slow and painful It's goingto put more pressure on the existingreplicas which are suddenly taking a lotmore traffic So if you think about itthe exact moment when you want morecapacity is the exact moment whereyou're taking away capacity because youwant to resize and that's terrible andnaturally we maintain some extraoverhead on our pods to combat that uhuh that extra pressure in terms of CPUandmemory So uh and by the way just quicklywhy is this possible Uh why is this theonly way to scale today That's becauseof some limitations with stateful setsAnd I'm going to refer to the talk I didlast year that go really deep into thelimitation of stateful sets So the theway to solve this is to do somethingcalled make before break which is verysimple right You add capacity uh interms of like uh like these extra threenew parts and then you take away theexisting capacity the old replicas youjust take them away and that's reallyfast right Because the minute customerwants to have like extra capacitybecause maybe their utilization isclimbing you can just really quickly addnodes uh without disrupting anythingbefore So there's no process of rollingrestart that's happening here It's notgoing to put any extra pressure Ifanything it's going to give youtemporarily even more capacity than youneeded and then you're just going totake away the old old replica onesSo this is what we wanted to do and theway we did it was we just implementedmulti-stateful set because the vanillastateful set as I said has some certainlimitations because of the way ordinalswork It cannot do this sort of so thisidea this very simple idea is notpossible today using a single statefulset So we had to kind of refactor ourclickos operator to do multiple statefulsets and this is how it looks like Sothe idea is to have one stateful set forevery pod Uh so before you you we had asingle stateful set and the transitionwas actually this right So each singlestateful set is responsible for managinga pod it by itself and this gives us alot of control and flexibility and thisand now we have the cap capability to domake before break So this is how it workSo we have three stateful sets managingthree pods You kind of do a makeoperation which adds a new pod As youcan see it's a bigger size You break theexisting pod that one goes away Rightvery simple idea but it's not like Isaid not possible with stateful sets Sowith all this context we're going totalk about the migration Now what whatdo I mean when I say migration Theproblem is we had we now have twodifferent uh code paths inside ouroperator and we have thousands andthousands of production clusters whichare still using this old uh approachRight So we want to move all of our uhcustomers uh from this approach to thisone And this is the migration that we'retalking aboutYeah And there are some of therequirements of this migration which isyou cannot have downtime Uh uh you stillwant to continue to serve it's anorchestration change right you're not uhof course you're uh in future you youwant to do make before break So that'san extra capability that you want togive the customers but at the same timeuh downtime was basically unacceptablefor us we need to do some orchestrationsome organization you know a lot ofExcel sheets there Uh a lot ofvisibility for our customers Uh and ofcourse the database itself is not likeI'm talking about pods as if they'relike you seamlessly going in and out Butuh the database itself was not preparedto be this elastic So there were a lotof challenges uh that we uh encounteredalong the way And of course uh beingable to roll back or even mitigateissues as they come into while migrationis happening for critical customers thatwas als~o like an important requirementfor us So again the migration itself wasalso like an MBB style migration whichis you have a single stateful set whichis managing let's say three replicas andthen you immediately add three newreplicas as three three new statefulsetsright and then you synchronize all thisreplicas So uh the sync step which isbasically copying over of the catalogand some database internal operationsthat are happening and after thesynchronization is complete you can justbreak the single stateful set that goesaway and now you are kind of left withthree single stateful So you go from theold way of orchestrating to the new onewhile at at no point in time did youhave a downtime you always had at leastthree replicas available in the clusterand you were always servingqueries So that's kind of the generalidea of how we uh designed uh thismigration flow and there were multiplepieces of this puzzle right So we uhlike it's not enough like this codecannot live inside our uh Kubernetesoperator uh because it's a one-timeoperation per cluster and then it it'lljust be throwaway code So we needed tocome up with a way to design is to havelike sufficient decoupling and thensufficient orchestration as well So wehave an like an API which is kind ofresponsible for doing things like uhsetting up your autoscalingconfiguration setting up your idlingconfiguration Uh we use temporal I don'tknow how many people are familiar withtemporal Hopefully many of you are Uhit's an orchestration system that weused uh to kind of uh you know handlethese migrations as they happen inbatches And we wrote a custom migrationcontroller which was responsible fordoing all the things we couldn't doinside our own operatorSo quickly uh maintenance this was avery simple idea that you put a clusterinside maintenance mode where youdisable certain operations So if youthink about it while this migration ishappening while you're adding andremoving replicas for this criticaloperation you don't want the exist thecustomer to kind of suddenly uh you knowdecide oh I want to scale my clusterbecause all the all those operations aregoing to be influencing what's happeninginside the pods anyway So you want todisable those operations but you don'twant to disable inserts and selects Sowe implemented something we call partialmigration uh which basically removed oursto virtual service and then uh uhactually it didn't that was the onlything it didn't do right uh but itdisabled uh from the autoscaling and theidling backups perspective all of thoseoperations were disabled but the insertsand selects were stillrunning and like I said we used temporaluh this is a really powerful tool uhthat came in really handy because itgives it gave us durable execution wehad to write a lot of business logic tokind of you know uh save the initialstate of the cluster perform themigration which changes the state itselfand then reset the state after themigration is complete uh to whateverconfiguration uh was uh before themigration even started We had to dothings like you know catch configurationdrift if there's a human interventionthat's required send a notification toslack uh you know uh pause the migrationwhile some someone on call goes andchecks that those things So a lot ofthose things were made really easy andpossible with temporal because of itslike seamless way to kind of do durableexecution So all the inputs and outputsare preserved failure detection retriesyou get all of these things basicallyout of thebox and this is how it looked like So wehad like a parent uh workflow which isbasically like a glorified cron job ifyou want to think about it in a simplefashion It's like goes and checks uhwhat's happening uh what clusters are inmaintenance mode and what clusters areuh eligible for a migration today Solet's say there are 20 of them todaybecause you have you you want to do itlike in small batches You don't want todo it uh you know immediately You don'twant to migrate thousand cluster That'sgoing to cause capacity issues in all ofyour regions Uh so every day you likehave a select a certain small batch ofclusters go over them and kick off likea child workflow uh for those clustersAnd this this workflow is going tobasically uh uh look at our uh clickhouse cluster custom resource It's goingto start reconciling and kick off somemigration controller jobs Uh we're goingto go into what migration controller isall about because that's really theheart of what is doing these migrationsBut this is the orchestration a veryhigh level orchestration view of how weactually manage thesemigrations I'm going to hand it over toJamie to talk about the controllerThank you Thank you Manish I justcurious how many people know click houseIf you could put your hands up becauseI'm going to be talking about some veryclick house specific things during thisOkay A fair number of people Uh so yeahfirst we'll start with the migrationcontroller and discuss a bit the actualimplementation of that and how we didthatSo I just want to first highlight thatthis was a test for click house in termsof being elastic in its scaling in termsof being cloudnative adding and removingreplicas as you saw from these examplesof make before break and in specificallytwo things in click house One is what wecall syncing the catalog which you canthink of as all your databases and yourtables and the other thing is the actualdata Now as Manish mentioned in thebeginning on our cloud product that datais in S3 but there is metadata uh at thetime of these migrations at least inreplicated merge tree that is on a localEBS volume and if that data is notsynced to other replicas before you dropa replica you could have data loss Intheory the data is still on S3 buttrying to piece together you know whichblobs on S3 are for which customer wouldbe very difficult And the other thing isit was a test for our implementation ofthis make before break in our KubernetesoperatorSo starting with the requirements wewanted something simple where we couldfor example add a label to a CR usingthat temporal workflow and thismigration could start If needed we couldreverse this process and we could havetwo states at the same time So we haveour single STS code path and ouroperator We have the multi-sts code pathWe wanted to live in both of thesestates and ideally not implementbackwards compatibility in our operatorto be able to reverse this to handlethese two states And so we came up withthe idea to have a second controller amigration controller that takes oversome of this work So at a high level wehave a CR that's reconciled by twocontrollers now which is an unusualpattern in Kubernetes So we had a mutexto ensure if there was a reconcile stillhappening from our regular Kubernetesoperator that finished and there wasonly one reconcile happening at a timeAnd so we have our single stateful setreplicas and our multistateful setreplicas managed in some senseseparatelyNow that's not quite true So I'll give amore honest picture on this one The waythat we actually implemented ourmigration controller is we embedded thereconciler of our main Kubernetesoperator into the migration controllerbecause a lot of the logic to forexample add replicas sync that state thestate of click house remove replicasthat's going to be the same in themigration controller So we actuallycalled those functions from themigration controller and called theoperator when needed But any specificmigration logic we didn't have topollute our codebase in the operator andmake it backwards compatible We couldjust put all that orchestration andlogic into the migrationcontroller Now just to summarize that wecan think of the migration controller asa sort of wrapper controller over ourKubernetes operator with additionallogic to orchestrate things and we canhandle all those specific things So wemake we sync and we break and thatprocess with two different singlestateful set and multistateful setreplicas and being able to reverse thisprocess is completely handled by themigrationcontroller Now I want to go into somechallenges when we actually started toroll this out We wrote a lot of testsand we wrote integration tests unittests m tests We have somethi�ng internalcalled stress house We migrated a lot ofinternal instances But when we startedto migrate customers we encountered someinteresting issues So the first one I'llgo into is an implementation problemthat we had in multi-STS and the way wewere implementing this in our Kubernetesoperator So I'm showing you here a tworeplica cluster on the left and we do amake we add another two replicas on theright and you see here that the firsttwo replicas are in zones A and B andthe new replicas have been put in zonesC and C Now as many cloud serviceproviders do we try and maintain uh azone balance We have three availabilityzones and we want no more than a maxskew of one Now currently this isactually uh completely valid You have ab c uh you have two replicas in c butthat's not max of more than one Soeverything's great But now as we'vedescribed this process of breaking theold replicas we have to break thereplicas that are in zones A and B Andnow you're left with two replicas inzone C and C And now this is what wedon't want So we now if zone C goes downwe loseavailability Uh a similar problemhappened to us with idling with in termsof our zone balance So as Manishmentioned we idle clusters that don'thave usage and so they scale down tozero Now when you unidle those instancesand at the same time for example start amigration or do a horizontal scale outyou can have replicas schedule in a zonethat at that time is valid but thereplicas that existed prior that have aPVC and are not scheduled once they aretrying to schedule they won't be able tobecause now the the zone isimbalanced So how did we fix this We wefixed it in our operator by zone pinningand tracking the zones It's alsopossible to solve it in other ways moreKubernetes ways you could say with ummatch label selectors for example but uhwith the way our operator is writtenthis was a convenient way for us to doit Now I'm going to go into some of theclick house specific issues Uh so clickhouse has a nice feature where you canhave some external system like postgressor rabbit mq ornats and you can have atable engine in click house itself andat time of creation it makes a lot ofsense for click house to check that thisexternal system is available when you'retrying to create that table So you tryto create a postgress table engine clickhouse checks is that even can I evenreach this uh postgress instance thatyou're trying to create the table fromNow when you add a replica to ClickHouse and you sync the catalog it triesto create those tables And of course itwas unable to because in some cases thecustomer no longer has that externalservice or perhaps they've just shut itdown temporarily But now it makes a lotless sense to not be able to create thattable because you're not really creatingit So this was a problem we encounteredwith the migrations and something thathas made ClickHouse more cloudnative andelastic So these are now secondarycreate queries that don't throw the sameerrors and do the same validationbecause they're not creating the tablefor the first time but they're justsyncing it to a newreplica Similar but completely internalto click house Materialized views is apopular feature where you have somesource table and some destination tableand you have some trigger uh that willmutate that data into your destinationtable and at the time of creation againyou want to check that that source tableexists But at some point someone mightdelete that source table but still wantthe materialized view which iscompletely fine But again click housetried to do validation Does the sourcetable exist when syncing it to a newreplica and that failed So that kind ofmade click house again more cloudnativeNow this this one's quite interesting Soagain we've got a make before breakhappening On the left we have our oldreplicas On the right we have our newreplicas Uh so we've got our singlestateful sets numbered S1 to S3 andwe've got our multistateful sets M1 toM3 And currently they're they're bothactive receiving customer insertsEverything's great We now we're doingthe sync step So we're trying to we'rerunning a command system sync replicalightweight which syncs all the partmetadata from the break-in replicas toone of the new replicas So we see we'rerunning it on M2 in this case but we'regetting all the data synced from S1 toS3 but we're also getting the datasynced from M1 and M3 in this case Nowwhat's the problem with that Well acustomer could have continuous insertsand they could be inserting into M1 andM3 when you run this command Andpractically this command would probablytime out and you'd have to rerun it abunch of times In theory if you hadcontinuous inserts forever this would bean unbreakable replica You would neverbe able to succeed this query And so youcould never finish the sync step So whatdid we do We added a from modifier intoclick house So you can do system syncgraphical lightweight from in this caseS1 to S3 which is our single statefulset pods And now we don't need to waitfor those inserts that are going to M1to M3 because those replicas they're notgoing to be dropped now They're they'resticking around Everything'sfine DNS issues Yes we had DNS issues inthis migration So click house has adistributed DDL queue Uh onclusterqueries go into this So for examplecreate table oncluster create databaseon cluster these kinds of queries And atthe time these queries are written thehost names of the replicas available inthe cluster are recorded in this queueSo for example we've got S1 to S3 arerecorded in this case Now let's say itwas unfinished on S1 and we then droppedthat replica which is fine because itsucceeded on the other two replicas wedon't need it to succeed on all three ofthem Uh but now click house when ittries to read that query from the queueand execute it it's unable to resolve S1that replica doesn't exist anymorebecause we dropped it And so this was anissue that uh we encountered trying touh you know deal with DNS issues Sosystem table copy this is something thatwe actually found late in the game ofour migrations So system tables are areplica local table There's many of themYou have log tables and non-log tablesAnd these are used for introspection Soyou might want to look at the query logto see your history of queries You mightwant to look at the metric log You cansee CPU and memory statistics and otherthings to see what's going on in yourdatabase Now these are completely localto the replica as I said And the problemis with this make before break patternyou're actually going to lose thosesystem tables You'll lose all thathistory And many customers use this forobservability and general introspectioninto theircluster Now how did we solve this So forthe migrations themselves we had a stopgap solution where we literally selectinserted from the break-in replicas tothe new replicas and just copied thisdata Obviously you can imagine thisdoesn't scale very well and we couldn'tdo this when we do this for generalautoscaling purposes in customersinstances And so the fix for this thatwe implemented was a new disk typecalled S3 plain new writable that movesall this uh data and metadata onto S3and allows you to attach these discs Soyou effectively do a zero copy attach Soyou just attach the discs from the oldreplicas to the new ones instead ofcopying this dataSo I I'm going to end on this idea thatmigrations are kind of a boring thing Ithink generally and they're quite hardand it's something that many many of usas software engineers have to deal witha lot of projects of software is youbuild something new and then you need tomigrate the old thing to this new thingand I think it's quite challengingespecially if you have thousands ofcustomers each with their ownrequirements and you need to make surethat for example if they don't want tomigrate on a particular date even ifit's a live migration like we hadthere's still concerns there's yourproduct reputation what if somethinggoes wrong so just leaving on thatthought so thank you uh if you want tochat to us we've got a booth you cancome over um if If you're interested inlearning more we also have a blog postand and we're hiring2025-04-15 21:59:37.597368�d find somethingthat was more cloudagnostic and the big question is can wehave an infrastructure abstraction so wecan consider different regions um in inthe same way and of course I mean I'msure you know the answer the answer feltuh very clear at the time it was it wasKubernetes right and the idea is thatKubernetes offers us a unified way tointeract with infra right you cansoftware packaging in a unique way withcontainers uh you have a unique way todefine application rollout strategiesand updates and you have a way to defineresources that will be god agnostic andof course uh this was working on all theproviders we were considering at theSo so far I mean our applications wererunning in in in nodes and what wewanted to move to is to move for toapplications uh running in containers ininKubernetes and there are additionalreasons right to move to Kubernetes As Iwas saying uh abstracting the infra wasuh the first one and the most importantone for us But something that wasimportant to very similar to usingmultiple clouds is that our customerswere running were starting to runKubernetes in 2018 and we were not andso uh in order to make our integrationand our support for communities betterit made a lot of sense for us to alsorun on Kubernetes In addition thecommunity was uh pretty big at the timeIt's even bigger now which means like wehave a lot of people to talk to when wehave challenges or issuesAnd finally and it's important too rightIt was much easier to hire people in theinfraspace when we were running oncommunities at the time So now we havethe answer Uh let's go right However nowthat we've said that we want to doKubernetes we need to figure out howwe're going to do it And I'm sure youremember Kubernetes the hard way from KCHigh Tower but it can be a bitoverwhelming at the beginning right youhave a lot of concept you need to um toto care aboutIn 2018 when we started of course weconsidered using managed Kubernetesright uh all cloud providers hadofferings so EKS G and AKS howeverremember we were early 2018 right and atthat time the landscape was prettydifferentum EKS and AKS were very young and stillinbeta and while G was uh generallyavailable it was limited to 500 notcluster And at the time we already knewwe needed much biggerclusters In addition um manageKubernetes clusters uh come with a lotof things that are that a cloud providerput in them right They're veryopinionated uh because they each cloudprovider has its own way of implementingthings They're very similar but they'redifferent enough that it's a bit painfulto build an abstraction on top of themAnd we put a few example there likenetworking is different Uh operatingsystems are different SupportedKubernetes versions can be different Andsomething we also cared about and and westill care about is um this manageoffering offer very limitedobservability on the control plane Andif you've run at scale I'm sure you knowthat uh when you have an issue you needto understand what's happening with APIservers at CD which you can't see whenyou have a manageroffering So we decided to do itourselvesOver the years we went through multipleiterations So we started pretty simpleright uh we deployed the control planeof our cluster in virtual machines andall the worker nodes were also virtualmachines You can recognize here the maincomponents of the controlplane and we managed all this with withglue Uh I mean we were familiar withterraform and civilian packer and ofcourse shellcript So we we had a lot ofthose and now that we had clusters wecould deployapplications but remember I mean we wehad already we started we were startingto have two regions of the time and weknew and we knew we're soon going tohave more than more than two and we needto provide a way for applications todeploy to all theseregions to make things even more complexclusters have limit right I mentioned500 nodes before it turns out we'retrying to limit our clusters to 5,000nodes but even with 5000 nodes In someregions we need more than one And if wewant the schema to be more diagram to bemore complete uh regions actually ha�veeach have multipleclusters which makes the life ofapplication developers a bit trickyright Because uh they want to deploytheir application in all the regions wesupport at data dog but it's starting tobe difficult because they need to figureout which cluster they deploy to Theyneed to figure out if regions areslightly different from one another Andand also in terms of workflow right Ifyou do a deployment on a region itbreaks maybe you want to stop and notcontinue to another region So we had tofind a way to make this easy for forapplication And the we actuallyimplemented a solution uh that is thatwe call the software uh deliveryplatform And here is a quickillustration of how it works right Sowhen uh application developers want todeploy their application what they do isthey use the platform And in thisexample here uh we're going to updateapplication A from version blue toversion green And what happens is wellthe application team is going to callplatform and tell update it to thisversion And what's going to happen isactually this platform is going to starta temporal workflow where we encode allthe logic todeploy And to give you an example of atypical workflow what's going to happenis going to start deploying to firstregion here Asia in the US And thenwe're going to check that everything isworking fine by looking at monitors andalso metrics And if everything is okaysometimes we also want to wait for sometime to make sure that things are arestable enough If everything is fine weproceed and we continue to the secondregion Right So in that example herewe've now updated uh the application Aon GCP And finally we do the same thingwhere we verify everything and if it'sokay we proceed with AWS last check Andnow we're happy right the um applicationhas beenupdated So we now have a good story forapplications right But forinfrastructure components we we don'thave a good story there right Because asI was saying we're uh application teamtend to care about a single cluster ineach region But as an infra team weactually have to care about all theclusters which means updates are prettypainful right Because imagine when youhave dozens or hundreds of clusters Andso our operating model is slightlydifferent from application team becausein in our case uh we're using terraformusing enable and so updating the controlplane means running terapform contentupdates and this is one of the big shiftwe we did a few years back where wedecided to run command in a differentway So if you remember uh the way our towere looking like right everything wasworking was running in VM and managedwith as I was saying terraform onpacker and what we realized and andwe're not the only ones to do this rightis that instead of running applicationin this cluster we could also actuallyrun control pen components if you cansee here at the bottom of the slideinstead of having application Aapplication B we know we now have podsrunning Kubernetes control pencomponentsAnd what's interesting there is now ifwe want to update a control plane we canactually use the same primitive we useto deploy application which is we canjust use helm and just update and rollout our control plane Of course in thisexample we have a single cluster but youcan actually have multiple ones right Sojust to summarize this slide on the topyou have the control plane of a parentcluster which we still have to managewith virtual machines because we have tostart somewhere And just lower you havethe worker node from this cluster wherewe actually run the control planes ofthe child clusters right And all this isrunning as pods and we have multiple wecan run control plane for multipleclusters in the same parents And finallyon the left hand side of the slide youhave the actual worker nodes for alltheseclusters And this is where we are todayAnd as you can see we still have tomanage some Terraform and some aniblecode But in terms of orders of magnitudewe manage dozens of clusters uh withKubernetes now hundreds even but only afew have to be managed with Terraformand AnibleStill we want to make this better andone of one o�f the thing we have in mindis maybe we can run the parent controlplane in manage Kubernetes in which caseeverything would be manage the same Wewould just have to manage our clusterwith withKubernetes So if you remember I saidearlier that manage Kubernetes offeringsuh were pretty different and hard to andhard to use This is a bit of a specificcase right because it's only a singleteam that has to care about it the teamthat is managing Kubernetes and alsoit's a very specific single use case andso it's easier to adapt to to to thedifferent needs So it is much better towhat we had before right So instead ofrunning terapform and now what we can dois we can just have this ugly for loophere where we just updating with helm orourcluster It's good but it's not ideal Butif you remember we actually have a toolthat is allowing us to deployapplication across multiple region withlogic to deployit It turned out if instead of deployingapplication we try and deploy uh controlpen components What we can do here andI'm going to take the example of ATD iswe can use exactly the same type ofworkflow to update control pencomponents and here we're going to seehow we update ATCD from blue to greenand you're going to see that it'sexactly what we're using before forapplications So we can leverage thetooling that we've built for other teamsand this is very similar right and firstwe're going to update in a region thenwe're going to move to a second onewhere slightly differently we caniterate over multiple clusters in aregion and and then define a regionwhere we have threeclusters So let's recap right we nowhave a way to deploy our applications inmultiple regions and and and cluster anddifferent providers and we have a goodway to abstract our application uh usingKubernetes However one big questionremains which is how good is thisabstraction and Maxim is gonna talkaboutit Thank you Laur So yes so as you cansee uh as Lauren stated we have a know away to help us unify how we run anddeploy cub application across uhmultiple cloud providers But onequestion remains indeed like how fardoes this abstraction actually goes andwhere does it start leaking and so asunder kubernetes we are still heavilyrelying under on cloud native primitivessuch as compute storage or network Thisis where things starts to gets trickywhen you look into making that work overmultiple providers because whileKubernetes gives us a consistent APIthat all sales infrastructure group andapplication teams can leverage uh inorder to interact with uh the underlyinginfrastructure remains widely differentfrom one provider to the other So let'sdig into uh each one of them a bit andlet's start with compute So at thecompute layer there are two keyresources that we care about mainly podsand nodes Pods are are a new way to umuh deploy our application whilst uh thenodes represents the actual underlyingcompute resources on which pods arebeing scheduled on Hopefully I'm notmaking learning you anything there Butif you recall beforehand uh before wewere using Kubernetes every app we wereoperating had uh a native implementationover dedicated instances that weredynamically scaled through uh well inthe case of AWS autoscaling groups Andas uh everyone in the engineering groupat data log was very familiar with thisapproach Uh we decided to get uh themigration much easier on everyone handsto start uh with a very similar one byhaving a single pod per node And themain uh advantage of that well of coursebeing able to migrate uh a bit morequickly but also preventing ourselveswith any kind of upcoming uh quirks thatKubernetes brings along the way such aslike uh having applications competingfor the same resources over the samenode and potentially also keeping thepossibilities that user were alreadyused to such as having local disks orany kind of accelerators that needs touh be solely accessed by a singleapplicationAnd here as well similarly as before ourgeneral idea was to continue relying onthe same underlying uh primitives thatwell services that the providers wasoffering us such as autoscaling groupsBut from a men�tability perspectivethough uh we had to find a way toefficiently manage all those scalinggroups uh among those providers And thisis where Kubernetes can come quite handybecause if you are missing anabstraction well you can build ityourself uh thanks to custom resourcedefinitions and this is what uh byheavily inspiring ourselves from whatthe cloud providers do for theirrespective uh managed Kubernetesofferings uh decided to do as well So weintroduced this concept of nodegroupoups that uh we therefore offeredto our users and that way they couldhave a way to uh declare the uh uhinstances that they were expecting tohave to run their workloads in a unifiedfashion and the idea there was to reallykeep it simple for them Uh they declarewhat they need and we take care of uhimplementing then implementing it on ourhand So from there on when uh users werelooking to deploy their applicationsthey would start by uh defining nodegroups uh that represents what they needand uh ourselves we would be we operatea controller on each uh cluster uh thatcan reconcile uh those configured nodegroups uh basically making the necessaryAPI calls to the underlying provider uhin the case of AWS for example creatinguh the associated autoscaling groups andfinally uh we decided to leverage thecluster autoscaler in order to determinehow many nodes for we need for each ofwithin each of these groups according tothe amount of pods that the user wasexpecting to have to schedule And uh thecool thing with that is that well thecluster autoscaler came up out of thebox with support for all the providerswe were aiming to operate on and uh wehad to make uh our controller do thesame But as long as we have that we cantechnically support uh any kind ofprovider while having a a unifiedabstraction for theusers And thanks to this approach butmost importantly thanks to the hugeeffort from the entire engineering groupat data dog at some point we eventuallymanaged to uh migrate uh the entirenessof our applications overKubernetes And as you can imagine thoughuh this solution started uh to quicklyshow its limits Uh of course uh theabstraction uh was quite uh leaky as uhusers would still have to uh configuretheir uh instance types Figuring out onwhich cloud providers they reside on andalso in terms of efficiency Uh figuringout what instance types you need foryour workload is a hard task to do andif you well uh especially on one wellespecially on three providers where uhyou may not have the equivalence uh fromone to another And uh last but not leastin terms of scalability as well it p uhquite uh some challenges as uh for uhany kind of pod that you need toschedule you need a new node So it addsup in terms of latency and when youstart managing hundreds or thousands ofuh autoscaler uh within your AWS or GCPor uh Asia accounts uh it starts toquickly put pressure over the API uh thecloud providerAPIs And so for those who started withKubernetes more recently this might seemobvious But of course we want our usersto focus on pods rather than nodes rightUh so in order to move from where westarted to a more future proof approachuh we had to start somewhere And itbegan by ensuring that every pod uhdefined uh were uh every pod specs wereembracing this u uh habit of specifyingme resources requests and limits And asyou can see here as well and here aswell I'm not learning you anything butuh this is a much more abstracted way torepresent what uh our application needsin terms of resources rather thanspecifying a particular instance typeAnd so once we had consistent pot specsbeing set uh among u uh our apps we wereable to start flipping the model andstart managing node groups on behalf ofour users So application teams that wereready to make the move uh having updatedtheir node spec we could happily onboardthem onto those set of management groupsthat uh infrastructure team would belooking after Whilst others who stillneeded to have like a strong ownershipof the instance they are um operating onand willing to maintain that themselvescould remain uh with that approach Butaside from the maintenabi�lity on bothside uh what it brought us as well thethe fact that we could manage those nthose groups as part of theinfrastructure uh group Uh it uh allowedus to start introducing new instancecategories for instance such as varyingCPU architecture or instances withaccelerators And whilst nothing wastechnically preventing uh well users ofdedicated note groups from doing sothese option were being made accessiblewithout any kind of effort on their ownwhich made uh the overall approach uhquite appealing for everyone And on topof that if we were to go even deeperthan this we could example start usinguh different hardware generation thatallows us to figure out what's the mostcost effective instance types given aparticular need right And all of thatwithout interfering uh with theapplication team'sworkflows which leads us to our thirdand last step uh in that process whichwas we were now being able to focus onto how can we efficiently size ourinstance fleet and I'm not going to riskmyself at paraphrasing what ourcolleague Rahul and Ben did yesterday uhin this exact room actually uh so I hopethat the recording will be availablesoon but they dig into that uhparticular uh topic in quite somedepth and of course there is much morethings to say about compute but uh wealso have stuff to say about the otherthings so I'll continue with storage nowsimilarly here uh there is a fewkubernetes primitives that we can relateto uh when we talk about storage uh asan abstraction in kubernetes what we aremainly referring to usually is how canwe provide uh our reports with access toblock uh to block devices And when youoperate in the cloud there are many twoways you can have those uh block devicesEither they are local to the instancesthat the pods are being scheduled on orthey are remotely managed by the cloudproviders and that way they can followup the life cycle of your application ifthe pod moves from one node to the otherThankfully here containers have beenaround for a while now and the communityhas been looking into ways tostandardize how can we uh approach thisfrom an even broader perspective thankubernetes actually and thanks to thecontainer storage interface now we canuh rely on providers to implementdrivers that satisfies the spec of uhthe standardization and the great thingwith that is that uh we get out of thebox support for most of the well all ofthe providers that in our case we werelooking for but as interestingly as aswell We have more cap operationalcapabilities such as well thanks toadditional uh abstractions uh we can uhpotentially snapshot uh thevolumes On the other hand uh what we'vediscovered along the way is thatmanaging block devices among multiplecloud provider is uh quite challenges Uhit has a whole set of challenges Firstof all the offering is uh often quiteinconsistent from one another uh whichmeans that uh it's very hard to offer aunified user experience uh among themall Um on top of that uh as I was sayingeach uh provider is building their ownis maintaining their own driver Somaturity can largely differ from oneanother So same here in terms ofscalability we do not necessarily havethe same experience from one provider tothe other And last but not least uhmigrating or any kind of migrationeffort is often an afterthought uhwhenever there is a new feature comingup or whenever there is an an updatethat needs to go through uh quite oftenit involves a lot of manual effort to uhto get it through On top of that uh ifwe jump back early uh 2018 uh CSIsupport in Kubernetes was uh not a thingyet So that means that back thenKubernetes was uh implementing thoseabstraction on its own way uh whicheventually led us to have to migratetowards the CSI way uh a bit later Andhere as well if you're interested intolearning more about this our colleaguesAntoan and Batis did an amazing talklast year at CubeCon where they talkabout the intricacies of uh this uhmigration And finally uh what aboutnetwork uh here as well multiple facetsto comprehend but most importantly whenwe talk about networking in communitiesuh at least at data log what we werelo�oking for was to oop sorry to have ourpods being able to communicate with eachother whether they were uh beingscheduled on the same cluster or amongdifferent clusters and in more generallyuh we were looking for our ports to havethe possibility to access any kind of uhentities on our network So at the timeit was very common to use overlaynetworks to do so Uh but uh ourselves atdata dog uh we decided to not go thattowards that path and we went straightaway with addressing our pods uhnatively over uh each cloud provider'suh respective software defined networksand the main reason behind this is thatwe were looking to avoid uh theperformance overhead that uh we wouldhave got by having our packets packetsflowing through a tunnel uh we also feltthat uh in general it would be morereliable because there would be lessmoving pieces in our data path and lastbut not least uh as I mentioned beforewe were looking to have connectivitybetween two pods from one cluster toanother and that's what we could get asuh every single pod had an IP on theunderlying network they could have crossnetworkconnectivity on the challenging partthough uh something that we've beenfacing from the beginning and that we'vebeen uh continuously uh oh sorry nogetting ahead of myself Uh the uh mainchallenge when we implemented that partwas that we had to deeply uh integratewith each of the um cloud providers uhSDNimplementation and back then uh theecosystem in 2018 in terms of uh networkplugins was uh fairly limited comparedto today uh and so we had to use twodifferent approaches uh and pluginsdepending on the provider we were uhrunning on and this is why uh when westarted looking into working onto Asiawe decided to shift towards a moreunified approach and we picked Celiumfor that purpose what it brought us hereas well was the out of the box supportfor all the providers we were lookingfor more capabilities such as networkpolicy enforcement better routing uhthanks to the BPF perks and thereplacement of cube proxy and also thepossibility to collaborate with anamazing open source community and thisis what I was looking to get to beforeUh the challenge back then and that weare still looking into today mostly isthe scalability one Uh the scale atwhich we operate really leads us towardsuh yeah the intricacies of the solutionsSo yeah whether it's within a cluster oramong clusters that's uh what we aim tosolve And if you are here as wellinterested into the matter uh we have uhmade a talk with Amos and myself acouple of years ago at CDMCON where weuh dug into the matter and with thatI'll leave it back toR So um Maxim told us about um how wecould abstract compute storage and andnetworking um and the challenges associassociated with that But of course uh wewe need we also need higher levelprimitives right and cloud providershave a lot of high level offerings highlevel services and we have a few on thisslides However because we wanted to runon multiple providers we had additionalconstraints right um we only could wecould only consider uh high levelservices with common API acrossproviders posgress being uh one forinstance It had to support our scale andalso last but not least it has to be ithad to be efficient in terms of costbecause we run a pretty significantfootprintBased on this constraint uh we picked afew a few services right we picked RDSfor posgress elasticasheach for radiusuh and load balances and and objectstorage however and very similar to theother lower level offering um there arethere are similar offerings betweenproviders but they are slightlydifferent which means after a few yearsuh our decisions have evolved on on thistopic so the first one is in terms ofobject storage uh we use S3 GC s andblob storage and we're very happy withall of them right I mean they are verysimilar they have very similarinterfaces and behaviors and we're we'revery happy with them and we keep usingthem in terms of load balances uh itwould seem that load balancer aresimilar enough but it turns out if youlook at L7 balancing there are actuallya lot of differences between providersright and that's why over the yearswe've actually uh went down a layer andwe're now only considering layer fourload balancing and Anything that has todo with L7 properties we actually manageourself with withenvoy and and finally for database andand I'm saying I'm posgress and readyhere um we started with manage offeringwe starting with RDS and cloud SQL andand it was working right but in the enduh after some time we had too many uhtoo many small issues where thebehaviors of the different providerswere um different enough that it wasstarting to be painful because while inp in in in theory It was supposed to besimilar between providers and we providean abstraction to our users All of themhad to know the exact differencesbetween all the providers and theconstraints of all the providers Sobecause it became too painful weactually implemented the ser theservices ourself as a at a platformlevel We we learned a lot of of lessonsuh along the way right And one of thething we want to say is that thechallenges we face we talked a lot aboutthe technical challenges but thechallenges are not only technicalUm the first thing is well uh if youwant to run Kubernetes in multiple cloudproviders at scale uh you have todevelop expertise you have to understandexactly how Kubernetes works and andthese intricacies of the small issuesyou can get but you also have tounderstand how the cloud providers workIn addition uh you also very likely needto build strong relationship withproviders because well if you have ifyou encounter a performance issue or anAPI issue you need to be able to to workwiththem And of course as you're offering aplatform to the rest of your company youneed to be able to help them and trainthem to use the platform bestAnd and finally as always when you'rebuilding a platform uh we do the prettygood job at addressing uh 80% of the usecases but we still have team with veryspecific use cases where the platform weoffer is is not perfect yet but we we'reworking on it and we're trying toiterate to support more and more usecases And yes I mean Kubernetes offerpretty compelling abstraction but it haslimits Uh it's a great way to describeapplication deployments but it requiresa lot of effort to make it workAnd and remember I mean if you run on uhon on a single even on a single cloudbut even it's even more true on multiplecloud is as soon as you start pushingthe limits the underlying uh theunderlying implementation is going toleak and you will have to understand itIdeally you'd have only a few people uhthat have to understand it but you youstill have to understandit And and finally something that'simportant too is while managed servicesare getting more and more mature what'shappening is providers are trying tomake their offering the best And so alot of the things that happening thereare are not shared with the communitythan and is happening internally Andit's much better But it mean that uh thebehavior is becoming slightly more andmore different between differentproviders In terms of perspective thereare a few things where uh we're thinkingabout uh we mentioned using it managekubernetes clusters for uh control planeWe have a few other use cases that whereit's the use cases are specific enoughand unique enough where it makes sensefor us to make the effort to adaptapplications to multiple manageofferings So we're looking into thatWe're also looking at new abstractionand capabilities Uh Maxi mentioned GPUearlier of course and GPU is a topic andwe have to support GPU for our users andand finally I mean we also have toimprove our efficiency because we needto control our costsFinal thoughts I mean we we talked a lotabout how it was painful with Kubernetesand it's suddenly challenging butwithout it it would definitely had beenuh much harder and and we learned a lotalong theway And and on this uh on this slide uhuh thank we want to to thank you all andof course we won't have time forquestion because we're already a bitlate but we'll be around if you havequestions and we're and you can reach uhus on on on Slack too Thank you2025-04-15 21:59:38.413186 ��:�%#��+AiCAFXF5ECtohello Heyeveryone Thank you very much for uhstaying with us up until the very lastsession of the day Highly appreciate itUm so my name is Maxim Vizeno I'm heretoday alongside Laurai We are bothworking for data dog as part of theinfrastructure engineering group And ifwe're here with you today is to talk toyou about uh our story on how we operateour infrastructure over not only one butthree cloud providers now and uh howKubernetes helped us uh in that journeyUh we'll be happy to take questionsafter the talk I'm not sure we'll haveenough time but happy to take themoffline Otherwise you can always find uson the Kubernetes Slack And in case youdo not know about data dog we are asoftware as a service obserobservability platform uh that help ourand we aim to help our customers obtainbetter visibility into theirinfrastructure andapplications And to give you a bit ofsense of scale of uh what we operate atwe have about 800 integration as oftoday and we handle more than 10trillions events on a dailybasis Uh if we're here today is to talkabout Kubernetes So on that front uh wehave hundreds of clusters uh tens ofthousands of nodes and hundreds ofthousands of pods and uh they are all uhliving in the cloud uh among three of uhthree providers public cloud providersnamely Azure AWS andGCP and if we did that uh spoiler alertuh it wasn't because it was fun uhsurprisingly perhaps but we had somereasons to do so first uh our customerswere looking for us to be uh close tothem Uh there is also this uhinteresting philosophy that we have atdata dog that we call dog fooding whichis basically looking to experiment uh asmuch as we can in order to potentiallyprovide solutions for that are um morepertinent for our customers if it solvesour problem potentially it solves theirsand of course uh more locations tooperate from and also a good startingpoint to establish partnerships withthose providers And uh so to get startedI'll hand it over to Lauron who will diginto will get you into where we startedand uh where we aretoday So let's start with a bit ofhistory Um until 2018 dead dog wasrunning on a single region uh in the USon AWS uh what we could refer to asclassic and everything was operated withvirtual machines managed by chef andapplication were deployed withcapistrano In 2018 we decided to expandand we created our first region outsideof the US on GCP in Europe And over theyears we added more and more regions Sowe started diversification by adding uhedge data centers which we were fromwhich we run synthetics and browsertests uh we had our firstcertifications and in 2021 we had a bigexpansion where we created uh three newregions one uh for government agency inthe US and we continued last year I meantwo years ago we opened our first regionin Asia inTokyo and we continue added uh edgeregions and this is where we stand todayas of the first quarter of 2025To summarize this in in a few numbers uhwe currently operate over threedifferent cloud providers Uh we have sixmain uh large regions and we have 29edgelocations To make that work we we had tofind a solution right And of course asMaxim was saying one of the reason we'resharing it here today is because we'reusing Kubernetes to make itwork And as a reminder uh this is how westarted at data dog right Everything wasrunning on AWS It was simple enough uhwhen we wanted to deploy applications wewere using a set of tools terraform chefcapistrano some packer that was easywhen we added our first region outsideof uh AWS uh we decided to pick GCP inEurope and we had to decide what to donext right so were we going to rewriteeverything we had written for AWS orwhere we going to try an��developerproductivity means to you and then Iwant your hot question whe that to askfor a developer not about fruit and vegwe'll get there later um but what is themost interesting or your favorite ormost important question for you to ask adeveloper if you would like to startthank youyes hi uh my name is Aaya Yeah I'mcurrently a VP of engineering at uhOceler um uh Oceler was founded by thefounder of Kafka uh and uh prior to thatI worked um uh for Netflix and GitHub umthe famous teams that built uh platformand infrastructure for GitHub copilot umand uh the hot question for so what isdeveloper productivity how do you defineit and then also I just want your hotquestion that you like to ask developersor to be asked when you were a developeroh yeah sure uh what is developerproductivity for me personally I thinkit's uh the time to impact um I don'treally you know care um how many linesof code or whatever you know people havewritten or whatn not uh in fact we uhwe'll talk about that later but um yeahreducing the friction uh whileonboarding um and you know you just umyeah making everybody productive uh asfast as possible and reaching what thecustomers um need uh and gettingeverything shipped out there and thehard hard question yeah what hardquestion fun question whatever you wantbecause it's about asking yourdevelopers and to give you how do youmeasure developer productivity butcontinue okay so I'm gonna uh like Iasked the audience that's no whatquestion do you love to ask developersum has J really helped improve your uhdeveloper productivity that's alwayslike there's always a debate whenever Igo uh but yes yeah thank youhi folks really good to be here seventhCubeCon right it's different um but yeahI'm Helen Greel uh VP of engineering atcompany called Multiverse which is atech company that identifies closes andprevents skill gaps by providingpersonalized learning on the job andpreviously I've spent a decade workingwith uh platform teams of all sortsincluding backstage at Spotify uh soI've done all sorts of like black magicum and and frameworks of measuringdeveloper productivity and I'm reallyexcited to be here and to share it withyou all and to also learn from from thisamazing panel in terms of my definitionof developer productivity it's quiteclose to what aaya had uh so it's timeto value basically which is in my mind afunction of like efficiency uh meaningin your day-to-day and satisfaction withyour day-to-day job as well as clarityon direction so those like threeultimate pillars of like getting tovalue faster uh when it comes to thefavorite question I I uh like to askdevelopers I had a chance actually toask that question quite a bit as like Ijust started in in the new job threemonths ago so um the question that flipsthe script in my opinion is like what isthis like what is one thing basicallythat hurts your day-to-day uh flow thatyou think leadership doesn't see ordoesn't realize which surfaces like verysurprising insights sometimes becauseit's like either the small things thatpile up or the things that arecompletely unrelated to like developertools and developer productivity justprocess you know um so yeah that'sthat's my favorite Um over to you Cad hieveryone uh I'm Cat Morris i'm productmanager at Centaso we build Craticswhich is an open source platformframework so very into that productivityhow do you reduce the cognitive loadspace um before that I was a platformproduct manager at Thoughtworks for fiveor six years um so I've been doing thisa while and to me developer productivityI'm a little bit fast flow teamtopologies uh nerd here so it's allabout reducing that toil to improve flowso what's the waste in people'sday-to-day jobs what are the things thatget in the way of doing somethingeffectively or efficiently so that'sthat's where my head's at um and Iactually asked someone shout out to Philif he's in the audience this question oflike what would you want me to ask youbecause I'm going to get asked "What'syour favorite question?" And he said"Whoa that's really hard i'm going toneed to think about it." And I'm like"That �should be my favorite questionthat's great like what do you wish I'dasked you with respect to developerproductivity what's the question thatyou wish I'd have been able to ask youtoday?" And I think you would get somereally interesting insights from fromyour teams on thatright hi everyone so it's CubeCon and myvoice is not in the room with us but Ihope you'll bear with me my name isLaura Taco i'm the CTO at DX dx is adeveloper intelligence platform thathelps companies measure and improvedeveloper productivity i'm also theco-author of the DX Core framework whichis a unified framework about measuringdeveloper productivity we worked withthe researchers behind Dora Space Devoxto create a unified framework ah thereit is if you want to see it I'll talkabout this I'm later I'm sure um so Iguess the answer to how do I definedeveloper productivity is right there onthe screen um it's not about lines ofcode it's not about one particular thingit's about this very complex world ofsoftware development and impact so weneed to be going fast but we also needto be high quality we need to have agreat developer experience we need tohave uh impact on the business um and itreally productivity really only happenswhen we have all of those thingstogether in terms of my hot questionthis is a question that I ask actuallyit's very multi-purpose um and I asked acouple of my friends this today in factis like what could we do that would makeyour life 2% better um because I thinkthat we really undervalue small changeand slow change as an industryespecially like in this world of AIwhere every two and a half minutesthere's something brand new but we canfocus on what's the 2% change that wecan make today because tomorrow we canmake another 2% change and it doesn'ttake long for all those changes to havea really really um impactful impact onon your team so um don't think about the100% better just think about the 2%better also it gets rid of this myth ofthe 10x developer and you have to dothis you have to do this you need to bethis percentage better but if we alllook at ways we can improve ourlifestyle or which let's be honest worklife balance is not a thing in tech thenmaybe 2% isn't bad at all and maybe it'san app there's no such thing as a 10xdeveloper no but 10x organizations doexist 10x teams even and actually Ithink maybe what we're all trying to sayis like developer productivity shouldmake um an ordinary developer should beable to be really successful in yourorganization and charity majors just hada really outstanding article um that I'drecommend that you all read about indefense of the average developers butwhen we're talking about developerproductivity we want an every everydaynormal developer to be able to haveaccess to outstanding tools lessfriction and be able to really umproduce outstanding results forcustomers and really should besustainable right yeah and it's onlysustainable if we think about it asteams not those individuals or anythinglike that so Laura since you've beendoing this measuring for a whilenowso what are let's start with thenegative but we'll reach the positiveand then we'll go to AI which we'll seeif that's negative or positivebut why is this a controversial topicwhy is developer productivity somethingthat maybe some of the developers in theroom are like I don't want to be moredeveloper i don't want to be moreproductive you're counting my codeyou're going to fire me especially inthis time of layoffs and then why is itthen hard to measure yeah I mean trustis paramount here um if any of you haveused like um clockwise for example thattells you how long you've been inmeetings or whatever i would opt to getthat data about myself but I don't wantanyone else looking at that data youknow I maybe track my macros so that Iknow if I'm getting enough protein but Idon't want that data sent to my doctorevery day right and so a lot of thisdeveloper productivity hesitancy is verynatural normal because I don't want tobe spied on i also don't want to spy onother people um so intention is reallyimportant here one of the reasons thatit's so diff�icult to measure is that allof you know that it's we can't we canharvest an acre of wheat or a kilogramof wheat and know how much it sells fori can't tell you how much value a lineof code has i don't I can't tell you howmany euros my semicolon can sell for umin software and so when we try to makereally clear parallels between likeagriculture manufacturing and apply thatto software development it just comesacross as like first of all we'reworkers in the machine and it reallyundervalues the knowledge work that weall do the creativity the complexity ofsoftware development which is why it'sbeen so challenging to find a measurewe've been in search of this one onemetric that matters for a long time andjust research has shown us time and timeagain that it just doesn't exist wouldsatisfaction be a measure because likeAtlassian doesn't even call it developerproductivity they measure for developerjoy it it's definitely an importantcomponent but at the same time we alldon't work for coding clubs right wework and operate in the context of abusiness and we have to be realisticabout that that it is important to havedeveloper satisfaction and retention isimportant attrition is important but wealso have to balance that with thebusiness needs in operating efficientlyand having the right impact for our endusersyeah I have another take on this and Iagree with everything Laura said um soat Oscillar we build AI based uh basedrisk decisioning these complex workflowsthat um you know capture what a dataanalyst is trying to do like if 2,000 or20,000 data analysts take like two weeksor a month we want to make sure thatit's done in seconds or minutes not evenhours sometimes and we also look atanti-money laundering account takeoverslike all the good stuff uh that'shappening every day and with AI theattacks have become extremely complex umwhile I was building something a decadeago at uh into it I I had seen risk andfraud but these are like so complicateduh we have to have an AI a good AI agenttackle uh the bad AI agent in real timeso that actually happens right um sowhile we are doing all of these thingsuh we noticed that one team was umparticularly lagging uh with respect tospeed to ship and I I was wonderingwhat's happening because they're allstellar developers there right so therewere no clear signals so the answerturned out to be that they were creatingall of these complex workflowsum that would speed up everybodyeverything for everybody but the problemwas that they were spending a lot oftime doing backward compatibilitytesting of the injection uh uh you knowpipelines and all of the work that goesin and capturing all the edge uh casesand thinking what ifs and when we are inthe geni space right now right when weare coding something we we have to alsopredict how we going to get hacked howwe going to you know tackle somethingthat hasn't happened yet where industryhasn't benchmarked something yet and weare you know we have the first moreadvantage or disadvantage however youlook at it uh so while we are capturingall of these things it's not like youknow you can't penalize them for uhdeveloping slowly we also have tocapture how how long like all the theentire um developer life cycle to trulymeasure what they bring to the table sosecurity and testing were also twodifferent dimensionsyeah and just to respond to that I thinkone like one other hot take on like whythis is controversial right to yourpoint like nobody really looks atdeveloper productivity metric until theproduct pace is not really there so thisis when you start digging and thenyou're sort of coming into the spacefrom like a defensive position like howdo I defend myself uh and that like whatmakes it like slightly tricky of adynamic to to navigate and this is why Iguess I'm harping on like the outcomesversus uh units like the output becauseoutput might still be there but theoutcome might not um so yeah that's justkind of I guess the the tricky partabout itso I'm going to be that guy and bring upthat it's a system like so so everythingwe're doing is in a system right likethe organization is� a system thesoftware engineering is a system um andI've been working with a lot ofarchitects at the minute and they havethis concept called counterintuitivenesswhich is like we find a problem but wefix it in the wrong way it's thatclassic the team's moving really slowlet's add another engineer to speed themup and it never works and it's becausewe've got this bias to fixing things inthe wrong way because we want to fix theproblem we're desperate to do it so likeyou said we only really look at thiswhen the product velocity is too slowthat we're not achieving the rightoutcomes so then you put all this focuson the one thing that you think is thecause but it might not be because it'sthe whole thing fitting together soactually taking a step back and thinkingabout how are all of these piecesfitting together might help it could bethat your product objectives areabsolute pants so pick new ones we don'tknow so we've had from the audience ofthank you for being such a participativeaudience already and Andrew hi Andrewout there uh thank you very popularlysaid shouldn't we be measuringorganizational productivity notindividual productivity and I would sayyeah but there's other levels in betweenI was just about to say I thinkdeveloper productivity is a pants namefor what this is if I may borrow thelocal parliamentThank you um because it puts theemphasis on the individual developerwhich is not what developer productivityis about whatsoever it's about systemsand improving the system and so rightout of the gate when we use languagelike developer productivity we'realienating our core population which isdevelopers because I know that many ofyou if not all of you in this audienceand you're here today because you areresponsible for measuring and reportingon efficiency and developer productivitywhen we use language like developerproductivity upwards to the sea levelseuite loves this language but when weuse it down into the development teamsit can be really alienating and it'sokay if you have to do translation likeyou don't have to come up with like thethe one word to rule them all i oftenyou know to me developer productivitydeveloper experience are so so relateddeveloper joy developer thriving there'sall of these different vocabulary wordsthat are talking about the same thing umand so if this is a problem that you'rebutdding up against like shouldn't we bemeasuring organizations instead ofindividuals you might just need tofine-tune your word smithing a littlebit um and come up with betterterminology to describe actually what itis that you're doing but I wish we couldhave a doover with developerproductivity i think it's too late nowi also want to add to what Laura justsaid um I I often struggle um you knowused to struggle at least initially umwhen I had to explain the business valueof helping each otherout organizationallyuh how uh will the infrastructure teamunlock value for people how will Janaiimprove the lives so infrastructure whatyou know we were looking at this uh helmpulum uh go deployer and all of that andI was like you know our backendengineers are struggling why don't wemake sure that we can deploy whateverinfrastructure we want in minutes cuzit's taking you know an hour and it'sreally slowing us down and they unlockthat value so developers helpingdevelopers here and then genai uh whenwe were looking at um you know likelet's say uh you have this famous orinfamous quarterly uh pentest report andit's 100 pages long and we were lookingat all of these things and I you knowhow do we triage that how will we helpuh our own organization understand frontend needs to do this xyz things to makesure that we are less vulnerable becausesecurity is not an afterthought it's itshould be in every part of your CI/CDlife cycle right so then the jai teamworked on something that could triagethat and uh basically help us even solveand mitigate some of those things uh theunknown unknowns so we also as leadershave to make sure that organizationallywe are building that culture where eachteam has been given enough scaffoldingenough uh developer tools and y�ou knowis encouraged to build that for theother developers as well so that we canall improve together you know I oftenfind that when it's you measuredeveloper productivity um I always thinklike so what like what's the point whatare you measuring it for what are yougoing to do about it and sometimes wehire these very smart people to solveproblems maybe they are the best placeto solve the problems as well likethinking about it more of like how areyou going to fix it so to your pointhelping developers developers helpingdevelopers give them the problem tosolve let them solve it themselves in away that makes sense for their teambecause then at least you know they'regonna use that thing if they've designedit the way that they want to that'sthat's where my head comes with likesome of those developer productivityversus or productivity if you keep itwithin the developer space and they canfix it themselves it kind of feels alittle bit nicer rather than using it asan organizational metric with that uhone of the questions we have is how doyou convince stakeholders that investingtime and money in developer experiencewill generate more income in the longrun so how do you then also with thathow do youthendecide okay but the developers have tofix their own stuff before they buildsomething that goes to market right awayhow do you that is a stakeholder becauseeveryone everybody's pressure is from updown rightand I wanted to bring up this slide umfor that question so um everything we doI'm I'm a huge fan of uh auditablemeasurable metrics uh that are not likeyou know derived from a textbook theseare all practical uh stuff right nothypothetical um so how do you measurethis uh and uh no offense to any productmanagers here But it's it was reallyhard to to justify why uh you know uh weneed to actually invest in uh buildingthe devx or the you know explain thecodegen impact until we actually putthis into uh effect so uh I rememberasking this uh PM um to give me u amonth to basically t tackle somethingand he kept on going and you know um isso you just want a month to handle techdebt tech debt this is not tech debt I'mjust uh you know like helping all thedevelopers improve and this is uhbasically a part of this is what I usedto actually explain how and why we needto basically unlock the value and if yougive me a month i'm going to speed upyour uh delivery or um the thing that uhyou want in September uh and push it upto June and see how it goes and he heldme to the fire if we didn't deliver inJune we delivered in August but still awin is a win but yeah uh the metricshere speak for themselves i'm a huge fanof low cognitive load and fast feedbackloops uh that's basically you know whatI wanted to say so with the idea becauseI see a lot of questions still coming innot we're not advocating I don't thinkany on the stage are advocating that weare measuring at the individual levelandunan anonymizing it like we're it's justnot our vibe i can say that right now soas the co-author of one of theseframeworks and we work with theresearchers behind space Dora Devx I cantell you right now and I'll officiallysay it do not measure this on anindividual level if you are using thesemetrics to do stack stack ranking ofengineers based on their PR output thatis not what they were designed to dothat's weaponizing and that's going togo downhill and do long-lasting damageto your organization i can't say thatplainly enough i will say like we knowwe know the the conundrum that some ofyou find yourselves in because you'rebeing asked to do that and I thinkthat's what's really challenging aboutthose of you that are in this room thatknow that it's the wrong way to do itbut you're you have pressure put on youto do it um and so whenever we'retalking about so for example DX core 4if we're talking about PRs or diffs perengineer this is always at an aggregatelevel if we're talking about um yeahother like cycle time we're neverlooking at how long it takes aparticular like one developer tocomplete work or how you know oneindividual ticket it's all aboutaggregate and trends because aga�in we'remeasuring on the system level to improvethe system we're not using this forindividual performance management that'sa whole other class of of problems sowith that in mind because how do wemeasure developer productivity becausewe're not against the idea of askingindividual developers questions whetherin an interview or pizza give them foodis kind of a trick news flash uh ordoing regular surveys or pulse checks orquick little Slackbot things we're notagainst that so how do you measuredeveloper productivity like we know nowwe've settled on this idea that I thinkas a collective in this room based onyour great participationuh we've settled on this idea that it'snot okay to make people accountable atthe individuallevel organizational productivitymatters i think team productivitymatters but what do you measure therecould be even space is a bit makes you abit confused of how many what are wemeasuring and I guess just maybe for thesake of like this audience beingplatform engineers like a lot of youguys I think kind of floating in in inthat space I think it might make senseto kind of bring that angle which islike going back to like measuringsystems like how do we sort of increasethe throughput in the system as opposedto throughput of individual uh so thethe kind of measures that I think makesense in that context is perhaps likethe self-s serve approach like how fastit is for you to get what you wantwithout like three four levels of likeapprovals handovers that kind of stufflike how easy it is to potentially likescaffold a new thing like how manypeople do you have to interact with toscaffold a new component those type ofthings and um like if you reduce it byjust 2% by 10% by 15% you can like 100%be incremental about it that's thebeauty of like working with throughputin the system like it doesn't have to beall or nothing it can be like in phasescome on Laura you knowum your life's work my life's work yeahi think um bringing Yeah bringingdevelopers into the conversation therethere's no people in your organizationthat know more about the friction that'sstanding in the way of you knowefficiency than the developers whointeract with these tools and to ignoreto to purposely not survey them or orinteract with them to get thequalitative and self-reported feedbackis kind of missing half of the wholeproblem um you know we can talk about uhsystems and workflow to uh metricscoming from tools like to to measureadoption and those are really valuableto have but they become exponentiallymore valuable when we can marry themtogether with survey data and data fromthe experiences of your developers soputting them together really unlocks alot of uh I would call like directionalinsight to figure out what to do liketalking about backstage for example orthe the IDPs that you're all using theseare like Swiss Army knives that can doso so so so many things and we need tofigure out what to do first and in a lotof ways developer productivity you knowor development system productivity datawe'll call that oh boy um we're a littlecursed here um it's it's kind of likeuser research for platform engineeringteams right this is really existentialyou need to know where it hurts the mostso that you can direct your investmentat the thing that's going to have thehighest ROIyeah I want to bring back the slide soum what uh worked well for us in thepast uh at least uh with the teams Iwork with uh um is asking thosequestions more often um every oneononeevery um backend or team sync that wehave every week uh just to you knownormalize that conversation how how umhow do they feel about you knowum what what did we do in terms ofimproving the audit log of pipelineanalysis uh to how do you enjoy codingmore to like you know like did our uhdid we actually compromise our SLAs'suptime downtime how many like how did weimprove from last week or yesterday uhinstead of like waiting for quarterlyI'm not a huge fan of quarterly whateverwhatever the the uh semianual surveythat goes out it's too late and in inthe era of geni that's too late uhcompanies have to catch up right and thecontext is not there so this isbasically how I measure developerproductivity we just uh you knowbasically for each team front end backend like did you improve workflows didyou improve onboarding uh with thescaffolding that we provided with thepaper pad that we have does yourorganization coming back to organizationdoes your organization feel like a gatedsuburb B or does it feel like uh youknow an a pretty new city where you youwould love to explore on your ownwithout being shackled and if if we seeand improve this outcome and if if youare better than what you were a month ortwo months or two days ago then weactually have improved and then as we'rewinding down to the last minute or twowe could talk about this and we do forhours at a time which isgreat because it is true that this is atime of tighter budgetsand which is silly that companies arenot investing in their developersbecause they're really expensive so andyou deserve it but with that in mindwhat is some sort of metric that youshould be measuring that can helpcompanies understand that you need toinvest in developer experience and inmeasuring it even because you can'timprove what you don't measureyeah I think maybe I'm not the rightperson to talk about like developerexperience index with like having Lauraon stageoh weBut yeah like I guess as a user of DXright at a company we use DX that's kindof like if you're looking for one metricwhich is sort of silly but that's kindof what lot a lot of people do and thatgets you in the door like in in some ofthe like senior leadership discussionsof like what is one metric that I shouldbe looking at um I think developerexperience index has both like the thesort of the dollar value translation asto like why should you care right likewhat what's the impact on my bottom lineum as well as like a measured fourpillarkind of foundation to it uh so like youhave a balanced view and also it answersas to like why you should care right uhbut yeah I'm sure you have things to sayabout it Laurum another thing to think about and thismight be a bit producty of me is thecost of delay so if your teams are notbeing as effective as they could be thatis an opportunity for other companies toovertake you because if you're beingslow someone else is going to be quickso figuring out how long things aretaking what is that what are you givingup by doing it what can't you do becauseyou're too slow and by investing in thatproductivity you get that cost of delaylower and you can be more successful asan organizationyeah uh um in the previous slide uh theuh there was one mention of revenue perengineeruh that is something that we look atbecause engineering and R&D is apercentage of revenue they're bothrelated so I you know I just wanted toconcur with what you were saying and uhyeah that's that's and let's be honestfrustrated developers it's expensive notonly to have developers but to hire thembecause the ridiculous hiring processesweSo these are really costs of justknowing retention and what's frustratingdevelopers has a bottom line so I thankyou my fantasticpanelists and I thank you the greataudience please do connect with us onLinkedIn we are all five of us sharingabout this topic every day we're reallyexcited about it and we hope you mayleave a little more excited and ifyou're on the side of as you're adeveloper please make sure you giveyourself a voice that you can feelspeaking up and if not look for anothercompany because if you feel like yourdevelopers as a manager don't feelcomfortable speaking up well you gotwhole other problems with psychologicalsafety and this isn't going to work soyou got to get there thank you all verymuch thank you[Applause]2025-04-15 21:59:39.205634 ||��q�'#��AzZ7bDPZMCqYi'm going to go ahead and get started umthis is probably the most people I'vespoken to in terms of a conference uh soI'm definitely going to mess up at somepoint so please forgive me if I do uh myname is Bob Walker i'm the field CTO atOctopus Deploy and I'm really excited tobe here because I get a chance to talkto you about how we as a company nowprogressively deliver changes to ourcustomers who are hosted on Kubernetesusing a Canary style deployment as wellas feature flags now before I jump intothat just a little bit about me aspeople continue to stream in uh first upyou might be seeing field CTO um almosteveryone blows right through the fieldpart of that and just thinks I'm a CTOi'm not a CTO i am in still anindividual contributor i can't set thedirection of a company i can't hireanybody i really can't tell anyone to doanything but uh my job as a field CTO isto go out and speak at differentconferences uh put together differentthought pieces talk to differentcustomers uh do a whole bunch of otherdifferent fun stuff like that uh but Idid achieve a very important distinctionlast year where scammers are now usingme for fishing attacks to my own companyso that's great um all my contactinformation is up there uh that QR codesends you to my LinkedIn profile if youwant to connect with me please feel freeto go ahead and do that now you canprobably tell by my accent that I'm notfrom around here i live in OmahaNebraska um if you haven't heard ofOmaha most people haven't i don't blameyou uh we just went over a millionpeople uh that's where it is on the mapi live there with my wife and we havetwo cats and a dog and if you're curiousabout what on earth does Nebraska looklike uh that's what it looks like umwhen I'm not spending time with my wifeor my pets or at conferences or talkingto customers I'm going to be out on thebike uh so we have a lot of rollinghills doesn't really show up super wellin that picture uh but it's a lot offarmland and a lot of big open air and Ipromise we do have cities um that's justan old railsto trails conversion trailso while I've worked at a number ofdifferent companies in fact I startedoff as a developer way back in the earlydaysofnet and as much as I'd love to tellyou about that story primarily thisstory is going to focus in on myexperience within Octopus Deploynow the goals for this particularsession today is I want to provide youwith solutions that you can apply inyour day-to-day based on the some of thechallenges that we faced and this isjust working under the assumption thatyour use case is fairly similar i alsowant to debunk the myth that canarydeployments slow so solely means thatyou're routing traffic becaus��G�&#��EAgycxQT3DHIUwelcome everyoneum and we're gonna put up a little slideout please scan that if you have theintent or thought to read write or voteon questions because you can do all ofthose thingsso yeah welcome everyone i know it's theend of day two unless you've been heresince Saturday so maybe you're day fiveyou're different vibes for everyone sothank you so much i would like towelcome each of our panelists tointroduce themselves and then also asyou introduceyourself define what ��e we'redoing a canary style deployment and I'llget into why that's important and then Ialso want to talk a little bit about whyit's so important for you to separatedeploying new versions of your code fromreleasing different functionality andthen finally this is not going to be asales pitch about Octopus deploy i willshow you like a couple of screenshotsbut it will make sense in that uh butreally but you will have to understandwhat we're doing uh just to understandour challenges so you're going to haveto learn a little bit about that i doapologize so let's just jump right intothatas a company we started off as aself-hosted option we'd give ourcustomers an MSI happy days and then astime went on you know because we startedin June of2012 our customers started asking for aSAS offering it's like yes this makes alot of sense but we're not 100% sure ifthere's going to be a significant amountof demand and so we waited in until wekind of reached that critical mass andthen in July of 2018 so almost six yearslater from our original release weoffered our uh SAS offering so a littlebit about our architecture in caseanyone's curious we leverage cell-basedarchitecture and so what it is is eachone of our customers they get adedicated instance of our application ordedicated copy of our application theyget a I I have up there a dedicated poda namespace they also get a dedicateddatabase as well as a dedicatedfileshare and the big reason why we didthis is this allows us to scale up anddown the compute resources per customerbecause we have some customers who needto do tens of deployments per day and wehave others that need to do thousands ofdeployments per day and we also haveanother what we call cell that's on topof that which we call a reef and thereef is the Kubernetes cluster the AzureSQL server and Azure storageso as a result of this we have 1,600instances give or take at any point intime hosted across three regions so thatmeans 1,600 copies of our applicationare running in production and that'sjust for our SAS offeringtoday if you're curious about ourconfiguration uh we actually have eightreefs which means eight Kubernetesclusters i think a ninth is about tocome online uh we can host about 3 to500 instances per reef uh there's nohard limit it's just kind of the way itworked itself out um we're leveragingvirtual machine scale sets so we haveabout three to six seven nodes give ortake uh but these are pretty bignodes now some other fun challenges forus is that we still offer self-hostedversion but the self-hosted version isalways an N minus one so in thisparticular screenshot we first deployed2024.4 4 in September and we didn'toffer a self-hosted version for thatsame version 2024.4 until December sonow that we're all hopefully familiarwith how everything is working I'm goingto talk a little bit about some of theself-inflicted challenges that wecreated forourselves the first one is is because wesell a deployment tool customersreasonably have the expectation that oursoftware will be running when they needto deploy to production and I thinkthat's fairly reasonablebut not everyone deploys to productionat the same time some customers theylike to deploy to production you know inthe middle of the night others like todeploy to production at 7 p.m orthroughout the day but we need to beable to do things like upgrade to thelatest version perform some basicmaintenance like database backups andtaking file shares or anything alongthose lines and so what this representsis we have customers who can be upgradedthroughout the entire day so we allowedour customers to select a maintenancewindow which basically allows themexcuse me allows us to perform thenecessary maintenance where the customersaid we're not going to do anyproduction deployments during that timeso what this chart represents is all ofour customers um and the maintenancewindows that they picked now you mightbe wondering why the huge spike atmidnight uh that's the default option soyou know most people don't know tochange it until they actually getupgraded in the middle of productiondeploymen�t and it's pretty easy toconfigure that maintenance window i havemine configured to be pretty long andI'll get into that but it's justsomething that you can do through theuserinterface the other big challenge thatwe love to do is we deploy very veryvery frequently um I picked thisscreenshot in particular because you cansee that we actually had two productiondeployments on the same day now thechallenge of that is that as soon as wedo a new production deployment what thatmeans is we need to restart that wholelife cycle so that 24-hour clockrestartsitself in addition to that we're alsoreleasing a lot of new features andfunctionality throughout the quarterwe're not just doing a big bangdeployment for all of our customers sofor example deployment freezes wasreleased at the beginning of the quarterbut get protection rules was released atthe end of the quarter for ourcustomers now the challenge that wefaced is that we were a self-hostedcompany for so long and our mindsetstill had that self-hosted approach andat the time when we first released ourSAS offering when anytime we wanted tomake a change in the olden days is wetest everything on our staff instancesmake sure everything's looking good runthe appropriate tests uh automated testsand everything like that and then wewould release to our self-hostedcustomers but our self-hosted customersthey had their own upgrade cadence someof them took a couple days before theyupgraded to the latest version somewould take six months uh we still havecustomers running two-year-old versionsthat are self-hosted so that's a that'sa fun game toplay but with our SAS offering thatmultimonth release window shrunk down to24hours and so if we wanted to startupgrading say at 10 a.m brisbane timebecause um we're based out of Brisbaneum and that's where the vast majority ofour cloud platform team lives they wantto start upgrading say at 10 a.m whichis midnightit takes us again a full 24 hours butone of the very first things that we uhexcuse me encountered with our SASoffering was if a critical code bug wasfound we were still upgrading all theremaining instances and so everyone gotthat bug which is a bit of a problemespecially if we could prevent theremaining you know couple hundred orthousand customers and not blockthem so I want to talk a little bitabout the very first solution that wecreatedso the very first thing that we did iswe said well let's adopt a canary styledeployment and if you remember the wholepoint of a canary style deployment comesfrom that old story which was minerswould bring canaries into the coal mineand they did that because the canary hadmore uh more sensitive lung system Iguess than us and if it died that was itsign that you needed to get out of thereand so we kind of did the same thing wesaid okay let's first deploy to ourstaff customers our staff instancesthen our insiders those are like ourMVPs after a day making sure everythinglooks good and then we would deploy toour early adopters seven days after thatand then assuming the early adopterssaid everything was good then wedeployed after another seven days to ourstablebranch and then we had those lagardsthat still you know we had a couple ofthem that didn't happen very often butit's just worth pointing out and so whatthat really meant was is it took us 15days to upgrade every one of ourcustomers the other thing that weimplemented was we needed an ability tostop that whole release cycle and sothroughout our research um we cameacross the concept of the Andon cordwhich was pioneered by Toyota and sowhat we were doing is we're starting totreat our deployment pipeline as anassembly line and so Toyota famouslypioneered the concept of an andon cordand the andon cord anyone can pull theandon cord at any point in the assemblyline and when they pull it that stopsthe assembly line and the whole point ofthis is to prevent a small issue maybe amachine is misconfigured maybe theydon't have the appropriate parts tobecoming a massive issue which then theyend up having all of these recalls andeverything like that uh the vastmajority of andon c�hord polls was aresult of they just don't have theappropriateparts so we implemented something veryvery similar where we wanted to have anenthusiastic promoter do the work wherewe're always promoting to production soone of the big changes that we made wasin our mindset which was when you lookat canary deployments the assumptiontypically is is that something is goingto fail it's always going to fail wellwe we took the assumption of somethingis it's basically always going to workuntil it doesn't work and so we havethis whole workflow that we've kind ofput together where basically we say yesI think we're ready to deploy let'scontinuously deploy as fast aspossible but with the andon chord we hadthe capability to stop that and anyonecan pull that we've created thepsychological safety where anyone canpull that andon cord in fact I pulledthe andon chord um two weeks ago um andall the people asked was "Hey why didyou pull the cord?" I provided them theappropriate feedback they said "Yep thatabsolutely makes sense." Um they foundit was actually due to a very particularconfiguration with my instance but westill had I still have the psychologicalsafety to pull thecord so what that looks like is now if acritical bug is reported we can go aheadand stop upgrading all of ourinstances which really helped us out ifwe found that critical bugi'm not going to read this to you i'mjust going to give you the cliff notesand this worked really well for us iwant to say for a couple of years but weslowly started seeing problems with thisnow remember the best case scenario is15 days to get somethingupgraded now in this particular case umthe very first bullet point says it was46 days ago since we did our lastupgrade and the next upgrade is likelyto occur within about a week so we'retalking well over 50 days betweendeployments based on the way we do ourand chords and canary style deploymentsthat was obviously not a good thingespecially if we had critical bugs if wehad a CVE that was sitting out therethat would beawful and the biggest issue that wefound was that bugs were still making itall the way up to our stable ring evenafter 15 days and I think this is trulythe problem if you look at Canarydeployments as a way to derisk newfeatures andfunctionality because truth be told thestaff the insiders the early adoptersremember the early adopter list wasalways static it didn't cover everypossible code path we couldn't predictevery possible different configurationthat someone would have different usecases anything like that and I thinkthat's fundamentally the problem whenyou look at canary deployments as thesole solutionbecause critical bugs still made it tostable and that was the biggest problemthat wehad and as a result we had weeks beforewe had a stablerelease um I asked AI to generate thatimage in case anyone's wondering wherethat came fromum the number one reason for all of theissues that we came up with was newfeatures and functionality because it'svery unlikely for code to just suddenlystop working unless you change somethingbe it the infrastructure um yourapplication host or your code code willgenerally continue towork and so as developers you know we'retypically the ones that cause our mostproblems i still consider myself adeveloperso I want to talk a little bit about ournext generation solution that we'reusingtoday so the very first thing that wedid is we shortened the amount of timeit takes to upgrade every one of ourcustomers instead of 15 days we're downtoabout 5 days atmost so the biggest other change that wemade was we said our main deployinstance so shockingly we use octopusdeploy to deploy octopus deploy it's aweird snake eating the tail type ofsituation but we upgrade our main deployinstance first that is the very firstthing to that is upgraded and if itdoesn't work there then it is not goingto work for any of our customers and sowe start dog fooding almostimmediately and then when we want todeploy to our canary customers we'rerandomly selecting 5% of thosecustomers and it's always rotating andthey're updated after 3 days at mosttypica�lly it's about one or two days butat most is it three days that's kind ofour you know line in the sand so tospeak so that means the majority ofusers are in the stable ring i want tosay about 93 to 4% of our instances arein the stable ring at any given point intimebut that didn't solve the new featuresand functionalityso this was the Octopus deploy userinterface at the start of2024 this was the Octopus deployinterface at the end of 2024 now imagineif we're doing a Canary style deploymentwhere you'd sign in and you'd see thisinterface you'd sign out the next dayand then you'd see this interface as auser you'd be pulling your hair outtrying to figure out what on earth isgoingon and they then you'd go back to theold interface sorry I didn't realize Ihad thatanimation the biggest challenge that wehad for any new features andfunctionality is we needed to gatherfeedback from our users before updatingeverybody and that's really at the heartof Canary deployments when you thinkabout it which is you want to gatherfeedback from a subset of your user baseeven if it's this is broken that'sfeedback we need to know that before weupdateeverybody the approach we were takingwhen I first started at Octopus Deployand I'm sure a lot of people in the roomprobably have this as well is these longrunning feature branches and we had oneof our feature branches lasted over ayear believe it or not and they weredutifully merging into you know fromMaine repeatedly but when they mergedinto Maine that was a I'm glad I wasn'ta developer at the time let's just saythatwhat we wanted to do is we wanted tohave a trunkbased development life cycleso what this means is we had to havevery small short-lived feature branchesand we always wanted to have main bedeployable at any given point in time soif we wanted to make a change just makeit on a very like maybe two days threedays at most a week and then merge thatin so we're making very smallincremental changeswell this means we had to change how wedeploy new features and functionalityand this is where open feature comes infor us because with open features we'reallowed to create these segments and thesegments however you want to define themcan be whatever they can be based onlocation if you're doing an internalapplication it could be a team if it'sbased on like it could be based on usagehistory or it could be you know whatenvironment you're deploying to i justrealized I had a user sign up like Iguess it's the fonts um soanyway what we would do is we'dinitially deploy with the feature flagturned off to production we'd have itturned on for our lower environments butagain we're always merging in to main soif we were adding a new piece offunctionality like that new userinterface I'll just go back to that weadded the new user interface and wemerged that into main but it was hiddenbehind a feature flag and then we'readding more and more and morefunctionality to that throughout manymonths and then once we reach a pointwhere we feel confident then we'll goahead and turn it on for our staffinstances and even then it's not forevery one of our staff instances it'sthe majority ofthem and then afterthat we will start deploying to ourearly adopters now what's really coolabout having these feature toggles andthese feature flags is we can now havemultiple lists of early adopters this issomething we can never get with Canarydeployments just by itself where wewould have you know say we wanted tohave this group of customers they couldpreview this new feature uh we'd havethis other group of customers they couldpreview that feature we can never getthat with true Canary deployments we'donly be able to do all or nothing typeof thingbut now we have right now we have fourseparate early adopter programs runningconcurrently and they all have differentlists uh except for my instance becauseI keep asking to get added to the uh thenew features andfunctionality but then what we woulddo is once we felt confident about thenew functionality based on what theearly adopters that feedback uh based onour staff users uh based on anythingelse that we �can think of then we startenabling for the rest of our customersnow it says 30 33% 66 100 it's notnearly as uh controlled as that as I'dlike to say oftentimes what it really isis it's about 5% and then 100% but itreally depends on the feature so a goodfriend it depends because if it's apretty high risk feature like that newuser interface it was very controlled itwas 5% 10% 15% and so onbut one of the best advantages that wegot out of this is that if our usersstarted reporting issues we could goahead andpause for the deployments and then thatgave us the flexibility and the you knowbasically the chance to take a take abreath realize what's going on do someanalysis making sure okay is this asomething that's going to causeeverybody to have problems or is this avery unique configurationand if it's a configuration what wecould do is we can continue to 100% butwe can exclude those customers forgetting that functionality and thisallows us to again get more and morefeedback and make sure the use thefunctionality is going to work for ouruserbase so in case anyone's curious um thisis what it looks like in terms of ourfeature flagging interface that we puttogether um each of those little uh Idon't know how well you can read thatbut each of those little uh chickchiplets I guess is what we call itchicklets um that represents aparticular customer's instance um so I Ipicked this because I can highlight it'sjust my instance or some of our staffinstances that we've turned it on forspecific customersnow let's talk about some of theadvantages that we've gained from thatoutside of having this this greatcontrol over releasing new features andfunctionality by doing this approachwhere it's combined with both ourSLOs's have increased to 99.9% unlessAzure decides to have issues again umbut as you can see in February is 100%and January is 99.24% 24% that's reallyspecific and then you know earliermonths there's 100% or near99.9% that is that is excluding any ofthe scheduled maintenance or the the themaintenance windows that we were talkingabout earlier but if even if we includedthe maintenance windows we're talking99.9% so it allowed our uptime to bereally really high per customerthis also allowed us to decrease ourlead time tochanges especially for these smallincremental changes that we're makingbefore again we're talking 15 days nowwe're down to four days and three hoursand our goal is actually to get thateven faster to be down to one day that'sour end goal and you can actually seethat we got really close with one of ourreleases in uh is it 2025 4242 so we'regetting there we're getting therei want to spend the remaining bit oftime talking about some of the remainingchallenges because it's not allsunshines and rainbows that might be avery American phraseum one of the challenges that we stillhave is that detecting issues is still abig problem because when you release anew feature there's going to be errormessages that get logged but not allerror messages are the same is it a userthat's just pushing the same button overand over and over again is it due to amisconfiguration but it only impacts oneparticular customer or two particularcustomers or is it a pretty big deal andit's impacting everybody now we aredoing analysis on all of our errormessages so we know all the common onesso when a new error message appears uhwe're made aware of it and we startreaching out to the appropriate teamsbut again how often does that happen isit a significant thing so right now ittakes about 5 days and 20 hours again itgoes back to lead time to changes beforewe start detecting any sort of issue soit's not perfect but once we detect theissue we can fix it in under an hourbecause most of the time it's just dueto some weirdconfiguration but that beingsaid high impact issues still make it toproduction and we call these it's a verywordy way of phrasing it high severitydefect escape rate and what thisbasically means is an issue that causesa sev one or a sev2 uh incident isbasically what we call it and so wetrack those every quarter we're alwaysreporting on them i think it's actuallyevery month every month we're reportingon it um I pulled the one from Decemberjust because it actually had some highseverity default escape rates uh you cansee um we try to learn from all of ourmistakes i mean we're not perfect we'rehumans and so what we'll do is we'll doa retro and it's always a blamelessretro to see what caused those problemsbut again they still make it toproduction but we want to make sure thatit's as little as possible you can seewe only had two uh forDecember the other big thing is that ifyou want to implement feature toggles itdoes require a very disciplined approachyou have to adopt the expand andcontract pattern so what I mean by thatis you expand your code base to includethe new feature toggle um all theappropriate if then statements thattypically go along with it all of yourunit tests to test both the uh it'senabled or disabled or however you wantto code it and you also have to realizethat those toggles they might live for acouple weeks or they might live forseveral months like we did with the newuser interfaceand then you also need to have thediscipline to go back through and cleanup your feature toggles um if anyoneattended Open Feature Summit a coupledays ago I believe it's Dinatrace saidthat they had 1,200 feature toggles thathadn't been used in like two years orsomething like that it's a crazy highnumber so you need to have thatdiscipline as developers or as platformengineers to go tell your developers arewe still using these feature toggles weneed to turn these off and we need tostart deleting that codeand then one of the other challengesthat we have is when should we usefeature toggles because there are somefeatures that we add that are extremelyhigh risk and it's almost a no-braineryes absolutely we're going to 100% dothat so uh the new user interface is areally good example um we're adding somenew uh functionality around templating100% that would be it's almost ano-brainer like why wouldn't you do thatbut then we have some features that arevery low impact and it's almost like abrand new page well does it make a lotof sense to put a feature toggle aroundthat and so that's kind of the theconstant tension that we're always goingto have anytime you want to do somethinglikethis and then finally for us our featuretoggles that we put into place they'refor SAS only if you have a SAS productas well asself-hostedthen for us our forcing mechanism to dothe contraction is going to be when werelease a version for our SAS c excuseme for our self-hosted customers becausea lot of our self-hosted customersthey're self-hosted for a reason um Iknow there's a we have a couplecustomers in Germany for example who bylaw cannot connect up to like any sortof cloud provider or send us telemetryor do anything like that which is fineso this is just something that we'llalways have to struggle with becauseeven if the vast majority of ourcustomers move up to our SAS product westill have about 25% that will probablynever move to cloud for variety ofreasons so to go back through the goalsfor the session hopefully I provided yousome solutions that you can apply sothink about always you know alwaysassume that your code is good and it'sready to deploy versus always assumingyour code is bad um leverage canarystyle deployments if you can doingsomething where you can kind of controlit especially if you can control if youhave like a customer base where you havemultiple copies as well as leveragefeature toggles or feature flags andleverage open feature wheneverpossible uh debunking the myth canarydeployments means routing traffic as youcan see we're doing canary styledeployments it's that canary in the coalmine but we're not routing any sort oftraffic we're just doing it at acustomer level and obviously why youneed to separate deploying new code fromreleasing functionality becausereleasing no functionality is the numberone cause of all our severe issues andthen hopefully you didn't think this wasa sales pitch so thank you very much iwill be out there for any any morequestions[Applause]2025-04-15 21:59:40.119001� give more insight to the previouspoints and these ones but uh just tohighlight some extra benefits we got iscluster creation speed which webasically increased by roughly 15 or 20times and then obviously securitybenchmark improvements which is as abank it makes sense to improve thosenumbers ingeneral so the biggest one here and weif you read the abstract you kind ofhave the answer to it we approximatelysave85% by moving our stuff to on-prem andthis is the cost for compute andstorage we took all the cost we couldfigure out from our running our own datacenter the manpower network coolingpower everything and we took it into afivey year period and compared to whatwe pay for what we are running publiccloud that's 85% savingone example I want to give you you canbuy one server for 20,000 euro write itoff over five years that's333 amonth that one computer gives you 32physical cores 1.5 terb of RAM you canwant a lot of workload on that one and Iwould encourage you to look up the sameamount of power for a public cloud VMyeah so the next benefit is clustercreation speed i'll leave the video inthe background that's basically how fastwe can make a cluster in our on premisessetup and uh below you can see a pictureof how it is in our current manageclusters and you will see the differenceand why it's a good benefit so I'll justfocus on the below picture as you cansee the deployments might take roughly30 minutes this is just a basic clusterit's not it's not githubs it's notapplications nothing just a basiccluster with the nodes in it and uh 30minutes it doesn't impact you actuallydayto-day if in your production clusterssure but you have multiple shortlivedclusters that have to be rebuilt everyday and uh let's say we have 20 of themand a team wants to deploy them one byone then that's a big chunk of the daythat's just gets lost because of that sobecause of that we relied on somethingthat our cloud provider uh supportedwhich is stop and start feature ofclusters uh which basically allowed youto stop the cluster meaning all thevirtual machines scaled down to zerostate was stored in CD you save moneybasically win-win and then the clusterstart is uh two or three times faster soit should technically be good but theproblem is the stop and start feature isreliably unreliable from what we've seenbecause the state does not alwaysactually match the reality and we've hadfew cases where the team that isdeploying the clusters they see oh heyyou know our cluster is broken pleasefix and then we're like okay let let'ssee what's the situation we see thatthere's maybe no pods running or they'repending and whatnot and then we couldsee that the uh web hooks like uh theones from Kivero or linkerd are blockingthe pods so we're like trying to figureout okay what's up why is that andthat's because nothing answers thosecalls and that's because the in thiscase linker D pods are not there becauseit's being blocked by the linkerd webhooks so there's a bit of a yeah loopthere which sometimes happen it's alsonot very stable so you had the situationwhere the policy blocks itself becausethe workload is not even there uh soyeah so that uh messes up and uh wastesa lot of manpower aswell and then we've had a weird case ofghost pods uh because of this statemismatch as well we've had situations aswell where we see the pod is runningit's healthy uh the service is there allgood but the website or or whatever isbehind it is not answering and we'retrying to figure out okay what's up uhand then you see the pod is marked asrunning but the node it is marked asrunning on does not exist because thenode has been removed on the stop partso we had few pods that or the APIserver thinks it is there because it'sstored in HCD that it's actually therebut it was not properly cleaned up soand the controller uh and theulerdoesn't actually bring a new pair upbecause it's well it's technically therebut in reality it's it's not it's anillusion and a and a ghost as we calledit so we needed to build some customsolutions uh as as the ghostbusters webuilt a basically a chron job that runsevery five� minutes called like ghostspot destroyer or something so it justchecks all the pods and see if there's anode that's not there and just delete itand then uh we introduce one more uhnode termination handler which basicallywould help to to prevent thesesituations as well because these alsohappen on uh other types of node that'scan suddenly disappear yeah spot spotyes exactly yeah and then on the extrafeatures that happened is the securitybenchmarks and transparency which for meas a developer is is great and awesomebecause I like to change and see thingshow they actually work because in cloudthey're often abstracted away or notvisible at all so the extra v visibilityhelps and and it can be seen more inhand with the CIS benchmark improvementsso this is what we increased it in liketwo or three weeks basically almostdoubled it uh in our on premises uhbecause in cloud providers at least inours you cannot change anything incontrol plane and maybe your CISbenchmark uh tool you know still reportsthose those issues but you yeah cannotdo anything and now we can and theremaining percentages are basically ifyou're familiar with CIS benchmark thoseare the five do something rules uh thatare mostly about accesses uh maybe somecontainer specifications and so on sothose are things we still have to ironout but the major part of the actualcontrol plane we could fix and changeone of those things is CD encryptionwhich obviously everyone should have andis a good thing and you might think aswe thought that maybe it's by default inyour cloud provider please check becausefor us it was not enabled by default andenabling it is also a bit tricky and andsome features were only in a previewmode which also does not bringconfidence uh they relied on uhobviously their own uh uh diskencryptions uh inside the data sensorswhich we also have so on top of that weat least could uh doc encryption and uhyou know be morecompliant uh one more feature that mightnot sound something spectacular is uh wehad control over service accountcertificates i'll explain a bit later onwhy it's actually important and why itplayed a big role for usso one of the things is that when youmove on there is the lack of externalsupport depending on the solution youchoose of course but one of the questionhas been asked is this what if the cloudprovider can help what do youdo we actually after I've been runningfor four years in public cloud weactually do a lot of the thingsourselves it's rarely that we reach outto the provider itself um even ifthere's something very critical wrong wedo most of our stuff ourselves so thathad given the confidence that we canactually do this and we have thecompetence to actually do itso initially um and I maybe some of youin the crowd from my sales I want totalk to you after but it was um actuallyidea coined in Ventia 2022 at CubeConwhere there was a keynote from my salespabout running over 1,000 cluster withcluster APIum with with cluster API so that was thefirst uh idea that was born there andthat's also what we're using to do thison premise installation this is actuallya picture of the first PC cluster uhfrom our own data center so we um wedecided to uh go with uh some serversand then a storage systemand the thing was that should we go withum storage insight as an VSAN or a CI asa hypercon converge um but the thing isthat we don't know the growth of thedifferent parameters meaning that howmuch compute do we need how much storagedo we need so in order to kind of growwith the load coming in It's much betterto have that flexibility that we justcan put more hard drives into the sandand we get more storage and put morehost on when we need more computecapacity so that's what we went withanother added benefit to that is thatyou can boom around the VMs andinfrastructure can do the maintenancewithout uh reaching out to usanother thing is that we already have aplatform running in the public cloudwhere we use uh GitHubs to uh boot tobootstrap it and the only things we areonly changing is the things in the topyou are seeing so we instead of thepublic cloud we use now� VMware i knowthere could be some discussion aboutthat but it's a good fit for ourorganization we use Caby and we use theCSI driver that's many of the things wehave been changing in order to do ouron-remise move and then we you use therest so here's overview over uh what arethe different components that we switchso we use STEX for the uh single sign onwe useharbor and Mino i will let you pick thatone up uh Candice V came in because weone of the things I I will get back tothat uh why we choose that core DNS fortheDNS this about the persistent volumepersist persistent volumes we have anextra thing to consider there um becauseof our data center set up I will comeback to that yeah and we know how itgoes with the plans not according to theplans so we still needed to change acouple of things one One of the thingsis we needed to replace Mino with Rookyou might ask why some of you might knowthis year the Dora kicked in high andstrong so we needed to really look atour licenses and the AGPL free licensewhich comes with the open source miniowas a was a killer for us uh because oftheir hard copy left uh approach so wewanted to avoid that for some liabilityreasons and enterprise minio was out ofthe question due to costs uh so we wentwith Rook because it still supports theS3 which we needed so it fit pretty welland we already used it for rewrite mestorage as uh we had already mentioneduh and to manage Rook and and Miniobefore we initially we made our ownoperator around minio to manage usersbuckets policies and and whatnot uh thenwe found some issues and and maybe tomake it easier to maintain we used acrossplane provider there's oneavailable uh it was slightly dead to behonest they didn't reply to my PR forlike four or five months so we forked itand extended it how how we wanted uh butyeah now for rook we needed to redo itagain uh so we made a there is alsotechnically a safe provider for uh foruh Rook via crossplane but the problemwith it doesn't work with privateregistries so we cannot really use it sowe just made our own provider fromscratch uh in a senseagain uh one of the things we also hadto change as I highlighted before is theservice account certificates this iswhere it comes into play we use workloadidentity in our workloads to communicatewith external resources like uh vaultsdatabases Kafka and so on and uh it hasa thing called managed identity which inour architecture setup it was one perenvironment and it would be sharedacross the clusters in that environmentand it comes with a problem because itcan only have 20 attached uh federatedcredentials meaning only 20 clusterscould use it which usually would be finebut in our test environments as Imentioned we have dynamic test clustersand they can easily go above 20 and uhwe needed then to fight for the slotsand or maybe make some custom solutionsaround it but in on premises we don'thave this issue as we can generate andreuse the same certificate in the forexample for all our dev clusters testclusters and so on and what this meansin the workload identity we can then useonly one credentials for the whole devand the whole uh test separately so wewill not face this limit uh uh yeahanytime soon or everhopefully yeah uh one more thing that wehave to consider which P will talk a bitmore about is the we would need tospread the control plane across datacenters which is something cloudproviders do but we needed to figure outhow we do it and then how we spread itout uh nicely because we have three datacenters in all of free there will becontrol plane on but only on two uhactual uh application workloads will berunningon and then the last thing uh might beslightly controversial to some uh we'vehad different talks with people is wereplaced Hashi Corp with sealed secretsobviously as you some some might knowthis means we only replaced for thesecret management part there are stilluse cases for maybe key management anduh we had some uh applications wherethey have autogenerated secrets whichwas supposed to automatically go to avault and then developers never see itso sealed secrets that doesn't �fit butfor the other cases we replace it withsealed secret uh there were few reasonsfor that uh Hashior itself it's good butwe wanted to use we wanted to store thevault itself inside the cluster um andit's not super friendly for in a in aGitHubs world because you needed tomanually go in and seal and unseal thevault which is uh yeah not great and wedon't want to use any manual steps thereare some other tools like bank vaultsthat technically does it for you butit's also slightly outdated and doesn'tfollow uh or has fallen off the cadenceof Hashiqua and the third but not leastalso is again cost to reduce the cost asmuch as wecan yeah and then for the migration uhwe didn't want to wait for 5 years untileverything is done and perfect and thenmove people we wanted to move them asfast as I can but also saving the costas much as we can so there were somedownsides we accepted and be like okaysure we'll deal with it once we you knowmove everything and then we'll clean itup later one of them is the workloadidentity as I mentioned will still useit uh because there are still maybe someuh vaults or uh or uh yeah databases andKafka and so on that they might stillhave so we still have a reliance on ourpublic cloud uh for that part and as Ihighlighted before is uh for someworkloads that have key uh managementthey still use uh keys inside the publiccloud and then also if they have thisautomation or if the developers are justso extremely stubborn that we're likeokay sure we'll deal with you maybe in ahalf a year then we'll bring you overAnd uh as I mentioned slightly before wehave different than availability zonesin our data centers while we have threeof them we use them all free activelyonly in our pre-production andproduction environments uh which meansthat in our dev test it's all hosted inone while that might not be an issue uhfor the initial migration it does bringsome bumps mostly because we have helmcharts that already use topology spreadconstraint topology spread constraintsfor zones uh and you know how do youspread if you don't even have that labelbecause you have only one data centerone zone so our quick solution was wekind of faked it and we just addedlabels for some nodes so it couldactually spread even though it's not areal zone in thatcase yeah and then uh for the workloadmigration it is going on uh don't tellper I actually stole this from ourofficial sheet this is the official ourwaves how we split out so theterminology especially the last one isreal uh it's not something I just madeup uh it's split in multiple parts anduh as I mentioned before we are in themiddle of moving our critical pieces uhas we also call them the big spenders uhso they make uh you know extra reasonsto to movethem and uh yeah uh the migration itselfaside from the components we needed tochange that we highlighted before it'sactually pretty smooth there's not beenmany hiccups due to Kubernetes itselfand GitHubs because Kubernetes it's yeahas I mentioned in the key keynotes aswell it just works it's awesome amazingand with GitHubs you just point it to adifferent cluster it also works and inour current setup we have uh to makedevelop ers change as few things aspossible we also have config maps in ineach namespace with some variables orenvironment variables that they coulduse which for example points to uhingress URL or or where Kafka is locatedand so on so they don't have to evenworry about it and they could justreuse basically move to on-prem andchange as few things as possible perryeven had uh some presentation where hewanted to show how to move things toonrem and he forgot to change our thingsettings from public cloud to on-premand the demo still worked because itmostly used the environment variablesfrom our config mapyes so one of the things that Kellismentioned was that we need to have umhigh availability for our productionsetup and that imposes some challengesin how do you route the traffic from thekind of outside and in um we um we havemultiple data center and that was a bigchallenge and I will not say we havecompletely solved it yet to and it's notthat it's not solution for it's morelike how do we make our legacy kind ofnetwork conform with all this and makeit a kubernetes way so it's automatedand people can do it the Kubernetes waythat we still need to iron small thingsout um also the fact that the differentdata centers we have they don't sharethe same subnet they also impose somechannels when you have IPs and whatnotum and if you're used to being a publiccloud you can you kind of have one IPfor your let's say your ingress and itcan float kind of around the differentuh zones we cannot do this here sothat's different channels in that yeahcome forward with our on network yeahand one challenge we also have or uhwe're a bit on the edge about it butthis is a rough picture how we have ourcluster set up we use cluster API formanaging them and uh yeah we have a itbasically all starts from a kind clustereither be local or from a pipeline whichwill create our registry cluster as wecall it our grandmother cluster becauseit that's the one that actually makesthe management cluster that would managethe cluster for our workloads uh but theproblem is that once that kind clusteris done it will move the state andeverything to the registry cluster so itbecomes self-managed it it works now butit brings some potential issues that heyyou know it's a one point of failure andthen maybe something goes wrong maybedelscaler plays a bad game with you andthen uh goes haywire or you misconfigureand the registry cluster down and andwhatnot so we'll still monitor how thisis performing because we might maybe goback on it and not have the registrycluster be self-managed but be just a aregular cluster that you make forexample with cubadm or something likethat and it would be solely justmanaging other clusters and not itself ithink there was another talk yesterdayabout the chicken egg problem in allthis so yeah we need yeah from theanother Swiss bank yeah so this is stilla hot issue we need to figure out yes sothe vision for the future being agnosticuh meaning that we will probably not goback to manage cluster in public cloudwe will use the cluster API to deployKubernetes on raw VMs in order to havethe same setup anywhereum and another thing is to run uh lookinto cubeword i really see a lot ofpotentials in cub word in order to kindof look at our legacy world as K startedout saying we have a lot of thingsrunning on Windows and I think therecould be a lot of potential let's see ifwe can leverage some of the samefeatures uh we get capabilities we getin Kubernetes like DNS uh certificatesand load balancing all this is a hasslein the old world I can tell you that ARMCPU if the hypervisor couldn't supportit I would really like to go there butwe are not there yet um and the mainreason for that is there the moreperformance in it and the cost is betterso when the hyper is there I will gladlyhop on tothat yeah and then now that we'reclosing to the end some people might askyou know after this should everyone moveis is this the way new way to go uh nouh it's it's not that simple righteveryone should consider it a bit moreuh because it it really depends on theuse case and you really should evaluateyourself if this fits for you because Ithink Saxo is one of the rare caseswhere it actually fit so perfectlybecause as P mentioned we already had anexisting theme already existing datacenters we already had the the groundsthe ground works were already done uh soit made things much much easier and theexperience was there so it was almostlike plug and play and then you know ifyou're like a very small company like astartup a couple of people or don't havethe experience it's probably not uhworth it or if you're uh yeah as inquotes truly global it's a vague termbut if you would have I don't know haveto serve 60 countries and be in like 80data centers across them and you don'thave the sophisticated teams foreverything to be in all places andmanage it also becomes difficult that'swhere cloud plays a good role but in insuccess to LinkedInDon't forget to rate us and thank youall for being here2025-04-15 21:59:40.842579 ��a�(#��yAZfp94fOMcwEgood afternoon everybody and welcome toour session hope you're still fresh it'sstill it's a little late in the day butwe want to talk about breaking free fromcloud banking on self-hostedkubernetes first up about me I'mPopensson I live inCopenhagen love to travel the world ijoined Saxo around six years ago and nowserve as the head of container platformengineeringand besides that I'm also the chairmanof Cloud Native Denmark that'sorganizing the two previous KCD Denmarkand now we changed the name to CloudNative Denmark and we will have aconference later this year one of thethings we do for our conference is thatwe donate the profit to charity tofoster young tenants intoit yeah my name is Carl Sak Gribbles ialso joined Saxobank only last year butI've built Kubernetes clusters beforefor companies such as VUX and Accentureuh spoiler I'm on the way to I'm a cubesor not and on the way to golden cubes ornot at this point I'm a bit unsure ifthere's 14 or 17 certificates now basedoff yesterday's announcement yeah we'llhave to see how far I am yeah and uh I'mnot from Denmark but I reside in Denmarkcurrently so I'm from Ria Latiayes about Sax Bank we are an onlinetrading platform we are founded in 92 socompared to all banks we are prettyyoung uh we have 100 USD billions inclient asset 1.2 million customers a lotof financial instruments that you cantrade bonds stocks CFDs and whatnot andwe are too big to fail as they saidyeah so couple of disclaimers before wecontinue as that some have said titlemight have been a bit too catchy ormaybe misleading so to clear some thingsout most of the core banking itself isnot run on Kubernetes it's run on it'snot even on Linux it's on the good olduh Windows machines uh but it's not likewe don't have any critical workloadsthere's a lot of critical componentsstill running on Kubernetes so if thosesuddenly disappeared clients wouldnotice and it would not be good for thebank uh one more important thing this isnot an advertisement tool like for oragainst public cloud private cloudwhatnot this is look at it as ourexperience our story our case study andwhatnot and uh the migration is notfully finished but we are in the middleof migrating our most critical uhcomponents at this point intime yes so what was the initialmotivation to begin this journey firstof all we are a bank compliance is a bigthing and for that we need to have anexit plan for exiting uh public cloud ifwe needto we also need to we experience some ofthe um running four years in public downnow we have actually see somereliability issues it's not always goodthere's some limitation even in thedoubt but the biggest kicker is the costand that's also the title of it therewas the that moved things along andkickstarted thingsfaster yeah few more things we gained wewill�� have at this point4667 backend services running and uh1955 data data pipelinesrunning okay okay okay this going to bea long day so let's go let's get on withit then let's actually realize that I'mnot going to do this alone uh so let'sget a team together we assemble the keypeople needed in order to reach bothdisciplines in order to reach the folksin order to apply the fix etc so yeah weset up this war room sorry for thereference but anyway that's what we'restill calling it uh what would we needwell we need a chat where we can findout their sharings etc etc of course weneed a virtual meeting let's get goinglet's create a plan for how to fix oursoftware fleetthis plan will include let's make sureeverybody are on board and fix their because in Spotify we thinkownership is very important so you ownand you operate your own stuff okay coolso we set up a public communicationchannel we set up another chat whichthey can be in we set up a meeting thatthey can jump in and understand theproblem the questions whatever they haveand also we reach out to everybody wesend a mail saying this is urgent andimportant is pretty scary and in orderto not get caught in everybody's ignorelist with all capitals I'm adding alittle bit please fix itStefan again hypothetical but anywaylet's roll with the story uh so yeah andeverybody doesn't read mail all the timewe page them to get into it we startcommunicating with everybody we makesure that they know about this we makesure that they do the changes so pleaseapply this trivial fix in a one line inyour dependencymanagement and get somebody to review itas well right because that's we do we'reresponsible we review all thecode okay i could argue this wouldn't bethe case we we shouldn't review itbecause it's a typical break the glasschange but anyway that's what typicallydo good goodgood uh aside from the ones we didn'treach with thepage so let's make sure we get in intouch with all the owners that aren't oncall or didn't read the mail or wherethe owner feel no longer is true or thesquad no longer a team no longer existsetc yeah there's there's always a bunchof longtail when you when you're tryingto update at least for us when we updatein complete fleet so that's that's oneof thechallenges but anyway you do that youget to a place I'm sure you also startassessing what was the how much of theproblem was there any was there anyactual security breach in this case oror did we manage etc you bring this tothe incident review the postmortemanalysis you say good work team becausedamn a good work we did lots of work atleast uh we try to assess the impact didsomebody actually breach us or not umand then we also asked a very importantquestion what should we do for next timeto beready okay cool two obvious things popsto mind let's be proactive nexttime may not help for this zero day butit may help for a bunch of other ones sofair enough uh and also let's put someautomation in place so we can do thisquicker next time yeah that sounds goodright we had already been around thisone so so when we came to to thisincident we were slightly more preparedthat was good h we're going to still dothe same thing is this a problem ofcourse it's a problem is a problem forus yes it's still a problem for us weunderstand it we try out the fix weupgrade we test we deploy it but then westart doing something differently we donot try to communicate with everybodyinstead we started doing what we call afleet shift we prepare the change in away that we can apply it across thefleet that's the main difference hereright and it still applies to all ourservices and and pipelines of course andthat's good then it's a major deal andand this is not a silver bullet it'sjust taking it e making it easier westill need a team to get together ithink we actually were as much as 50people working on this even though wehad the automation uh so we assembledthat team we set up the infrastructurewe create a plan on how to approach thisbut since we had already startedcreating this fleet shift in Swedenbecause that's that's where we'relocated halfway is when US is� coming inthey're ready and we just starting toroll it out but still we communicatewith everybody right we send out a mailbut it's not do fix this it is okay wewe we have this incoming right so beaware be uh be on your toes in casesomething happens we skip the we canskip the paging part we can skip the uhreaching out to them um to some extentwe run the fleet shift we apply thechange to all our services all our datapipelines and we automerge it and onautomerging it's being continuouslydeployed it's deployedautomatically but now I need to behonest um yeah we don't hedge on oneverything on one one thing here so wealso asked all owners to make the changeso we let it go in parallel not sure wewould do this now three years later tobe honest okay cool uh so that's how wedid it and then we came to completingthe work still a long tail okay fairenough uh but instead of thehypothetical multiple days chaos we weredone after 11 hours with 80% of allcomponents they were updated they weredeployed they were running in productionwithoutincidents um two days later we wereactually 100% no sorry 99% done uh fivemore user repos may or may not have beendeployed we still needed to handlethem okay cool and now we come to theincident review and yeah still good workright we we we did a bunch of work um wedid luckily not luckily fortunately atleast not have an impact so that wasgood in this case as far as we know atleast um but we also ask this questionwhy did we not reach 100% with ourautomation what can we do in order toget there all the way and there weremultiple reasons and one of them beingCI flaky or or uh for numerous reasonsbut one which was pretty important thatcaught our eyes was that fleet shift botwhich is the thing applying the changeactually wasdisabledh this is not only about automationthere's something more to it so weneeded to work a little bit on ourculture and um you know don't waste agood crisis we use this one in order tochange how we were working on it thiswas the pivotalmoment uh yeah so taking a couple ofsteps back here why are we at all in thebusiness of fleet management why do wethink it's so important to be able towork on the complete software fleet weare a microservices shop we have a bunchof teams i think we are around 700 teamsright now each team owns one to twothree system you know it from the systemarchitecture uh we have here inbackstage um we have a bunch ofcomponents inside each system why splitbetween squads and systems well becausesystems and software live a lot longerthan teams right we learned this conveylaw work so yeah a system is a bunch ofcomponents services websites etc insidethese there are a bunch of instancesthis is how we reach about reason aboutour fleet i think we have around 200,000software entities in ourcatalog okay cool so this is what weneed to work with um we concluded thatwe had a a microser shop then where eachteam owns their stuff again ownership isstill important uh and we were growingthis were somewhere around 2017 I thinkbut I'm not going to take poison on thatone uh we uh we concluded that yeah weare growing faster in our software fleetthan we are growing engineering cultureand and engineers were also growing upuntil recent years uh but still oursoftware grew a lot faster and to put anumber another number of this uh thenumber of lines of code each engineeringSpotify owns has doubled in the last twoyears so that's a problem so what do youdo with such a problem well you investin taking care of it Another way to lookat this how long time does it take totake care of our fleet this we'relooking at a standard migration we movedfrom Java I don't remember we say Java 9to Java 11 just to put an old number uhand we started working on it in Marchand then we went component by componentand updated it and uh 10 months laterwe're close to done good job right butnot super snappy i I can say that soyeah how do we change this this is thereason why we came to we need automationand we need to change from a componentfirst mindset applying our S sur umthoughts to a fleet first mindset whenwe want to mak�e a change and it's goodwe want to make it across thefleet cool let's build some automationso we we as you saw we can make a changewe call it the fleet shift you might bethe author of fleet shift you go in andyou create ituh not super uh interesting yet butanyway it has a target this is ourdependency update shift it has 2560 allour production services right now uh asa target it updates the bomb bomb is abill of material it's the it's thedependencies we have which we share incommon and update accordingly thanks tothis we actually have more or less asingle version policy in our poly repoworld okay cool so you create this fleetshift and and what does it do under thehoodwell this is the oh sorry this is thechange we're going to be making uhupdate the all the all the updateddependencies uh by modifying the versionof the bomb in allrepos under the surface this servicewill take the manifest you just saw uhor the or the summary you just saw andit will uh do exactly what a human woulddo when it was tasked with this it wouldcreate a couple of Kubernetes jobs oneper repo that job would clone the repoit would uh set it up and launch aDocker container which makes the codetransform or or the text transform inthis case um once done with that it pushthe changes it push the changes to abranch and then up u push it up toGitHub okay cool and then it does thatfor each uh eachrepo what Fleet Shift then does is watchthe progress on this one and keep trackof it once you're done with the compl uhwith the change it creates a pullrequest and there we go right each teamget a pull request awesome uh yeah tookcare of it well a lot simpler right sowe have a list of changes again this was200 something uh all the repos all thePRs createdall good aside from a little bit ofproblem because we're doing this maybeonce or twice a week on each repo thereare a lot of code reviews huh so whatcan we do to this well I mentioned italready during the incident review wewant toautomate change right how can I letanybody else or how could I let uhautomation go make change to mycomponentsi need to be prepared for that one and Ineed to allow that to happen okay fairenough so let's work a little bit withthe readiness of the fleet so what we dowe create a u create we we we realizewhat do we need in order to be readywell we have to have tests they need torun they need to have a decent coverageand they also need to pass so we alwayshave a releasable main that's good goodpractices anyway so so no surprises butlet's really work through it for allcomponents and then we also need to havea little bit of monitoring set upalerting setup etc uh and that's allgood we also need to be able to makethose changes and if everybody dowhatever they like it's very cumbersomefor us in platform to make the changesto whatever technology they have so thiswas the second time we realized thatyeah it's really good if we use the sametechnology so we introduced thisstandard terms and for those who haveseen Spotify from the trenches or agilefrom the trenches we're pretty famousfor being autonomous we still want to dothat we still want to make our owndecisions but maybe just maybe itdoesn't help if we make a new technologychoice every day so that was the changewe introduced uh so we identify whichtechnologies are in use the green onesand which is our bless technologies thegolden ones so yeah Golden Tech allowsus to do a couple of things itsimplifies for building software for ourengineers and it unlocks us to run fleetmanagement really really good in turnthen here comes the promise back we canmaintain all these components for freeonce you're golden you stay golden i'mgoing to do a knockknock on that one andalso going to realize there are someexceptions but but that's the motive atleast we take care of you and weactually also use fleet management tolevel up the technologies and use thestandard one so that's how we do themigrations cool and then we realizedthat okay this was good this was aneffort um a campaign we made to geteverybody on fleet management the crisiswas used we really really saw the valu�eof fleet management in this part uh butwe want to stay this way because we wantto continue to do it so then we kind ofintroduced this golden state concept aswell so here's the things we want uh aservice to do the technologists includedusing the bomb you just saw also thebase image also using the same kind ofservice which uses the same kind oflogging uh we call it Apollo internallyit's not the same Apollo service thatexists out there in a couple of flavorsand we also use what we call theinfrastructure because this doesn't haveto stop with just the uh softwarecomponents there are also a bunch ofother things we can fleet manage what ifwe can take care of all the resourcesthe databases the autoscaler everythingelse locus uh brilliant name or not Idon't know anyway it's our cachingservice caching layer um and when westarting to declare this it's just atext replacement right so that's whyfleet management and updating componentswere not enough also roll out githopsfor all ourresources which makes it possible tomake changes in this case we we changedit not to the locus but to the big tablethe second example uh we changed how weconfigured the big table clusters andsaved a bunch of uh saved a bunch ofeuros actually by just reducing thenumber of minimal nodes and cranking upthe scalability so we can handle a bitmore load because we have verified thatthat works and now we in platform uhaddress that across thefleet okay cool so then we have theautomation in place we have worked a biton the readiness so the culture iscoming there how does it work then wellyou see the same picture in the middlehere we have the fleet shift which umpushed the changes to GitHub when a PRis uh available ci runs of course beforemerge and one that is passing we havethe auto merger at top of the right onethe one that will be continuouslyreviewing for PRs by uh fleet managementand if this shift is one that we canautomerge it will doso once it merge CI runs passing thetest pass it on the deployment systemthe deployment system start deploying itall good we have uh another servicewhich is making sure that some guardrates are still nice we're lettingautomation go wild here uh that itdoesn't start breaking so we roll it outcohort by cohort and if we see anyissues out of the ordinary that's whenwe break it or we alert the feedmanagement team okay cool so thisactually put us in the spot where wecould handle this this incident uh a lotbetter and the and the uh developerscoming to the realization that this isfine okay it was cool when I got the PRautomated but also pretty boring to readthem now I can just go back in time andif something happens to my service Iwill look at the history and that's theonly team I need to interfere uh andthat's really really good we've trainedthe developers at Spotify me included uhto to not really care about the fleetmanagement PRs anymore actually slightlytoo much because we're having a we'rehaving a new problem and that's whenthey start failing people are still notlooking at it but that's we we'll get tothat we we'll sortthat cool so yeah where are we now we'remaking a couple of different changes uhwe started out with targeting repos polyrepo one one repo one component all goodyou feed it a list of names pretty soonwe realized yeah we need a data setwhich you populate with a list of namesuh and then that was all good for us inplatform who wanted to take the time toprepare the fleet uh the the data setetc but the next level was then we couldintegrate this in our code search systemthen you could just find I mean in theoriginal case you could search for logforj 1.4.12 4.12 I think it was doesn'tmatter uh and uh you would find it youwould also find it by metadata I'm onlycare about the services and the languageis Java uh and then I could make thischange and then we also embarked on thejourney of monor repos we have had ourclient code in monor repos for for manyyears uh here is not really good atapplying the change at the root of therepo this is where we need it on acomponent type based on the componenttypes and based on repo and path becausethat's how you identify a component inmonor repo so cool that's what we cantarget meaning one component at a timeor one system at a time we see a coupleof different changes we want to makingthe first one was the simple change justthe text to a reg x that's how we gotthe dependencies updated that's how wefixed the the cost for bigtableum but we also want to take care of moreso what we saw that we were we weregetting a lot out of fleet management inthe simple one and then we compared toothers and say but what but what elsewhat other changes should we be doingetc what more can we do so we startedgoing to code changes maybe we updatethe the beam version in our uh copipelines okay fair enough and here wego into a transform abs transformmeaning we changing source code we canupdate APIs we can deprecate APIs we canalso handle breaking version updateswith thisCool we also go into code cleanups weidentify a problem and we apply thechange across thefleet and of course this is good but ofcourse we want to do more of this sowhere we are exploring right now iswhere so many of us are what can we dowith AI how can we get these nice uhLLMs to make the more complex changesfor us but not just in our ID but alsoapply it across the fleet let them loseand do it um to all the fleet becausethey love simple repetitive tasks withvery clear instructions and what webelieve what we're exploring now what webelieve is that we can move in a coupleof uh migrations code migrations that wehave not been able to do beforewe also recognize that some we can docompletely we're never going to gowithout human review here or never ishouldn't say never we're not goingwithout human review now uh but even ifwe cannot make it completely if we canmake it 90% of it that's actually a bigwin as well so that's what we're playingwith rightnow so did this help us yeah the timefor doing migrations you remember forthe Java one it was 10 months uh howlong how far did we reach for 70% is thewhat this graph is showing how many ofthe migrations have we completed or howhow long does it take to complete 70% ofthe migration and we see as of uh 21this is becoming fairly low in the inthe order of one to two weeksuh also and that means that we'rechanging we're actually shipping newfeatures to our to our developers inthis time framewe also recognize that um we can expandthe usage of it here you see the graphof how many human PRs we're makingideally now freeing up time for thehumans so they get to make moreinteresting PRs and that is slowly goingup which is good of course but we seethe bots going a lot faster so 3:1 uhratio bots versushumans of course on humans making theinterest of one not we're going to denythat but what we're also seeing and as Imentioned with uh code u code searchintegration we're having more and morechange makers so it was not only us inplatform that benefited from thisplatform it is also the platform teamsoutside in the organization the featureplatforms like hey I'm changing my APIand it's the one that does what I loveto clean up our production environmentuh renaming the metadata just making ittidier so there's a lot of change makersgrowing And uh the promise to start withhere was how much time can we free upfor our engineers how can we set themfree to do the feature engineering workright and we calculate this in a prettypretty cheesy way but anyway each PRwhich is trivial we call that 30 minutesof work compared if you got a male andthe more complex one could be calculatedas two too much as much as two hourswhich is very far from true if you wouldhave done it yourself but but it's okayuh and in um in the end of 2024 we hadsaved as much as if 355 engineers wereworking full-time during that year uh oftime if you're interested in learningmore uh this is where you get to knowmore uh we have produced a podcast onthis i think it's NGN who's beinginterviewed our our chief architect uhand we've also produced a couple of blogposts on this one so please feel free toreach out and read themand that's what I wanted to share thankyou[Applause]2025-04-15 21:59:41.619023 ��_�)#��uAZr7y27HpII4hello everybody let's see if we can getsome yeah we have it on st screen aswell so I wanted to try out with alittle bit of storytelling here soimagine waking up picking up your phonenot that you have it next to your bedbut anyway let's pick it up there andthis is the first feed you read happento work at Spotify as well this ispretty scary actually so we havesomething that can execute code in ourback end all our backend services yesthey are Java services that's a prettythat's a pretty daunting way to wake upis not true because this blog coming upa couple of days later uh but of coursesomebody was woken up by this securityincident and I don't know because Ididn't see it but I can imagine the facewas basically like this this is going tobe a longday uh so yeah that was a problem acouple of years ago we know it as thelog forj incident it's important to usin Spotify because it was a pivotalmoment on how we operate ourfleet my name is Stefan i want to sharea little bit on how we were at thestorytelling style not ever going to be100% true but anyway how we were workingsuch an incident how we could have beenworking it earlier because again this iswhy we are working on fleet managementhow we take the pain away from most ofourengineers so yeah how would you do thathow what would happen you wake up youtry to understand a problem right firstquestion is this a problemyeah it's bad somebody can executewhatever they code they like in yourback end that's bad it's okay it's aproblem it's a problem is it a problemfor us though no we're fine rightbecause we're using another loggingsystem so we're actually It's all goodit's allgood aside from the fact that we're alsousing libraries which are using log forjso yeah maybe not that fine anyway uhand also we have many disciplinesanother one is the data discipline andwe have a we have a data processingframework called Shio which is alsorunning on the JVM it's in Scola but thelibrary is log forj is in Java so yeahyeah this is kind of a problem okay coolso the next thing is of course to reallyread the incident understand the fixapply it it was pretty easy in this casethere is a next version this nextversion has solved the problem justupdate the version of it test it deployit there you'redone also understand the scope of theproblem because again this was for ourbackend services and we have we havemore than two it actually turned outthat we��nched this sevenyears ago And today in Fidelity we haveover 600 houses across the organizationwhere we've given people the freedom togo and build what they need for theorganization Now Rachel has anabsolutely fantastic video on this inthe QR code you'll see in the cornerthere So highly recommend you go awayand take some time to watch that if youget a moment But with this comes ashared responsibility model The cloudplatform team would go and deliver thefoundation So if you think of that as ahouse it would be the water man cominginto the house But then the app teamswould then be responsible for buildingthat house and everything inside thathouse So if that water man bursts that'sabsolutely the cloud platform team'sresponsibility to go fix But if a tapstops working in a bathroom or a kitchenor you have a leak for whatever reasonthat's the app team'sresponsibility And the app teams lovedit Absolutely loved it It drove hugeamounts of innovation in theorganization and we had some fantasticoutcomes And if we start to look at tohow our cloud journey then went First itaccelerated but as time went on itstarted to slow and slow and slow untilit really peted out We're thinking toourselves what's going on We we've giveneverybody the dream here like they cango and build their own applications Whyis it slowing down It's like any goodproduct owner We had to go and sit downand listen to our customers So when wewent and met them they're all looking alittle bit like this We're like what'swrong We've given you the dream here Youcan do anything you want in the cloudwithin reason We set the safeguards andcontrols but you can go and build allthose fantastic applications And thenone of the appdev team leaders comealong to me and he said Dean I wouldlove to focus on building sorry focus oncore business delivery and not worry toomuch about the underlying infrastructureNot because I don't want to because Iwould rather like the experts to manageit and I manage my business applicationsI think it's fair to say when I firstheard that I was like "This guy doesn'tknow what he's talking about We've givenyou everything you could want What isgoing on here?" But actually we heardthis message again and again and againAnd then we had to start thinking do ourapp dev teams really want to worry aboutinfrastructure Are the two and a halfthousand developers we got really cutout for managing infrastructure Is thatthe best use of their time And actuallyso if it's not the best use of theirtime how can we deliver something thatlets them focus on their apps and notall of that infrastructure that theydon't want to manage today So we wentaway and thought about this We've gotthe cloud analogy What else could we doSo I introduced the cloud containerhotel So building on that analogy wethought to ourselves what if we built ahotel this time Rather than justdelivering the foundations we'd providethe entire hotel for the organization Wewould provide hotel rooms that we wouldthen give out to our app devs team butall of those hotels would look the sameand be very opinionated So this time ifa a leak occurred in the kitchen or thebathroom or one of those hotel roomsbecause they all look the we would be ina position as a platform engineeringteam to take away and fix that we couldthen let our customers focus ondelivering business value but we wouldthen go and focus on all the platformengineering and infrastructure wentthere around thatSo we went back to the app devs and saidwhat do you think of the hotel model andit was party time again at Fidelity Thethe hotel was everything they wanted Itcan now start to give them the time tofocus on the business applications andnot worry so much about everything thatcomes with that like the infrastructureSo that was all great Everyone's all thedevelopers are off and happy and excitedand partying over somewhere Back behindthe scenes we're scratching our headsgoing "Right we've sold the sold thehotel but how are we going to do this?"Fortunately for me I had Rachel in theteam So Rachel how are we going to dothis So Dean had the really lovely jobof� selling the vision which got everyonevery excited and I had the slightly moredifficult job of figuring out how wewere going to deliver on that vision SoI wanted to talk a little bit about whywe made the choices that we did So Deanhas mentioned that we started with CloudFoundry and actually our developers forthe most part really loved this It's athin viable platform because it's veryopinionated and it helped have thatstandardization across the organizationBut over time we recognized that thecloud foundry community was diminishingand over the last 10 years given thenumber of you in the audience I thinkit's fair to say that Kubernetes hasdominated the market So we needed tobear that in mind We work with a lot ofthird parties as any large organizationprobably does We don't build everythingfrom scratch We do buy and we do havemerges between those pieces of softwareand we found that lots of these thirdparties were expecting Kubernetes to bewithin our organization be it theyproviding Helm charts or something ofthat nature Also from a career vitalityperspective we want to have the bestengineers and to have the best engineersyou probably want to be using the besttechnologies and we felt that thatcareer vitality would come withKubernetes and probably not with CloudFoundry Finally we had several businessapplications that were already tied inwith certain products for example theBloomberg API and they were alsoexpecting Kubernetes to run So ourinitial use cases were Kubernetesheavy MVP If you have ever done asoftware engineering ticket as anengineer or perhaps you're a productowner with a white page problem you knowhow difficult it is to decide whatminimum really means to you in yourminimum viable product We are aregulated company So we also have someguardrails that we need to consider whenwe're thinking of the context of howwe're building Sometimes in a largeorganization like ours it's who shoutsthe loudest And we wanted to make surethat we avoided that and instead in thatlovely binomial distribution want totarget that 80% in the middle So how doI do that Well I think about ouracceptance criteria And by virtue ofhaving a platform for me by definitionit's going to be multi-tenantparticularly if we want to have thosestandardized rooms in the hotel thatDean has already told you about So forme that MVP comes down to the corefunctionality of Kubernetes And what isthat Well you want to host and run yourcontainer somewhere and you want to haveall of that resiliency and scaling thatKubernetes is famous for And for us ourguardrail and constraint was networksecurity and network isolationSo it's no surprise then that theplatform team was the one tinkering withKubernetes to provide said platform withsaid network isolation However thisoperating model didn't go down all ofthat well with some of our appdevelopers who got quite excited thatthey would be playing with KubernetesAnd when they found out that instead myteam would be doing the playing and theywould simply be popping their containeron top there was a little bit ofdissatisfaction which is hopefullyindicated by this uh lovingly generatedAIimage Right So I've got to startbuilding I've mentioned that MVP Whatwas that definition We could do a wholetalk on this and more but I wanted tospend just one moment to give a littlebit of a thank you to a few of the CNCFprojects that we use It is cloud nativecon 2 after all So we've got Psyllium asour CNI We're using Kyverno for policyenforcement and Carpenter is handlingour scaling to zero which is reallyquite good for our costsaving which as abusiness veryimportant So I've now built thiswonderful MVP So Dean back to you So Iget a phone call from Rachel one day Shesays it's ready Great Can I have a lookbefore we show the customers And shegoes "Da." I was like "Wow Uh maybe thisisn't what I had in mind." So I've wentback through all the epics and thestories and looked at MVP and gone"Actually Rachel you've built exactlywhat we said we would do for MP MVP Notsure this is what the customers have inmind though." And when we showed it tothem it was "Wow� one-star motel ratherthan five-star hotel like maybe we gotall excited about." But actually so havewe have upsold this a little bit toomuch And should we have been surprisedMaybe not So we coined this the hotelhype cycle And we're looking at thisthrough the eyes of our customers anddevelopers here And should we have beensurprised It's like any other productlaunch really So if we think of ourhotel here and actually at number onehere when we had that engineeringtrigger that's when we were hearing thatthere was too much noise around managinginfrastructure for our for our app devsin Fidelity today So we sold the dreamof the hotel and that really was thatpeak of expectation that we were goingto have this fantastic hotel servicethat was going to solve all ofFidelity's development issues Rachel'steam then busily went away andengineered their MVP service and thenvery much very quickly we hit the troughthat is actually the motel which is whatwe went live with on day one And for methis is the really really importantpoint At this point here the hotel couldhave crumbled and fallen down butactually we were very lucky to have ourfirst customers coming on board who werevery vocal which might been difficult tohear but actually the feedback they gaveus allowed to drive that path ofenlightenment where actually we couldstart to deliver some of the capabilitythat start to take the from a motelactually to a hotel For me right now Ilike to think we're at this finishedviable platform place where actually weprovide a lot of the capability we allowneed for our developers to move forwardBut I'm not sure our developers are evergoing to be fully happy with what wehave So Rachel that's through the eyesof our customers What did it look likefor your team though Wellunsurprisingly it also looks like theGartner hype cycle but this time fromthe lens of platform engineering When wetalk about platforms we talk a lot aboutdeveloper experience I feel likesometimes we forget the engineers butultimately they are developers too andtheir satisfaction within theorganization is also quite important Sowe introduced Kubernetes and we weresuper excited so excited career vitalityan actual green field project How oftendo you get to go into an organizationwith lots of money and get to designsomething completely from scratchexactly how you want it All of the bestpractices that you wanted to see But werealized quite quickly that our customerbase was Cloud Foundry users And so weslooped down into the trough ofdisillusionment very fast Cloud Foundryis a paz a platform as a service out ofthe box Kubernetes is not And so Irealized what we were trying to do wasrebuild Cloud Foundry this time inKubernetes Not sure exactly why we wouldwant to do that There's a lot ofcomplexity in doing so And I realizedthat barrier to entry was going to bequite high We had a new operating modeland the team that had previously beenrunning the cloud factory platformexpected DevOps in a purist fashion Itwas you build it you own it you run itwe don't touch your app And that's notquite what we're going for with thehotel operating model So we need toadjust our expectations And we blurredwhat was previously quite a cleardelineation between platform andapplication and shifted it slightly moretowards the applicationdeveloper So features are finallydriving enlightenment We've finally beenable to add some of those extra featuresthat are bringing a bit moresatisfaction from our customers But Deanand I are asking ourselves how far canwe go My platform team can only offer somany services and still have that twopizza model So at some point it's goingto plateau And I haven't told you whatfive is because we're not really surejust yet So I'm often quoted in mypersonal life as saying thatdissatisfaction lies in the gap betweenexpectations And that's what I'm tryingto show you on screen So both of thesefamilies have visioned for a hotel thatthey want to stay in for their holidayAnd as you can see they look quitedifferent That doesn't necessarily meanthat one is more valid than the otherbut it does mean that if� they go to eachother's hotel they're probably not goingto enjoy the experience My definitionand understanding of MVP was to build aninfrastructure platform that abstractedthat infrastructure from the customersIt wasn't to write their apps It wasn'tnecessary to offer guidance onarchitecture for their applications Andit also wasn't necessary to buildservices that might be downstream thatan application wants to use But thedeveloper thought they were gettingeverything as a hotel service the fullfive-star all-inclusive resort and we'renot quite ready yet to providethat So why are we not ready to providethat Well we're spending a lot of timerunning the platform We build on AmazonEKS which is fantastic and that hastaken some of the lifting away from usBut as you probably all know you needadd-ons from the CNCF to really makeyour Kubernetes platform flourish I'vementioned three of the ones that we usealready but we use quite a lot of otheropen- source tools as well My team looka little bit like this at the momentwhich is something I think about on adaily basis And we have a lot of testflakes I'm assuming a technical audiencebut to be really clear by flake I mean atest that is sometimes passing sometimesfailing and there hasn't been any changeto the configuration or code So you'renot really sure whether this is a falsepositive What this is indicative of isan environment that's not quite stableThat keeps me up as a product owner Sowe need to spend some time fixing thatWe're also doing continuous integrationin a purest fashion as well where we'repulling the latest changes immediatelyfrom the internet in the open sourcecommunity straight into a developmentcluster and we're running our full suiteof tests We actually destroy and rebuildthe cluster every time So our PRpipeline it's taking quite a long timeto run The more add-ons that we add themore tests we need to run the moreflakes there are And we have anoperations model where you do one day onQ which basically means as an engineerthat it's your job to deal with all ofthis So a little graph Now ignore thepercentages I've kind of made aguesstimate just to show you a nice linefor illustrative purposes only And I'vegot at the bottom here your time inmonths So this is from staging anddevelopment being built through tonon-production go live at zero which isthe motel that we've spoken about allthe way through to prod and then wherewe are now about 10 months later we hadquite a steep rise in operations andwhat you might call BAU for running theplatform until the point that we've gonelive that's simply because we'rebuilding out and adding more featuresand there's a linear relationshipbetween the number of features and thenumber of tests that you require It thenplateaued for a little bit while we gotcomfortable and slowly came down But asa product owner I'm kind of hoping thatthis this time spent on looking afterthe platform was going to come down abit more quickly so we could startbuilding other features So where you'vegot that steeper part on the graphthat's why we made a conscious effortsays talk about this in retrospectivesand have dedicated epics forthis But where do we go from here Wellmost likely it's going to look somethinglike this But there's also a risk thatit looks like two other scenariosSo the worrying future is that we add ontoo many features and we simply can'tkeep up with the maintenance and thatlooks something like this In an idealworld we continually bring down thenumber of flakes and time spent onoperations But at the moment it's onlydeclining on that shallow bit here Sowe'll have to give you another talk in ayear's time and let you know how we'redoing Right So I have a background inphysics and it's tradition for me to getsome mention of physics into my talk SoI'm coining this not the conservation ofenergy but instead the conservation ofcognitive load Now that probably soundsa little bit counterintuitive because wewant to get rid of cognitive load WhatI'm saying is much like energy it canneither be created nor destroyed So howfar can we actually abstract ourinfrastruc�ture because that cognitiveload has to go somewhereSo we built for the acceptance criteriaon here for the application teams butwe've sold the hotel model so well thatnow we have infrastructure use caseswanting to run on our platform as wellTo give you an example of some of thoseit might be scanning software orsomething that we're using forauthentication So not your IAS levelinfrastructure but other services thatcould indeed run onKubernetes But we've got a few softwareengineering challenges to think abouthere Blast radius If we're hostingeverything in that one cluster and itgoes down I'm probably going to have myhead on the chopping block and Deanprobably too but also chicken and eggdependencies which is what we like totalk about a lot in the organization Butwhat we mean there is cyclicaldependencies So worst case scenario ifwe have to rebuild that cluster but theauthentication that's requires aprerequisite is hosted on that clusterwe can't get going atall Right So to talk about this inlayers uh Daniel Bryant from Centasooften talks about the three layers fromhis Java background So I've kind ofstolen this from him You've got your IASat the bottom which is AWS and then youcan see where I thought I was going tosit in the middle there which isKubernetes and these otherinfrastructure services that I'vementioned would be parallel to us in thestack The applications would sit ontop Unfortunately everyone else thoughtit would look a little bit like thiswhere we Kubernetes would do all of theheavy lifting hence that getting alittle bit bigger and we would haveinfrastructure server sitting on top ofus and then applications on top ofthat We need to move that cognitive loadsomewhere My team simply cannot run allof the services We're only about 15people So instead what we're spiking outat the moment so this is sort of hot offthe press is how can we divide thatcognitive load and split it amongmultiple teams So I've indicated thathere by the potential for a multiclusterapproach where we have clusters fordifferent use cases Naturally thatreduces the blast radius but it alsomeans my team is not on the hook foreverythingRight I think Dean you want to ask himif we're going to deliver on our promiseSo Rachel are we going to deliver on ourpromise of that five-star hotel for ourcustomersWow it's a good question Let's let's getinto it So why don't I kick off bysaying where I think we were at thebeginning when we went live So we'vedrawn a draft graph here looking overfive different pillars of securityresiliency scaling capability and thedeveloper experience As we mentionedFidelity look after 950 billion ofpeople's money And actually security isa total non-negotiable as part of whatwe offered For us security we couldn'tcut any corners And we also had to taketime to upskill the security teams inthe organization as to what we weredoing Kubernetes was brand new toFidelity So for us we spent huge amountsof time and effort investing on thatarea of security And actually that comeat the detriment of all of the otherareas like resiliency capability in DevXWe were very lucky just naturally usingKubernetes comes with great scalingcapability But really we didn't reallydeliver on that glossy hotel experiencefrom a for a customer because we put somuch time and effort into other areas SoRachel do you want to maybe talk aboutwhere you think we are now Yeah pleasego ahead So I'm relieved to be able totell you that we've brought resiliencyand scaling up to match where we werewith security But you're probably seeingquite clearly that capability anddeveloper experience are lagging behindAnd unfortunately that's the realityThis does create a problem because wesell the platform typically as about thedeveloper experience But really it'sit's second to business value And forthe business we also have audit andcompliance which if you work within aregulated industry is simplynon-negotiable And we spent so much timeon those first three pillars thatcapability and developer experience islagging behind So be aware of that ifyou're starting this from scratch We arenow brin�ging this forward and our focusfor the next 12 months will absolutelybe on capability and developerexperience But I need to be mindful as aproduct owner that as we expand thosefeatures I don't introduce moreoperational work that I talked aboutearlier Thank you Rachel So the realityof is actually this is sort of where welook right now There is absolutely agolden path for our customers anddevelopers to go forward and have usedthe hotel for example The hotelsgradually increasing star by star bystar but there's still some way to go Ifyou stay on that golden path all's goingto be okay But it looks a little bitscary if you deviate off that path atthe minute But have we done the rightthing Did we go live too early with whatwe called MVP And that's a questionwe've asked ourselves a lot I think theconcern is we could have spentengineering uh forever engineering theperfect hotel but actually the feedbackthat we got from going live with MP MVPwas things that was never ever part ofthe conversation earlier on the featuresand capabilities that have come sincewe've gone live have actually drivenwhere we've got the platform to todayand actually the the difficulty in thoseconversations and the the feedback wegot around how challenging it was to usethe platform We would never have got theplatforms where it was today withoutthat So it leaves the team in this sortof position right now We got CloudFoundry on the left and we haven'treally mentioned this today but Rachel'steam are having to keep Cloud Foundryalive at the same time And we got thenew Kubernetes hotel on the right Forthe team we want to move away from CloudFoundry but it's going to be there forprobably another 6 n 12 months or soIt's been in the organization for 12 13years and there's lots of people verycomfortable with that wild west stylehotel maybe there's still fires everynow and again that the team have to putout and that distracts them from beingable to work on those features of theKubernetes hotel So we do need to put alot of time and effort into supportingour customers transitioning from the oldhotel to the new hotel so we can reallyfocus our time and effort on KubernetesFor us at Fidelity we're reallyfortunate to have a fantastic communitylike we've seen this week here atKubeCon The C community at Fidelityaround Kubernetes has been huge andactually many of our customers migratingalready spend time helping othercustomers on that journey and we alsohave an enablement function within ourengineering group which also has playeda big part in thatmigration So I think to wrap things upfor around five things we'd ask you toremember from today If you too weregoing on a similar journey we absolutelycouldn't have built the hotel withoutthe open source community and some ofthose plugins that Rachel mentionedtoday But beware and be ready for thetime and effort that comes in that youneed to contribute and support thoseplugins within your platform I thinkit's fair to say we probably oversoldthe hotel model a little bit too muchand we probably should have spent moretime staying closer to the uh customersas we went through that MVP uh cycle andcontinually reinforcing what MVP wasgoing to look like But what was reallyreally important was the feedback theygave us So it's always important to haveyour first guinea pig customers ready tocome on board to your platform and giveyou that immediate feedback And thenfinally I think the big point thatRachel made you can't abstractinfrastructure Whilst we might be ableto take it away from our customers it'sstill Rachel's team that get left withit at the end of the day So a big thumbsup to Rachel's team for the work they'vetaken on And I'm sure they're going totake on loads more into the future YeahRachel Most likelyAnd then I guess there's still things wedon't know For us the infrastructurecluster we're still a little bit on thefence Are we actually fueling badbehavior within the organization byrunning an infrastructure cluster forinfrastructure teams that don't want torun infrastructure I think whilst werecognize some of the capabilities thatwill� run on that cluster will benefitour customers in the long run likecontainer scanning for example we'restill not sure whether that's the rightthing we should be doing and maybeactually they should be running theirown hotel for theirinfrastructure And I think there's stillwe don't know what we don't know We'vestill got several thousand workloads tomigrate onto the hotel and there's boundto be things that we've not reallythought about yet So again there's stilla big unknown for us going forwards So Ithink that's everything we wanted toshare with you today I think we've gotfive minutes left So happy to take anyquestions I think there's a microphonesomewhere in the middle of the roomEqually uh there's the QR code for ourtalk today if you'd like to give anyfeedback Thank you[Applause]Um thank you for your talk Um so youstarted off with a raw vanillaKubernetes cluster and then you built aum motel and then a hotel Can youdescribe some of the features that gotyou to motel on hotel So we're buildingon top of Amazon EKS and those three uhCNCF projects that I've mentioned werethose motel versions So Selium for theCNI Kyverno for policy enforcement andcarpenter for scaling So I was reallylooking at this from a veryinfrastructure lens where we needed tomake sure that it was secure andresilient We weren't really thinkingabout other application features Sowhile moving towards the hotel we'reoffering sort of ancillary services likea database as a service for example Soas part of your declarative config youcan state that you would also require adatabase of flavor X of size Y and thatgets sent off with a FIFO key to apipeline at the back end that anotherteam is running as a concession and thatgets delivered into your namespace Sowhen I talk about these other servicesit's things that aren't traditionallypart of Kubernetes but you would needinside your cluster or interfacing withyour cluster for part of yourapplication development ThankyouAny moreNo Thanks very much for your time Thankyou every Oh one more One moreUh I have one more quick question Uh youmentioned about the declarative configuh for the apps Uh so do you deploy nowall the apps with Helm or like you planto use something else like Argo or anyof those Yeah absolutely So we run acentralized Argo service for ourcustomers as part of the hotel model butpeople are also free to use other formsof CI/CD in the organization So we haveJenkins within the org that people aretypically using to do their builds butmostly within the platform for all ofthose use cases I mentioned that requirehelmet is using Argo Cool Jhi thank you That was very good presojust a quick question on how do youbalance new features versus the timethat you're spending on maintenance andhow do you prioritize as you as you havemore customers as you say going to yourhotel I imagine the number of featuresrequests is increasing but also themaintenance work so how do you strikethat balance absolutely that's probablythe most difficult part of my job at themoment and I'm becoming quite famous forsaying no so what I do with Gareth atthe front who is my uh lovely BA we'retrying to make sure that we split ourtime across a number of epics that suitsour velocity with always having one epicspare for platform maintenance So loveto our own code base as an example Theway that we're looking at that in termsof prioritization is really quitemercenary It's business value So we'redoing something called big room planningAll of the different areas of thedepartment So data services securityenablement developer platformengineering all come together and saythis thing is most important to me It'sgoing to save x number of hours for thedeveloper It costs X amount of moneyThis is the perceived benefit If thisbenefit is bigger than this one thisgets prioritized This doesn't Thank youHi Um how much do you extract the setupand settings for your developers Like dothey have to have a lot of Kubernetesknowledge to get into the hotel or canthey just plug and play And so it'ssomewhere in the middle and I think we'dlike to make that more and moreabstracted over time Um the reality ofrunning Kubernetes is that you do needsome understanding of it to design yourapplication to run effectively inKubernetes We're quite fortunate thatmost of our use cases are webapplications So you can design that as a12-actor app Um but you do need tounderstand like the CLI So we want teamsto be able to go and troubleshoot whatthey're doing So cube cuttle would be anabsolute requirement When we sold thevision we sort of said to seniorleadership that there would be like noknowledge required The reality is thereis a baseline level of knowledge that'shere and we'd like to bring that down Uhwe are also introducing a portal Mypersonal view is that the orchestrationlayer should come first and then youlayer the portal on top but we arelooking to do some of that abstractionby having a single entry point usingbackstageI I think some of maybe the plugins likeCarpenter for example you know our teamshave had to get comfortable withhorizontal and vertical scaling andthose are sort of principles they needto understand out of Kubernetes to beable to accommodate things likecarpenter running Yeah of course ThankyouH uh thanks for the talkUm you mentioned that perhaps youthought um you released the MVP like tooearly Um did you find that when you wereon boarding like your first customersdid they still work with like with theold practices and with the old habitsand did that kind of corrupt like theprocess of rolling that out and um youknow what you thought the the bestpractices for this new platform shouldbe and did you manage I'll take that oneto start with So I think what I've cometo realize on my journey as a productowner is that customers will never befully happy and I need to accept that Iwill never be completely liked for thatfact So if you think about the originalan analogy of an MVP as being a caryou've come to me and said "Rachel Iwant a Formula 1 car and I go away for afull year I build you what I think isthe perfect car It's red it's shiny it'ssuper fast You know Vstappen likes todrive it." You get in the steeringwheel's really far away and you're like"Well I I can't drive this." RightYou're Lawson You've been droppedActually if I'd given you the chassiswe'd immediately have seen that thatsteering wheel was too far away but youstill wouldn't have been happy with mebecause I've given you a chassis not anF1 car So I don't think we released MVPtoo early because we would never havegot that feedback that has made theplatform that hosts our applications nowBut regardless of where we released thatMVP someone was always going to beunhappy So to any product owners ortechnical leads in the room you need tobe comfortable with the fact that you'rekind of parenting the organization inthat sense and you're doing stuff forthe long run for the good of theorganization but you might not beappreciated for doing itThanks Um so how did you kind of guidethe developers to use your new practicesand things like that Do you want to talkabout enablement Yeah so we're quitefortunate We briefly mentioned it in thetalk within uh our enterpriseengineering function in the organizationWe have an enablement function So whilstwe're the platform engineering team wealso have an enablement function So theywork very closely with us to understandwhat our platforms look like and thenthey really roll their sleeves up andget their hands dirty with our customersand appdev teams So they're actuallyable to inject themselves or parachutethemselves into those app dev teams andtry and steer them in the rightdirection Equally they give us feedbackand go Dean Rachel that's not workingvery well You know why are we doing itlike that We need to and they can helpinfluence the direction the platformgoes So that enablement function on topof our platform engineering group reallyplay a huge part there And they've gotsoftware devs in that group So they'repeople that do have technical backgroundand expertise that are equipped toadvise best practiceOkay Thanks All right I think we're attime So thank you very much everybody[Applause]2025-04-15 21:59:42.367136 �o��S�+#��]Ay0JgZ-hQ-Bogood morning um feels great to see um alot of you here in this talk uh where wetalk about how do you extend KRM beyondKubernetes workloads and reach to astate likeKRM++ umso so I'm MJ i'm one of the maintainersat KCP i work for Casti we basically dofunky things with Kubernetesi am Navarun i do some Kubernetesmaintenance and I also contribute to KCPon the side and you can find me on thesesocials so let's get started with thetalk so we're going to see three thingsin this talk in part one we're going tosee um what is scarm if you are notreally familiar with it or um don't knowthe gotchas or nitty-g gritties of itand we will come to know like why it gotsuch success then we will explore howdoes KCP add to it and how uh the KCPproject is pushing for the nextevolution of KRM and later on uh we willtry to show you a demo of how do youcreate custom APIs without using CDs butstill the Kubernetes API machinerythroughKCP so let's get into a problem like whyare we here and why are APIs so hard todo so imagine you are a tech companyrunning a hybrid cloud setup and youwant to create infra resources likecompute or some networking stuff or somestorage things so you'd be interactingwith a lot of different cloud providersif you are on multiple clouds theproblem that comes is consistency acrossproviders every provider has their owndifferent APIsinteroperability between them and thenthe whole ecosystem how do you buildthings when you try to integrate all ofit it surely becomes a littleoverwhelming and at that point you mightask yourself a lot of questions what arewe even doing and often why we do thatis we really li��h�*#��AahANKkTT-yothank you very much for coming to ourtalk today on building a five-starKubernetes hotel We'll introduceourselves and then dive straight inbecause we want to make the most of the30 minutes So my name is Rachel Wanottand I'm the technical product owner forthe Kubernetes platform at FidelityInternational I'll let him introducehimself but this is Dean Fuller who hasthe wonderful job of looking after meThank you Rachel So hi everybody I'm uhDean Fuller I look after what is knownas developer platform engineering atFidelity Some of you might have heard ofFidelity International You may havepensions or personal investments withthem but Fidelity have been around since1969 We operate in 27 countries and welook after 950 billion of people's moneySo Fidelity have actually been on ourcloud journey since 2013 And that firststarted with Cloud Foundry on premiseand it's actually been hugely successfulin the organization Over the last sevenyears we've been on our public cloudjourney in the way of AWS and Azure Andthat supports the two and a halfthousand developers we have at Fidelitydelivering all the capabilities for ourcustomers and internal systems So todayyou're going to hear me and Rachel talkabout our customers We follow theproduct model here at Fidelity So whenwe're talking about customers we'retalking about our internal app devs atFidelity So let's get started on whatthat cloud model and operating modellooks like at Fidelity So as I say wewent into cloud seven years ago and wevery much did it on a very purist DevOpsapproach and the way we framed it is thecloud house effectively Me and myplatform engineering teams in in theorganization would go and build thefoundations for the house and we wouldthen hand that over to our customers tobe going and be able to build theirhouses that might host the customerapplications or the internal businessapplications And it was hugelysuccessful when we lau��ke reinventing things welike to build our own new standardsbut then should we as a tech company whoknows what their business goals are whattheir business metrics are you shouldideally do what you know the best youknow your area of business you knowwhere is your value proposition and youshould start to stop inventing outsidethose areas and you need to start seeingsomething which is more like a standardand used by a lot of peopleso we do really have an answer that weshould do not try to reinvent if thereare existing patterns that you can useand there comes the Kubernetes resourcemodel you all have been using it throughthe Kubernetes API it's basically theAPI surface of Kubernetes and how thingsare done what are the key properties ofthose first declariveness so you don'task the Kubernetes API to do things stepby step you don't ask it to hey can youstart creating a pod of this and thisbasically say one by one what youessentially do is you tell it directlyhey I want to create a pod with thisimage with this resources and here iswhere it should be located it'sbasically the declariveness ofKRM and when you start creating thoseresources using KRM you have a structurewhich is pretty consistent across thewhole surface and it has been consistentacross um probably decade now so what doyou see when you create a Kubernetesresource you have an API version youhave a kind what kind of API it is thenyou have certain properties of theresource like its name some metadataassociated like annotations labels thenyou have two key properties what is thestate of the resource that you want whenI explained about the pod that's yourdesired state and then you expect thesystem to return you back like stampyour resource with something like astatus now all of this is known to usright it's sounds very trivial now andwhen you try to create any object in theKubernetes control plane you essentiallyum have this specific format no matterwhat you are dealing with um other thanprobably two outliers config map andsecrets but uh maybe we can skip thosewhen talking about this and it justworks right nobody questions thestructure anymore everyone is quite okaywithit and the next important part is whenyou tell the system that this is thestate Idesire there are some processes runningasynchronously reading your objectsreading your resources and then tryingto reach that desired state if it canreach that it will stamp back with somegood successful statusesor if it is not um able to reach thatstate due to some error condition orsome unfulfilled um states or anythingit will report back the same status backto um the resource itself so these threethings make the Kubernetes resourcemodel really powerful for your use casesbut do you think it's that easy like docan you just start writing some go typesuse kubernetes code generator and thencreate crdsum does sound easy but then you can doreally u weird things with it so thereare some pitfalls there are some gotchasthat you have to know when writing uh kmAPIs some quotes from Albert Einsteinum so let's come to the things that youshould avoid when writing KM APIs firstmixing the business logic into the CRDSdesign so we talked about spec andstatus right two very segregatedu parts of the object sometimes whathappens is we might try to put too muchlogic into spec that's not somethingthat we should do we should just putthings which mention how the desiredstate of the system should look like ina spec and not too much of proceduralknowledge and not clearly separating thespeck and status for example here in thecase of service right when you uh whenyou ask the Kubernetes API to create aservice um do you really give clusterAPI cluster IP always not really you saylike what port I need to expose and itcomes and stamps cluster IP in the specwhereas should it should it have been inthe status i mean this is all debatablebut ideally your desired state should belike obviously you can specify your owncluster IP but should the field havebeen called like desired cluster IP andthen you have the actual cluster IP in uthe status so things like this yo�u needto think about when you design your umKRM API and then if you ignoreestablished API conventions you againrun into pitfalls for example here whenyou define a CRD you tell what kind itshould be named when you do keep codleget so and so um for example here if yousee DB connection it should ideally beideally have been named capital Dcapital B capital C and then the rest ofit but if you see it it is not thatclear how do I read and then whenspecifying the plural it should ideallybe a plural name and not just a singularagain which creates more confusion umpretty um interesting example is if youhave an um API called sheep what do youdefine the plural as um should sheepsshould be flock againquestions um and then when I talk aboutuh creating like writing a go type likegostruct putting fields you shouldreally leverage the things that KRM isgiving you like open API schemavalidation then you have when youproperly tag your fields correctly theopen API uh schema will have enoughinformation for your uh API clients toactually derive information from theobject itself things like if you want todo some pattern validation or do somesort of simple validation or maybe evenlet's take it a little bit ahead with ucell expression validations you canessentially run uh small expressions aspart of the Kubernetes API serversadmission process to validate youruh custom resources so things like thisreally make it powerful but you need touse those if you don't use those thenyou again run intoproblems um and then again like classicexample like if your if your request wasactually not fulfilled but you arereturning a 200 okay with the payloadbeing status code 400 and status beingbad request that's not how you shouldwrite your APIs so this kind this kindof pitfalls are again to be avoidedand then let's talk about evolution ofAPIs let's say you wrote your API oncebut you keep on adding fields thereneeds to be a proper upgrade strategythere needs to be proper versioningconverting between two versions thosekind of problems you run into right soin in the KCV project we uh ran into asimilar situation where we have thisworkspace API but it has been around forlike since the inception of the projectbut we have been adding new things to itum recently we made a small breakingchange not for consumers but anyone whoconsumes the um workspace API through umlike the golibrary that started a that started adiscussion like hey I have the nextversion v1 alpha 2 but then is that thecorrectone how what are the differences betweenv1 alpha 1 and v1 alpha 2 a simpleconversation started a thread of 309replies and I'm pretty sure by the endof the conference we would reach like500 replies on the same thread so it'svery essential to actually discussthings when you upgrade your APIs orcreate new versions on how and you youneed to really think how your consumersare going to consume the APIum sometimes we really don't know how todo it but then we have to discuss uhwith everyone all the stakeholders andas you saw here like things escalatedreally quickly I think um once thethread was created by probably a coupleof hours there were 100 replies in therelike discussing how this evolutionshould work out and how how do weactually do the conversionstrategy then again another pitfall islike we have been talking about APIs allthroughout right but what about thereconciliation rule so Kubernetesprovides you some tools and resources towrite your own reconcilersum if you if you wrote reconilers likelet's say six years back like if you seesome uh Kubernetes controllers probablyall of them are migrated now really oldcontrollers they're written in like workq and by manually creating workerprocesses managingthem that pattern is now slowly evolvingum if you really need to use controllerruntime um so here you see some goodpatterns right when you are writing yourreconciliation loop there might be acertain set of actions that you want umthe controller to do now what you can dois you can atomically divide each ofthese operations into its own reconcilernow what it does is each of thesereconcilers can be� easy to write easy totest easy to mock and the the completereconciler would be very easy to read byanyone and when you aggregate statusesacross it it's very easy for you to alsolike atomically think um okay there wasan error here now what kind of error itwas was it something which isrecoverable was it something like astate of the system which is transientbut if I recue it down the line maybe 5minutes later I would be able tosuccessfully process the object so youneed to think about all of this you needtoaggressively think about how do youreconcile um do you result in a propererror state do you result in a transienterror state or do you result in a uhgood state as part of that reconcileiteration um here are some examples likewhat can we do in um each of thesecases now there are a lot of otherpatterns that you can learn from theKubernetes code base or a lot of otherprojects in the CNCF ecosystem who aredoing KRM APIs um a very neat examplehere is like how do you do the committerpattern um I would just tell you it inlike 30 seconds the committed patternessentially disallows you to change thespec and the status in the samereconcile loop there are obviously somehacks around it but those are notsuggested to be followed ideally in thesame reconcile loop you should onlychange one and a controller shouldideally only change the status and notthe spec um that's how things shouldwork and there are other patterns thatyou can learn from controller runtimecube builder they all help you to writeum your reconcilerseffectively but how is CRM better thanwhat you have in your company now youhave a custom solution that you wrote umthat works for you but you might alsothink like hey Navarun is talking aboutcreating a 16th standard and I alreadyhave the 15th one why should I go to thenext one actually it's not the 16standard it's the same thing that youhave been using allthroughout as consumers of likeKubernetes or other CNCF projects youmight be using one or more KRM styleAPIs already in your tech stack andthat's why it's nothing new that you'retrying to do it's just a it's juststicking to the um the general flow thateveryone is using so why not just stickwith something that you are alreadyusing so now MJ will take over so let'sswitch the gears a bit so now KCP andhow it makes it better so let's say yousold you want to use KRM you want tobuild your internal cloud or externalcloud for your organization and you kindof usually end up with something likethis this is usual microser architecturewhich you would have in your platform orcloud you want kubernetes kubernetesmicroser calls virtual machine microservirtual machine calls compute microserthat need storage networking and youhave this mesh of interconnectivity likethis is nothing new like everybody whoworks for hypervisors knows you run thisalready everybody who runs for a biggercorporate has this so this is look likethis so okay we still sold to KRM let'suse Kubernetes model to implementsomething like this so the flow as Imentioned would be like request flow anda pattern so that looks good it's likewe've been doing this for a decade nowbut usually what you build is this nowyou have a second region now you want toserve your US customers Europeancustomers maybe some customer wantsisolated environmentand at some point you reached this pointwhere now you started with oneKubernetes cluster many operatorseverybody's reconciling everythingKubernetes team popped up and said likewe are too big now let's do our ownKubernetes cluster now you're going totalkcrossclusters and now you've beencreating like CRS and CRDs acrossclusters threeilers and there's manyprojects out there who's doing thatstuff and is it bad no it's like we'vebeen doing this for a decade Nowtotally the same flow would be here likeif the compute team needs a storage mostcase storage cluster contains somenaming pattern how you separate tenantsby name spaces where you create yourrequests like storage requests computerequests it get implemented so you getthis reconciliation mesh betweenmultipleclusters and once things start to movelik�e storage team updates an API maybewe add new field did some breakingchange things break down so okay westill want to use KM we still want touse Kubernetes but like how would you dothatso what we have here is we have manyclusters problem like why KRM well KRMitself as API standard is amazing it'sis great everybody used this for thedecade it's vessel its ability tofragmentate tenency isolation and havean API management behind it it's not sowe have an API standard we don't have away maybe we do but we don't have a wayto manage it alternatives is Bclusterslike you say okay we don't want to havemany clusters let's run pod per tenantbut that doesn't solve the problems itreduces the costs but you still havemany clusters problems API management isnot still addressed we have a genericcontrol plane which is a kiddo projectunder kcp which does very similar asvclusters but again same challengesstays you just save some few bucks andmaybe operationalcosts so what's the elephant in alike we need strong multi-tenency forCFDs for the APIs we need API managementinterconnectivity for those things likeall theAPI things which we don't at this pointso KCP Kubernetes control Kuberneteslike control planes so if the Kubernetescluster looks like this where yourtenants is name spaces you have airbackrules some magiccontrollers in a KCP land this would belike This KCP is just yet anotherapplication running on your Kubernetescluster it's not a replacement to yourKubernetes clusters it's just a flavorof Kubernetes API server which allowsyou to do stronger multi-tenency and APImanagement so previous picture which wehadbefore so how the same architecturewould look like in a KCP land is everyteam gets its own box where we canmanage the APIs we can do own things butwe still operate a single control planeso now we kind of trying to eliminatethe many clusters problem and we createthis we did that in a KCP community andit's like okay so what's next so weadded this abilityto life cycle APIs too so we gotstronger multi-tenency we split thosebox aparts we still have in that samebox application lessfragmentations and we've working stillworking it's very much work in progressabout how to support multiple versionsof APIs and how we do that and this isvery important thing is like in a normalKubernetes cluster when you install aCRD you install a CRD API extensionserver accepts it installs it say it'slike done and the CRS live in the samecluster so it's a one torelationship what we did we split thislogic we say API provider ownsCRDs instantiations of those CRs on theconsumer side and we put a boundary inthe middle we call it API bindings andAPI exports what this means now you canhave a one to many relationship whereKubernetes cluster is very much onetooneyou have a API provider API resourceschema which represents a CRDS and youhave a consumers which representCRS on top ofthat tenency we isolate them in avirtual API servers where now your teamC can become a consumer of team DB APIposgress SQL database and I'm rushing abit because I want to get to the demo towant to show that in actionso some of you might seen this alreadythis is the tree representation of KCPworkspace tree this means this is howyou organize your virtual API serversyou would say your application towerwhich has application platform X manageKubernetes your compute tower which hasvirtual machines networking and thingslike that so it's a logicalorganizational structure represented ina in a tree structure and that would bebasically what's what you built so youhave differentteams interacting with single controlplane calling to each other you havethis medium to do all these thingsso and that's not a new thing like ifyou look this and it's like oh this isvery much overwhelming like why I shoulddo that like you doing this daily I betevery one of you like this is a free APIfrom Google cloud like clusters API aSQL admin API for creating manageddatabases and a compute to create VMslike it's it's literally represents thepicture here so while we havethis we use it daily we might not see itmentally but the control pattern and howwe do is out there when you I bet whenyou create a Google manage computeengine it talks to the compute instanceto get VMs it talks to database to getsome etc flavor stuff and it constructsa managed service hence how it works sowhat is KCP began to replay it it's asingle platform to manage multipleKubernetes resources APIs andtenants by shifting API surface out ofKubernetes cluster like all your computeall your controllers and everythingstill runs in your cube clusters theydon't go away they just run a workloadlike any other workload but what we dowe externalize the APIs they shift a bitleft or right depends on yourperspective and this is how we enable tobuild platforms basically so nowdatabase as a service example so thisexample was done yesterday in a workshopwith had about GCP so if you if you wantto replicate it just go to the thisrepository there is a very muchwell-driven instruction how to dothat mr robert sitting in second wrotethe most of the writing so kudos to himso let's try so demo god so I have Ihave two tenantshere so two tenants basically twological clusters you will see they havea different URLs is it enough should Imake itbigger i make it a little bigger yeplet's see how it goes so so my tenantone is a consumer of database as aservice so if I do get clusters thatrepresent the database I have a databasecluster which if I get out which hasvery overwhelming spec which kind ofshould be needcleaning if I find the type I can seethat that's a posgress database i have atenant two if I do getclusters it's same cluster name sameobject type and how this object type nowis appeared in this cluster it has thisuh API binding which we talked about soI have a API binding to the posgressdatabaseteam like I will not go into details howthe providers are wired up andeverything like you feel free to go intothis uh workshop or other talks and thatbut how this is now represented at theprovider side like a database team ownsa Kubernetes cluster and if you noticehere like what I have is I have twodifferent namespacesrunning same name databases is outthere so basically what and if I dolet's say still have few minutes solet's try to improvisesomething API clust consumer one[Music]cluster so I basically rename itIt's always interesting typing in a livedemos if I create a new one okay secretsare present you know consumer one i seethat that's basically getting being setup and on the side I can see that tenantnamespace representing here is beinginstantiated so that's very highlevel but idea isthat you have your compute cluster whereall the database team spins up thethings and everything and you have manyconsumers which don't talk to each otherthey interact they own virtualenvironments and in the same way thatconsumer could be other team usingdatabases to build something else sowe basically did a control plane as aservice and database as a service so KCPhas quite a lot of talks already in aYouTube channel and everywhere so we hada tutorial yesterday which we had linkedwhich if you like the database asservice you can go in there and do ityourself you are here now we have onemore talk related to KCP which is willbe very interesting i would stronglyrecommend to go into that it's about anew kid on a blog uh how to buildmulticluster controllers controllerruntime[Music]so and so that's basically itand platform engineering in general isreaching a I would say a quite amilestone now because more and morepeople are reaching this point where wehave many cluster problems so if youlike what you see if you want to joinus come and talk to us we have stickersyou can find us at KubernetesSlackpev or Twitter X and your feedbackis very much welcome any final wordsthis one over the cool so we still haveI think few minutes for questions ifthere is any um just to add one thing ifin case you are uh interested to go tothe like do the demo yourself uh the oneeasy link is you can go todocs.kcb.io/contribum that's a website with all the demosand all the presentations that have beendone for KCB in the recentpast cool2025-04-15 21:59:43.133032�e difficult If something goes wrongmid update stopping or reverting theprocess isn't always easy Without asolid rollback strategy teams maystruggle to rec recover quickly leadingextended outages onitability So how do we fix this how dowe make Kubernetes updates smooth andsafe for that I will hand over to Ba whois going walk you through how toautomate Kubernetes updates the rightwayOkay great When it comes to upgradingyour cluster there are two parts thatneed to be upgrade The master and thenodes The master need to be updatedfirst and then the node can followBefore we dive into the kipanda toreplace nodes let's take a look at theprocess Let's say we want to upgrade theKubernetes war from1.26 to 1.27 20 27 our note pool hasthree nodes which are severaldepartments and parts running insideit When upgrading nodes there are fewdifferent strategies that you can useThere are two that I want to focusrowing update and magregation use usenote pool The simple way to update yourKubernetes nodes is rowing update So arowing updates works in the followingway One by one a node is dragging codenso that there are no more parts runningon that node The node is deleted and anew node is created with the updateKubernetes version Once the node is upand running the next node is update andso on soforth It does has a few drybacks Onedrybacks is you actually get one lessnodes of a capacity in your cluster Ifthe existing nodes didn't meet therequirements of the evicted ports theywill they would be re rescheduled Sothis can cause a lot of disruption toyourapplications But this issue is easilyresolved by scaling up your notebook tounder another uh cap uh capacity andthen scaling it back down once theupgrade iscompleted As you can see a new nodes isadded to the cluster and runs the thedesired worm It is 1.20 27 in ourcase But what about the exiting nodes weneed to drive one of the old nodes tominim to minimize the disruption torunning applicationsThis options will move all therescheduable PS from the old node to thereplacement or other exiting nodes inthe cluster It depends on thescheduling result performed performed bythe Kubernetes scheduleuler In order tomake sure the pause can be placed on thenew nodes we need to similar the scalingto launch a suitable machine from thenode pool and place some metadata to thenode Hand hand isuh manual is hand What is the kipanawhat what the kana does kanda take asnapshot from thecluster to build a relation relationshipfrom between the nodes and pause Placethe place the pause on the on the on theon the existing nodes If ifcount they uh he will get a templatefrom the note pool and uh place placethis part and place this part to thenode once the old once the old node isis is for for foruh is calculated The the no theorder the the older node order can bedeleted Repeat repeat it until all thenodes in the cluster have beenupgrade Kandanda provide a featurecalled draft to update your nodes likewe just described You can only updatethe node pool and the node class objectIt will take care of the rest Thanks tothe kipanda the process of upgradingnodes is mucheasier Node pool says a construct on thenode that can be created by the kandaand the ps can run on the noseAdditional it also allows the pause torequest request the nodes based on theinstant type OS or other attributes byaddingsomespecification to the Kubernetesdeploymentmanifest A node class is where youdefine which subnet security group AIAMI group AMI role the nodes will willuse Hand use node claim to manage thelife circle of Kubernetesnodes It serve as a request for thecapacityC Kapanda create a node clam in responsefor to create a to to pre privation anda dis disruptionneed This image show how the kanda worksWhatever kipanda create a node claim itask the cloud provider to create ainstance registerlinked the created nodes with the nodeclam and waiting for the node and isresource to be readyIf a node iscreatedis if the node claim is deleted the nodewill bedeleted and the and the part on on onthis nodes will will be evated by thekander If you want to or upgrade to runfaster we can increase node disruptionbudget which means more nodes are areadded The upgrade operation is possibleat the same time However it means alsomore nodes arejoined at the same timeA powder disruption disruption budget isa Kubernetes resource that specificatethe minimum number of P that must mustremain available during a reuh disruption It it helps maintainmaintainapplication stablestably by preventing too many ports frombeing unavailable at the same time If anode is marked as dis disruption with areason kanda will check where if thenode contains a part that cannot bedisrupted before theydrag the node If yes the node will beselected as a continident until the partis deletedIf a single republic department is notprotected by aPTB it will cause a disruption on the onapplication once is it isevicted The rowing update proposal isnot accepted by the kanda due to somedrybacks For example the departmentreference a volume which cannot beattached on different nodes at the sametime for SLO said there are no betterway to hand the hand it in this caseThat's all Thank youOkay And uh as we have already discussedCarpenter plays a crucial role inautomating Kubernetes scaling But howdoes it actually work across differentcloud providers as we all know uh capinitially carpender is developed by AWSand is donated to CNF in 2023 and fornow the uh uh community is become uh actactive So let's take a closer look Rightnow Capanta supports the PI providersyou see on the slides and at Cloud Palaiwe have contributed the Alibaba cloudprovider and we are now actively workingon adding GCP support Uh the GCPintegration is still in development andit will be available in next two monthsSo staytun and but uh like any other truthsCapanda isn't perfect Let's talk aboutsome of its current limitations Firstthere's no gradual update control RightNote updates happen without intermediatestate checks or fine grade controlswhich means you can pause the updates inprogress This can lead to unin intendeddisruptions Second Capanta doesn't havea built-in fallback strategy Ifsomething goes wrong like an AMIincompatibility that crashes a criticalservice There's no automated rollbackThat makes recovery much harder andslower Finally in certain scenariosespecially for migration sensitiveapplications like real time gamingCapanta doesn't account for the besttiming for node migrations withoutcareful coordination Scaling down at thewrong moment can disruptservices Okay now let's talk how weextend its features At Cloud AI weoffered a manage Capender cloud servicewith intelligent features that takecloud management to the next level Firstintelligent node selection Instead ofjust provisioning nodes based on basicrules Cloud Pay analyzes workloadcharacteristics cost data and CPUarchitectures across AWS Asia and GoogleCloud This ensures you get the most costeffective instances while balancingperformance and stability Second sportautomation with advanced interruptionprediction Sport instances are great forcost savings but they have come with abig risk sudden termination often withjust a few minutes notice But Cloud Palichanged that by predicting interruptionsup to 120 minutes in advance usingmachine learning trained on historicalpatterns But prediction alone isn'tenough and we pair it with automatedmigration strategies similing workloadsbefore an instance is reclaimed Thismeans you can use spot instances withconfidence So to sums up cloudi buildson capendo strength giving you smarterscaling uh better management and mostcost savings without compromisingreliability instead of just provisioningresources It helps you provision theright resources at the right time So umthat wraps up our talk If you'd like todive deeper here are some useful linksfor Capanda Also these slides arealready available online So feel tocheck them out later And uh we have aCapenta kiosk at uh project pavilion Andif you have any questions welcome to ourbooth tomorrow morning and ask anyquestions you like Yeah And uh here'sthe feedback QR code We would love tohear from you Thank you2025-04-15 21:59:43.955344 �g�,#��ArAIcQvKBuZAgithub name is Ki I'm I mainly uh focuson Kubernetes project and SQL storageYou if you have any question about theKanda you can send me in Kubernetes flagThank you Okay And today we are excitedto talk about something that everyKubernetes user has to do with updatesAnd how many of you have ever run intoissues when updating Kubernetes clustersraise yourhand So lots of uh so we all know thatuh upgrading Kubernetes can sometimesfeel like a headache Even small updatescan cause issues like inconsistent nodeconfigurations downtime or compatibilityproblems But what if you could automatethe this process and make it seamlesswith zero downtime and that's exactlywhat we will dive intotoday And before we get into theautomation details let's first talkabout why it's critical to keep yourKubernetes data plan updated especiallywhen it running on the public cloudFirst one common reason for updating theuh data plant is the need to modify nodeconfigurations such as tweaking kubletstartup parameters or upgrading the OSversion used by the nodes Keepingconfigurations up to date ensures smoothoperation and workloadcompatibility And second new kubletversions come come with security patchesperformance improvements and newfeatures Skipping updates can lead tovulnerabilities and reduce clusterreliability Lastly Kubernetes releasesevolve rapidly As we all know if thedata plan lacks behind incompatibilitiescan erase leading to unexpected failuresThis sync helps avoid that and keeps theentire cluster runningsmoothly and clearly updating theKubernetes data plane is important butthey also come with challenges If nothandled properly things can go wrongsometimes with serious consequencesLet's take a look at some of the risksinvolved in updating the data plan Firstif not all nodes are updated Differencesin Kub Kubalite versions or OSconfigurations can lead to unpredictablebehavior and application failures Secondif you manage several clusters manuallyupgrading each data plan can be atedious and resource inensive taskdelaying deployments and consumingvaluable engineering hoursThird there's always a risk of servicedowntime during the update If workloadsare not properly drained or rescheduleddisruptions can occur affecting serservice uh available Finally roll backcan b��ellyou a bit about crossplane Yes thank youUm yeah and I wanted to start from afundamental concept and that's theconcept of a control plane Um we firststarted seeing softwaredefined controlplanes come up during the rise of theinternet during the '9s Um when weneeded to manage you know millions ofroutes and we're having explosive growthSo um there were a couple of key thingsabout control planes that defi came fromthat era There was something as a anorthbound interface and this was acommon interface that we h had thatcould be used uh for all different typesof clients And then there's this idea ofa southbound interface which is whereall the commands were sent to thedevices and all the routers Um today wehave probably one of the most popularcontrol planes on the platin andkubernetes um you know it's very welland you know we have an API server inplace of that northbound interface andthen we have things like controllersthat are the southbound interface rightso um the idea of crossplane is thatwhat if we took key technologies withinkubernetes and use it to make auniversal control plane uh can we manageeverything with kubernetes um and whatcrossplane does is it's an extension tokubernetes and it uses core technologieslike cds custom resource definitions andcontrollers and he uses that to expandextend Kubernetes reach to be able tomanage almost anything I'm going to talkabout a few key things here Crossplaneis a massive platform and there's a lotto talk about but I'm just going to talkabout a few key concepts here um and youknow how they fit in with you knowbuilding a regulated platformSo the number one thing to understandabout crossplane is this idea of amanaged resource and what a managedresource does is it match it maps oneobject in Kubernetes to one objectsomewhere else outside the cluster andin this case we have something calledyou know we're using a bucket here andso on crossplane the bucket is modeledas a CRD and every single thing that'sin the cloud providers API is present inthe CRD as much as we can um and as partof that there's a controller that runscontinuously um that looks at theKubernetes API server and whenever auser asks for something like newinfrastructure it will go talk to thecloud providers API and provision it Umso you could see here the two keyconcepts of Kubernetes that we're usingWe're using custom resource definitionsto map resources and controllers tocontinually reconcile and provisionthose The next thing is because we'rebuilding complex infrastructure we needto combine resources Um so we have anidea of comp composition and this iswhere we take multiple managed resourcesand we bundle them together Um and thenwe present this to an end user And theclosest uh analog I could get to thisthat I like to use is something like therestaurant right so when you're creatinga restaurant um you're defining thingsthat people can order right so that'syour custom resource definition We callit a composite in crossplane It's an XRDbut this is the menu And as a platformengineer you define these things andthey're all Kubernetes native Um then wehave the composition which is our set ofrecipes right we put those all togetherto combine and create a meal So um interms of building a platform these arethe components that crossplane uses Umwe're going to have one part where youas a platform engineer are defining APIsand these are XRD and then thecomposition and in order to create thosethings because we know when you'redefining things like uh Kubernetesclusters or creating databases there's alot of things to configure Crossplane umuses something called functions Um sothis is uh it's like a serverlessinfrastructure that we have and youcould use any programming language tocreate the desired state Um so whathappens here is that when you run afunction crossplane will pass all thedata into your function and you couldwrite it in any language and then it'spart of your function You can see uphere the top example is something in KCLWe're creating an item and the bottomone is in go templating So if you'veever used Helm uh this is about 95�% thesame Uh you could also write things inGo or Python Um I think there's a sheetC car SDK forfunctions So the just the key componentshere just a quick summary why would youuse crossplane as your base um the firstone is the architecture is very good Umwe have everything has specialized andthere single ownership of everything Sofunctions define your desired stateproviders talk to different APIs Yourcomposition is what you provide tocombine resources and your XRD is yourAPI which is Kubernetes native CRD Umcrossplane controllers run continuouslySo they always reconcile drift detectionUm everything that you do in crossplaneis a CRD So all the things like if youuse uh thirdparty tools like Cerno or ifyou're using inbound Kubernetes APIthings like Cell um you could provideAPIs to your end users that support allthat Um and a really strong part ofbecause we're based on a platform ofKubernetes is that a lot of theoperation things in crossplane are verywell handled Things like uh ourproviders emit metrics Um there's goodobservability Um and we do things thatare um actually kind of cool likeworkload identity because everythingacross plane runs as a pod We could dokind all kinds of OIDC trust to preventsecrets from being you know staticsecrets from being used So that is avery brief introduction to crossplaneI'll pass it backThanks Steve Um I'll talk to you aboutthe control plane architecture thatwe've deployed in Mcquaryy Um so as Imentioned the two main goals of whatwe're doing with the new control planewas to enable new services at speed sothat developers the app developers couldget a new service within a day or twoessentially and start testing with it umin our regulated environment And thenext thing was that the developerexperience was just really enjoyable fordevelopers They didn't get stuck doingall the the boring manual stuff Um andit was really simple for them to deployinto the cloud They'd just love it Umand how did we want to achieve these twothings so at a high level we werethinking of automating all of ourcontrols um and remove all the manualsteps wherever they were We wanted tomake it easier for engineers not justplatform engineers but just engineersacross Mcquaryy um app engineers even tocontribute towards new compositions umso that they could uh I guess enable thenew services faster um by doing uh byusing things like YAML etc that was away for us to um allow more engineers tocontribute because the the knowledgecurve wasn't as high to use crossplaneWe wanted to free up the time from ourplatform teams um towards uh enablementwork and by reducing the toil and justmaintaining ourpipelines And we also just wanted touplift the security by doing thecontinuous reconciliation um between thedesired and the target state um andshift left all of the security andcompliance feedback as much as possibleso that it was faster for developers touh deploy claims So if you look at thatarchitecture diagram just up there it'spretty high level That's the controlplane Um it's sitting in a a Kubernetescluster we're actually using EKS and AWSUm for the IAS we're using crossplane Sothe infrared code tool is crossplane ForCI we're using Argo workflows and CDwe're using Argo CD Um securityconstraints we're using Gatekeeper Andfor our observability stack so loggingand monitoring we're using Graphfana andLoki We've also got a custom providerthat we created for integration with ourexisting um AWS pipeline Um so that'sspecific to deploying onto AWS and wecreated a few custom Golang and PythonAPIs as well um because we need tointerface with external services inMcQuary like our developer portal changemanagement tools etcWhat were the key design decisions thatwe made um I mean there was quite a fewbut I guess the three main ones werewhat was the IA tool that we'd use Wealready had Terraform that was beingused in McQuery Um we had another KRMtool that we were using called KCC forGCP Um and then there was Crossplanewhich we knew about So the decision forcrossplane was basically based on wewanted to continue using a KRM tool itwas really easy to integra�te with um ourinfrastructure and we also benefitedfrom that in-built continuousreconciliation that you get withKubernetes Um and we also wanted thatcoverage across all of the clouds andsort of using different tools fordifferent clouds Um so that's wherecrossplane kind of ticked all of theboxes and the others didn't at thatpoint The next decision was the CI/CDtools we were using Um so Argo is partof the CNCF you guys would know that andit's a very familiar tool for use withcrossplane when you go online and lookat the crossplane doco the documentationsorry it's an Australian slangum uh you can find instructions for howto actually use Argo CD with crossplaneArgo workflows with crossplane so that'sreally handy um and there's lots offlexibility with Argo workflows tointegrate with different um aspectsdifferent tools and also with Argo botswhich I'll get into a little bit morelater um so that was uh not a very harddecision So the last decision is justabout the control plane and I guess isit a green field control plane does itsit on its own and um deploy resourcesthat are basically brand newapplications that don't integrate withanything else or is it a integrated umcontrol plane that allows the resourcesdeployed to talk to all the existingapplications resources So I've justtermed it green field versus brown fieldI'll get into a bit of um I guess thisdecision because this created probablythe biggest um amount of work um on topof what we already had to do to set uptheinfrastructure The benefit of a greenfield is it was faster It would havebeen faster to deliver a new controlplane We didn't have to worry aboutintegrating with the old legacy stackand the decisions we made could be justbased on the latest technology availableUm but in terms of the um the use casesand the impact it would have in mqueryit just meant that the um applicationteams that had brand new applicationsand didn't necessarily need to betalking with their existing ecosystemsonly they would really be finding usefor this control plane Um and we wantedto make sure that whatever we builtcould be leveraged by everyone who justneeded new services So that's wherebrownfield came inThis kind of shows what we'veimplemented on the left side Um so thisis specific to AWS again So we have ourexisting AWS pipeline which manages allthe foundational um AWS organization andthe network VPC subnets thebootstrapping of the accounts and the AMand it also does that enablement ofservices that we already had enabled umlike IAS primarily some databases etcAnd then the new control plane um sortof leverages all of the good work thatthe pipeline does with um establishingthat foundation Um but it also is usedto enable all the brand new servicesthat we get from AWS Um and it uses acontinuous deployment which our pipelinedoesn't because it's a pipeline It's nota control plane essentially Um and thenon the right side uh essentially this iswhat we've got now because it'sintegrated the two control planes um areaware of each other Um the resourcesthat are managed by the two differentcontrol planes are aware of each otherIf you have an application and you'vegot a server and a database for examplethat you manage with the existingpipeline um those uh services can talkto uh the new services that you deploylike a Fargate ECS Fargate for examplethat are managed by the new controlplane But it's still really just one umseamless environment that theapplication manages anduplifts How did we do this integrationand what was involved there werebasically three main um integrationservices that we created um using Go Thefirst one was for just identity So it'san identity API and we use that toretrieve the resource identifiers um ofresources managed across both theexisting pipeline and the new controlplane The second is access management Soall of the policy requirements that weneed to set for the the new resources ofthe new control plane um that would needto be um aware of the policy uhrequirements of the existing resourcesSo that API that access management APIhelps us to do that to define the policyAnd lastly th�e compliance API So thatactually um is feeding back thecompliance of the target state cloudenvironment making sure that it'scompliant and uh the control planes candeploy new changes into the resourcesinto them If the environment is notcompliant then there's no change that'sallowed unless you do an exemptionprocess So we we built those three uhGolang uh integration services as partof our controlplane The impact that this had as Imentioned before was primarily ondeveloper experience The um the impactof this control plane meant that all ofthe application developers out therecould use could leverage what we providewith this control plane and if theywanted a new service from the cloud tothen integrate with their existingapplications they could use thisservice Another big one for developer umexperience that we did was um a way wefound to do to give even more earlyfeedback to our uh our developers ontheir deployment pull requests So whenthey have a claim and they specify whatneeds to be deployed um this workflowallows us to give them feedback just onthe pull request So in their git repo umand they don't need to actually do anycalls to cloud APIs or anything likethat to um be confirmed that theirdeployment will besuccessful So I'll walk you through howwe did this So this uses one of the Argoworkflows We just call it a bot of botsworkflow We call it bot of bots becauseit's workflow chains together uh aseries of Argo bots which areessentially ephemeral pods that are spunup and um they just run a series ofchecks that we define in the bot andonce that's done the pod is umterminated and so it's not a longunningservice on the cluster It saves um savesresources running on the cluster thatway So this bottom bots workflow istriggered by a new pull request or justan update to a pull request So when youhave a claim which is the deploymentspecificationum there's a pull request that's open uhthere's a web hook that's triggered inthe back of the repo and that web hookcalls that Argo workflow So the workflowis um then will sync that pull requestdata and it'll run the first bot on thatdata So the first bot to run is thepolicy as code uh bot and it's basbasically checking all the securityrequirements that we have um againstthat claim It makes sure that all theconfiguration defined in the claimsatisfies the securityrequirements The second bot to run wecall it the compliance bot and that'sthe compliance checking of that targetenvironment and make sure that ourgovernance rules are being satisfied onthe environment that the claim is goingto be deployed in And the last one is ahydrator bot This is um particularly fordeveloper experience So it hydrates alot of information we don't need thedevelopers to put in like VPC ids forexample and other application specificum identifiers etc um you know DNSspecific um information uh route 53 idsetc That means the developers can justmake a really pretty simple claimspecification and all of those specificdetails can be hydratedin So the feedback from what these botsum run and output is directly on thepull request as I mentioned So if allthe bots are successful you can see iton the pull request comment And if itfails then on the comment we'll say thisbot has failed and this is the actualissue and you have an option to fix thatraise another pull request and then thebot will be triggered again um everytime that the pull request has changedSo once all the bots are successful umthen it'll give that input into the pullrequest the the reviewer can um approvethe pull request based on that output umand then hit merge and upon merge um thecommit is actually signed through um thegit commit signing feature and thatsigned commit is um passed to Argo CDArgo CD passes it to crossplane and thenthe deploymenthappens The signing commit is an extrasecurity gate that we have If the commitisn't signed it just means that thebottlebots has either not beensuccessful or that it has not even beenrun and so they're trying to bypassthosechecks This just shows a pull requestIt's a pretty simple pull request thatwe raised and the output of the bots areshown as successful on the comments It'sreally nice because if I need to reviewpull requests and I need to see what'sokay for me to approve um I can checkthe outputs of the bots see that they'reall good they run and hitapprove We've had pretty good feedbackfor um from the developers on thisparticular aspect In the past with thepipeline they had to actually wait forthe deployment to hit um the cloudplatform and then see if there's afailure etc or if there's a securityissue with their claim specification Andthis um bot because it just works on thecode in the pull request it's a lotfaster for them It just runs within afew secondsessentially You can learn a lot moreabout what I've just said Um we've got aseries a blog series that we've writtenup Um you can uh scan this QR code withyour phone or just hit McQuaryengineering blog and you'll read aboutyou can read about the control planeblogs and several other topics in thereSo yeah feel free or look on my LinkedInyou can find the links there as well SoI've talked a lot I think um and the keythings that I've talked about is that atMcquaryy we had moved on to the cloudmore than 70% onto the cloud We wereusing pipelines and we wanted to use thelatest tech out there to uplift ourexperience and also make the new serviceenablement a lot faster by allowing moreengineers to contribute um and also justmaking um everything more automated sothere was less manual work to do that weused a lot of the modern technologystack that was available such ascrossplane um we leveraged CNCF in thecommunity a lot uh and we also made surethat everything we used was supported bya community So it was easier apart fromthe custom um APIs that we did have tocreate because then it was easier for usto keep everything updated um such asvulnerabilities etc I'm sure you'veheard in other talks that it's a pain tokeep up with all the CVEs that areconstantly um released out there So uhit's a lot easiernow What's next for us uh we always wantto focus on how do we make things fasterSo new composition development um we'relooking at creating templates for newcomposition development So it's not sobespoke So engineers out there just needto look at a you know template This isthe basic things that you need to enablein a composition for it to work for ourspecifications Um and they just need toadd in the specific uh providerinformation for that compositionWe're going to investigate um vclustersas well to see if we can um use ourclusters in a more uh segmented way andmake testing um a little bit more fasteras well Potentially use blue greentesting with vclusters Um and there'salso crossplane v2 We'll see what what'suh released in crossplane v2 and if wecan leverage thatUm and yeah so we're always looking atyou know what's next and how to makethings better So I'm really appreciativeof everyone who's come here on a lateThursday right before we finish up Andthank you so much And before I hand overto Steve um yeah let me know if you haveany more questions Happy to talk I'llhand over to Steve now Yeah we have onemore slideUm so um what we're demoing here atKubeCon and if you come to our boothtomorrow you'll be able to see it Um thenext version of crossplane which isbasically based on our learnings overthe last few years and uh it's reallyabout use cases like this like enablingpeople to develop applications better uhenabling more isolation between tenencyUm and there's a few things there likecrossplane will be much easier to managethings like deployments and servicesyou'll be able to move everything to anamespace so like every team can haveall their applications and uh manageresources bundled together Um yeah andit's backwards compatible withcrossplane one So we're very excitedwith it It's preview We're showing it atthe crossplane booth and tomorrow two ofthe core maintainers of crossplane aregoing to be speaking at 11:00 andthey'll be doing a deep dive So ifyou're interested I would encourage youto go I think at that we are we're donewith that Time Yeah Thank you Yeah Thankyou very much Bye2025-04-15 21:59:44.843884 BB��!�-#��yATELnK0PrKHUhi everyone nice to meet you Thanks forcoming to our talk today Uh we'll betalking about the um the journey toevolving our cloud control plane inMcquaryy group I'm Prrenita Praine andI'm a director of engineering inMcquaryy I've got a bit of experience inplatform engineering in cloud platformsessentially That's my background for thepast 10 years I'll be presenting withSteven Hello Yes I'm Steven Purley aprincipal solutions architect at Upboundand a member of the crossplane communitySo just a little bit about Mcquaryy ifyou don't know already So at Mcquaryywe're a financial services group So weprovide asset management investmentbanking retail banking services We'reheadquartered in Sydney Australia but weare global So we've got an office inLondon um all all around the worldreally in Americas as well Uh I'm basedin Melbourne Um so it's a pretty funplace to work in There's about 20,000 orso um employees so fairly big for acompany Uh we really love to say thatyou know we're empowering our peopleboth our staff as well as our customersto innovate and invest for a betterfuture Um in saying that about 10 yearsago we started uh uh moving all our techstack onto the public cloud We startedwith AWS and now eventually we've gottenon to GCP as well as Microsoft Azure Sowe're firmly multicloud right now Iwould say we're about 70% or more atleast in the public cloud and we'restill looking at moving some of our oldstack into the public cloud because westarted quite a while ago Um we'remostly I is right now and mostly sittingin AWS but we are looking to uplift thatmove on to pads and SAS solutions um andalso become more more firmly multicloudas well with the properstrategy We found that over the 10 yearsbecause we've just been focused onmoving all of our stuff into the cloudetc Um our pipeline that we'd used toget onto the cloud was uh getting moreand more cumbersome to maintain Um wewere finding that there was a bit moretoil for the platform engineers infixing um what we needed to fix to getit working to support our productionapplications Um and we also had becausewe started with AWS moved on to GCPAzure we had different solutions fordeployment into the three clouds So thatwas a little bit more operations umheavy We wanted just one multi cloudsolution Um so then we started thinkingabout how do we change this what do wewant to do to uplift what we're doing inthe cloud to make it easier for ourdevelopers to really give a greatdeveloper experience both for ourplatform developers but also ourapplication developersum and also primarily make it reallyfast for us to consume the new cloudservices that we're getting um releasedum basically every week So in McCorywe're highly regulated and when we wantto consume a new cloud service fordevelopment and test we need to makesure that we've got some securityboundaries and guardrails around thatservice Um so that's what I mean byservice enablement just FYI So we wantedto really have super fast serviceenablement We started thinking aboutwhat are the principles that we want todeploy to get this amazing uplift intoour cloud journey Um and that's wherecontrol planes came in We like the ideaof that continuous reconciliation youget with a control plane Um and we alsowanted to make just one multicloud uhdeployment um solution for all threeclouds um make it a little bit moresimple for our developers to maintain Umwe wanted to automate as much aspossible So use the GitOps principles umwidely across uh all of our stack andpotentially just keep to cloud nativewhere we could so that it was easier tokeep our code secure Um we didn't haveto do so much custom um upgrades etc tokeep to the latest standardsUm so that's where crossplane came inand I'll hand over to Steve just to t��bernetes for a while uh I've been apart of the CNCF technical oversightcommittee and in the past I've also beenthe co-chair for CubeCoin um and yeahthat that's kind of how I started withKubernetes was through an internship theGoogle summer of code internship programstarted contributing to it using itjoined a lot of uh vendors that werebuilding products on top for Kuberneteson top of Kubernetes and that's how gotstarted perfect cool everyone I'm Casperum I currently work not as a Kubernetesengineer actually I changed role umafter we submitted this um so I'mactually doing dev world now so it's akind of a different game but I used tobe a staff platform engineer at acompany called Lunar and I got into thisactually when I was writing my masterthesis in 2015 we've heard about thismicroservices stuff and uh thought heythat could be fun to write somethingabout that and figure out how do weactually run all of these things andthen we we found this project online ofsomeone from Finland a 16-year-old guyapparently did something of migrating aKubernetes to to ARM and I said thatthat that could be a fun thing to to doa master thesis on running microserviceson a Raspberry Pi Kubernetes cluster soSo that's what we did back then andthat's how I sort of got into uh to thisgame master's thesis it sounds amazingso we've got like Shane coming from likeyou know joining his uh friend andcolleague at Shopify and then Nikitastarting from an internship you are froma masterademic sort of a backgroundgetting into Kubernetes what about youMicah yeah um I'm Micah Hler i'm aprincipal engineer at AWS uh I am alsonow a Kubernetes contributor uh I'm onthe Kubernetes security committee um aswell as a co-chair for SIG O uh I gotstarted with Kubernetes as a Kubernetesengineer way back in 2016 i ran my firstfirst Kubernetes cluster and this was beon on AWS at the time um and this wasbefore cops and before even reallycubeatom um so it was hand rolling itout and then this guy wrote this blogpost called uh or this this GitHub umabout Kubernetes and his name was Kelseyand I was like what what is this thinghe built and I was like oh yeah okaythat's what I just built that's great umbut I was working at a small startup iwas part of a twoerson DevOps org uh orteam I guess uh and uh with supportingdevelopers and we were using cloud andbuilding small services and I was likeokay I know container because this is2016 docker had just come out right um Iwas like we're going to use docker we'regoing to use containers uh we need toorchestrate this and at the time therewere a lot of alternatives there wasmessos and I tried messos and amazon'sproduct and um docker swarm was a thingif for those of you who remember andthen I tried Kubernetes and I was likeoh my goodness this is solving theproblems that I have and uh that that'sreally where I got my start and thenstarted contributing to it as uh uh notat AWS but as uh working at a startupperfect so you've been one of thewarriors from the containerorchestration wars that's right allright so now that you know what thisdiscussion is going to be like um we'llwe'll dive right into it so I am Rajas iwork at Broadcom doing all thingsKubernetes uh I'm also a contributor ofKubernetes and currently active in CNCFtechnical advisory group for runtimeworking group artificial intelligence soon and so forth all right cool so let'sdive into this right like help meunderstand uh what do you think is aKubernetes engineer because when whenthis term comes to my mind um I I wouldsay that it's something wherein ifyou're touching any aspect of Kubernetesbe it uh contributing to upstream or umusing using it for like downstreamthings or being an end user ofKubernetes or advocating for Kubernetesthat would I would still term that as aKubernetes engineer but how would youall define it like what's what comes toyour mind I think If you've spent enoughtime with Kubernetes to get frustratedby it you are a Kubernetes engineer uhbut just to show our fans who like whohere thinks that they are a Kubernetesengineerokay all right that's that's a goodnumber i I th�ink I almost think of itlike a car right like what are you a cardo are you involved in a a car right doyou do you build cars do you sell carsdo you work with cars do you drive carsdo you drive cars right yeah no I meanif you're if you're driving Kubernetesyou are a in my opinion you're aKubernetes engineer and driving being ifyou touch it i think that's a greatgreat uh great exampleyeah maybe another question for theaudience how how many of you are sort ofidentifying as a platform engineerinsteadoh that's uh more hands actually becauseI think that's that's also a kind of aninteresting discussion right is what isa Kubernetes engineer and what is thedifference from a platform engineer andis there a difference um I think it alsoI think this really depends on thecompany that you work for how big it ishow many employees you have i guess themore people you have the the morespecialized you become and then the morecloser you're probably becoming to onlytouching Kubernetes and then you'reprobably closer to being like a real Idon't know Kubernetes engineer insteadof a platform engineer with which ismore about like you know we were talkingabout Lego just before like puttingbricks together and sort of building aplatform and uh making it easy fordevelopers so I think for me that's sortof the differenceyeah i don't think a Kubernetes engineeractually exists and and I don't meanthat you're all figments of myimagination of course i think it's justsomething that you're doing at a momentin time but really what your focus is issolving some kind of a business problemwith technology and I think a lot ofthat gets lost and you know you tend tobe especially if you are a more juniorrole or if you're at a very largecompany with very prescriptive rolesthen your day-to-day might be editingYAML files and then uh trying to figureout why things aren't working when theydon't and silently celebrating i hopewhen when things do work uh and as youkind of move up or as you're in a moreflexible organization you might pivotfrom just solving a very specificproblem with a tool that you're givenKubernetes uh to kind of questioning howyou're going to solve a a largerbusiness problem with Kubernetes and soyou might be moving from uh just editingthose YAML files to considering wellevery time somebody forgets to indentthis YAML file my application is exposedto the internet so instead of just goingthrough and indenting their AML filesfor them now there's opportunity for meto solve a bigger business problem i cancreate automation i can create aplatform and that's going to indent theAML files for them and then as youcontinue to move on you think well whyare they editing these YAML files at allwhy why don't we automate that for themand then you've got a more matureplatform and at some point you become ina place where you can ask if you knowKubernetes is even really the right toolfor this job uh and if so what can yoube doing to make sure that it stays thatway how can you improve Kubernetes sothat it continues to mature it continuesto be the tool that people can use forthat and and I've been really excited tosee that Kubernetes for every problem itintroduces it seems to also introduce asolution but then it also introducesanother problem to that solution so uhyou can continue down that path forquite a while and the more experienceyou gain the more you're able to adaptthe tool that you have to the things youneed it for but I I really think thatthere's there's a key there to beingable to decide if it's the right tool atall absolutely i think we touched upon abase like a bunch of things but whatcomes through this is there's a spectrumright depends where you are at yourcareer which company you work at what'sthe scale of the company what do youwant to get out of Kubernetes whatproblem are you trying to solve withKubernetes and then you can decide whatkind of a Kubernetes engineer you areright and I think one of the things thatyou brought up was like the wholebusiness aspect of it do you thinkKubernetes engineers should also bethinking about the business side of theproduct �or um business goals of theiremployer yeah especially as you kind ofmature more and advance in your career Ithink you need to be more aware of thosecertainly there's a place for people whohave a real knack for the technical sideand very little interest in the businessside and and I hope that that can bekind of protected and we can keep thosepeople doing what they're good at toobut they don't exist in a vacuum and soI think if you're not solving realproblems with Kubernetes then your timeis limited you're not going to be ableto just keep doing that forever so Ithink as you move on it is prettyimportant to consider what it's actuallydoing absolutely yep umI I just wanted to add to that so Ithink all engineers should know aboutthis so you cannot build somethingwithout knowing how that's going to getused or if or how that's going to affectyour customers otherwise you just don'tknow what you're building is right orwrong so understanding that businessperspective is important it's alsoimportant when you're talking toexecutives because they want businesssolutions they don't want engineeringsolutions so having that kind of mindsetwhere you at least able to switchbetween both sides that's important yeahyeah uh maybe just also a small commenti think we as as tech people right arein many caseslooking initially at the tech uh ratherthan the problemum at least I I've been doing that inthe past and I also when I came backfrom a conference it's always hey weneed to try out this new project but I'mnot really sure what problem it actuallysolves or what value it will bring tothe company but I just saw it and itlooks freaking awesome right absolutelyyeah so so yeah I think we definitelyneed to think about the value and andstart with the problem instead of thetech and then figure out what willactually you know solve that problemyeah I think tech is a tool but at theend of the day it's about problemsolving that yeah exactly you need toyou need to provide some value to yourbusiness it's after all who is I guesspaying paying for for you being anemployee so yeah right micah do you wantto add anything to that yeah i think umas you know most of us even probably inthis room right we all work somewhere uhuh on your name tag it says where youwork and and I think that that justreflects the nature of what we do as anindustry right we are solving problemsfor an end use case um unless um this isnot a crypto conference where we're allmaybe not even all but some of usindependently wealthy and just doing itfor fun right we're we're doing this forfor a point so I think that that doesmatter Um and as as called out like tovarying degrees if you are moretechnical oriented and that's just whatyou do for fun and that's great um but Ithink all of us have some some part ofus that that uh has to be aware of whatis what matters to our businessyeah I think problem solving kind of uhis the theme or that that's somethingthat I would take away from this pointof discussion okay moving on um whatmisconceptions do you see associatedwith the term Kubernetes engineeri think there's a misconception that allKubernetes engineers love YAML i I don'tthink that's the case how how many hereactually love YAMLoh that's a few hands okay that's greati think that's a mis misconception thatthat that we all love YAML because atleast for us when when I was working atat the former company at Lunar we weretrying to really really hide all thecomplexities of kub all the Kubernetesaway from our engineers and and reallytry to to make it as as simple for forwhoever was using this also for otherteams that were like building platformson top of what we were providing so Ithink I think it's a misconception thatwe all love YAML yeah I think this wasalso called out in uh the keynotesyesterday that we need to be thinkingabout the user experience of Kubernetesin this next decade i think one othermisconception that uh is is seems commonin my experience is that uh once youhave the label Kubernetes applied to youwhich probably most of us here do uheveryone thinks that that is your hammerand everything lo�oks like uh a nail andyou're going to only use Kubernetes tosolve those problems right and I thinkuh being pragmatic about problem solvingKubernetes does solve some real problemsobviously as we said it creates more toobut it also isn't the solution to everyproblem um yes control you know the thecube API and the cube resource model andwriting controllers is and CRDs is verypowerful and you can do a lot with thatit doesn't solve every problem youwouldn't want to build your uh system ofrecord on Kubernetes for your businessapplication necessarily but it's a greatinfrastructure tool so I think uh that'sthat's probably another commonmisconception that I see is that peoplethink oh you're with Kubernetes youdrink the Kool-Aid and you think thatcan solve every problem absolutely yeahshameyeah so in 2020 I think the loudestpeople on the internet all had themisconception that you need Kubernetesfor everything and in 2025 I'm worriedthat the loudest people on the internetall have the misconception that youdon't need Kubernetes anywhere or thatit's always the wrong tool or that itadds complexity unnecessarily and sowhen you see somebody and they taketheir Ubuntu VM and it's running theirSQL database and that's all it does andthey move it to Kubernetes they have abad experience and then they think "Ohthis this Kubernetes thing is terriblewhy do I need a deployment why do I needa stateful set why do I need all thisbacking storage this is just ridiculousi I can't do anything more easily orfaster or better than I could before."But if you look at a problem ofdeploying a web application that'sstateless and you want it to be able toscale up very rapidly you want it to beable to scale down very rapidly and youwant to be able to change it a thousandtimes a day then Kubernetes isabsolutely perfect and so I wouldn'tblame a you know a Phillips screwdriverfor not being a slotted or a flatheadscrewdriver it's just one tool for aspecific purpose and if you use it forthat purpose I think Kubernetes does avery good job and for you know most ofmy career it's been improving andgetting better at that so I'm excited tosee that and I hope that people willkind of look at it for what it isinstead of holding it up to be a kind ofcure all for all of your web-basedproblems or on the other hand just beinga you know a kind of relic of a zerointerest rate environment that we werein before yeah I think there are nuancesthat we need to consider we can't be atthe extremes of the spectrum earliertoday I was talking to Dims and he wastelling me that we are in this decade ofadoption right and not just innovationso it's it's it's that aspect that we'regoing to focus on like what problem arewe solving does Kubernetes fit in overhere what part of Kubernetes fits in andthings like that awesome all right uhmoving on like can we get a quick showof hands if anyone of y'all have beenlike uh pinged or like called at like3:00 a.m in the morning because acluster went down or someone tried to doan upgrade and uh there was an APIdeprecation and things like that and nowit wouldn't upgrade the upgrade failedokay we have quite a few people goingthrough the Kubernetes debacles uh sohave you all been involved in any of theproduction failures were you responsiblefor one do you want to talk about any ofthose stories yeah sure uh yeah I'd beenat Shopify for maybe seven or eightmonths and I mentioned before a lot ofthe tooling wasn't as sophisticated itwasn't as mature as it is right now andone thing that we we had was uh theseproxies that help they had to keep aconnection open for some hours and itwas this really arduous process to docluster upgrades and being on theinfrastructure security team I was kindof championing these these patchrollouts because we want to get rid ofthe vulnerabilities that are in the oldversion and uh one thing that is alwaysfrustrating about Kubernetes is thatthere's very little backwardscompatibility i think they promise kindof one minor version will still becompatibleish and sometimes that's noteven true um so these cluster upgradeswere always a pretty diff�icult taskespecially with this kind of longtimething that I had to worry about fromsome business logic that's much muchbetter now anyway so the way I was goingabout this was I'd have to go throughevery single node in our entireenvironment and it was thousands and Iwould have to cordinon them to make surethat no new workloads got deployed thereand then I'd have to wait six hours andthen I'd have to go drain them and so Iwas using one script that would veryvery quickly do the coordining and so Ijust ran it all in parallel for eachcluster then I had a another kind oflike oneliner for bash that I wassupposed to be using for draining and itwas supposed to do those sequentiallyand I can you can probably see wherethis is going i I accidentally used theparallel one for the the really badthing and the the the sequential one forthe thing that wasn't so bad and so itjust all of a sudden a whole clusterjust disappeared and we had pretty goodresilience but losing an entire clusteris not something that most mostenvironments can handle we didn't havethe tooling we have now to just shifttraffic somewhere else so that wasextremely embarrassing for me luckily wedid a retro we looked at what went wrongand it's like well why why is this newguy running these commands from hislaptop to do everything anyway and so wewe kind of learned from that and overthe years we become much moresophisticated that couldn't happen todaythere's just no way that we would beable to do that bash is not always yourfriend right it's not yeah i think thethe important aspect that's I'm takingaway from this is we're all humans atthe end of the day right so it's it'simportant that you're calling this outthat hey it was a mistake I committedright so it's fine like you know it'sjust a cluster we can fix that uh soit's it's especially important for thepeople getting into the community orattending CubeCon for the first time itit can be intimidating yesterday it wascalled out in the keynotes that someonedoing k get paw can be like intimidatingfor some people so it's it it's vitalthat we focus on the human failures aswell and uh the human error aspect ofthis so thanks for calling that outyeah I I I have have a not not acomplete similar story but but somethingalso done in production uh also this wasalso a little bit early on before wealso had GitHubs um before the processeseverything was yeah we also had a lot ofbatch scripting but but that day I wasdoing things uh with the good old cubecontrol i needed to uh upgrade the uhengineext in English controller and thensuddenly all of the pots justdisappeared and no traffic was coming into uh to our environment which was againa similar situation very embarrassingtrying to scramble trying to fixeverything figure out what is going onuh something with an apply a pend uhfigure out at the end I could justreplace the deployment and theneverything came back up again um so yeahI I've also tried that uh on multipleoccasions um but but yeah after this uhGitHubs came and it sort of made this alittle bit easier for us and and alsoremoved us from actually doing cubecontrol applies in the cluster and stufflike that but yeah that's a great callfor GitHubs as well because uh it's it'sthe um maturity that we've shown as acommunity right over a period of 10years how much we've evolved to automatethings to minimize these human errorsyeah um one of one of the things that wealso was struggling a little bit with inyour in early days was also reallyaccording draining the the nodes uhdoing it in a like with with good timein between just to make sure thateverything was sort of uh coming upagain and going down and so we also liketried to do some some manual scriptingand like writing a a go integrationbecause we were missing actually thingsin cups back then uh like to do rollingupgrades with actual time in between thenodes and stuff like that so yeah wewe've also done a lot of hacky things touh to sort of get around the quirkinessof uh of Kubernetes totallyoh um I just want to offer anotherperspective from a maintainer ofKubernetes or contributor to Kuber�netesperspective so one of my jobs I've had asituation where we had a job that was uhmanaging an authentication service forsome of our customers clusters and so onuh and the pods for that job would justnot show up like no matter what uhconfiguration changes we made they wouldstart but then they would just likecrash immediately so like we we weretrying so hard but no matter what logswe checked what configuration changes wedid it would just not work um so afterdigging in a little bit more we thoughtlike why not just check the Kubernetescodebase check the job controller try tounderstand what's it actually doingbehind the scenes uh and that's when wefound out that there was a recent rewell a change in a feature gate thatwent in uh somewhat recently that madeuh that so like instead of exponentialback off it was a linear backoff thatwas happening so it meant that it waslike this uh the service was just notresilient enough whenever uh theinfrastructure uh or the there was theresome infrastructure issues um so that'swhen we had to jump in and actually fixthe bug in the controller itself uh andthat was important to us at that time umwe could create a issue on theKubernetes repo wait for the maintainersto come and fix it but it was really uhit was not trivial right so this was achange in the core controller itself andwe could wait for that to happen but Ireally didn't have the time for that tothat fix to land um so in cases likethis it's kind of important to get yourhands dirty uh get involved a little bitmore deeply in Kubernetes uh and show upthere and maybe Mika you have probablysaw like that from also from themaintainer sort of perspective I thinkit was one of the earlier contributionsin Kubernetes for me where uh Kuberneteshas security issues surprise um the uhthere was a a issue where you could portforward uh to a to a node but if youoverrode the node's IP address with theuh cloud credential IP address you couldport forward to the cloud credentials ofthe API server it was really fun um andI so I wrote a fix for that and in thecode I was I was earlier in my career iwas write I wrote the fix and I was likehey I think there might be even in thisfix there might be an issue and someonewho was reviewing it was like no that'llnever happen and like who would everhave a crazy DNS server that replieswith different results for one requestversus the other um and then it turnsout uh a year or two later I'm on thesecurity committee in Kubernetes and weget a security report and there's a timeof check time of use bug at a oh I wrotethat code great not only in fixing a CVEI also created one um and that was lessof a page in the middle of the night butstill a uh oh no we have to fix thatquickly um and and get that out to thecommunity as well as a fixthese are great points right like uh thewhole community perspective as well soone of the things that I did as part ofthis panel discussion is ask fellowKubernetes engineers from the communitymaintainers end users on what questionswould they have to uh these panelistsand one thing that came up was um how doyou keep track of changes that go inupstream and I think the thethe topics that we talked about just nowhaving a presence in the Kubernetesupstream community digging through thecode base or having influence as well inthe community having a meaningful seatin the community matters but at the sametime how do you keep a track of changesthat go in KKit's hard uh I I think the e the mosthardest way but the best way is to getinvolved uh in some form or capacity inthe project uh and the easiest way isprobably to do what you all are alreadydoing now is attend cubecons becausethis is where you get the big picture uhotherwise like join the mailing list iwouldn't say I mean slack channel isshort but there's a lot of noise thatgoes through so I think mailing listsand uh such avenues would be the bestyeah I think the probably some of thebest methods to learn about what's goingon is if it depends on how granular doyou want to get do you want to know theday-to-day of what's being developed andwhat might be developed that's whereyou're going to get involved in aspecial interest group or a SIG um youcan read the SIG notes you can to justget a uh get up to speed on what'salready been discussed they're all ononline on uh Google Docs um if you wanta little bit uh higher level the releasenotes even for in in progress branchesis a great way and then finally if youkind of just want a very high levelthere's a release blog on the Kubernetesblog for every release and it's great ithas a great summary it's not so superdetailed but it has a great summary iread it personally every release justbecause I'm involved in very specificparts of the project project but I'm notas deeply involved in SIG node orsomewhere else and it's fun for me tolearn and that's that's the place whereI go to learnyeah I I just want to plus one on on allof this um what I what we did initiallyon was really we we created a meetupgroup also just to discuss things sohave have peers to actually discuss whatwhat is going on what is the next thingwhat are you doing why are you doingthis and get feedback on what how we areseeing the world i think that was reallyimportant in our early Kubernetesjourney to be involved in community inthat way for just having some localpeers to actually discuss the thingswith uh as well was important for us butbut definitely second the uh the otherpoints as well on reading the releasenotes is a is a really good oneabsolutely uh but from an end userperspective I'm just curious uh howoften do you all upgrade your Kubernetesclusters oh we've got a great story forthat now so after I broke everythingyears ago um we tried our best to stopbreaking things so often so now it's allGitOps for those kinds of changes andthen kind of on top of the regularGitOps there's another layer uh and I'veI've actually seen this with cloudproviders there's kind of like fleets ofclusters now as well as the individualclusters and so you can have redundancyfor redundancy and we make a lot of useof automation and so the original changeuh is a thing that you can click or uhopen in a PR and then that triggers someautomation that's going to spin up newthings to replace the old things andthere's a whole elaborate system of soaktimes and automated checks to make sureokay uh Shane missed the you know thethe breaking change in the patch noteshe didn't follow the advice of thispanel and so he uh decided to go aheadwith this upgrade without changing thething that breaks he also didn't talk tothe other people who are using it and sonow there's a problem well as soon as itgoes out it's going to hit a stagingcluster hopefully break there and if notit's going to a Canary cluster and sowe'll have the absolute minimumdisruption possible and so we don't seea lot of issues with our clusterupgrades now but uh it does require afair bit of additional tooling uh aplatform if anyone's familiar with thoseuh built kind of on top of Kubernetes soI do hope some of these things thatwe've worked on will make their way backinto upstream and we'll see some ofthose trends uh become universal andeveryone will get the benefit from theminstead of individual companies doing itthemselves sure definitely testing andum havingum in a GitHub story definitely makessense all right I have a bunch of morequestions but surely we're running shortof time maybe we should do a version ofthis of a day in the life of platformengineer in Atlanta later this year uhbut uh I think uh we have to get to theend of this discussion we're runningshort on time but if the audience has uhquestions we'll be happy to take thoseuh there's a microphone over there andwe should be able to take a question ortwono questions okay looks like I exhaustedall the set of questions then okay allright okay maybe we'll do a version ofthis in Atlanta we don't know but thishas been a pleasure talking to you all ithink we had a bunch of insights uh onuh attain the life of Kubernetesengineer what happens behind the scenesthe different aspects of the spectrumand the nuances involved so thank you somuch uh thank you for attending2025-04-15 21:59:45.570981 ""��9�.#��)AQhTlZs4m59wgood afternoon everyone um welcome tothis panel discussionuh on a day in the life of a Kubernetesengineer uh to set some context uh weusually talk about our successes withKubernetes how we're running multipleworkloads off off late AI workloads andwhatnot but seldom do we focus on thepeople who keep these clusters up andrunning and the failures that they runinto and all the other issues that theyrun into so to shed some light on thatto talk about the real issues of aKubernetes engineer what is a Kubernetesengineer um what what skills do you needto be a Kubernetes engineer uh wethought of having this open discussionand I have an esteemed set of panelistsover here with me uh so why don't youall start introducing yourselves and atthe same time tell us moreabout how you got to be like aKubernetes engineer like why did youdecide to be a Kubernetes engineer youwant to go ahead Shane okay cool so I'mShane and I started my role as aKubernetes engineer after working for amanaged security services department inthis big consulting company and Iactually really liked my team quite abit there but uh most of our clientswere really riskaverse organizations andthey definitely didn't want Kubernetesthey uh they were hesitant to try newthings they wanted established bestpractices very you know experiencejustifiable those kinds of things and Ihad a friend who I'd previously workedworked with and he'd left the militaryabout the same time as me and gone towork at this company called Shopify andhe was always sending these messagesabout these cool things he was workingon this new fangled Kubernetes thing andall this cool stuff they were trying andthey were going to be moving to thecloud and they were collaborating withthis team from Google with uh GregCastle and Maya Kazowrosseski and thesethese great people and so I I was seeingall of these things that he was doingand I really wanted to be a part of itso I ended up joining Shopify in 2017and that was when Kubernetes was prettynew and a lot of the things that we havenow didn't exist and it was reallyreally exciting to get to kind of buildthose with them perfect i think just theother day some of us were talking about2017uh there was no concept of a controlleror a controller was just being formedthere was it was very difficult to get acontroller up and running and now herewe are like building on top of so manycontrollers and so on and so forth it'slovely uh Na you want to I don't thinkthere were actually controllers at thetime i think it was they weren't thereweren't even CRDs i think it was likethird party resources uh but anyway II'm Nikita i am a principal engineer atBroadcom uh I got started by building alot of features for CRDs in Kubernetesitself so I've been a maintainer forKu��the noisy neighborproblem those are the first principlesthat we're going to continue to go backto in today's talk but first a quickquestion for you and I'll throw someswag if you take those phones out orkeep your phones out and hit that QRcode let me know who you are um uh uhfor a couple of the questions today youwill have the ability to select multipleanswers uh for other questions you'll uhjust be able to select one um so I'lljust give you guys a quick sec just so Iunderstand who's here in the audienceand hopefully I don't surprise somebodytoo much with those t-shirts uh if youget more than one uh please go ahead andum and get them out that was even thewrong question u but this was thequestion for later where do you run yourcompute uh and it seems like onrem andAWS are the big winners here we'll comeback to that information later um butlet's start with the venerable containerat this point I'm sure everybodyunderstands that containers are layersof information uh but uh we oftent timesum still suffer from securityvulnerabilities and let's get startedwith our first hacking demo uh what I'vedone is I've created a simple web shellhere and embedded it in a variety ofcontainers to make a point and whatwe're going to do is we're going to lookat code that we've all seen before it'scode like log 4j code that um uh isabused and not used the way that it wasintended it um takes advantage of acommon weakness that we find incontainers which is uh a violation ofleast privilege bringing too many thingsinto a container here so some of us haveeven written code this bad so what I'vedone is I've created a couple differentcontainers for this fake company calledpersonfax personfax is the best personfacts they're instant they're fresh uhand I've embedded it in a variety ofcontainers here uh one with Fedora onewith iuntu uh one um uh with Alpine andone uh from scratch to show that youknow we could have more minimal andminimal containers but I want to justmake sure that we all align on um how umhow bad this can be and so with thiswebshell I can actually just um uh typeuh you know forward slashcomand here atthe exact um handler and I can then passcontainer or uh pass commands so if thiswere in your cloud and I were a threadactor I might do something like um let'sgo maybe download uh the AWS CLE uh tothis computer or CLI i'm not sure whoI've offended with that pronunciation ornot and then I might try to do somethinglike let's make sure that we got itdownloadedokay uh this is the silliest web shelluh but it does indeed work uh it's notthat much unlike like the early versionsof China Chopper if you've seen it andthen you can see I've got this CLIdownloaded and I'm not going to gothrough the entire exploitation processhere of of what would happen what did Ido wrong here uh but um you could unzipthis instruct the command and then startprobing your infrastructure or look forum lateral pivots around and let's tryto get this unzipped because I thinkthat's like the the neat thinghere up the CLIokay all right there we go all right sojust to demonstrate we could take thisexample a little bit further and startpivoting laterally through ourinfrastructure if you'd likeinstructions for doing that in yourcloud and abusing uh STS and things likethat I'm happy to get with you umafterwards but really what I'm trying tojust show is this idea uh thatcontainers oftent times um let us downin their fundamental properties in oneof their designs my only goal here is todeploy a function to the web and yet I'mthinking about operating systems i mayonly need to use file in and file outfor my function but I'm ending up umgiving my thread actor access to thingslike um uh uh like uh TCP and UDPsockets and things like that it's one ofthe things that we find if you startmonitoring these things it's a verycommon thing that's abused in containersand this is one of the fundamental flawsthat we're going to we're going to leadto when we talk about why we need abetter abstraction for platformengineering and tech well what aboutcontainerized platforms you know there'sKubernetes but �there's there's honestlysome other technologies that are gettingreally good things like AWS Lambda hasbeen great for a long time or can nativeyou know when we think about these umthings from a technology point of viewwe've all lived through these epics oftech you know going back 20 years we allstarted with the same idea of I have asingle application mapped to a fullcomputer this is crazy i have all thisunused capacity all of this overhead andthe the question we asked ourselves ateach one of these fundamental epics washow do I platformize this how do I sharethis layer how do I get multi-tenencyand higher density to save money to savetime to save cost and whether we'retalking about CPUs with VMs and VMwarewith data centers with uh AWS and havinga shared um data center or withoperating systems and orchestration withcontainers and Kubernetes we're stillasking ourselves that same question howdo we get multi-tenant how do we savemoney how do we platformize um so let'sdo a quick example of one of thefundamental weaknesses that stillremains in this layer um and we're goingto take a look at Kative if you're notfamiliar with Canative it is fabuloussoftware um it originally was out of umuh Google uh and um it it's designed tobe sort of a lightweight functions as aservice um it's usually deployed on topof uh Kubernetes so what I've done hereis I've just pulled out um a canativeand I'm running uh a couple functionshere including a node function that hasscaled to zero so it's no longer um inthe pods uh memory here this is a commonproblem that we want to demonstrate thefirst was security the second here isthe cold start problem of what happenswhen we have to uh start a containerfrom scratch that hasn't been loadedtechnologies like Canative and AdisLambda still use containers underneaththe hood and even highly optimizedthey're still essentially slower than anetwork request so this very first hityou can see it took 1.2 seconds and if Irun additional hits they go much muchfaster you can see this one is afraction of a millisecond if I hit itagain here locally it's about 1millisecond um and so these can theseplatforms are very good but they sufferfrom this problem called the cold startproblem now why do you really care aboutthat because there's a hundred tools outthere on the floor that will help you tomanage this problem um they will helpyou to cost optimize they'll help you toautoscale but you still can't get pastthe fundamental inherent properties thatcontainers take longer than a networkrequest to start and that means that ifyou want your container to be availableit needs to be running whether it'sbeing used or not and what this leads tois the high cost of cloud computing whateverybody complains about because ofyour SLAs's your SLOs's if you're notfamiliar with those words that's the Ineed to run multiple copies so that I'mreliable I'm redundant I'm available inow have to run more copies or perhapsfor data locality for GDPR you're nowspinning up more and more regions soyour cost per application is gettingworse with Canative and and CanabisLambda are not bad technologies they aregreat technologies but they're held backby the uh the abstractions that we'veembraced 10 years uh 10 years ago now atthispoint so this high cost of of today'scontainers um uh is uh this second uhbig problem and it uh also impacts thedensity that you can run at uh here onmy laptop if I were to continue to scaleI found that I could lock up my localMacBook now that's not reallyrepresentative of a production use casewhere you're optimizing it on Linux butit is um representative of the problemthat if you're running hundreds ofcontainers on a Kubernetes node you'retypically doing pretty good it meansyou've got your images you know rightsized down to pretty small you're notusing a bunch of 2 gig Java images orsomething like that but this density isnow the third problem that we have withcontainers they are inherently insecurethey're open by default the cold startproblem and this density problem that wehave here and this cold start problem uhuh is something that uh pe�ople like AWShave been optimizing uh for years buteven they have trouble getting it past anetwork request which is about 30seconds now you've got a little bit of apreview here of coming attractions whenwe get to Web Assembly one of WebAssembly's unique uh properties uh thatwe have um enabled through going back tofirst principles is this incrediblysmall um uh cold start time it's sosmall in fact that it's less than anetwork request even if we were in thesame data rack uh so it's an incrediblypowerful um uh technology when we get toit here but let's talk about the finaluh and the biggest problem with the waythat we build today's platforms and Ilike to refer to it as the original sinof platform engineering now uh in theaudience we've got an audience full ofplatform engineers here and you are allgood people and you're very kind and youall want to help your developers to gofaster and to check all of thoseenterprise boxes so that they can umjust do what they need to do buildfunctions and deploy them uh but the uhthe golden template ends up becomingthis original sin of platformengineering uh what happens is is we'llpull together something typically JavaSpring Boot and we'll include all of ourenterprise goo it'll have messagingit'll have one Kafka one Kubernetes onelake uh one hotel all of our corporateservices that we have in here but assoon as a developer copies um thattemplate they are on their own and weend up in the world's worst positionwhere we have 5,000 teams that eachpatch one vulnerability the samevulnerability one time and what wereally want is a world where one teamcan patch 5,000 vulnerabilities at onceand the problem's even worse because ourcode is not compatible acrossapplication frameworks it's notcompatible across languages and whileyes containers and Kubernetes aretheoretically portable what happens isis that they get anchored into a placeand a time into a control plane ifyou're implementing and building on youknow VMware Broadcom today you arelikely stuck there even if you're incontainers because you're using all ofthese local platforms platform engineersdon't build a single platform weintegrate and orchestrate platforms ofplatforms and this opportunity uh meansthat we end up stuck in one place withthis original sin of platformengineering so while people complainabout the high cost of of today's cloudsand uh cold starts and idleinfrastructure and security what they'rereally complaining about are these firstprincipal problems with the design ofthe platforms themselves and theinherent cases that by defaultcontainers are open and it does not meanthat you cannot lock down yourcontainers or minimize them throughwonderful projects or doing things likefrom scratch containers containerssuffer from high cold start times whichleads to the high cost of idleinfrastructure they're often bloatedwhich leads to lower density per uh persystem they're anchored in place and thekiller for me what I found most painfulin my previous role uh owning platformengineering is that I'll have 5,000teams fixing the same vulnerability onetime so how do we solve this problem nowuh may I ask who's heard of WebAssembly okay I can't see any of you soI'll just guess some of you maybe nonemaybe all of you i can't really see withthe lights to be honest uh Web Assemblyis kind of a hard technology to wrapyour brain around and I hope uh in thenext 10 minutes uh that I can help youto really understand it very deeply andwhy you should think about it as anotheralternative with containers with virtualmachines and with clouds um for your uhon premise infrastructure so webassembly is a bite code target it maysound a little bit like Java it's notall that different but one of itsfundamental reasons that it's differentis that it's open it's a W3C standard onthe same level as HTML CSS andJavaScript uh it has been adopted at apace like which we've seen nothing elsebe adopted it already has billions ofusers you're actually probably using itright now when you should be listeningto me if you got Google Docs open you'reusing Web Assembly if you're on a Zoomcall� and you've got your backgroundfilter you're using Web Assembly in theback watching Amazon Prime Video you'reusing Web Assembly they don't target8,000 unique devices or build 8,000binaries for Amazon Prime they targetWeb Assembly think of it like the thetiniest virtual machine but it's avirtual machine and an ecosystem that isdesigned with uh first principles nowlike a lot of good things it started inthe browser so it already works in V8 uhwhich is Chrome it already works inSafari uh it already works in uh SpiderMonkey which is Firefox and you can evenrun it on your servers with projectslike CNCF Wasml Cloud which isincubating we've got around 350 uhenterprise contributors there and thereare even a couple of other orchestratorsuh uh and runtimes that are also in theCNCF so you've got a couple optionshere now it's neat because there's astandard that's called a web assemblycomponent and a web assembly componentis a like a really tiny virtual machinewith um a neat uh property we declarethe inputs and the outputs when we buildthem and what this means is that we knowhow to lock it down we know that if itonly needs file IO or standard in andstandard out we don't even give itaccess to streams we don't even give itaccess to fork and other uh things likethe posics type system calls now uh it'sreally easy to compose um two webassembly modules together and you'relike Liam that's dumb i can compose Rustand Go today but how do you do it youbuild a container you set up a webserver you set up gRPC and if it happensin milliseconds you're like I'm awesomeweb assembly is neat because it'slanguage interoperable so when we have astring in Rust or a string in Go they'relifted and lowered so that they're thesame and it's smart so if you've got twothings in Rust it doesn't even shortcutthem it just connects them uh directlythere and when we compose Web Assemblyuh we have this share nothing linking wehave separate stackbased virtualmachines that are interfaced driven justlike Lego blocks if it fits you know uhuh I get uh and it all happens within asingle process so these uh web assemblycrossprocess accesses happen innanoseconds they're lightning fastespecially when we compare them totraditional sort of microservices thatwould happen over a network boundaryeven even on a local host or somethinglikethat so I think you can probably alreadysee where we're going with this webassembly is a little bit like Legoblocks what if we have a platform Legoblock like a platform harness that theplatform engineers can maintain and whatif we give our users a really tinyharness that they can maintain andthat's exactly uh what we're doing inCNCF WASM cloud is that sort of approachwhere we can minimize um embrace theprinciple of least effort of least codeof least surface area and give ourdevelopers these tiny functions now wehave the advantage of every one of theselittle Lego blocks as you'll see shortlyis versioned uh so standard in has aversion standard out has a version youcan create your own contracts for thingslike business contracts or GPIO orwhatever your business case is and thoseall get versioned so you can start tomaintain them independently and we'reone step closer to this world where oneteam can maintain 5,000 apps at oncewe'll do a demo in a momentso this is our vision for what platformengineering looks like with web assemblyand CNCF WASM cloud uh you have languageinteroperable platform harnesses thatare maintained centrally empoweringdevelopers to choose their businesslogic approach in the language of theirchoice what's neat about web assembly isbecause these are these highlystructured interfaces is that we candesign them to be pluggable so in theWOM cloud community we've createdpluggable interfaces for common platformengineering components things like keyvalue uh things like authentication andsecrets so with secrets for example youcan choose to use Hashi Corp vault maybeyou're on your old Broadcom interfaceyou migrate to AWS and you want to useAWS secret manager hot swap at theplatform layer the developers don't evenknow maybe you want to use Kube�rnetessecrets you can swap that in and soforth uh and you're not stopped fromdoing tight couplings you know um peopletalk about all the amazing power of ValKey um uh and the new performance uhgains that they're getting out of it ifyou've looked at the options for Valkythere's many maybe a maybe a generic keyvalue interface doesn't let you accessthose options you're not prevented atall from tightly coupling to specificinterfaces and we've had vendors uh likeCloudflare who partner with Fortune50banks uh implement uh their owninterfaces in the Woml cloud ecosystemso that platform engineers can stillchoose to tightly couple with aninterface uh if you'd liketo so um so what does this world looklike now I can suddenly take mycomponents I can run them acrossdifferent language run times and becausethey're pluggable I can easily run themon any cloud on any edge or anyKubernetes including your own that's apowerful statement when we're looking atit so let's look at a demo now we've umall these slides will be open by the wayi see a bunch of people taking picturesi'll put them up on the slides share iwant you to have them i want you to takethis message and be able to communicateit to your teams i'm an open source guyby heart i do work at a company thathelps companies do this but I'm moreconcerned about you know building thecommunity and the effort in theecosystem in CNCF cloud so this is a a asample application we're going to lookat and we're going to uh dive throughthose problems we saw with containersand look at them through the lens of webassembly and we'll see how uh by goingback to first principles web assemblychanges this now this app is a littlecomplicated because we're using it inmultiple demos this week but the parts Iwant you to look at are this littleplatform harness piece and this usercomponent now um uh the theory behindthis app is that it's some sort of uhrules engine uh uh for interbanktransfers uh and uh any bank can posttheir own rules into the platform solocally here in my laptop I can actuallyrun 10,000 of these tiny littlecomponents and I use less than a gig ofmemory uh you'll see in a moment uh whenwe um take a look at them so uh let'sdive into uh the app here i think thedemo's here all right great so uh thiscode is up on uh GitHub already uh butI'm going to look at um one of thedeveloper harnesses first and it's ait's a very simple uh file when you'relooking at it the developer is justwriting Go code and it looks likethere's not a lot of Go code herebecause there's not because we'reactually importing these platformharnesses where we have authenticationand tracing and messaging and uh HTTPand all of those things that may havevulnerabilities that we want to maintainfor all of the developers here and whatI find to be uh really neat about thisapplication uh is uh the fundamentalsize here so I've pulled up and compiledone of these things here we could do iton disk either way uh where am I wom paymy camerabuild all right is that okay in thesize-wise in the back of the room thereum now there's a couple different WASMcomponents here that are compiled andone of these is 7 kilobyt i said 7kilobyt that is the user component andwe when we compose it together with theharness it's still only 265kilob really really tiny now why is thisso much smaller than if I put this uhcompiled this as a Go app like I did inthe web shell even if I put it in a fromscratch container it's still a few megsor if I put it into an Alpine containerit's even bigger or in a from iuntu oryou know whatever image you're usingit's because the only thing I have hereare the functions necessary for thatparticular app there's no additionalinterfaces there's uh uh nothing beyondthe lease that we need to get here andwhen we think about why this wasdesigned this way Web Assembly wasoriginally built and designed for thebrowsers the backstory is actually kindof amazing this um a great developerover at Google named Alone Sakai had oneof those ideas that is so dumb it'sbrilliant he said "What would happen ifI cross-compiled CC code transpiled itto JavaScript?" A�nd everyone in thisroom remembers this day it was in 2012you were on Reddit maybe you're still anold slashdaughter not sure about yourhistory and right at the top of the pagewas run Doom in your browser and youwere like "Man they're running Doom oneverything these days." And you clickedon it but this felt different because itresized for the window and it wasperformant you were like "Whoa what theheck is this?" And then you look behindthe scenes because you're curious youknow you're you're into tech and youjust love to understand how things workand you're like"AmJS transpiling seated JavaScript thisis an abomination." And it was anabomination but it was the start of abrilliant idea the browser enginemanufacturers at Firefox and at Chromestarted to optimize those hot paths andin 2016 a standard was formed called webassembly and since then a group of us umuh out of the bite code alliance andmany other organizations this is open sothere are uh literally tens of thousandsof contributors that are helping tobuild this uh have been uh creating theecosystem and the community to enablethese results this is supported by yourcompilers you know all of your statictools that compile to LLVM Rust C C++compile right down uh Go and Tiny Gohave support uh we can uh compile Pythonon a limited basis and last week Oracleeven announced that they're working on aJava port for web assembly which istruly the white whale um that we'll getto so these components are incrediblysmall and uh when we spin them up uh uhthey um uh you know they give us theability to um to build platforms atscales that were previously notfathomable so here I'm looking at um uhuh web assembly clusters um on CNCF WASMcloud that are running across AWS Okamiuh and my on-prem and we're using thesort of pluggability of web assemblyunderneath the hood here uh in order toenable this really high uh multi-tenencyso I've got two WASM cloud hosts whichare just executables you can put them ina in a container we ship them with CRDsuh for all the different layers and anoperator you know following bestpractices so you can use uh uh coupubecontrol um right out of disk uh in orderto get these things running uh uhthey're compatible with Argo GitOps umall of your other uh uh types ofsolutions uh and you can easily deploythem um uh out here so I've got uh anumber of components here that arerunning and I can set limits on them andresources just like I would um uh myother technology here so here I'm I'mlimited some of these interfaces to uhto be you know 100 and if I actuallylook um at the uh at the at the sort offake company here I think I can do I cansend $100 in uh great British in USdollars uh but there's a rule that'swritten so that if I try to send it ingreat British pounds it should fail outuh because there's a rule there um sotransaction failed validationunsupported currency uh so I have a alonger demo there with Argo if you'dlike to see it come to the booth i amshort on time so I'm going to make surethat I get through the rest of mycontenthere so uh maybe next question i'llthrow a couple more swag what are themost popular languages that you're usingat workgood research question here this is anice little purple platform engineeringshirt um see I told you if you sat upfront I was not going to get these veryfar maybe try to get one a littlefurther thistime there we go maybe another one righthere all right uh and I hope hopefullythat's yoursize uh now you can see uh a bunch ofplatform engineers go is there but Iwould have guessed Java was going to beour big winner here and I'm notsurprised my conservative estimation isis that Java makes up around 60 to 65%of the enterprise microser market withwhat what we're seeing on um onKubernetes uh with uh Python or Go beingum a solid number twohere so what did we see with that umwith that tenant um evaluator we sawreally tiny pieces of code uh that thedevelopers wrote this is what itcompiles down to from an interfaceperspective now what I'd like you tonote from the interfaces here is thatyou can see that this interface doesn'thave all of PZIX you know the 500 systemcalls that are available to it it hasstandard in it has standard out it hasstreams it has some file pre-opens andthings like that but the platformharness itself when we bring that inbrings in a couple of really criticalcapabilities messaging as well as uhHTTP and I'll call out a hidden dividendof of this approach is is that let'simagine that you wanted to createadditional triggers for this for thesefunctions you can simply create triggersfor SQS or whatever triggers you wantand reuse that same uh that same harnessbecause the harness plugs into aninterface that I'm calling validationand on the other side of validation iswhere I'm sourcing my request if that'snot clear to you I can whiteboard somethings with you um uh uhafterwards so this is a very powerfulapproach so if we com so if we comparethis to um uh why not containers and whyweb assembly we now have a platform thatis capability based I start with nothingin my sandbox and I grant it onecapability at a time I can scale to zerowith zero cold starts um they'reincredibly small so I can get tremendousdensity in my platforms they're portablenot just in theory but but across bothclouds as well as different capabilitiesbecause they're pluggable and instead ofthis world where 5,000 developers fixthe same vulnerability one at a time Ican move to a world where one team canfix 5,000 applications atonce so uh let's maybe do one more demoand then I might have uh time for onequestion uh what about more complicatedthings you know I've got theseinterfacesuh in this particular demo which I'm notgoing to execute live uh but I canafterwards if somebody wants to see iti've got an unchecked uh file inputcould we leverage the power of these ofthis uh component model in order tovirtualize that interface yes we can sohere's another view on what we're doinghere underneath the hood we have ourplatform harness that imports you knowHTTP and Postgress and all of thesedatabases running on CNCF was cloud adeveloper is able to import it uh and umuh but in this case the developer doesneed access to the file system they'recreating we're uploading files and we'rechanging them well in this case we canuse a technique called WY vert whichwould enable us to actually take one ofthose interfaces and virtualize it intothe file system so uh uh developers arecompletely unaware that they're evenrunning with a virtual file system maybewhen they're developing locally weattach them to local disk and uh andwhen we run them on our platform wevirtualize it it's the same sort ofcomponent composition uh we take ourcode um uh we're going to take that filesystem pre-opens and we can use thatcommand at the top in order tovirtualize it out and then when I runthis at disk um this is open if you wantto play with this little exploit lateror you can have chatgpt create it justlike I did yourself uh um you canvirtualize ityourself in closing this is a reallypowerful approach and this is nottheoretical uh uh at the last kubecon wehad this amazing talk from AmericanExpress who came out and talked aboutwhy They selected CNCF Wazcloud fortheir platform engineering what weretheir requirements and how they wereusing the platform harness approach justa couple weeks ago we were surprised byfour blog posts by the Norwegian FoodSafety Agency which talked aboutleveraging CNCF WASM cloud for buildinguh their platforms uh and uh we have aproject uh from a group of telos uh in ain a thing called the uh TM forum uhthat had just crossed a milliondownloads uh this is people like Orangeum Vodafone and others that are buildingtheir platforms on CNCF WA and cloud ireally look forward to getting to knowyou we'll back be back at the cosmicbooth afterwards for questions we've gotthe WMCloud booth out on the um out onthe pavilion floor as well if you cometo womcloud.com you can click join ourSlack you can come and hang out and askask any questions you want the wholecore team is in there i think I'm out oftime for questions uh but I will beavailable here on the side of stagethank you so much for your time2025-04-15 21:59:46.417668 ��v�/#��#AHrO5KVMQfHsuh thank you everyone for coming today ahuge thank you to uh the CNCF uh thevolunteers and the staff and everyone ittakes to put on today's talk i reallyappreciate some of your time uh todaywe're uh I have a talk put together umthat was originally titled um you candeploy a thousand apps to Kubernetes butcan you keep them up to date my name isLiam Rand Randall i'm founder and CEO ofCosmonic where we create enterprise webassembly i'm co-creator of CNCFWASMCloud former VP of innovation atCapital 1 and a longtime open- sourcecompany builder i built the very firstuh Kubernetes company in 2014 acquiredby Capital 1 uh before that I didCororeite which is built on Broiquethat's a unicorn i'm also an investor ina number of open source companies builtaround OSQuery Cloud Custodian and manymore i'm incredibly passionate aboutopen source i love long-distance uhswimming i've got three kids and I livein Washington DC if you'd like to talkto me about wine-making I'm availableafter the show aswell uh let's get started uh with a fewprinciples about platform engineering umuh some things that we want to align onuh and maybe review a quick agenda uh sowe're going to start by having an honestlook about the state of containers andorchestrators they've gotten pretty goodbut there's still some gaps that arefundamentally fundamentally inherent inthe technology we want to make sure thatwe align on then we're going to talkabout what is web assembly we're goingto talk about the platform harnesspattern uh look at a couple referenceapps uh and then hack a few things liveon stage which I think is always fun i'ma big fan of the live demo you know wecan all live and die by that togethertoday but I've got I'm going to try fivetimes so we'll see what happens uh someof the principles that we want to alignto today are the principles of leastprivilege secure by default uh a limitedblast radius uh the talk is directed atplatform engineers so I'm assuming thatmany of you think very deeply aboutthings like blast radius how do I handlethundering herds and ��team to catch them with uh products wedon't have that we expect the same theteam that builds it to operate it overtime so as well as um so it's fundingit's how the team is put together howthe team operates um and it's thelongevity as well which is reallyimportantawesome and so um Colin um related tothat so I've decided I'm going to builda platform i am going to uh I I got thesign off to kind of treat it as aproduct within the or organization sohow do I really make sure that myplatform aligns with my business goalswell for me it's been really interestingwatching the evolution of platformengineering over the past several yearsum I think we're we're watching and whatwe're seeing is we're seeing uh thedevelopment of the competency ofsoftware engineering into theinfrastructure space so thetraditionally the infrastructure teamsor the core IT team space um typicallywe'll see that um we need infrastructureto be able to run and maintain a systemthat developers can use or other membersof the organization to keep thingsrunning keep things afloat and whatbusinesses are starting to realize andand maybe it's because we're gettingmore technical across all domains isthat there's more that we can do andthis thing that we're building maybe canend up being a a core piece of ourbusiness and a core engine of ourbusiness instead of just being a thingthat keeps things alive right um andpart of the the challenge when we'retrying to justify and get investment forthe platform and if if any of you havehave uh started to consume and use thethe platform maturity model you'llyou'll see that investment is one ofthose big aspects and measurement is oneof those big aspects and um engineersare finding now too that we need tostart to understand how to sell thingsinternally also so we have to puttogether a business case oftentimes forthe organization and we have to findways to deliver business value and so weneed to find ways to translate uhtechnical problems and challenges intothings that can benefit the business orkill two birds with one stone or orthings there so as as part as far aslike planning the platform and thosethings there we we really need to toinvest in our understanding of how the bbusiness operates and what moves theneedle for the business and use theplatform and develop the platform andits capabilities to make sure that theyfacilitate growth of those particularareas that way when uh non-technicalstakeholders at a company observe thedevelopment and growth of a platform andchoose what to invest in um and thetechnical employees choose what toinvest in and where things need to gothey're both moving the same needles atthe same time so that the platform cando that but in order to do that wereally need to understand who our usersare in the organization like even allthe way across to business uh usersright um but that takes a lot of extraeffort too but aligning on businessgoals and understanding that having aperspective from technical and also thethe business side that's really the keyto start with well and those businessgoals um you know they need to be beyondjust the I want to deliver somethingfaster right they need to be beyond a abusiness wants to outperform theircompetitor they want to increase annualrecurring revenue and so reallyunderstanding that technical or sorrythe business terms and what the boardcares about and what your sea levelscares about like when you are looking atyour platform as a product that's whatyou really need to align it for becausewe all know platforms aren't necessarilyinexpensive so um to the point aroundpeople Valentina you talked about youknow your personas uh do you want tooffer some about what you need to thinkabout with personas yeah absolutely sowhen we think about platform as aproduct right we start thinking aboutwhy we start thinking about why becausewe are building a product and who areour our users who are our personas we wetalk about alignment business alignmentand how that aligns also into thebusiness and the company or goals thatwe have uh so you will probably willstart identifying your end users yourc�ustomers and how the platform benefitsthem and then building like you knowyour metrics around that which is greatbut also one of the things I feel likesometimes we miss is the people whobenefit and interact with the platformdirectly or indirectly and that could bemany teams it could be the developersthat interact with the platform it couldbe directly or through an IDP or also adata science or a security team sounderstanding all those personas is veryimportant to understand their needs andstart building upon that well and Ithink just to add to that it's alsobringing them along for the ride sodon't just decide you're going toimplement a platform and then do allthese things and roll it out for eachone of those personas you want to beincorporating them into what you'rebuilding because it's a big culturalshift in an organization yeah exactlyand also will help to the future no whenyou start thinking on iterations and howthat evolve it will really help you toconnect with that audience okay so nowwe're going to talk a bit aboutimplementation and so we're going toback to you Valentina and we're going totalk around self-service right soeveryone's here self-service you canjust use this and you're never going tohave to talk to DevOps or SRRES and it'sgoing to be amazing um so how do youlike how do you how do successful teamsactually make that self-service happenyeah no it's a great question it's myfavorite topic i'm actually working onthis for like the last six months sosome of the things is first definingwhat self-service means right and theidea is to remove blockers to acceleratesoftware development uh to removecognitive load so but in order to doself-service going back to the personaswho are we serving who are the ones thatare benefit from something that I findout when working with customers thatsometimes we don't look at the processesso something like you know a valuestream mapping could help to startmeasuring the processes to understandwhere are the gaps how much somethingtakes versus how much uh how muchsomething I need to wait for so that isvery important because it could help youfind where the blockers are but alsounderstanding the user needs soconnecting with those personas willreally help you what is that they neednow and also what they may need in thefuture and one of the things that beenworking lately is not only aboutapplications developers sometimes need aplayground to start trying newapplications what if I want to startbuilding my first AI application shouldI use a development cluster where is itsafe maybe I need like an ephemeralcluster something that I can spin upquickly and it doesn't spend so muchresources what if I need to configure aspecific name space maybe withobservability or service mesh oranything how I can enable all thosedevelopers or these personas in reallycreating and building faster so I thinkthe self-service approach is really away also for platform engineering toscale so how many platform engineerengineers we havetoday great so to all of you if you'rethinking well right now I have you knowa couple of data scientists one whathappened with that start growing rightthe same with developers how you howmuch time you are spending and enablingeach developer or each application whatif you take that operational knowledgeyou put it into best practices you takeyour organizational guard rails and youput it into templates and then you letfor example I don't know GOPS or I don'tknow uh whatever tool that you are usingin order to use like configuration codeso you can ensure that you always havethe current state you want in yourcluster so start thinking how you canscale your knowledge to serve betteryour organizationand so Simon I'm going to throw thisquestion to you so I've self-servicei've thought about my personas i'mbuilding something and it's going to begreat because everyone's going to dostuff on their own and I'm not going tohave to do anything and I have thesetemplates so wonderful but what aboutsecurity and governance and all of thoseparts of building your platform yeah weneed to go and support th all o�f thosethree aspects within our platforms thereneeds to be a technical implementationas well of policy um governance and andsecurity so easy examples that we knowfrom from financial services for exampleis that often there is a split that umif you work on if you are a developeryou cannot directly release toproduction right there always needs tobe additional approvals that need totake place now there has to be a way atechnical means of gatekeeping that sowe might have um in the case of githopswho can approve a PR that allows us togo and merge a branch that leads toproduction so the the point is is thatthe platform is going to need to to umallow us to enforce policy and we mayalso do have to do the supportingengineering to um cater for that as webuild up our platforms as well awesomeanything that you want to add to thatyeah I I go back and I think about thepersonas and I I think aboutplatform engineering versus platform asa product and kind of the the growth andthe shift into that space and securityis is a great example because we wethink of we start with a developmentplatform maybe or we're just trying toget apps to production but the the realchallenge is we start to recognize andwe say hey uh we have entirely new usersets we have security teams rightobviously we're talking about uh we evenhave compliance teams uh we have uhalready have observability or S sur wealready have developers now we'reintroducing hey maybe what about projectmanagers and life cycle what about uhagain the business stakeholders heckeven I need to be able to ship apps tosalespeople how does sales people or thebusiness analytics folks get in therehow do they access this informationright what's the gateway and the the waythat they connect securely and do allthat we start to grow this massive uhset and the the platform needs to growwith it but when we as we do that as wedevelop the and and grow as platformengineers into a deep understanding ofeach of those domains and each thingeach one of those users needs toaccomplish it's really it's our duty todesign the product to be able tofacilitate interaction between thoseteams without them really knowing itright we have to meet them and even Ifyou haven't implemented UIUX processesin your platform please do um it's thenext step for you probably um with withwhat we see with all the talks andeverything but um building thatunderstanding and finding thoseopportunities for overlap that's reallywhere you start to hit um amplified ROIright right and that's where you reallystart to drive and that that adoptionand that investment because whensomebody can self-s serve and use theplatform and it helps them work withsomebody else without them even knowingthat they're connecting like just thenature of you doing your job now ismoving the business forward i thinkthat's when we really hit the nicepowerpoints well and that is conveyingthe importance right you talked aboutROI you talked about you know you'redriving business change like that iswhat you should be communicating whenyou're talking about thisplatform just kind of moving on a littlebit so um Simon I'm going to go to youon this one so how do you keep theplatform requirements in sync with likeall the changing needs within anorganization and then again conveyingthose changes and that investment to thebusiness yeah so you've brought youryour users are along with you for theride um we we have to bring our ourusers with us they're going to bestarting to use the platform right fromthe very beginning as we bring in PC'sMVPs and as we increase the level ofmaturity they're with us what we do needto see is we need to get to ensure thatum that development teams that arerelying on a platform are able to getfeature requests quickly and easilythrough to the platform developersthat's really important um so um ifthere is too much of a gap between theconsumers of a platform and the teamsthat are actually engineering theplatform then then we start to see itnot meet needs so there needs to be areally um important dialogue there umthe other thing too is um measure it soum um is is we �do need to report to ourour management structures around how isthe platform beingused are releases actually happening orare people finding ways around it sokeeping that reporting going umdorometrics they're they're a place tostart build a framework that works foryou on that well and that's a greatpoint again around conveying thisbusiness value or this value of yourplatform it's what are your metrics soevery organization is going to havegoals they're going to have KPIs andOKRs and it's all going to kind ofwaterfall down but what are the metricsthat you should be Colin kind ofexplaining to the business when you'remeasuring a platform successthis is a really challenging one for alot of organizations um a lot of timeswhen we get started in helping uh anorganization take kind of take this turnand implement a platform strategy wefind that uh they don't even really knowwhat to measure but it's because theydon't know how they do their work theydon't have anything defined as far aspolicy or the way the business shouldrun they don't have that goodunderstanding so going through that UIUXprocess is a great way to surface thatbut also to determine what's valuable toyour users and that's really the keything right and this is not a newconcept and that's why I love theplatform as a product turn and thatassociation because if you're building aSAS product or anything else any otherproduct you need to be able tocommunicate that ROI and you need to beable to say hey this this product issuccessful because it delivers X for howmany dollars and how are we going toprice it and how do we do the costanalysis right all that stuff um but thekey thing I think the simplest way tothink about it um you as a a platformteam are probably initially going tothink of success metrics as what is myuptime right how many users do I havehow many people have I forced into thisplatform right and mandated that theyneed to use this thing but the we needto encourage them to adopt the platformbut be able to show and almost drivecompetition between some teams but theeasiest way to encourage adoption of theplatform and also show that you have anunderstanding is to create success memetrics that are based on what theconsuming teams what makes the consumingteam successful and the success of theplatform and the product should be arollup of the metrics that make theteams successful that way if you have ateam that's adopting and a team thatisn't then you can say hey our users onaverage increase their ROI by 50% andthe other teams are going to say "Holycrap how did you do that?" And you say"Well I used the platform that yourefused to use." And they're going todemand and say "I want to get on therebut it we work a little bit of adifferent way." And it'll start thisdialogue in an exchange where you get toinform or the the we as the user get toinform the team on what our needs areand we can start to build that value andwe should set a roll up of those successmetrics so we have two different teamsthat might have two different sets ofmetrics but we can measure what successlooks like by quantifying thoseindividually and then just rolling it upand um the platform maturity model thatCollins worked length on does give somemetrics that you can useit it defines how to walk throughcreating and understanding your metricsis really the key thing um like anycompany we're all unique and different aplatform essentially is a company and anorganization so you need to start at thebeginning and understand what makes youtick and define metrics based on thatand just grow through maturity if if youskip steps and go say what makes otherteams successful let me copy that youwon't effectively implement that so inthe maturity model I'd encourage you toread through each of the aspects ittalks through adoption uh investmentmeasurement and others and start at thebeginning and if you miss out on onsmall key things um don't advanceyourself be pessimistic don't advanceyourself to the next level of maturitygo solve that problem it'll give youthose high impact loweffort things andand go from there awesome so we have afew minutes left really as you know asyou leave here you know we just wantedto get everyone thinking about okay ifmy platform is a product you know you'reall building a you know your company hasa product whatever that might be andyou're trying to sell it and you'retrying to get users and you're trying tomake money off of it and you're tryingto reduce risk you want your platform tobe doing the same thing and whetherthose users are internal externalwhatnot you're looking at how do we geteverybody using it how do we reduce riskand how do we make the platform makemoney right and that is through enablingyour developers to make money uh todevelop that new feature um so in inkind of my final question as we move onis really open to all because itwouldn't be a talk without mentioning AIso we have AI we have new tech like howdoes a platform like how does it supportthese new workloads these new ways ofworking um and how does this productmindset help you be ready for that andit's like one minute each valentina doyou want to start yeah so Mike yeah sothank you yeah so one thing to rememberis when we talk about platform as aproduct we we want to be agile right andwe want to be able to iterate iterateover each feature and also have a goodfeedback loop between your personas sowhen we're thinking about AI we reallyneed to discover who these personas arein your organization from AI engineer MLops and also data scientists so lookinginto the different personas uh andunderstanding what are theirrequirements and looking into forexample it's a service approach how wecan help them uh because for exampledata scientists they want to you knowbuild the models and do model trainingdon't necessarily look into a new thingwhich is you know learning kubernetes uhwhich is amazing we all love kubernetesbut you know everyone wants to focus onwhat they do best so how we can enableall those peoples to remove kind of uthat complexity by allowing them toreally focus on what they should bedoing which is in this case likebuilding models but also looking intothe new AI uh type of applications thatare coming and the different tools andhow we can bring all those toolstogether into the platform in a secureway because that will be important weall want to have control over what isthere so um so yeah I think it's that ySimon quick 30 seconds oh um I'm goingto be really um controversial and Ithink it we really can't do AI withoutthe capabilities of cloud native um Ilook at it very much technically i seeAI as being um brilliant functionalityand cloud native platforms provide usthe non-functional capabilities that weabsolutely require and I think um just aa very quick pointum if you're familiar with teamtopologies um consultation okayfacilitation and then X as a service weneed to be moving as quickly as we canto to X as a service that and I think umthat's relevant for AI because thebecause the iteration and the speed isgoing to be so important and Colin finalwords I think the AI story is uhelevating the need for platformengineering um AI really is has openedour eyes to a new form of value deliveryum and we need to well we all know weneed to protect our data and everythingelse there so on the technical side weknow how to a we know how what we needto do to run AI right but at the end ofthe day AI is a value delivery storyit's an application development story aswell we can run the best models the mostoptimized models but if we can't getthat into the hands of the right usersto use it the right way if they can'tuse it and access it it's it's moot soplatforms is about getting the righttools to the right people in the rightway so they can do their jobs awesomeand we popped up here some resources aswe've mentioned we um a number of us puttogether the cloudnative maturity modelthe platform engineering maturity modelall of this piece of content helps giveideas around business outcomes metricsall the things that you will need toconsider as you're building yourplatform as a product so thank you allfor being on the panel and thank you allfor coming2025-04-15 21:59:47.200358 DD��0#��uAKtW4HkonQHUhello everyone um thank you for comingto our session around conveying theimportance of platform as a product umso for those in your room obviouslyyou're interested in this topic um weare going to talk about platform as aproduct so it's this engineering conceptjust to make sure we're all aligned asto what we're talking about today thatbasically says we're going to run ourinternal platform the same way that weare running our product development umand it's about making the frameworks andthe core systems all kind of workingtogether supporting business processesum and and all of that so I'm DanielleCook um I'm a CNCF ambassador i've beenhelping people adopt cloudnativetechnologies since2016ish um and I'd like to get the restof the panelists to introduce themselvesand then tell us one common problem thatyou are aware of with platformssounds good hi everybody i'm ColinGriffin uh I'm a founder and CEO of aplatform ccentric software developmentcompany called Crumbware uh I'm also aco-chair of the CNCF uh platformsworking group so glad to be here uhtoday um I got into platform engineeringfrom the application development sidebuilding applications and supportingapplications and it ending up becoming adebugger uh for infrastructure teams totell them why their infrastructure wasbroken and not my application uh so thatgrew into platform engineering practicebut awesome my focus uh my name is SimonFer and um I'm a an architect and I doengineering work in uh financialservices here in the city of London umand um I've been involved in theplatform space i'm an engineer a aninfrastructure engineer by backgroundand with the adoption of uh cloud nativeI've seen that that change over timethank you hello uh I'm ValentinaRodriguez i'm based in New York i'm aprincipal architect at Red Hat but mybackground started as a developer like Idon't know 20 years ago and I did thatfor 15 years and then I I joined Red Hatand started working on platforms um andthat will I started looking into thedifferent personas and how the businessconnects with the technology and helpingdevelopers uh you know take theadvantage of the platform also yeahstarting to learn more about howorganizations adapted and what they weretheir blockers and challenges so Ibecame kind of passionate trying to helpcustomers and different organizations inthis space so I've been working withthese amazing people here for I don'tknow a couple of years a year I don'tknow uh but also a contributor of theQflow project i'm a KCD organizer andyeah I think that's all so thank you andwelcome awesome so we're gonna jumpright into our questions we've kind ofput this together in a few differentways we're going to talk about theplatform as a product approach we'regoing to talk about managing itimplementing it um and then kind ofiterations of it and then where you gofor the future so we're going to kickoff now um starting with Simon soobviously loads of people that we meetand um even some of the sessions we'vetalked about you will talk about I'mgoing to start this project i'm gonnacreate a platform and that's how you'rethinking about it but what's thedifference between talking about aplatform as a product or just platformas a project yep so the um for me umprojects are time bound they're scopedthey often have a set budget umrequirements for a specific outcome aretraditionally defined right up front andthey really adopt a waterfall typemindset um so um with a a productproducts tend to be uh they're iterativeum they um requirements come in overover time we have consistent teams aswell that go and uh deliver productswhen we run a project we often see we'vegot a project team creates the um theartifacts systems whatever throws themover the fence and there's no operations��walk youthrough LinkedIn scale because withoutputting things in perspective I don'tthink you'll get a full picture of whywe're building what we're building firstof all I assume you're all familiar withLinkedIn uh we have over a billionmembers on our site and these membersare served by some 3,000 services andsome of these are databases streamprocessing services or microservices andthese systems run on about 500,000 plusservers and these servers are again amix of Kubernetes as well as as well asour in-house orchestrator and theseapplications run on one and a halfmillion containers uh it's a pretty goodnumber and they all get deployed prettyfrequently we have we're servingthousands of engineers internally atLinkedIn and they're doing deploysmultiple times a day on theseapplications and one interesting thingabout us is that we do not run on acloud provider we have our own datacenters and we run everything on baremetal we do not do virtualization uh formany reasons one of them beingperformance so today we'll walk youthrough our compute platform uh you hereyou'll see three layers at the bottom wehave an infrastructure as a servicelayer you're not going to see anyKubernetes in that box and on top ofthat we have a Kubernetes clustermanagement layer which are the conceptsthat you're kind of familiar with youknow there are clusters pools nodes thatkind of stuff and we'll talk a littlebit about uh some of the controllersthat we built there and on top uh mostof the talk we're going to dedicate thistalk to the workload platformlayer but first let's start with thebottom two layers so I'm going to coverthese one by one uh I want to start withthe infrastructure as a service layerand then move on to the Kubernetescluster management so in this case inthe infrastructure as a service layer uhyou saw a few boxes and let's go throughthem one by one first we have aninventory manager you can think of theinventory manager as a metadata store ofwhich machines we have in our datacenters so for example if a data centertechni technician comes into the datacenter and racks a machine in well thatmachine has to show up in a database forus to use and that's the machine nameyou know what are the characteristicswhat is the hardware what's the CPUmemory GPU that kind of stuff uh and ontop of that we have this componentcalled compute broker compute broker isbasically kind of like the heart andbrain of our uh data center and machinelayer uh it's a centralized componentthat everyone talks to it's not akubernetes operator it's actually just agRPC service it's a pretty simple one ithelps you manage the pools and addsremove capacity to the pools uh ifyou're using a cloud provider you canthink of it as your you know virtualmachine scale set or you knowautoscaling uh instance group API and soon so forth uh so the pools that we haveare composed of heterogeneous hardwareum we don't really get fixated on likespecific machine types and so on butthese machine types are largelyinterchangeable within a pool and we'lltalk a little bit about this shortly soeach pool specifies something called anode profile which is basically what doyou want out of a machine and we kind ofcapture these minimum requirements asthe lowest common denominator toconfigure a pool and the compute brokeritself is also the source of truth formachine allocation and maintenanceoperations like if a machine belongs toa a pool compute broker is basically thedatabase for that and if I let's saywant to take a machine out formaintenance or if I want to you know uhchange the hardware on that machine Icome to the compute broker to schedule amaintenance operation and lastly we haveu host health remediation well it turnsout if you have that many machines somemachines are always going to be failingin your data centers uh so we have builtautomated systems to take machines outof rotation um service them do upgradeson them and so on and so forth so themaintenance orchestrator that we buildactually has some decent ideas fromKubernetes itself uh it's anorchestrator that goes through our fleetand upgrades our fleet� because againwhen you run bare metal you have toworry about kernel upgrades securityupgrades and soon so the first concept that I'll talkabout in the maintenance domain is theconcept of maintenance zones these arenot like your availability zones they'reactually just software update domainsfor us so when we roll out a softwareupdate like a kernel upgrade through ourfleet uh we stripe our data center to 20different parts let's we call thatmaintenance zone 1 2 3 all the way to 20and we roll them out roll these softwareupdates out one by one by doing so wekind of ensure that we're not takingmore than let's say 5% of the capacityin a particular data center out at atime so if we have a bad kernel thatgoes out and somehow you know themachines don't come back online we onlyuh take down 5% of the fleet that wehave uh now the compute pools that wecreate that runs our applications thepools themselves have machinesparticipating from each of thesemaintenance zones and so when you lookat a pool uh it's striped evenly acrossour data center however that said theKubernetes clusters themselves are stillvery much a fault domain right when youdeploy a bad operator or when you messup your web hook configuration you'rekind of breaking that entire cluster sothere's that uh I want to talk a littlebit about the coordinated maintenanceoperation uh operations that we have soas I mentioned mentioned earlier uhhumans are not in the loop for doingmaintenance in our fleets all oursystems are rather automated ifsomething goes wrong again the systemstalk to each other to figure out whatthey need to do so we have this conceptof a disruption disruption basicallymeans you want to take the control fromKubernetes or whoever is using thatmachine away from that person and giveit to a maintenance actor so we have twotypes of disruptions in our fleet one ofthem is planned maintenance operationsthis one this is basically you know yourregular OS kernel upgrades your you knowswitch upgrades that are happening inyour racks hardware u decommissioningoperations that we have or maybe thecublet upgrades themselves as well andthe unpl unplanned operations as you canimagine there you know when stuff goesreally wrong and we need to get thismachine out of the way so these areusually the hostile remediationoperations so this picture kind of showsyou um what happens when we receive adisruption so in this case you have ifseveral actors a machine disruptor comesin creates a disruption and thisdisruption gets sent to the computeworker so now comput broker knows aboutthis disruption and our clustermanagement layer observes that okaysomebody wants to take this node awaylimit core don't drain the node and weapprove the disruption we give a thumbsup the maintenance actor sees that itdoes its thing but whatever it needs todo either reboot the machine reimagewhatever it needs to do and then theyremove the disruption and then we putthe machine back in rotation so all thatisautomated now I want to talk a littlebit about how we organize our Kubernetesclusters i bet you're all kind ofcurious about that uh we don't use anyspecific Kubernetes DRO we just take theopen source bits we configure them inour own way we don't use any cube atomor cluster API uh a lot of the featuresin cluster API are actually stuff thatwe don't uh need uh we use a singleprovisioner you know we we don't run oncloud providers and so on so our stackis pretty simplistic in this regard andwe want to keep it simple uh so thisworks well with our clusterbootstrapping stack we we also tune alot of flags in API server and CD um interms of cluster size which is I assumeis a hot topic we'd like to push ourclusters to 5K and we want to actuallypush them a little further and the mainmotivator behind that is to reduce thehardware fragmentation across clusterslike if I need certain hardware used bya certain customer I want to kind ofkeep it all in the same place andanother reason is uh we want to do thatis um clusters are multi-tenant andcustomers grow in place you deploy anapp to a cluster well we want to give ita little bit� of a wiggle room to growand uh our clusters are pretty mixed upwe run stateless stateful uh batchoperate batch workloads all in the samecluster and the cublet upgrades that wehave in our fleet again uh as I coveredearlier they happen as part of the OSmaintenance we regularly upgrade uhcublet itself and we have a centralizedhub cluster to manage all the otherclusters that we have and manage whichapplication goes where which Ronach isgoing to talk a little bit shortlyso we use the Kubernetes resourcemanagement uh concept we use the KRMAPIs we've built our in-housecontrollers to manage the pools andclusters that we have uh here's a prettysimple example uh we modeled theKubernetes pools as a custom resource inour management cluster and these changesportion of them gets converted into acompute broker custom resource and thenthe comput broker pool gets synchronizedto the gRPC API that comput brokeroffers that I mentioned earlier uhhowever you use this custom resource tospecify other like Kubernetes relatedsettings for example what demon sets doI have what node labels do I have rightand um when you model things this wayadjusting capacity on a pool becomesreally easy you just change an integeron a YAML file you submit it and thenyou wait for a minute and suddenly youhave bare metal machines showing up inyour cluster prettymagical i want to touch on how we scaleKubernetes itself um as I mentioned wewere running pretty large clusters andthe clusters the control planesthemselves are shared resourcestechnically if your control plane goesdown it's bad time for basicallyeverybody right um so when our customersbring their own operators into ourecosystem we heavily review theirarbback so we can kind of control whatthey can and cannot do and on top ofthat we use the APF which is APIpriority and fairness to ensure everyoneis kind of staying within their limitsnow is a hard topic um again it doesn'thave a lot of these cool mechanisms torestrict load u and CD also happens tobe the first bottleneck when you try togrow your clusters um as you submit moreevents you know there's more pod turnurnand all that like yourd will getoverloaded uh so one thing that we'vedone is we've increased storage uh limitfrom 8 gigs to 16 gigs and we'replanning to potentially increase itfurther uh we rund on SSDs so uh basedon our stress testing this seems to beokay and uh we also built our internalbackup restore system so in casesomething goes uh catastrophically wrongwe are going to resort to that andcontroller scalability also remains tobe uh an active topic that we're tryingto figure out because controllers we runmany of them and you can't infinitelyscale controllers actually funny enoughright now there's another talk happeningelsewhere about horizontal cu controllerscaling and it's not our sole problemright now so uh this bothers us quite abit and we're working on it and I'llpass it to Ronak to talk about workloadplatforms so for the rest of the talkwe'll focus on our workload platformlayers so we use Kubernetes to run ourstateless stateful workloads as well asjobs now we don't want our appdevelopers to be Kubernetes experts orworry about how to craft the exactdeployment spec so what we do is wecreate custom resources which wecreatively named as LI deployments andall stateful sets so users only haveaccess to these two custom resourceswhen they're deploying statelessstateful services uh these are exposedto the users and they don't have anypermissions to any other nativeKubernetes resources at the same time wealso want to cater to other platformteams like Spark machine learninginfrastructure teams who want to buildtheir own platform on top of Kubernetesso we also support those teams byproviding volcano as a batch schedulerto help with things like fair schedulinggang scheduling and we have built ourown regional job queue as well as kotasystems the reason we built our own isbecause the systems that exist involcano today they are cluster scopedwhereas we have hundreds of clustersrunning within a region and we need kasto be effective across all of those forthe� rest of the talk we'll focus more onhow we run services onKubernetes now about a decade agoLinkedIn started building its owncontainer runtime and auler this isbefore Docker and Kubernetes days wehave been running all of our statelessand stateful applications on top of thatsystem and it has been serving us prettywell so far over the last few years westarted migration to Kubernetes and wewant to do this migration without anydowntime to the life site we want to dothis centrally with automation withalmost no user involvement and at leastfor the stateless applications and wewant to challenge legacy requirements aswe go ahead so that we can reduce techdebt along the way at this point we havemore than half our stateless fleetrunning on Kubernetes and we haveseveral stateful systems running inproduction on Kubernetes aswell now before we go into how we runthese services on Kubernetes it'simportant to talk about our internalservice infrastructure at LinkedIn wedon't use several Kubernetes featureslike services DNS configs and secretmanagement or network policies and thereasons are twofold one these featureswork really well within the boundary ofa single cluster but when you spanacross hundreds of clusters the overheadbecomes really large for example makingKubernetes services work across all ofthese cluster boundaries becomescomplicated really fast the other thingis we have an equivalent of all of theseofferings at LinkedIn which have beenscaled to all of our regions and theyhave been hardened over the years so wehave been developing these offeringsindependent to Kubernetes and then weintegrate Kubernetes with it to run ourapplications ontop so Kubernetes becomes the primarypod orchestrator for us and we don't useseveral of these Kubernetes featuresthat I mentioned but we reallyappreciate the flexibility that itoffers so that we can extend Kubernetesfor our use casesnow let's first talk about stateful onKubernetes uh it's been an ongoing memethat Kubernetes does not really work forstateful systems especially the onesthat store data on local disk now wehave several data systems like Kafkasome of you might have heard of that uhEspresso Wen Pino etc all of thesesystems store data on local disk and wedo local disk because of performancereasons these systems are replicatedthey're sharded so any part eviction orupdate requires coordination simple PDBsor stateful sets don't quite cut it atthe same time we don't want to ask allof our data systems teams running datasystems to run learn Kubernetes andwrite Kubernetes controllers so what wedid instead is build a generic statefulcontroller uh that integrates with shardmanagers that teams rely on now statefulteams are really good at managing datasystems they know how to build theseshard managers our controller interactswith these shard managers through ourcustom protocol and handles things likepod updates version updates and eveneviction and the maintenance aspectsthat Ameth mentioned earlier u we gave atalk on how we built uh this operatorlast year in cubecon North America andwe also published a blog post we highlyencourage you to check that out to learnmore u let's talk about stateless onKubernetes so on as for stateless andkubernetes a user would typicallyspecify their simple spec in 10 linesfor the alli deployment custom resourcethat we have we also offer various partorchestration capabilities in it likecanary for instance where you can say Iwant to ramp a change only up to 10%validate that it looks good and then goforward we use clone set under the hoodinstead of deployments because clone setoffers volume claim templates so that wecan use a PVC pod which you cannot dowith deployments and then we translatethis clone set to about 500 plus line ofpodsp spec we rely on init containerspretty heavily to integrate with therest of our infrastructure we createdefault volume mounts environmentvariable so that an application can geteverything it needs to run in Kuberneteswithout having to worry too much aboutchanging it going from our legacy systemonto the new stacklet's go through a few user w�orkflows umthe first thing that an app developerdoes at LinkedIn is they would writetheir own manifest they would check itinto their repo as part of our buildprocess we also publish these Helencharts in the Helm repo now because weare serving several thousand engineerswe want to focus a lot more onvalidation and prevent people frommaking mistakes even before the code ischecked into production so what happensis when you have a PR and let's say youdon't specify a field that is mandatoryin the manifest we have a GitHub actionthat would call out saying hey you'remissing memory there for example andit'll block the PR from getting mergedthere are several such checks which runand beyond schema checks what we also dois we have similar checks that we do inour web hooks we run them on the PRsthemselves so even before the requesthits the API server you already see inyour PR what are all the things thatyou're going to be blocked by this hasresulted in significant reduction ofuser support load for us because nowusers can see what's wrong and they cango and fixit um we use a namespace per applicationthe reason we do that is because all ofour permissions are based on namespacesand a user only has access to namespaceswhich has applications that they own ifa namespace for an application does notexist what we do is we create for thefirst time and we route that to acluster so let's walk through thisworkflow for a second a user will neverbe requesting a namespace they don'teven know about namespaces they don'tneed to know about it they go to ourdeployment orchestration service andthey'll say I want to deploy myapplication if we get to know that thenamespace does not exist for this app weget information like what is the appname what tenant meaning is a statelessor stateful application and the appowners get to specify what node profilethey want so mentioned earlier a nodeprofile is basically a combination ofwhat is the machine type you want to useand hardware configuration you want torun on so we get that information wehave a controller at the hub which alsogets information about yourauthorization rules from our regionalauthorization service what it'll do nextis it will find a cluster which has apool that matches the node profile thatyou have and if one doesn't exist it'llcreate one and it'll propagate the namespace as well as role bindings to thatparticularcluster the next step that happens isthe app getting deployed now thedeployment orchestration servicespecifies the application version passesit to Argo CD within the cluster argo CDwill sync all the manifests from theHelm repo apply to the API server andthen our controllers take over ourcontrollers interact with all of thoseservices mentioned above those areregional services deployed and allcontrollers running in every clustertalk to them and then we translate thesecustom resources to the part specs thatI mentioned earlier we have several logsand events that get emitted as part ofour controllers and applications all oftheir make all of them make their waythrough Kafka to Azure data explorerwhat we do is we create defaultdashboard for every application so youcould simply go to a link type in yourapp name and you get a bunch of eventslogs as well as metrics like how muchCPU memory your application is using howmany times your containers arerestarting why they're restarting howmany replicas are desired and how manyare available so all of this is given tothe app owners out of thebox uh a quick note on Argo City uh sowe run Argo City one per cluster so itmanages all applications within thatcluster uh it has served us pretty wellso far while the scale was small butwhat we have started to see is that asthe number of objects in the clustergrows the application sync times havebeen going really high the other problemwe start seeing is that as the number ofreplicas within an application grow thehealth status sync is taking too longand this has resulted into not the idealuser experience that we want and it'sbecoming more of an operational word forus so what we're doing now is we'rewriting our own G�itOps engine and we'llbe replacing Argo CD for majority of theapp deployments we'll still use it forsome of the infra pieces but notnecessarily the end userapplications uh let's talk aboutfailures and categorizations for asecond now going back to again servingseveral thousand users u if we don'tspecify why an application rolloutfailed we would be getting a lot ofsupport requests so what we do is uh foranything that is a validation error assoon as the manifest is attempted to beapplied we fail it right away specifyingwhy it's a bad manifest and why thedeployment will fail if it's a terminalerror then we fail it on the firstreconciliation specifying it's aterminal error and if it's a transienterror or an application health issue inthat case we use progress deadlineseconds we implemented our own progressdeadline seconds on top of our customresources what it does is that if itobserves there is no progress made for agiven amount of time then it fails thatdeployment in addition to failing thatrollout what we also do is we categorizewhat the failure is and whether it's anapp issue or an infra issue so if yousee in the conditions so what we do isall of these app categorization theymake their way to our object conditionsand we show user uh user readable errorssaying this application failed becauseyou have 18 parts which are unhealthy onthis version and this is an app categoryall of these condition we don't expectusers to necessarily go and query everysingle object so what happens is ourdeployment orchestration service willtake all of this data and send this tothe UI so a user goes to the UI clicksdeploy deploy fail they actually seeexactly why it failed if they went andinspected the object in the cluster theysee the exact same thingnow because we have custom resources andwe have application running acrossseveral hundred clusters we don't wantusers to have to figure out where theapp is how to query it how many parts tolist so we provide a cubectl plugin thatapplication owners rely on the semanticsof the plug-in are very similar tocubectl but instead of cubectl a userwould type cubectl in and then theywould do something like get pods getalli deployments get ally stateful setsum what we do is behind the scenes wefigure out where the app is running andwe scatter gather all of that data anddisplay it to the user in addition tothe default commands that a cubectlplug-in would provide we also have somecustom things like status so a user cantype cubectl and status my allideployment the alli deployment name uhyou don't see the full picture here buton the right side that is kind of thesummary we present to the user we saythis is the alli deployment name theseare the desired specs you have whetherit's healthy or unhealthy and if it'sunhealthy here's why and all of thisinformation is propagated to the user wealso have a system that runs within eachregion it's watching all of these customresources pods and nodes and it collectsthem in a central database we built acustom UI on top that users go to todebug their applications so instead ofhaving to just use a CLI a user cansimply go to a UI and get a bird's eyeview of everything at the same time thatUI doesn't hit our API servers whichallows us to scale API server andcontrol plane it hits that internal APIthat we've built which is collecting allof thisinformation now we also spent a lot oftime on building API guardrails so ifcubectl deleting a resource would resultin an outage then we have deleteprotection for it so this includes allof our custom resources that areuserfacing every single namespace evenfor admins and custom resources that webuilt ourselves we also have otherprotections in place to prevent usersfrom having an incident by just fatfingering for example i'm sure everyonewould have had gone through a time whereinstead of thousand you press 10 uh andpressed enter and something bad happenedso we prevent scale down operations inone shot beyond x% and typically that x%relies on how big your application is wealso have an upper bound on how much maxsurge and canary percentage you can use�so that you don't go over your kotaso what's next um so so far we have beenintentionally running each applicationwithin one cluster that's been anintentional choice to keep operationalsimplicity what we want to do movingforward is federate these workloadsacross multiple clusters we want toalign our clusters with maintenancezones so we get two sets of alignmentsone we don't have a single clusterfailure taking down an entireapplication the second part is everyroll out we do as part of data plane orcontrol plane it also only impacts up to5% of an application and no more it alsoallows an application to grow way beyondits cluster size and we have severalapplications that would grow beyond acluster size uh it also helps withmachine types getting fragmented acrossclusters so let's say if I need fiveGPUs three are in one cluster two arethe other one if we fra federate aworkload we can actually utilize all ofthose machines too now by default weactually allow applications to burst sothey can use more CPU if it's availableon the host while this helps withefficiency we have found thatapplications that care about performanceit's not enough it doesn't give thempredictability and it sometimes resultsinto noisy neighbor problems so what wedoing is we rolling out CPU pinning forthese applications and one thing we dois we only pin physical cores we don'tpin logical coursees so it gives us moreisolation than otherwise we're alsobuilding uh part CNI or IPv VLAN basedpart IPs where every pod gets an IPv6address that is globally routable withinour entire production regions so stillwe don't have to worry about clustertocluster routing it's all just a flatnetwork and cubeception uh we want torun Kubernetes control plane as pods ina management cluster this makes managingclusters super easy uh especially whenwe have several of them in a region andallows us to stack components resultingin more efficiencyuh let's talk about migration lessonsreal quick uh so one thing which we havelearned so far is always start early andmake incremental progress because whenyou're migrating something from a decadeor legacy system to Kubernetes there areseveral things that you'll find that youdidn't expect to u and I'm sure we havewe all have fair share of thoselearnings um one thing which we alsorealized is migrating the first 50% isthe easiest part of the job and theremaining 50% like that's a long tailwhich takes really long we want to alsobe pragmatic about the tech that wechoose to solve because if we chose tosolve everything along the way then wewouldn't necessarily finish themigration at all so one example is wewanted to generate container images foreach application instead of having andteaching every app owner about how towrite docker files we have an automatedsystem that actually automaticallygenerates a docker file createscontainer images on the fly and that'swhat we usewe also want to be intentional aboutwhat Kubernetes features we use andexpose to the users if we expose rawKubernetes to users it wouldn'tnecessarily work for ourscale invest in guardrails when you'reexposing APIs to users because theywould use it in every which way beyondwhat you would expect and I'm sure weall have experienced that uh and developgood user guides and self-s servetroubleshooting it goes a long way andreduces support loadsignificantly lastly uh thank you somuch for coming to the talk i think wehave some time for questions and there'sa mic in the middle and also we havesome other resources we encourage you togo and check out thank you so muchright yeah go ahead thanks a lot for thetalk i have a quick question youmentioned uh um that you have to tuneAPI 7 and flags i wanted to I waswondering if you had considered umstoring the events the communitiesevents to another cluster and if youhave already done that if you went backfrom it why yeah I can take that sorryyeah I can take that one um so we we docurrently run uh separate CD eventsclusters for at least one of theclusters that we have um but we'reactually planning to merge it backbecause we are not liking having two CDsum t�hey're you know you can't use thesame disc if you're running two CDs sideby side you're just going to you're notgoing get that much performance out ofit so instead we're going the route ofincreasing the database size okay greatanother question I had was the operatingsystem that you use on your on your baremetal boxes which one uh so it's allLinux so we run just Linux are you areyou asking specifically about whichdistribution of Linux we run yeahexactly yeah uh we run Azure Linux whichone sorry azure Linux so Azure has aLinux offering so we learn Azure Linuxyeah thanks a lot thank youthanks for the talk uh crazy to see thescale at which Kubernetes can run reallycool um so you are using uh open sourcetechnology but we also saw a couple ofin-house uhdevelopments uh have you considered tomake it generic and donate it to theopen source community especially for methe uh the cube ADM cluster installationtechnology sounds really interesting souh I would like to know more about itbut but I can imagine some of it isproprietary but also maybe you wouldlike to donate something make it genericwhat are your considerations uh in thatyeah yeah i would say that like we um inthe past we considered like donatingparts of the stuff that we have but uhfor example we developed an in-house HCDoperator but right as we were doing thatthe an open source HCD operator uheffort just started so we're you know uhtrying to participate to that as much aspossible but for the most part a lot ofthe cluster management stack that wehave is bespoke for a reason thatwouldn't widely apply and there areactually really cool tools out therethat let you do stuff like cubeceptionor you know cluster API is actuallypretty good for most use casesthanksand so you mentioned earlier that you'reusing arbback for your access controlyou know that works pretty well in aname space code but the moment you needto go beyond that you know somehow youlose that fine grain access and I wantedto ask do you use any other web hooks oranything else so we actually consideredwriting our own uh web hook forauthorization so instead of relying onkumar's arbback at all we basically havethis web hook basedauthorization so far every applicationlives within its own namespace anapplication can also have multiplenamespaces because you can say I want apro cluster a test cluster so you gettwo different namespaces with thosedifferent tags u because we haveapplications bound within that namespacethe arbbacks kind of work out for us sothat's not been a problem uh what we dothough is we don't necessarily treatthat as source of truth what we do is wehave that regional service so we justsync all the arbback rules on a periodicbasis and that's that's how we applypermissions but if you need to go beyondthat then yes webbook is the way to gouh we have several other webbooks fordifferent validation checks but not forarbback thank youuh thanks for the talk uh I wonder howyou solve the ingress and uh useinteractive user access and machine tomachine access between services so u inour production fleet we have flatnetwork meaning every machine can talkto each other every application can talkto each other our service discoverysystem relies on that as a prerequisiteso today every application can talk toeach other and there's MTLS andauthorization in the middle meaning youneed to use your certificates or bettertokens and then you can only talk to aservice if you have the permissions totalk to that API so in today's case itdoesn't matter whether your applicationis running on cluster A or cluster B thenetwork is flat and they can talk toeach other the way it works is our datacenter network is controlled by us andbecause we have control from all the waythrough which rack we run the machine onand how we deploy applications we we canmanage some of that what about theinteractive user access like through thebrowser to these apps yeah sointeractive user apps in terms ofproduction if you're talking to anapplication in production you can'tconnect to it directly uh we usejumphost for example but typically usersare not talking� to applications inproduction through a web UI all of theseapplications we deploy they're servinglink.com we have several internalapplications to expose those we haveinternal proxies that we run but theyare not cumulative controllers uh so wehave a traffic team that managesregional proxies any application thatneeds to be exposed to the browserbecomes a backend on this traffic proxylayer and that's how we interact with itthanks thankshi thanks for the great talk so myquestion is related to uh one of theinitiatives which we are trying to do inour uh company so have you ever triedupgrading the Kubernetes uh componentsalong with the applications which isrunning on top of it like control planeas well as data plane so all the timeyou want to take that i don't fully getthe question actually you want to takethat uh so I just to make sure we getthe question right uh you're askingwhether we do a Kubernetes control planeand data plane upgrades while theapplications are still running yeah andand uh some of the application you canassume that although it's not advisablebut application is running on controlplanes as well uh so none of ourapplications are on control planes uhbut we def we always do in place updatesfor our control plane and data planes uhso there are let's just say two parts toit for data plane it goes back to whatAmeth mentioned earlier the maintenancetrain would go and deploy new cubitversions across the board while theapplications are there so we drain themachine upgrade the machine bring itback into rotation for control plane wehave our custom orchestrator right nowthat'll go and update API server forexample in a rolling phase in a rollingfashion so once we check 10% of it looksgood it goes forward and theapplications continue running it's likeinertia objects in motion stay in motionobjects at stay stay at rest so controlplane or data plane upgrades don'taffect the applications at all does itrequire downtime not at allokay uh it will be very good if we canshare some links or documents the way weyou're doing it that will be veryhelpful for us thank you happy to chatthank youthere might be a talk for North Americamaybe oh yeahhi thanks for the talk um I'm curiousabout the freedom of your teams aboutscalability so you define uh the namespaces for the teams and how what is thelimits for the teams uh limits in termsof autoscalability in resources CPUmemory uh so there's no specific limitin terms Okay the limits are definedbased on two factors one is how manyreplicas you need to serve the site andthen the app owners choose how much CPUand memory they need u typically whathappens is we have another system whichis called autoite sizing so it's notreactive autoscaling it's proactiveautoscaling so majority of ourapplication owners don't have to worryabout how many replicas to use what wetypically do is that we observe sitetraffic over time we develop a model tosee how many how much QPS does thisspecific application need to handle whatits CPU memory usage looks like and wehave a decent idea of how much QPS weexpect to see next week or two weeksfrom now that system will actually gotalk to the LI deployment API here andupdate the replica count dynamically andthe application owners don't need toworry about that okay do you react ifthere's like a overuse of the resourceswe have quota systems in place so iflet's say for example an applicationstarts using beyond we don't necessarilyenforce it in the sense that theapplication is killed but theapplication owner receives warnings thatyou're going above Kota thank youhello thanks for the presentation myquestion more about bare metal server uhyou know the management it was always itis always challenging you know to thethe bare metal server you need to get inthe right nickart you know you need toknow the this the nickname and Linux hasa always recommendation persistentnicknaming you know this and also forthe disk so I think you didn't use theSDA or SDAB naming instead of maybe WWNI mean it's all you need to alsoautomate right so how do you manage uhthat kind of thing yeah I I mean we wedo manage them basically like we have toconfigure the hosts exactly the rightway but like stuff like storage forexample we develop our own CSI driversto be able to you know allocate um spacefor containers to use on the host sosome of that is a little bit you knowtamed with the help of Kubernetes butyeah we do have a host configurationlayer we use um puppet internally and weuse puppet extensively to um configure alot of stuff like you know severalkernel settings um I mean yeah are likepackages running on the dro so on and soforth but my question uh regarding thisnick how do you detect you know thewhich nick card is actually online youknow for example bare metal server has amultiple nick cards and you need to thethe pick the right one to you know toadd it to the cluster So in this casewhen a machine is dragged in our datacenters for example it has one neckassociated with it uh not multiple neckcards and I I would say we have aseparate team who takes care of takingcare of how machines are not justconfigured in a data centers but managedover time u they would be a betterexperts to tell you in detail on how todo it uh but typically when a machineshows up there's one interface attachedto one neck and we know exactly what IPaddress it has okay thankshello hi um thanks for the great talkand uh especially the scale youmentioned in terms of servers likecurrently LinkedIn has more than 500,000servers so is LinkedIn uh managing itsown data centers and if yes then so likeLinkedIn is used globally like allaround the world so how you guys aredealing with latency like you're usingsome CDN kind of services or is it likeyou're have data centers like atdifferent locations so this is somethingI'm curious to know uh so there aremultiple data centers hosting all of ourapplications and then there are manynumber of pops or edge networks uh thatwe rely on to reduce that latency to youso if you're let's say talking tolinkedin.com it'll go to the nearestedge site and then it gets routed to ourdata centers cool thanks thanks for yourtimeawesome i guess we'll take the lastquestion and yeah put it up there hithanks for the talk i do have a questionat some point of time you chose to haveinhouse solution instead of choosinglike vunder solution or also what isyours inkubernetes at what moment do you know ifit was a success and how do you collectfeedback batch on your own work becauseyou don't have anyone doing uh similarstuff so how do you know it has been thegood choice and is there occurs alreadythat you need to change or you come backagain on your decision and choose to useKubernetes stuff yeah I would say likewe try to not you know lock ourselvesinto a corner and we don't we try not tomake decisions that we can't back out ofso that that's pretty critical to us uhso we try to experiment with otherthings um for for most intents andpurposes anything that comes from theopen source into our ecosystem we haveto be experts on that like we got bit bya lot of open source components we hadincidents because of them and so as aresult we started to become reallyselective and really choose picky and westarted to read the source code and westarted to understand okay well here'san open source component what are itsfailure modes what is its scaling storylike how um can it accommodate this sizeof cluster and so on so forth sobasically Um I would say there are keyareas where we are able to successfullybring in something from the open sourceit does exactly what we want and wecontribute back to it uh clone setoperator for example is a good exampleof that like it works really well for usbut there there has been several othercomponents that you know we still kindof hang on to but we're aware of theirlimitations and we kind of work aroundthem uh so developing in-house has alsohasn't been the worst thing i think acompany of our size and our you knowengineering culture it's kind ofreasonable for us to kind of own some ofour fate especially in key areas wherefinancially it also makes sense yeahokay thanks all right thank you so muchthank you and2025-04-15 21:59:47.963808 �F��3#��mAy21i3lG2jUMokay I think we'll get started helloeveryone maybe don't have to use the miccan you hear me in theback all right hii do for stream yes sorry peoplewatching online i apologize uh my nameis Schneider core maintainer of theDapper project streaming committeemember also uh CTO co-founder at DIGGrid and together with me we haveRoberto yes so my name is RobertoRodriguez um I'm also a new Dappermaintainer creator of the Dapper Agentsproject and I'm also part of the Nvidiauh team um so I'm going to be doing howto enable autonomy how to enable AIagents into security operations todefend Nvidia's you know productsokay I'm going to take the first sectionof ourtalk let's start with community updatesthe Dapper project has been through alot uh it's gained new contributors newend users um and one of the mostimportant things that we've seen duringthe last year is that now we're agraduated CNCF project since November 12uh 2024 and this is really a testamentto the validity of the project it'sbroad user adoption the number ofcontributors that are contributing tothe project a�� 2#��wARw4c7lmdyFshelloeveryone um welcome everyone to our talklet's start it's really amazing to seeso many people coming to our talk todayuh looks like this topic of renamingmetrics is really uh you know importantum so it's very relevant and hopefullythat's why you are here um so yeah wehave really exciting content for you solet's let's go let's not waste our timeso why we are here today uh imagine thefollowing scenario you have a countermetric that is exposed and scraped youknow from some example applicationexporter or production application it'svery important right this metric is veryimportant you use it in on importantalert dashboard maybe it's using youknow are you are using it forautosۂ�D�1#��?AdDkXFuy45EAall right um hey everyone welcome to ourtalk uh in the next 30 minutes we'llwalk you through how we are building ascalable compute platform all the wayfrom metal bare metal to apps we arepart of the central comput team atLinkedIn serving thousands of engineersuh we don't use Kubernetes in thetraditional way but we extend itextremely heavily um just a littlewarning we'll have bunch of slides bunchof text and bunch of pictures feel freeto take snapshots and we'll share slidestowards the end as well uh all rightbefore we get started a little bit aboutus i'm Ron i'm based out of Toronto thisis my third CubeCon first one as aspeaker and 2022 Detroit was my firstone outside of Kubernetes I love to playracket sports specifically badminton ifsomeone wants to play uh and I also hosta podcast called Soft Misadventuresyeah I'm one of the top followers of RonNext podcast uh hello everyone my nameis Amit i'm uh coming here from Seattlei think this is my seventh or eighthCubeCon i actually I couldn't count uh Ithink this is my third time giving atalk outside Kubernetes I like gardeningin house plans but actually stilloutside Kubernetes I like to developcubecado plugins for you all uh maybeyou're using a few ofthem so I want to I want to ��caling and you know and SLO toolsand so on and if this query stopsworking you have essentially aproduction incident uh for example ifyou use it in autoscaling if this queryis you know broken then you cannot scaleyour workloads for customers you cannotstart things so it's super serious thingimagine that and it happens rightsuddenly after the underlying you knowserver application that was giving usthose metrics um this applicationrestarts and suddenly after that wedon't have the query working again sowhat happened after some stressfuldebugging you kind of like take a lookand you realize that the applicationwith this key metric was upgraded to anew version and this new versionactually renamed a metric and made asmall you know naming changes to themetric um so what's worse is not only ametric name but actually a couple oflabels right so it change you knowinteger to my number and um category toclass and then label value changethrough you know to get uppercase justbecause maybe you know it was clearerand and easier to read um in ourexperience it's actually a very verycommon scenario um and it it causes alot of friction for both consumers ofcourse users who have their queriesbroken suddenly out of nowhere but alsofor producers for maintainers like uswho are afraid to rename metricsliterally because it can cause possiblythose issues right and the honest truthabout this talk is that when we proposedit in October um we maybe had a smallvision how to fix it how to improve itum but we also knew that the currentsolutions to solve this are kind of weakor not existing so We were kind of likeokay this talk will be a little bit ofmoaning and sadness but actually itsurprised even us that in the CNCF CNCFecosystem both open telemetry andPrometheus communities worked on aalready there are existing corefundamental pieces that combine togetherallows something truly truly magicalright so imagine there is a way to pinyour quer metrics to a certain versionlike you do with your code dependenciesright like you do with your G branchlike you do with your HTTP requests sotoday talk is exactly about that we wantto show you a prototype of of a solutionto this problem in Prometheio ecosystemum using several pieces uh from opentelemetry project as well and and giveyou a chance to give us a feedback as aPrometheus maintainers developers uh ifthis is something you would like to seein a next version if it's somethingyou'd like to do with your metrics so umthis is the plan for the next uh youknow 20 minutes and we do this in athree sections first we'll talk aboutwhy renames even are you knowproblematic why we have to do them uhhow to handle um you know the renames inthe current state of the things and whatcould be possible future uh short-termshort-term future actually in Prometheusuh but before that a short introductionI'm here with Ariana hello everybody myname is Ariana and uh I am a Prometheusmaintainer i'm a software engineer at SAstreaming well starting Monday wish meluck and uh um yeah uh I have abackground in the musicbusiness yeah so my hobbies well nothobbies like real passions are uhsynthesizers and history of art apartfrom coding of course what about you BKsure my name is Bwami Podka but you cancall me Bartekch um I'm I work at Googlei'm a tech lead for a Google managedPrometheio service i maintainPrometheio's project and several youknow ecosystem project like clientGolang for example um I love generallygo and and you know I have maintainother projects I co-author Tanos projectas well I'm active in the CNCF and uh Ialso wrote a book called efficient goand um I I just love solutions that areyou know efficient and andpragmatic and recently I also startmotorcycleycling so some kind ofadventures uh I really recommend thoseokay so let'sAbsolutely so let's start by addressingour first question which is why it ishard to rename matrix but if you thinkof it it's not like renaming matrix perse that it's hard right it's likeavoiding the bad consequences of therenaming um so but before we delvefurther into the matter let's establishwhat we mean by renami�ng matrix whatwhat does mean so changing a metric namequite obviously falls into this categorywell but also changing the label namesor label values with some semanticequivalent of course is also part of ametric renaming but then there's alsothe particular case where you might wantto change the the unit in the metricname and then you would have to changeof course also the label value which iseven more difficult because then youalso have to change the sample uh valuebecause the unit has changed so now thatwe know what what we mean by by metricrenames uh let's see why this is nottrivial so when you rename a metric uhthis breaks everything relying on thaton that original metric like userqueries like dashboards like automationlike alerts recordings you know etc lotlots of things so rem renaming metricsalso not trivial because whenever youhave like a matrix overall it'sdifficult to communicate this to the enduser in a in an effective and timelymanner and no matter like how long thegrace period you give your end users nomatter how big that is how long that isthen in any case some is going to betaken by surprise and you don't know howum you know the consequences are goingto be and the extent of thoseconsequences um that's unpredictablevery often and things are even moredifficult when your end users havedistributed systems because then youhave to sync many components like forexample the client part the collectionstorage part the consumptions part likethe dashboards and everything and thusneeds fixing in multiple places you knowin a way that everything is uhharmonious and synchronized and that'snot trivial at all so after having toldyou about all of these problems youmight wonder but we really have to dothis like really reallyuh often yes you really have because uhsometimes it's really more thansometimes it's really inevitable likefor example when you have to uh applynew recommendationsum and adapting new you know adapting tonaming conventions like base units andsuffixes you see we have a lot nicecornucopia format uh on or when you haveto uh switch from uh you know differentdifferent kind of of metric metricecosystems like from Prometheus to opentelemetry or the other way around andyou see that the syntax of the naming ofthe metric is incompatible i mean youhave to deal with those dots somehow anduh sometimes you might you might youknow happen to name your metric in a waythat you think it makes sense but thenmaybe you use like in this example aprefix uh sorry a suffix that isactually like kind of reserved like inopen matrix uh uh let let's let'spretend well not pretend this isactually a Java uh uh a Java umapplication mean and uh if this isinstrumented with Prometheus client Javafor example and you have like uh thismetric ending with underscore createdclient Java is going to trim that and ifyou have another metric that was calledexactly the same but without that suffixof course there is a collision so youhave to find some sort of a schamotagein this case it's like adding anotherunderscore at the end of the underscorecreated Um in any case you have to dothat and uh then sometimes I mean yourespect all the conventions all the allthe the formal things uh all the suffixand all the rules and etc but then youreally you really realize I should havenamed this metric better this does notconvey the meaning that it has I meanthis is right like no but but then youknow uh this is something that weexperienced like last year when we weredoing some housekeeping and client golanpermits client golan we were like doingsome housekeeping um in the uh goruntime metrics that the client supportsand we realized that there were reallymetrics that you know this name doesn'treally make exactly sense it could benamed really better like this one youread go GC duration seconds GC standsfor garbage collection and you thinkwhat does this measure the cycle theduration of the garbage collection cycleright no it does not do that it'sactually the duration of the stop theword phase of the garbage collectioncycle but that's misleading right so wereally really reall�y wanted to changethat but we got scared we got scaredbecause we realized like the themagnitude of the repercussions uhbecause of all the users that clientGolang has and since it was like okayit's compliant for all the rest of nolet's not do that but this gives megives me the access to lead you into uhstraight into the nextsection which is uh what strategies canwe use uh this day right now to mitigatethe unwanted effects uh of rename somigration strategies basically so wecame up with uh five categories um underwhich the the main um um the maincurrently available implementationstrategies fall into conceptually sothis is not like an official taxonomy byall means just something that we came upwith and made sense like a way ofgrouping uh these different strategiesso let's go through them uh and theirpros and cons asanticipated the first is the no changestrategy so when you have like a stableproject that is really popular likeclient Golang uh you might not want toto to change the matrix because changingwould be way worse than actually notchanging uh we for example in clientGolan have uh like on top of dynamic goruntime metrics which are which canchange every time there is a new goversion we have like a sort ofprotection layer so that with stabilitytest so that the changes that every timecan can be uh made to the go runtimematrix uh on the go team side do not propropagate automatically into our clientuh also um like open telemetry uh has uhum you know the same thing happeningwith semantic convention they have uhstability uh tests and frameworks tomake sure that you deprecate um an oldmink instead of renaming it um and alsouh Kubernetes the Kubernetes frameworkuh for SIG instrumentation also has uhum this kind oftests but if when you do decide to gofor a change what are the choicesavailable well the first one is you knowthe most intuitive probably documentedchange so one notable example again isthe Kubernetes uh SIG instrumentationframework that generates uh documentsrelated to uh metric changes and theirstability automatically uh opentelemetry semantic convention same likean analogous thing um they also you alsocan generate documentation fromdefinition and their stability and uhdeprecation level so what is the prohere of course it's like a centralizedsource of truth and it's also at thesame time a way to communicate to yourusers like you know uh at the same timebut doesn't that doesn't do thepractical work i mean you have that butthen your metrics still stay unchangedso you have to do the work yourselfsomehow so what do you have if you canif you don't want to choose that thenmaybe you could rely on translating uhall the versions of the same matrix uhof the same matrix to a certain knownversion like you would do for example inPrometheus um you know with reabelingvia this uh um parameter uh calledmetric re label config and uh I thinkthose data dog does something uh similarwith vector but in any case um the theadvantage here is that you have the sameversion in storage so that's good butthen what what are drawbacks that thisis like a kind of a let's saysuperficial kind of thing because okayit uh it it is like affecting the labelsand the metadata uh but it does nottouch the samples and also like if youwere to switch to uh query time versusuh write time that would not be easy atall so what's next you could also go foruh writing all known versions of thesame matrix an example of which inPrometheus is uh uh doing this withrecord new rule um the pro is that okayyou can still use the old and the newnames um and this allows you to um youknow for a for kind of a gradualtransition between the two but theproblem is that during this migrationthat by the way you don't really knowlike how long it's going to last uh youhave to account for double the storagebecause you will have to to to makespace for both things at the same timefor for as long as the migration lastsand of course uh that gets even worsewhen uh there is more than one updatefor a single metric and this is reallyma really super manual as a burden andit's not trick it's very tri�cky inpractice so the last current strategythat we identified uh and we called likewe called it like version read basicallyor we categorize it under version readis uh um the practice for which you pina metric to a certain version when youquery it so in Prometheus it wouldcorrespond for example to making queriesusing the or operatorum as illustrated this is not superrecommended of course or you could evenum like in a very naive way like usinguh you could also use um um labelreplace uh on the name lab on the on thename because name under the hood is alabel itself a special label but still alabel and doing something like a bityeah very unorthodox probably like Idon't know label replace old metric namenew metric name old metric somethinglike that and this probably like reallysounds farfetched but in any case atleast Theoretically the pros would be noneed for right time changes um clientstorage can ingest whatever no need fordouble right cost but of course asalready hinted at the cons like for uhversion version done this way as thingsstand nowum as far as this or solution or labelreplace solution is that the burden themanual burden is high it's impossible toassert for all co for all consumers andof course it's can be very inaccurate insome cases when there are transtransitions when the two things areoverlapping so to recap as you can seeno strategyum nowadays at the current state iswithout significant drawbackshowever the version read has enormouspotential we found out but we need tomake it automatic so is there a way toavoid those promql complexities andmanual operations in other words can wedo better can we devise a way to makerename seamless in Prometheus what doyou think B good question Ariana so I wefeel there is there is and it's actuallypretty uh pretty accessible so let'srecap you know um how to implement aversion read in Prometheus as it sendsin automatic way so what we learned sofar right when the metric is renamed inprime use well the moment the changesare deployed your queries suddenly juststop working hiddenly as well so it'snot only crashing they're not crashingthere it's actually the results are nottrue anymore so it's actually much worsescenario to be in so how we solve thiswe first well have to establish somekind of schema some kind of definitionsomething that captures the things thatcan change right like metric name likeunit label name label values uh if youhave like maybe a set of known valuesand even a type right and it could be aseasy as a YAML file something that canbe clearly versioned and referenced umfrom for all of those differentcomponents and this alone is extremelyextremely valuable for various thingsyou can generate documentation out of ityou know what what metrics yourorganization or application has you canvalidate schemas you can you know um andyou know validate them test them to seewhat is the impact of certain changesbut moreover you can easily generate SDKcode from the schema to a differentlanguages right so for example here wegenerated this schema into go uh youknow go code with promeuse client Golanguh which we maintain with Ariana andthis code is not only not not only likeautomatically done for us but it's moretype- safe you can notice that labelsare not strings anymore you can kind ofhave a type of it um it's actuallyfaster there's potential to to have itfaster and uh have faster implementationof the instrumentation itself and it'slike just less um less room for mistakesbut for our naming problem there is evenbigger benefit the moment let's say namechanges we can kind of capture that asanother version let's say 1.1 from thiswe can really trivally generate atransformation or change lock file thattells consumer consumers how to upgradeand downgrade this metric the uhdowngrade the shape of this metric rightfor the same semantic ID let's say umsimilar to any kind of schema migrationyou would see in a posgress or any otherrelational database this transformationlogic actually works pretty well formore complex transformations as well sowe can capture upgrade and downgrade uhfor metric names� units actually if wehave a value transformation because thethe moment unit changes you have toreally map uh your samples to adifferent unit as well um label namechanges label value changes in future wecan potentially explore even you knowchanges like label splits and labelmerges and this works for multiplechanges layers on top of it very easilyas well so let's see how schema helpsfor our renaming scenario right um whenyour metric will be schematized and asyou know promeius generally wereschemaless now you can have some of yourmetrics important metrics pointing tocertain schema you'll be able to importautogenerated code for this metric inyour go in your code which will assurethe correct schema and also assure thecorrect reference to the schema thenevery series in every stage of the uh ofgeneration collection storage um andconsumption follows the schema and haveits version reference for example as apromuse label now with the schemainformation available for consumersusers could be able to maybe uh whenthey are making queries for importantdashboards and alerts and tools can pina metric used in this queries to acertain schema and a version and thisway the moment a new definition comes inuh just on the on the on the YAML fileand imagine that nothing is and and itrenamed let's say a metric even if thereis literally no deployment change nocode change it just kind of like thedefinition just upgraded you will beable to kind of reference a new name uha new version of and use new name ofthis metric immediately in your consumconsumption for example in yourdashboard and it will work becausePrometer or any other back end will beable to fetch change log transformationand do the conversion on thefly and then when application starts toadopt this version slowly uh it doesn'tmatter from the consumption side thiswill still work as it as it is and theneven when a new version that never seenbefore and never consumption never evencaptured ex you know will be will bethere you know things will still work soit allows essentially a huge flexibilityon on on both producer and consumerRight um and you know allows multiplecases okay so how to do it in practiceuh how to have it in the nextPromeaterius release if you really wantto so we think there is practical answerand and we have to combine a followingelements uh to implement this so we needa schema definition transformation weneed reference syntax and schema engineimplementation lots of work yeah butturns out we have actually many of thosethings fundamental things already inexisting CNCF ecosystem so we needschema right but we don't need toactually redesign the schema yaml specand and and have the whole ecosystem forthat because open telemetry exists rightof course and we have a standardizedschema spec uh for all the telemetrypieces not only metric actually on topof schema itself open telemetry alreadydefines uh hundreds of metrics thatanybody can reuse so the schema isalready used um and they could be usedin primitus ecosystem um in fact opentelemetry has a rich tool set to performoperations on top of the schema um sothe main tool is called weaver it'sactually called written in rust but it'svery flexible and it's already able tovalidate schemas generate documentationcode transformations and so on and youknow two maintainers Lauren and Joshhave had an amazing talk two days ago soonce it is on uh on YouTube please checkit out the tool is amazing and we areusing it in ourprototype so yes you totally can definePrometheus metrics using Prometheusnaming convention in open telemetryschema so of course it's a little bitmaybe um sometimes awkward because youknow in Prometheus we have labels notattributes and so on we have uh metrictypes not instrument but you know therecould be some um overlay with more kindof like promeuse friendly names but youcould use to it and actually um it'spretty pragmatic one note is that I kindof tried to make it uh more pragmatic sothis this yaml is kind of simplifiedversion of open telemetry schema and wepropose to add a special flag in rassorry in weaver uh do you know simp�le soto make it kind of pragmatic for promuseusers who knows maybe they will acceptthat um but I think there is it is notnot super difficult to do that as aPrometheus user who may be used to umdefining metric in the hard-coded coderight so it's not too bad um thanks ofthat well we need to havetransformations so I was showing you a acertain YAML format but of course opentelemetry already thought about this andthere is a transformation formatunfortunately is limited there are nolimitations to this format it was justnot designed it was designed for a firstiteration and you know open telemetryknows about this and there are differentproposals to improve it um and I just wejust came up with some new format thatcould be a new version new format foropen telemetry it could be somethingonly for prime use what I really triedto what we really tried to do with thisformat particularly is to have itefficient for consumer use so we we madeit spec specifically useful for uhbackends to quickly look it up on thefly so we really we really wanted tomake it very efficient and pragmatic umso anyway we come with this i think it'spretty useful um we'll be discussingwith open telemetry community what couldbe the eventual version uh but forprototypes we could totally use this andautogenerate that from open telemetryschema right now we need syntax uh sorryreference syntax so how we referencethis schema you know somehow users couldbe able to pin their queries and howstorage uh will be able to capture thelink between metric and the schema thisalready exists as well right right soopen telemetry have a schema URL whichis this notion of telling essentiallywhere is your schema definitions andschema artifacts as well the generatedartifacts and you use it usually viasome you know GitHub repo and you canfetch it very quickly and um youreference it by some whatever um URL andthe version as well so uh we can totallyleverage that we could put that as alabel and the moment query is having thespecial selector which like normalselector but with schema special metricit will actually do the magic it willactually do the umtranslation uh maybe one one more onething to note is that there are moreefficient ways so I'll be proposing somemaybe improvements to this ideally it'snot a schema URL is a schema ID butthose are new since I think a schema URLis good enough for a first iteration umand yeah finally we have to combineeverything into something that uses allof it and in fact we already implementedthat it's kind of around thousand linesand uh the schema engine itself and it'sas easy as adding a one component incode between promql engine in Prometheusand storage layer that do thetransformation in a in a transparent wayand here we go so let's do the trickystuff and do live demo what do you thinkshould we do ityes so let me start uh to mirrormirroring let's see how that goes andlet's do this oh beautiful so let me alittle bit make it smaller yeah okay sohere we go this is our query that brokeright and uh why it broke because thereis a new version of the same metriccalled you know with a change name andthen my number in class so naive waythat Ariana mentioned would be to addthe or statement right yes and we saidthis is not ideal for various thingsit's just complex and of course lookwhat happened it's actually a differentseries and because not only but becauseof metric name but also because of uhlabel names that are totally conflictingso it's not ideal and also you don'twant manually to do this so instead youcan tell uh Prometheus in this case thathey actually my uh metric has a schemaand it's actually old metric from onezero schema and everything works rightso it's on the fly super fast um and youcan totally use a new version at somepoint you like actually I like the newversion right so let me use a newversion I have to actually bump theversion of the schema actually versiontwo and then I have to rename my labelsuh because of course it's a new versionand someone decided it would be goodidea to change so why not I can do thatas well so this is um already yeahpretty pretty flexible and the main themain kind of like source of of thisthose possibilities is this kind of umtransformation file which allows meallows kind of promuse to go and findthe exact spot of what's query desiredto have and how to get there and how tokind of translate matchers I'm selectingthrough in storage and then how totranslate something that comes from thestorage back to desired uh version soit's totally doable it's totally fastand I want to show you one more thing sohere are here's a histogram thatactually changed um a unit which is notso trivial because it's not only metricname that changed but actually of coursea value right and of course millisecondsyou can see the values are quite largebecause there are milliseconds and thenseconds the new version is kind of uhyou know in seconds so the value issmaller so you can totally represent oldmetric as a new metric so I want onlyseconds because that's recommendedactually have base units and this willtotally work and you can confirm that bychecking you know where this metriccomes from it comes from do from theapplication who exposes one zero metricsand we totally expose one zero metric asone two dotmetric and value change aswell so um it just you know translatedon demand so you can do amazing amazingthings with that so back to our talk ifI can do it yes there's by the way thereis a full video on YouTube I alreadyshare on linkadin you can totallyrewatch again uh for the demo but let'sgo I don't want to actually do this howto switch do this yes okay bonus part sothere is this kind of funny interestingthing happening in the ecosystem forlike last two years so Prometheus hasits own metric convention right it hasthose suffixes it has underscores Andopen telemetry at some point designed adifferent metric convention uh namingconvention which is incompatible withprime use right they had the reasonsit's okay they decided to use dots whichare not allowed which used to be notallowed and decided to not use suffixeswhich in our opinion as a prime usemaintainers essentially removes uh manymany advantages of using promql in yamlfiles because when you look on the yamlfile you you don't have a system thatwill tell you oh this is a seconds orthis is a counter Like so having that inname is actually crucial but ecosystemwants to have this name and that's okayand this solution actually allows um asingle metric have two versions and youcan decide if open telemetry users uhwant to kind of have exactly the sameopen tele open telemetry metric um theycan totally use a version that has thismetric uh and it would translate to thisopen telemetry version of it and thenyou know uh if uh there is Prometheusfan and like super strict person wholoves this um user experience ofPrometheus naming they can use that aswell so that's amazing because we are wecan support bothso we are uh almost at the end and uhwhat are the learnings here wherebasically the learnings are that uh youknow renaming matrix is actually isactually very hard and uh it's nottrivial then uh you alsoyeah having having schema for your forfor your metric is actually a very coolthing is does a lot of benefits weaverCLI from open telemetry is absolutelyamazing and uh these schemas actuallywork for Prometheus too and so of courseyou know there's some work to be donetowards uh uh having more simplicity inthis more um having more efficient uformat transformation etc and uh toimprove the schema URL as as BK saidwith going towards like an schema ID butwe can work together towards you knowthis seamless renames in Prometheus andthe ecosystem for a betterworld and uh yeah uh we're going to beat the Prometheus booth um yeah we sothat's it thank you very much first ofall and we'll be here I don't think wehave time for questions so we'll be herefor a bit and then Prometheus boothtoday and tomorrow and please check thedemo please check the pull requestreview it so we can have that featureplease let us know if you if you'd liketo put schema on your Prometheus metricsand and really Yeah thank you so muchthank you for having us2025-04-15 21:59:48.651216�nd um it's been a a journeyto get there but if you ever contributedto Dapper opened an issue communicatedwith maintainers you have our thanks umbeyond this we are one of the fastestgrowing projects in CNCF currently 14thlargest over 700,000 Docker hub pools amonth uh more than 300,000 unique docviews and our Discord community recentlysurpassed 8,000 members and we'reclosing up on 4,000 individualcontributors this is great and um beyondall that the most important thing for usis to actually measure end user adoptionso this has been pretty much off thecharts um the Python SDK grew 151% Ibelieve um year-over-year uh which isvery much fueled by AI workloads thatwe'll talk about in a second umnetcontinues to see almost I I think it'sactually over 200,000 uh downloads perweek now um which is pretty great and ofcourse we have other languages usingDapper but we don't have uh great waysto track them this is obviously publicmetrics that we're taking um we'reseeing amazing uh use cases this is myown personal favorite nasa runningDapper on the International SpaceStation um what it's doing there isDapper is being used as the message buslayer to take pictures of astronautsuits um and then fire those off to amachine learning model that decides ifthe suit needs fixing in this case wecan see the glove so essentially checkfor tear and wear um this is theultimate edge deployment in our opinionall the way to companies like Grafana umthat built their uh secure supply chainon top of Dapper on AWS uh usinggraphana cloud this is essentially thebasis for their entire infrastructurestack scanning container images forCVEes and there's many more we'vereleased dozens of these case studiesthese are uh just two that Iparticularly like recently we'vereleased Dapper 115 a really importantrelease because uh it saw some prettyinteresting additions and alsoimprovements to the Dapper project we'veadded conversation API which is a wayfor developers to securely and reliablytalk to underlying LLMs and of coursethis isn't just a pass throughessentially when your application talksto any of these models using Dapper youcan apply circuit breakers and retriesand timeouts and you can kickauthentication middleware you can doauthorization all before talking to theactual LLM and on top of that we addenterprise features like sensitive dataoffiscation which means if you want tomake sure that the user prompt which canreally be anything doesn't contain anysort of personal information like creditcard numbers um social security numbersnames emails Dapper will do that for youand that's a very hard problem to solvealso we can offiscate data coming out ofthe LLM if you have your own uhpersonally trained model and you'reafraid you might have put some data thatshouldn't be there um Dapper canactually make sure that we offiscatethat data going out and then promptcaching which is great a lot of L&Mproviders do have prompt caching butwithout prompt caching you still need tosend the request out of your cluster soyou still need to pay for networkcharges when hitting those and even ifyou hit the cache you will still payabout 50% of the charge um for thatcloud provider with Dapper all of thatis gone you don't need to do thatanymore all of the caches are actuallysaved in the Dapper sidecore locallylatency is extremely low because nothingever leaves your cluster um not evenyour your VPC so um this conversationAPI is new and it's garnered a lot ofattention we've released with theseproviders that you can see here but eversince we've released we actually hadcommunity contributions to add um uh uhGoogle Gemini and Olama to Dapper socome 116 at least these models will alsobe introduced our community is veryenthusiastic about adding themuh workflows has now become stableworkflows has certainly been the mostinteresting building block fordevelopers since we first announced itin Dapper 112 there's many largecompanies that took it to productioneven though we told them not to take itinto production which is interesting umjust because of the value ads that itprovides now it's production ready sono�w you can take into production ifyou're watching and you did take it intoproduction and you're having issues thenthese issues should be gone now ifyou're still having issues please cometalk to us um the maintainers we like tofix things so um please if you findworkflows interesting if you didn't useworkflows up until now because it wasn'tstable um please go ahead and take themout for a spin we support Java Python GoC um JavaScript TypeScript and PHP iscoming it's we have a communitycontributor who's creating a PHP SDK forworkflows it's going to be reallyinteresting to see how the web meets umthese durable workflows and we're goingto cover them uh more soon it's nowready for pretty much anything you wantto do it's the programming model is sogeneral purpose you can essentially useit to do anything from data processingto notifications to timers and remindersto even writing your own Kubernetes likeserver and also AI agentic systems whichwe will cover more in this talk um sowhat do workflows allow you to do itallows you to create very complexdistributed systems pattern in a veryeasy way so task chaining for example iswhere you have one activity that'sfollowed by the other that's followed byothers and we take the input from oneand we feed it back again um one examplethat I like to use is a morning routineactually used it yesterday so I reallylike using it and you essentially wakeup you hit the shower you brush yourteeth and then you go make your coffeeif for example you couldn't make yourcoffee you don't want need to go all theway back to the shower this isessentially the guarantee that Dappergives you you can fire off tens ofthousands of those activities dapper hasa very smart efficient and efficient uhevent sourcing mechanism which allows itto track the state of each individualactivity it'll know where it needs topick off again and it will continue theexecution from essentially where it leftoff uh fan infinout that's your standardmap reduceuc like pattern you can fireoff again tens of thousands of these umin parallel so imagine you're firing1,000 batch jobs with Dapper and onefails and 999 succeeds dapper willactually track the state for each andevery one of those and it'll make surethat only the one that failed actuallyretries so this can lower costs you ifyou have IO operations or you're hittingyou know things like LM providers forexample which cause you to pay Dapperwill make sure that well that justdoesn't happen because it will not kickoff again uh activities that actuallyrun to completion successfully and thenyou can aggregate those results themonitor pattern allows you to createdurable timers so it's something likeyou know an alarm clock for applicationsremind me in one week one year Dappersince 115 can actually sleep to uminfinity so you can have something thatwill just wait for years and years andthen wake up and it's a loted time andhit your application and this is durablethe cluster goes down the database goesdown this thing willremain so um it's a pretty nicescheduling semantics for lots ofapplications external system interactionthis is something extremely importantbecause this is essentially a human inthe loop or today humans and AI agentsin the loop where your system needs toexecute a workflow and then pausebecause it's waiting on an externalevent to come in something like a humanapproval someone needs to approve a formor an order that went through or youwant to give it off to an AA agentthat's going to make a decision that'sgoing to resume the workflow so thatapproval event can actually come fromany system um and again Dapper can sleepfor a long while and survive networkinterruptions pod disruptions completeshutdowns and once the system gets backup and the approval um is being uhforwarded to Dapper the workflow willcontinue so now let's switch gears andtalk about how all of that relates to AIagents um the very basic concept of anagent is that it's autonomous and thisis why they're actually useful um youhave a user here now these users grantedthey can be processes they can be otheragents in this case it's a �user and weissue a prompt into the agent and thenthe agent kicks off a workflow weessentially reason about what we want todo or what the agent wants to do becauseit's not really us and then it plans umthen that plan goes into an executionphase which is called action and to beable to deliver the best action theagent needs to choose the correct toolthat is best for the job then it willessentially fire that off and observethe result feed it back to the LLM sendit back to the user and hopefullyeverything uh executed to completionsuccessfully and those tools caninteract with anything it can be yourinternal systems it can be externalsystems um it can be pretty much uhsystems that even give you a dynamiclist of tools while the ex execution isgoing so you don't necessarily know inadvance which tool you're going to be uhkicking off and so these workflows areextremely important but workflows aswe've come to know them since prettymuch software has existed are verydifferent from agentic workflows becauseregular workflows are structured theyare predefined predetermined steps weessentially code stuff um and we knowhow the code's going to work we expectthat if we launch the same code againwe're going to get the same resultbecause otherwise it would have beeninsanity Einstein said so thisrule-based orchestration um doesn't havereal-time adaptation yes parameters canchange the data can change but theworkflow remains the same the number ofsteps remains the same the the toolsthat you use and hardcode into yourworkflow they remain the same it'srepeatable and it's structured and thisis what we've been spending the last 40years writing software for software thatmakes these structured executionsreliable and secure but now everything'sa little bit flipping on its headbecause agentic workflows are not thatanymore imagine your brain as adeveloper was connected to the code youwrite and you examining the executionstate and the input output state at anygiven stage and then making decisionsbased on what it is decisions that arenot hard-coded so agenda is essentiallylike that it'll take uh blends it blendsworkflows structured workflows with AIdecision-driven making which means thatit can make a choice about what activityto run next and that might not be theexecution that it ran last this thing ishighly unpredictable um it usesreasoning which we don't fullyunderstand um to dynamically choose thebest path based on the incoming data andall that needs to enable autonomybecause agents are only really useful ifthey're autonomous but the more autonomythey have the more things can go wrongbecause the more autonomy we give themthe more choices they have the morechoices the more permutations ofdifferent situations and activities thatthat can occur um and so this becomes aproblem because as I said we've beenwriting software to deal with structuredexecution and now we need to deal withsomething completely different which isadaptive and intelligent and one of themain things about agents is that it'snot really hitting production systemsyet but it will soon and when you takethem into production you're going to berunning into issues keeping those thingsreliable making sure that if you have200 operations now that the AI agentessentially kicked off that they allcomplete in in the time that uh theyneed to complete it by or that theyretry based on where it left off becauseyour workflow might be dynamic in natureit might not be item potent um you mightneed you know things like exactly onceexecutions of these systems and the moreresponsibilities you assign to it themore can go wrong so having a very uhresilient foundation and infrastructureto these AI agents is crucial and if welook at a lot of AI agent frameworks outthere today um they're great for demosgreat for PC's of course I'm not goingto name any names but you know it's goodto get things going but they don't havethe same level of durability and uhresiliency that Dapper has withworkflows so how do we combine these twothings how do we combine these easyinterfaces for developers to write AIagents with these um dura�ble workflowsthat Dapper has in a way that isadaptive to things that are dynamicallychanging um there's other top challengeshere except for reliability which we'vetalked about because again withreliability the chances of failureincreases with the level of autonomy umsecurity is another one how do weencrypt communication between agents ifyou think about agents they're like theperfect microservices right everydeveloper tries to decide how to bestmodel their application and encapsulatetheir business logic and we want to dodomain driven design where weencapsulate all of our domain into asingle class or a bunch of classes or aservice and every time something bleedsthrough the interface this will neverever stop but with a agents it'sactually pretty easy to arrive atarchitectures that really make themreusable and repeatable so you can thinkabout having many agents each one doingits own task and then an orchestratoragent that picks the best agent for thejob and these agents can dynamicallyscale up and down um instead of runninga monolith agent which is going to beextremely problematic especially as itbecomes a single point of failure ifyou've got many different scenarios torun with a single agent if the agentessentially fails um then all of theunderlying activities all of theunderlying executions well they're notkicking off anymore it's it's a veryproblematic noisy neighbor problem sohow do we um actually encrypt theseagent agent communications because wewant to have multiple agents eachbounded to their own context how do wedo authentication between agents that'sa problem um they don't necessarilyadhere to you know standard uh standardprotocols how do you authorize theseagent-to-ag communications and of coursecost efficiency we all want to be verymuch aware of the costs it takes to runour software um and these agents canbecome very very expensive to run interms of CPU and memoryconsumption um in the Dapper project umwe have come up with a project that weannounced recently called Dapper agentsand it is essentially a solution toanswer all of those needs it's anagentic framework built on top of Dapperworkflows it is a unified programmingmodel with Dapper you can use all of theDapper APIs with it it facilitatesmulti-agent collaboration it allows forauthentication authorization all of theDapper APIs and building blocks that youuse for resiliency can be applied tothese agents and you can also runthousands of agents on a single corebecause these uh this agent framework isbuilt on Dapper workflows and Dapperworkflows itself is built on a conceptin Dapper called actors which is a veryvery lightweight process thatencapsulates computing state and Dappercan essentially move millions of thosewithin even a five pod Kubernetescluster with four cores um and so wetalked before also about workflows andfull autonomy versus deterministic wellwith Dapper agents you don't have togive up one architecture for the otheryou can actually choose whether you wantto allow the agent to fully uh fullyautonomously handle the task or if youwant to write your own type of workflowmanually and then connect it to a taskthat is autonomous using a differentagent but Dapper will guaranteedurability all the same it doesn'tmatter if you're giving the agent fullautonomy that will make sure that evenif you connect it to thousands of taskseach one of these tasks will essentiallybe a durable activity that finishes anduh picks up where it left off even inthe face of the most catastrophicfailures you can think of to yourunderlying infrastructure um and to talkto us more about Dapper agents I'm goingto let the original author of it whocontributed it to the Dapper projectRoberto um take us forward thank you allright thank you very much all right solet's go to our next one I guess but umso before I guess we started talkingabout this the Dapper agents project umit was something that first I wanted tofigure out how to build my own agent soas a security researcher I wanted justto learn and kind of like reinvent thewheel a little bit just to learn howevery component was working t�ogether butat the same time as part of research Iwanted to figure out what was out therethat I could use right built on the topof something that has been proven towork in production and that's how I gotto Dapper and the whole ecosystem sowhen you start thinking about agentsthere is multiple components of an agentright you have the concept of memory theconcept of tools state how theycommunicate once again how they plan howdo you save the plan and then distributethat plan across all the other agentsthat are part of your multi- aentcollaboration system right so there's alot of things that we have to considerevery time we start thinking aboutagents now when you think about theworkflow concepts there is actually thisthis is known as an agentic patternright so instead of saying agenticworkflows in the community in the AIcommunity I guess is an agentic patternand there was this one that came out inuh this one was in October 2022 this wasactually before Chat GPT which wasNovember 2022 um and and this was kindof showing that we do not just have totalk to an LLM and get facts all thetime in this iterative loop right can weactually make a system to interact withthe outside world so this agenticpattern was proposed in this paper soyou have to make it think so have thelanguage model think about your questionfigure out how to act in the real worldthrough tools this could be applicationsyou know functions such as you knowpython function for example and then tryto get that observation so that then youcould make the language model try tofigure out do I have to continue thisloop do I pick the next tool or maybeI'm ready to answer the question thatwas a basic basic enentic pattern butwhen you start thinking about thispattern this is a basic workflow rightthis is a set of sequential or justmultiple steps that you can run oneafter the other one trying to get theoutput of the first one the second onethird one so on and then at the end youcreate this loop where the languagemodel is the one that is going to reasonand once again it can continue the loopor it can stop the loop so this issomething that in my opinion in thecommunity we in my opinion we were notthinking about this just as a workflowthat we can simply define how we goingto execute all of this now this took usto multiple different agentic patternswhere you can have your basic loop whereyou execute something you can critiquethat uh input and then you can just goback until you feel comfortable or maybeyou say I want you to critique this uhoutput like 10 times right you can alsoadd planning so you can add an stepwhere you want to make sure that youdefine initial steps and then go throughthe whole loop again the most known ofcourse uh at the bottom left is the toolcalling tool calling means that I justwant to know what tool is the best toolto choose to execute for each iterationand that could be combined with thecritique it can be combined with theplanning before you start that flow oragentic pattern and then at the end theone that is super interesting is can youhave all of these agentic patterns aslet's say microservices and then haveall of them communicate where you candefine the tools for each service youcould say you're going to be for exampleas a security researcher you're going tobe the threat intelligence analystyou're going to be the reverse engineerthe incident responder the securityanalyst you can provide multiple toolsand you can have them all communicate uhright with each other of course when youstart thinking about these patterns onceagain this is what we all see in thecommunity but you can use thefundamentals task chaining fan out fanin all the way to monitor right you canmix them up and you're going to startbuilding your own agentic patterns thatof course caught my attention and when Ifound out about Dapper workflows and allthese different patterns I was like thisis the tool that I need to use to startbuilding my own agentic you knowworkflow um you know engine right and ofcourse Dapper workflow has the durabletask uh you know concept which you knowwhich you know Yon alr�eady explained alittle bit so it was very interesting totry to define my workflows or my agenticpatterns in a easy you know programmingfriendly type of mode as if I waswriting just you know a Python scriptand try to figure out how all of thesetasks or activities were connected rightso when you start thinking about dapperworkflows right um for those that haveused dapper workflows on the left youcan see what it would look like if youjust want to directly use the concept ofa workflow the concept of an activityand then define maybe a client tointeract with you know these providersfor language models in the concept ofDapper agents uh we come up with aconcept of a task and a task is awrapper um around the activity where youcan define I want to run this task withthis you know LLM provider or thisspecific language model and if you canuse the concept of a you know Pythonfunction right using dock strings usinga decorator so it is easy to say I wantto run this task the description of thetask is going to be this and then Dapperagents will take it as a oh this isgoing be an LLM based task let's takethe description of the task as a promptand let's execute it with an LLM that'sthe the fundamentals of trying to usethese concepts create this abstractionlayer where you can define LLM based uhtasks now of course the beauty of thisis that every single agent that youbuild right is connected to multiplecomponents that if you run for examplethis locally you already have access tomultiple things that you can use throughright testing such as you know zipkinyou know reddis to uh you kind of keepthe state of the agent all the way to ofcourse the whole ecosystem of dapperwhich was something that to me I don'thave to you know rewrite once again Ilove to reinvent the wheel to learn butmy main learning thing was I want tolearn how agents work not so much how tobuild all the architecture and all theservices that and all these standardAPIs that I could use to communicatewith multiple things right so that waskind of like the thought process ofcourse just having durable tasks is notenough we want these systems tocommunicate right because we want tomake sure that each one has its ownpersonality its own role its own toolsits own goals for example and for thatwe use the right popsup APIs there's alot of conversations on on creating evena standards into how to how to makethese agents to communicate well thereis cloud events already which is anstandard that we could use and it'spretty good at tracking more metadatathan the things sometimes that I seemultiple frameworks trying to track whenthey communicate so this is just anexample how we can have agents simplyusing that API and try to exchange youknow messages the other concept ofcourse which was amazing right is that Idon't want to think about on buildingmultiple clients and modules to interactwith multiple architectures out there sotrying to make it agnostic to all ofthis was amazing right because there ismultiple components that you need ascore components of an agent that itwould be great just to swap right if I'mworking in different environments so theprevious example think about this that Idon't have to change the logic of howthey communicate i can simply swap themain component in this case if I like touse radius great if somebody else wantsto use other components in here that'salso possible just by swapping the rightcomponents and then at the end we try tomix other concepts of Dapper in thiscase every single agent must have itsown state and the way how it works thisis very basic example is that when youinteract with multiple agents oneapproach is to say I want everyone toknow what's happening at the same timeeveryone so if the one on the left thegreen one is the orchestrator theorchestrator will send a plan toeverybody it's going to send messages toeverybody and then the orchestratorcould say "Now you go next." So he canpoint to a specific agent and now theagent has all the previous messages thehistory of of like what's going on andevery time the agent stops rightfinishes the the execution of the tas�kit can broadcast that message soeverybody is aware of of what's going onso trying to keep uh the thecommunication through popsup and alsouse the state APIs super powerful tostart building all these agentic systemsand let's um go to the demo which Iguess first the the demo is going to bewe're going to use this basic agenticpattern which is uh or yeah let's saythe whole pattern which is the conceptof an orchestrator and then assistanceokay so how does that work uh theorchestrator is going to get the maintask it's going to generate a plan it'sgoing to send a plan it's going to thentrigger agents and agents are going tosend a response back to the orchestratorand that will be broadcasted toeverybody else the orchestrator is goingto figure out if we can pick anotheragent to execute the next step or maybewe're good to go and we just respondback to the user righti want you to think about this as aworkflow for each micros service eachcomponent of this whole ecosystem theorchestrator is going to take the taskas I mentioned before and you can takethe purple um let's say rectangles um asthe tasks right and then you can seethat once it gets to the checking theprogress it's going to decide the modelbehind the orchestrator it's going tosay based on the input based on thecurrent man uh messages in my state Ican continue so we use the continue asnew API I which is part of themonitoring right pattern from fromDapper workflows or we can simply stopand then talk to the user very simplepattern that where we are using multipleum you know Dapper workflow patterns aswell this one is the assistant all theassistant is doing is getting a messagefrom someone in this case theorchestrator it's going to take themessage look at the state of what'sgoing on it's going to execute a taskand this assistant can suggest a tool toexecute or it can simply respond back tothe whoever sent the task if it is aboutto suggest tools to execute it's goingto execute them either in parallel usingthe Dapper workflow fan out fan inparents right or it can execute one toolat a time depending on how the modeldecides to do it and this is the wholeflow back and forth into how this modelthis uh agent is is using the Dapperworkflow tasks um and also some APIslike the continue asnew how do you define an orchestratorthat's on the left very simple wealready have some you know built-inclasses that you can just define anorchestrator provide the attributes thename how it communicates with the stateuh stores and then you can expose it asa service for example so that you canhave someone sending a request to anHTTP endpoint and that's how you wouldtrigger the orchestrator and the conceptof an assistant we also have theassistant class where you can define thename the goal the role instructionstools etc and we pack all of theseworkflow those into these classes soit's easy to to allow you to describeall of this in a type of microser wayand then of course we use concepts suchas uh running multiple apps at the sametime so you can see how we have everysingle uh assistant right pointing toits own Python script with their ownassistant classes and at the end we haveour also workflow orchestrator which isgoing to run the other workflow that hasthe the decision of who speaks next andtry to send the plan etc so let's um runthis real quick the way how this worksis um so let me just make sure I havethisUh hopefully you can you can see it butif you go to the GitHub repo there is aquick starts and there is the DapperYAML files let me just make sure thisone is easy to read maybe I'll do thisas I mentioned before you can justexplore every single application thatit's in the folders and if we go forexample to the Dapper Workflow LLM thisone we can even change the uh the maxiterations you could say I want this torun for 10 iterations and then we areready to execute this but before that uhyou could actually use the you knowbuilt-in um you know docker containers Iguess when you deploy it locally whereare going to keep track of your stateand they're also going to you know wecan see how all of these tasks ar�e beingexecuted so let's run this so we go tothe folder where this exists we're goingto do dapper run f and we're going torun the dapper llm yaml we're going torun this um and then we're going to makethis one a little bigger so you knowthis one uh this one right here istrying to to uh as you can see listeningto the client there is a client that wedefine in order to automatically triggeran event in this case the event waslet's take the ring to Mordor i'm a youknow big fan of Lord of the Rings so allthe microservices were our characterslike Gandalf as a wizard agent thehobbit Froto uh and the orchestrator isjust trying to figure out how tocommunicate with all of them if we go tothe and and get the best plan for how torun and or take the ring to Mordor umbut we're all smarter than the LM andthe agent because what's the easiest wayto take the ring to Mordor the eagle theeagle yes thank you all right exactlythe LM is not going to tell us that soone thing that you can do uh let's makesure I have this in here so one thingthat you can see automatically we'reusing this state to first trying to keeptrack of what were the agents thatactually are part of this wholeecosystem that way everyone tracks oflike who is available to actually speakwe can see if we go back uh here we canhave uh the concept of a beacon channelwhich the beacon channel pretty muchwhat it does uh let me make sure I uhzoom in here a little bit uh so the thebeacon channel pretty much is sayingeveryone who is connected to this uhspecific um let me just make sure thisis bigger to this specific topic so youcan see all the microservices areconnected to it so they're just waitingif anybody broadcasts anything I'm goingto act on it right and all of theseclasses are listening on all of these uhyou know topics right now as this isexecuting right it's kind of goingthrough the whole loop one by one we canuh probably talk a little bit about thiswhere you can see that there isbroadcast messages being validated etcbut one way to look at this is Dapperagents provides a file that is going tobe created for every single applicationthat is being part of this whole agenticsystem and we're going to keep track oflike what's going on and this is allstructure output and this is actuallywas created by the model so let me justgo straight to the first part this is aconcept of a plan for example let mejust make this a little bit bigger soyou can see it there is a concept of aplan uh where is the plan right herewhere it's saying uh step number one weneed to assemble and prepare all thenecessary supplies it the model itselfsays we need to have substeps we need tomake sure that we are provide more um uhinformation into what needs to happenand then this plan is being iteratedover and over being passed across allthe agents and as you can see it's alsosome of them are completed already sowhat does that mean every time we checkprogress we assess the current state ofeveryone we assess the plan and thequestion and we start making the modelto produce updates to the whole plan andthat's so we can say we're completedwe're still in progress we need toprobably give it to another agent andthen as you can see at the bottom if wescroll more down we can see the taskhistory and in this case what that meansis every single agent like GandalfLegololis Gandalf again um all of themare tied to the tasks so the modelitself is a language model is readingthe output of all the services it'sgoing through the whole iterationprocess and it's trying to to to figureout like who did it right uh whoactually was assigned the task and youcan see that it's going step by step andthis file is being updated in real timeone thing that is pretty interesting isthat some tasks this is was actually aninteresting result wherethe unpredictable you know happens forexample some tasks were being completedsubtasks without the the actual agentcompleting it and that's because themodel was assessing that based on whatthese other agents did these other tasksare already completed so we have aprompt that says you need to validate ifwhatever information you have in thecurrent state is going to support otheroperation and all of that is happeningautonomously right and then this one Iwould assume it just finished because myoutput was NA before but this wasupdated in real time and this kind oftell you exactly um you know whathappened at the end producing anstructure report um and all of this ispossible by having all of these systemscommunicating with each other keepingthe plan the state across all of themusing in this case a reddis local dockercontainer and then you can have thisoutput that then you can probably use itto validate how good this agent is at aspecific task or not which is what we doin security we're trying to validate howclose the report was to what a securityanalyst would do for example when theyinvestigate an alert right so havingthis structure output allows us to thentake this and see how efficient it wasor not and that was itso this is the end of our maintainertrack please try out Dapper agents jointhe Dapper Discord um talk to us andwe're happy to talk about whatever umDapper use cases you think you mighthave or yeah just anything else so thankyou so much for coming here thank youfor your time thank you thank youokay you have questions no questions andwe're here for questions we we willremain herehuh like this i think they have a mic inthere yeah question i think it's for therecordingone two three there you go yep okay umthanks for great talk uh really good toto see how um AIS or agents are tackledfrom sort of a microservices perspectiveinstead that's great uh I'm MortonForfang from a consultancy at Norway umanyway I was just wondering you know ifthe the upper part takes care of theplanning that you know I'm theorchestrator and you guys get alleverybody gets the plan right but theseare agents so one of them might come upraise their hand and say "I got a betterbet better plan you know we should stayin Riendell we shouldn't actually go tomortar." Um how does theapproach possibly accommodate a morekind of a dynamic the the plan haschanged kind of scenario that was myquestion yeah it's going to be aniterative loop yeah exactly so we keepthe same concept of an iterative loop inthis pattern that we show we have theorchestrator with the language modeldeciding what to do next what not to donext what updates actually need tohappen so there is already theorchestrator trying to figure out what'sthe best path based on the current umstate and the current tasks that arebeing completed so that that's kind ofbeing taken care of already but I alsolike your your question when it comes towhat if an agent just decides to updateit right that's possible and that's apattern that we're working on which isthe swarm uh pattern where all the theagents themselves based on their owntools their own goals their own rolesthey could actually propose even changesto like what's going on across the wholeecosystem so there is different patternssame concept you just move the reasoningof what's next to the agents versushaving an orchestrator like a managertrying to send the tasks to everybodyand there is of course um you knowpapers that talk about which one mightbe more efficient to specific uh youknow use cases right so I think itdepends on what you're trying to do uhfor us trying to have one orchestratorwith a reasoning model those that arelike for example like 01 03 in in u youknow openai for example uh or deepseekright like you reasoning models aregreat as orchestrators because they canreason a little bit more than maybe youjust want your agents to be the typicalgive me some input i'll decide what whatto run execute it and I'll send you backthe result that's probably moreefficient if you have hundreds of thosetrying to do things and the orchestratorwill be the one you know sharing thingsbut good question you can still you knowpush the reasoning to each uh you knowagent yeahany more questions any more questions wehave the mic in here okay we have themicuh noquestions all rightall right cool well thank you thank youvery much again appreciate it[Applause]2025-04-15 21:59:49.560189�onference at the the GoogleCloud uh headquarters in Sunnyale andthe date is August 26th and we justopened the call for speakers a coupledays ago and so if you are a gRPC userum and you want to come and talk aboutways you're using gRPC um in yourapplication or if you are uh one of thefolks submitting code um for the projectwe would love to have you um you know bepart of the call for papers um if notstay tuned looking forward to theannouncement for um you know the theticket sales and such for the conferencewhen we publish the schedule so wouldlove to have all of you there it's areally really great event um it's it'snice place to come together and hearfrom both maintainers of the project aswell as folks that are that are outthere andusers one of the the things that we'vedone recently is we wanted to puttogether um a a special sort ofannouncement list for really keyannouncementsuh things like outages or securityissues security patches uh things likethat where it's sort of a list that wereally think everyone would like to hearum about those messages and so if youhaven't seen this um the gRPCIO-announceum is is where that's at and we haveanother mailing list that's notmentioned here just without the announceat the end and that's where you can postyour questions and comments and and andsort of speak more socially and sothere's a whole lot of uh messages onthere every day um and that announcementlist is is different than what we havehere you know this is really just corekey messages from the maintainers aboutreleases and and any kind of issues andthings that you you'd need to know so ifyou haven't done so so far uh pleasetake a chance to uh subscribe to themailing listuh one of the last things I wanted to tolet you know is talk about um some ofthe efforts that we're doing around uhmaking a a proposal for graduationum for CNCF so right now we're anincreating project and we would like tobecome a graduated project and one ofthe things uh that we're doing isrewriting the governance um for the GRPCproject um you know we started talkingwith CNCF about graduation uh around ayear ago and one of the the feedbacksthat we got was some really key changesum around the the steering committee andaround other government's items uh thatwould really help make uh you know thethe project governments significantlybetter and so you know don't besurprised when you see those coming uhit's really something that's reallydriven through you know working with theCNCF on how we can make this um a betterproject um for open source and for allof you and so with that I'm going tohand it over to Gina and before shejumps on stage we should all do a roundof applause because Monday will beGina's 10year anniversary at Google sopretty awesomeall right thank you everybody and myname is Gina and I'm one of the JRPCmaintainer so as you may know that umover the past few years we have beencommitted to making the cloud nativeadoption as easy as we can for you soyou can scale your services seamlesslyso process gpc service match streamlinesthe deployment um process by eliminatinglike the manually operational overheadassociated with maintaining and alsolike managing um your sidecar process soalso umprocess drop PC comes with um manyfeatures that allows you to bring theservice match um capabilities into yourjob PCapplications jpc supports variousauthentication and authorizationsmechanisms including like all odds the Jtoken um it can be also integrated withthe load balancing systems to distributethe traffic across the multiple serviceinstances it also provides a reflectionservice that allows the clients todynamically discover the methods and themessages supported by your server and JPPC also comes with a building retry andtimeout mechanisms that can help you touh improve the fault tolerance andreliability of yourservices jrpc also allows the developersto intercept the request and theresponse and last but not least um wesupport the plugins that you can use toexpand the functionality that you needso the Kubernetes gateway API announcedthe gRPC route resource in GA and umit's s�upported it's supported in Googlecloud platform and also other cloudproviders so this powerful featureallows you to easily define um thesophisticated routing rules with theirJPC applications and enabling theprecise control over the request and umcontrol of the how the requests aredirected um across your all of yourbackend serviceswith gRPC route you can leveragecriteria such as like the servicemethods the headers um to match incomingrequest and route themaccordingly this f grand control enablesthe defense of traffic managementstrategies like um canary deployments ABtesting and the traffic splitting ofyour JPC serviceswe are also expanding the observabilitysupport to hotel telemetry so it helpsyou to troubleshoot the problem andimprove the reliability of your JRPCapplications so you can make a betterdecisions about how to artic and alsomanage your systems open collaboratorymetrics are available in all thelanguages that JPC support and you canget this metrics to help you to analyzethe RPClatency QPS error rate and also thingslike the payload sizes so you can findthe full list of the metrics that wesupport at the documentation that I haveon theslides open telemetry tracing is alsoavailable on Jar PC um so we would liketo encourage all of you to give it a tryand let us know if you have anyfeedback we are also excited to sharethat um Stefo session affinity supportis now available inJPCC++ so it is a load balancingtechnique to ensure that all therequests from one particular client arerouted to the same back end this isparticular useful for applications thatmaintain a per session state informationum such as like the shopping cart or theuserprofiles jpc implement the stiflesession affinity using cookies so whenthe first request is sent out the dropclient um will route to a server as anormal request based on your LB policiesand in this example the request onehappens to be go to the server two andthe server two encodes it in identityinto a cookie and then populate thatinformation in the set cookie responseheader within it so we can see that thecookie attached to the responses onehere and um the client will receive theset cookie header in the response anduses the cookie into the to define asession so all the subsequent requestsin the session needs to be populatedwith the same cookie and the jar PC willroute or ensure that all the followingrequests will be sent with that cookiewill be sending to the sameserver so here the client wants to sendanother request um in the same sessionas a request one so it populates therequest two with the cookie returned inthe response one and the drop PC willroute the request to server two based onthecookie so until the cookie expire orlike the server server two goes down umall the requests with the cookie will berouted to the server two and as a resultyou are guaranteed to always have the hehave a warm have a warm cache um of thatsession significantly speed up yourlocations so now um you know like howthe steply will general works with thejar PC and let's take a look how you canconfigure it so we have introduced acustom resource called GCP sessionaffinity policy so in the YAML file youwill set the cookie TTL time in secondsand the session cookie will be expiredat the time that you provide it here sobefore it's get expired the request withthe session cookie are guaranteed to besent into the CN back end and then youcan also set the target reference to tospecify which route or surface that youwant to enable staff session affinityon another cool feature that's launchedrecently is dual stack um backendsupport jpc clients currently supportboth IPv4 and IPv6 um however most ofour language um implementations does notsupport does didn't have support of likeindividual backend having both IPv4 andIPv56addresses so with this new launch um theresolver and the LB policy API willsupport multiple addresses per endpointand happy eyeballs will be used um tominimize the time to determine theaddress so we are happy to announce thatum dual stack backend support isavailable on jar PC across all thelanguages that we have� and you arewelcome to give it a try and let us knowif you have anyfeedback so to further enhance thereliability and the resilience of yourservice we also recently introduce aanother new feature XDS fallback so thiscapability addresses the critical needsof like continuous operationing byallowing you to configure and deploy asecondary control plan alongside withthe primary one so when your primarycontrol plan is not available um itcould be whether due to like the planmaintenance or unexpected network issuesour system PC will automatically switchover to the secondary um control plan sothis automatic uh automatic failovermechanisms ensures you um that yourservice will remains operational and umaccessible by your users without anyinterruption and also minimizing thedowntime and maintaining a consistentuserexperiences all right with that I willhand it over to Richard to talk aboutour next featureall right thank you Ginaso um first thing I'm going to talkabout is a a bundle of features thathave sort of been uh driving towardsproxyless service mesh on cloud run aserverless platform so service meshoperators generally don't run on justkubernetes whether they like it or notum feature teams in their organizationoften run workloads on variousserverless platforms as well such as forexample Google cloud run umunfortunately there historically hasn'tbeen great interoperability betweenthese two types of platforms for servicemesh um for GRC proxy list services inparticular we've seen good support forCloud Run to Kubernetes and in factKelsey High Tower demoed exactly thiswhen proxy list was first introducedback in 2020 but the other directionKubernetes to serverless hasn't workedout nearly as well so with Google CloudRun in mind uh we've designed andimplemented a handful of features thatmake this a firstass experience uh firstwe've added XDS host rewriting thatenables the use of vanity URLs in XDStargets something very common onserverless platforms next we've addedJot tokenbased authorization foroutgoing service mesh mediated RPCs andfinally MTLS O based on Spiffyidentities putting all of these featurestogether uh the integration betweenserverless and proxyless services is nowtruly first class um and you can uh seethese coming to Java and later otherlanguages in the comingmonths so next up is XDS global ratelimiting um a really exciting newcapability with GRC Proctulus and thisis the first time that we're talkingpublicly about it so global ratelimiting is great for situations whereyour servers may frequently beoverwhelmed by requests um most ratelimiting solutions require some kind ofproxy whether it's a gateway ingressproxy or a serverside sidecar proxy uhwith XDS global rate limiting gRPCservers natively implement this ratelimiting functionality directly in theGRC library in conjunction with acontrol plane uh this feature leveragesthe existing XDS protocol for servicemesh as well as a new RLQS protocoldesigned specifically for aggregatingload information from servers and asynchronously programming rate limitingpolicy from a globalized control planethis means that you get the benefits ofglobal rate limiting uh decision-m andalso the benefits of a completelydecentralized dataplane so here's a bird's eye view of howthis works the XDS control plane sendsconfiguration down to the gRPC serverwith instructions to connect to an RLQSserver a service dedicated toprogramming rate limiting decisions ingRPC servers the server will thenconnect to that RLQS control plane uhthe gRPC server groups requests intobuckets based on their metadata and foreach me uh bucket the gRPC server willperiodically send usage reports to theRLQS control plane so for each of thesebuckets um the RQS control plane willsend query per second limits down to thegRPC server and that happens completelyasynchronously if the query per secondrate exceeds the per bucket maximum atany time the gRPC server will rejectthat request with the uh configurablestatus code so putting this all togetheryou have a highly performant gloglobally aware system for managing highvolume systems uh the protocols and GRClibrary implementation described hereare completely open source and a firstROQS control plane implementation willbe coming to GCP cloud service mesh uhwithin the next fewmonths so moving off of service mesh uhanother change I'd like to talk about isprotobuff editions uh think of protobuffeditions as a versioning system forprotobuff features um instead of havingseparate syntaxes like Proto2 and Proto3 with fixed sets of rules uh Protobuffeditions provide snapshots of Protobufffeatures with customizable settings thisapproach ensures forward compatibilityallowing code written in older editionsto work with newer ones by unifyingfeatures and enabling incrementalupdates Protobuff editions make iteasier for developers to keep their codeup to date and maintain flexibility intheir projects proto edition 2023 is thefirst edition and it essentiallycombines the features of Proto2 andProto3 and it's supported in manylanguages there are a few code changesyou'll need to make to adopt Protobuffeditions and you can find more detailsat the public documentation uh in thelink below we're also working on a newCLI tool called Protoiller to help withmigrating these files messages andfields to new values for each feature inaddition to all of this the JRPC team isworking on other improvements to how youdevelop using protocol buffers in thefuture so I want to give one quickpreview at the next edition of Protobuffuh 2024 um you can expect a a few greatcross language features like symbolvisibility and style guide enforcementas well as plenty of language specificimprovements and that is it um there aretons of ways to engage with uh gRPC theproject and gRPC the community we ofcourse have our gRPC.io website thatKevin mentioned earlier we arecontinually revamping with uh improveddocumentation um we of course have ourYouTube channel where we have tons ofgetting started guides um growing everyday um we have meetups both physicallyand virtually um you can always meetwith a maintainer from the GRC uhproject just by going to that link umand of course join the mailing thevarious mailing lists we have so withthat I think we'll move on to[Applause]questions any questionsyeshello um I have a couple so um I thinkthe gRPC docs saidlike if you're happy just keep withProto3 and don't bother with additionsis this do you recognize this statementand is it still true i I think it isbasically true that um you know ifyou're happy today with what you havewith Proto3 keep on living with Proto3um so for the foreseeable future yeahthank you and um for the gRPC um yourcolleague was saying it does nativehotel is this the um stats handler stuffor is this something new i think Imissed the beginning of that questionthere can you can you say it again yeahyou said the gRPC client now has likenative hotel support i think that waswhat you were saying is this so we usedto use an interceptor and then it movedto I think a stats handler is this a newsomething else as well now uh yes wehave uh first class uh hotelinstrumentation so I believe there havebeen thirdparty interceptors in the pastnow we're providing you know very firstclass hotel support um from the GFCproject okay thank youwhere are you sirhello thanks for the talk so I'm opentelemetry good maintainer and I have aquestion regarding the uh the opentelemetry like uh instrumentationuh are there any willingness tobasically uh work in the open telemetrysemantic conventions because currentlythe I I haven't seen any cooperationregarding the semantic conventions RPCand JRPC between gRPC and open telemetryor isn't any plan to contribute back onto follow up because right now the Opentele like the official open telemetry uhgo country instrumentation which issupported by open telemetry divergescompletely for the one implemented by byGoogle and I think it makes a lot ofconfusion for the users which one theyshould use basically yeah I thinkabsolutely there's a willingness tocollaborate um let's let's talk in thehall about thatthankyouany otherquestions all right thank you very much[Applause]2025-04-15 21:59:50.386174 XX��4#��UA0qNOZpdW870all right uh welcome everybody uh myname is Kevin and we're here uh for thegRPC maintainer talk uh going to talk alittle bit about uh what's new ingRPC so before we jump into the maincontent for the session uh moretechnical stuff I did want to share alittle bit of the success stories thatwe've had um with gRPC over the lastyear um what we're seeing right now isuh in in for Java and Apache Maven we'rehaving 1.7 million downloads every weekum for Python also really great numbers24 million weekly downloads uh number 72most downloaded um package over the lastmonth and then finally for for Node uhnpm 13 million weekly downloads and sowe just continue to grow uh want tothank all of you um for your continuedusage of uhgRPC um another thing I wanted to sharewas sort of the the chart of the growthof GitHub stars um we just continuegrowing and growing up and to the rightacross all our various languages um ofgRPC and so this is really great to seeand hope that uh hope that itcontinues one of the feedbacks that weheard a few years ago um at KubeCon uhwas basically that the documentationthat we have was not up to the standardsthat that people wanted to see and thatwhat they expected from open sourceprojects and what they were seeing onother projects they were using and soone of the things that we've done overthe last two years is really double downon a bunch of our documentation effortsuh just this year we added four new uhdocumentation sections uh 39 new videosand a bunch of new code examples and theyear before it was a ton ton more so umwe've really been you know doubling downin that area and made a shift towardtrying to make short consumable YouTubevideos for a lot of the features that weadd to kind of give you an overview getyou started um and see that thatcontent uh one of the other big thingsthat we're doing in gRPCum is adding support for REST uh this isanother one of those areas where we didsee a lot of request uh coming from thecommunity um here at KubeCon and atother conferences uh this was one of themost requested features and so we wantedto share with you the road map of thingsthat we plan to build uh throughout thisyear and one of the things that we'rehoping to do is at gRPC comp which islater in the year in August we arehoping to uh release a preview releasewith a bunch of the features that youcan see there and so this will be sortof a a pre-alpha preview um andhopefully with hands-on code labs um atgRPC comp and then you can see the therest of the the um feature roll out umto theright if you are interested in Rust wewould love to hear from you um if youwant to be part of the the earlyfeedback uh let us know review whatwe're doing we are looking for folks whowant to do that um so if you are a Rustdeveloper or or interested and and wouldlike to um help us along please uh fillout the uh the link there for interestand we'll be reaching out to sharethings withyou um I mentioned uh the GRPC comp werun a big c��umreading from their APIs rather than fromtheir upstreamDNSum as a community we're pretty I meanthere's 387 contributors but activelythere's just a handful of people it'spretty small friendly place so if youare interested in this um and I'll I'llI'll go through in detail some recent umthings that contributors added that areare pretty cool but um the you knowwe're we're very welcoming as a as aproject um and uh it's pretty easy toget to get involvedum and uh I think now I'll hand it offto Yang who's going to talk a little bitabout how you might extend it uh andthen I'll talk about like I said somenew things that are in there and we'llthen we'll leave some time for Q&Aokay thanks J for introduction so I'mgoing to walk through a demo pluginyou're going to say what exactly is thisplug-in doing this plugin is doing avery simple thing uh this is actuallyone of the most requested features bythe community many times people come tous and say okay I have a network mynetwork is like internal network but Ihave a DNS server that's going to serveboth internal network at the outside Ido want a way to differentiate those uhinternal and external networks so ifit's internal I'm going to return one IPaddress for uh DNS requests and if it'sexternal I'm going to return another IPwhich is pretty straightforward rightmany people requested a lot of times uhI'm going to show this how you can dothat with Golang because many peoplecome to us say okay can you make aplug-in or can youuh allow us to uh achieve this goal withsome you know configuration files buteventually we have some debate about ifit's worth effort to write a plug-inbecause the the plug-in the complexityof how exact exactly going to handledifferent configuration is going to be alittle convoluted so we say okay maybewe can show a uh plugin write a plug-inin Golang and see how easy that isactually that one took me I think ittook me one day to get this one done yesand you probably can get everything donewith less than one day maybe a couplehours okay so as as as you can see thisis how this plug-in is going to serveyour uh serve your network DS entry youhave internal network that's going tostart with 172 uh prefix of 172 you haveexternal network that's can be anythingand you have a putting a server sittingon the edge serving both inside andoutside if uh your network is going toreceive a DNS query of the same name andif it's the traffic is coming from theinternal you're going to return1.1.1.1 this is a faked IP anyway rightjust give you example and if it'sexternal you're going to return 8.8.8.8eight fairly straightforward so how dowe achieve this goal so there are onlythree function you need to implement thefirst function is to it's called initfunction acc me function is going toperform a one-time initialization andregister the setup function with caddyand the setup function is a secondfunction you are going to uh handle thatis to pass a configuration file becauseyou do have some very minimalconfiguration to say what you want topass and uh it add a handler and it willbe called once for each of the plugin inthe core file and that's it you'll setup the uh plug-in but finally howexactly your plug-in is going to handleor process DNS requests that's a keyright it's also very fairlystraightforward there's only onefunction called the serve DNS thisfunction will process the DNS request uhafter you receive request you can decidewhat to do you can either uh reply backwith a response or if you say this isnot my domain I don't know what to dowith that you can pass the request tothe next chain of the plug-in so youjust say I don't know how to handle thatsomeone else will take care of that andthat's it uh I'm going to show the codethe whole code of how exactly this isbeing done as you can see the initfunction is straightforward this isnothing just a stop function to say thisis uh demo plugin with a server type ofDNS and the second function that's asetup setup is going to pass the corefile but because this is a demo plug-inwe don't have lot of configurations wejust say let's just pass the simplesta�ndard core file without anything okayso that's several lines of code againuh now we move to the served DNS whichis the key of this plug-in so howexactly you're going to do that uh youtake a state which is going to berequest you get a state and then with astate You can get the name of the querythat's a Q name now you can decide whatto do with that in this function as youcan see we are going to reply witheither1.1.1.1 or8.8.8.8 depending on the IP address ofthe query the source IP of a query andthat's it as you can see only one twothree four five six seven eight lines ofcode of course that's uh pretty much iti actually added uh print line functionjust to let us see how exactly this isgoing to play out but if you try thisone you're going to see it fairlystraightforward and that's it like Isaid I spent one day but you probablycan spend maybe like a 20 minutes to geteverything doneright okay so now you say okay I have aplug-in how exactly am I supposed toconfigure that because we said this is ademo plug-in so you really don't have alot of configuration if it happens thatyou do need a lot of reconfigurationsyou may need to make a littlecomplicated on the setup part passer butfor now it's just a demo plugin you setup one line to say I'm going to invoke ademo plugin that's it the core file anduh that's it uh the finally you're goingto say how exactly am I going to buildthat the building of the uh cordingplug-in is a little I'm going to sayit's a little convoluted unfortunatelyyou had to configure the setup uh fileof your cording you add one line to sayyou are going to invoke this demo pluginyou also need to copy a source code butthere are some easy way to make ithappen uh if you copy this line a littlelittle strange line with a Docker youcan just uh run this command and it willbuild the binary uh once the the uh yoursource code is compiled you're going tosee a binary called cordiness on yourlocal directory and you can just runthat so that's uh pretty much it uh wedo have the code available incoordinates repo that's uh uh github.comcoding as demo and you can take a lookand the total lamb code is 80 lines andthat's it okaythanks Yang um yeah so you can see it'sit's super easy to extend which means umwe do have a bunch of external pluginsif you go to our website you'll see um asection for external plugins wherepeople have built plugins you can pullthose in and build your own cords fewlittle notes that we don't have in thedeck but probably should you know itthey're statically compiled in there'sno dynamic loading of plugins and twothe ordering of plugins is fixed in therequest what you put in yourconfiguration file of your how yourplugins doesn't matter at all um betweenplugins like within a plugin if you havemultiple directives for a plugin itmight that order might matter but umthat means that if you want to changethat ordering for some reason you haveto recompile so those are some littlegotchas but otherwise it's super easy toextend um to give you an idea of some ofthe things that uh that have been goingon lately um so we just released I thinktwo days ago yesterday maybe um1.12.1 and um the probably the key thingI want to call out in that is it ituh back in in 1.11.4 four um so aboutsix months ago I think um we we we brokesomething in Kubernetes um which is theway the PTR records which are thereverse DNS records are created for umfor for different service endpoints andso we reverted that change um so if youare using uh cordiness in yourKubernetes environment um and you're ona version before 1.11.4 four you'regoing to want to skip all the way to to1.12.1 and not not go anywhere inbetween um just an FY a little warningthere um but over the course of the lastyear or so we've added a few differentinteresting um plugins one is uh thetimeouts plugin which just allows you toconfigure some more options um bunch ofchanges to other plugins but what I'mgoing to drill into next is the themultisocket plugin and I'll explain thethe problem this solvesso we've had a long-standing issue incordns where we sort of hit a we hit alimit uh a QPS l�imit when no matter howmuch more CPU you give to cords uh youcan't get more QPS out of it in fact asyou can see on this little graph yourQPS starts to decline um and so we wecould we weren't sure what thebottleneck is and as I mentioned likeit's a friendly friendly place so weactually had a brand new contributorcome in um this is now maybe about ayear and a half ago because this onetook a while to figure out but um theycame in and started uh asking questionsshowing graphs and trying to figure outwhat was going on and a couple othereven new contributors jumped in and andover the course of a few months um wenoticed a a few things um and I'm goingto show a little bit of the internalstructure of cordiness to show how kindof this evolved um internally right wehave a thing we call a server um and itit's what handles a given port so whenyou in your core file this basically umcorrespondsto one stanza that's got a a port um adomain and a port andum that one server object basically hasa routine that's pulling um that'spulling requests you know off the theUDP socket or or whatever it may be andthen it's doing them out to go routinesand um dispatches them to go routinesand then the actual resolution chain soso I guess maybe we didn't say itearlier but when we talk about pluginsthe way the the DNS resolutions happensin in cordns is we have this list ofplugins they're either enabled ordisabled based upon what's in the corefile and when we get a request we justhand to the first one in the in thechain and then it calls the next pluginwhich calls the next plugin which callsthe next plugin which allows any andthen and then there's a reverse right asthey return you get another opportunityto to affect the request so this is thisplug-in chain depending on what pluginsyou have enabled that could be very fastor very slow and um so you know in orderto you know parallelize the requests theserver picks the requests off dispatchesthem in go routines so parallel inparallel running execution threadseffectively um and uhlets lets them all operateconcurrently um so this is where we'rewe're hitting this is the sort ofinternal structure and where we'rehitting the thethe QPS limit and so during all thisexperimentation somebody said well let'ssee what happens if we create twoservers what because we could do thiswithout even changing code we can justdo this in the configuration file and sothey just opened up another socket andjust you know duplicated theirenvironments and then just sprayedtraffic across both of those ports andlo and behold the throughput doubled umand so it became clear that the b thethe the bottleneck was in that in thatthe server couldn't pull things off ofthe theum out of the socket fast enough anddispatch them fast enough uh to to get alarger a better QPS than than what wehadso what we decided to do is the nextexperiment was okay if you use so reuseport which is a sort of kernel featurethat lets multiple sockets be attachedto that same port um you can you knowinternally without you can serve thingsstill off the same port 53 um but youcan do it with multiple sockets thekernel will distribute the packetsacross those different sockets for youdon't need to do anything it's actuallysuper simple on the internal side tooandum so that's that's what we implementedand then uh what we found was that umthe number of sockets that we have umreally corresponds best to the sort ofsimplest default is uh just for everyfor every processor you have you knowfor every CPU you have create one socketand when we did and so we use this thisuh if you're familiar withgo limits its concurrency based on anenvironment variable but Uber has thisthis uh go module you can pull in whichautomatically in a containerizedenvironment sets that appropriatelyso we pulled that in and I can show youthis was the result of course with thesingle threaded environment and thenum with two sockets uh you can see thatall the way up through five CPUs we'reseeing a you know basically linearum a linear path with the with growthwith of QPS um three sockets foursockets five sockets so re�ally we getnow really good vertical scaling as longas you configure the multisocketsappropriatelyum you can scale to as many CPUs as youwant and you get basically linear linearQPS scaling until you hit other limitsright so so I think if you figure it's512 bytes per UDB packet you hit a yousaturate a one gig nick at around 250kQPS maybe um most likely if you'rerunning in a cloud environment they'regoing to cap you before that anyway soum those are the kind of limits youstart having to worry aboutum so this was released in 1.12 um itrequires SRU port so this is not on bydefault because that's a it might not bethere in your kernel or or you mightneed some configuration to make thathappen but it's uh all you need to do toenable it is put multisocket in yourconfig file and it will automaticallyadjust based upon the number of uh coresyou give to that container andum you should see better performanceum so that's really all we had then hadto discuss other than uh you know Q&A soum please uh come join us at accordingus it's super like I said super friendlysmall small world and uh easy to becomea a really uh make a significantimpact anyquestions yes sir so one of the biggestproblems with thescale is basedlittle amplificationthank you so currently the main problemwith coordinates is the scaling becausecoordinates is scaled based on thenumber of nodes and not by the number ofrequests is there any planning onchanging itso that's a that's a great question uhyeah that's a great question butactually I was I was thinking about thislast night we should maybe add a slideon this but so our yes our current sortof default recommendation is use acluster proportional autoscaler which isnot really a very good um recommendationespecially now that we have thisvertical better vertical scalingum so you know I guess it's more aboutthe recommendation at this point umdefinitely you'd be better off lookingat say CPU utilization um especiallywith this configuration now you alsofrankly probably can get away with soone of the one of the big problems wehave uh we see repeatedly is early inKubernetesum life uh a decision was madeto to make it easy for um developers asthey're developing to like feed in a DNSrequest for a service that's like justthe service name right and if theservice is in your same namespaceum then you just need that name and andyou can kind of allows you anindirection where you can move thatservice around and your manifest doesn'thave to change or whatever but what thatled to is this this we call the end dotsproblem where the the way DNS resolutionworks is um on the client side in theresolve.com file so this is beforecordiness is involved at all there's a aconfiguration option called n dots andit and and there's a uh a search pathand basically any query that has likefewer than n dots dots in the query andit's not fully qualified so it doesn'tend in a dot will go through that searchpath so what does that mean it means ifyou put foo and you feed that into yourprocess as your service namethen then the the resolver on the clientside will start trying names in thatsearch path and in kubernetes the searchpath is long it's like you first you tryfooser name then you funamespace andthen you try like or fu namespaces svcand then you like like right you end upbasically an any DNS name that's notfully qualifiedresults in six queries before it hitsthe right one if it's for an externalservice at least six um if it's for anddepending on if it's for an internalservice it's going to hit anywhere fromone to six depending on if it's in adifferent name space or whatever sothat's like an ampl query amplificationand that has nothing to do with cordsthat's about how Kubernetes configuresthe resolve.com but it leads to thisscale problem that you're bringing upbecause if you are doing a lot ofqueries for external services your yournumber of like we see this with like saylike buckets right then so therecommendations I would give you on thescale issue is first if you can do itmost people can't is make everybody puta a fully qualified domain name and whenI say fully qualified I don't meanending in.com I mean ending in.comdot right that last dot is the mostcritical piece because it tells it don'tuse the search path so that that cutsimmediately any external queries in toone sixth of what they would beotherwise um most people can't enforcethat easily um so in that case I wouldsay now that we have this multisocketthing just run two cords and give themlike you know eight CPUs or whatever andfor most people that's going to be finebut otherwise I'm going to scale suggestscaling based upon uh CPU utilizationand do horizontally so give it a bunchof or or a vertical pod autoscaler ihaven't thought it all through but thatthat would be my recommendation the theother scale factor and I'm sorry I'mgoing on long on this topic the theother limit um you have to worry aboutis the number of services in yourcluster which directly affects thememory used by cordns so in your memorylimits um if you're if you have a lot ofservices or a lot of endpoints inheadless services then that's that'sbasically what the memory correlatesto any otherquestions are we out of time or ohthere's another question hereuh we still have a lot of time on thatsame topic um we're using node local DNSon our side to like kind of cache at theat the node level um like how would youcompare like that um that new capacitythat you have now to vertical scalecompared to node local DNS greatquestion um so node local DNSsolves similar and but also differentproblems additional problems as well iwould recommend in general you do wantto run node local DNS but it it's ifyou've got a lot of nodes it's actuallymore expensive right you're usingresources on every node but it um theproblems it solves is one it uh itshortcuts any requests to likesubdomains or external domains I believeit does external domains too um and itwill look those up directly and cachethemum the other big thing it does isit we had this problem in the highestscale environments where um sointernally the kernel the kernel has aconnection tracking table and it's got alimited number of slots and if you havemake a TCP connection and then you closethe TCP connection it it there's there'sa close it disconnects and so it removesthe entry but UDP connections don'tbecause they're connectionless itdoesn't know when to remove the entryfrom the table so it lets them time outSo when basically when that entry whenyou're it's like 75 seconds or somereally long time period so you if youhave a lot of queries going out you fillup that connection table and you startlike literally dropping packets right soso you see failures in your resolutionat that point so node local DNS fixesthat because it turns off connectiontracking for um those UDP connectionsgoing out for DNS and it it actuallyupgrades connections from to theupstream cords to use TCP instead so sonode local DNS is cordns it's just aspecial build of it with most of the thefeatures stripped out to make it smallerbut um so I would recommend yes you useit the only downsides that I can thinkof are one um you have to run it onevery node if you've got 50,000 nodeswell most people don't have 50,000 ifyou've got 10,000 5,000 that might bemore expensive than running a few smalla few cord DNS's um toum the default TTL for services is 5seconds in a centralized core DNS youcan turn off DNS cache for those becausewe already have them in memory i thinkwe we probably do in our recommendationsso you don't have a a delay when podscome and go out of a service of 5seconds whereas if it's in a local cachenode local DNS cache will keep that soyou could that's not much of an issuebut there's a little issue there andthenum it's a single point of failure onyour node because we don't run twocordiness instances in node local DNSjust one but it's only on that node soit's scoped to the node so that's notthat badeither does that answer your questionokay any otherquestions we have uh three and a halfminutes lots of timeokay well we'll be uh right over here ifyou need us and um happy to answer anyany any additional questions should youhave2025-04-15 21:59:51.137747 ��G�5#��EAW3f5Ks0j2Q8allright hello everybody um welcomeum we're gonna talk about core DNS todayso I hope you're in the right place ummy name is John Belame i'm with Googleand uh uh my name is Yang uh I'm workingfor Cord DNSyeah umso you're probably a little bit familiarwith cordns at least because most of youare here for Kubernetes and uh core DNSis the default DNS server in uh inKubernetes um but it is it is more thanthat it is um a general purpose DNSserver pretty much everything but arecursive DNS server although there'sways to make it that tooum so it's an authoritative DNS serverprimarily but it's um you know it it's alittle different than than yourtraditional uh bind type of thingbecause we are written in Go which is amemory safe language uh a lot of peoplehad a lot of CVES in bind and uh so oneof the reasons that cordiness exists isfor that but more interestingly itexists because of the level offlexibility it offers so it's reallyreally easy to build integrations ofcordns with other backends so sure wesupport zone files sure we supportKubernetes as a backend um but it'ssuper easy to build other support and uhin a few minutes Yang is going to showyou how to do that um but uh uhessentially you know it's just a few youknow obviously it depends on howcomplicated your back end is but it'sjust a few lines of code in general uhtoday because of it so easy we support abunch of different backends um with as Isaid file being the biggest one zonefiles and then kubernetes actuallyprobably kubernetes is more widely usedthan file um which means we back usingthe kubernetes api um And uh but you canalso do back directly um be backeddirectly by like your cloud provers uhDNS service so essentially we canprovide a caching layer um for those ��tin the CPU or memory load or maybe theload is coming a little bit after muchwiser would be to actually scale theapplication based on the metrics comingfrom Rabbit MQ Kafka Prometheus you nameit how we can do that yeah you guess itright we plug it into the mix and withthis solution we can uh we can scrapethe metrics directly from the externalsource and then scale the workloaddirectly we can also scale to zeroYeah it's your turn nowNo I will hold your microphone So let'sshow you ademo The good part of KDA is that youdon't need to be aware about complexstuff related with custommetric serversand how Kubernetes handles metrics andand the management of the metrics forthe HPA Can you see myNo we cannot see your beepIt is what it is It happens Now I'mshowing the right screen and instead ofusing an HPA with boring custom metricswith adapters and a lot of boring stuffyou just need to deploy Keta with inthis case I have a an authenticationsource just for the secret It's notrocket science At the end of the day weare not putting humans on Mars So it'sjust a reference to a secret And I'mgoing to deploy thedemo And if I go to the demonnamespace Now I have start I I havepublished some messages on a rabbit QAnd then automatically it has startedfrom zero to one And eventually it willscale now for now for two to four Thenit will scale to eight and so on and soon and until all the messages areconsumedPerfect Nice I can dance a bit if youwant in No no no Please don't Thanks Iappreciate it Now the P the messages arebeing consumed and in a couple ofseconds when all the messages end wecould see how it scaled to zeroagain No it doesn't work It is what itis But you are the cutest support forthe microphone that I have Thankyou What's your wife about itNice So it's finishing Let me check howclose is It's almost thereAnd in just a second no problem We cancontinue with our session because justanother second Blah blah blah blah Nahwe have 24 minutes left 24 No no problembro No problem We are almost there N Idon't Yeah Scaling to zero again So wecan reduce the money I was I was afraidabout doing a demo that doesn't workjust in the beginning to be honest Let'scontinue with a shift F5Perfect Okay so this is Ka in action Ihope it was somehow understandable Sowhat is Kada Ka is a project that uhbuilts on top of Kubernetes Uh we try toreuse Kubernetes internals You can scaleany workloads uh Kubernetes jobswhatever based on different metrics notjust CPU memory We have 70 plusdifferent uh event sources So Rabbit MQPrometheuswhatever Uh we have a community a lot oflot of folks using it We have some goodcontributors Uh we have a Ka user surveySo if you are K user please fill thesurvey so we can know uh we canunderstand what is your opinion on theproject So architecture it's your turnKa it's my turn Okay Oh thanksThank you verymuch So marketing stuff for the CTOtechnical part for the SR right the lifethe world So the point is how it ka howhow keta works basically ka is built ontop of kubernetes is it's not replacingthe kubernetes mechanisms for theautoscaling is just extending them whybecause keta doesn't replace the hpabasically keta extends the APA the theHPA thanks to ex to another extensionfor metrics through externalmetrics that can be done thanks to thescale object scale object is just agrapper on top on top of the HPA toextend to HPA capabilities But one thingthat we have noticed over the years isthat not all the workloads can behorizontal horizontallyautoscaled relying on the HPA because ifyour workload have some kind of stateand can't be randomly killed during thescaleown process probably you will facewith troubles for instance if you haveyour GitHub runners or any kind of CIrunners that doesn't match with anhorizontal autoscaling process as it isbecause you cannot guarantee that Theguild bot is the empty pot For that wedeveloped the scale job which it'salmost the same but instead of relyingon the HPA and the horizontal potautoscaler controller we spawn from KDAdifferent jobs So the job has to endbefore it's removed Apart of theseapproaches KDA ext�ends the horizontalautoscaling capabilities supporting thescale to zero which is something thatyou could say but Kubernetes doessupportit Yes but not you need to enable afilter gate and you need to provide yourcustom metric in your own way So youneed to deal with all the problems andboring stuff related with custommetricsserver APIIn this case KDA does it on yourbehalf and how an ancal object looksbecause HPA is like a common and and awell-known manifest within KubernetesBut if you check the current scaleobject approach probably there are a lotof similarities and a lot and a lot ofthings that you can relate with the HPAThere is a scale target ref There is themax and minimal replicas that you wantAnd instead of metrics you have triggersWhich is a trigger A trigger is thescaler that you want to use And you canuse one or multiple triggers Doesn'tmatter You can include all theinformation that you need to make yourautoscaling accurate with yournecessities not just CPU or memoryand or all excuse me and almost the samefor the jobs but with some smalldifference Instead of referring aworkload in in general you can specifyyour job spec and keta will spawn jobsbased on your rules during the scalingprocess As I said this can bring youmore power if your warloads have a stateor if you are running some long runningexecution You need to keep the processalive until it ends It's not just astateless application Probably this isyour best fit for your necessities It'slike a chron job on steroids basicallyYeah it's like a good chronjob features Nice featuresThis year has been a year probably withnot super fancy features but all thosefeatures that you could need for reallyreally enterprise scenarios What I meanprobably if you or if you check otheryears we could have said no 30 newscalar sources and a lot of out methodsextra created But nowadays in enterpriseapplication in distributed and reallyhuge application you need a goodobservability you need a good eventingabout the the things that are happeningWe have been working on cloud events onsupporting the cloud event specificationin order to publish all the things thatare happening in Kella but wherever youneed them just publishing them throughHTTP through as grid and we are workingon new new targets that you can use toreceive old events related with ketawherever you needthem Also we have been working onimproving observability in terms ofaligning all the metrics that we exposeand all the the flows that you can usewith the with the community standardsbecause if you work with Prometheus orat least in my case when I work withPrometheus I don't like to see differentkind of h metrics namedlike execution hours no hour is not acommunity standard the communitystandard is second underscore second Sowhy should I work with ours Why KDAcomes here to break the standards andmake the things that they want No wehave refactor all those metrics We havebeen working on aligning all thestandards with the community and alsoYeah Yeah Yeah Do you want to continue Ineed to drink No continue please YeahYeah Yeah Perfect Yes sirWe have been working also on improvingthe experience when you work with KDA todetect the problems as soon as theyhappen Working on admission web hooksadmission web hooks were there last yearbut we have improved what we whichchecks we do We have improved multiplefixes that we have discovered over theyear and please go ahead goahead We have been working also onimproving all the authentication stuffsWe have improved the AWS Thanks Iappreciate it Yeah Yeah I'll continue Sobasically a lot of improvements on theauthentication area because autoscalingis one part but uh securing theauthentication and credentials and thiskind of stuff is also important becausewhen we are talking to these externalservices you need sometimes or usuallyyou need authentication So we need aproper way how to uh how to communicateto the services Okay So this was a avery brief uh brief uh overview of KDAand some new features and new stuff Uhlet's talk about some best practicesactually which might be interesting toyou guys So th�e first thing it might beobvious please use HPA scaling behaviorWhat is it It's a it's a feature nativeto HPA which we are using under the hoodbut is important This setting isimportant for one to end scaling Itbasically controls how HPA responds tothe actual load So with the with itsproper setting on this on this on thisconfiguration we can avoid fluctuationson the replicas So number of replicas Soyou see high load low load So you canhave a stabilization window policies etcetc You can also define how many potsyou would like to add a single iterationSo basically when we are creating morepots where we are scaling out we canchange this kind of settings So this isvery very important setting and thiscould really change the change the uhthe stuff for you if you configure itproperly The other other let's say bestpractice or maybe uh I would say one ofthe most repeating questions we got fromour uh from our users So they say okaywe are using chron scaler So what whatis chron scaler In chron scaler you candefine a schedule and you can basicallytell okay at this during this schedule Iwould like to have this amount ofreplicas This is useful I don't know ifyou would like to scale out the workloaduh different uh in a different ways uhduring the day So for example work hoursand during the night you would like toscale a little bit differently And thetypical thing that usually people do isto define that they would like to scaleto zero outside of working hours Thebest way how to do that is not to definethe design replicas to zero in the chronscaler because the how the chron scalerworks it tells hpa when it should scaleSo it should be the opposite So in thechron scaler we define the uh theminimum that we would like to haveoutside of this of this zero schedule Sothis is the this is the important thingtodo actually you can use this featurethat you are going to explain to achievethe behavior of the previous slide justin case Exactly And uh the very uh thisis also let's say newer feature maybenot everybody's aware of it It's calledscaly modifiers What is it about Ketainternally is relying on HPA to drivethe scaling and you can as mentioned youcan define multiple different scalersThe different the default behavior isthat the scaler that reports the largestvalue is been selected to to drive thescaling So imagine you have two scalersOne scaler give you metric to scale totwo replicas The other scaler tell youuh to scale to five replicas So HPA bydefault will select the five as themaximum We cannot do about it Forexample you would like to have averageYou have like the minimum or stuff likethat U because it's hard to modify thisin the HPA because the way how HPA worksUh we develop scaling modifiers whereyou can um you can define the neededbehavior So you can specify amathematical formula or nestedconditions and you can really play withthe with a different metrics So you canuse different sources different uhtriggers and scale them out properly forI'll show you it uh this kind of stuffin actionNice This time I have doneitThanks So in this case we have a supersuper simple scale object Easy peasyjust with two different chrome triggersfor the demo Doesn't doesn't matterwhich trigger you use And if we checkthe metrics if we go to the HPA we cansee that there are two metrics with onein average This is just a super examplebut if you want to add them instead ofexecuting a max between them because youare mixing some provider Q and anotherprovider Q whatever you want to applyHPA doesn't support it As Siniacexplained you can go to your scaleobject and you can define an advancedsection scaling modifiers where you candirectly write down your custom formulaand KDA will do the magic under the hoodto provide the scaling as you want Inthis case I'm going to say scaler a plusinstead offourplus for 42 Easy peasy And I'm going toapplyit And now I'm going to check again theHPA The same HPA has automatically beenupdated Now instead of of seeing twotriggers I see one composite trigger andthe result of it is one + one for twofour is the value This is a super stupidexamp�le but it show the power of thisfeature because for the previous casewhen I want to scale to zero I can justdefine zero as the chron and multiplyfor the working hours that I want toscale or one for it or those kind offormulas The limit is your imaginationin this case Yeah perfect And also Iwould like to highlight that thisfeature is not useful only when you havemultiple scalers or triggers only If youhave just one one trigger and you wouldlike to just modify the modify thesetting on this particular trigger youcan do it with with this uh with thispowerful powerful engine So if you ifyou haven't u used it please try tothink about it because it's verypowerful Okay So this was the demo rightDemo No no no no no Enough Okay so thiswas the best practices some just thefrom our observation in the communitywhat people usually ask what is thecommon issue and um this is all niceright autoscaling works everything isperfect ka is perfect kubernetes isperfect right actually but it is not sowith the autoscaling and uh especiallyin a larger scale you can you can see uhyou can see bunch of challenges andbunch of issues in the environment and Iwill try to list a few of them so thefirst fun This is a really good goodquic actually and uh uh so as wediscussed ka supports multiple differentuh event sources one of them isPrometheus and I would say this is oneof the most popular scar that we haveout there because it's like a flexibleit allows you to do a lot of lot ofthings under the hood because you havealready some monitoring solution into inyour system right so why don't use themetrics to drive the scaling as wellthis is good for a lot of cases this isgood but there are certain scenarios Iwould start the start with the first andthe permits is just example It could beany other monitoring solution It couldbe data dog data trace Basically youhave the metrics stored somewhereoutside the cluster and you are justpulling them So the first problem is ifyou look at the diagram you can see thatthe workload is you know exposing somemetrics and our Prometheus instance isscraping them So first we need to sendthe metrics to Prometheus and then KDAcan scrape those metrics from Prometheusagain So there is like this delay whichcould be matter of seconds maybesometimes a minute and it depends on theconfiguration For some cases this iscompletely okay because you are doingsome uh some complex permitous querythat uh takes average over a long timeBut if you need more real time realtimetraffic this is a problem because youare introducing the delay for thescaling So it would be better to insteaduh scrape those metrics directly fromthe workload The other problem with thissolution is that especially larger scaleif you have many clusters and forexample single uh let's say umobservability tool or prometers instancesomewhere outside the cluster you areyou are causing lot of traffic outsidethe cluster and back into the clusterimagine you have application you havethe metric in the in the in thekubernetes cluster then you send themetric there in the somewhere in thewild and then back to keta sometimes itcould be uh cross zone traffic you knowcrosszone traffic is uh is expensive aswell on a on on the call providers rightSo you might you might cause a lot oflot of traffic unnecessary traffic intoyour system Other thing since you aretrying to trying to uh be more real timeso you try to you know configure thescraping periods to be you know as lowas possible This is doable Maybe yourprometers instance is capable ofhandling This is fine But once you startoverloading this primes instance you arestarting getting uh connection times inKa This is very very typical typicalissue we see from users that they arecomplaining okay we are getting lot ofthis context that line exceed it andstuff like that So yeah thank you So andthis is this is really issue and youshould you should really think aboutwhen you were designing your solutionhow you where you host the host themetrics The other thing which is kind ofrelated is that uh you are yourapplication is scale based on some somemetrics and� these applications it couldbe multiple different applications andthey are all same talking to the same uhinfrastructure piece It could be thedatabase or your service that you arehosting yourself You might actually buythis autoscaling stuff which is awesomeYou might overload this database So youshould think about really how toproperly configure the database maybetry to scale the infra somehow littlebit or control how many replicas of eachapplications uh can can connect to thesame instanceThe other other thing clusterautoscalers because if you would like toenable autoscaling of kubernetes it'simportant to correctly set up both portlevel autoscaling and the clusterautoscaling It needs to be every timeyou know configured together becausethis is the key key to uh enabledynamicity Cluster autoscaler is muchslower than pot autoscaler because ittake much longer to schedule a new nodeSo you should really think about how tohow to mitigate this solution There arebunch of ways how we can do that It's alittle bit harder because clusterautoscalers they don't have any API uhfor us to tell cluster autoscaler okaylet's start scaling out creating newnode so this is a little bit harderbecause there is no APIs but there areways how to how to achievethat the other thing AI is everywhereright so you are hosting your or forexample you would you wouldn't like tohost your host the models in in someservice in for example open AI orsomewhere else you would like to hostthe models in your kubernetes cluster inyour infra for compliance securityissues or maybe cost So autoscalingapplies also to these workloads and it'seven more important because GPUs areexpensive So we need to properly thinkabout using the proper metrics to toscale out the the applications to havethe good latency and throughput Irecommend you a good talk on this onthis on this problem They are using uhketa kurf to try to mitigate thisproblem and it's tomorrow uh the lasttalk of the day Definitely recommendthis stuff if you are interested inrunning your AI workloads on KubernetesOkay So for the road map uh we have apredictive scaling which is something weare trying to do in a long time What isthe what is the uh benefit Ka is gettinglot of these metrics about yourworkloads So the idea is uh to to getthose metrics and then based on thepatterns we see in those metricsanticipate the load So we can scale alittle bit in advance before the actualload is happening Uh alsoauthentications the stuff we wediscussed before it's important So uhthis is the stuff also we would like toextend and will continueYeah and we are in in talk with AVM justto extend new platforms uh moreplatforms for KDA just to support moreuse cases and the but don't troll me Ahtrollers want to troll The last but notleast the HTTP scaling All the thingsthat we have explained fit really reallynice with asynchronous processes Butwhen you are dealing with synchronousprocesses like HTTP request you need todo the things in a different way Youcannot just no the request is on thequeue we will process it Not if you wantto scale h fromzero We are in hurry Please pass thenext slide HR will contactyou This is more or less the draft Ifyou see the blue the blue are the colorsYeah The green part is your applicationThe blue part is the client the thecustomer's calling And the red part ishow KDA works Keta can interact in themiddle of your workload withoutmodifying your workload So it's not sointrusive in terms of needing to changeall your deployment way just to use theHTTP uh the HTTP scaling features Andbasically it will deploy the interceptorin the middle Yeah Yeah I'm on it Sothanks Welcome Yeah I would like to addthat this is also applies to the stuff Imentioned before because HTTP traffic ismore um more you know it's import morespecial Yeah And uh the real time aspectis more special because you have thetimeouts and stuff like that So you needreally need to be fast when you arescaling based on the HTTP traffic Soending with the last demo If I go to theHTTPdemo there isn't any pot here serving myrequest But as soon as Ibrowse I could� need please work Yeahit's working I can seethat Oh I open the ground chat Isee the pot here So it has a scale andwhen the warload is not needed it willjust scale to zero after some time So ifI come here again and refresh I will seeexactly the same behavior scaling outreprocessing the request and scaling insaving money in the pro in the processAnd it has been all Thanks everybody forjoining us today for this session If youhave any question this is the moment Ifnot if you are afraid about askingpublicly you can reach us during theprocess you can ping us by Slack byhoweverThank youHi thank you That was really nice Youmentioned somethingabout cluster autoscaler and I was awareI I know what KDA does maybe not so wellafter that presentation but youmentioned something about clusterautoscaler How would Kada work with youknow a concept like cluster to scalingYeah So so we we have a P on that atKify We build a um we are using clusterAPI because cluster autoscalerscarpenter cluster autoscalers it doesn'thave an API to tell it to add a new newnode So what we did uh we have a clusterAPI defined on a on a on a separate nodegroup So you can run your carpenter orcluster skill on the cluster youseparate you create a separate uh nodegroup and we can target this node groupto create new nodes based on thosemetrics all the metrics we discussed Soyou can specify for example I can seethis this the benefit is that the crossauto scale carpenter this kind of it'skind of reactive so it based on thebased on the resource uh utilization andschedulable pots that that's that's allyeah so we try to do that uh in a uhfaster way through through cluster APIso you said that's in PLC right now yeahyeah we can we can talk about it lateron but we have we have PC on that thankyou thank you very muchanybody else has a question there is aI'm sorry for for the guyHi a question regarding the interceptorsfor HTTP traffic Can uh you alsoutilize existing interceptors like froma service mesh because I think theservice mesh would be kind ofinterceptor anyway So yes yes yes and nobecause for especially for the scalefrom 0 to one you need to hold the firstrequest and catch it and wait until theapplication is scaled out which is bitharder with with the with aninterceptor any other questionanybody we have thehe's sorry for that singleMike do you want to take that oneHi Uh I know that cluster autoscalerrecently introduced something calledprovisioning requests Sorry can you hearme Um cluster autoscaler recentlyintroduced something called provisioningrequests Um which I know like Q is usingto be to basically like submit uhupscales like before a job is runningI'm curious if you've like thought aboutintegrating provisioning requests withKada at all because you talked a littlebit about like an API to likepre-upscale Um I know it's a little bitmore challenging with predictive scalingto Kada but I'm curious if you'veexplored provisioning requests at allYes Yes we we did but nothing nothingconcrete at the moment but yeah we we weuh we we saw that Yeah So good goodinput YeahYeah I I will let you do the mic YeahSo you you mentioned earlier that Kadaworks also with other custom resourcesother than deployments Is that rightYeah Yeah Uh so do you know a use casewhere it is used for CI workloads forexample with GitHub actions self-hostedrunners does it work well with thosekind of situations So so you mean likethe worker like in the GitHub uh likea runners So I have GitHub action selffor runners running as ports in my butbut they are managed with a customresource Uh so can I use keta toautoscale them The yeah the question isif KDA can manage custom resourcedefinition or can scale custom resourcesbecause there are he has a use casewhere they use a custom resour anoperator based on a custom resource tospin up some workloads there The answeris yes Keta can scale whatever thatsupport/scale is an extension that youneed to implement on your CRDs and assoon as your CRD implement SLS scaleeasy pec will do it Doesn't matter whichCRD is is the the affected one If itimplements last scale ka will doit any[Music]other Uh thank you for the presentationUm does the HTTP autoscaler also take umopen websockets connections into accountI'm not fully sure right now becausethere is a in progress PR for supportingis not merged yet No no it is not likethe the HTTP I don't that we presentedIt's still alpha beta version I wouldsay So it is so the websockets aremissing there YeahYeah I remember it's not merge expectingend to end test before merging thefeature Yep It's not true ported thewebsocket yetThe mic does the extensions are in in orGPA GPC based stuff what what exactlythe extensions are So runs inside thekada binary or outside or add-ons whatwhat how how the whole framework actualextension framework works What whichextension thescalers all of them are already built inand you can extend them through a G agRPC interface that we expose You canuse any built-in scaler the six the morethan 70 scalers that we said are builtin and you can use it without any otherchange or you can develop your own gRPCserver exposing the interface that webring and extend with any business logicthat you want if the current scalers andthe scaling modifiers feature if theydon't match your own necessities you caneven develop your own server and connectit to KDA using your server to to managethe business logic and relying on KDA tomanage the Kubernetes part of theintegration So you manage the challengeuh of delay when you're scraping the thematrix from Prometheussomewhere far from the workload Thenwhat maybe I missed this during thepresentation but what's yourrecommendation to target this challengeUh there is not a single recommendationThere are multiple ways how to you cando that Um one of the things is uh weare also posing uh open telemetry scalerSo instead of sending those metrics overto the Prometheus uh we deploy smallopen telemetry collector and it directlyfetches the metrics from your workloadsSo it doesn't leave the the Kubernetescluster and it's been directly pushed uhpushed to Ka So this is one of the oneof the solutions how you can avoid thatUh the other is maybe you can try thedifferent scaler already built in Kathat would fit the purpose betterbecause the the Prometheus instancemight be overloaded And the last um lastrecommendation on this front is that wehave also a feature which is called uhmatrix caching And this uh feature cannot improve the delay but it couldlittle bit uh ease the load that'scoming to the Prometheus because we canreduce the number of requests coming tothe Prometheus So uh I recommend usingthis one And about this uh autocollector PLC when when this autocollector PLC is going to we have wehave it open in our in our repositorylike in the in the Kify repository andwe will be probably thinking aboutmoving it into the Ka Ka organization ifwe discuss that and it will be separateseparate add-on basically so it won't beback in Keta but it will be there So ifyou you know we can talk about thisoffline I supposeOkay Okay There is a question right Dowe have a one Yeah Do we have one moremic OkayJust wondering if uh you guys had plansfor any more AWS services integratingwith thoseDo you have an example of of a servicedoes not cover itIt's supported Is is there SQS You willask it No just any others that you guyswere working No it's this is usuallydriven by users Yeah actually we don'thave any road map but it's acommunitydriven project If you have anyservice that you would like to seesupported just open a feature requestThe point is that the amount ofpossibilities is almost infinite acrossall the services ac across all theprojects So we implement them as soon assomeone actually need them instead ofjust implementing a thousand of them Wejust implement on demand or even if youneed it you can open the the the is thethe feature request and you can open aPR adding the support and it will beimplemented for sure Okay So I think wewe need to we need to end here So if youhave more questions feel free to reachout to us I'll be at Karify booth or CEwill be around the venue So thank youfor for attendance[Applause]2025-04-15 21:59:51.904699 @@��6#��mA317rLOIKfDQthanks everybody for joining us today inthis session about KDA and how to unlocknew visions and how to use KDA inproduction First of all I'm going tointroduce myself I'm Jorge Dorado I workas principal SR atCRM the international half of SWARSgroup and I'm one of KDA maintainers I'malso CNCF ambassador and Microsoft MVPand he is my employee Nice Introduceyourself Thank you Horge Thank you GreatSo Horge is done for for today and therest of the talk is on me So my name isBeign I know it's hard to pronounce anddon't feel bad about it U I'm also KAmaintainer Uh I'm with the project sincethe inception So uh already even it wasbefore the CNCF donation So we togetherwork with a couple of other folks on KaUm on on top of that I'm also founderand CTO uh of Kifi It's a company builtaround KDA and we provide enterpriseautoscaling solutions uh around Kada forour customers So let's talk about Katoday This is the agenda we have I hopewe will be able to get it in time So wewill have some interaction talk aboutsome features uh discuss best practicesand some challenges that you can facewith uh dealing with autoscaling Sobefore we start uh may I ask uh theaudience if you can raise your hand ifyou already use SCADA in productionNice It's like a more than half right Uhis there anybody who doesn't know whatKDA is or maybe just heard the name butis not aware of the capabilities Okay wehave bunch of folks and uh is thereanybody who knows what KA is but doesn'tuse it would like to use it Okay perfectSo we have a coverage for all thestuff Uh so let's start rightSo why do we need to use SCADA Why do weneed to use auto scaring For the folksthat already use SCADA this will berepetition of the already known but Iwould like to cover this kind of stuffin a in a short time So why would weneed autoscaling on our clusters Firstwe would like to save resources becausewe would like to run only the workloadsuh when they are needed or and also atthe same time we would like to improveperformance of our applications How wecan do that We can enable autoscalingbecause we can autoscale our workloadsbased on the actual demand So let's takea look at this problem We have aKubernetes application This applicationis consuming some data from someexternal source It in this case it couldbe Rabbit MQ So the consumer applicationis consuming data and we would like toautoscale it because we just discussedthat autoscaling is great right So thenaive approach sorry uh would be to plugHPA into the mix HPA is already built inKubernetes It works well It's perfecttool and what it does it monitors CPUmemory usage on the application andbased on these metrics it scales out theapplication This is great for a lot ofuse cases but for a lot of use casesthis might not be the most idealsolution Why Because consuming messagesfrom the external source uh doesn'treflect in the in the resourcesconsumption on the workload What does itmean our application is pulling datafrom rabbit MQ and this doesn't reflec��oymentscaling and management of storage it'seasy to consume you can use PVs storageclasses and also have CSI driver foraccess and it's completely open sourceso looking into how uh the architecturelooks like it has three uh three uhlayers the first one is Rook whichdeploys and manages SE storage then wehave CSI which mounts and provisionsstorage for you and lastly uh uh and wehave SE which is uh the data layer whereuh your application's data is gettingstored so uh no uh the question comeswhy we chose SEF uh uh the it's a compit's an open- source distributed storagesolution complementing Velvitz Rook'sphilosophy and it really ticks all theboxes if you see uh you want to haveblock storage you can have that you canhave shared file system you can have S3like object storage and uh adding tothat it it's very scalable you can scaleup and out very easily without anydowntime it's thin provision so if thestorage capacity used would be based onactuals and not on how many uh PVCs youare creating so uh looking into the scauh scalability numbers we have seenmultipetabytes to uh uh rook clusteralready with uh with the recent clusterreporting around 250 nodes used1,800 OSDs uh each having three terabyteuh of NVME amounting to around 5.2 twopabyte storage and uh if you see the SEpublic telemetry numbers it's alreadyreaching exabytes and that's not just uhuh the numbers can uh be many times ofthis uh because few few people enablepublic telemetry and SE is also veryperformant if you configure it correctlyuh you can see the blog uh for moredetails it's already um uh reaching to 1TBTE per second uh speed as well And uhum now you have a complete solution uhusing rook and where can you run itshort answer is anywhere Kubernetes runso you can run it in a cloud providerenvironment with using EV EBS orpersistent disk or you can have your ownuh on-prem bare metal uh data centerusing it um with SD SSDs and HDDs formore performance and control or you canalso have a hybrid and multi cloudenvironment mix match for storage uhacross cloud and on-prem for resilienceand uh uh if you See the advantages ofdeploying Rook in cloud environment uhthe challenge people generally face incloud environment is they fear data lossuh as there are not availability zonesrook replicates and distributes storageuh uh data uh across as for you and uhthere can be limitations on the ppersistent volumes per node in some ofthe cloud providers or there could bepoor performance of small PVs rook hasan uh um intelligent and optimized uh umobject store data placement and uh itvirtually gives you unlimited sto uh ununlimited scaling and you can also havecl uh crosscloud support uh as you seein the next slide uh you can have anexternal uh rrooksef cluster and haveyour kubernetes client clusters uhaccessing this unified uh uh rrookf ukubernetes etes cluster for storagedemands and uh looking into the newfeatures that we had we had a recentrelease in uh December 2024 1.16 withthat with this we have mirroring nowavailable for redos name spaces therewere improvements in object storage withS3 uh storage class improvements and youcan also access audit logs for S3 nowand uh we have now direct multiple uhmultis network integration ation so nomore holder pods needed we also supportReef and Squid SE versions and in thenext release that we plan 1.17 in April2025 we want to have CSI operatorenabled by default uh allow Rookoperator to run in multiple namespacesindependently and support multiple SEclusters we want to have SEC CSI versionuh3.14 integration and uh have some uhobject uh bucket claim improvementsplanned for greater control over bucketswe also have uh uh u external monitorsupport for two uh data center scenariosalso we plan to support SE's latestversion tentacle and uh if you want touh download and get started with usinguh Rook you can use a Helm chart orlatest release from Docker Quay or GCRuh now in the next slides we'll coverCSI driver over to you Mu thank you DAso let's talk about CSI driver like CSIdriver like we host uh three projects uhbasically three drivers in it like oneis FFS RBD and NFS so it's a singleproject which holds like u three driverstogether so this the architecture lookslike we have like a controller plug-inwhich runs as a deployment and we have anode plug-in runs as a demon set thedeployment um runs with like uh HA it'salways like uh replica 2 which isresponsible for volume snapshot groupsnapshot creation deletion volumeexpansion all the stuff so whenever youwant or wherever you want to mount uhthe PC to our application we need to runthe node plug-in port that isresponsible for mounting and unmountingthe PVC they run like as a demon set oneper node what there are like few key keyfeatures of CSI driver like we CSIdriver like we use OMAP extensively forthe state maintenance in SEC CSI even ifit restarts like we don't want to leaveany garbage values so we use go for theAPI calls to talk to SE to get a betterperformance because reuse theconnections which is connected to apools so a single CSI driver is able totalk to multiple SEC cluster we don'tneed to run CSI driver per se clustersso we have an isolation uh at the SElevel like we could use redos name spaceor sub volume group to get amulti-tenency as well uh uh there arelike we support thin provisioning forRBD we support RWX block mode this isfor VMs we support RW file system fordatabases for SFS we support RX filesystem and NFS we support RX which is ontop of CFS we support PVC encryption forboth SFS and RBD we have tested withvarious KMSs like such as example WAAzure like uh IBM HPCS like bring yourown key secrets etc so we also supportonline PVC expansion for all the threedrivers oh we support volume snapshotfor SFS RBD and NFS we recently addedsupport for volume group snapshot forRBD and CFS uh the PVC clone issupported for all three drivers so therewas some problem with CFS backupsbecause FFS clones are like costlyoperation it's full copy so weimplemented something for uhspecifically backup tools where they canclone into an ROX so they can just mountand copy the data to some remote site sothis was one of the use case uh solvedfor SFS uh we support topology basedprovisioning like uh read from the re uhnearest OST so that we get betterperformance we support staticprovisioning entry uh migration as wellwe we support kernel mounts and userspace mount for both CFS and RBDso we have everything with CSI so weneed something on top of it like uhsupport like in SE we have replicationsetc so we built something similar to CSIit's called CSI add-on so it runs in asimilar manner where we have onecontroller and site car running with theCSI driver so in RBD like when youdelete an file from a PVC so that in theback end the space will still beoccupied so we have built a communitiesway of mechanism like user can seecreate a CR so in the back end we run FSstream or RBD specify to just uh makesure like we have a balance between thedata and like what you see on the userand the back end we support networkfence class this is like recently addedone uh where like on all the nodes likewe will display like what's the IPvisible on SE cluster and this is veryuseful for disaster recovery especiallyfor RWO PCs where they want to move toanother node so we don't want to haveany problem with the data and also it'suseful for like where you want to moveyour application from one cluster toanother cluster so they can you can failover fence the whole cluster uh and youwill have a data so we have we supportlike PVC encryption for surface RBD weadded a key rotation policy as wellwhere like user can run like uh kind ofa chron job for rotating the keys forthe encrypted PVCs uh this is one of theimportant feature for CS add-on that'svolume replication we supportreplication of a PVC and uh replicationof a group as well so this is like cubuh cubernetes way like we have a CR soyou don't need to do any manual steps inlike to promote the remote and all theoperation just one state change it sothat you will be able to do the GSrecovery from like one cluster toanother cluster easily so we have CSIoperator so we want to move away thelike all the CSI functionality or themaintenance from Rook to CSI itself sothat we export the CR for the user sothat with a minimal configuration youcould have your own CSI drivers createdwe support Helm installations and uhKubernetes charts as well uh some of thefuture road maps for CSI driver is likewe plan to support NFS um shallow volumeas well because NFS uses SFS underneathwe have the same problem we have plannedto support brows authentication for NFSvolume so a volume group snapshot forRBD though it's supported with upstreamuh main SE we are waiting for a releaseuh as well to make it a GI feature so wesupport mirroring uh group snapshot weare working on chain block tracking forRBDPCs for better backup we have plannedto support QoS for RBDPs as well this isone of the long waiting feature so I'llhand it over to Adam to talk aboutobject store yep thank youum object storage uh SE implementsobject storage in Rados gatewayapplication or RJW uh RJW is a webserver implementing S3 and Swiftprotocols and using SE as a storagelayer uh in a simple case user define SEobject store custom resource withdesired number of RGW instances andparameters for SE RBD pools uh and rootwill create airbd pool in se and um eddeployment inkubernates uh another advanced scenariois a multiffrontend deployment uh itallows to have dedicated edge doubledeployments with different configurationbut serving the same data for examplehave a separate deployment to host onlyS3 only swift or admin APIs have adedicated deployment per customer forbetter load balancing or a dedicatedinstance to perform garbage collectionum in this case it will be disabled foruserf facing instances for performancereasons uh here you can see uh objectstore custom resource with a long listof parameters which can be divided intwo groups uh the first one is a front-end configuration it configuresdeployment like desired number number ofreplicas uh resources enabled protocolsuh also certificates domains and so onuh and the second group is a backendconfig telling where and how store datain SE uh Rook allows to move backendconfiguration into separate customresource and referate from object storein this way we can have a second objectstore using the same back end forexample one on the left uh hosts only S3API and the second one only swift uh butthe same data so you will be able toaccess objects um from the same S3bucket or swift container uh now let'sexplore advanced backend configurationspecifically pool placement and storageclasses so here is again uh SE objectstore custom resource uh with thebackend uh configuration zoomed in uhyou can see a list of pool placement uheach placement has a name uh and the setof pools uh the first one is metadatapool it's responsible for storing uhbucket index andmetadata we configured pool to store iton SSD in three replicas we also have adata pool storing object payload on HDDin three replicas and we also defined uhstorage class reduced redundancy andoverride data pool with HDD single uhit's also storing object payload on HDDbut only in a single replica now S3clients can create bucket and refer ourplacement as S3 region uh and if userfor example put a object in such abucket it will end up in HDD3 replicapool and if user set storage class uhreduce redundancy it will be stored in asingle replican rook allows to have uhany number of pool placement and anynumber of storage classes per placementuh now let's see how Rook provisionsobject storage to other Kubernetesapplications uh the most popular way isto use object bucket claim it's similarpattern to PVC uh here user createsobject bucket claim with the storageclass name bucket name and optionaluh storage quer uh then rrook createsactual bucket and credentials in se andprovides secret and config map withconnectioninformation another way is to usecontainer object storage interface orcoy uh its open specification currentlyin alpha version it defines howorchestrators like kubernates canprovision storage from providers like orfor example uh cloud providers likeAWS uh it defines admin custom resourcesbucket class and bucket access classdescribing uh storage quality of serviceand uh access permissions there is alsouser custom resource to actually requestthe bucket and access for it and finallyRook also implements coy driver uhcreating bucket and credentials and umcreating a secret uh for application toconnect uh the bucket uh now uh Travisokay thanks everyone so we've heard alot about the features of Rook umstorage is a huge topic too and we'rejust scratching on the surface but onething that's really important to know isthat of course your data is is importantit's critical and Rook does everythingand Seth does everything we can toprotect your data so it's important tounderstand well what do you need to doas system admin to to make sure the datastays safe what do you need to thinkabout during your cluster maintenancetasks because it's not something thatyou can just deploy and expect to workindefinitely it it takes some attentionand and someplanning so as you're doing yourmaintenance operations uh you need tomaintain the nodes you need to updateKubernetes you need to to do to do thisand that uh so what can the clustersurvive what can I take down and whatwill keep running what can I expect atmy data layer um to keep running what'sthe severity of my outages if somethinggoes really wrong and what can thecluster survive and one of the basic uhthings that's important to to configurethen and plan for is your topology inthe cluster so your topology being howdo I lay out my data center do I havemultiple data centers zones racks uhhosts how many uh devices per each nodethere's there are lots of ways to do itand I'll just touch on some of theconsiderations today so this pictureshows us uh a zone topology where okaygenerally you'll have three zones youneed at least three zones you can havemore and each with within each zone youcan have multiple hosts and each ofthose hosts can have multiple OSDs whereone sethosd maps to a single uhdisk u so as as you place your data inthe cluster generally you'll you'llconfigure with three replicas that's thedefault replic recommendation So Sephwill place one replica of the dataacross each zone if you're running inthe cloud that means you you get areplica across each zone and the data isavailable from any of those zones asDepica touched onearlier um and so what happens now as asmaintenance operations need to happen asthings go down uh what can we survive solet's say we need to do ma maintenanceon a zone there's a a network outage ina zone there's something that needs tohappen in one of the zones and let's sayeverything in zone A goes down whathappens well from your data perspectivethe cluster is fully online seph is ableto still serve the data for reads andwrites and and everything will beworking and and staying online sothere's no downtime there's no outagethere's no data loss even with having afull zone that is down for some time oreven permanently no no data loss nodowntime uh what if it gets worse whatif we have an outage where two zones godown this is where because the data isreplicated across all three zones thedata itself is still safe so zone Cstill contains all of the data uh butwhat does this mean for availabilitywhere the data the cluster is down nowrites or reads are allowed at thispoint so you'll need to get one of thezones back online in order to to bringthe the cluster back online but at leastthe data is safe and there are measuresyou can take to bring the cluster backonline with that single zone if theother two are completelylost um but that's how you know the datais safe as long as the data on one zoneis safeum so for your various maintenanceoperations how do we how do we helpguide this this process so poddisruption bud budgets are kubernetesmethod for signaling how much can I takedown at atime so Rook will manage the PTBsdynamically to say oh it looks likeyou're draining this node let me uhadjust the pod disruption budgets sothat I'll allow any node in that uh inthat zone to go down so so back on thisthis picture here if we see you'redraining host X Rook will adjust thePDBs to allow host Y to also go down atthe same time the PDBs however willblock zone B or zone C from going downat the same time so that while you'redoing no maintenance as long as the PTBsare respected that you won't expect anydowntime during those maintenanceoperations so it is so the PDBs aretopology aware and the PTBs will helpguide you through through that to keepavailability another type of maintenanceis when you're upgrading so if you'reupgrading Rook itself Rook will uh willdo it in a rolling manner so rollingupgrades automatically so that you don'thave any downtime and it's only one atmost one failure domain affected at atime so here's kind of a case study soin the last month or so working with auser where kind of the worst casescenario during one of their maintenanceoperations the control plane and all theworker nodes were accidentally imaged ohwhoops i just rebooted all my nodes andthey all got a new image uh that'srather disastrous kubernetes is goneright so what do we do is there any hopeto recover the data well the data ispersisted to disk bySeph and so the question became well howcan we get the data back from the disceven though Kubernetes and all itsmetadata is completely lost how can webring it back up well in this case wewere able to help them with with a lotof manual operations admittedly to beable to get their data out theircritical data they were able to bringback up from you know from from thediscs that they were still availablebring up basically a a temporaryKubernetes cluster mount just the thedevices RBD or SFS volumes that theyneeded and extract the critical dataso we're we're working on documentingthis case and adding it to our our Rickdocumentation so that if anybody hitsthis case it is possible to to recoverwe do have some steps in this already inour disaster recovery guide in thedocumentation but there's plenty of roomfor improvement and and we're clarifyingthat so as far as tooling what can youdo during your maintenance operations uhwe do have a coupe cuddle plugin whichis essentially a CLI tool for runningdifferent maintenance tasks on yourcluster these are things that don'treally fit the model of CRDs where youneed to do a one-off maintenance taskinstead of expressing desired state inthe CRDs so there are things like youneed to you've got some downtime youneed to restore Monorum to that singlezone that that is only remaining youmight need to remove OSDs you might needto uh perform other maintenanceoperations for you know advanced thingswith MonzoOSDs and if there are other maintenanceoperations that that you've seen youneed to to run in your cluster we'd loveto know well what what would you like tosee automated because this tool is isthe perfect place to run those one-offtasks where um if it's not a CRDsetting and just to finish up I want totalk about project health how does thework project work how does how do wevalue the community well really you knowsince the project was started I thinkwe're coming up on our ninth anniversarynow amazing it's been that long butcommunity has always been important tous uh we've got over 400 contributors tothe to the project out of all those Iforget how many thousands they said inthe in the keynotes this morning toKubernetes but had a lot of lot ofdownloads we've we graduated CNCFgraduation is almost five years ago nowin 2020 and thanks to all ourcontributors we've got uh currentlyKaiso Ibosu IBM and Red Hat and Outboundas contributing with maintainers andwe're always looking for new maintainersif you get involved in the project uh wehave a process to you know to advancemaintainers and those who are active inthe projectum the stability you know being that weprovide a data plan data plane stabilityis of course critical so we've wedeclared Rook stable six years ago thereare many upstream users running inproduction many downstream deploymentsand products running on top of it aswell so just happy to say um it's it'sgreat to see so many people running itinproduction how often do we release uh wedo have a minor release about every fourmonths kind of to follow the spirit ofKubernetes itself which which releasesabout every four months we do have the1.17 release planned uh in the nextcouple of weeks and on a just to have acadence of patch releases you know aboutevery two weeks we will release arelease a patch or on demand if there isa criticalneed what's our road map for Rook uhwell essentially you know sometimes wetry to say well here's a here's afeature list here's what we want to dobut ultimately our philosophy is wealways want to support the latest Sethfeatures that the data plane isproviding we always want to make surewe're integrated with the latestKubernetes features as there are alwaysnew features coming out and to make theSeth the best storage platform forKubernetes that it can be as Kuberneteskeeps growing and growing we've seen youknow everybody needs storage and theyneed they need it to be the best it canbe uh if you're still in London in acouple of months there is Seth Days uhhere's a link to go register for thatand then I think we've got some time forquestions so thanks everyone please cometo the project pavilion we do have aRook booth for the first half of the dayuntil 3 p.m today for example uh so comefind us there and and we'll be therehappy to answer questions if we don'thave time right after this so thank[Applause]you anyonethe mic is walking around we got one infronthere hi thank you for the talk um noobquestion but what's the performancepenalty compared to running like just inlocal storage compared to running it inSE with Rook in a container storage mhmso the performance of local storage justthe local disk compared to running withwith Rook and stuff yeah like thethroughput and maybe the latencies alsobecause of this distributeduh I it probably has a penalty hit yesso you know with a local disc of courseyou don't have any latency with networkor other things so that will be thehighest performance of a local disc butyeah when you introduce Seth whereyou've got multiple replicas beingreplicated to other nodes Seph is fullyconsistent and so it does requirecommitting to at least two of the threereplicas before it will act the anyrights so the yeah so there's a networkperformance to write to at least oneother node and and even yeah two nodesso there is a performance there itdepends exactly on the numbers as far asyour hardware and network and um wedon't have numbers published becausethere's so many so much variety in inthat performance but I does anybody elsewant to comment on that or there's ablog linked in the presentation as wellyou can take a look into how to uh getthe performance number for your use caseyou can use fio you can have benchmarkthe local disk and then run uh fio withu maybe RBD or the workload and then seethe numbers for yourself yeahthanks yeah another comment comment onthat also is that Seph is as adistributed file system essentially adistributed storage it works best whenthere are many clients because manyclients can distribute the load acrossmany nodes so the bigger the cluster themore clients the better overallthroughput you'll definitely see if youjust have one client it yeah you're notgoing to be as impressed with theperformance probably but the biggerthings are that that's where it shinesoh is that about time okay one more onemore yeahum I have a question to your case studyso if only the Kubernetes cluster isnuked but the data is still available onthe disks wouldn't anc backup be enoughto get the data back because youtheoretically spin up a new cluster withsimilar number of nodes and then withthe CD backup you could be able torecover the Kubernetes resources youneed to get the data yes correct withthat if that city is backed up and youcan essentially bring back Kubernetesthen it's it's a much simpler recoveryproblem and not difficult in this casethey even um if they had anc backup theycouldn't recover it something wentcompletely wrong with with allcd backupand but that yes that will improve yourlife definitely if you can back up at CDand restore it in that case okay thankyouokay If no other questions feel free tocome up and we'll be here for a fewminutes thank you2025-04-15 21:59:52.703315 � R��~�8#��3AjikiO3CC7Zwhi everyone i'm u very happy to be heredid you enjoy coupon so[Music]far oh okay i did enjoy CubeCon a lotlast night it was it was reallygood and uh yeah this coupon is veryspecial to mei want to thank CNCF for for puttingthistogether after four years of workingwith Sanscar uh on Flagger on Flux uhsometimes day and night he is from Indiaand from Romania Bucharest there is uhsome overlap between us but we managedto do it um this is the first time everwe met in person soyay yeah so today we are going to talkabout Flux ecosystem the tools aroundFlux um don't worry we are going to talkabout Flux a little bit uh at the endwith the road map and everything but umthis session is focused around thetooling we've built around flux toextend it to enrichuh the way you do githubs um and thefirstum there are two major projects in theflax organization um there is the fluxproject in in its current form versiontwo is the you know the continuousdelivery tool powered by GitHub toolkityou know all of that that's why you'rehere so not going to insist that much onthis um and there is also flagger um itis a sub project inside the organizationbut it can be used independently it's astandalone tool it has been developedfor Isix yearsum more than that seven almost sevenokay seven yearsum it's uh something that's like anadd-on to Flux if you uh mix it togetherbut as I said works with any other toolout there uh that can�� 7#��QAxGywrHPAMmsuh welcome everyone uh we are gatheredhere to have Rook storage for Kubernetestalk i'm Dupa from Clyiso i have beenworking with Rukf for almost 5 years nowuh Madu yeah hello i'm Madu Raja i workfor IBM so I'm one of the maintainer ofSEC CSI SECSI operator CSON also takecare of integrating CSI with RO yeahhello I'm Artum also working for Clyojoint project uh last year mostlyworking on object storage parthi I'm Travis Nielsen one of theoriginal creators of the Rook projecthappy to hear and talking more aboutRookso to get started uh we'll have theagenda for the day is we'll have a shortintroduction to Rook and Seph then we'lllook into what new new features areadded to Rook then we'll have a shortoverview of SEC CSI driver then a majoruse case for Rook which is objectstorage lastly we'll see whatmaintenance considerations you shouldhave while working with Rookf so to getstarted on introduction to Rook uh wheredid it all begin uh so coming to thequestion if you're using storage forKubernetes you generally go for thecloud providers but why not have thestorage in your data centers why thestorage is not as scalable as theapplications you use uh is there even astorage solution that's doing it so withthat question we uh begin to design astorage platform and uh theconsiderations we had was we didn't wantto reinvent the wheel we just wanted tohave the uh best way to run storage inKubernetes and for that we chose SEwhich has been running in production foralmost a decade now so uh looking intowhat Rook is uh Rook essentially is justa Kubernetes operator that automatesself storage for you it's completelycloudnative management so you can useCRDs uh or uh for handling depl� do continuousdelivery and it helpsyou at the end of the road of everythingthat happens in your pipeline so canthink of flagger as the thing that inyour production system is right therebefore your user uh users get tointeract with the new version of yourapp and tries really hard to you knowprotect you from uhmistakes uh it does things likeautomatically roll back um it it can youknow shift traffic from one side to theother uh it implements a lotof deployment strategies that Kubernetesitself does not uh is not capable of thethe Kubernetes uh deployment controllerso I can think about if if flux isa is an extension of Kubernetes API intermsof watching the external world being agit repository or a container registryor even an S3 bucket or a helm chart ina container registry so flux does thatwhich is the outside world then itbrings all those changes in inside theclusterif those changes would have beenautomatically deployed to all your usersand you had a problem in there you willbe affecting everyone rightso flagger solves this partwhere let's say in um in this workflowplug scans the container registry seesoh there is a new version of anapplication let me go into git andupdate the image tag from version one toversiontwo and so this is step one what fluximage automation does it just commits toget and you can run flux imageautomation on not in your productioncluster So you can run it outside andyou should be running it outside um it'slike a dependabot if you want for forKubernetes workloads then the other partof Flux is hey there is a change in thedesired state now I need to deploy a newversion so it applies that change on theclusterthen flagger if you've set up flagger onthe cluster it will say oh you want todeploy a new version i'm not going tolet the Kubernetes deployment controllerkick off i'm going to take over andbased on a policy it's a Canary customresource uh Sanskar will will show youthat later on uh based on that policyFlagger will starts to shift traffic alittle bit towards the new version umtest it out with metrics and so on umand at the end of that analysis it saysokay according to your defined SLOs'sthe new version is good now I'm going toroll out to to the whole user base andthat's how uh the deployment ends sobasically flagger disconnects uhcontinuous deployment from the actualrelease right if you if with acontinuous delivery tool you youcontinuously deploy changesum Flagger does it in a disconnected waywhere releasing becomes something elseand releasing means doing itprogressively doing it with guard delaysso you know your your users are lessaffected byum by mistakes now we have we have thisuh in place now with flux and flaggerand and Sanskar will talk about flaggerafter that I'm going to demo flagger foryou today after that I'm going to talkabout um another ecosystem tool calledflux operator which acts at the CI leveluh up there when you do a commit changeyou open a pull request with that changeand maybe you don't want to end up inproduction to actually see that whateveris happening on the Kubernetes clusteris not that good uh and basically fluxoperator can watch pull requests watchthe registry and create for you FMLenvironments where you can actually testthe thing before you merge it but youknow as as always doesn't matter howmuch testing we are doing on our testtesting environment staging environmentthere is always the chance we are goingto break production so flaggeries is acore part of that and Sanskar please uhgo ahead uh right so just a quick showof hands how many of you are likeactively using flagger uh in yourKubernetes clusters okay I see a fewhands so so there is some bunch ofexciting stuff that happened over thelast few months uh we added support forwe were the first ones to add supportfor gateway API and in uh true fashionin that we added support for AWS gatewayAPI as well and there are a bunch ofother things that we did you can seethem on the screen but there are twomain things that we're really excited touh introduce here one is support for Knative um K native has been on our roadmap for a very long time uh a lot ofusers love K native because it's oftheir whole serverless architecture umand integrating with K native is achallenge because they have their owndeployment model right they don't usecub they use kubernetes deployments butthey have their own layer around it umso it was a bit of a challenge to getthere uh but we managed to do it uh andthe way it works is uh is actually a bitdifferent than how other uh networkingtechnologies work so if you were to usethis with let's say a gateway API youwould define some networking object likea route or something which can uh youknow spread HTTP traffic here instead oftargeting the Kubernetes deployment youtarget the K native service itself andit's actually very easy to get startedyou deploy a K native service and thenyou create this K object and then you'reoff to the races um and uh by the waythis was a community uh contributedfeature so a big shout out to thecommunity for contributing thisfeature um and there is one more thingwhich I would like to talk about andit's a sort of a problem uh but beforewe dive into the problem I wanted toexplain how we do progressive deliverytoday um so if you think about it it'spretty simple right you have a primaryworkload and a canary workload in thisdiagram the blue one is the primary oneand the green one is the canary um sowhat you do is you slowly and slowlystart shifting traffic towards the greenone so first at 5% then at 10% and thenuntil you're satisfied that the newversion is working perfectly fine it'sall good uh you keep shifting trafficand then you start directing all of yourtraffic u to the greendeployment um so Flagger does all ofthis for you it automates it fully itcan hook into Prometheus a bunch ofother metric providers uh fullyautomates this for you um but oneasterisk here is that the weights areactually random uh as in you can'tdefine which user gets routed to what umdeployment so uh uh so if we apply thatuh analogy to this next slide so on theleft hand side it's the old Spotify UIand the right hand one this I pulled offGoogle and the right one one is ascreenshot of my Spotify uh applicationum so if you think about it if you if ifthe team were to apply progressivedelivery to this uh scenario a userwould first get directed to the lefthand side UI which is the old one rightand then during the uh fa during thelife cycle of a canary analysis theywould they would end up you know seeingthe right hand side but then they wouldalso switch back to the left hand sidebecause traffic distribution is randomyou can't really control who sees whatum that's a big problem like you don'twant users to have an inconsistent userexperience you want them like you wantthem to have a consistent userexperience where they are comfortablewith what they're seeing so but how dowe solve this right um so we came upwith a new sort of deployment strategywhich combines weighted routing withsession affinity so session affinity isnothing new it's been well establishedwe know that if you want users to uhcontinuously hit a particular workloador particular service you use sessionaffinity which introduces stickiness uhright so what we thought of was what ifwe combine weighted traffic routing withsession affinity so that you still getthe benefit of you know slowly shiftingtraffic from your uh current deploymentto your new deployment so that you don'taffect all your users all at once butonce if a user is served by the Canaryworkload which is the new versionthey're always served by the Canaryworkload so once they see the right handside UI they will never see the lefthand side UI and that's that's the kindof the user experience you want to offerto the users right um so I'm going toquickly demo this um and I hope the demogods are with metoday um okay okayso thank you uh let me know this font isvisible okay um so here I have a canaryobject already defined in a Kubernetescluster um there's a bunch of stuffwhich we can like gloss over but one onething I really want to highlight here isuh this particular section here which uhdefines the cookiename yep yeah so this cookie name iswhat flagger will inject into theresponses uh served by the uh canaryworkload and that's how you actually dothe stickiness thing um and flaggerfully automates this all you need todefine is a cookie name and flagger willgenerate random value cookie valuesduring the life cycle of a canaryanalysis and here we can see that we aretargeting this pod info deployment whichis our target workload so let's see itin action right um so I have STTOingress gateway API set up already onthis uh Kubernetes cluster and what I'mgoing to do is uh I'm going to I'mactually running6.0.0 as you can see here and I'm goingto go ahead and uh set it to 6.0.1 sobasically trigger a canary roll outright and I'm going to wait for Flaggerto do its magic so what's happeningbehind the scenes is like Flagger isconfiguring the uh HTTP route with thecorrect uh response headers and thecookies and everything to make sure thatthe sess session affinity uh takes placeso right now it's scaling up the canaryworkload uh which is running the newversion that is 6.0.1 so uh let's justwait while it doesthat and I'm going to go ahead and So Ihave two incognito tabs here uh to likedisplay the cookie stuff because cookiescan uh get weird if you have the sametab um so I'm going to go aheadand load it up here and here aswell okay so as you can see here it'sboth of them are like uh were beingserved by the primary or initially butnow they're being served by the canaryand now if I like try to reload this italways hits the canary even though ifyou were to inspect the HTTP route uh alot of the traffic distribution istowards the primary workload but yetonce uh but yet because uh this windowand this window as well hit the canaryworkload sorry hit the canary workloadit always hits the canary workload youalways get served by the new version andthis is kind of the behavior we wantusers to experienceyeah so the trick here was we want toyou know progressively move traffic fromone site to another but we don't wantusers to jump backso yeah it was quite challenging becauseyou now we have like two types ofinjecting cookies we inject cookies onthe canary and you give it a name thenflagger you set it random but it's notrandom generates a unique cookie namefor each canary session so every timeyou start a newdeployment everything starts from zeroand then that's how we can pin users toparticular versions and the lastaddition in flagger we also create acookie for the primary so we can do finegrain load balancing between the twouh yeah this is uh this is something uhSanscar was working a lot on uh thisworks with gateway API it works with andother other ingresses correctly Okay yepanything which supports gateway APIworks with that and if you use STOvirtual services works with that as wellyeah we are I am extremely happy aboutgateway API uh when we started flaggerwe had to integrate every single APIlike API uh solo IO API there are somany otheringress yeah and gate API basicallyuh unifies all the way you can definenetworking i if there is tomorrow a newservice service mesher or a new ingressor a new CNI out there that plays theservice mesh role uh we don't need tochange flagger anymore so that was agreat win for theum cloud native community having thistype of API that um you know usimplifies how other tools can programtraffic because this is what flagger ina way is is a traffic programmerprograms the traffic but it alsointeracts with metrics and so on thankyou very much for uh for the demo giveit a clap to[Applause]Sanskar so okay what else we have uh inthe Flux system i want to give a majorshout out to the headline team uhthey've been working I think it's oneyear now they they've been working onthis uh flux plug-in for headlamp if youdon't know what headlamp is it's a uhKubernetes dashboard it used to be asandbox project uh but they announced atthis con they are moving underKubernetes SIG so they are now part ofthe Kubernetes project uh underSIGUI which tells that you know headlapis successful uh the community isgrowing um they are uh improving day byday the the flux plugin um rece ntly theyreleased uh uh an overview page for fluxuh I've also seen in beta flagger umintegration in this dashboard so you canactually you see caneries moving and soon um my request to you is if you are aPL user maybe like me you don't use UISit's okayuh give this a tryum bring back feedback to the headlineteam they'll they'll definitelyappreciate it uh so yeah there is a linkthere go and read our blog post uh andyeah if if any of you catch the keynotetoday morning they uh demoed headlampwith flux as well so that means it'shere to stay and yeah the integration ispretty strong yeah cool okayso how much time do I have you have 12minutes 5 minutes 12 minutes 12 minutesokay plenty of time okay so I'm uh I'mgoing to say a little bit about my uhpet project in the last year uh it'scalled flux operatorum we've been building this for I thinkI think one year uh it's me and otherFlux maintainers uh under control planeit's an open source project and we westarted this adventure by uh you knowthe desire of open shift users to beable to use flux in an open shift nativewayum so okay he said okay I'm going tocreate an operator even if I was likewhy are we creating an operator whenflux is already an operator it was likeokayuh so yeah we created flux operator uhthe first the first goal of this was tolist it on the operators hub so you knowit fully integrates with open shiftecosystem uh you do click ops you clickthere and you install it or I don't knowyou can configure it uh with asubscription uh and that's how westartedso before the flux operator how peoplewere deploying and and bootstrappingflux was with the flux CLI or with theterapform provider so we had to come upwith a new abstraction for uh you knowallowing a declarative model forbootstrappingfluxand we after after we we released it werealized that many other users were likeI'm not running on open shift but I likethis idea I want to I want to switch toit so it kind of grew from thereum and we we we ended up in a placewhere flux operatorum streamlines uh how you can deployflux on hundreds of clusters is uh howyou can optimize flux is very easy to dosharding all the advanced configurationsthat you think of flux you can do itwith the flux operator very very easy sowe kind of worked on the user experienceof of that part of the of the storyuh so we have this new API called the uhflux instance thatum is the thing that allows you to moveaway from git like to maybe feel like uhstrange to say that but githubs does notmean you need to run the git server inyour production systemor at times you don't want to depend onyouruh hosted git or your self-hosted git orwhatever you are using to be that youknow single point of failure when you doanything in production and how we howyou can decouple git from from yourproduction systems especially aroundcontinuous delivery fuse flux is throughuh OC artifacts so the idea is you stilluse git and for collaboration for uhchange management for um history forcontrol but flux no longer goes and lookat looks at your git repo and you youcan run in your CI a simple commandcalled flux push so the flux cli haslike docker uh commands for interactingwith a containerregistry you modify something in it youdo uh you do a a commit a push and thenyou run flux push that stores the statemaybe you could sign it uh so flux canverify that artifact so you store thatstate in a container registry so then ifgit goes down and you want to roll backor something you can just act on thetags of a image of an OCI image rightyou can still do roll backs you canstill change your system even if g istemporarily down Um another great usecase for for uh for using OCI as thedesired state is when you run in uh airgap environments um flux operator wasone of the first adopters of the tool uhsoon as we we launched OCI support forit were organizations that had this uhum they are highly regulated they mustrun in an air gap environment even ifit's in a cloud it's still you know kindof airgapped i don't know cloud air gapdoesn't quite sound right but anyway youknow what I'm talking about if you workin th at uh in those organizations umanother thing thatum we we wanted to improve was automaticupgrades flux is mature enough we haveproved our users in the last year thatwe really care about backwardscompatibility and we we even today youcan upgrade from flux 0 something to thelatest version and it will work so basedon this we said we need to makeautomatically upgrades better how howare you going to secure your systems ifyou are keeping them up to date if youif your system is veryold a CV somewhere it will affect youlike it's imminent there is there is nodoubt about it the the longer um youknow you run something in yourinfrastructure and flux runs as criticalinfrastructure is cluster admin is thefirst thing you deploy after you createyour cluster so it's like reallysensitive anything there can actuallyaffect and take control of your wholecluster so what's the solution to thisyou should always keep up with thelatest release and PL operator makesthis very simple uh you have their specdistribution version you give it aserver expression and that's it and youforget about it and flux operator willkeep flux up to date and I I have hereare two examples on you know how you canenable um persistent storage for fluxinternal artifactsmulti-tenency all the things that youcurrently you can do with the CLI thereis no issue you have a bunch of flagsyou also need some um customizedoverlays on the git side so instead ofdoing all of that here you replace tensof thousands of YAML that you store inyour repo as the flux uh system here youjust have this I don't know couple oflines of of configuration and you canchange it you can enable shardingdisable sharding upgrade downgrade andand the flux operator will do the rightthings for you another another goal forthis was to reduce the chanceof suicide like it's very it's veryeasy to destroy your wholecluster uh if you delete something inthe flux system uh directory in your gitrepoum there are various way[Music]of yeah because flux is so flexible weyou can do whatever including tellingflux hey please delete yourself and thewhole clusterand yeah uh with flux operator you it'sactually hard to do itum reallyhard uh it requires a lot of uh you youyou need to try really hard to to beable to do that kind of stuff so yes italso adds some guard rays right we wantto better protect our users especiallybecause it's flux operator you'll berunning fully automated is like githubson autopilot mode so an autopilot shouldalso be defensible right it should alsoprotect you from doing uh bad thingsokay so that was the the first API withwith we started with making flux easierto install running it at scale beingsimple to configure it and we got to apoint where we said we want to buildhigh level APIs on top of Flux whichdon't actually make sense to build themto extend the current flux APIs therewas a lot a lot of requests for examplefor a flax hand release to be able todeploy from a pull request and stufflike are we going to modify the Helmrelease API which should be a reflectionof what Helm does and talk to what is apull request is not even a git conceptit's a GitHub concept gitlab has mergerequests right so and and flux talks gitit doesn't talk GitHub gitlab and so onSo with with flux operator and this newthis new API we kind of said okay we wecan add here the fancy things that arevery scope to whatever service you arerunning and the goal of the resource setAPI is to give you this selfservice uhenvironments feature where yeahself-service can mean many things tomany people and in the spirit of flux wedidn't want to have the research set APIbeing highly opinatedso we don't haveum we don't impose you the idea what anFML environment means for you in thiscase an FML environment means you createa new hand release in a existing nameaceevery time someone opens a pull requestadd a label uh with the input provideryou tell flux operator hey scan thisrepo watch for pull requests that have alabel and with the resource editorcreate a hand release object for each pfor each pull request that has a labeluh and use the um git sha of the pullrequest as the uh image tag right so youyou can change code you can changeconfiguration in a pull request you openup a request your CI builds the imageand then then you can deployboth the code change in the containerthat's packaged in the container butalso the configuration change which ispackaged in the Helmchartand this exampleis it it implies that you have a previewserver you have a name space there foreach app and you deploy there using helmall the in all the instances for everypull request and it follows the pullrequest and so on but this may not workfor everybody um some people may want tocreate a new cluster how would you dothat you would add here in the resourceset a cluster API definition and thenthe hand release will have the target tothat cluster so it's up to you or youadd I don't know cross crossplane uh uhdefinition that creates the cluster andso on so it'snot it does not forces you into tellingyou what how you do FML environmentsgives you the tool to build that stuffand if you want you can also create anAspace per pull request or a cluster ora Hen release or no Helm at all and youonly work with with flux customizationand so onum so yeah it's um I'm I'm quite excitedabout this new way of doing uh of beingable to test things before you mergethem it was I think it was one of themost high requested feature in Flux likepeople are saying I don't want to mergeI don't want to see it running on acluster so yeah we are we finally havethis capability uh so I please give it atry tell us how it wentum still uh early days we we want toimprove things we want to expand it moreit only works with GitHub and GitLab fornow if you have other Gitservices you know open an issue maybecontribute uh yeah so that was that wasthe flux operator how much time i thinkwe have like 30 seconds we can okaythrough this okay okay so to endum these are the things that we'veshipped in flux 2.5 uh the latestrelease uh for many of you running onGitHub uh you know the request alwayswas we want app authentification wedon't want to link flux to a personalaccess token um one of the fluxmaintainers uh Deepty uh she works atAsia Arc uh she she contributes a lot toFlax she helps us uh you know with withFlux maintenance she actuallyimplemented GitHubauthentification uh and yeah give it atry if you're on GitHub now you candecouple Flux from personal accesstokens uh another thing that we've beenworking on and this is a long long-termrequest are having having a way todefine custom hell checks with common uhexpression languages so now we can dothat with flux and it works withcrossplane cluster API you can do manyadvanced things when you do with fluxyou deal with flux dependencies when yousay I want first to create the clusterthen some other things I want to checkthose other things are there uh so nowyou you have an an expression languagewhich is very cool uh it's beenintegrated in in Kubernetes is part ofthe policy um u policy agent thebuilt-in policy agent and we are we areworking on getting more integrationswith cell influx we we think this is thefuture of the flux APIs if you want toextend flux in some way we will betrying it to you know uh do it usingcell which is quite powerful and moreflexible that add you know hundreds offields to the APIs we we we kind ofthink cell will will work really greatin the future for usum yeah and we we also[Music]uh completed the OCI part of FL so youknow you youyou have notification controller whichcan go into git and say check thiscommit was successfully reconciled onthe cluster but if you use OCI as anintermediate layer we couldn't do thatand now we can we can with metadata soeven if you use a container registry asthe desired state Flux can go back andsay even if it was reconciled from acontainer registry and go back to G andsay yes this commit was completedsuccessfully or it has a route um yeahplease check out our road map there aremany many things coming this year we arehighly focusing on on futures um yeahthat was it thank you very much for uhfor being with us thank you[Music][Applause]2025-04-15 21:59:53.557106y so TLS helps us on encrypting theevents but what happens if you want tokind of protect your broker that noteverybody can send events to it so forexample define some access rules or somepolicies and protect your broker thatonly source XY Z can send events to ittherefore we need first authenticationto verify where the request comes fromor who sent it and finally authorizationto implement thepolicies and for authentication as saidit helps us to verify who sent therequest and therefore we use we areusing or integrated open ID connectto help us or to ensure that we canverify who who sent the requestand this is also done on on two sidesfirst addressables they provide theso-called OIDC audience in the statusthis is an autogenerated unique fieldand it will be automaticallyprepopulated for you when the OIDCauthentication feature isenabled event producers on the otherhand they also create or they create aservice account uh service account underthe hood for you also automatically andthis new service account serves as theiridentity um when sending events so umwhen event producer now sends an requestor sends wants to send an event itrequests a new ODC token for theiridentity service account and for theaudience of the target and attaches itto the to the request in the in theheader and then the addressable on theother hand when it receives an an arequest then it checks the authorizationheader or the the token when it's okayit passes it through and continues withthe um event processingwhen it's not okay um for examplebecause the token expired or because itwas for the wrong audience then itdeclines it by a41 on the resource side looks thefollowing that um on the source side onthe event centers we now have the theidentity of the service account umprovided in the status or serviceaccount namefield and for the addressables as theyhave their audience this is alsoautomatically populated for you it isthe status address audience field andit's of course also in the addressesfield as we saw in the TLS slides butfor simplicity we have ithereokay uh sorry yes okay so now we knowwho sent an request but withauthorization we can now or aauthorization should help us to restrictwho can send events to your resourcesand we do this by um as we added a newum custom resource called event policyand in this um event policy you canspecify who can send events to whom andthe event policy has two mains mainfields first the spec.2 where you canspecify which resources are going to beprotected um for example you can do thisby providing a direct reference as inthis case to a broker or you can also dothis by a selector by selecting multipleum resources which are kind of protectedby this eventpolicy also as a side note when the spectwo is empty then the event policyapplies to all resources in thenamespace and the other importantsection is the spec from section whereyou specify who is allowed to sendevents uh to the above resources and youcan also do this by specifying a directreference as we do here for ping sourceand this can also be come from anothernamespace and you also can uh directlyspecify the service account names or theidentities via this sub field and a nicething about a sub field is that itallows you to use prefixes followed by astar like in the second sub um and thatway you can for example allow allrequests from all namespaces from orfrom all service accounts from from anamespace and this third section are thefilters and they are optional and theyallow you to even shrink it more downwhich events are allowed for examplethat way you can restrict that onlycertain event types are allowed to besent to yourresource okay let's uh check on a quickexample how this looks like imagine wehave namespace one having uh ping sourceone and namespace 2 having ping source 2and a broker and this broker isprotected by an event policy and inevent policy we define that only pingsource one from namespace one is allowedto send events to obviously meaning thatping source one is or ping source 2 isnot allowed to send events to thebroker now you might wonder what happensif you do n't have defined any eventpolicy for someresources for that case we define somefallbackmechanisms or some default authorizationmodes which jump in in case no eventpolicy is in place for a resourceso and there we have three differentoptions you can choose from first allowall which means that yeah all requestsare allowed if no event policy is inplace deny all meaning that all requestsare denied by default which kind offorces the user to create an eventpolicy and assign it to the resource andthe third option is the allow samenamespace option which means that onlyrequests from the same name space aswhere the resource lives in is allowedto be sent and as this is default umthis could be seen as some breakingchange but as you can simply update orset the default authorization mode umyou will befine okay so going back to the previousexample where we had the two name spacesand a broker um in this example we don'thave an event policy which would meanthe broker would fall back to thedefault authorization mode which is theallow same namespace um policy whichwould then mean that only requests fromthe same namespace would be allowed tothe broker in this case this is only thecase for Pingosource 2 and as Pinkourone is in another namespace as thebroker it would be declined and bydeclined I mean they would receive afour or three statuscode okay that's it from for securitynow let's check what wehave or which kind of news we have onthe integration sideum previously we had only or can nativeeventing was kind of limited to a smallnumber of built-in event sources andthings for example we had the API serversource or the ping source and when userswanted to connect to third partyservices they needed to create forexample or to leverage the containersource or write their own source and ofcourse this is some some burden and ummaintenance work of course you couldalso use some third party sources butsome of them are not maintained verywell anymore or even got shut downbut maybe you know the Apache camelproject which have and the Apache camelproject they have the camellets whichrun Kubernetes and provide you someconnectors to various third partyservices for example service to AWS orto Google cloud services or um messageservices or JMSworkers and so we thought okay it wouldbe nice to have support for connectingto third party services natively in Knative eventing core and this is donenow by the new so-called integrationsource and integration sync and thosetwo new sources or those new sources andsync allow you now to get events andsend events from AWS S3 um SQS SNS orDyn Dynamo DBstreams and it's now super easy for youfor example if you want to receiveevents from AWS SQS you simply createintegration source add the SQSconfiguration options the O secretspecify the sync as you usually do for asource and you're done afterwards yoursync in this case the broker willreceive the events fromSQS similar if you want to send eventsback to SQS you simply create anintegration sync specify the AWS SQSconfiguration options the O secret andyou're done the sync will create an anstatus where you can send send events toand that's it and by having thisnatively inum K native eventing you also get thisin the in the usual release cycles andalso it um supports for exampleintegrating into existing systems whatwe have like or um so that for exampleit it comes with TLS support out of thebox okay that's itfor eventing integration so handing overtoum eventdiscovery so the first part is likeunderstanding why we talking about eventdiscovery and so as we saw like when webuilding event- driven systems usuallywe connect to all sorts of systems thatare like either public cloud services orlike private internal system servicesand um the idea is that to know how toconsume them we need to have either gooddocumentations in way that every fieldis like documented and well known or wehave like schemas aswell and um so far like what we had wehad like on the documentation website wehad like a long list and you had to digthrough each individual source tounderstand what is was producing whatare the kinds of events I'm going to beable to consume with that source and andwhat's also the shape of that event uhin a way that I can you know uh dializeit and do something useful withit and yeah that's the idea So what do Ido next when I visit the documentationis not very helpful usually these daysyou know we go to charge BT or cloud orwhatever you like we ask it for examplein this game in this case I'm askinglike what is the type of event I canconsume from AWS DB streams um it's kindof very lengthy at some point is kind ofbeinguseful and it gives me kind of anexample shape so maybe we don't need todo anything you know we have LLMs we wecan solve some of these problems theissue comes when you know if you assumethat public services you know they arewell doumented LLM maybe can help on onsome of those internal systems are notthat well documented we know that youknow um documentation is not first classcitizen in many like legacy orenterprise systems and so we need sometools for that and so probably um that'sthat'sbasically the other solution that wehave currently is basically create afunctions for that event consume it andlog it and maybe extract the schema ofthat event in that way that's prettymuch what everybody's doingso that's why we created this API it'scalled event type API this was presentfor a while inventing uh we kind ofevolved it in a way that you candescribe basically uh a reference to aresource that is emitting thatparticular type and also the attributesand in these attributes as you can seeyou have the name and the value thevalue can also be some sort of templateas you can see in the source value thereis a variable ID in there so that youcan describe kind of complex and thenthere is we usually in in theseattributes list you can put whatever butwe recommend at least type source anddata schema because data schema is alsoa link to your schema registry that youmay have you know people use I don'tknow confluent or there are like someother open source it could be also agithub repo somewhere on your privateum so that that's That's the idea oflike having these APIs like a cube APInatively we can consume it and so asusual you can list it with CLI cubectlget event type and get the list um theother issue is that how do we get likethese event types created you either cancreate them manually you know you cantype the full YAML create them in thecluster and there is also another way ofcreating them we create themautomatically as you go through thebroker as we saw in the previous slidein this slide as they go through thebroker we basically create these APIsforyou so that you can get a head startlet's say and this is that's the idealike as the event goes through thebroker we create that CRD um event typethat event type CRD for for you um thereare obviously downsides to do this Andyou know uh but this is is at least agood start experience you can kind ofget it going at least in development andthen maybe in production you just turnit off the automatic creation and useyou know the the manual oneuh okay so last year we had also part ofthis we presented this at cubecon na sowe have we had also a demo a full demoif you're like using a backstage thisthere is a plug-in for backstage you canvisualize your catalog with this APIit's kind of cool go watch it um it'scalled uh what even discoveringKubernetes uh it's on YouTube so thereis a demothere uh job sync jobs sync is isanother feature we have been working onand uh the idea is that when you uh workon like lambda type of systems or likeserverless type of systems you have ahard limit on how long that processingis can take you know it can be in theorder of minutes uh let's say but thereare like lots of workloads where you youhave this processing and it's going totake long time to kind of process andcomplete and so that's the idea of jobsync is really to be able to have arequest coming in to an event and thendo basically offload these uh longunningprocessing in the backgroundum so the idea is that I have an eventsource so you can use whatever whateversource you're using and as you can sawas you can see so uh as we discussedearlier we can this can come from anysystem and basically create uh when theevent comes in to job sync you can whatwe do is basically create a job for youand the event is going to be mounted onthe file system on that path and so youcan read that event as a JSON and thendo your processing and this canbasically takes as long as you you needtoum another thing that I want to talkabout is that part of this is um apattern basically on now we see peopleuse this uh job sync also to create likecomplex workflow that are like eventdriven so very scalable in a way and theidea of um is the idea is that beforeshutting down your job basicallycompleting your job is to produce a newevent so that that event can alsotrigger further processing basicallyOkay so um to do that we we also reusesome of like existing features uh ifyou're not familiar with sync bindingthe idea of sync commanding is to injectan addressinto a resource that looks like a pod soit could be a deployment a job it couldbe stateful sets could be demon sets umup to you or even custom resources is toinject this as an environment variableinject an address as an environmentvariable we call this case sync and sobasically before completing your jobyou're just going to fire a new eventand that event is going to basically sayI've completed my processing with thisuh with this event and so that you canbasically further consume that event forfor more processingeventtransform so uh we we didn't do like anintroduction but like uh eventing prettymuch is based on cloud events cloudevents is a CNCF also is mainly focusedon specifications but also um we weproduce a bunch of SDKs for differentlanguages so Python Java Node or uhothers and uh it's a specification whatwhat it's really trying to do is tostandardize how you transport eventsacross different protocols or differentformats um in eventingwe primarily we support binary HTTP soit means that as a compact form and theattributes of the event are in theheaders and the body of the request isthe datapayload and then also there is astructured JSON this is a little bitless efficient because we need todialize the JSON when we need to applyfilters um but you know you can use bothof these and adopting adopting thisformat is um is relative s relativelysimple you need to add a bunch ofattributes but it's still something youneed to do additional the idea is thatfor with even transform you want to likebring systems like that already existingeasily into uh eventing and so have thiskind of more cloud native processing foreven um event driven onsystems and um not only that we don'twant people to write a bunch of codethat is just not even lifting or likekind of transformation stuff um andthat's the idea of even transform maybekind of declarative transformations ofeventsum so we started with JSON obviouslyJSON is very popular for like API or oror similar and the the idea A is that weuse this JSON uh expression language touh transform events from one JSON toanother pretty much um it's relativelysimple idea but is very effective andout with like less code to maintain foryou all this is an expression exampleexpression it looks like JSON but is notas you can see the value these arevariables they are not quoted so umthat's how you can tell it's not JSONum so here I have an example in in myexample uh I have a broker and eventtransform and in this example I'm usingthe broker reply feature in the brokerreply feature you can basically have aprocessing of an event and then replyback with a new event and that event isgoing to be get cued as a normal eventwould be um sent by a different sourcein this case the idea is that I want toextract the the payment method from thedata and put that as a attribute of thecloud event so that I can filter andmaybe have a processing only for likecredit card type ofpayments and this is what it looks likein YAML um it's likeum so in in this is is another exampleon how to use event transform there isanother way of like instead of replyingback with the new event I'm going toforward that to a different sync andthat sync can be also a any sort of HTTPURL could be outside the cluster even inthis case I'm justum uh shaping the event to be propercloudevent and this is what it looks like youhave a if you're familiar with K nativeyou usually have this specs sync andthat sync usually is the basically thetarget of theevent so that was the presentationhopefully it was useful and we have sometime for questions if you have uhotherwise we are going to be at theproject pavon kios3a feel free to leave also feedbackthat's QR code forfeedback thank you[Music]oh yeah thank you very much for thepresentation just one question regardingum is there going to be support forclient uh certificate as well with HTTPSencryption is it going to be both way orjust server side that's question one andquestion two regarding the AWS supportfor um for sources are we is there goingto be support forIRSA so for TLS we don't have plans forlike mutual TLS so also client but youknow feel free to open an issue we justdidn't have the ask uh for uh AWS alsothere was no ask but so I I don't knowthere is no yet planOkay any more question oh yeah uh thankthank you much for your presentation agreat talk um I'm working as a platformengineer like I need to provide Kennedyventing to multiple teams like in in theuh in in the organization i'm um and Iunderstand that like uh the brokerincrease like right now is a share akind of share within control pane righti'm wondering if is there any plan tosupport you know like multi-tendencyokay broker in grace like I mean tospread broker in grace to another namespace thing like that so when when youknow multi-tenency is like comp likeevery organization kind of defines itdifferently the way like currentlyeventing is like designed is likesimilar to how you would consume cubeyou know you have an API server that isshared and that's also what isimplementing the logic behind the brokerthe broker is just a logicalrepresentation there is no power runningbehind the broker usually at least inthe like the multi- the sharedimplementationsokay so that's the idea it's kind ofdepending on like there are someorganizations that say okaymulti-tenency for me means a powerrunning in a specific name space becauseI'm going to segregate that with likenetwork policies for example that is notgoing to work with eventing but if youif you combine with the securityfeatures that we have with like allowsame name space you're going to have apretty good solution like around atleast being secure segregated now we cantalk about like the shared data planethat we have there is no like properlet's say um multi-tenant segregation ina way that you know one tenant canoverload the data plane and it goes downyou know it's similar to cube you knowso there are some solutions um we alsohave if some of the native featuresdon't work and you use service mesh forexample some sort of service mesh youcan also integrate with that and we havenative integration withto because that'sthe most software we have but you canalso do something interesting in thereis it depends on like how you want tooffer that but it's possible we wedefinitely have people using you knoweventing as a shared uh there are likealso CNCF case studies talking aboutthisokay thank you for for thepresentation for the authorization partdo you support uh label basedauthorization like uh networkpolicies for example uh can we acceptonly from the same name space but fromsome sanders with label A or label B doyou see what I mean yeah so we we don'twe don't have that u when you when yousee when you say label A and label Blabel on what in thesender sender being an event source yesso no there is no because the way itworks is basically we create serviceaccounts Kubernetes service accounts andthey live in a specific name space uh sothere is no labeling also becauseeverybody can kind of add labels to tothose so it's it's kind of I'm not surelike what exactly that entails in termsof like the threat model you know of thesystem yeah okay thankyouany morequestions all right thank you so much[Applause]2025-04-15 21:59:54.378898 MM��9#��cA6usWUdJMyHYthanks for joining us um today we weregoing to talk about what we have beenworking on for like the 12 18 months agoum we're going to share what we areworking on like security discoveryintegrations and job sync and also uhevent transform i'm Pangelo and I'msoftware engineer radar i also have umbe the working group lead inventing anduh here's Kristoff yeah hello andwelcome from my side as well my name isKristoff i'm also working at Redhead asa software engineer on serverless topicsand I'm also a maintainer of differentCanative eventingprojects so let's see what we have onagenda for today and what we'll show youfirst we'll give you a quick update onthe latest eventing security featuresincluding support for transportencryption andauthorization afterwards um we show youthe latest updates on eventingintegrations how we for example canconnect to third party services like AWSSQS now natively with eventingafterwards Panchul will give us anupdate and introduction on eventdiscovery and the new chopsync featureand finally we'll give you an overviewabout the brand new event transformwhich helps which help which helps youto kind of reshape your events and whichis an often asked feature from your siteokay let's first start with security incanative eventingby default canative eventing does notenforce security on event deliverymeaning traffic is unencryptedunauthorized andunauthenticated and we all know thisposes risks in not only productionenvironments and enterprises need secureevent- drivenarchitectures and to address those needsKative Eventingprovides in its or provided in itslatest or recent versions three new keysecurity features and also transportencryption via TLS through HTTPendpoints to make sure event traffic isencrypted then authentication via OpenID connect to allow event consumers toverify who sent an event and finallyauthorization via event policies whichplays together withauthentication to allow you to restrictwho can send events to yourresources okay let's start withtransport encryption as said by defaultKative eventing does notum enforce security or does not encryptum events and we all know that as soonas somebody has access to the networkthey could potentially intercept thoseevents and this poses risks and byproviding HTTPS endpoints foraddressables we now can or addressablesnow accept TLS encrypted and events andthis is doneon two sides first addressables providetheir HTTPS endpoints or provide anHTTPS endpoint and depending on thetransport encryption feature flag configconfiguration they provide only an HTTPSendpoint when the configuration is settostrict if the transport encryption modeis set to permissive they provide anHTTP and an HTTPS endpoints or both andwhen it's disabled of course then wefall back to the current behavior whatis providing only an HTTPendpoint and event producers on theother hand they prefer sending events tothe HTTPSendpoint and for the certificategeneration for the HTTPS endpoints weuse cluster manager under the hood so noservice mesh or anything else is neededfrom yourside on the resource side this looks thefollowing um here on on uh example on abroker and addressable we have now thestatus addresses fields maybe you knowfrom before we had only the statusaddress fields but since we can since wenow can have um at least two addressesthe HTTP and the HTTPS one in case it'spermissive um it's now simply the pluralform and there we have the HTTPS addresswith the certificate and the HTTPaddressokaectly umPrometheus scrapes these exportersperiodically pulling the metrics fromthe exporters's endpoints and to makescraping easier there's service to gutdiscovery to automatically find theendpoints to scrape um for example itcan query the Kubernetes API toautomatically discover pods uh servicesand so on to um find the things exposingmetrics and then the data is uh ingestedand stored in a time series databaseTSDB for short where it can then bequeried with the Prometheus querylanguage from QL to be visualized andalerted on which is you know veryimportant for understanding your systemand for ensuring that production isn'ton fireso the latest version of Prometheus 3.0was released in November last year uhseven years after the 2.0 release somajor developments include a new UI LTEcompatibility a new ingestion protocolnative histograms and some breakingchanges to improve querying and also toremove tech debt and we'll be coveringthese areas in the talk five months onthere continues to be additional updatesand a few minor version bumps um so inthis talk we'll also go through some ofthe things that have happened since 3.0as well as looking forward to theexciting things happening in thefuture so with that nice refresher onPrometheus out of the way let's talkabout um one of the most discernablechanges introduced in V3 which is thenew UI and yeah as you can see it isbeautiful you can stare at it for hoursum which makes for a much nicer on callexperience for everyone so this improvedlook and feel is the result of a largestream of work undertaken by Ulius lastyear to um modernize the Prometheus UImaking it less cluttered more consistentand just a lot more easier to navigateum to share a few technical detailsbehind the UI um it has been completelyrewritten using React along with themantine UI framework replacing the oldBootstrap based setup um and so this newstack is much more uh developer friendlyand modern and it also brings in somepowerful features um adapted from theupstream prom lens project making theoverall experience much more umintuitive so you can see uh in the newUI a lot of the query panel and theglobal options are now organized neatlyand um sort of hidden away making it umuch less cluttered and it also sports anew minimalist uh color scheme for thealerts and targets pages instead of whatwas there previously which were sort oflike jarring bars ofcolors um another one of the standoutfeatures is um adapted from prom lensit's the new uh query explanation viewand so it's powered by a brand new umPrometheus API endpoint that um returnsthe abstract syntax tree of your promquery and this view breaks down thatquery tree visually and enriches it withdocumentation in line so that um youknow exactly what you're querying andwhat why you're querying um and we allknow um joins are also super confusingat times so this query explanation viewum helps you too by allowing you to umsort of visualize exactly how you arejoining series together by describingthe match groups on both the left andright hand side um of all of your binaryum operatorsnow Promill syntax errors also become uhkind of easy to spot and fix with thisexplanation view as it points out theissues um with detailed descriptions onof every part of your queryand finally um you also get a completelynew and revamped uh query builder UIthat not only helps you discover yourmetrics and all of the metadataassociated with it but also helps youselect the appropriate labels that youwant and build matches directly from theUI itself to build yourquery and there's still a lot of workbeing done on the UI um we have alreadyhad a bunch of work um done on the graphviews such as vertical grid lines and umclipboard functionality um there arestill in progress work items uh toimplement UTF8 um query autocompcompletion as well as add the ability torender um heat maps and nativehistograms um and we would also love tohear feedback from the community on howusing the new UI fields and what theywould like to um modify or changeum yeah next so let's talk about nativehistograms now so historically if youwanted to measure things like thelatency of requests in Prometheus you'duse classic histograms this is whereeach bucket in the histogram isrepresented by a separate sample with alabel defining the bucket boundary andadditional count and some metrics umsamples too each of these samples isjust a simple float valuea new feature uh native histogramsstores the entire distribution so allthe bucket counts the sum and the numberof observations within a single complexsampletype so native histograms are moreefficient and cheaper to store thanclassic histograms unlike classichistograms where you have to manuallydefine bucket boundaries nativehistograms handle this automatically uhwith preset boundaries based onexponential growth so you avoid possiblychoosing inappropriate bucket sizes foryour data uh you're also able toconfigure the accuracy of nativehistograms if youwant so this shows an example of queringnative histograms via the new PrometheusUI where this graph shows the bucketcounts of native histograms and you cansee the bucket boundaries in the umtablethere and this shows quoting for the50th percentile of a native histogramnote that there are some slight uhchanges to querying for native versusclassic histograms you don't need togroup by the le label when querying fornative histogram forexample so currently native histogramsare still experimental but it'sbasically stable and there shouldn't beany like big breaking changes since 3.0there's been additional work likeimproving how functions aggregate nativehistograms and still to do is finalizingthe text format and some additionalrefinements to edge casesthere's also a heat map UI in progressfor better visualization of thedistribution ofhistograms lastly custom buckets supportfor native histograms is in progresswhile native histograms automaticallydefine bucket boundaries sometimes it'suseful to have the manual control bucketboundaries instead so custom bucketsextends the native histogram format tosupport this case so you can get theefficiency of native histograms with theexplicit bucket boundary configurationof classichistograms these will also make iteasier to migrate from classichistograms to native ones so withscraping able to automatically convertclassic histograms to native ones umwith custom buckets and as a heads upthough this is feature is still veryexperimental at the momentif you're interesting if there is adraft native histogram specificationthat goes into a lot more detail aboutnative histograms and how to use themand how to configure than I havehere um up next let's talk about one ofmy favorite topics which is remote write2.0 so if you're not really familiarwith um Prometheus remote write it mightbe good to first start by explainingwhat remote write 1.0 is or was it is aremote storage protocol used by umPrometheus compatible senders to sendrealtime script metrics um to remotewrite receivers who can then ingest thatin some Prometheus format um time seriesdatabase and so the protocol sendssnappy encoded protobuff messages overHTTP with um backoff retry semanticsbased on the receivers's um responsecodes um Prometheus could send andreceive it and other um scalablemonitoring um tools like Tanos CortexMimmer are capable of ingesting thatdata and writing it to their own umrespective TSDBs and this protocol hasbeen widely adopted not only by thesetools but also by vendors as well andhas proven to be very useful for umreal-time metricstreaming the protoraph definitions forremote write 1.0 into are pretty simpleand look something like this um and sothe samples being uh time stamp andfloat value pairs that are associatedwith a set of labels you can optionallyalso include some metadata in separaterequests and the protoraphs by the wayare also available via um the buffschema registry which that QR code isfor now remote write 1.0 is immensely umsuccessful but it really could be betterin terms of network bandwidth um andefficiency as well as support for uh newPrometheus features and maybe also leaveroom for uh changes in the long run andthe remote right 2.0 iteration doesexactly uh that so as the newspecification dictates we have quite afew changes over 1.2 it's still umprotobuff over HTTP but um remote 2.2 touh now sends data by string interningall the various label data as well asthe metadata which vastly reduces um theactual request size and the networkbandwidth costs the spec now alsoincludes fields for the um latest umPrometheus features such as uh nativehistograms and created timestamps andeven supports um sending relevantmetadata directly alongside the timeseries furthermore we now have explicitpartial write handling directly withinthe spec so receivers can respond withthe exact partial write error statisticsfor different types of samples in thatum remote request using HTTP headers andsenders can use this uh can use theseresponse headers to basically log errorsand retryaccordingly as you can see in the newremote right 2.0 into protobuff um thereis a symbols field in that request whichis essentially a symbols table ummapping all of the labels including umetadata strings and the rest of thefields can then refer to the indices ofthat symbol table um thereby leading tojust much smaller um request bodies andum yes this pro is also available in thebuff schema registry now supportingremote write 2.0 on your existing 1.0receiver does not really mean that youwould need a new endpoint um as thisuses HTTP um content negotiation usingthe content type header so which wouldbasically contain the fully qualifiedname of the protobuff being used and thereceiver can use that to determine theversion of the remote write request umwe have also been hard at work trying tomake this much easier to adopt in thelong run and to that end we have a newexperimental library built directly intothe Prometheus client Golang library umwhich kind of allows you to construct umremote write 2.0 compatible senders andreceivers easily without having tomanually ensure compliance to thecomplicated spec the idea is forprojects including Prometheus and otherdownstream such as Thanos Cortex andMemory to use this library and supportremote write v2 as per the benchmarkingdone by Bartk and Callum the initialauthors of remote write two remote 2.0during the implementation and their promtalk uh you can see the efficiency gainsum you would have by switching to remoterate v2 as well uh both on the bandwidthside as well as on the serialization umlevels and we are only getting startedwith this with the growing adoption ofremote write v2 the next step would beto declare it um non-experimentalthere's also some work going intoactually sending created timestamps viaremote write in Prometheus as well asaddressing certain efficiency issueswith metadata and maybe seeing how thatcan evolve with type and unit metadatalabels and the purpose of having thisnew um flexible protocol is so that wecan experiment more with variouscompressions and maybe even formats likeApache Arrow in thefuture with open telemetry becomingincreasingly important um we've beenactively um committed to ensuringPrometheus is a great backend forhotel in addition to scraping a remotewrite Prometheus now allows you todirectly push metrics in the OTLP formatto the OTLP v1 metrics endpointprometheus will then automaticallytranslate these OTLP metrics into itsnative Prometheus format there's a guideon how to configure this best practicesand information on resource attributehandling on the Prometheus websitelinked in this slide and one thing we'veseen is that out of order tends to be alot more common for data received viathe OTLP endpoint and as part of 3.0 outof order suggest um ingestion inPrometheus was marked asstable also excitingly with 3.0 UTF8support was enabled by defaultpreviously Prometheus metric and labelnames were limited to letters numberscolons and underscores but with recentchanges we now support any UTF8character this makes integrating withhotel nicer uh metrics can now preservedots for example which are commonly usedin hotel metric names and labels ratherthan replacing them with um underscoresso there's a translation strategysetting to configure this um UTF8support of course also means emoji whichis personally the more exciting use casefor myself um note that there is newsyntax for UTF8 uh metric and labelnames metric names with UTF8 um have togo inside the curly braces and metricand label names with these charactershave to bequoted so UTF8 support is still beingrefined as not all tooling supports itor supports it consistently so forexample we are looking for volunteers toadd GTF8 support to the Prometheus Rubyclient and um there's also additionalwork and fine-tuning with OTEL um wenoticed for example recently that withUTF8 turned on the hotel collector wouldbe exposing metric names withunderscores but label names with thedots um and next there's currently aproposal for type adding type and unitof a metric as metadata right now hoteltype and unit are just shoved into themetric name which uh isn't somethingthat everyone likes um and having unitand type C treat especially couldimprove the user experience and allowfor more typwarefunctions to more fully support OTmetric ingestion we're also looking intosupporting delta temporarity inPrometheus so currently there's anexperimental option to convert deltametrics pushed through the OTLP endpointinto cumulative ones to be stored inPrometheus but we're also progressingwith another option that avoids thecurrent version altogether so nativedelta support in other wordsum we've gone through the major featuresfor 3.0 so just to wrap it up we havethis screenshot of an OTEL exponentialhistogram that was ingested via the OTLPendpoint and converted into a Prometheusnative histogram with the dots in thename preserved and of course queried onthe newUI and now we get to the final part ofthe V3 update the breakingchanges with the new major version wesaw an opportunity to make some breakingchanges and clean up Prometheus a bitmost of these changes were focused onhousekeeping removing some old flags andtech debt and also making a fewrefinements toPromQL i could go through everything uhline by line um but that's probably likenot the best uh use of time for everyoneum there is a migration guide that goesinto more detail about everything ifyou're looking to upgrade from V2 thoughso yeah don't worry about reading allthat teeny text up there um yeah I dohave a couple of key call outs thoughbased on issues we've seen people hitwith the update toV3 so the first thing is that range andsubquery selectors are now left openpreviously the sample at the start timestamp was included in the range and nowit isn't so in subqueries where therange and the resolution are the same uhV2 will usually return two points whileV3 only returns one so this can causeissues if the subquery is um wrapped ina function that needs two points likerate uh suddenly you might just not seeanydata the second thing is when scrapingum the implicit fallback to thePrometheus text format has been removedso previously if the content type fromthe target was missing or unrecognizedPrometheus would assume it was using thestandard Prometheus text format howeverthis could lead to metrics in the openmetrics formats to be misped uh withouterroring so therefore V3 will actuallyerror if the format is unclear um we'veseen a few instances where users starthitting this error when upgrading to V3and the ideal fix is to set the rightcontent type but there's also a fallbackscrape protocol setting you canexplicitly configure in Prometheus to goback to the previous behaviorso we also have some interestingcommunity and governance updates um thatmight have been shared a few times butit's always good to reiterate thesethings um so in case you missed it umthe Prometheus team recently agreed toadd 22 new members including the both ofus um a lot of them have beeninstrumental in getting uh V3 out thedoor some are working on things in theecosystem and have been working hard touh make sure that using Prometheus is aholistically enjoyable experience foreveryoneinvolved and we are also updating ourgovernance model a bit so instead ofadding team members rarely and having uhto gain consensus for all smalldecisions um from that many members umwe would have a much smaller steeringcommittee that would make crucialdecisions about the future of thePrometheus project and it will be acommittee that people can uh get electedto as well as be rotated out of umalongside that we are also introducing acontributor uh ladder framework torecognize um key contributors within thecommunity and get them involved muchquicker um so if you have ever gotten aPR merch in any Prometheus repocongratulations you are now acontributor and based on um further uhcontributions going from member tomaintainer is also fully possible andjust a much less uh daunting experiencethan what was there previouslyand if that excites you um and youreally would love to start contributingum and getting more involved well wehave a lot of future work ahead um sobit of a nice segue there we have a lotum going on in our community um and withseveral different changes underway andwe would like you to be a part of thatas well especially if those changes canhelp you in your context um we haveinitiatives to support open telemetryand OTLP signals in a way which makesPrometheus the best backend for OTLPmetrics um the work going on thereinvolves tricky concepts around deltatemporality resource attributes evencreated time stamps and so much more andone of the key features to enable umOTLP which was UTF8 now needs efforts onevery single um Prometheus clientlibrary we have proposals for addingsecret providers to Prometheus orexploring using type and units as actuallabels instead of metadata and there's aton of work going on behind the scenesfor native histograms aroundimplementing custom buckets and forremote trade 2.0 as well and peopledon't stop innovating there um we noweven have a working group to explore a ta new TSTB format powered by Apacheparet and and the reason I mention allof this is to highlight that there'sroom for everyone to contribute um tothese in-depth topics and bring in theirfresh perspectives and experiences andthe easiest way to reach out to us is tobasically explore um the myriad of uhchannels we have on CNCF which are quitea lot and um each has its own topics andyou can also grab us here at uh thePrometheus booth in the project pavilionwhich is the 9A uh kiosk for morein-depthconversation so any questions[Applause][Music]regarding PromQL oh okay uh regardingPromQL were were you considering uh theimplementing the labels into the queryfor example if I have a recording ruleand I would like to have a label withits uhSLO then it would much be easier for meto just replace the um the value thethreshold with the label value insteadof you know hard- coding itso not sure I fully understood yourquestion do you mean like whether youcan add a label to recording rule or Imean uh in such case that I have arecording rule and I have an alertingthreshold and I would like the rule tovalidate as true if it extends the valueof the label were you considering suchcasesi'm not just just to understand so thisis sort of like you have the label valueas the value essentially of the theresult of the uh the metric um I I'm notsure if there's like any activediscussion on this i think it's likeit's actually a case that I was uhlooking into a bit myself or something awhile ago but it was I guess at thatpoint there wasn't like enough ofincentive for us to really look at thatso I guess it's like if there's enoughuse cases um for it then maybe it'ssomething that we we could implement butit's not Yeah actively doneoh there's a questionthere uh at the back uh well furtherdown there's a in themiddle just through therethank you yeah just a question um soboth remote and hotel uh allow to pushmetrics to Prometheus so is there any uhreason to use one over the otherum really depends on what you want tomonitor i guess if your application isinstrumented with ONET it might makesense for you to send OTLP um whereas ifit's instrumented with client Golang orsomething similar it might make sensefor you to just scrape using Prometheusor remotewrite anyoneelse goingonce goingtwice okay thank you[Applause]2025-04-15 21:59:55.129557 {{��h�:#��AKS_rGWazTiohey everyone uh glad to see you all hereat CubeCon London today hopefully youmanaged to catch some good talks alreadyand are just not too sleepy after lunchum so in this talk we will be going totake you taking you through PrometheusV3 what's new in that and talk a bitabout the future work beyond V3cool so uh hi everyone i'm Fiona i'm asoftware engineer at Grafana Labs and uhI mostly work on Prometheus and GrafanaMimir there um in Prometheus I've workedon stuff like out of order nativehistograms as well as helping tocoordinate the V3 release and I'verecently become a Prometheus team memberso uh yay for me I guess um this pictureis from the last time I was here at theExcel Center so that was a couple yearsago and I was doing a sports event andmy watch picked up that I may have beena little stressed so I'm hoping thistalk will be uh less nerve-wracking thanthatyeah and my name is Saswata Mukharji i'ma senior software engineer at Red Hatwhere I work on monitoring platformslargely based around Tanus andPrometheus i'm a maintainer of Tanas aswell and a newly minted Prometheus teammember i also help maintain certainother um CNCF adjacent go tools andlibraries and you can find me as thechasm code pretty much anywhereum so yeah um a little bit of audienceparticipation to start which we all knowand love um so for Prometheus deep divetalks um we like to do this fun exerciseum before we start so um first uh raiseyour hands if you know what Prometheusis oh very nice cool uh keep your handup if you usePrometheus okay and now keep your handup if you use Prometheus atscale awesome um one last question thennow uh put or keep your hand up if youknow what the plural of Prometheusis awesome okay cool some real expertshere I see um but yeah if you didn'tknow it's uh Promethei and there's a funtalk on why um but yeah now we know theplural is Prometheus uh what even is itjust as a bit of a recap um so it's ametricbased monitoring and alertingtoolkit created at SoundCloud in 2012and it joined the CNCF in 2016 as thesecond project to do so after umKubernetes if you're curious about thebackstory um there's a nice documentaryon it on YouTube which goes through itsinception and its journey to becomingthe widely used open source project itis today um what Prometheus offers is arich instrumentation ecosystem datacollection and storage and queryingalerting and visualizationsupport to dive a bit deeper for datacollection you have exporters whichexpose metrics in the Prometheus formatthere are exporters for a bunch ofdifferent things like databases andservers and stuff and then there's somesoftware like Kubernetes that alsoexpose Prometheus metrics dirhere we gocan you hear me okay um so some of thecaps that we've implemented over thepast year huhsorry yes kep is a Kubernetesenhancement proposal which is what isused to implement longunning big featureimplementations that are going to takelonger than just like a single PR and sothey graduate from alpha to beta to GAand uh some of the ones that we'veimplemented over the past six months toa year um is support for custom profilesfor cube kettle debug you know we havesome in the past we had specificsets of permissions and rules that youcould use for cube kettle debug um butnow we have uh the ability for admins topass in whatever they want to be able topatch the stuff in place you can stilluse the legacy tooling because againbackwards compatibility is paramountwith cube pedal but we have the abilityto specialize what you're doing for aspecific debug eventnextplease uh we also have begun to improvethe subplugins uh one of the biggestissues that we have with um the numbersof issues we get opened is the cubekettle create command uh a lot of peoplewant to use that instead of apply and alot of people get frustrated wheneveryou can't create a specific resource orthere aren't enough flags to seteverything for whatever resource hasbeen created because the proliferationof flags makes it very difficult tomaintain and supportthe subcommand um we try our best not toadd new ones um and so this is a meetingin the middle um with the community sothat they can create their own pluginsthat define these things for us nextam I doing this one too yesokay we've also added sub resourcesupport pretty tightly scoped uhso being able to have nice easy to readtable output is one of the great thingsabout cube kettle and we want this to beavailable for certain sub resourcesuh whenever we're uh getting specificthings so like for a deployment we wantto be able to get the status or thescale rather than just getting like thegeneric uh information that you getwhenever you do you run it now so wehave implemented this and uh 133 GA yeahit'll be GA and1.33 nextuh so I should have said this at thestart too we also what we have a lot oftimes at the end for questions anddiscussion so stick around for that andget your questions ready uh KEP 3895 wasinteractive delete for cube control andand I really want to tell the storybehind this one uh for years that we'vebeen working on the project uh What whenyou delete a namespace in cub uh inKubernetes it deletes everything in thenamespace when you delete all of yournamespaces maybe by passing the thewrong-all flag uh it basically wipesyour cluster uh it was never there wasnever a chance for it to ask hey are yousure you want to erase your cluster uhand so we've had lots of issues openedover the the past 10 years or so uhpeople saying hey I shot myself in thefoot uh but this was way too easy for meto do so could we please make it betterand the reality is we couldn't uh we wecan't change the default rightkubernetes has a very strict backwardscompatibility contract and one of thethings that we have contracts on ispipeline runs and so all of a sudden ifwe start prompting users to ask ifthey're sure in a CI pipeline we mayhave broken and paged a whole bunch ofpeople right so we have to be verycareful because people don't read therelease notes and they upgrade anywaysuh so uh this cap actually introduced adash I flag to cube control delete sonow you can actually opt in to deleteconfirmation uh still as a flag it's youknow you have to remember to pass theflag and so thankfully we've added a wayfor you to set this by default yesexactly and this is how the userpreferences are working and this isentirely brand new for feature that wefinally are releasing in a couple moredays weeks um it it is alpha in 133which basically means you have toexplicitly pass an environment variableto use it i'll I'll I'll quickly showyou how this works in a minute uh acouple historical notes we actuallyopened the original issue because asEddie was talking about delete we wantedto in be able to separate the userpreferences from cluster preferences fora while now and actually the originalenhancement was put together in 2022 sowe're like 3 years in um and we've triedover the the past three years to put ittogether um and thankfully Ardat who didamazing work and Jordan who actuallyreviewed and provide valuable feedbackwith regards to the API shape so that itis currently bulletproofuh as we were we will be migrating thisuh this file contents over uh severalversions alpha beta and as Marley wastalkingearlier currently we have two major uhfunctionalities embedded in it and wethink that that should handle roughly 8090% of use cases uh so my ask to eachand every single one of you here um inthis room is please please please giveit a go try it out and and see what'sgoing on whether it's uh supporting allof your use cases whether it is uhcapable of of doing whatever you want ifnot let us know uh 6 CLI Slack email uhemail us in our on our mailing list orjust pop up to one of our uh bi-weeklycalls and let us know we want to know ifit matches your requirements or or notso like I said a a quick demo and whileMach's getting that going right so thisis a separate file that's going to livein your cube directory called a QRC thathas its own configuration forpreferences and Mach is going to give ademo of that what's really nice is ifyou are a administrator for IT or youare provisioning developer laptops orjust anyone on your team you canactually like forcibly put this on theirmachine as part of a provisioningprocess to opt into safer defaults likedelete confirmation and other thingsyeah so here's a a sample um cube rcfile so normally there are two optionshow you can embed the cube rc it will beexplicitly called qrc and it will beplaced in your cube uh directoryalternatively it can be passed through aflag as I was mentioning we have eitherthe ability to define aliases so thefirst example it basically creates analias to a get command which explicitlyruns the get pots and it passes in uh aspecific namespace and a selector i hopethat the the data and and the way we arepresenting those information it ispretty verbose but I'm but we're alsohoping that it is ratherself-explanatory the second example uhbecause the only difference between thetwo examples is how we are um adding thearguments whether it's before or afterthe arguments that you're going to passi'll I'll show exact uh examples aswe're going and at the bottom the thelast two examples is basically overridesso the last example is what Eddie wastalking about forcing the lead to bealways interactive the the one before weare basically switching ourselves toalways use the server side apply so letme quickly show you how this thingworks oh I wanted to do something else iwanted to do getstoreclose okay I'm going to start with thesecond example so uh in this particularone I'm actually appending uh appendingthe the name of the resource and thenalso adding the the name space and I'mtrying to get um a service of that namebut there is no uh service but if Iexplicitly ask for a deployment for thisthing uh then I'm going to getuh deploy of courseuh so I'm getting a deploy so forexample one can create all the resourceswith a particular name and then use thatuh to be able to retrieve uh specificresources but the other one was getstorepot and that will return us uh the potsmatching the the explicit uh request aswe were talking about for the applythat's prettyself-explanatory uh we are going to beserver side applied so we are actuallyinvoking uh the defaulting for theserver side normally you would just onlyuh you would only see that the pod wasuh just applied and not server sideapplied in a similar way if we do uh adelete yes we are getting uh theinvocation about are you sure you wantto remove it which I will uh I willspecifically call out the delete all allbecause there is an all that is not allso please stop using all uh which is analias for resources because the allresources is not all resources in cubeit's a it's all it was all I don't knowlike in the first year maybe and quicklyyeah it it it stopped being all but theflag all basically means it it doesn'teven ask so whenever you're usingwhether that's uh label selectors ormultiple resources when removing youwant to make sure that you are makingsure that whatever you're trying toremove is the thing that you're tryingto removeuh just a couple future things thatwe're thinking about that we would lovehelp uh involvement come hang out withus come give us your feedback use thesethings uh we want to figure out how dowe support multiple cube configs in agreat way uh while we do have contextsthat are are great for you know takinguh different uh configurations from thesame cube config we want to figure outhow do we kind of wrap that and extractthat out another layer where you mightactually have different cube configsthat aren't just contexts so we'relooking for some what you want at a userexperience for that uh we also want todo a new command like apply that isserver side by default uh you should beusing serverside apply it's a greatfeature that uh solves a lot of problemsand will stop you from hurting yourselfand we are looking for uh for namingbecause what we currently struggle withis what we should call it yes and wehave multiple examples the examples arebad we've all agreed that actuate is nota good example for apply v2 also wedon't want to roll anything like applyv2 because that that sucks as well so ifyou have a great example for what theserver side apply name of the commandshould be let us know we are very opento uh to suggestions yeah or any otherthoughts about what you like the commandto look like or the shape beyond justserverside apply again this is one ofthose things that we can't just changethe default behavior because it'll breaka whole bunch of people uh JSON paththat's in cube control folks have usedit before you may realize that it's notactual JSON path uh it's a very smallsubset uh and also what you think isJSON path like uh the length function isactually not part of the spec i could goon about this but we don't have a a fullJSON path implementation that you thinkyou'd want to use so we're looking atwhether or not we should improve that uhor if we should be looking at somealternatives like cell no one groaned somaybe sell yeah uh and then we haveanother feature that we want to work onuh for CRI native container copy rightnow if you use cube control copy youhave to have a tar binary locally and inyour container which stinks uh but thethe reason for that is we've had threeCVEes around we have way too many CVSaround this and that led basically to usseverely limiting the functionality ofthe copy if you remember early days ofcube it actually allowed you to copymultiple files currently we settled on asingle file because we had I rememberworking on three and there were a coupleothers and it was like literally oneafter another we we finished solving oneand it was like oh but you actually cando something else with that and uh Ibelieve that lasted for like a year andthen we even considered droppingentirely cube cuddle copy uh but due tobackwards compatibility where we're likeokay we have to keep it where we justneed to make it secure enough with thecurrent limitation and ideally talk withthe CRI folks and try to add the copyfunctionality natively into Dockercontainerd cryo and whatever otherimplementation exists out there yeah andif if you've opened feature requests orbugs about it and we've closed them andsaid no sorry uh it's because we're allafraid of introducing more CVEes and noone wants to be responsible for that soit's kind of just stuck and frozen uhbut hopefully this way that we push theAPI directly into the CRI so it operatesat the node and cublet level we can doit in a much more secure way where pathtraversal isn't as as scary of a thinguh and last but not least Marleymentioned this earlier we have so manyflags in cubic control for every singlecommand uh we actually had to rewritehow the help text prints out inside ofCobra to make it readable in the firstplace so we're trying very hard to notbe introducing more flags which isreally hard for a CLI tool especially a10-year-old one um so if we say no toflags that's why we we need to find abetter way to make the help textreadable uh and last but not least uhthis is where you can join us uh we haveuh meetings every Wednesday with thetimes that are listed here um we domonthly bug scrubs uh for cube controluh we also do customize uh in those aswell so come hang out with us there youcan find us on Slack our mailing list uhand then this QR code if you scan thatyou can give us a review so they keepletting us have maintainer sessionsbecause it lets us come to CubeCon whichis awesome so give us a good review andI think we have 12 minutes for questionsyeah quick note also the the mailinglist if if you've watched our sessionspreviously the mailing list wasdifferent that has changed as of end oflast year uh the last list is stillthere but we're asking everyone uh tostart using and switch over everyone allthe previous users were automaticallymigrated over to new one we are still uhmonitoring the old list but every singleuh post on the old list has to gothrough our moderation so the less youpost and use the old list the better forus because that means less work uh thenew list is under the entire uhKubernetes orc uh Google managed accountwhich makes it so much simpler for uhfor us and also for the entire org incase of different things happening andmaintainers switching emails and soforth that we've struggled in the pastso I believe that's all we wanted tocover uh do you have any questionssuggestions for comments namesuggestions for new commands orsomething along those lines i'm positivewe want to hear about it do we have amic for the audience weofs Thankyou thank you hello thank you i have aquestion about the state of thecommunity around customize because I sawquite many uh PRs rejected or uh closedwithout any reaction so how many minutesuh there areuh so currently Yugo is our workhorsewhen it comes to customizeuh so be mindful that a lot of thetopics he's handling just by himself ibelieve we have one another personcoming on and off uh from time to timeand helping with customized triaging theissues but if you're interested in um insupporting our maintainers or eventuallybecoming one of the customizedmaintainers uh we welcome you uh it'sbest to talk with you go on Slack that'sthe best way to interact and startworking through uh through the backlogmake sure that the issues are properlydescribed eventually if you have anyissues or something and you can uh learnon the job and eventually raise throughthe ranks and become a co-maintainer foruh for customize yeah and a greatquestion thank you for asking it uh thisgoes for everyone everywhere we're allvolunteers we don't get paid to work onKubernetes uh the last lead of customizeactually got laid off and then decidedto go back to school so we lost her souh talk to your employers see if theycan give you open-source contributiontime and uh come hang out withusnext next question I mean there wasone we have lost a lot of customiz uhcontributors over the past year soplease I'm afraid I've got I've gotanother customized question um there wasthe there was a mention on there aboutuh better integration with Helm what hascustomized got to do with Helm what doesthat mean so any more backgrounduh souh can I say again um so what what whatdoes it mean to integrate customize withHelm better yeah yeah uh currentlycustomizeexecuting her binary on your computer sothat'sthe go exto a custom is executing her binary foryour computer so but that is a thatmean customer pass for your hergenerator transform uh her configurationtouh her to her generator command withputting a parameter toarguments so I plan to uh usinga importing a herm code for customize soI'm stop using a executing him binarydue to asecurity problemand had tomaintain that pass and building a hermis hardokay yeah thank you yeah and we've runinto this for some other stuff too a lotof the Helm uh code is not exported sowe actually can't consume it as librarycode which is why it relies on thebinary being on the system so we'reworking with the Helm maintainers tomake this different for Helm V4 uh sohopefully Hugo can integrate it muchbetter[Music]next questionhow much time do we havehello okay i question about uh CRDsupport in customize especially you canyou can add the open APIuh with a link but that doesn't seem tobe aligned with the API serveruh CRD properties that are used forserver side apply is there any plan tomerge that so as a maintainer of a CRDwe could have the same propertiesdescribing the merge type and the mergekeys you understand thequestionuhcustomize don't allow to communicate forAPI sub to when customized build so Iuh I think uh customize somesubcommand can export open API from QAPserver so please consider to use thatfunctionokaydo we do discovery for customize uh thethe way the customize is currentlyconstructed is that we basically embedthe entire open API the cube API openAPIum it will be nice for sure in the longrun to be able to and probably read thelive uh open API schema from a clusterand rely on that rather thanembeddingum but I don't think we have thecapacity to work on it if that would besomething that you'd be interested inhaving I'm pretty sure that we alwayswelcome contributions and customize toto expand the the piece of code thatcurrently embeds or hard codes the openAPI and replace it with um with thediscover API especially that you couldprobably just reuse a bunch of client uhclient Go code which does it or wealready embed that in in cubectl sotheoretically you could replace that iam I'm not 100% sure how the the readingpart looks like but probably that isalso easily swappable for something alittle bit more dynamic i'm not sayingthat we always have to to read it liveevery single time but something that issimilar what we currently do in cubecuddle where we if I remember correctlythe current TTL is about 6hoursish um and you have toexplicitly tell cube to refresh it whichit does when you request API resourcesand API versions if I remember correctlyin all other cases you basically justhave to um clean the uh cube cachedirectory which will trigger the uh therefresh i hope that answers yourquestion i'm I'm I'm fully aware thatnot100% but um yes we're we're definitelywelcome welcoming contributions on thatfrontuh a question about pruning resourcesespecially new version of pruningresources or part of questions first canI just enable the new version through uhkuberc and the second one is there anyplans to improve uh pruning uh ormultiple name spaces when I need tocreate a custom resource then like useit as a owning resource etc soum the person that was helping actuallyus with the customize and when she gotlaid off she also in in cooperation witha different person different contributorthat also got shifted to different tasksuh they started working on a new uhpruningmechanism um such that we actually canmore reliably prune the resources applythe resources and prune the resources ibelieve that functionality currentlyexists inalpha i talked with uh with them lastCubeCon and this CubeCon there are someuh missing bits that we wrote down inthe enhancements i can't rememberspecifically the number of the cap butif you go through six CLI caps there isone it should be at alpha so you cantest it already uh through environmentvariable there are some missing bits iremember we we merged prior to thisparticular to this uh code freeze thatwas uh before this CubeCon a couple ofminor fixes that will get us uh open usthe path to enabling this by defaultwhich I'm hoping will happen in134 so yes it's a it's a pretty bigtopic we want to move this forward umI'm very hopeful that we will be able topush this forward towards the end ofthis yearyeah I don't think we have time for anymore but thank you all for coming uh andagain if you can't tell we need lots ofhelp uh so talk to your employersespecially if these are tools that youdepend and rely on uh there's no oneelse to work on them so uh come join uswe're we're very friendly people comecrash our meetings you can just show upin the Zoom meeting and stick around andthere's plenty of opportunities that'llcome up to get involved so thank you2025-04-15 21:59:55.895386 u��G�<#��EAbpsclYlGl2sall right we have 30 minutes only so Iwant to start thepresentation one more time welcomeeveryone uh to presentation so uh titleis mastering efficiency in Argo CD anduh we will be talking about how to whatdoes it take to scale Argo CD and howmuch does it cost and I will share withyou you know tips how to reduce yourcloud bill uh before I start quickintroduction so my name is uh AlexanderMatensv don't try to pronounce my lastname uh so I'm one of the Argo projectco-creators i've been working on thisproject for like seven or eight yearsand I'm also co-founder of companycalled Acuity where we run Argo CD forother companies and in the last severalyears we learned several lessons uhabout uh how much money we have to spendto run Argo CD at scale and I'm going toshare this learnings with you andhopefully help you to save some moneyfor yourorganization um here is today's agendait's kind of simple first I want tobasically convince you that this topicwas a conversation so we all know thatif you use Argo CD you know it's cheapand very efficient and until you have torun it at a really large scale and so uhit can get expensive at some point and Iwant to share with you some numbers ofwhat expensive really means so you candecide if it's time for you to investyou know uh some effort into reducingthe build and optimizing and I'm goingto share with you some lessons we'velearned and solutions that you can applyto reduce cost of your ArgoCD umand now it's time to start um so I dowant to start from the statement thatArgo CD is you know it's not going tocost you a lot from the beginning so oneof the most common ways to run Argo CDis to install it into the cluster thatyou plan to manage and uh this is thescreenshot of freshly installed Argo CDon a small cluster and as you can see wehave quite a lot of pods but combinedthey use a fraction of CPU and maybe 10��n�;#��AKQBz7nwWxUEhello thank you for joining us uh myname is Eddie Zaneski i'm joined byMarley Salazar from the United Stateswe're joined by Mache Schulik fromPoland uh and Yugo Kobayashi from Japanuh and so we're here to talk aboutwhat's new with cube control andcustomize and more importantly how youcan help and get involved uh real quickwe are the CLI the SIG CLI leads anyonefamiliar with what a SIG is so SIGs arethe special interest groups they're theparts of the uh Kubernetes project thatown different parts of the codebase uhso we own all the CLI tooling so cubecontrol customize a whole bunch of otherstuff that you've used but don't knowexists uh so it's just the SIGs are allopen meetings there's a SIG for prettymuch every part of the project that youcan just volunteer and get involved withand open source is awesome uh so I'mgoing to hand off to Yugo to talk aboutcustomize hi hi i'm a mentor ofcustomize i explain current status andfuture plan of customize so uh customiztool that'scustomizable and bundledquestion uhsecond we discussed uh remove customizefrom question in that proposal and thatis updated uh in conclusion we decidedto to not to remove customized from cushuh due to all desire to support to theexting pro and back comparity is ourpriority uh nextslide uh we are a customer projectcurrently working on three plans sofirst we consider planning to supportplug-in that is like a current edcontainer care plugins so second we areimplementing the release automation forcustomized binary uh that will enable tomore frequency release for customize uhs we want to improve her support incustomized uh that is okayhit the just hit the button t0megabytes of memory couple of those podsare even optional such as DEX andnotifications controller you can don'teven run them if you don't need to anduh there is a diagram that shows howthose components interact and as you cansee there's basically four of them thatyou would need to install and pay forand so uh generally Argo CD won't costyou anything because highly likely itwill just land on some nodes inside ofyour cluster that have spare memory andCPU and Argo will apply a little bit ofpressure on Kubernetes API server ofyour cluster but it's really hard totranslate it to money so basically it'skind of free uh and so it's free untilyou decide to you know uh use Argo CDmore and use one of the most appealingfeatures of it uh which is calledmulticluster management and so once youtry to manage multiple clusters youwould have to change a bunch of thingsum and kind of high level you would needto create a centralized control planewhich is used usually by multiple uhengineers in your organization to managea lot of infrastructure and there is avery important uh new requirement hereis that this control plane has to behighly available and uh it's not veryspecific to Argo CD but basically if youhave any application that has to beavailable all the time and it has tosurvive upgrades without downtime youwould have to start you know runmultiple instances of everything and uhthis is the same diagram that you justsaw in the previous slide in a so-calledHA mode so if you're not aware Argo CDhas um in official repository twoinstallation manifests and we highlyrecommend to use HA version of manifestsif you install Argo CD in production anduh the result is you would have to runmultiple replicas of pretty much everycomponent um so there is API server reposerver that serve requests we needmultiple replicas because to have arolling update uh we have one statefulcomponent uh radius and radius turnsfrom one port into five ports one isstateful set and a J proxy that balancerequests and you might end up havingmultiple controllers if you managemultiple clusters and this transitionalso requires uh infrastructure thatwould cost you some money i kind of tryto highlight here you know things thatmaybe not as obvious um and willcontribute to your cloud bill um soobviously we need instances to run ArgoCD components that's you know obviouswhat's not obvious is we need at leastthree even if you have beefy instanceswith a lot of memory CPU we need threeof them to kind of satisfy all all podanti- affinity rules and run uh allcomponents in HA mode uh next not youknow unpleasant surprise that you mightrealize a little later uh it's not okayto reuse one of your workload clustersto host Argo CD as well you might regretabout it in the future and the reason issometimes other workloads on a clusteroverload API server and it affects ArgoCD and other way around so it's highlyrecommended to have dedicated controlplane cluster and it kind of the billstarts from several hundred dollars amonth just for HA uh API server usuallymanaged by your cloud provider plusthose three nodes that you need to havefor Argo CD workloads and so last butnot least um you are going to have topay for traffic and it's kind of notobvious because if you use AWS thosecosts will be hidden under this shy EC2other section and but it surprisinglybig like in in multiple cases thisnetwork cost is around 30% of the billso it is significant and as I will sharea little later sometimes it actually canspike and there is but there is way tomitigate the close cost closespike um so next I promise to share somenumbers and this is mostly for you todecide you know at which point it makessense to start looking at at the cost uhand so before basically we uh didn'twant to guess and we just run anexperiment to get some real numbers thatwe can then play with and so this isdescription of what we did uh we usedcreated a GCP cluster uh that we used asa control plane had it had three nodesand we just chose this node size to makesure it fit a pretty big Argo CDinstance that managed a lot of clustersand applications uh next we installedArgo CD on it and deployed 800applications across three managedclusters and we did not count cost ofthose managed clusters because thoserepresent your workload clusters and uhthis is the result this slide is mostlyto you know prove that uh we did run itwe kept it for like almost a week anduh later on we kind of took the mostimportant numbers and we've got this umso a long story short the combined costof that uh whole setup was around $40 aday and so if you multiply it by numberof days you will see that if we kept itrunning for a year we would have to payaround$11,000 and 30% of that cost was uhnetwork traffic i think one importantdetail the traffic most of the trafficwe had to pay for was u over internetand significant part of it was u crossavail um cross availability zone trafficand so our workload clusters wereavailable to control plane over internetso you might make it cheaper if you useI don't know same VPC to connect all ofyour clustersuh and then one way to extrapolate theresults I I do realize it's very kind ofnaive way to you know to try to measurehow much it would cost you if you have abigger setup but that's basically ittracks very precisely with what we'veseen from our experience so uh you canbasically using this simple math you cancalculate that in our experiment atleast one Argo CG application cost us$1.49$49 and so this table is I don't mean toscare you but it kind of shows you thelevel of a problem if you're trying tomanage 3,000 applications you might haveto spend 50K a year on a control planeand for 6,000 I didn't even want to putthis scary six-digit number uh yeah butmy main goal here is to convince youthat at large scale when you manage6,000 applications that usuallytranslates to like close to 100 clustersthen control plane became quite costlyit's like comparable to a salary ofengineer and it does make sense to startworrying about reducing thecost and yeah so now I want to go to themeat of the presentation and share withyou you know what we learned you knowfrom running Argo CD for many years forvarious companies and a lot of thoselearnings uh we kind of translated backto contributions back to open source soArgo CD the latest version is usuallyhas optimizations built in and it it'scheaper to run it but some of thosechanges are not enabled by default andor requires additional configuration forlike various reasons for backwardcompatibility reasons and so I'mfocusing on these type of changes hereand it covers networking compute costand basically a suggestion for you howto reduce the cost on the dedicatedKubernetes control plane serverand okay so I want to start from thefirstum lesson we've learned the hard way ithas a short story behind it so we um atour company we run Argo CD for customersand so one of the customers suddenlystart costing us a lot of money and wehad to do some investigation to figureout what's happening and what we learnedis that we observed uh video streaminglike traffic between applicationcontroller and the control plane clusterand the I wanted to kind of highlightthat um it had nothing to do with howArgo CD is implemented it uses verytraditional approach of you know how allcontrollers in Kubernetes areimplemented so controller periodicallyupdate um CRDs that it manages in thiscase application CRDs in particular ArgoCD it send a tiny patch to everyapplication uh and patch usuallycontains bytes it's like it changes thelast reconation timestamp on anapplication so it's literally bytes uhand also controller watches all theapplication and that where this is wherethe problem uh appeared so we learnedthat for some reason our customers hadgigantic application CRDsuh basically they were close to onemegabyte which is a limit ofKubernetes and we had to understand theuse case why they were doing it and wehad to look for solution and so oh Ididn't explain why exactly 1 GB is aproblem so basically Kubernetes APIserver every time when it receive apatch request it replies with the wholeresourceuh JSON so in case of Argo CD We haveone watch and one patch request and soevery such requ est produces 2 megabytesof traffic and with several thousands ofapplications we did get to a level ofvideo streaming gigabytes uh per minuteand each gigabyte cost us a cent and sowe were paying several hundreds ofdollars per day for one customer so itwas kind of a loss for us um and so welearned from our customers that theyjust wanted to deploy Helm charts andHelm charts were sitting in a OCIregister so for them it was very easy tosimply inline Helm values into theapplication spec which is supported byArgo CD and they had a lot of valuesthat they had to supply so it wasliterally u several kilobytes hundredsof kilobytes of value files and uh ArgoCD has a nice feature uh roll back andso to facilitate roll back every timewhen you sync Argo CD store a copy ofvalue files used for a sync in a historyso we ended up havingapplication resources with severalkilobytes of value files uh stored 10times in the same C and yeah so that wasa problem and we had to to mitigate kindof the problem we had to make a smallchange in application spec so there is afeature of Argo CD that let you controlhow many versions you want to store inhistory for each application so we hadto reduce it to two and our customerluckily didn't mind because they usegithops for rbacks and uh next westarted working with them to convincethem to move those inlined values into aseparate value mml file stored in the grepository uh and another feature ofArgo CD called multiple sources kind offacilitate this transition and basicallythis is a a suggestion for you to watchand monitor the size ofapplication resources in your controlplane cluster and so if you see thatit's close to a megabyte highly likelyyou're already paying a lot of money foruh cross availability zone trafficwithin a clusterum and then next we kind of we were notreally happy because we had to work withthe end user and convince them to useapplication in a slightly different waywe started to look for a more kind ofpermanent solution and we found thatthere is a way to uh dramatically reducenumber of times Kubernetes uhapplication controller patchesapplication CRD and so um to be veryshort we introduced a settinguh which you can see here I'm not sureif you can see my mouse setting iscalled controller resource healthpersist it's in argo CD cmd param cmconfig map and what it does it uh inconfigure controller to stop storinghealth related metadata in a kubernetesuh in application CRD and that metadatais duplicated in a radius and so unlessyou build something very specific aroundargoCD application CRDs without using ArgoCD API server you probably don't evenknow this metadata exist and it's verysafe to just change this plug to trueand very likely you will reduce pressureon your API server and it will reallyreduce amount of traffic that uhcontroller sends toKubernetes uh all right I want to moveto the next suggestion it uh alsorelated to traffic and also related tolarge applications same story we had acustomer who suddenly started sendingway more data and we had to investigateyou know where is this data coming fromand what's the reason and so whathappened is we learned that Argo CD hada kind of inefficiency related toapplications that manage a lot ofKubernetesresources and so if you have anapplication that creates singledeployment but deployment has hundredsof replicas for Argo CD needs tovisualizeuh 101 resource in the user interfacebecause it kind of accounts it tries tovisualize everything including pods andso if you have such applications thenArgo CD updates a set sorry a key inradius and it stores their JSONserialized list of all those resourcesand we realiz realizedthat with the growth of number ofresources traffic grows exponentiallyand the reason is if you have a lot ofpots they change more frequently andthen every size change is bigger in sizeand that's why we had a user who wasmanaging application with like severalthousands of of resources and it wasagain kind of video streaming likesituation and so we've implemented uh anoptimization and encourage youto consider enabling it in your clusteruh so the the! setting is an environmentvariable and the setting is fairly newso there is no good default yet but fromour experience we found that it'sbeneficial to useuh this magical number 50 and so whatthis setting does is it manage itinstruct controller to split informationabout application resources into chunksand instead of storing them all in asingle radius key now we can control howmany resources go into a single keyshard that's how we call it internallyand number 50 showed the best results soit doesn't roll doesn't overload CPU socontroller doesn't use much of uh CPU tohandle the cancellations and it reallyreduce amount of data it sends totraffic um yeah and then internally wenoticed that in in these extreme caseswhere you have large applications itreduced traffic by like 10x umsignificantamount um andfinally so we found that we hadexcessive internet traffic uh fromcontroller to a managed cluster and uhit's this is kind of there is no reallyuh a good wayto reduce this traffic so there is nomagical setting in Argo CD unfortunatelybut I wanted to suggest a workar aroundhere and the reason is uh this trafficis 10 times more expensive than internaltraffic and that's why even smalleroptimization optimizations here kind oftranslate to uh good savings and so thereason for this traffic is controllerArgo application controller watchesmanaged Kubernetes clusters and it triesto catch each and every event um thathappens uh and can potentially updateaffect the state of managed applicationsand so there are known resources thatchanges very frequently and usually noone kind of care about those changes andresources are endpoint and endpointslices and so by default Argo CCD manageand watch these resources because insome cases end users wants to manageendpoints manually but if you don't wantto do it then it's recommended to use asetting called resource exclusions andtell Argo CDto just just don't watch this sourcesand then that might save you tens ofdollars per day just because it will beit's a significant source of of trafficand it just uh makes sense to do notwatch these resources unless you reallywantit and this is this is all suggestions Ihad uh about networking and next I wantto move to several lessons we learnedabout optimizingcompute cost and so one of the mostimpressive change we've made uh here ishere on the screen so as you can see wechanged one line in a radiusconfiguration and it helped us to reduceour production uh in basically almost by50% and so uh what this setting does isin HA mode Argo CD usesradius with a single master node and twoslaves configured to you know replicatechanges and so there is a defaultsetting in radius called replication byclock size which is set to half agigabyte so 500 megabytes and it makessense I guess for lots of applicationsthat's why it's a default but Argo CDuses radius as a throwaway cache and wecould not think of a single H singlesingle scenario whereuh you know inconsistent replicationmight affect anything so radius is usedonly for to visualize data in the UI andit is anyways kind of has a a little lagso it shows you state of your Kubernetescluster with a few seconds delay so ifit's delayed by another fraction of asecond nothing really nothing bad isgoing to happen and so we changed it inall our instances to uh 60 mgabytes 64megabytes and reduced memory requestsand this is like a screenshot of uh thechange we've made so we used to run 100almost 150 nodes and eventually numberof nodes dropped from 150 to 80uh and we were running around 100 ArgoCD applications so this is for you tokind of if you run several dozens ofArgo CD instances for your organizationthere is a chance you can saveuh hundreds of gigabytes of memory andthis change is safe we can say it we'vebeen running this setup for severalmonths now and no complaints yet andit's saving us $2 to $300 a day which isprettysignificant um all right uh I have fewmore suggestions and we still have 7minutes to go through them so nextum another frequent complaint andfrequent consumer of uh uh compute isreposerver and so the reason is thiscomponent is kind of CPU heavy is uhit uh run for us Helm template commandsand customized build and those commandsactually take a lot of memory and CPUand uh another known problem is a reposerver is configured to by default itdoes not process multiple requests for asingle application in a single repo andthe reason is sometimes there are sideeffectsAnd uh for I would say historically justin case repo server is configured to uhonly process one request at a time andso in case you have a git repositorythat has manifests of multipleapplications every time we make a singlecommit Argo CD suddenly needs toregenerate manifests of all theapplications in that repo and it willhandle every application one at a timeand so to provide a good user experienceyou would be forced to run many manyreplicas of repo server and that's thewaste of memory and CPU and so thecommon suggestion we give to everyoneand apply it to ourselves is we enableconcurrent processing and there is twowell-known cases where it's safe uh andthat's why Argo CD has a kind of asetting to explicitly enable it one iscalled controlled by environmentvariable argo CD helm allow concurrencyand it in a in a new version 3.0 zero itwill be set to true by default if yourun I I mean it's not yet releasedthat's why you for for older version youwould need to enable it by default andwe know it's safe and it will save youit will allow you to run less replicasof repo server and save some money andanother frequent reason why where ArgoCD sometimes have to run uh manifestgeneration sequentially is related tocustomize So if youoverride images in uh customized basesettings and image is stored inapplication spec then Argo CD have tomodify files in a local clone of gitrepository and that requires to runuh manifest generation sequentially soif you simply disable if you don't usethis feature don't override images itwill provide better performance and itwill cost you less money and lastsuggestion is uh once you enableparallel generation you can use aparallel limits flag available inreposerver to control u how manymanifest generation you want to run inparallel and there is a very simple ruleof what value you want to uh supply hereuh we found that it makes sense to checkhow much CPU cores and how muchgigabytes of memory you give to uh yourrepo server instances and small smallernumber between these two is per listlimit and the reason is it just makessense to reserve one core for uh onefork exec of helm or customize andusually every invocation takes onegigabyte so that's why that's what weuse at least internallyYeah and uh last suggestion almost ontime so we foundthat reserving and giving the whole umKubernetes cluster to Argo CD is oftenkind of a waste uh and the reason isArgo CD uses Kubernetes API serversimply to store a collection ofapplications and it needs to get accessto a few secrets and the config maps andso if you provision a single Kubernetescluster for each each and every Argo CDinstance that you have it will not use alot of features thatuh production ready Kubernetes clusterprovides such as different controllersand so those there will be some waste ofresources and internally we found thatit's beneficial for us to have a singlehost cluster for multiple Argo CDs butwe configure our Argo CDs to use KS toto store metadata and uh this way it'spossible to use a single host clusterand have safely multiple Argo CDs withvarious versions using K3S clusters thatruns on the same host cluster and it didsave us money uh because we didn't tryto calculate exactly how much moneywe're saving but we are I'm pretty surewe're talking about hundreds of dollarsfor each Argo CD instanceOkay that's all I had and we almost runout of time i'm using you knowopportunity to talk about the book thatour companyuh sponsoring so one of our colleaguesand uh uh an engineer from Red Hat wrotethis book if you want to get it for freei guess yeah that's uh available withvia this uh QR code and thanks a lot ihope this section was useful andhopefully it will help you save somemoney for your company[Applause]2025-04-15 21:59:56.643715#en there could recognize the value ofthe build pack project Okay Um anotherimportant thing to notice here is thecloud native build pack will always keepa well- definfined boundary between eachlayers category right and that allowsyou to offer a feature like rebase Umand that's very powerful also And Adamwill talk a little bit about that um ina few minutes OkayUm so let's see three very quickexamples to give you a very brief ideaof what can n cloud native build packcan do for you Let's start withum uh monor repo application Um we havea backend uh written in go Then we havealso a NodeJS front end uh in TypescriptWe're going to use some paketto uh buildpacks collection uh and we're going touse to the pack cli and we're going torun just a simple command pack buildexample is going to be the name of theapplication image and then uh using thisbuilder petto builder yummy base isbasically a flag to tell pack hey pleaseuse this builder and you going to seewhy and that's important OkayUh let's seeYes Okay You can see that it's verysmall for me here But let me see if Ican pause it OkayNoOkay Sodark Okay I'm sorry Uh okay So whatwe're doing here is basically is youhave the source code I show you like uhthe structure We run pac build thereand what's gonna happen is we're gonnadetect that this is a go and a typscript application and we're going torun the build packs inside the builderthe petto builder and we're going to uhcompile everything right and it takes alittle bit Soum at the end of the whole process anapplication image is going to be pushedI believe I run this one on my local Soit's going to be saved on the duker inthe on the docker demon And basicallyit's going to be composed with all theledgers that I mentioned before We'regoing to have the run image We're goingto have the uh go dependencies thenode.js dependencies the application andeverything Once we have we have donethat what we're going to do is just torun two containers the same image thesame application image with twodifferent entry points once for the backend one for the front end Okay And thenwe're just going to go to the webbrowser open the web application and inthe bottom you will see a little messagethat message came from theuh from the back end running and go Soit was really fast uh because obviouslyI I don't want anything going wrongduring the demo but uh that's theimportant thing on the example Okay it'sthe same application imaged it wasexecuted with two different entry pointsonce to run the uh backend component andwant to run the front-end component OkayAnd the image was just built using thepack build commentRightOkay Shoot Okay Now let's go withanother demo This is a spring bootapplication a Java application Everybodyknows about it In this case what we'regoing to do is run pack build pet clinicAnd I'm going not to use the builderflag Why I'm doing in this waybasically because I just configure mypaci to use the pedto builder as thedefault builder So I don't have to youknow pass through this flag and it'sgoing to be a little more easy fordevelopers to do it What's going behindthe scenes is the same thing We're goingto inspect the code We are going to seethat it's a Java application and we'regoing to apply all the Java build packsfrom the PTO builder and produce theapplication image at the end Then we aregoing to just run it Docker run Nothingmysterious about it And then we're goingto go to the browser and hopefully youwill see the pet clinic application upand running Okay Nice So same commandpack build for two differentapplications using the same builder inthis case right now the third one it's aPython application I generated with AIbecause you know it's fun so if we wantto build it which comment we shoulduse back build right nothing weird solet's do that that's it same builderuh Uh I believe there is just oneuh flag there to say hey use this Pythonversion and that's it We just build thecode I can seeit Oh yeah it's building it Okay And atthe end of that uh we will just run itand see the uh Python application appI'm running Okay So here the importantthing is $you don't need a docker fileThe app image that are being generatedare optimized for production Uh andthat's a benefit of using the ptoo orthe herokuum uh collection of build packsAnd that's something that you don't needto care and when you're using Dockerfiles you do need to care and how tooptimize your application image This issomething that we can get from thecommunity and from the uh specificallyfrom the um Pakto teamOkaysweet So now Aiden it's gonna give youmore helpful insightsFantasticThank you Juan Oh I see the screen hasgone into horriblecontrast Unreadable on the laptopFantastic Anyone knows how to usecomputers please uh let us know how toturn up the um the contrast on this Soas one said at the start we kind ofthink that there are three differentaudiences in the room for who areinterested in build packs There'll besome of you who are application authorsSo in my day job I write applicationsI'm a I'm an enduser of build packsThere will be some of you who are actualBill Pax authors I recognize a few facesin the room Um we know and love you andthank you for everything you've donewith us Um and uh we'll talk to youfolks probably individually at the boothAnd then there's a whole section of thisaudience who are probably what we callplatform operators Um so folks whoactually build images for other teams intheir organizations Um those are thethree audience that we kind of want toupdate today on what's happened in thebuild packs world over the past yearFrom the end usersperspective I appreciate it's slightlycomp complex looking right we'rebillpacks project the cloudnativebuildax project we're at buildpax.io andwe maintain a specification and the specspecification is at that git URL uh youcan change that specification if youwant to come along to our meetings andsuggest improvements to it Uh we alsoprovide the pack and kack tools forpeople to use the bill packs However theactual implementations of the bill packscome from downstream providers So thePTO team it's an open source projectThey maintain a suite of build packs forbuilding various language uh groupsHeroku where some of this technologyoriginally came from They maintain a setof build packs that they use to buildcertain uh language groups that theysupport And then Google use build packson Google Cloud Run So the set of buildpacks that they provide are particularlyoptimized for running for buildingimages to run on Google CloudRun from the changes from end usersperspective over the past year becausethis is a maintainer track talk Um firstof all I know it's late but hands up ifyou've ever heard ofWindows I'm not joking Hands up in theair if you haven't ever heard of WindowsSome people have never heard of WindowsI I doubt that Keep your hands up thoughif you've heard of WindowsContainers right it's a smaller numberthan the people who've heard aboutWindows It doesn't surprise me uhWindows containers on Windows is is is areally interesting technology forMicrosoft to run applications in Windowson containers environment in Windows Wehave dropped support or we've deprecatedsupport for that And the reason thatwe've deprecated support for that is notbecause we don't like it It's just wehad no users Uh there have been zerousers for that over the past number ofyears If you are users who areinterested in those features please comeand talk to us because you know wehaven't removed the code yet and we havethe ability to to undeprecate supportfor that But we had no we have nosupport we have no users for it and itwas taking development time away fromother things So I'm afraid we've justdeprecated support Uh prior to last yearwe had our own magical way of providingim image annotations It worked It wasgreat But now we still support the kindof standard OCI image annotation stuffSo great if you annotate your images ina certain way it's all now supported inbuild packs If you're a longtime buildpack user for the past four or fiveyears and we've been around for about 10years So if you've been using it forfour or five years then uh you'll beaware that we deprecated stacks a fewyears ago and %we've cleaned up thedocumentation around what that means uhnow because stacks have just gone awayfrom pretty much all implementations andasof yesterdayyeah Monday Monday I can't remember whatday it is Uh as of Monday uh we now havean actual road map for the first half ofof 2025 that you can read on um our RC'sAll the features in there are indevelopment So we've only put stuff inthere that we we're fairly confidentwe're going to finish soonish But it isan open source project and people cometo and from the project at their own uhvalition So you know may not be quitethe first half of 2025 You might getdelivered in the second half a bit andbecause your end users uh or some ofyour end users at least you will beinterested in what's going on in thedownstream projects which we've justsummarized two of the change sets herefrom downstream projects So as a buildpacks user I use build packs with thePTO builder or I use build packs withthe Heroku builder to build myapplications Uh the PTO folks have somereally interesting features In the pastyear they've pulled in dependencymirrors I'm a big fan of this featureWhich means is that if I uh mirror PIIor npm or Maven internally in myorganization a lot of us are doing thatthese days because we've gotrequirements to keep control over ourdependencies So we're mirroring thoseinternally in our organizations The bogstandard build packs that I pull fromPTO now will allow me to point the billpacks at the internal mirrors So they'renot going to pull uh dependencies fromupstream And this is a big useful thingfor me And the PTO folks Juandemonstrated the PTO build packs usingtheir Jammy base builder We're slightlyold school but um they've announcedsupport for UBI base builder a rail uhimage Uh if you're a Red Hat customerthen that will probably be of interestto you But they've also announced a newAuntu corebased image uh for things Sotheir builders are going to be based onAbuntu core right up to date Um andthey've got clearer maintainer statusabout the build packs Um one of thethings that they noticed in the pastyear is that they need more maintainersSo things that have activemaintainership are now indicated quiteneatly and things that have less activemaintainership is been uh indicated Itis an open source project They'relooking for maintainers Uh they have abooth downstairs as well So talk to themThey're they're good folks to get to towork with From the Heroku folks it waseasier getting information from thembecause they brought us to dinner lastnight So I'm really happy Um but uh theynow have a really nice new shiny newbuild pack for .NET Uh they've addedpoetry support to their Pythoncollection of build packs So now if youuse poetry instead ofusing simple build or one of the otherPython backends for building then uh youthey've got your back They have a reallyreally neat um new uh build pack forinstalling Debian packages in imagesThis was something that traditionallybuild packs have had poor coverage ofand now I think we're getting to thestage where we understand the problemand we can actually give decentmechanisms for installing operatingsystem uh packages So um I haven'tlooked into this yet but I'm seriouslyintrigued and uh that's what I'm goingto be spending my weekend doing And ofcourse all the Heroku build packs thesedays are multiarchchitecture so they canbuild uh AMD 64 or ARM 64 images When Isay multiarchchitecture we're generallynot talking about Sun or AIX these daysUm but you know we we can get supportfor that if you really wantit So if you're Bill PPX end usersthat's the kind of things that havechanged over the past year If you're aBill Pax author you're probably wellaware of the changes because you'veworked with us quite closely on thingsaround spec changes Um one of the thingsthat you can rely on as a bill packsauthor is we introduced um a a a featurethat we call uh um extensions Iremembered it because it was written onthe slide Um we introduced a featurecalled extensions about a year ago maybetwo years ago at this stage Um that'sgone GA now So it's now part of thestand It's not b&ehind a feature flaganymore So as a build packs author youcan use extensions and rely on them Andthe other thing that we've changed inthe past year is now we have bettersupport for insecure registries whichsounds like an anti-feature but actuallyif you're a build packs author and youjust want to just upload stuff to youryour local registry running on localhost port 5000 then having support forinsecure registries is really neat Wellit just speeds up that development loopfor for build packsauthors But I suspect the vast majorityof you folks in the room are what wecall platform operators If you're kindof a DevOps person if you are what's thenew term we call ourselves platformengineers now don't we uh if you're ifyou're a platform engineer building aplatform if you're the kind of personwho in your company you help otherpeople build OCI images and likew I usethe term OCI image rather than Dockerimage because we tend to talk about thestandard rather than Docker's excellentimplementation but we rely on thestandard Um yeah Um so if if you're oneof those folks who helps others buildimages we would call you a platformoperator Um and we have two tools asJuan said to help you build images forfolks First one is KPAC The K standsfor yes Kubernetes You got it Of courseK is our Kubernetes operator and it'llsit there It will monitor your Gitrepositories I think it might havemercurial support as well but I've neverused it um it'll sit there it'll monitoryour your your git repositories and oncethere's a change to the main branch oryour nominated branch it will just buildan image for you and stick it in yourregistry And it's a really nice way ofbuilding images for for folks It's gotanother other bunch of other advantagesthat I won't go through today but talkto us in the booth tomorrow morning uhif you're interested in the otheradvantages of that Um we were we we wegot to announce it last year didn't wein the EU um that KPAC had released uhuh the latest KPAC version at that stagehad salsa level 3 support So if you'reinto your software supply chain securityarea which a lot of us are these daysbecause of various regulations um thisis the highest level of software supplychain security that you can achieveunder that particular um uh mechanism orthat particular specification of supplychain security So it still has that SESAlevel three compliance and you know it'sa good thing Pack is a CLI that Juandemonstrated at the start Um it doesintegrate with common workflows Soanywhere where you're currently buildingan image using other tools co jib dockerwhatever you can probably just parachutein pack I'll say probably becausenothing's easy is it um but we havedocumentation and examples around usingpack within CircleCI So if you're aCircleCI user we can give you an orb andyou can use pack to build images Or ifyou're using GitHub and if you useGitHub actions to build your images wellwe've got a GitHub action that will letyou use Pack to build images and you cantry it very easily There's very littlerisk in trying it Um GitLab of courseGitLab workflows We've got tooling andwe've got uh examples and documentationto show you how to integrate Pack withthat I have not linked Jenkins toanything because every single Jenkinsinstance I have come across in my entirelife and Jenkins is now older than me Iswear uh is that the Jenkins pipelinesin every organization are different So Ican't tell you one way of integrating itwith Jenkins Suffice to say that I havenot found a Jenkins pipeline yet that Icouldn't put Pack into So you know talkto us in the booth tomorrow please ifyou're interested in integrating PACwith your Jenkins pipelines And we'vealso got Tecton support as well So ifyou're using Tecton as as your yourbuild uh automation infrastructure PACwill just drop in there and generateimages for you So it's cheap to trygiving all the different uh workflowsthat we uh helpsupport Right so as a um DevOps team oras a PL team of platform engineers whatI want so kind of as a platform engineerwhat I want in a build system is controlover the base images I want to give uhe'ngineers the ability to experiment withnew base images But when it comes to ourproduction pipelines I want control overthat The organization wants control overthose base images And as we've kind ofhinted at you can have control over thebase images that you use You can use acustom your own base image by buildingyour own builder We have some samples inour uh we have some demo samples in ourdocumentation that explains how to dothat Please do not use our demo samplesfor anything production related Do usethe PTO or Heroku build packs forproduction stuff Our samples are therereally as sample code but they just showyou how to build your own custom builderand you could reuse the PTO or Herokubuild packs in your custom builder andmaintain that production levelquality It's straightforward to doThere's a buildertoml file Everything'seither toml or yaml these days Andinstead of using the um uh the the buildand run images that's mentioned in thatyou change it to your own build and runimages Then you use pack to rebuild thebuilder It's pretty straightforward Butthat means that I get to keep control ofthe base images that are used in myorganization Similarly I want controlover the versions of tools Again myenduser developers can experiment withwhatever tools that they want But in theproduction pipeline the organizationneeds to have control over the toolsthat we're using Need to know exactlywhat JDKs we we support in production Ineed to know exactly what Pythonruntimes we support in production Theymay be different We might not supportPython 313 I think that's the latestversion right now In production we mightonly support Python 312 And that's yourdecision as the platform engineeringteam And it might take you 6 months toadopt 313 for various operationalreasons And again build packs supportthat by again providing a builder toyour build system that would pull in theversion of the Heroku or the PTO buildpacks that support the latest toolingthat you want to support Uh you don'thave to go with the latest greateststuff You can hang back six months orchoose the policy that you want toimplementyourself Similarly I want control overthe language families that we support inproduction Um you know I like Rust butmaybe we for policy reason decide thatwe can't uh support Rust in productionright now So I would create a builderthat has my base image has the versionsof the tool chains that I want And maybeI would only allow on my builder supportfor .NET andNodeJS And the reason for that wasbecause those will be the two languagestacks that we choose to support inproduction Your organization will choosedifferent language stacks and we havethe flexibility in the framework for youto be able to choose those languagestacks and not finally but close tofinally and I've forgotten how manyslides I've got left and I've lost myphone Uh is here I've got five minutesNo way Time's flying Um what we've gotis control over the build environmentAgain I mentioned that it's common thesedays particularly in uh regulatedregulated businesses for us to uh mirrorour um dependencies internally intointernal mirrors I want a mechanism toensure that our production images alwaysonly build using those internal mirrorsand that end user developers as wellmeaning as they might be cannot refer touh other pip locations for example Andwe do this by um creatingnon-overridable environment variables onthe base image which is in my controlbecause I'm the uh platform operator inthis case So I might expose a pip indexURL that developers must use which willpoint at our internal mirror And againthere's an example in our our samplesdirectory folder as of this morning Umwhich is good So we can for otherenvironment variables that you mightwant to drop in here as a go proxy oryour npm config registry or your umMaven registry that you use internallySo what we're doing is we're buildingproduction images and we've got theflexibility to let developers dowhatever the hell whatever they want ontheir local machines But in theproduction build pipelines I controlthis and they can't overwrite pullingthings from from public registries forexample So we know what's on our imagesI've got about four minutes left Custombuild packs I'm going to skip becauseit's slightly more advanced But um theanatomy of an output image This is theoutput image that Juan generated earlieron What you can see is that the layersare small They're very well defined Thelayers are only as big as they need tobe There's a layer called node whichcontains only the NodeJS engine and is165 megs Um this has the fact that webuild these very very small layers has alot of advantages in terms of fastrebuilds It also has a bunch ofadvantages in terms of applicationsecurity For example first three layersthere are unlabeled They're from theAbuntu base image We can rebase onto anew Auntu base image without rebuildingthe image It is a registry onlyoperation Meaning that if there's a CVfound in the base image which oftenhappens uh I can upload a new base imageto our registry and I can effectivelyrewrite the manifests so that we rebaseonto that new base image Similarly ifthere's a problem with the git with thethe targets layer which is my um gobinary but there's no problems with theother layers I only re need to rebuildthat layer I don't need to rebuild theNode.js We have this fast caching stuffpretty advanced caching strategies whichdoes lead to fast rebuilds I got to slipthe slide because I've already saidwhatever I want to do there And in termsof the secure stuff if you refer back tothat past slide uh Juan lied to usearlier He didn't go into all thedetails And one of the things you cansee here is that there's an SBOMsoftware bill of materials that stuckonto all the images So all the imageshave software bills and materials whichallows us to have these small andfocused layers Because we've got Sbombswe can tell exactly when CVES arisewhich images the CVS actually affect Andit's no longer guesswork And if you'vegot the operational ability you can alsotell where the images with those CVS aredeployed on what pods in which clustersNow it's pretty advanced but it is itdoes allow us to easily remediate CVissues when they come across fastrebuilds and just redeploy theapplication quickly And of course allthis stuff KPAC ships with co-signsupport by default and pack integratesquite neatly with the Inoto toolingstuff like witness to generateprovenence certificates for for thethings that you're building And againthese are important particularly inhighly regulated environments And I'mgoing to conclude before she throws meoff the stageUm Bill Pax project is in quite a nicehealthy state at the moment We feel likewe're in control of where we're going inour own destiny which is good We've hadsmall updates to the spec over the pastyear No major uh spec breaking changesSo we're happy with that The build packsimplementations the stuff from Pquettothe stuff from Heroku stuff from Googlealthough I didn't put it up here seem tobe in a very stable and healthy stateYou know the Heroku stuff is onlygetting more features the PTO folks aresorting out their maturity model whichis brilliant Um and I hope what we'vedemonstrated as well that you can usebuild packs and the features in buildpacks to give your organization a lot ofcontrol over the workflows bycentralizing it into a DevOps orplatform engineering team But it doesn'ttake away the ability of your engineersto experiment with a maybe a devonlybuilder that that you can provide withthe latest greatest tooling on it Youcan find us on the internets at allthese places and like all CNCF opensource projects we have a weekly meetingthat you can join uh and you can controlhow the project goes Um thank you verymuch I know it's been a late one uh andwe're free to take questions now fromanyone[Applause][Applause]OkayOkay No questions So thank you guys Yeahthe uh the the pavilion booth thing ison right now I think which is why a lotof people are rushing out the door andI'm rushing down there as well becauseum we've got the build packs booth downthere So please come along and ask usquestions if if if you want to Thank youvery much2025-04-15 21:59:57.438857 ++��4�=#��AEb9AweCazi8good evening everybody Uh first of allthank you very much for staying at theend of the day It's been a long day foryou I know that and we appreciate a lotSo you can stay to the talkSo it's a pleasure for us to be here totalk a little bit about cloud nativebuild pack project especially in how youcould use it in your organization uh toquick and secure image builds to bedeployed on any Kubernetes environmentor OCI runtime environment Uh my name isJuan Buptamante I'm a platformmaintainer at the build pack project Ialso worked for the company named DBaccessed and with me is the amazingAiden Delaney which also a learningmaintainer at the project and workingfrom BloombergUm a quick agenda for today is uh we wewill be discussing a little bit uh likea 10 minutes about uh what buildpack ishow it works I will try to go through afew demos so you can see it in actionThen uh we will try to explain you alittle bit from the three main personaswe identify as the main users like enduser and applications end users or buildpack authors and platform operators Aanwill talk you about a little bitbenefits and stuff we have for them andwe will conclude with some fast rebuildsand secure image stuff Okay So let's getstarted Okay So in essence uh the cloudnative build pack project um allows youto transform your application sourcecode into a production ready OCI imagethat can be deployed on any OCI runtimeenvironment uh like Kubernetes or maybeyour Podman or Docker instance locallywhatever uh and we offer uh a referenceimplementation of Our the specificationwe also uh maintain uh theseimplementations are the paxi or the kaxiyou can use it and uh what we do is wecan take almost any programming languagelike go java python node rust and usingthis cli tool like kpac uh like pack orkpak we will produce an OCI image with avery defined and organized set of layersIt will start with the run image layerokay containing the operating systemthat you need for your application to beexecuted followed by the runtime layerof of I don't know your uh programminglanguage For example if you are runninga Java application maybe the JDK will bein that layer Um then some dependenciesit will contains all the things that youneed to run your applications andfinally the application layers we willget uh that will contain everything weget from uh building your source code Umokay all this without needed a dockerfile Okay we don't need the Docker fileto build this application imaged Andthat's very important becauseapplications that handles probablyhundreds or thousands of workloads withdifferent teams with differentdevelopment teams running differentapplications on any language A Dockerfile could be a nightmare for theplatform operators Okay people that it'sbe"*on path andwe're going to talk about what thatgateway API support for ingressengineext features are um I don't know Iwe have a lot of features we've talkedabout a lot of those features all thetime so we're going to talk about whatthat looks like from a gateway APImigration uh what the current status isof Endgate and um how you all can umcontribute to itso let's start with this little friendhere um actually the name is not comingfrom us neither from whis um but mediagave it the name ingress nightmare whatis ingress nightmarelast week we had some last week wepublished some CVEs so the mostimportant one is the one with the 98it's a critical one and it's basicallyunauaut unauthenticated remote codeexecution uh where does this come fromthere's u like three lines in theengineext project that were responsiblefor checking if a engine x configurationthat the controller is generating isactually valid and working but this isalready happening during the inressresource validation so in a admissioncontroller and normally and especiallyin our setup you can access this webhook without any authentication and sowhatever you put in there and if youespecially do not go via the API serveryou can form your own admission requestsand put in whatever you want and um yeahthis basically and in combination withthe following three CVEes led to someconfiguration injection whichpotentially can load um code and moduleswhy is that the fact so there is a uhflag-asht for engine x binary and wedidn't expect this to actually load codebut just to test the configuration theform is the case and with that um thenice guys of whis were able to show ushow they could get binaries uploaded toan enginex ingress engineext controllerpot and also execute this code and thelast CVE is also a bit in thecombination with the others you can umget access to all the secrets thatyou're able to reference in anannotation because um with that CVE itwas able to actually just put thesecrets download it from the API serverand put it somewhere in the file systemand because this is a web server you canalso put it somewhere where the webserver is actually able to show it toyou and then you have access to thewhole API server again so all of thesein combination are definitely criticalso and uh CVE by CVE it might not lookimportant but we can definitelyrecommend you updating your deploymentsto either 1121 or 1115 because both ofthem include fixes um there's also oneimportant change um that might you mightneed to know about because we not onlyfixed the points where you could injectcode but we also disabled this engine Xconfiguration test during the ingressvalidation this means that while yourcontroller is already running youprobably won't face any disruption butum your engine X the Inkress engine Xcontroller might complain as soon asthere is an invalid configurationbecause this cannot be reloaded and thisbecomes uh really a critical issue assoon as new ports are going to be addedor restarted or whatnot because thenthey won't won't will no longer come upum we were aware of this but at thispoint also decided okay security is moreimportant for now and hopefullysomewhere in the future find apossibility to reenable this check butfor now and in this patched version it'sdisabled by default and also there's noway around this last but not least um onour issue tracker we got asked how isthis about change route is this maybechanging something can we reenable thischeck in change routt deployments shortanswer no because you can Google thatthere are plenty of threats about itchange route is not meant as a securitydevice and even worse if you manage toget out of the change route then younormally have way more access becauserunning a pot with change route justrequires root access in the end and thisjust makes it a lot worsesome other status update around theAngress intern project right after andduring our last talk in Salt Lake Cityuh we were also facing some seriousissues around Google Cloud Build we areusing Google cloud build to build andthen in the later step also promote ourimages and uh especially+ with crosscrossplatform compilation there weresome issues we couldn't find the rootcause and sadly surprisingly whatever umthey just got solved overnight um ifsomeone's of you is working for Googleespecially cloud build I'd like to havelike 40 48 hours of my lifetime backbecause I wasted at least these ontroubleshooting the issue um next aroundDecember we got the open restdi which isbasically the engine x distributionwe're using updated this is currently onmain and will probably be released inthe next minor release and last but notleast because the Google containerregistry got deprecated or at least wefor Kubernetes project should not nolonger use it uh migrated to theartifact artifact registry and againwe're facingsome yeah funny issues with Google cloudcloud build especially if you're tryingto build an image and it takes more thanan hour you definitely get some timeoutsbecause it builds it builds builds ohand now I want to push but yourauthentication timed out yousorry that's just another issue thatreally cost me a lot of time and uh I'mjust happy it's working now and um lastweek right before we published the CVESI got this fixed and Isomehow wasn't able to read calendarbecause I fixed it on Monday and Ithought yeah okay it's another two daysnow i have time for the CVE fixes yepand Tabita first row here was like "Areyou ready to publish the CVE fixes it'sMonday." Yes we agreed on Monday ohokay this was not expected sorry oh comeon but we handle unexpected things asgracefully as we can yes yes we didand before we hand over to James abouting status um here's a bit informationabout maintenance mode so as of now we'dlike to still provide a as stable aspossible solution and therefore we willdefinitely continue updating AlpineGolang Kubernetes third partydependencies we will also continue tofix bugs and of course CVEes so you cansafely expect us to still release patchreleases on a monthly basisum a future version 113 will probably bethe last minor release which means thatafter that we will also at least fromour side not push forward any featuresand um any new features so if youprovide if you propose new features tous this will be a casebyase basis and umit needs to be decided there is nogeneral okay just merge merge mergeanymore um apart from that we still willsupport new Kubernetes releases in thefuture because as of now what we do isjust spinning up a kind cluster in aspecific Kubernetes version and then werun our whole E2E test suit against thisso it's comparable easy to supportfuture Kubernetes version but there willno be would not not be any furtherdevelopment with our whole engineextproject feature- wise and uh I think oneof the last things we are doing isactually providing a migration path toingate and with that I think I'll handover to James yeah I was going to say Ijust I want to continue and add just youknow I I did have someone yesterday comeup to me and say "Hey I do have a PRopen um I know it's not been looked atbut you know what's going to happen toit?" So one of the things that we'retrying to do I know there are 78 pollrequests open right now i know thatbecause I read them maybe not finish allof them but if you do have a pullrequest open please let us know in theSlack you know come to the communitymeeting and help us get that into that113 milestone because we don't want tocontinue to have to do um minor patchreleases so minor releases so um if youhave one open please let us know we haveearmarked some already for113 and with that um I did sayapproximate i continue want to sayapproximate because we had the 9.8 CVEthat took up a lot of Marco's time toget that work through we have a limitedamount of time as open sourcecontributors this is not our day jobwell I mean it's not my day job some youget some time for it as a serviceprovider but yeah no anyway no um butthis has the gateway API um timeline onit as well but we want to again we westated this goal in Salt Lake City umthat's why it's a lot of this stuff isapproximate we would like to by Atlantahave a V1 that people can test um thatcan work through you kno,w setting upsimple HTTP routes manage a gatewayclass things like that um and then I putin stable in here and it's not not herein directly in a timeline but what isstable stable is it's passing theconformance tests it's um manageswhatever version that's available so onesix from this timeline um we're workingon the 13 that I think that's got someRC's out so probably one two by Atlantabe able to run through thoseum because as we've talked about we'vestarted the the timer on the ingressengineext archive so we say maintenancemode but there is an actual in theKubernetes community maintenance modemeans the project's being archived it'sno longer going to be updated there'sgoing to be no new releases no one'sworking on it so our maintenance modemeans we are putting out minimalreleases it will be archived so that'sthe plan once we have a stable releasethat folks can migrate to and we have amigration path we will archive theproject and the goal is 2027i forgot I put that on there i continuewant to let people know like dependingon the level of everyone's involvementhere will depend on how fast we can dothis as part of that um I've beenworking with one of the gateway APImaintainers and looking at what ingressengine X supports from a featuresfunctionality and versus what issupported in gateway API um 35% issupported today so if you if we were toimplement one two folks would be missingout on a lot offunctionality um supported soon sothere's probably a gap someone's workingon it it's an experimental um 55% willbe supported soon so maybe by 2026 we'llhave that 55% but we do have an issuewith understanding what is going on withthe no plans that 45% that's what's leftum a lot of those things will not besupported so we have things there are Idon't want to say will not be supportedthere are just there are currently noplans so there's no gap there's been noconversation in the community so Ithought you know mod security was one ofthem but talking with Rob this weeksafely make it to gateway API talkingwith Rob this week there actually is agap out there for um WFT umimplementation in gateway API so this isthe call to action for most peopleoh I'm jumping over my slides in my headum when we say support there are 118annotations in um ingress engine X andthis is just an example of one of themso going through all of this we'vegrouped them we've got about 40 groupsof annotations but you can see there'sno plans for setting off allway setcookie um off method is going to youknow in progress so there's lots ofthings that just aren't supported wehave a whole spreadsheet for all of thisand only we've only looked at theannotations we've not looked at um theconfigure the config map option so ifyou want to set like TCP and UDP routewhich is an experimental andgateway that's a configuration option sothere's even more functionality that wejust don't know about and what I wassaying for the call to action is I mayberemoved it i'm talking through my SLtalking um I have two presentations soI'm talking very closely to both ofthese thingsum but we we have an ask for you for asurvey i'll bring that up later but umtalking about Ingate and its currentstatus um if you were at our talk inSalt Lake City this looks very familiarbecause it it is the same slide um we dohave a new repo in the Kubernetes SIGwe're doing core gateway umimplementation so anything that's goingto be in one two one three um right nowwe're you know because of the vastamount of work that needs to be donewe're not going to be able to extend theAPI nothing implementation specificextension points things like thatanything in experimentalum there will be ingress support but wesay ingress API support it will be um avery clean very exact implementation ofthe ingress API we're not going toexpose Lua we're not going to expose theengine X config there's lots of thingsthat we aren't going to do that umunfortunately I say that are sins of thepast that have caused a lot of technicaldebt have caused CVEs and caused issuesas maintainers So if you want advancedfunctionality that you had in ingressengine X you need -to work with thegateway API community to get itimplemented in gateway API that's that'sthe gist of thatone um we do have a very we have adocumentation site that has nodocumentation um we are actually usingthe um discussions on the board so we'vehave about three right now just workingthrough trying to understand right thereis a lot that needs to be done and weneed to understand why those decisionsare going to be made so we can have youknow a historical record because there alot of things like I don't remember whyI don't know why mod security wasimplemented why someone asked for youknow it's in the git commits and it's inthe git history but it doesn't explainwhy that decision was made so we'retrying to work through a lot of thosediscussions um so we'll have them inslack we'll have them community meetingsand just trying to keep a record of likewhy we've made the decisions that we'vemade um and as you can see we have avery it I started it starts thecontroller manager um so again a lot ofwork to be done and actually yeah that'smy next slide um so one of the issuesthat we have with and one of thevulnerabilities that was exploited isthat the controller and the engine Xproxy are running in the same podenginex doesn't need as much permissionsas ingress does so for that we've beentrying and we tried um with the controlplane splitting those out it was ahorrendous mess um I think that's reallywhat broke Ricardo and made him you knowleave the maintainership but um it isvery difficult with the technical debtthat we have now they're veryintertwined in our ingress controllerand engine X um so we're going to one ofthe reasons why we want to start theproject fresh is that we will start withthat um separated that does introduce anew problem we have to understand how dowe push a configuration to the engine Xcontroller how do we manage that fleetwhat's that API look like how do we dothat communication so that's one of thethings that's actually a discussionpoint that's up there right now how dowe do the translation layer right how dowe get that those gateway objects intoan engineext config we need to figurethat out how do we manage gateway classHTTP route gRPC um those we just youknow hopefully with the use of usingcontroller runtime will help usbootstrap us a lot faster than having todo it um ourselves so we're working onthat and of course with the introductionof gateway API there is conformancetesting that says that we do run likeversion one two that we have to reportto the gate gateway API maintainers sosomebody will have to write theconformance testing set up all of thatum there's a lot of bootstrap code thatneeds to be written i think I spent anentire weekend just getting the uh umgetting everything up and runningbuilding the container the go stuff allof the all of the nitty-gritty that'sneeded that you know it's not fun butit's needed to happen that'll be on theconformance testing and then I saytesting testing testing because we'redoing this all in kind um we will not bedoing this in AWS um GCP other areas sowe would need folks to help us out withthat testing doing that deploymentgiving us that feedbackonce we have a release that folks canactually you know run something onyeah I can do that sure so um Jamesalready told it multiple times now um itnot only depends on us it also dependson the community and uh if you um arequestioning if you are asking yourselfnow how can you contribute we alreadyhave set up a new Kubernetes slackchannel there is ingade dev mostly fordevelopers and development questions andtopics and there also is same as foringress engineext users a ingate userschannel and there you can come ask yourquestions around support about how toimplement your use case using ingate andwhatnot there are contributordocumentations um about how you canstart contributing to the project youcan find them in our GitHub repositoryand every other Friday 9:00 a.m easterntime we have a community meeting this isalready happening since a few weeks nowso exactly 40 days um it's mostly justme talking for 40 minutes so please comeand ask questions so I don't have totalk for an hour on a recording bymyselfand yeah so uh last but not least wewould like to ask you for some feedbackabout what are the most critical inressengine features that would need agateway MPI API equivalent before youmigrate uh this is mostly about gatewayAPI so you need to know the differencethat Ingate is implementing gateway APIbut with Ingate we would like to adhereto gateway API and therefore newfeatures shall also go throughdiscussions with the whole gateway APIteam and not only ingate and uh thiswill be the process for us in the futureand with that we hopefully can find away to implement features in a way thatit's not going to be an security issuein like two years again no we don't wantthatso um what day is that what day is todayum Thursday at 4 so that's tomorrowright we will continue this conversationum with some of the gateway APImaintainers um if you want you can comehang out with us ask your questions umand continue the conversation aroundwhat features and functionality areimportant to you what you need to see tohelp make a successful migration andalso I will be talking in quiteliterally a half hour with Rob Scott oneof the gateway API maintainers and whathis plans are to help us with thatmigration um so yeah we're talking a lotabout it soand that's it thanks for attending ourtalk if you have any questions there isa mic hereif it's about theCVE just upgradewowsure coolfine ah no um I have one question is italready decided which versioning schemeingate will followa version scheme yeah it looks likesemantic versioning but semanticversioning is kind of religiousspecifically with enginex yeah so thishas upset a lot of people and I amdefinitely going to put documentation onwhat our scheme is going to look like itlooks like um semantic versioning but itis not so we got into the habit of youknow we made one major release and Iactually was trying to think on Mondaywhy we made that major release i thinkit was an engine X upgrade i I can'treally remember but we're not going tomake a major release every time we makea configuration change so we willdocument that so one of the things thatwe did is that we in I think 111 weintroduced annotation validations and itwas false and in 112 we switched it totrue trying to get to that idea of likesecure by defaultso people either didn't read thedocumentation didn't read the releasenotes it upset folks but we didunderstand that it is not Simver becauseit's a change it's not backwardscompatible to me it was because if youread the documentation you knew and youdidn't blindly update you would switchthat back to false so we have theconversations a lot in a lot of ourmeetings um we have it in Slack we putit in the release notes we try to be asvocal as possible about these changesbut yeah umwe some folks would disagree that wedon't follow sim so we will try to makethat very clear in the release processwhat that actually means what versionchanges means because we did do we fixCVEs in patch releasesso yeah just to summarize then theingate will follow a similar versioningscheme as the engine yeah but it will bedocumented and we'll make sure thatfolks are aware of it because werealized when I was reading an issue I Iwent back and I realized it's like oh wedon't really document what the versionscheme is it just looks like simver sopeople assume it's simver and yeah wehave broken um we have definitely brokena lot of clusters like folks that useflex and argo CD when we do patchreleases yeah yeah we we know okay thankyouawesome well if you are all interestedin doing that um please come talk to usthere is lots of things that you don'tit doesn't need to necessarily be codeum there's user experience there's thetesting so if you're act have access tothings you know you know Oracle cloud oryou know open shift clusters there'sthere are going to be things that we'rejust not going to be able to testagainst so that's always available froma contributor's perspective and uh ifyou're just interested in listeningthat's fine too all right thankseveryone2025-04-15 21:59:58.150819 ��1�?#��AanqWhSnN7sAhey everybody thanks for coming to ourmaintainer track talk here in Londonlet's have a look at the state ofBackstage in2025 i'm Avantika i'm an engineeringmanager at Spotify and here on stage aremy colleagues who are also backstagemaintainers Ben Frederick Patrick andVincenzo you'll hear more from from themas we carry on uh so we've got a prettymassive agenda today but we'll bestarting by celebrating what is a reallyhistoric moment for Backstage and thenwe'll look into some project areasupdates framework updates and then havea look at running backstage at Spotifyscale okay projectupdates first happy birthday backstagethe project we all love has turned 5years oldso just five years ago the folks on thisstage got together to bring our internaldeveloper portal backstage to the opensource and here we are all here todayyou know 5 years later talking about allthe great things that we've accomplishedtogether as acommunity this is what we've achieved in5 years we are up to roughly 3,400adopters that's 12.5% up from what wesaw last in uh Salt Lake City if youjoined us there uh we're at about 230open- source plugins a lot of them arenow in the community plugins repo andsome others are kind of still out therein the wild but hopefully moving to thatrepo and we're nearly at 30,000 stars soif anyone in this room has not had achance to go and start the repo yetplease do it now because we really wantto hit that 30,000 number before thistalk ends before we continue let's takea moment to celebrate all of you themaintainers the community and everyonewho's made this project wildlysuccessful/��e�>#��AzTLbnstVjHchello everybody and welcome to our talkhow to gate away with ingress uh 40 daysin gate uh you might wonder why wecrossed the one actually um back then inSalt Lake City Cube Conna last year wecame up with the idea yeah let's do thattalk will be somewhere in the project atthat time it turned out nope plans arenot working as they do and so uh we justhad our first community meeting aboutingate 40 days ago this is why I allowedmyself to slightly change the title abit about me uh my name My name is MarcoIbad i'm a site reliability engineer atGiant Swarm um already more than 10years in open source working withKubernetes since 2016 and proudly amaintainer of Ingress EngineX sinceNovember2023 besides that I'm interested inclimbing and mountain biking and withthat handing over to James helloeveryone my name is James Strong i'm asolutions architect at Isurvalent nowwith Cisco um I've been a maintainer ofthe Kubernetes ingress engineext projectfor a long enough time now that I don'tremember how long it'sbeen uh I am also the author of theOrali book networking Kubernetes a cluruinstructor um on basically the samething that was the basis of the book andum I you know I can't really travel withthe axe i would love to um but I don'tthink he would like that or you know youknow the United States government but Iam a Gimly cosplay enthusiastso uh what are we going going to talkabout today first of all something aboutthe state of status of Ingress engineextthere have been some little CVEsrecently yeah I don't know um some othertopics uh that we were faced with andthen before we hand over to James againabout the ingate status I'd like to talkabout how we get out of ingress engine Xinto a maintenance mode and what we aregoing to provide thereuh we're going to talk a little bitabout how we're going to do thattransition with uh I I'm going tocontinue to say the word approximatebecause like uh the title of our talk wethought we'd be 140 days in to Ingate umso it's going to be an approximatetimeline for that migrati)0 let's keep this going yeahand get those number numbers up likeeven stronger the next time we come backtothis so let's begin by looking at someof the highlights from the last coupleof months first up is the scaffolderproject area which consists of Bogdanfrom Bull.com and other core maintainersuh there's been a push to roll outcheckpoints for scaffold actions in thelast few releases what checkpoints uh dofor scaffold actions is make themident sorry about that you can safelyretry actions without the risk ofconflicts we're not going to dive intotechnical details here but Ben andPatrick did give a talk about it in thelast CubeCon at Salt Lake City so tunein to that talk and you can learn moreabout it we've also introducedautocomplete support for repo URL pickerso now when you're filling out thescaffolder template form you're going tosee suggestions of existing repositoriesand organizations in the drop-own listinstead of instead of having to manuallytype themout onto the community plugins projectarea the community plugins project isgrowing pretty rapidly we have reached97 workspaces in the monor repo that's21% up since the last CubeCon thegeneral feedback we have been gettingfrom plug-in authors is that peoplereally love the community plugins monorrepo especially the way it's set up morecompanies are moving their plugins intothe community plugins monor repo they'refinding it easier to maintain theirplugins without having to maintain aseparate CI pipeline or a releasepipeline they're getting better feedbackfrom other plug-in authors and it'shelping them build more stable pluginsand backstage adopters tend to trustthese plugins a lot more because theylive in that repository right makingthem more willing to use these pluginsover other plugins out there a hugethank you to all the plug-in maintainersand the community plugins project areamaintainers Beth and Kashish from RedHat and Andre from Spotify for makingthis repo so verysuccessful we also have some great newupdates in the open API project areathank to Ara thanks to Aramis from uhDoor Dash uh open API documentation thatgets generated from an open API schemafor each of the backend plugins is nowembedded in the micro site when youbrowse plug-in docs on backstage.iodocumentation site now you'll find a waymore interactive experience kind ofsimilar to swagger we've also gotbuilt-in support for schema testing injust now previously we used a toolcalled Optic to do schema validation butnow it has been implemented in-house tomake the experience a lot smoother thescaffolder plug-in will soon have andsupport open API schema there's still abit of work pending on that one but itwill be shipped soon now to Patrick whowill take you through some exciting newproject areas in backstage all rightthankyou so now over to something completelynew uh last year we hinted at thecreation of a new design system and nowwe have a project area for exactly thatum maintainers are Charles with a littlebit of support for myself and the areais responsible for our new design systemthat we callcanon uh the long-term goal is to fullyreplace uh material UI and the backstagecore components bringing everythingtogether into a single component libraryapart from reducing fragmentation a bigdriver for this work is for to have alibrary that is more uh opinionated andeasier to theme uh and also being tunedto more information dense UIs that saidwe're really not interested in buildingand maintaining a full component libraryfrom scratch um instead Canon is builton top of an unstyled component librarycalled base UI uh this let lets us focuson the styling uh and um curating andcomposing components rather thanimplementing everything from scratch nowthis is of course very recent and Canonitself is in early alpha uh there areonly a couple of components available sofar so don't go try to build pluginswith it just yet uh but you can checkout the docs atcanon.backstage.io to get an idea ofwhere we're headed and uh let us knowwhat youthink uh moving on to the documentationarea uh which is a also a recentnewcomer one important piec1e of work hasbeen to restructure the top level uhnavigation of the docs uh which reallyhelped clean up clutter to find thingsfaster uh now of course uh we also haveum the only show the documentation fromthe most recent release by default uhrather than from the development branchuh so this got rid of the confusionwhere you were seeing documentation forfeatures that hadn't been released yetuh a big task that's coming up uh is thecreation of golden paths so a shout outto Aramis for starting up work in thisarea already um but it's also a placewhere the core maintainers um and andour team plan to put quite a lot ofeffort uh as well uh builder experiencein general is something that's going tobe a larger priority for us uh now thatwe're wrapping up work in other areas uhthe goal is to create great endto-endguides uh to build using both the new uhfront end and backend systems speakingof those uh let's switch over to Ben forsome framework updates yes all right wecan start with this microphone but thenthere's going to be a demo so it's goingto be kind of hard to do two things atonce hold this and also steer the uhscreen okay um yeah let's dive into someum uh framework updates uh that are newover the last few months since lastCubeCon so first up is the uh backendsystem just a show of hands in the roomuh who uses backstagetoday nice quite a lot so who said ofthe new backend system new front endsystem hands okay good that's good somepeople um so you might reme rememberfrom our last talk in Salt Lake City umwe encouraged everyone to startmigrating their plugins uh and backendinstallations to the new backend systemnow that recommendation still standslike we haven't changed anything thereif you haven't done it yet please go anddo it now's the time uh many plugins inthe main repository now no longersupport the old backend system uh andwe'll be expect the remaining few to becovered in the next few releases uh ifyou're looking for ways to contributealso this is a great opportunity thecommunity plugins repo uh has someopportunities if you go and contributemigrating to the new backend system uhand then onto the new frontend system sobefore we dive into the new front endsystem let's take a little step back andlook at how we got here so we firstintroduced a new front system about ayear and a half ago uh with an alpharelease in alpha alpha release aboutyeah 2024 uh and at last cubecon weannounced something called blueprints uhwhich uh basically are feature designedto help you or simplify creation ofextensions uh within the app uh we alsointroduced new APIs to provide a well-definfined way to override theseextensions and since then we've beenencouraging plug-in authors uh tointegrate support for the new front endsystem alongside the old system just sothat we can kind of get some earlyfeedback and just to make sure we're notmissing anything uh and make sure weyeah not missing any patterns orfunctionality so let's take a look atsome of the new features that have beenadded since last CubeCon so first up ismultiple attachment points now I need tostart with a little recap of like howthe front end system works andessentially it's basically a tree um ofall different nodes connected togetherwhich are the extensions and they haveinputs and outputs and then we wire allthese together and then this is thefront-end app that you get uh yeah inthe browser now one challenge we raninto uh was handling extensions thatneed to be used in multiple uh placeswithin that tree um a great example ofthis is techocs add-ons uh so theseextend the functionality in the techdocsplug-in um and these need to be attachedin two places so one on the entity pageand one on the tech docs reader page sopreviously to make this to work we hadto create two different extensions maybeeven two different blueprints just toattach like the same extension uh intotwo different parts of the tree which isjust unnecessary extra work so to solvethis we've introduced support formultiple attachment points uh whichallows the same uh extension to beattached into two different places inthe treeuh enti2ty pages and the front end systemthey had some challenges um specificallygaining like control over the layout waspretty difficult so you often had tobuild your own layout just to change itand make improvements um and in ourprevious talk in Salt Lake City we alsoum were announced that we were workingon migrating our internal version so ourone that we run at Spotify to the newfrontend system and Patrick's going tocover a little bit more on that shortlyuh but as part of this process we notonly wanted to fix some of the issueswith the entity pages but we alsodecided to open source some of theinternal improvements that we've made umchanges that we think align pretty wellwith the new front system so first offis tab groups so at Spotify we've gotover 100 plugins u many of themcontribute tabs to the entity pages andthey can kind of lead up to like a bighorizontal scrolling nightmare kind ofwith a lot of different plugins there uhso we came up with the idea of tabgroups so you can group these groupgroup these tabs uh under a commonpattern and you got a little drop downand you go select it making navigationso much easier and finally is entitycard types so we introduced differenttypes of cards to help you build betterentity pages uh and one especiallyuseful addition is sticky cards so uhyou can place cards on the right handside sidebar out and these are perfectfor like metadata quick actionsimportant links and they're going tostay visible as you scroll through thelonger form content on theleft and lastly we've introduced a newcapability which is middleware uhextension factories so this allows youto globally modify extensions as they'rebeing instantiated so this opens up fora powerful range of powerfulopportunities so for example you caninject debugging analytics across allthe extensions without modifying eachone individually uh another powerpowerful use case is like AB testingwhere you can swap out extensions maybeeven just remove them entirely uh justbased on conditions chosen by theintegrator uh this feature also givesdevelopers a lot more flexibility inmanaging extensions at scale making iteasier to experiment monitor monitor andcustomize the app as a whole uh and withthat I'm going to jump into a demo nowso showing some of these features i'mreally hoping this works but we'll giveit a go as is tradition you can hold itthank you my assistant lovely uh allright let's give this a go uh I'm goingto jump over here and find the windowyes here we go ignore the that bitokay let's jump to here so I want tojump to the new entity pages i'm goingto make this a little bit bigger therewe go um so first off to show I can'talso look that way uh I'll look here umso first off is the sticky card on theright so if I just scroll through thewindow here you'll see that this uhcontent gets stuck as I scroll throughnice uh last bit I want to show you realquick is just the grouping of the tab soyou see we've got Kubernetes and techdocs here if I jump over to our configyou'll see we've got this commented herei'm just going to uncomment this likethat save that and then this updates andnow we have a tab group here for techdocs and APIs which is nice uh one lastthing I want to show you is we kind oftried to build a little bit more betterdev tool experience for the extensionsso we used the middleware and providedthis nice little view for uh extensionsso you can kind of dig in a little bithere to the entity graph one forinstance i can see some configuration uhyeah just a nice little experiment Iguess inspector nice uh let's jump backtothis and that worked did it work yes itworked that's good and this is good yeahnice there you go all right um that's itfor the demo uh now let I want to talk alittle bit about how uh we startedmigrating our internal project atSpotify to the new frontend systemso as part of the new system we providea top- down migration strategy where youswitch out the root of the app first uhthen with help of conversion utilitiesyou can lift over all your existingplugins into the new system from thatpoint you then migrate each plug-in oneby o3ne until the app is fullymigrated here's what that looks like incode uh you pass the existing structureto the commercial utility and out comesfeatures for the new system that you caninstall in the app now we've shown thisbefore uh but there are two importantadditions since then uh first is the newconvert legacy app options uh which is aconverter that takes options for for theold create app function and uh turns itinto modules so this is things like thethemes and the sign-in page it's justanother little helper to make it alittle bit easier to take the first stepand now the more important change uh isthat you can now include the entity pagestructure in the conversion it used tobe that all entity page uh all entitypage content was forklifted over as asingle extension uh which prevented thisgradual migration that we want to enableuh this new utility instead picks apartuh the entity pages and converts themall into separate extensions that wayyou're no longer forced to migrate uhall entity uh all plugins on the entitypages at once uh and you can insteadinstead uh truly do this gradualmigrationhopefully you can seethat yeah okay with all that in place uhthat that's what we're now uh running inour internal app at Spotify uh what yousee here is the list taken from uh thevisualizer plug-in that shows all thedifferent extensions in the new systemthe app itself uh looks the same to ourusers but what's really important isthat we're now able to use the newfront-end system for any new andexisting plugins even though there'splenty of work left to complete themigration if you want to be on thebleeding edge you can try this out aswell in your own backstage instance uhwe're always looking for feedback andeven if you might not ship it toproduction you can try using theconversion utilities to see what the applooks like in the new system now withthat important milestone reached uhwe're finally ready to apply finalpolish and work on a stable release onthe new front end system now let me handover to Vincenzo for the last frameworkupdatesthank you Patrick cool last frameworkupdate is catalog model so there is abig misconception we have heard duringlast year which is that they believethat the backstage catalog model is notextensible let me clarify this this isnot true you can extend the backstagecatalog model and there are greatexamples from the community uh forinstance Kristoff one of the plug-inmaintainer uh managed to ingest theinventory of his house into softwarecatalognow if you can ingest your coffeemachine into software catalog there isno doubt that software catalog isextensible enough the real question hereis how easy is it to to extend softwarecatalog so we look at that and we sawroom for improvements um and that's whytoday we're happy to announce that thereis work in progress in this area andsoon you will be able to bring therewill be an easier way for you to bringyour custom entity kinds um tweak theexisting entity kinds and add relationsbetween theentities that's it for so stay tunedthat's it for framework updates nowlet's share some updates we have made toour own internal instance of backstagethat we use as Spotify so first updatethat we have is about the front enddiscovery so this is how we deploybackstage as Spotify so each backendplug-in is deployed on its own backstageapp in isolation and as me as many ofyou already know if you want to usesplit backends you need to provide yourown custom implementation of discoveryservice so that plugins can find eachother as in this example but now whatabout the front end because the frontend also need to have access to allthese backend plugins so how do you dothat you have two options option one youhardcode the URL of which backendplug-in into the front end or option twoyou use a re reverse proxy adding somesome some rules to expose all thebackend plugins into to the front end sothis is this works works great if youhave a small set of plugins but onceyour your number of plugins startgrowing it might add some friction sowhat did you do instead we used we callit gateway instance so what did y4ou dowe took the first one of these instanceslet's say the one at the top weimplement a new plug-in called gatewayand then we connected the front enddirectly to this instance now wheneverthe front end sends a request the backthe gateway plug-in extract the plug-inID from the request uh use discoveryservice to resolve the URL of thecorrect backend plug-in and proxies therequest the good advantage of this isthat if today we uh deploy a new backendplug-in that plug-in will be exposed tothe front end without any changes toyour infrastructure and I if in casesome of you already use split back endwe are happy to announce that we areopen sources the open sourcing this getplugin so it will we aim to include itin the next backstage release now secondupdates we want to give you is how wemade upgrading backstage simpler byadopting the backstage yarn plug-in sothis is something we released last yearthis plug-in is it's kind of magic it'slike it resolve the the dev intensity ofbackstage according to the version thatyou set in your back JSON file so forexample on the right you have an exampleof a packaging some file and as you cansee all the backstage dependencies thereis no version there is just thisbackstage correct thing and that'sbecause the yarn plugin will take careabout resolving the correct version foryou and this simplify things a lotbecause whenever you bump backstage thepackages won't change so if you have abig instance with a lot of plugins andeach plugin has a different code ownersthose condoners won't be pinked so thisis a great improvement and we have somedata here here we have all the the lineof changes you need to make in order toupgrade backstage to the next versionand if you spot it in the graph onversion 32 there was a drop and thisthat's exactly when we released the backjam plugin so now if you zoom in intoone of the release let's say you are onversion 36 and you want to bump to 37the latest this is what we have to doyou bump TypeScript you introduce Canonthe new UI library and you bump yourback JSON file that's it so if you arenot using the back plugin we highlyrecommend you doing so and now let'sover to another important area of uhwhich is catalogyeah sorryyep uh specifically I'd like to talk abit about how we use the softwarecatalog at scale atSpotify our catalog is fairly large uhwe have quickly you know startedapproaching half a million entities uh alarge number of data source providers uhfrom different teams around the companyand yet it reacts in sub-second times tocode changes in a globally distributedcatalog where the primary database iskind of cruising along uh at a tinyfraction of capacitybut it was a bit of a journey to getthere so uh let's rewind a bit um likemost adopters do we started with an outof the box installation of uh thecatalog um this is very convenient whengetting started and it works fine up toyou know certain point but in thissingle deployment all work nodes uh doall types of work indicate with thecolors there there's the read and writeparts of the API there's internalmachinery for ingestion and processingand more and the requirements of thoserespect respective parts are actuallyquite different um over time more datasources were added pushing the number ofentities ever higher and API tracktraffic to the catalog was increasingyou know uh so more and more callersstarted to consume it so let's uh have alook at what we did to scale up beyondthebasics one of the first things we didwas to split the catalog apart um intotwo deployments so one deployment thathandles only read traffic and the otherone deployment handles uh write trafficand ingestion and that's a pretty smallchange in itself but it actually givessignificant significant benefits so nowwe can autoscale these deploymentsseparately um and give them differentamounts of resources which can lead tocost savings and they also becamedecoupled in terms of load uh so burstylike load spikes from ingestion uh thatare heavy can no longer affect uhlatency and responsiveness of the readAPI and lastly we can also use lockdownuh database settings5 on the readdeployment um which is good from asecurity perspectivea drawback is that we need to figure outtraffic splitting for for example writesshould not end up on the read nodesbecause they're not actually able toserve them uh this required a tiny bitof bespoke po code to solve but don'talone we hope to open source a way tolet adopters more easily uh do the sameout of the box you can but it needs alittle bit of gluecode we're now in a great place forsplitting both database and service bygeographical region as well the writeAPI and ingestion are only in the homeregion uh right next to the readrprimary database and the read API ismoved over to a dedicated read onlysecondary database and having streamingreplication happening from the primaryand to this we add more regions asneeded we have a bunch of regions asSpotify so now we get even better perregion scaling that follows the sun uhcallers get fast responses uh from anearby installation globally and we freeup database resources so the primarydatabase can focus on what it needs todo so to serve its needs um and thenchanges stream over to the secondaryreplicas replicas who only need to servelocalreads as for drawbacks well we're allsuffering under the limitations of theannoyingly slow speed of light right uhand a small amount of replication lag uhbecause ofthat when scaling up it also becomesmore and more important to be able todevelop and experiment safely withouthurting end users this can be trickyactually sometimes you find that youhave to tweak database replicationsettings for example and make sure thatthey work well under load 5 p.m um orone of your teams may want to deploy amassive new data source and you need tomake sure that it integrates nicely youknow without negatively affectingperformance or you want to exercisedisaster recovery that type of thing soto help with this we set up an identicalpre-pro production cluster where we cansafely move fast and break things it'snot just a minimal installation it's afull-fledged copy with all of the samedata and metrics so it gives us thefreedom to try things out at any timeincluding things with complexinteractions in the last 6 months we'vealso spent a lot of time digging deepinto the open telemetry and and um logsof the software catalog to find ways ofmaking it even faster under heavy loaduh I can't go into all of thenitty-gritty details here of course butsome examples are streaming of largeresponses to ensure a smooth load thatdoes not block the event loop you wouldbe surprised how JSON serialization canmess with your latency uh identifyingdistinct hotspots that could be sped upand improving the database performancequery of of some queries um making surethat even if entity providers issuemassive you know data dumps into thecatalog it responds to them gracefullywithout you know without a hitch um andadding more metrics to help with boththis and future improvements of courseall this uh was changes that we made tothe open source code so it benefitseveryone as youupgrade and finally we leaned fully intoweb hook triggered events from GitHub uhto drive catalog updates this means thatwhen a user merges a change to a filethe catalog is immediately notified andreprocesses it processes it after doingthis we were able to turn down therecurring processing loop uh to almostnothing so you can see in these graphshow uh um the load on the ingestionmachines and the primary database fellto almost background noise levels whichis super nice yeah we look forward tocontinuing improving the softwarecatalog both in terms of features andperformance and uh so watch this spaceand uh that's it for now back to Patrickall right thank you um to wrap up justwant to talk a little bit about the roadmap items in some other areas uh firstquick update on splitting the CLI intomodules uh we've made a lot of theprogress uh thank you Aramis uh but workis still ongoing another thing uh thatwe want to look at is refining ourconfiguration system we're seeing a lotof issues in the plug-in ecosystem uhwhere configuration schema is not insync with the implementation in theplugins and we'll be looking for a wayto just prevent those kind of issuesfrom happening in the first place lastlywe're of course keeping our ears on theground and have started looking into uhbringing integration via model contextprotocol into backstage uh this is aspace where we want to rely on existingsolutions as much as possible and onlyadd the necessary glue to tie thingstogether expect more updates in that uhin this area in the comingmonths that's it thank youdo we have time for questionsyes okay any questionsif so feel free to step up to the micthank you Francesca i can break I canbreak the ice congratulations first ofall huge steps ahead i'm very curiousabout the MCP so I mean MCP is is abuzzword and it's very exciting forevery developer how do you how do yousee the integration because MCP isusually coming from the AI space how doyou see the integration here i thinkthere's some um natural work to be doneto tie things together but uh we'llwe'll see exactly what we figure out todo um we'll talk about it more in thefutureindeed cubeoh yes yeahso so if I understand the questioncorrectly it's if we plan to add anyAPIs to the scaffolder plugin uh so thatyou can call it programmaticallyessentiallyyeah uhuh yeah I I mean it depends what youmean by programmatically cuz I guessright now you have the workflow itselfand you can ex execute like the templatedefinition by an API um there is a CLIcoming also I'm I'm hoping this is kindof I'm on the right track here answeringyour question uh we have a CLI toexecute those templates umprogrammatically or from yeah ACLI Iguess um does that answer the questionyeah good okay good go on you can alsoYeah and also in general we um theyalready have APIs right these back endsin general it's just how they're exposedright so I mean the front end alreadydoes these calls in the first place it'smore how accessible are they and we'relooking at making you know open APIdefinitions properly for all of thebackends over time uh it just takes sometime but yeah we'll get theretricky bit is user authentication aswell for the templates so hopefully withthe CLI we can address that tooyepokaymy name is Sasha i actually used to workin Spotify so I know backstage from bothsides from Spotify and outside now Iwork in Expedia so there is one toolthat probably you know internally youguys also use which is C4 plug-in formodeling which is amazing so when Imoved to Expedia we actually I realizedthat you don't offer it to externalcustomers so I'm wondering why is thatbecause it's amazing tool just wonderingif there's any plan um to expose it toexternal customersi'm just wondering ifbecause it's in your blogs like Spotifyblog published about it how amazing thetool is so it's I know this guy downhere might have ananswering internal technology so it'snot just easy to ship that's the boringpart i agree with it it breaks my heartokay so built with internal technologytricky to ship it please do that asecond question can I quickly ask thereis I've seen also a pattern of engineerssometimes when they create somethingfrom from backstage um they don't wantto rename things because they can'tquite track how many places changed theID and what was provisioned so theyprefer just to to kill the projectinstead of trying to rename it so Iwonder if you try to do something withusability perhaps as well if you knowwhat I meanyeah I I understand what you mean um noI I don't have a straight answer rightnow for a better fix we're looking intomaking things more reactive like we saidwith you know web hooks and so on butthe naming of things those are reallyyou know external references so if Aclaims that I depend on B then A needsto change that claim right that's justan unfortunate reality of it um we'dlove to look into better tooling forthis um and if any ideas are verywelcome you know in the community or inissues and so on and we're happy to lookat it but uh I don't have a good answerright now for that particular one thankyou and thank you we're at time everyoneyeah so thank you for coming thank youall2025-04-15 21:59:58.9459067aking uh making sureit's running and then there is uwatchdog npd that is thing looking at uhship and like see that there is no leaksand anything and there is a bunch ofplugins for different resources uh needfor container I'm out of uh uh my depthshow to make it nautical theme because Idon't know much about ships uh so youhave plugins that uh allocate resourcesfor specific containers and then thereare other components that wasn't drawnhere in fact in the past we didn't evendraw plugins we only did uh uh kubletand npd and like contain runtime nowplugins making a playing a very goodrole and we will talk a lot aboutplugins in this talk anyway so this is abeautiful ship and then uh out of thosebeautiful ships with many manycomponents once you put all thecomponents in place you need to makesure that this ship is actually runningthat node is healthy uh that you haveenough resources for managing yourcontainers and then containers forcontainers themselves you workload likeuh user can decide how much they want torun it reliably and how much they wantto run it uh efficiently so there isalways trade-off and you can make thistrade-off you can make kublet how muchof trade-off you want to make uh betweenefficiently uh running and utilizingresources over provisioning and how muchyou want to run it reliably so it willnever crash that kind of a uh priizationthat you can make and make can tellkublet to make and kublet will manage itout and like make sure that it's allrunning reliably uh again as I said it'sall running reliably only if Kubletallocated some uh resources for itselfand it has enough uh um capacity toactually run the ship so with all thatsaid we will do a deep dive in resourcemanagement today uh it will be as I saidslight deep dive and then even deeperdeep dive so what resources Kubletmanages today we have a small list herei mean yeah it's not very extensive umwe do some standard resources like CPUmemory uh out of CPU we also look aboutlook at numa nodes uh memory is alsodifferent types of memory uh we do somespeeds uh limiting and uh uh we also dodevices devices are big anyway it's nota huge list but if you think about allthe things we need to do with thoseresources um you you will be I mean youyou understand why we're doing it and bythe end oflike first deep dive I want you tounderstand how important it is to makedecisions on every step of this way solet's say advertising resources how doyou model your resources how do you makesure that scheduleuler autoscaler knowsabout your resources then depend on thisdecision uh you'll have a next decisionhow do you tell which port has whichresources then how to apply some quotashow to manage those resources acrossorganization how to allocate and admitport on a node and then um how toallocate devices and like actually makethem available on a node when port isalready scheduled and then how todeallocate them and how to deallocatethem fast if in certain cases when youneed to uh evict everything reallyquickly and then all this question aboutoverprovisioning and managing uhresources efficiently and monitor themand how to evict uh like some resourcethat are not perform like some pose thatnot performing very well and consumingtoo much all these questions needs to beanswered and actually every day ofmaintainer in kub kublet in kubernetesand sign node is answering many manyquestions about all the resources and infact in some resources in some uhquestions we going back and forth whatwe want to achieve how we want toachieve it and what kind of uhcompromises we want to make so let's sayum advertisingresources I mean even for CPU and memorywhat kubernnees started with is uh howlike just proportion of CPU it didn't uhaccount for CPU being different like youhave one VM that is uh uh has one typeof CPU another have different type ofCPU you still saying like I just need ahalf and this half may maybe like 10times faster on another machine becauselike CPU is just more powerful butKubernetes didn't care enough to make itavailable for you and advertise itproper way so because we just believeit's it's okay8 it's enough modelingthings that we can make and enough foryou to tell what you what you need andthen um good example of compromise hereon advertising resources is what we didwith uh uh classic DRA versus uhstructured DRA so if you don't know thisterms classic DRA structured DRA you cansee some older talks from Kevin and fromother people so classic DRA we startedwith device allocation and we said thatwe will have a plugin and you willmagically know about resources and ifyou ask about it it will tell yousomething about it that you need to knowand it didn't work very well becausecluster at scaler has no idea what thesedevices are and then we um just uh killthis uh whole direction of classic DA wesaid like structure data is a way wewanted we simplified devices intoresource slices and that this is what wewant to tell Kubernetes about devicesand this is a very primitive and simplemodel of devices but it will work goodenough for majority of usecases then allocating port to resourcesum another story uh for allocating postto resources is uh our long-standing capof uh memory swap support so for memoryswap it's countable resource you cantheoretically expose it and advertise itas a countable resource and then try tounderstand do you want to share swapmemory do you want to uh exclusiveexclusively allocated to specific portsso in the first iteration of memory swapsupport we decided we don't want to evenadvertise the swap memory we just wantit to be magically available to someports and you don't even know whatthat's available um and we get all theway to alpha with that for memory swapbut now we're questioning ourself is itgood enough is it uh what people willwant and if you enable that support formemory swap will it be uh putting us ina situation when we cannot evaluateand can it um work to forward and uhstart counting swap memory and startover provisioning and like what who knowwhat scenarios want to support on thatanyway and uh another good uh example isuh uh for as changing ports to resourceslet's say you have some uh QS resourcesanother long-standing uh uh cap that wehave and quality of service resourcestelling you how good of a quality youhave like let's say you have networkinguh switch and you want to say like Iwant super fast network and then howmany ports of super fast network accessyou can have on a node do you want tocount them and then do you want to applylimits to them and it goes to the nextstage like how do you apply limits howdo you count how you quarter those anddo you need to spread them across allthe devices evenly or you need to putall of them like first uh node that fitsthis uh QS class all these questionsneeds to be answered and uh this is whatwe ask as maintainers uh what we want todo for support for specific uh classesof resources and uh types of resourcesnext uh when you have an admission on aport admission it relatively easy topicbut uh if you consider um device plug-inand DRA like the big difference indevice plug-in and DRA that we get towhen we designed one and another fordevice plug-in we said that all devicesneeds to be pre-allocated pre-sharedpre-sliced and uh they pre-advertised sowhen port comes to the node devices arealready available you need you just needto associate device to the node or tothe port however however with DRA it maytake a while to allocate the device andsometimes you get into weird situationof timeouts so with DRA we said thatthere is no port admission you cannotfail admission on the port if you umhave a device associated with your portwhat you you will end up with if deviceis not available you will end up withcrash loop back off that will keeptrying to create your port create yourcontainers and you'll loop throughallocation of the devices and if itfails it will try again if it fails ittryagain then um allocation resources as Isaid uh u it's a interesting topic anduh uh this allocation isuh DA is a very good example like howhow do you allocate them how quickly youallocate them and what kind ofparameters you can u pro provision tothisuh provide to allocate proper deviceslet's say uh9 with the device plug-ineverything is pre-allocated with a uheverything pre-provisioned with a setproperties with DRA you able to passsome extra parameters how you want toallocate this device and modeling thatwas a quite a challenge because uh uh tomodel that we needed to have resourceslices uh resource claims and like mapthem all together and kub will haveextra properties that it will need topass to allocation of devices andfinally A couple more things uh supportoverprovisioning is hot topic we don'tsupport it on devices yet uh we supportit a lot on CPUs and uh memory and itgives gives us pain um and we for everydevice for every resource that we wantto overprovision for we need to bereally careful we need to havemonitoring of usage we need to knowwhat's happening uh and for memory swapfor instance we don't only need to knowhow much memory swap you use but alsohow often you swap do you swap likeeverylike do you swap just once in thebeginning or you swap all the time andyou are affecting the like IO operationsso monitoring is is not like very simpletopic and then uh once you know about uhonce you monitor your resource you needto understand how you evict them that'swhy we don't don't overprovision deviceswe don't do this of a device because wedon't know how to monitor them and howto evict them so this is a veryinteresting topics and is all thequestions that we constantly asking forevery single resource be it resourcethat we already support CPU and memoryor it's some new resource and uhFrancesca will talk a little bit aboutuh standardresourceshello thankyou okay so uh lot of questions we haveto answer and account for like Sergeymentioned so let's let's look a bitabout the answer we currently have andif and how they are sufficient and howwe could move forward to address thecurrent open points and we are not goingto talk about the array for a few slidesbut we will talk again about the arraysoon enough so don't worry okay uh firstof all how do we express resourcerequirement currently well it's let'sconsider the extremely simplest exampleso one pod one container you ask forresources and so you set a request andthis is the minimum the system willgrant you and the translation is verystraightforward and that part this croupsetting basically was in the cubletsince basically forever but we seek thatuh if we start adding things like limitsagain limits per say are fine the podswill not use more the container will usenot more than that but we already begunto have an implicit meaning on anoverloading of terms so for example theQS is computed out of the limits againthis is the extremely simplest exampleso if the the you have request recalllimits for all the resources isrequested by a pod the pod ends up to beguaranteed quality of service which hasa lot of good properties like forexample for shadowing and evictionguarantees but this is again anoverloaded of terms so you you don't youcannot express yet I want my pod to bequality of service guaranteed it has tobe inferred so it is more context tobring you along while you define yourresearchrequests if we need something extra anduh basic basically uh croup enforcementto resource request we need to uh takein account the so-called resourcemanager for example the very first andsimplest probably addition was theexclusive allocation of resourcesinstead of saying I want eight coreworthy of ch of CPU time we can say heyI want exactly eight cores and they wantexclusive access to them and for that weneed the CPU manager and for other uhspecial would say allocation we moreresource managers these are called alsocalled herdor managers built-in managersbut over time and this is a theme theresource management requirements kepthave a steady flow of request this isnot a solve the problems even if we takethe array out of the picture and todemonstrate that we recently well in thelast 10 releases or so Kubernetes weadded the policy options which are wayoptions to fine-tune those extraallocation and the fact that we have forTP manager alone six policy option meansthis is something people still ask forand ca:reabout let's uh have a deep dive into oneof those option this is if I'm notmistaken the latest way that which isabout LLC awareness so in a overlysimplifying modern CPUs don'tnecessarilyhave uniform charact physicalcharacteristics some of them may beactually built like a cluster ofresources replicated on the larger umsilicon well physical piece which is lotin the motherboard and to grant bestresource and best uh uh performance weneed to uh allocate if we can CPUs uhaccording to those boundaries thoseinternal boundaries of the CPUsotherwise what will happen that when wecross those boundaries we can hit a peruh per we can take a performanceprice so with this new option the CPUmanager takes into account thoseinternal boundaries when it does thealignment and instead of doing thesimplest thing it tries to allocaterespecting those boundaries so this isimportant because we have now anotheralignment boundary what's an alignmentboundary is an and it's a structure totake into account when allocateresources to make uh uh best to grantbest performancesstill another um another recent additionwas the conversation about in placeuh update of resource of uh podresources the VPA which is a veryinteresting and complex topic and one ofthe goals is to preserve the exclusiveallocation so when we scale up a pod inplace don't just take it uh so many newCPUs take into take CPUs such as thealignment guarantees are preserved andthis is again pose issues it's an hardproblem to solve because it basically uhforc us to review a design uh uh aninitial design we took with manager withthe source allocation which is okay nowduring the pod lifetime we need tochange this allocation so we need totake into account the fragmentationavoid it fragmentation resources and howto avoid it not uh not not a trivialtask we I I mentioned previously thatimplicit characteristics what does itmean it means that there are constraintsor desire or things that a workload willlike that are not immediately obviousfrom the podspec okay we can grant forexample exclusive allocation but youcannot just say hey I want exclusive CPUyou need context to learn that forexample this uh resource request I I'mshowing up means different thingsdepending on which node it lands whichis okay we we grant the the what theworkload actually requires but you needto take account the node configurationand means different things depending ondifferent nodes so once again is this anexpressive an this is probably anexpressiveness problem because we willwe will we could benefit from a clearrepresentation of those requirements ifokay we can do that but giving theworkload the option to explicitlymention if needs that or prefers thatallow us to make better decisionsbecause if a workload could toleratelack of those guarantees which areotherwise always try to be enforced wecan make more informed decision andactually for example enable to us toreserve resources for the workloadswhich actually require themso there are some emerging themeshopefully from the examples I try to tomake and in general over theconversation over the months and signaland first of all those are among themthe one I have I think there are worthtalking about which is f first andforemost the how do we allow theworkload uh owners to express in a moreexplicit way their requirement insteadof having to take more context extractfor example the cublet configuration tolearn about them and which hardwaremodel is cublet considering because forexample because nowadays it is usingbasically the C advisor hardware modelso the data structure they representhardware there are the ones from Cadvisor which is a very simplistic modelwe are very close to the breaking pointmaybe someone could say we are past thebreaking point but it is what it is andwe need to rethink about it if we wantto move forward and unlock more POSpossibilities and the fact that we havea steady um flow of requests to reviewthe resource allocation model alsobrings the topic do we should we make itmore modular than it is because nowadayswe need to change the cublet to enablethose changes the;re is a a constantdemand or chatter about make let's makethem pluggable let's make them modularbecause this way people can experimentwith their own uh uh allocationrequirements and do their thingy ontheir own clusterwithout without changing things thatdoesn't need to be changed and therethere was a cap back in time aboutenabling those plugins this effort is isnot really proceeding but this is againuh a strong indication that such desireexists and still is is there in thebackground which are the technologieswhich we can take into account and buildupon to uh satisfy those needs i won'tjust mention the two main ones which isthe node resources interface which isbasically a plug-in architecture for theruntime interface and of course Ipromised DRA which is now f fullyfocused on device but there are alreadyconversation well we started to haveconversation about how to expand it toum to use for core resources namely CPUandmemory and like I hinted there is noclear direction about all thosedesired ideas or requirements yet wehave uh so many open questions and wehave so many challenges to face andagain I'm just highlighting some of themwhich are emergent from the chatter fromthe conversation we have in signaledwhich is about open question is firstand foremost how much should we delegateoutside the cublet if we delegate thisquestion depending on the how do weanswer to this question we we we havemore questions for example if wedelegate fully the the the theallocation and the ownership of theresources like croups manipulation forexample how we do how do we do bootstraphow do we ensure the relability again ifwe delegate or if we do partialdelegation because we are move moremoving parts how do we ensure a constantUIX because Kubernetes is already veryrich ecosystem with many odd don'tonsand then we are adding more and speakingabout the challenges um how do we keepup with the other requirement becausehardware is getting more complex andagain some we could probably say that ithas outpaced us in some area about theotherrepresentation we need we have uh aconstant stream of requests to adapt butwe also have this constant conversationabout our redesigning if we keep weshould keep iterating because with thecurrent architecture because there areactual demands but this makes thecurrent cubelet a moving target what weshould say hey we we do in the nextdesign versus what we do in the currentarchitecture if we do in the currentarchitecture are the limitations asconstraining our solution space so thethe solution which to implement is notgood enough again open questionuh open questions and we we we we haveto have to have those conversation andwe need we need to iterate over therewhile we do everything else uh we keepiterating over the over the big featureslike VPN array and all the things we arecurrently doing which Peter will explainto usall right thank you Franchesco so we'vejust done uh depth first andso that was great we learned a lot andnow we're going to run through a wholebunch of stuff that Sign Node has doneand so it's going to be very muchbreadth and it's going to be kind of amarathon and um I'm probably going totalk fast because I can't help it sowhat has Sign Node been up to lately uhit turns out a whole lot um in the 133cycle if you look at this long list Idon't know if you can see the textbecause there's so many things we madeprogress on 24 caps which I have notseen happen anywhere before so this ispotentially a SIG record if not evenmaybe project record i don't know couldbe um which is very exciting and we feelum you know very proud of all the workthat we've been doing and it's beenwe've had a lot of help along the wayfor that i've sort of sorted thesethat's not five things that's six ishould have updated that um intodifferent buckets um generally uh and soI'm going to kind of run through uh someof the different buckets that have beenuh you know can sort them in but this isnot even an exhaustive um list of themso um starting off one of the big thingsthat we're very proud of is uh we'vemoved forward with um we've finally uhgone to be<ta with in place pod resize sowhat we're scrolling through here is theentire issue uh of in place pod resizeand all of the work that has gone intothat it's actually not even the wholething because yes youplease it's amazing actually in thisvideo we hide 500 comments so I'm noteven going to show all of the thingsthat have happened on this issue but umit's been a really long time coming andum it's taken a lot of work to get thereand we're very proud of that um sofinally uh you know we've moved forwardand um you know looking forward to thenext uh changes that'll happen movingtowards GA the next thing that we'vedone um in 133 is we have uh we havecome on now there we go ah we've g uhsidec cars um which is also veryexciting so this has been um little lesslong in the making but still it's been agood effort um and it took an entirework group and um multiple caps and youknow a lot of energy to uh move forwardon sidec cars um so we're very happy tobe able to extend the pod life cycle tobe able to have you know this new typeof container which is quite a trickything to doum next up we've got um some smallerDRRA updates we talked a little bitabout the DRA updates i mean small inthe sense that like you know we're notuh Sign Node is not the entity that'sdriving a lot of the updates now likethings are going to be moving intotheuler and a little bit into networkingas well but we have gone beta withstructured parameters which is veryexciting so that'll um we'll be able tobe using that um in a lot um moreenvironments now and then al some otherum edits that are also really importantand we're going to continue movingforward on it and continue investing inthe DRRAspace um here we have a list of thingsthat are loosely uh related they're allyou know node things but they're kind ofnot related but they are all extensionsto the podspec we have you know uh canwe do this yeah so host users true thisis for the username space feature umwhich has gone uh on by default betaalong with the proc mount type um that'sactually a typo it should be procmountum sorry uh so these two are on bydefault now um which is very exciting soyou can have access to uh usernamespaces in pods um next up we have thecontainer stop signal um which is a wayto encode the way that you want to stopthe container um outside of thecontainer image which is currently theway or with the runtime default um wehave these uh extensions to the pod lifecycle through pre-top sleep action umwhich is both um the 0C option and justjust generally the option both of thosewent to beta we have OCI volume mountswhich went to uh beta off by defaultbeta um and allow us to mount in an OCIum image and even potentially anartifact if the runtime supports it intoa pod and then we've got a supplementalgroup policy which allows you to um saywhat uh how you want to configure thegroups inside of a container to have itbe a little bit more strict and thus alittle bit more securenext up we've got some cubitconfiguration option changes um ensuresecret pulled images um which was drivenlargely by sigoth by the end there umstand is here thank you for that um it'sfor but it is uh you know a lot of codein the cublet to make it so that whenyou pull a uh you have an image which isum pull policy if not present you stillgo through the authentication for thatimage so that if you're in amulti-tenant environment you make surethat every pod that tries to use animage is actually authorized to do sobut without wasting the pull um extratimes we've got some extensions to crashloop back off so now you can sort ofbetter tune crash loop back off andstart it off at a different time so youcan you know sort sort of uh better tuneto the behavior of um the containers onthe node and then we have um projectedservice account tokens for imagecredentials which is also which is a wayto um allow aa image uh plugin uh image credentialplugin to use service account tokensinstead of um just you hard codingthem and that can allow you to customizebased on name spaces and stuff like thatum finally just a couple ofmiscellaneous things that I was excite=dabout but couldn't really find a bucketor there's the PSI metrics which we'llbe able to be uh reporting in alpha sonow we can potentially take uh use thosethat information and take action on uminformation reported by the kernel abouthow long certain pods and certain croupsare waiting for certain resources likeIO CPU and memory and then we've gotsome CPU manager policies franchescomentioned the um split L3 the uncorecache piece of it and then we've alsogot these other two things so bunch ofwork this was not even everything um andso we're really excited we've been doinga lot and uh you may be wondering howcan I help with all of this excitingwork and I'm glad that you asked andhere's a non-exhausted list of the waythat you could help if you want to joinand help out in the SIG this is roughlyin order of maybe some of uh from top tobottom of things that like you know wereally need we've introduced a new uhrole in the SIG which not officially butprobably going to do that soon um capwrangler which is someone who helps outin the cap process you're not actuallyauthoring cups but you're just likehelping wrangle them along to make surethat the authors stay along withdeadlines which are constantly happeningum so this is a really helpful um thingthat we found and definitely contributedto our um record number that we hit in133 we also have a CI subgroup whichmeets um weekly and is uh you know goesthrough uh any issues that are happeningin CI and you know triages issues andthen also goes through open bugs um andhelps assign them so if you want to getstarted on fixing bugs which is anotherthing you can do joining the CI subgroupis a good way to get introduced to thatthen we've got just general PR reviewand feedback we always we have so muchstuff going on we can love um feedbackum documentation help always can usethis um trying out new features thatwe're pushing in the SIG is a great wayespecially if you're an end user and letus know what you think of them attendingthe SIG node meetings um we always talkabout a bunch of fun stuff and um andthen finally if you really want youcould be running features though thebarrier for that you know we have a lotof things going on and the approvershave a lot to look at so uh we can'tmake any guarantees that we can getthings in but we love when people um areenergetic about the new work and that iseverything thank you so much for joiningum please we have the feedback form hereand you have to be nice to us because welook so nice smiling here so thankyou i think we have time for questionsdo we have time for questionsyeah come to the mic pleaseum hey um so I saw the slide about theCPU resource policy and I wanted to knowif you ever u made a benchmark on how itreally impact the the containers becausethere is a significant overhead of doingit and understanding what we're doing sowhat is really the benefits hereso yeah um every feature is driven by acap pro possibly so in that cap uh inorder to graduate there is a a benchmarkrequired okay what's the benefit thischange is granting for the very exampleI provided the the benchmarks are in theballroom from 20 to 30% with certainworkloads for that specific feature onon the selected CPU not any random CPUbut so yeah but in general I want tostress that yes when a new feature isproposed except especially for aperformance-oriented feature people hasto demonstrate the benefits and thecondition which on which those benefitsthey manifest themselves thank you thankyoua question without a microphone let'sassume that I'm using a vertical portscaler that means that thesedays the VPA won't restart the containerum the question was about in placevertical pod autoscaling and it waswhether the VPA will be able to use thisum that is intended to happen I thinkyeah yeah so we made a change in um 133comparing to previous versions so inprevious versions what you can expressis I want to resize and you don't knowwhether couplets will be able will beable to resize without restart so youcan uh disrupt um resize can disrupt theuh workload so now you can uh like wechange the semantic of API so now if youresizeing and restart is required itwill rejected so there is options forthat so that's why like we intentionallywanted to make a VPA work and we hadlike cluster scaler uh consultants andlike we um we decided to change thesemantic last minute and like I mean wealmost get to bait it with previoussemantic in 132 but uh we got thisstrong feedback that it's not what umVPA needs that's why we changed it yeahthank you so muchi have just two questions about arethere any updates about confidentialcontainers and rootless notesnoyeah there there was there is one keptthat is for push uh going to help outwith confidential containers that wekeep sort of um not having the cycles tolook at but we it is on the list ofthings that we might be looking at inthe future um rootless is also in asimilar state um yeah so we haven't Iactually don't know if there's been anypush for uh more work in that space butum you mentioned a secret verificationof images yeah this um the imageverification piece I described uh reallyquickly doesn't necessarily help oh jeezum didn't necessarily help withum root list but does help like issecure a cluster um a little bit betterin a multi-tenant environment but um thefocus of the sig is really more rootfulcublet um because that's just you knowwe kind of expect the cublet to have alot of power um and to be able to do allthis fun fancy stuffso you have if you have specific needsand like um can increase urgency of thisrequest by coming to meeting and liketelling us uh because there's bunch ofpriorities and uh we're doing our bestuh navigating priority and disruptivechanges comparing to reliability of thegrouphey again uh um so in about the podresize um does it take intoconsideration the system reserved andthe cube reserved i mean it can causesome sort of uh node pressure so how itbehaves in this situationit does account for allocatable sobefore resize will happen we will admitthis resize and admission of this resizewill account for all those all the otherports running on this node all the coupreserved of the system reserved so itshouldn't affect uh uh so it also mightblock the resize if there is not enoughplacethank youokay let's do last question and uh ohlast two questions it's finehello thanks for this call and uh myquestion was aboutmore storage things uh for example whenwe want to mount a volume like a fusedriver or something like that we alwaysneed to mount to have capabilities adminto mount volume is there some evolutionsto fix that to prevent a container beinggapsis admin so basically admin ofeverything to be able to mount a driverfusei would not expect a container to needwell so capsis admin on that the cublethas or that the container process has onthe container mounting the fuse driver Iwould expect so in in a typical casewith the volume I would expect the Cubato be the one responsible for actuallydoing the M or the OCI runtime but theCuba asks the OCI runtime transitivelythrough the Z runtime um for a for a umC for when a pod wants to do mountinginside of itself um you actually that'ssomething username spaces could help outwith because you can uh we can relax thevalidation on capsis admin into acontainer um because if a pod is in ausername space it's not a fully umunprivileged thing to do but I think wehave it in the baseline um pod securityadmission policy where if you have ausername space pod then it can have thecapability like more capabilities umbecause it's username space so like thatmay help in your situation but I we'renever going to get to a place where wehave containers um just having capsisadmin and I don't think the kernel isgoing to have a case where it's notgoing to give like not going to make itspecifically capsis admin or some othercapability so there's not much we can dothere okayand will name spaces help to runcontainers inside of containers indeedthat is in fact a driving force that I'mtrying and proc uh procmount type alsowill help with that um but yes both ofthose specifically I'm trying to getthat working for so yeahgreat thank you everyone2025-04-15 21:59:59.902647 ��#�@#��}A7sr1eHJBXKsyou are in the right place if you cameto listen aboutsign it's a introduction deep dive i'mSergey Kangel i work for Googlehello everyone i'm Franchesco Romani iwork for Red Hat hello my name is PeterHunt and I also work for Red Hatokay so in this talk we start with uhdeep with short intro we only have 30minutes then we'll go into deep divethen we go into deeper deep dive andthen we'll lighten it up and uh finishwith all we're doing recently and thenwhat we did um so yeah um please expectit going lower and lower into detailsand then it's pumping up and uh youyou'll know about all the awesome thingswe're doing so s node if you think aboutkubernetes everybody in this room knowsthat kubernetes is 50% API server 50%sign node and everythingelse I know if you go to sig nodemeeting uh they will say it's 80%networking and 20% everything else butwe are in sign node meeting so uh bearwith me uh so sig node uh you see thebeautiful ship carrying containers um Imean yeah wenautical theme here right so we carryingcontainers there is a runtime that helpspushing it forward m6@uh and and the upcoming DRAdriver would also help in know for italso help them to know what the CDIdevice is because they are the consumerof this CDI spec fileso this is how roughly sample spec filelooks like so uh is the cursor work yeahso uh this is how you identify device onthe host right uh you sayvendor.com/device and equal to my deviceis the my first device here and thiscontainer edits is essentially a way totell uh cryo or CL runtime that you knowI if if this device has to be used uh inthe container inside the containerplease set this environment variablethis device node should be uh madeavailable i want these mount paths uhfrom the host to inside the container tobe ready and some of the hooks like Ijust gave for brevity i just uhmentioned the one hook here but thereare other five hooks like createcontainer start container etc that youcan you can useto inside this uh uh container edit nowif you notice uh there is anothercontainer edits at the bottom here so alot of times your devices let's say youhave uh GPUs like say you have four orfive GPUs and they all need to be uh setup in a similar way so you don't want torepeat this this uh edits every time soit will be my device 1 2 3 and everytime you just keep repeating the samesame uh uh stock right instead of thatwhat you do here is you you you definethem in the container edits and thiswill be more clear I have an example ofit how Nvidia does it uh in the CDaspect but but the gist of it isthat all the configuration that isrequired to configure your device typesthey go in this container edits and theconfiguration that is required for thisspecific device uh in this device typego here right this is just this this uhneatly avoids a duplication u so howdoes CDI work in cryo so CDI usually youhave a CDI spec placeinside a specific directory that cryomonitors by default it cryo monitors etccdi etc wire and CDI U who places thisCDI spec inside it could be your devicedriver it could be a demonstrate thatjust say uh can list the devices andgenerate the CDI spec so cryoessentially is monitoring thesedevices what happens after that is fullyqualified name like like I justmentioned device.com my devices uhdevice.com/device is equal to my deviceis passed to cryouh via Qblade uh using a CRF like calledCDI config and then cryo locates thiswhen the when the cryo gets the CDIdevice cryo essentially goes into thesedirectories like etc wire and CDI and itit reads all the files and tries to findthe device using its fully qualified uhfully qualified name and then whathappens once it reads it you you saw inearlier the container edits right hereit will it will start applying this thisum u artifacts here on the OCI runtimespec so your container is start willessentially get modified uh let me goover this diagram here so when you'redealing with say a classical deviceplug-in right you you need to implementan allocate method on device plug-in sowhat you do is uh in that allocatemethod you will essentially decide whichdevice for for example if the user askyou in a part resources saying I wantresources limits and Nvidia GPU andresource quantity one now as a deviceplug-in what you need to do is you needto go ahead and read this uh devicespecs that is generated a CDI spec thatis generated and you're going to selectsay a GPU one here from the from thelist of available device specs in the inthe CDI spec then this device plug-inwill essentially return the CDI namesone or more depends on the request uh tocublet cublet will forward the CDI uhdevice names uh to cryo via CRI and thencryo will as I It locates those thoseCDI files CDS spec files and it willread them and whatever the theconstituer edits that we saw it willapply them on OCI spec that will bepassed to a a runtime like C run or runC which actually runs the container andjust for a brevity I give example oflike you know create container is OCIhook i specified if I would havespecified OCI hook it will get executedat this point but there are there areother hooks that also you can you can umusehere like this is the this is example Igave so and how Adoes it look with thenew DR driver so if you see in old styleDRA plug-in classical DS plug-in youhave resources limit u this way ofspecifying the resources that containerneeds and in DRA you you're going toswitch to using resource claim so you'regoing to say resource claim GPU singleGPU and again the same thing it has tobe translated to a fully qualified namethat cryo can use in your device spec soagain the cublate will when the pod isgetting created cublate is going to callnode prepare resources on your D plug-inDRA plug-in uh is going to return theCDI name uh names CDI device names andthe the flow is essentially identicalafter that where CDI device name isforwarded to cryo cryo modifies OCI specand you can the runtime uh whichactually is responsible for running thecontainer will essentially run thecontainer using the modified config.jsonthat is your OCI spec uh so integrationwith CDI of cryo how does it look likeso crow essentially as I mentioned itleverages CDI specification to readdevice configuration like what do I needto set this device up do I need to mounta file do do I need to run an XCI hooketc um default spec directories in CDIare these are default CDI specdirectories in cryo are CDI etc CDI andwire run CDI but this is configurableyou can you can edit the cryo config tochange these defaults uh it greatlysimplifies device management becauseimagine if you are using a rathercomplex device uh without CDI youthere's no standardizationuh of what operations I need to performso that this device is accessible andusable inside the container uh what CDIdoes is it abstracts that out from fromthe uh a Kubernetes user if you will uhand and places the the burden ofcreating right CD device on themanufacturer right know if you cancreate a CDS spec by the way but amanufacturer can always come up with thebetter CDS spec because they know whatit takes to run the device and then fromyour point of view you're just using aCDS spec whether it's a AMD GPU or is aNvidia GPU doesn't matter you're justreferring to a standard CDI spec So uhand then from the cryo point of viewthere is no restart required you can addremove device specs uh on the fly andcryo will just detect that oh thisdevice is no longer available to me umpractical use cases of CDI that you seeas I mentioned GPU acceleration you canyou can use GPU acceleration uh usingCDI CDI specs uh custom hardwaresupports like you know if you have aspecialized network cards that need tobe initialized in specific way insidethe container before need to be used youcan use CDI FPGA is a good example uhcomplex devices you can construct a flowwhere you can say I want G I want a GPUbut I want with specific network so whatyou can do is your your device driver oror device plug-in or your D driver canactually select the two CDI spec filesand return them to Cubate and Cubatewill pass it along all the way to makesure that so so you can you can havethis setup of uh complicated deviceconfigurations uh sometimes you also canuse dummy devices we we see this oftensometimes people use this where uh a CDIspec gets created but there's no reallybacking device per se you know it's it'sbeen used to uh hold as as an anchor tofor example you can you can have someUID associated with the container pod uhand then you can just hold it there aslong a container running for yourinternal uh tracking purposes if youwill uh I have some real world examplehere how so Nvidia has a DI driver andhow they do it um let me Let's run thedemo here souh okay so I'm on the host and I'm goinginto this directory where the CDI specsare there so this is the spec createdwhen you install a DI driver this is aspec file gets created uh based on whatGPUs and the GPUs are there on thesystem and I have I happen to have fourA100 Nvidia GPUs so this will be createdby the device driver and if you see ifyou remember I said you know you havecontainer edits so it has device node uhenvironment variables these are thehooks you need to execute and in thehooks now the it happens that thisparticular uh device requires createcontainer there are specifBic mounts thatare that need to be added in thecontainer and this is the secondcontainer added per device where thesedevice nodes also need to be added inthe container uh config so this is howthe if fully qualified uh name lookslike uh the at the bottom kinduh hold on yeah kind and the name so theit will be kubern ks.gpu.in.comu device equal toGPU3 so then we go to other terminal andwe will we will run a sample workload uhwhich uses the resource claim uh fromthe DR so we we go and execute a samplesample execute command just to just tomake sure that we are running on realGPU it will just print out uh the GPU uhthat you have on that system uh then itwill have resource claim uh thenresource claim is is referred here inthe resource claim template this is a uhthis is a sample resource claim templateand we going to go ahead and and runthis hereokay let's say part got created and thenwe let's say they are running uh we goto the another terminal I have open onthe other on the same machine and herewe see remember we had just one CD specnow we use two resource spec uh uhresource claim so the dr driver has goneahead and created these two additionalCDI specs in the same directory and ifyou see there is I remember I wastalking about there's no real devicebacking it up here um but they are usedto track the claim of the DR so yeah wesee it here And then now we go just tosee the logs just to make sure you knowwe we requested uh we requested uh GPUusing DRA which internally use CDI andwe really got it so so that command hereup you see it returns the name of theGPU uh we'll go ahead and delete itright now and what we expect to see issince once you delete the pod the the DRwill remove the corresponding resourceclaim uh so we expect that thecorresponding CDR device also to beremoved and it is removed uh so this ishow you you would do it uh this is justexample but if you if you want to havesomething similar done similar so whatyou'll do now you're going to have ademon let's say and you're going toimplement a device Dr pl plug-in uhinterface and uh as I mentioned here youwill have to yeah implement this nodeprepared resources which willessentially uh select the DR uh selectthe uh device that you want to use withthis particular claim and you'll you'lljust run with the flow uh with that uhyeah then if you want more resources ifyou want to understand more about cryoyou can go to this uh uh GitHubrepository cryo cryo we also have happento have cryo as a default in open shiftuh about CDI there's a very good talk byKristoff and David uh on GPU sharing andCDI device plugins uh that will thatwill help you understand even more howthis CDI came to be and if you want tounderstand more about the actual deviceuh CDI interface itself uh they havetheir own GitHub page and I willrecommend that you should uh read thespec if you are if you are interested inthat uh I think with that I am done ithink you have a questions you can usethe mic here there's michere uh hey so is there an option tospecify in the specs um a fraction ofGPUs um instead of uh one full GPU for asingle container uh you mean to say incase of Nvidia can you specify uh let'ssay a mix slice instead of whole GPU isthat what you're referring to yeah okaybecause the the the example you showedis uh showing how many GPUs you want butis there a way to use instead of onejust fraction yeah so what what wouldhappen in case is that you would use theNvidia say I I'll stick with Nvidiabecause you have a practical example ofthat so you will use uh GPU operator andyou will uh you will say mix MIG enableon that uh you might select say mixstrategy to single or mixed no uh and itwill create the slices on the host uhthen your DR driver will essentiallyenumerate all the slices and shouldcreate the CDI spec corresponding CDIspecs in that directory and when themoment they are created in the directorycryo will essentially just will read itand they will be available yeah allright thanks uh uh do we have any otherquestionplease uh my question is if you have toadd these devices in the config filedoes it mean that only administratorscan add devices or can users also adddevices themselvesso I will frame it slightly differentlyso uh CDI doesn't magically createdevices for you right it's a way to uhexpress the device configuration in amore uh structured way right so the allthe rules that already exist to use thedevices they still exist for for usingdevices via CDI so if administrator wasresponsible for provisioning devicesearlier they it will same it will be thesame in case of CDI what helps here isthat you are not so imagine if you don'thave CDI imagine you you you havemultiple vendors who have differentrequirements in terms of what needs tobe done to uh get this deviceinitialized now you have to make surethat you actually follow these stepscorrectly uh you need to make sure thatyou know the OCI hooks get executedproperly uh and there is nostandardization so there's one one onetype of uh hardware might require thisOCI hooks other type of hardware mightrequire something else on so what thisdevice spec does is it it standardizesth those things and hides it under coverand you are only dealing with the thefully qualified uh domain name fullyqualified device name does that answeryour question um yeah I mean more likethe the CD path that's on the hostsystem right or where uh I mean clusterswhere you have administrators and otherusers that do not have full access andthey might not have access to the ATCDpath to add these configurationsyeah correct so those devices need to beu actual devices need to be administerattached to the node by the systemadministrator and uh uh the devicedriver itself should be installed bysomeone with the right privileges whocan essentially because this devicedrivers run with higher privileges to sothat I can access hardware so that thatneeds to be run in that uh yeah thatwith those privileges yeah all right yesuh do we have any morequestions uh all right uh oh you haveone go ahead okay i have a question sonow uh you write the fs in the file pathright so how uh why don't you use thecommand in directory instead of the uhfile pass u being bender fs or somethingbecause it need to uh I wonder it mightbe leading user uh security vulnerableuh so you I'm not sure if you executelet's say uh for example you're talkingabout OCI hook right why can't I justexecute OCI hook yeah definitely this ishow we do it without CDI right this ishow we do it without CDI But then youare responsible for making sure that youare executing the right hook this specwhen I showed you the demo it was notwritten by me it was the by the Nvidia'sdivided driver so since they make thehardware they know what what drivers arethere on the host and which path to useso it's not exposing anything more thanwhat you were doing it's just that whatyou were doing manually earlier uhmaking sure it works without anystandardization this just automates thatin a way that now there is aspecification that a vendor shouldfollow and uh they should make sure thatit works with that specificationokay and another question so who writeuh this CDI spec file and when sowhenever you install the driver driveris responsible for writing the CDI specfiles like yes it is it is you can writeon your own so as I mentioned like sayyou if you know uh for example it'sthat's actually good question souh there there is a I hope let me justsee if I have the to that example CDIhere yeah so if you see the device thisis the default uh CDI spec that Nvidiagenerates right but it might happen thatlet's say you want to change this youyou don't want to have this void youwant to probably put a migu ID there oryou want to put all you know whateverother value it takes you can do that youjust have to disable the aspect of uhNvidia device driver um to uh not dothat if they support this configurationyou can do it or let's say you decide towrite your own DI driver um definitelyyou can write that either way And uh youdon't allow that driver to write theCDSP you just don't install it youinstall yours mhm okay thank youuh all right i think that was all thankseveryone for coming yeah bye2025-04-15 22:00:00.733105 ��_�A#��uA7-JtDLNT0c8good afternoon everyone uh thanks allfor coming uh I am Harsh Patil and Iwill be talking about uh enhancing thedevice support in Kubernetes usingsomething called CDI uh and how itinteracts with cryo uh CR runtimeuh so roughly the agenda we would goover what is cryo overview of cryooverview of CDI fundamentals like what'sa CDI how it is defined uh how CDI isintegrated in cryo and some practicalapplications and use cases of uhintegration of CDI with the cryo we canwe can have some time for Q&A also atthe end uh so why cryo it's a cryo ifyou see it's it'sCRI runtime but what sets Cryo apart isthat itis completely oriented towards uhKubernetes in a sense it does everythingin its existence it does it for beingwith Kubernetes so including the releasecycles the feature development umthere's nothing else for cryo to doapart from just being inside aKubernetes cluster so that's kind ofkeeps it uh helps us essentially uh dooptimizations uh for example when you dowhen a generic plague tries to listparts in cubate cryohazard optimizationto uh return faster without hitting theuh without traversing the croup fs uhusing cache uh we also have experimentalfeatures in developing cryo we useannotations for that and cryo is alsodesigned from the security point of viewwhere you know because itonly targetsKubernetes the attack surface is verylimited there's nothing else cryo forthere is nothing else we know no otherway to use cryo other than using it viakubernetes like uh so that is where thatis where we believe that minimizing thisattack surface helps makes cryo a moresecure CR runtime uh we're also veryquick to adapt to new security forexample uh we have something calledsafecom artifact support so essentiallyyou can have your safe car configurationnicely kept somewhere in registry andwe'll pull it and apply the seccomprofile uh so uh moving on this isgenerally how a CR runtime will looklike the gRPC is a CRI interface andwhere you have image service you haveruntime service and different componentsthat cryo interacts with in order to urun the containers uh going forward andthese are the these are the some of thesome of the or adapters which whichessentially help us stay motivated tomake cryo better and better with everyrelease u now going to the CDI aspect ofit so what is a CDI uh it's essentiallythe if you expand it it's it translatesto container device interface and uhit's a specification uh for CRI runtimessuch as cryo uh to support third partydevices uh it is greatly inspired fromhow the container networking interfaceis designed and we will dive deeper intothat but before that who shouldgenerally know about CDI like so forexample For example if you are amanufacturer of an accelerator like sayGPU or if you have custom hardware likeFPGAs or for that matter any device anydevice that you want to be used inside acontainer in Kubernetes uh and whichrequires complex setup for example youmight need mounting uh a file systemfrom host maybe it has drivers that arerequired for your for your device towork maybe you need to perform some someOCI hooks uh to prepare the device maybeinitialize the device uh you might needmore than one device node now simplehaving /de my device is not enough maybeyou need /dev/my device 1 2 3 and thatis what makes your device work inside acontainer uh you might want to havespecialized uh runtime specialenvironmental variables uh I gave theexample here of what Nvidia doesessentially they set this Nvidia visibledevices and you can set the uh UID ofthe of the GPU or the the mix licenseand uh you can access adevice so apart from the devicemanufacturer you know people who aredevelopers of say classical deviceplug-in ?Eorkat Microsoft as a engineering lead yepand I am Taylor like I said I am also aCNCF WASM working group co-chair that'swhy we're giving this talk today we'renot giving it on behalf of our companieswe're mostly giving it on behalf of theworking group um and I'm longtime WASOMcommunity member uh I've been involvedin uh WASM stuff especially WomKubernetes since like 2019 and alongtime contributor in that space alongwith the various standard specificationsday jobs i'm an engineering director atCosmonic which is a startup in the webassembly space so with that this uglyslide is purposely ugly because we'renot going to spend a lot of time on itthis is the single slide of WASM historyso we we level set we want to level sethere a little bit around what was how itcame to be and where kind of we are inthe evolution everything here today ismeant to be the unvarnished truth aboutwhat like where Web Assembly is at wherewhere it's being used those kind ofthings as we talk about it and itstarted in 2013 i know it says 2014 onthe ASM.js but that was the latestavailable draft and I didn't want to gospelunking for the original draft so umthat's the very beginning of webassembly and then it it started asASM.js and then became um web assemblyand was integrated into V8 then itbecame a draft standard and then itbecame a core um W3C standard so theseare all standardsbased things that arein the W3C and it's very important thatwe understand that so first off let'stalk about why we have W was in thecloud we've given these kind of talksbefore so if you've attended these we'resorry if this is a repeat but this isvery important for the level settingthing on the left side you'll see thereasons why web assembly was taken totaken so seriously and then used in thebrowser so much was because it's an openW3 standard W3C standard like Imentioned it's sandboxed by defaultbecause you're running untrusted code umyou are small and fast is because likethat time to first initial load mattersa lot in the web space your polyglotbecause you might need differentperformance characteristics a lot oftimes people writing things in C C++Rust those kind of things to be able todo performant web applications thinkthings like Autodesk or or any of theirin any of their suite or like Adobe inany of their suite pretty high int highcompute things that are needed there andperformant um it's also portable rightlike that's by definition and it has towork on Linux and Windows and Mac and ithas to work in all the browsers now ifyou think about that that's why I havethese contrasted right next to eachother is those are the exact kind ofthings we want in the cloud we want openstandards that's why we have the CNCFthat's why this conference exists andthen also it's sandbox by default howmany times have we said "Oh containersare so secure to be bitten by a CVE."Now that's nobody's fault like bugshappen but that's just part of thetechnology you don't have that as muchinside of the web assembly space um itis very small and very fast whichmatters a lot for like functions as aservice with like your cold starts andother things and it's also polyglot umwhich is very helpful in in an ecosystemlike this yes we're very heavy into gobut there's also a lot of people who doPython and Node and C and whatever sothe other thing too here is that it'sportable and that's one of the thingsthat's most powerful about this at itscore level is that portability we'vealways deceived ourselves when we sayDocker is so portable it's not you'remaking a separate build for every singletype of image and operating system youhave so yes it's there's a lot of workthere it does it's better than what wehad before but you don't have the sameportability that you do with WAM solet's talk a little bit more about WASMitself and why is it important yeahthank you taylor do you remember thatstory that you once heard about writeonce run anywhere yeah I I remember thatstory i was really excited about thatstory uh in co in in university i got Igot super hyped about that um well Ithink Wom is our closest thing to it andI think it actFually reaches a lot ofthose things that .NET and Java aspiredto be uh we have the ability to takethis binary file format run it justabout anywhere and produce it with justabout any language that you want therein many of the languages out there istooling to produce web assembly binariesuh this is the stack we want we can runin browsers we can run it on servers wecan run it on edge we can take itanywhere we can build it from almostanything well this brings us to the nextthing what does core wazm really looklike core WASM highle types and stuff nocore wazm is a bunch of uh you know i32suh we're talking about pointers andlengths uh which gives us really greatflexibility we can express strings as abeginning of a pointer all the way toits length end and that's a really greatway to do that however uh do you want todo that with all of your highle typesprobably not um you're going to have towrite uh custom APIs to take thosehighle types those you know their theirformat in memory and be able to describethem to web assembly uh now what thisactually ends up meaning is we have towrite custom APIs custom applicationbinary interfaces um and that is a verydifficult tedious and errorpronejob so we have tools for this all rightanybody use gRPC Protobuff heck yeahyeah we all do right you may use it andnot even know it and it's fantastic wellwe need a solution likeprotoiles.protoiles give you thatdescription of what this highlevel typeneeds to look like on the wire okaythat's fantastic awesome we have thesame problem in web assembly we need alittle bit more expression than sayprotoiles so we have this file calledwit and wit gives us this high-levellanguage to be able to describe theinterface and then from the wit file wehave enough information to generate outuh AIS in each language that you wouldlike to use so do you want to do it fivebillion times do you want to write a uhAI for every single gRPC endpoint youhave i hope the answer is no if it's notno you're a sick person um no we don'twant to we want to generate that outyeah and this component model is veryvery important because like David wassaying we can't drive that home enoughif you're using plain WASM especially inthe cloud you're using custom APIs whichmeans you are locked into whoever oryourself if you created it using thatspecific library david and I work onvery different things in the webassembly space where you're going to seesome of the projects we work on in asecond and yet we're still able to usethe same tooling and the same a lot ofthe same components and things that wecreate with each other's stuff becausewe're using this common AI layer whichis the component model and another thingthat this lets you do is it lets youplug in what you need and when you needit so everything is expressed in theterms of imports and exports we alwaysdisplay these those of us who've been inthe web assembly space has these littleLego like nubs that connect to thatconnect together but you can have acomponent that is written in Rust forexample that is exporting something thatis used by a component written in Go andthose languages don't have to know thatthey were written in that language um sothat's why where the name componentcomes from as well you're composingthese things together and you cancompose and pull apart and put morethings together you can compose on topof a composition which lets you do somecool things around APIs and um basicallyforward patching things as you releasenew features and versions and so that'swhere we get into well what's in thiscomponent that we were talking about sothis is the same thing we were showingearlier and you'll see that we have thatcore the i3 we always jokingly call itnumbers and trench coats that's what theweb web assembly is underneath the hoodand so what's a component well it's justthis little layer and wrapper aroundit's a very very thin veneer you havethe core WASM module that is what likethe component model isn't encoding itdifferent it's using the same thingsunderneath the hood you have memoriesand all the things that are in there andthen you have thisG thin veneer thatexpresses your imports and exports so isthe the first question is people like isthere overhead there is web overheadwith web assembly anytime you do it umbut this is there's very little hereit's basically a lifting and lowering ofthe types into those raw pointers andthen back into concrete types for thelanguage that it's in that's it that'sall that's here now another thing likewe said we're trying to be frank andstraightforward about what the thestatus of thing is of the everything isin in the wom ecosystem and that is thatthe component model is very difficult toimplement but you do it once and thenit's all that it is is this thin veneeranytime you have a runtime you implementthat once in the runtime and everybodybenefits from it no matter whichlanguage that is not the case with thecustom APIs so when you look at thesecomponents remember there is some stuffaround it but it's very lightweight it'snot going to be your bottleneck inalmost any instance um obviously as youstart to get super high performancewe're not going to talk about thatbecause then every like bit and movematters but for the general purpose usecase it's not going to affect youokay so Wazzi and the component modelare still relatively nent and evolvingso um in the current release in the 020uh release uh we are going through andevolving that slowly with non-breakingchanges on regular timebased releases uhit is progressing uh quite well actuallyuh very soon here we will have a releasefor version0.3 version 0.3 is going to introducesome cool new features one of them beingnative async so we will be able to haveasync invocations of your components umand the uh the model will run nativelyin your languages this is fantastic uhbeyond that there will be a lot of uhvery nice ergonomic uh improvements forthe interfaces um all of this work isculminating and leading to a stable 1.0zero release hopefully as soon aspossible so that we can stabilize thisand build really amazing applicationsand you'll see on the bottom here wehave these worlds that are listed worldsare just collections of interfaces thatyou can implement or export or whateveryour your choice is and these worlds arethings like Wasloud which David and Iwork on you saw actually a snippet ofone of those which is the key valueinterface earlier um you have thingslike Wasi TLS and so on these helpbootstrap you into the ecosystem and youcan build on top of these or define yourown interfaces and once again when youdefine your own interfaces you're notcreating a custom API you're just usingand leveraging the same API that'salready there so with that is it goingtoload a funny video no oh maybe maybequick no oh well fine give it a playgive it a play anyway we're going tohave the Wom showdown anyway so whatwe're going to actually do is have umwe're going to go through basically theecosystem right now and talk about someof the biggest projects that are outthere um it's not it's by no meansexhaustive but we wanted to give you afeel of the breadth of projects here sothis is going to be very lightningtalkesque for a second as we go througheach of these things so you can knowabout the different projects first up isthe project I'm a maintainer of it'scalled WASMCloud it's an incubatingproject in the CNCF and it is basicallyexactly what it it does what it says onthe tin um right here it is isorchestration layer for WASM and it canbridge across multiple networktopologies um and also you talked aboutthis composition that composition can bedone by anything fulfilling anythingelse wherever you'd like to depending onhow you wire it up so it's very muchsomething where you can wire up your ownplatform you basically can program theplatform entirely using using WASMall right next up uh so this is Spin umand again I'm a little biased here uh Iam a spin contributor and a contributorto uh spin uh framework which is a CNCFproject spin allows you to uh build uhreally cool microser applications withweb assembly components developerexperience is pretty nice it's a reallygreat way to get started in web assemblyif you aren'tH really deep into it youcan go kick the tires and have a lot offun real quickthere's also cool companies out therelike Render Lit um those are somefriends we have in the in the WASMecosystem where they're using Wii webGPU to build graphics renderingpipelines you don't even knownecessarily that you're using WASMunderneath the hood but it's powered bythe component model and all these thingswe've been talking about oh this was areally cool one so um I'm a big fan ofZed and when I saw them come out withtheir web assembly component uhextensions for Zed I thought my goodnessthis is exactly what we want to seepeople building we want to see peoplebuilding these extensions and so uh werecently had them on to the communitymeeting for the Web Assembly workinggroup they described their pains theydescribed the successes that they had ifyou didn't see this video or if youweren't part of the meeting please checkit out it was really cool insightfulthere's also things out there like theCubeuler WASM extension um and that isactually ongoing work and we have somereally cool work being done here in thisspace in general you're able to plug inthings into the WASMuler or thing thingslike policy will see that down the linethere's actually a talk by raise yourhand um those of you over here who Iknow are doing the talk uh tomorrowright um talks tomorrow yeah the talkstomorrow where they're going to betalking about what they did with thecomponent model the the challengesthey've had with it it's going to bereally cool so this is something that'salso going on out in the ecosystem within this in this space okay another onethat I'm very biased on uh hyperllightis a CNCFbased project uh that was uhborn within Microsoft in my group inMicrosoft and some of the engineers thatare working on it are right here in theaudience um hyperllight Wom is reallyneat because what it does is it providesyou a hypervisor boundary around yourweb assembly components that you can youcan execute and use a defense in-depthstrategy with extremely low latency andhigh performance um give it a try it'sreally awesome if you are extra paranoidlike we are running public clouds yeahand this is one of the ones that I'malso very interested in doing my andplaying around with myself i've beenfollowing it along for a while andthat's really cool we also have thingsin other big public clouds this is oneof them this is uh built on top ofbasically proxy WASM um which is acustom API it's not the component modelbut this is like inside of of Googlecloud you have like service extensionswhere you can do proxy wom to plug infilters and other things into your intoyour pipelines it's pretty awesome toolum very flexible as well oh neat allright so uh inspector gadget soinspector gadget uh allows you to youknow instrument collect informationusing uh ebpf now uh last year aroundthis time uh we were talking with thosefolks and a little bit ofcrosspollination happened uh we uh gotthem excited about web assembly we gotthem excited about components and stufflike that and so they added web assemblysupport to inspector gadget you canwrite your gadgets that plug into theirebpf pipelines to go and collect thedata that you want and customize it it'sreally neat you also see this popping upin things like engine X so engineextunit has web assembly support also theyare have some component model supportand are adding more so you can basicallybuild extensions to engine X in highperformance server situations as well sojust once again you start to see howmany different kinds of products thisthese things land in ah so come toKuborton so uh do you want to write yourpolicies in whatever language of choicethat you prefer uh I would say that'sprobably a yes uh this is a good way todo it cube warden exposes uh the abilityfor you to use web assembly to writeyour policies in your Kubernetesclustersyou also have things like low-levelruntime so W was edge is another CNCFproject that's very much geared towardsas it's in the name the edge kind ofedge devices it has a lot of built-inthings for AI and other and other coolgoodies Iit's very batteries included umthey have partial component modelsupport and are implementing moreconstantly so um they're yet anotherimplementation of all this that lets youhave those cool extensionfeatures all right this is a lot of themthis is very different than our lastpages um so what these are are databaseswhat if you could take your compute andmove it directly next to your data uhthat would be kind of fun it would beinteresting it would be useful uh youdon't have the latency of going back andforth from that data data store now eachof these databases supply web assemblysupport in one way or another whetherit's UDFs or store procedures or someother way uh you can extend thatdatabase functionality with your owncompute written in the language of yourchoice and this is why we are so excitedwe are so excited because there are aton of projects out there that are usingweb assembly in production today you cando it too there are some rough edges butthere's a lot of great work out thereyeah and if you can't tell the GIF islike a live view of David right now onstage with it but really this is this isexciting there the web assemblylandscape is fairly huge there is anactual dedicated like wom section on theCNCF landscape if you want to see howmany things it's in just inside of theCNCF so let's talk a little bit aboutthe future um the future is where wherethis gets exciting um really let's talkabout a few of those things here we'llwe'll go into a little bit more aboutsome of them in just a second but reallythe future we see is it's a plug-inmodel to end all plug-in models rightnow you have this confusing messdepending on what you're using it couldbe a standard in standard out interfaceit could be just calling a raw binary itcould be gRPC there's a whole bunch ofdifferent things that people use toextend with plugins with a plug-in modellike this you can start extendingtheuler in Kubernetes you can domutating web hooks this way you canthere's all sorts of the these kind ofthings you can start plugging intosystems that already exist with WOM andI'll show an example why that'simportant in a second but it also givesyou new ways to build and runapplications as someone who works in aWom cloud that's very much what we'retargeting with WOM cloud when you startthinking of things as interfaces andthings to be consumed to satisfy thoseinterfaces it starts to unlock verydifferent ways of programming andbuilding applications and I think it'sgonna really make the developerexperience and the platform buildingexperience a lot better but we also getthings like integrating with existingtools here in the future david'sactually I think been doing a bunch withbuild packs right indeed uh so at twoyou can come and check out uh the workwith uh we've done with build packs tocreate uh a much better experience uhfor your development of web assembly soif you look at like how you go tocompose uh components together and buildthem it's quite a complicated recipe uhbut we can transform that we can use thetools that we have available to us inthe CNCF the projects that exist todayto not only extend those projects but tofurther the other projects that we havearound us so one of those is cloudnative build packs and I'm very excitedto show you all how it becomes onelinerto actually build your web assemblycomponents yeah and I think you're doingthat today right in a talk indeed 2 p.myeah so we're we're really excited tosee those kind of improvements in thefuture because that is once again interms of frankness web assembly is alittle bit finicky with how youespecially when you start composingthings or putting them together and insome of the languages it results in avery chunky boy binary or um it it canbe a little bit janky just to be just tobe honest but what you can do with it isonce you've set that up you're able tobuild these really cool things andpeople like David are really trying tomake that even better so let's do alittle example of like the plug-in thingbecause I think this really helpsillustrate it how it can be usefulbeyond plugins but it's something reJallyrelatable so let's use cube control okaycube control has plugins right they'repretty much arbitrary binaries right youhave my plug-in binary right so whathappens that plugin is just a randombinary you have downloaded from theinternet now yes we're all not dumb yetwe all still curl pseudo bash so um likethingshappened well what if that plugin doesan OSR read file of your Etsy passworddatabase oh no like bad right that's thewhole point like someone's just poppedyour whole machine with a single binarythat's in there now let's compare thatto what happens with the component modelin web assembly you have your plug-incomponent and you see the world theworld comes in so every component isself-describing it know you know exactlywhat it's importing and exactly whatit's exporting and if you look at it andgo "Oh crap they're using the filesystem i don't want to use the filesystem going to deny it." And thenyou're fine so the whole point is thatyou're able to see exactly what's thereand you can provide something and thisis a simple example but you could sayI'm going to provide a virtual filesystem or I'm going to pres provideaccess to the file system that only hasaccess to a specific directory path andso on and so forth you're able to reallytweak the security model here but alsoextend all sorts of capabilities imaginethis where you have something that sayslike import like Kubernetes colon clientor Kubernetes colon whatever interfacesyou want to provide could be in thereand given to a plugin and then you notevery single plugin has to like pull inthe Kubernetes CL like the like clientgo or any of those kind ofthings gosh wouldn't it be nice to beable to pull in client go yesi don't know uh most people probablyhave not tried to do that if you try togo pull it into client go right now andbuild that with tiny go you're gonnahave a bad day uh it will not workbecause we are missing some bits uh thatneed to be implemented in tiny go umthere are a lot of things out there thatwould benefit from having really awesomego support if you think about the CNCFas a whole so many of our pro projectsare written in Go and the old sayinggoes see go is not go right so if youhave to go and build your project say uhto use wom time uh the runtime that wemost often use uh wom time is a rustproject we then have to do you knowlinking to csh shared libraries uh thatis probably going to stop somebody coldin their tracks if they're go developeruh I don't like to use go personally iprefer to use pure go and we do have anice option for you know running webassembly modules in pure go while zerois pretty solid it's pretty good um weneed to have that option for componentswe need to be able to light this up inpure go this is this is what we'rereaching out to you all for i want toget people excited we want to get peopleexcited about the idea of what the CNCFecosystem could be like if we had pureGo support for components if we had pureGo support for web assembly componentsthe world is a much brighter place iguarantee you if you think about thatyeah this is really what like Dave wassaying this is one of the biggestinvitation we have if this interests youthis is what we need help with like Isaid it's complicated but you do it onceand then everybody in the ecosystembenefits from it and that's verypowerful like this is the way like thatlike this is this is how we do it thisis how we keep moving forward with thecomponent model especially in such agoheavy ecosystem like theCNCF so with that just a couple thingsof what can you do beyond that that'sthe big one there's a couple otherthings you can also try out the projectswe've recommended and any of them seetry out the wasom side see what happenssee what you can figure out um with whatyou can do um give us all the feedbackall the rough edges all of those thingswe want that also you can come see us atthe tag runtime booth tonight both Davidand I will be there we're going to havevarious spin cube's going to be therefor a project and cloud's going to bethere for a little bit and and the WASMworking group is going to be there alittle bit tonight all during the boothcrawl in the late afternoon today cometalk to us learn more and then you canalsoum take a look at any of these QR codesto just see efforts that we've worked onthey're they're things that you can comesome of them you can still join in onothers they just serve as an example ofthings that we've done in in the pastsuch as integrating with OCI um so withthat we'll go ahead and open things forquestionsi think the mic up here is for questionsright yes okay so go ahead and come upif you're going to ask some questions soeveryone can hear it on later recordingsmight be a dumb question but um thehypervisor based security layerhyperllight i thought the whole point ofwas to provide a sandbox level ofisolation so why do you need ahypervisor as well that is a reallyfantastic question so the sandbox of Womis in fact uh provably secure that's youknow that's what we rely upon now WebAssembly itself doesn't necessarily likeWM itself doesn't have any idea how toaccess a file system it doesn't have anyidea how to make an HTTP request thosethat functionality is actually providedfrom the host below the web assemblysandbox so um you can imagine in wyhttpd interface for say outgoing uh httprequests uh that re that function getsinvoked the host handles the requestthen makes a request out to that HTTPserver brings back the response passesit back into the guest the guest beingthe web assembly sandbox uh it thenoperates within the guest now all ofthat stuff in the host is your trustycompute base so do you want to do thatuh there there can be bugs in thatsection and bugs in that section couldlead to CDEs for example if you'reaccessing a file system relative pathinguh have we not seen millions of relativepath issues uh across the years uh ofhow those are handled so so the idea islike just as a container relies on ashared kernel if you're relying on thislike shared services layer which has amuch larger attack surface is that theYes no sharing of the kernel yeah it'sit's very much for those when you're onthat shared ecosystem like withfunctions so for example in contrast tosomething like WAMC cloud we don'texpose file system access or evenenvironment variable access by defaultso that's taken care of because theother things enforce those but when youhave to give file system access or likeraw socket level access you needsomething in in a very high trustedcomput like think something like Azurefunctions or that kind of thing you'regoing to need huge amounts of isolationit's the same reason stuff like um whyam I forgetting it all of a sudden uhfirecracker uh exists even forcontainers you could say well containerscontain it right well sometimes you needa little bit of extra isolation and it'svery useful for that with containers iget it though because no one's saying asand a container is a sandbox uh so yeahI guess you could see this hyperllightthing as microVM for WA is that Ohthat's a good way of stating it uh youcan also use hyperllight to run native Ccode and all sorts of other things tooand just with the I thought the wholepoint of Wazi was that you didn't havedirect file system access so that's alsouh you can see if you have direct filesystem access how it's provided isentirely up to the host like I said youcan virtualize it which is one of thebig security benefits so to be clearlike you can virtualize anything ifsomeone has a socket connection youcould be limiting what that socket'sdoing or it could not even be a realnetwork connection for a file system itcould be in memory right okay that kindof thing great thanks so much yeah thankyou excellent question any otherquestions i think we have time for maybeone or two more if anyone has anyokay awesome thank you all so muchyou've been an excellent audience andwe'll hope to see you around2025-04-15 22:00:01.568992 ;;��2�C#��AnclJn1KEjisgood morning everyone how's everyonedoinggood good good happy to see all of youhere uh and we've been having somereally nice weather in London as youknow we've been all enjoying it sunnyweather and that also you know reflectson some of the really cool stuff thatwe've been making progress on in OpenTelemetry uh and I'd really like tothank you all for joining in today uh aswell as for your support for the projectand for your contributions for those ofyou who are who are contributing tohotel could I see a show ofhands all right so we have a few of youand I hope that you know more of youwill be joining in to contribute to theproject so uh with that said we're goingto be covering uh and our esteemed uh GCmembers my fellow GC members will be uhtalking about some of the projectupdates on the features that we havebeen building out on the project uh aswell as our community and also theroadmap to graduation so with that saidI'd like to very do a very quickintroduction i'm Alolita Sharma i leadobservability at Apple and I also havebeen a long-term open telemetrycontributor and a maintainer uh and I'dlike to introduce Daniel Gomez Blancofrom New Relic uh Austin Parker who isthe lookalike for Severron Newman whocouldn't join us today and uh fromHoneycomb uh Tras Stalaker u also on theGC from Microsoft as well as PabloBayans from data dog who is one of ourcore collector contributors so with thatsaid I'll turn it over just run througha very quick community update and thenwe can dive into the features and otheruh uh roadmap updates but uh please holdyour questions for the end because againit's just uh we'll do a quick runthrough so that we can actually allow alot more Q&A time so you know it'll giveyou some time to get prepared with yourquestions all right so as you can see umand I hope you can all see the slide uhthis is kind of the trajectory of youknow open telemetry's contributions overthe last five years um you can see overyou know five years ago we were aboutless shy of a thousand uh you knowcontributions uh and also uh you know offolks were starting to contribute veryactively but today you can see you knowour our a kind of shot up to 7,000contributions a week and more than 500active contributors working on theproject at any given point in timethat's pretty phenomenal you know andand as many of you know uh we are thesecond largest project now afterKubernetes in the uh CNCF and also ofcourse you know as any ecosystemdevelops the observability ecosystemwhich we'll be talking about later uhwe'll kind of highlight some of theseareas uh also another uh few metrics youknow that you would be interested from acommunity uh standpoint again we have avery very nice welcoming and a healthycommunity uh and very proud of it uh wehave over a thousand plus organizationswho have contributed to open telemetryover time that's pretty phenomenal um wehave 255 plusorganizations at any given point in theweek contributing actively which is verycool because you know 25% of the uh yL��M�B#��QAKK0FKiQ7nisas you can see here this is probably oneof my favorite talk titles in the lastcouple years is was am I right or was amI wrong um I'm Tara Thomas this is Davidso we'll go ahead and introduceourselves real quick hey I'm DavidJustice uh a co-chair for the Womworking group uh maintainer of Run Wisyand Spin Cube uh two CNCF uh basedprojects for web assembly uh I'm theauthor of Go for Dev Ops and acontributor and champion for multiple wyspecifications uh also my day job I wDMouknow long-term organizations actuallycontinue to contribute and you know worktogether as maintainers as well ascontributors uh on the project i'velisted out uh you know some of the topcontributors if you will in terms oforganizations in that colorful band ofuh uh folks and uh as I said 550 plusdevelopers at this snapshot in time areyou know very active contributors uhmaking contributions as PRs or issues oryou know triaging um many different waysthey're involved in the project weekover week we have we are a very largeproject so if you've seen our repos onuh GitHub we have about 80 plus reposwhich means that we have a lot of workgoing on on the project and uh it'sreally exciting to see a lot of the workthat integrates not only and makes ourcollector better which is the core heartof the uh open telemetry project butalso our SDKs and libraries uh and otherintegrations we have more uh you knowboth a very healthy mix of vendors andenduser organizations working togetheron the project which has been justamazing uh and globally uh prettyamazing 105 countries at count right atthis point in time but 100 pluscountries that you know havecontributors coming in and contributingto the project so with that said I willturn it over to Daniel who is going todive into end user adoption and someother cool thingsthere's some seats up here by the way ifpeople want yeah there are some seats atthe front if you want to if you'relooking for a seat uh so on the end userfront um we we continue to see opentelemetry everywhere open telemetry is akey part of u any observability strategynow the important thing that we've seenthe change that we've seen is that maybea year or two ago uh some of the talksthat we were seeing were about toorganizations that were starting to useopen telemetry or that were thinkingabout using open telemetry and now we'reseeing use case studies and talks hereat CubeCon that are from largecaleadoption of open telemetry successfullyand end users talking to us about theirvalue that they're getting from opentelemetry from the point of view oftooling consolidationfrom the point of view of effectiveobservability and the changes thatthey're seeing within their optim theirorganizations from adopting opentelemetry so some of the uh as well someof um case studies in CNCF some of umindependent blog posts and podcasts ofum yeah organizations that have adoptedopen telemetry at scale already so thatis really great to see and if you're anend user uh tell us your story i thinkit's always really good to see that typeof uh how how are people doing opentelemetry atscale closer to home we have also seenopen telemetry being consolidated as thestandard to instrument cloud nativetooling the CNCF projects that you seelisted here are using open telemetry umtooling are using open telemetry APIsnatively and providing their users witha way to export and to process thattelemetry in a in a backend neutral wayso uh this is I'm not sure this is acomprehensive list there may be otherprojects out there or the CNCF projectsthat are using open telemetry so ifyou're one of them and you want to belisted in our list of ecosystemintegrations uh come talk to us or raisean issue in the uh in the website repoand then we can get your project listedas well and tell users how they can comeand use open telemetry with your CNCFprojectanother trend that we have seen emergingas well and that very much aligns withthe with open telemetry's vision is howwe're changing how we're instrumentingour open source libraries umtraditionally the way thatinstrumentation tends to work is uhwe're using uh instrumentation librariesthat are maintained by open telemetry uhcontributors and uh they use techniqueslike monkey patching or bite codeinjection to then instrument open sourcelibraries at that instrumentation thattelemetry and those use the samecrosscutting APIs now that are stablefor metrics traces and logs and use thesemantic conventions to have thatunified view that that single stream ofcorrelated data with your along withyour application code and then then solike that being abstrNacted from the fromthe SDK implementation then allows thethe end user the user to initialize theSDK with their own config and thenprocess aggregates and export thattelemetry wherever they want to exportit but we believe that telemetry shouldbe baked in and baked into open sourcelibraries so uh we are seeing more ofopen source libraries out there um thatare using open telemetry natively sowhen we talk about OTL nativeinstrumentation is instrumentation thatis added to open source libraries usingthe same taking direct dependencies onthe same API and semantic conventionsand then working together withapplication code with your custominstrumentation to provide again thatsame context that same correlationcontext to give you effectiveobservability now if you're an owner ofa library and then you want to provideyour users with a vendor neutral way ofapproaching telemetry instrumentationthen you can do u yeah you can justinstrument your library with the API andthen um allow your end users to then dowhatever they want without telemetry atthe end of the day just aggregate itexport it everywhere and then from thepoint of view of the most importantly Ithink is for a from the maintainer of anopen source library from thatperspective is having the the ownershipof that telemetry it is your library youcan tell your users how they observeyour library how they observe yourapplication right so it's not up to anexternal maintainer to come andinstrumented you can take ownership youcan take you can make make your usersobserve your code as you want them toobserveit and then we're seeing that happeningalready in the open source uh in theopen source in open source ecosystemswe're seeing Deno for example now as aruntime as a JavaScript runtimeproviding open telemetry out of a boxwe're seeing Quarkus as a Java frameworkthat again provides open telemetryintegration and other open sourcelibraries like the Azure SDK the clientthe open um the elastic search clientmass transit um and service bus or umthrottle um or umuh mass throttle but um so movingforward touh the project more like what we'redoing inside the project I hand over toAustin to talk Uh some very excitingupdates thankyou wait did I get applause cool u wehave some more seats over here too ifpeople standing want to come sit down noall right thought I'd offerso let's talk about what's going on inthe big old world of hotel today umyou've probably gotten sick of hearingthis by now but hey graduation's stillin progress we have fulfilled all ourprerequisites for it um there's a lot ofpaperwork um I've been told that itshould be happening pretty soon uh oncesomeone from the TOC picks it up but wewe have quite a few people you know umset up as adopter interviews we've donesecurity audits we've done third partysecurity auditsum graduation is an important signal topeople outside of this room to thegreater cloudnative community andespecially to a lot of enterprises thatalign their internal adoption of opensource with the cloudnative uh maturityroadmap graduation is kind of us sayinglike hey yeah this is something you canrely on to be here for a while as partof that we've also been working withLinux foundation education to uh do moretraining and certification around OTELso there's an open telemetry certifiedassociate um program which is an examit's in beta still I think and we'll becoming out of that pretty soon they'redoing some final alignment of stuffthere um but you'll be able to say heyyeah I'm certified at open telemetry ina positive way uh we also have a freegetting started with open telemetrycourse that you can take uh throughLinux Foundation education which is nicebeyond that we have just a lot going onyou know as one of the good things aboutbeing having so many contributors andhaving so many people in the communityis that we're able to really you knowwork on a lot of things at once this isa sample of some of our blogs since thebeginning of the year right where we'reum big shout out to our JavaScriptcontributors maintainers that are in theroom for uh SDK 2.0 which solves a lotof challenges thaOt JavaScript usersreported to us over the yearsum we're doing a lot of work with Gocompile time and runtime instrumentationand telemetry via ebpf we've beenworking to improve the lambda experiencefor using open telemetry a lot of that'sdriven by um a couple of contributorsthat are just very passionate about itso thank you if you're here thank youfor all yourwork beyond that we have a few biggerthings we're taking swings at this yearum you've probably heard us talk aboutprofiling we announced this a coupleyears ago at CubeCon that we were goingto adopt it as a you know fourth majorsignal and they've been OTLP since um1.3 we've been working with a uh thepeople at Elastic donated a systemwideprofiler to the project we accepted thatand it is now sort of the core of thisEVPF profiler um initiative that is inprogress that's being actively developedi believe there's quite a few peoplethat are working on that here at CubeConso come find them at the observatory ifyou want to talk to them the collectorhas a new um profile support that youcan enable via feature gate and we'recontinuing to work on semanticconventions and specification around itand hope to have um basic gates rightnow it's in alpha but hopefully we'll bemoving intobeta thisyearish now for a real fun oneeveryone's favorite topic logsso as some people might be awareoriginally um our plan at open telemetrywas not to provide a userfacing loggingAPI we wanted to there's every languagehas let me step back a second everysingle person on the planet has verystrong opinion about opinions about howlog should be done right and we didn'treally want to kind of step in that andcome in and say oh here's the new betterway right so our original plan was tooffer just a bridge API between existinglogging facads and the open telemetrySDK so that you could just take yourexisting logging API point it into theSDK and get um trace or hotel context tolog correlation however as we'veum but continued to work on things uhsuch as semantic conventions especiallyaround genai and also just generalcommunity feedback we maybe did not hitthe mark with that original decisionabout what the logging um facade shouldbe in hotel and so we are currently inthe process of defining a userfacingstructured logging API you might alsohear this referred to as events[Music]um what this will do it's currently inalpha um we're not getting rid of thelogging bridge you'll still be able toconnect existing log APIs into the SDKjust like you can today in all languagesbut there will be a new userfacinglogging API which does mean that you'llbe able to actually have um logs tracesmetrics profiles everything done througha first-party hotel API and when thiswork is completed it unlocks a lot ofnew things this kind of finally unblocksus around client side and rum telemetrybrowser performance telemetry uminstantaneous events structured eventsfor genai uh a lot of applications inthe CI/CDworld so if you'd like to talk to morepeople about that this week this wouldbe a great time to you know get feedbackon it and all this work is being done inthe specificationso with that I want to bring up TR totalk a little bit more about semanticconventions and other other fun thingsthat arehappening thanksAustin so semantic conventions definesthe shape of your telemetry um forexample standardizing metric names anddimensions for HTTP metrics um these arecritical part of the open telemetryecosystem uh since they define theconsistency and usefulness of yourtelemetry um semantic conventions inopen telemetry is a very broad effort uhacross we've got nine sigs uh dedicatedto different types of semanticconventions uh covering 74 domains andalmost a thousand uh semantic conventionattributes at this pointum declaring specific semanticconventions as stable is one of our toppriorities right now since it will allowthe whole ecosystem to depend on thesesemantic conventions from librariesimplementing native instrumentation tothe dashboards and alerts that you useum closing in on stabilization um wehave uh database client uh semanticconventions and the code Pannotationattributes are both in release candidateand planning to go stable uh by the endof April um and shortly behind that isthe system metrics group which isworking towards uh quickly towardsstability as wellum upcoming stabilization efforts beyondthat include feature flags Kubernetesmetrics messaging andRPC um and I did want to call out uh acouple of uh other uh highlights uh theCI/CD semantic conventions if you're atall interested in this there's a talkright after this uh right across thehall um check that out um the semanticconventions for generative AI and theassociatedinstrumentations is a very active spacea lot of uh a lot of folks interested inthat and a lot of work to do if you areinterested in contributing in thatarea and we are redoubling our effortsaround browser and mobile semanticinventions now as Austin mentioned we'veuh unblocked some of the uhlong-standing blockers uh of thatwork uh I also wanted to take a minuteto share some ongoing work uh in thesemantic conventions to integratesemantic conventions into yourdevelopment process um since not all ofyour telemetry is going to fit theexisting semantic conventions you'regoing to have your own custom telemetryyour own uh and this will allow you tohave your own define your own customsemanticconventions um you'll be able to defineyour own telemetry schema um generatetype safe SDK to emit that telemetry umautomatically validate that telemetryfrom your tests automatically handleschema changes as you make changes toyour telemetry um and ultimately this uhintegrating this into your developmentprocess and uh using the associatedtooling that we've been working on umwill protect you against breakingdashboards and alerts which is somethingthat everybody hopefully can appreciateum if you are interested in this this isuh work in progress um at varyingdegrees of maturity um there was a greattalk yesterday if you missed it uh checkout the recording once it's postedonline and with that I will turn it overtoPablo thank you D uh so on the collectoras you know you can uh create andpublish your own components that workwith the collector framework uh but weoffer a curated selection of those underthe open project umbrella since lastyear in Paris we published uh 43 newcomponents um it's impossible to talkabout all of them but some highlightsare you can now uh use with the builderconfig providers to for example get yourconfig from an S3 bucket uh we allow youto use telemetry schemas that relate towhat Trask was just talking about uhwith uh the evolution of semanticconventions and we also offertransformations that help you usePrometheus uh with the delta 2 communityprocessor uh we've also been working onthe road map we set for collector 1.0 umwe are trying to get a more minimaldistribution to be marked as 1.0 and forthat we are starting with the foundationof the libraries uh that make up thecollector and we marked 16 of them as1.0 in the last year and finally uhwe've been trying to grow the communitywithin the collector SIG uh with fournewers eight new approvers and three newmaintainers across all the um collectorrepositories um shout out to all ofthese people thank you for the work youdo um we're going to continue working onuh investing on this component uh you'veseen all before the the profiling workwe're also doing in the collector uh butwhen it comes to the collector 1.0project I wanted to highlight uh we feellike we're now ready to start marking uhour first pipeline components as 1.x uhstarting with the OTLP receiver uh sothat's what we're going to work on nextthere and then we are finishing somerevamp of uh botchin moving it to theexporter to address some shortcommentsof the batch processor and of theinternal telemetry so that you can uhobserve the uh internals of thepipelines that you define in thecollector so yep you can join us to helpstart marking components as1.x um so now I'm going to talk about uhfeedback uhwe are working as part of the end userSIG to empower maintainers to u be ableto create surveys and get insights aboutwhat our end users need uh we run sixsurveQys last year uh with more than 500responses in total i'll talk about acouple of them in the next slides uh andwe also uh try to recognize ourcontributors um last year in Salt LakeCity we did our first ever opencommunity awards and we will continuewith this effort recognizing the impactof contributors in our community um inthe nextcubecon uh one of the surveys we've runis the developer experience survey uhwhich uh got over 200 responses uh frompeople using open telemetry with morethan 15 programminglanguages and this help us identify gapsin uh some aspects of our documentationexamples debugging experience that thedeveloper experience SIG is going tohelp uh address across all of opentelemetryand on the other hand we've also done asurvey for contributors um includingmaintainers uh we got responses uh frompeople that have contributed to uh everysingle SIG that we have on on opentelemetry and about 40% of allmaintainers and uh I mean overall um westill need to to analyze the data fromfrom these two surveys we will publishuh the results on the blog post but uhthe overall feeling is that contributorscan solve problems and maintainers uhoverall have autonomy uh for theirday-to-day decisions but we want to lookinto improving the transparency andreadability of the decision- makingwithin the project and also reducing thedependency on um synchronous meetings umthis is pretty important given theevolution that we've seen ofcontribution across regions as you cansee AMIA contributions have grown a lotover the years and we've also seen aslight uptick on contributions frompeople in Asia-Pacific and we wanteverybody to be able to participate sowe want to look into more async ways ofworking and with that I'll pass it backto Alolita for closing all right thankyou Pabloall right so I think we have about fiveminutes left so I won't uh take too muchof your time and again use the fiveminutes to ask questions get getorganized um and I also would like tojust uh say that we will all be at theopen telemetry observatory uh in theexpo hall so you know uh please come andmeet us there if you don't get a chanceto ask your question now um I'd alsolike to kind of call out we are actuallyvery excited uh to be doing our opentelemetry contrib uh later today so incase you're working on any uh areas youknow whether that's the collector or thego SDK java javascript uh and otherlibraries do come and uh join in uh youknow we are our maintainers are from theproject will be there so again they canmentor they can answer questions theycan actually show you and go into thecode uh so please please join in it'sfrom uh 4:15 to 5:30uh later today and um it's again on thesame level um in suite 1 so please don'tforget uh to join in and with that Iwill actually open it up for questionswe have a few minutes please please cometo the mic uh ask away i'd also like toask our maintainers to come up to thestage if any of you are here jerassiTyler um I saw some more folks uh fromthe TC please come on uparmen Jacob again come onup okay and and again if you have anyspecific questions please come up to themic might be easierfirst questionhey guys my question is for Pabloactually uh one of the one of the itemsyou had there on the collectorslide regarding uh exporter queuing andthings like that you're revamping it canyou give a bit more detail about thatbecause that's something I've tweakedconfigured quite a bit um so beinterested to hear more uh sure so Imean the gist of it is that we aremoving the botching on your pipelinesfrom a dedicated processor uh on thepatching processor to each of theexporters uh that has some advantagesfor propagating information back in thepipeline things like errors um which area bit more challenging with uh componentlike the butin processor which is asyncand it also gives you a bit moreflexibility to do batching in a moresophisticated way if you're exporting ina format that is not OTLP uh because youcould say batch uh for a particularmegabyte size uh that you want to totarget um the work is still ongoing umthere are some companies that areexperimenting with this on production uhso uh we still need uh to polish thedetails but um it seems like a promisingdirection and it we're going to to keepon working on it awesome sounds greatthanksuh hello i have a question about the uhhotel arrow um kind of side project ibasically I'm wondering if this is goingto converge towards the standard way ofdoing hotel data transfer or is thismore of an optimization for a specialcaseum there's a project proposal that justwent up in community today so I wouldrecommend you go read that um I thinkit's something that we're going to lookreally hard at towards an evolution butwe also don't want to cause any kind ofsplit or fork in how people use hotelso you know TBD but I would go look atthat project proposalthanks one more yeah hi thank you forthe talk um my question is we in theorganization used um um the K KSattributes processor and tail samplingtogether and uh in the first time itworks but when we realized that the thethroughput of spans through thecollector was much much better when wedid not use both of these things westill currently use the attributesprocessor because it's very valuable butwe get rid of tail sampling and we storeeverythingi think it might be useful to recheckthat because we uh still are in thatstage since I think one and a half yearsdo you think that there are lots ofimprovements in terms of performance inthose areasso I can answer that um so if you have afile an issue or some other sort of umextra information please send it to meoh yeah okay uh this is I I'm sorryyou're doing them in the same collectoror are you doing two separate collectorsyou we have a a cascade of two stages ofcollectors for the load balancing stuffand then the tail sampling decisionafterwardsyeah and and you had a problem withperformance with that or you when youwere combining them into well I think inthe end it was simply that we saw welltail sampling might be a useful thing inthe long run but currently we can storeeverything so we got rid of that and werealize that as a much more higherthroughput when we do like low testingof uh the collector with doing lots ofthings on the systems okay um so I'm notable to give you a specific answer aboutyour problem uh but I will say as partof the marking components as 1.0 noinitiative uh we are going to be lookingharder into performance throughputreliability of the components andestablishing requirements uh so that yepuh this kind of thing is at leastmeasured and you have an idea of whatload each component would support andokay thank you very muchyepso we are looking to performance fortail sampling I think James uh made amade two questions before so he madesome improvements to the tail samplingprocessoruh we are working on few things for tailsampling like caching decisions andworking on improvements on loadbalancing exporter as well so we areaware of performance issues but thething is the load balancing exporter isdoing something and the tail samplingprocessor is doing something so it iscosting cycles right so um if you're ifyou can ingest all of that data do itit's way better thanI was going to say like it's reallycheap to it's really cheap to put thatstuff in like S3 right like if you ifyou really have performanceconsiderations then you should bepushing that off until much later inyour decision pipeline and just likestore everything postprocess yeah likepo post postprocess it rightdon't please don'tsample i'm sorry like I know I know it'scommon wisdom and I know that you you'regoing to be like a vendor but no likeseriously like your devs will love willlove you if you just say like oh what ifwe actually just don't sample and thenwe don't have to worry about all of thisthis stuff joey B itdon't don't sample and tell them Austinsent you this is not an official This isnot an official open telemetry statementthis is this is Austin Parker tellingyou don't sample it it will work outbetter for you in the long runall right thank you everyone and youknow where to find us after this so comeby and say hello at the open telemetryobservatory thanks2025-04-15 22:00:02.316861Sabout whatwhen we talking about I think in K uhthis group conf we have lots ofdiscussion about batch system right soat least in my point that batch systemis trying to help us to manage theworkload right including the workloadlife cycle fing and several things thethe other important thing related tothis topic is that our major task is tomatch the design performance of hardwareright for the hardware we including thecomputing results right GPU DPO weincrementing several things and theother thing is the networking part Ithink this is the today's topic so whatwe have done about this part fornetworking resource we have infiniteband we have rocky and bund wise healthyand also the latency is also importantfor large language modeling and thethird one I think is also storage thingsright we have cache we have local diskand we also have you know distridistributed a storage future for someprocessor data also several importantthings right this is a common things ofbadge system every badge system shouldhandle related stuff for this brandthe another thing is that based on thebatch system define what uh we want tobuild for batch system we have a volcanohere right for volcano this is a how tosay banner or the description of volcanoparts of volcano project so volcano is acf incubating project and it is thefirst batch scheduling system in CNCFit's start from cool batch since uh uh2017 this is the long time ago we buildlots of features and based onthe badge system what we want to buildnow the first one is the computingresource we build lots of feature toenhance GPU and CPU resource usage nowthe first one we we list lots of featurehere such as the dominator resource fairfair share and some common fair sharefor lots of thing we have job level wehave Q level and also name service levelfair sharing to balance the workloadbetween the different things and currentuh and today's topic is about networkingresource right before this topic webuild task topological where schedulingthis is major for the task topologicsuch as the PS worker mode in test andflow part we have performance and whenyou check the history of volcano you canget the performance report about thispart and today topic is the networkingtopological scheduling so we did someenhancement for uh in uh in volcano andwe also have some pipeline for that iwill give a quick introduction and the Pwill give a demo for the what we havedone uh in last year and the third oneis I think is in pipeline for in ourpipeline the right for for example wehave dataware scheduling for example wehave all information about thescheduling so we will try to do thesomething like data preloading to savethe effort or save the improve theperformance of the work the wholeworkloadRight uh the next one is volcano whyvolcano is you know we introduce a lotof feature we have we introduce a lot offeature every every year every month sowhy we can do that in volcano we havetwo level kind of uh plug-in machinerynow the first one is we have theuh in the middle line right this one isaction so if you want to build someaction personally right such as inq suchas the primion disculer you can do theaction yourself right this is also theplug-in point for example for the uh forthe common part for the otherscheduleuler they can you can just use apredefined entry point this is not youcannot modify this one if you want tochange the upstream this take a lot timefor yourself but in volcano you can dothe modified allocation steps yourselfright the third one the third one thelast one is the plug-in this is a commonpart we build lots of uh plug-in herefor example we keep the backwardcompatibility in the uh in the upstreamand we import all the things from theKubernetes default and we also build ourown differentation uh scheduling partsuch as list in the previous uh slidesfor example DRF and preemption severalthings on the in the cache side we didseveral enhancement to catch the datafrom the pi server because inuler wehave lost of data change uh statuschange in about the about the job aboutthe code so we want we try to avoid youknow try to minimTize the uh uh the APIinteraction with API server because youknow during our performance test wefound that some bot is from API serverwe lots of schedule put you know send alots of requests to API server part andon the right side we have a you know acan of code right this is the examplethe the uh the green box is that we justafter you finish the interface right youjust need to import this u API uh importthis package into the man workload andthen you can enable your customizerplugin yourself so you can build youryou can build your scheduleuler yourselfthis is a simple no change to theupstream everything is open so this iswhy we followed everything from thevolcano to do thatright uh yeah this is the volcano partright we have the basic framework wehave cache we have build several thingsnow when we talk with P about uh uh youknow this case about large uh uh RMtraining so we found something that nowthe f uh some challenges that now thefirst one is you know GPU GPU schedulingpart right we need for the uh for thelive language modeling we training weneed lots of uh GPU and this featurealready support and the second one is wealso need to networking enhancement thesecond one is bond wise low latency andum uh related network things for examplein the future We also want to uhintroduce a healthy part um when wefound this thing and back to volcano wefound that you know we already list thefeature here we found that the computingresources are already there gunscheduling and several basic uh severalexisting feature already supported thenext one is we try to you know what wecan do is try to uh support networkingpart you know green um highlight inright so we are going to have anetworkingpart okay for the for the network uhnetwork uh networking aware schedulingso currently we have in a data center wehave several networking part right forin a rank we usually use the MV link todo the uh interaction and when we crossthe rank we usually have several layerof switch and then we usually use uminfinite band or rocky to to share thedata share the data across the run so weare going to have several things forthis one but uh you know in the upstreamand most of the cases that we defineeverything as a kind of the static rightwith just use a label or just you knowput uh configuration fail that this isthe the simple one rightuh in in in uh in volcano we try todefine a s right to define all thethings on the left side we can see uh onthe left side is I think the networkscanning is the the idea is thestraightforward at least for the staticpart for the topological part because wewant try to you know minimize the partsof data trans transaction between thedifferent node this is this is a simpleone we try to avoid the data you knowdata path try to avoid didn't you knowwe didn't uh went through the switch ifif this is not not necessary or notrequired so we define the interface onthe left side we defined the CRD we callit hyper node right in hyper node wehave layer we uh sorry we had a tear forexample tier one tier two this is adifferent layer of uh of switch and wealso have a different the uh uhdifferent selector we have we have exactexact match so we select a single nodefor this And for job it is simple rightin volcano job we we can use this toplogic to make it you know uh soft uhsoft match or required match for thosethingsright uh I think currently we define theCRD we support the basic uh interfacefor uh for the topological scheduling sowe also try to you know unify the API ofuh M link IB and and Rocky because wewant to uni uh manage all the interfacefor all the networking for this partnerfor example partition fabric severalthings and then we try to do the autodiscovery about the service right we uhinvida we just released some feature DPFrecently it's including the uh networkacceleration feature and also the snapis the storage things and we also do thediscovery this is in the pipeline we aregoing to show and the on the theuh on the bottom we always we show the Cenhancement i think in the future we aregoing to do everything about the uh forexample the healthy Uwe are going to showsomething that the the network is readyor not or when it's ready how many timesis it's a kind of failed so this if thisare lots of time lots of condition isyou know not ready that mean this nodeis not stable so we also consider allthe things from our side um yeah I thinkthis is a major feature from the volcanopart i think this is a simpleintroduction of p will give a a demo forthis partthank you uh d uh gloss forum introducing me um good afternooneveryone uh I know you have a long dayhere and this is not uh dinner time andI promise I will not take too long anduh this will this section will endbefore your Uber ease arrive okay allright uh so yeah um the here um we'reI'm we're a cloud uh builder that webuilt GPU cloud and then um we want tohave um uh network topology aware in ourscheduleuler and and that's why we havebeen um watching the evolvement the uhevolution of the um network topologyawareness inuler and we see that uh atearly stage in 2019 there is alreadystandard topology uh labels introducedinto Kubernetes uh including the regionand zone and then you can also customizeyour own labels as well and then in 2020there's something called pod topologyspread feature that's what released andthen uh recently I think in January ofthis year um volcano topology awarescheduling now it's a feature that's inpreview and then we cannot wait we justjump on it and try it out so that's Whyuh we're here um let'ssee does itwork um why do we choose uh Mochoulerfirst of all because our GPU cloud isbuilt for AI training so GAN schedulingis a must-h have feature um and and atlike two years ago there are not muchschedule who supports this uh feature sothat's a key uh feature that we wantthat's why we choose moano and then ofcourse because that's GPU cloud so wehave to support GPU CPU and memory andeverything so um that's why this uhheterogeneous device scheduling is alsoimportant for us um uh the test topologysupport is another thing that we like uhwe really like because for AI trainingjobs uh often we want to run like apre-run to make sure all the GPU areworking before we actually launch theactual um AI training job we don't wantto have any like um slow GPU orproblematic GPU that's in the um in theworker nodes that before we launch thejob so we want to have the uh test setup and then after after that the GPU uhrunning we may have some cleanup job aswell so the task topology setup uhsupport is really helping here umfinally the uh the flexible resourcesharing and preeemption and reclaimthose are like when we share the clusterbetween different groups or departmentand um teams that's very helpful featureas wellum uh this is our uh planned buildup forour uh uh data center i don't want tospend too much time over here this is aphysical view uh and that's toocomplicated so I'll show you a simplysimpler view which is a logical view sohere um when we talk about the logicalum uh uh uh network topology with frombottom to to up we have the the b at thebottom we have the um tier one which isthe rack u now we're simil we'rebasically uh that's we plan to do GB200but now because GB300 is released so weplan to uh actually install the GP 300MVL72 as our uh tier one um rack so soeach of the um the oval here uh at thebottom those are a rack of uh GB300MVL72 which contains uh 18 nodes andeach nodes has four GPU and two um CPUover there and um moving upthe if you see the the the white uh ovalthat's each of them there are two ofthem uh on the second on in the middleuh each of them represent a tier two uhhyper node from and then on the on topwhich tier three hyper node so yeah sothat's basically the logicalrepresentation of our our networkarchitecture all right sohere it's not moving come on okay um umthe software version are listed here soif you want to reproduce our product umso because our data center is still uhum is still building we don't have aphysical I don't have a physical uhaccess to that yet so now what my whatI'm demoing today will be uh simulatedversion so that's why you can see what Iuse kind and quark those are justsimulate aVll those nodes in hereum okay so now let's go to the actualdemo um first of all I want to show youwhere am I in okay first of all I wantto show you that we uh I simulated thiswork uh this this network of um how manynodes we have uh100 18 nodes and you can see uh Ibasically marked uh I named them in thispattern so that you can see they're fromwhich region which zone and then whichDC and row and rack so at the bottom youcan see uh we uh we have uh we have inthis simulation we only have like oneregion one zone and one DC but we dohave two rows and each row have threeracks and each rack have uh 18 nodethat's basically simulate uh MVL uh MVL72 rack andthen let's moveto to the pods and see what pods arerunning thereright so now there's no pause runningthat's perfect um so um in order to showyou like uh the topology awarescheduling so um I will run some launchsome job on there um soum I will run some job there where is mycommand here first let me set up my cubeconfig and over there Iwill okay here this is the command I'mgoing to run to run actually three jobsover there um each of them is simulateuh LM training job um so we'll have Ltraining one two and three uh each ofthem will have will require uh 18 18nodes basically uh uh 72 GPUs and thenum so basically that that require onerack so what I want to show you is thatwhen you see those jobs get getsscheduled uh each of the job should uhall reside in one rack instead ofinstead of spreading over multiple racksso let me try run them all togetherum ohwait oh that's delete sorry my bad ineed toapply all right so now our pods arecoming up and you can see now they'rerunning and they're scheduled andrunning and if you look um if you lookat all the training one jobs[Music]uhtraining I cannot type here all right soall right so if you look at all thistraining one jobs you can see if youlook at their um can you see my mouseyeah if you look at their um racks theyshould be on on the same row same rackso in this case is row one uh rack 02and that's decided by the kind ofschedule not me sorry why I'm typingthis right so um if you look at thisshould be consistent all everybodyshould be on okay if you go to the otherone let's say trion two they should bego to all sit on the same um rack aswell which in this case is row two racktwo this is live so I don't know whatthat is so I have to look at it allright andthen if you look at the third onetrainingum trainingthree and in this case it happened to bescheduled on the row two of rack 03 isthatright yeah that's right all right sothey should be consistent if they're notthen that's that's a problem right sothat's how we can make sure that all thepods they can communicate with really ahigh bandwidth andum um yeah using all the bandwidththat's available so this this is ourfirst demo that's uh basically uh forthe uh rack level um now let me deleteall of those and then I will try twomoreum all right let's do let's in insteadlet's do apply for the four to fivewhich each of those uh those are largerjobs um which each of themum each of them requires 54 nodesinstead of uh instead of the um 18 sothat requires actually three racks ifyou remember in our logical view thethree racks um there are two rows eachrow contains three racks so we actuallyexpect that each job will um be uhcontained within one row instead ofspreading over tworows what is thatsee that what is that do you want toenable no Idon't all right so now uh they seem tobe running and then I have let's seetrainingfour okay and then if look at them theyshould this one happen to be running onrow one so you can see uh 54 of them allrunning in row one right and then if yousay225 and then you can see and again it's54 of them they should be on the row twoso basically that demonstrates that ourtopology aware scheduling is actuallyworking all right um let's go back todemo so now um this is done so I willshow you um as a summary that we firstwe showed our cluster topology and thenwe submitted our job and then uh we cansee that scheduling uh work as expectedin our real um live demo here um how dowe enable network topology wherescheduling uh we have to first of coursewe have to install the correct versionof kano um uh right now the previewfeature is in v 1.01 01 it's preview andthen we have to define all the hypernodes uh so that uh because those arethe CRDs that introduced in this in thisversion so we have to uh right now wehave to manually create those withoutbefore network topology uh networkdiscovery is available and then ofcourse we have to ask our jobs to usethe correct network topologyum so when you install a volcano makesure to have this correct correctversion over here uh using basicallythis this tag uh v1.01 01 networktopology preview um and this is exampleof our CRD um basically this is auh this is a rack level CRD which theyyou have to say oh I have this is tierone CRD and the type uh is each of themembers is a type uh is a node and thenode has to m match this regularexpression um of this that's um that'sjust how we organize in this demo uhyours don't have to be the same as mineand then uh this is a a portion of thesample job yaml and the interesting partis that you have this uh networktopology in this spec uh basically thisuh v uh volcano job specuh yamo file and then uh this mode meansuh there are two options for the modehard or soft so in the hard modebasically if you uh if your uh jobcannot be scheduled with uh requirementthen it will just be pending instead ofgoing forward if you have a software uhsoft mode then it would say and you sayoh my job want to be confined in um intier one and if tier one does not haveenough GPU or other resources it willactually will try to escalate and go totier two to look for more allow more uhresources to be used so in this hardmode then there's just no such optionyeah if you say tier one I have to lookuh uh all the resources have to beinside the same tier one and and most ofthe time that's what you want uh just touh to make sure that the the latency andbandwidth are consistent uh between allthe all the pods and then oh the yeahand the only this highest tier allowedthis is only usable for uh when you'reuh using hard mode if you're using softmode then this doesn't really matter youtry from the bottom tier and then go allthe way to to the top tierum there are still challenges andlimitations as of right now becauseright now the uh network topologydiscovery uh is not there yet so we umuh to as a scheduleuler uh volcano doesnot include inband network topologydiscovery um so we uh will need to thethe cloud providers to actually supportthe uh network discovery so that so thatthe hyper node hierarchy can be createdum for the customer without um customeruh because customer does not even havethe knowledge of the underlying networktopology so so it had actually it relieson um uh cloud providersupport um of course the accuracy of thetopology information that's when you uhdon't have that support and you manuallycreated this hyper nodes and uh then thetimeline is is is also crucial thedynamic uh cluster environment is thatbecause we know that GPU nodes fail andwhen nodes fail we do remove nodes fromthe clusters and we add new nodes in inthat case uh if not if that happen toooften then basically you have achallenge of keeping your topology uh uhaccurate all the time and of coursefinally the this feature is still inearly preview and it we hope it willmature soon and then we can use that inproduction um the future directions ohuh I'm am I out oftime yeah okay uh just very briefly umour future plan is that we'll um uh ourcompany will add our own support forthis feature uh from inside our companyand then uh the we're also hope thevolcano can can evolve very fast so thatwe can mature and then we can use thatprodu in production and then finally theschedule performance tuning andvalidation is also needed when we wantto schedule a large number of uh jobs orpods in in a short period of time um andhere are our contact information um thisis how you can findus and thank you so your support iappreciate that you stay all the timeall the way to at this time with usthank you so much2025-04-15 22:00:03.063277 { ]{��A�G#��9AsLFmnCyZ89Mthank you very much for being here uh Iam Paulo Paderno i am a softwareengineer working on Redat on themessaging and data streaming team somostly working on Apachi Kafka and evenon stream that this is what we are goingto talk about which is a CNCF incubatingproject um and it's about running KafkaKubernetes i'm one of the coremaintainers and with me there is Tina hieveryone uh my name is Kantasa but Iusually go by Tina so I'm a softwareengineer at Red Hat as well and I alsocontribute to StreamZ and Kafka aswell all right um so agenda for thissession we will start with introducingum Kafka and StreamZy so don't worry ifyou don't know too much about them andwe'll also talk about some of the newfeatures that came to streams uh such ascraft tier storage and auto rebalancingand we will also talk about some of theexciting upcomingfeatures so what is Apache Kafka it's aleading distributed event streamingplatformum it scales horizontally and has a it'shighly available in fault tolerant andit has wide variety of use cases it cancapture data in real time um fromvarious sources uh data sources likedatabase event driven applications orother cloud services and store thatdurably to be uh consumed by othersystems to uh manipulate or um react tothem in realtime so it was originally created byLinkedIn and open sourced under Apacheuh software foundation it's licensedunder Apache license20 so what is StreamZ um so it's also umopen source project it's CNC includedproject as Paulo already mentioned it'salso licensed under Apache license20 um so Kafka sounds great in terms ofwhat it can offer but it isoperationally very complex um that's whyuh running CFKA on Kubernetes becomevery popular it was probably the mostpopular way to runc��!�F#��yA_3fpZA-DqDUuh my name is Jonah Cowell i'm one ofthe maintainers of the Jerger projectand I'm here to walk you through alittle bit about Joerger is how many ofyou don't know what Joergeris all right that's pretty good so Iguess maybe I Well there's a couplepeople i'm gonna probably go over somethings that are familiar to many of youand then I'm going to drill into some ofthe really exciting stuff that wereleased just before the last CubeCon inthe US which is Jerger version two it'sa very big multi-year effort then I'mgoing to talk about the reason why wedid it the benefit to all of you and uhand then I'm going to talk about some ofthe other things that we're working onand the roa^��9�E#��)AM56SHzETAmMcloud events and X registry This is whatwe're going to talk about today Uh so myname is Man Lotle I'm a product ownerfor cloud native platforms at HDI GlobalThat's an insurance company in northernGermany And I want to talk about cloudevents registry So maintainer trackalways a little tricky one because youhave a few people that have not heard ofthese projects ever before and want toget to know them as well as someveterans that know exactly what's goingon and basicallyX��V�D#��cAJXcQcofGzrAthis is Claus and this is Pong so wewant to give introduction about our workuh to optimize optimize the trainingperformance for uh large language modeluh increment we did lots of work forthis in the in the last year so we wantto give introduction about this part uhabout uh about ourself who we are hereuh this is Claus from Nvidia i am thefounder uh co-founder of Volcanocurrently I joined the Avida for aboutthree years focus on the network partyou know uh for this this one yeahI'm Pong i'm from a technical startup uhwe're still in sales mode so that's whyI'm not showing my company name on thereokay great yeah so we are talking aboutuh um L one uh performance enhancementbut before that uh I wantto some kind of discussion RY need the likenitty-gritty details of what changed SoI will try to make both of you happy Umwe will have an intro and explanation oncloud events as well as XX registry Xregistry a more deeper explanation sincethis is the newer project fairly unknownprobably and I will try to make thismore understandable with a demoSo we're going to use a to-do listexample and then I'm going to use thelast five minutes to explain why an enduser company might be interested Andfinally I will share the project roadmap with you which a few people havebeen asking last year already and wehave one to give in this talk Startingwith cloud events I will assume most ofyou have heard of cloud events alreadyIt started in 2018 is graduated sincelast year shortly before CubeCon I thinkand it defines common metadata forevents So there are many diff differentmessaging and eventing protocols outthere and even more event brokers andthey have quite a few things in commonbut often differ in detail and it's alittle bit like if you were to send aletter to a friend but depending onwhich postal service you use you have towrite different things on the envelopeSo this is where cloud events comes inIt is defining these commonmetadata and it defines protocol andbroker independent headers So what weare not do is we are not telling you howto write your letter This is about whatyou write on the envelope So it's notabout a common event format but aboutcommon eventdata And we have nine SDKs to do thatSix protocol bindings and this is whatit actually looks like So I just toldyou we're not telling you how to writeyour letter This is the envelope This isan AMQP protocol binding We have a fewmore uh six in total We have Kafka aswell We have HTTPMQTT And you can see all the common metadata up there And this is your letterThis is the one that you would bewriting Now the nice thing when usingthis binary mode is that you do not haveto do a breaking change So you don'thave to break your schema contact withyour existing consumers You can startusing cloud events without actuallybreaking that If you want don't want todo that or if you cannot do that you canalso use the structured modes That's thesecond option we we allow where youdon't use the protocol headers butinstead you actually do use the payloadand the very top level uses theseattributes and puts them on puts them inthere and then your letter your payloadlands in the datasection So what I just showed you hasexisted for a few years now and is verystable Um I mean it's a graduatedproject That's kind of what you expectfrom it And what is still changing andwhere is there still a lot of evolvementis the extensions for this project andone would be CESQL cloud eventsstructured query language and that is1.0 zero know So what you can now do isyou can query for the event metadata andbasically filter down on your events andK native is already using that They havea filter expression in place So you canput the SQL CSQL expression in there andthen only the events that where thisexpression evaluates to true will end upin your service or end up at yourconsumer We also have four new extensionthat have been released in the last yearThe first one being BAM that stands forbusiness activity monitoring Um I had tolook that up as well and it monitors theprogress of a business process anddefines headers needed for that We alsohave data classification that might beinteresting for everyone from Europewhere we have the data classificationone that talks about the confidentialityof the data as well as the dataregulation that could be GDPR and thedata category which could be sensitiveor notsensitive And the last one or not thelast one one before uh but interestingfor all continents is deprecation Uh itanswers four questions Is your eventtype deprecated uh since when is itdeprecated how long will it still besupported and what can I do to migratefrom this event type to the one thatcomes after that and now the last oneopcua unknown to a few as well I guessuh that is an industrial protocol and itthe extension itself pro provides amapping from these protocoZl to cloudevents headers and adds a few as wellSo that is what happened in the cloudevents spec We are now moving over to Xregistry and when cloud events managedthe interoperability of the metadata forevents uh it still leaves many questionsunanswered and for that we're going backto this metaphor of letters andenvelopes because what would be nice toknow in our event- driven ecosystem iswhat event types what letters whatenvelopes actually exist and since Ijust talked about the metadata that canbe on there It would be nice to knowokay what metadata what event type haswhat kind of metadata uses which kind ofextensions Also it's interesting whatthe actual letter inside looks like Soknowing what the payload is is somethingyou you would like to know And then thelast thing that's where I put the niceDHL post box in there Um where are thepost boxes so where can I put my letterswhere can I send them where can Ireceive them from basically indistributed systems where are theendpoints that I have deployed to sendand receive messagesso these questions came up in the CNCFserverless working group that hascreated cloud events first and thenmoved over to um answer these questionsand we came up with something that wecalled cloud events discovery and it hasthese three three entities here So firstis the message definitions Those are ourenvelopes the event types Um they have adefinition and they can be versionedbecause events change over time And theyare related to the schemas that is ouractual letter the the payload of anevent And we have schema groups We havethe schemas itself and the versions ofthose schemas because they as well doevolve And the last thing we definedwould be endpoints where you couldregister your post box boxes basicallySo we created that and we were kind ofhappy because this is like the mostimportant thing you need and thenrealized well they all follow the samepattern and we do not want three specsbut we would like one core spec thattakes care of all this stuff and wewould like to have one interoperabilityinteroperable APIand for registries in general that wouldbe a nice thing It would be nice if wewould have one core spec that could beextended to these different entities andthat's when we decided to not call itcloud events discovery but call it Xregistry So X registry itself um is Xregistry standing for extensibleregistry Um XR registry itself and itscore specification does not haveanything to do with event- drivenarchitecture eventing or anything It isjust an abstract concept to managemetadata about resources Any metadataabout any resources and it has this veryeasy model of groups that have resourcesand those resources can be versioned AndI will show you later in the demo whatthat could be because this is kind ofabstract Uh so the endpoints messagesand schemas that I just showed you arejust implementations of that core specSo if I just slide that model from theslide before in you can see that theschema groups are on the groups levelthe schemas are on the resource leveland the schema versions are on theversionlevel So what we have actually createdor what we aim to create is a vendoragnostic registry specification that canbe adopted by any other vendor as wellwithout breaking their business model Sothe main value proposition of thisproject is this core specificationbecause it provides theinteroperability and lets you definebasically any resource managing any reum any registry managing anyresource The endpoints messages andschemas are then just implementations tothat spec What we also have is a serverimplementation a referenceimplementation We will take a look atthat as well It's written by Duck Davisand is part of the XR registry group nowas well And we will have a CLI as wellthat's written by Clemens Vastas It'snot in the group yet in the in theGitHub organization but will be soon Sodepending on your persona you mightvalue this core specification You mightvalue these endpoints messages theschemas implementations of that corespecification to get started Um or youmight want to have the CLI or theserver Going from this[ like abstractlayer down to a very concrete example Uhwe have a to-do list service by anexamplecorporation And on that to-do list wehave two tasks already You want toattend CubeCon 2025 you are here That isa success You want to attend the XRregistry talk We have done that as wellNow we want to add a new task and thatis drink beer at Cube Roll I learnedit's called cube crawl We'll name itcube crawl H you can do that in a fewminutes time But just creating that taskcreated an event in the background aswell So we have a cloud event It's thesame cloud event I showed you beforewhere you have your metadata up top theenvelope and you have your letter uh inthe bottom where you can see the labelis string be and it's not completed yetNow let's imagine let's let's jump twohours in the future we will kick offthat task And now you can see I can jumpback and forth We have a new event It'stask completed And it says completedFinally this is my very very complexapplication And now we want to take alook at the even more complexarchitecture We have a to-do listservice on the right on the left Uh thenan event broker where these eventsevents are emitted to and a consumerthat could consume that And what we haveup top is a registry that isimplementing the X registryspecification So this could be DXreference implementation or any otherreg registry that decides to implementthat specification and comply with itNow the very first thing the to-doistservice would do is it would createthese um entities So the envelope uh thethe the schema the message and theendpoint and then it can just providethese cloud events send it to theendpoint in this case it's nuts doesn'treally matter will be nuts in my demo aswell so that's why I chose that and thenany consumer could go to the X registryor a catalog that is built up on uponthat and get thisinformation and then consume the cloudevents or whatever events basedbasically from theendpoint I do want to show that to younow So I will end the presentation shortand startwith VS Code where we have a registryYou can see that right is it big enoughy okay uh where we have these three mainthings So what we see here is just aJSON file and in its very simplest forman X registry can be a JSON file So wehave endpoints up here and I will uhunfold that and you can see it has thename example.com nuts reversed withreverse DNS notation and in this caseit's a producer Uh the envelope is cloudevents 1.0 because you could registerother envelopes here and it will use theprotocol of nuts Yeah for your protocolYou then have protocol options that youcan define In this case it would be aURI for nuts to connect to and you canmake some statements about the subjectuh the subject structure for for nutsYou could use HTTP here and things wouldlook a little different but this is howyou would define an endpoint a nutsendpoint We then have message groupswhere the first thing we have is ourgroup And in this case I grouped it bytheservice.ample.com has a to-do list Andinside this group you have many messagesFor example task.created And if I foldthat you will seetask.completed Here you will have astatement about the envelope again andthe data schema format as well as thedata schema URI So this is we aretalking about the envelope This isreferencing the schema So the letter inthis case would be JSON schema and youcan see there's a reference to schemagroups which we will look at in a secondNow the most most important thing isthat most of the information actuallylives inside the version because this isall stuff that gets versioned and canchange over time So we will have theenvelope and the envelope metadata Againthis is where it gets kind ofinteresting because you can narrow downon the cloud events uh attributes andsay for example that this kind of eventhas a spec version of 1 1.0 for cloudevents And you could even set the typeThis is just like a concrete value Thiswill not change So you know what type toexpect as well as the value for thesource to expect You then have the datacontent type data schema format and dataschema yuri again in this case poi\ntingnot only to the major version but thevery exact version Let's now look atthat and close this and open schemagroups where you have the same structureagain because you do have a group againit's called com.example.todo to-do listbasically the same and now you haveschemas that can be versioned again andin this you have the schema itself so Isaid earlier it's JSON schema so underhere you just have a JSON schema objectthat makes um a statement about thatthere is an ID a title and a completedfield is what an Rex registry can looklike in its simplest form now what Iwant to do now is I will copy this curlrequest I have uphere and we'll paste that in here and wewill put this into the REST API versionof this exact registry So what we arelooking at now is the UI that Duckcreated for his server implementation Iwill plug that right here XRregistry/server You can also look at XRregistry slspecAnd for that server we now have all theinformation that was in my JSON fileinto the registry already Now you cansee we have endpoints we have messagegroups we have schema groups And wecould take a look at the endpoints Andyou can seecom.example.nuts And you will see a fewadditional attributes Uh you now have aself URL you have an X ID And X ID isbasically the last part of the URL of ofthe ofthe REST API you will have synced likean epoch created at modified ad and youwill have all these URLs that allow youto jump to different different placesLet's go back here and go into themessage groups where you can then alsosee if looking at messages you can seethe created one And what you see now ifyou compare this tothe file version is that you have manymore attributes in here than there werein the file This is because the defaultversion is per default integrated intothis very message So by default youalways get the most up-to-date versionwhen you query the resource What you cando though is you can take a look at allthe versions there are I was lazyThere's only one version So uh in thiscase this is the default version Ifthere were two let's say I have1.0.1 then this one would not be defaultanymore and the1.0.1 would be part of the resourceitself So let me check if I forgot toexplain somethingOkay what I want now want to do is Iwant to show you the actual model thatpowers this registry I told you earlierthat you can use any you can describeany metadata So there's actually a modelkind of a meta model that says how thisregistry is supposed to look and it saysyou need endpoints you need messagegroup you need schema groups You canlook in here and see okay a group canhave resources and those are schemas andthose can be versioned I told youearlier Now to prove that you can doanything with it uh we have a few otherregistries in here And when I look atthese models you can see now we havegroups in this APIs guru that Duck hasbuilt where you have API providers andinside API providers you have APIs andthose will have different attributes Youcan look at these as well And we haveone for doc store as well where you whenyou look at this model and you look atthe groups and the documents you can seewell you look at the groups and thegroups are called uh there's a groupcalled documents and it has resourcesinside them that is called formats andthose have different attributes as wellSo you can create your own model hereand basically build I said it threetimes already uh any any resource any umanage any metadata about anyresource Now what you can also do is youcan take that stuff from the registryand put it back into a file To to dothat let's go to the root first and wewill do something that is kind of basicfor REST APIs We will inline everythingwe have and then we will use this fancyuh unfold button And you can see what wehave now kind of looks like the um JSONfile that I showed you in the verybeginning The difference being that younow have stuff like self x ID epoch inthere that is added by the server bydefault And what you can then do you canadd the doc view to actually make this adocument And you will see that a few ofthose actually get removed again So inthis case i]f we look at not endpointsbut message groups you can see that youhave less information on the messagelevel because that would just beredundant and not update nicely And wehave even have a shortand for that thatis called ex export where you can justtake that file and um and use it locallyagain So jumping back to that demo thisis something that is kind of importantto us which is the symmetry betweenthese um representations So you canstart out with a single file and then goto resty API and go back if you want toNow there's a third representation thatwe also enable you to use which is astatic file server where you could justmimic the structure of the rest API umby splitting your registry up in severalsingle files put it put them in anystatic file server you like and thenthey will be rep will be usable as areadonly registry of course by by yourcustomers There might be use case ifyou're like a platform engineering teamand your people get a GitHub repositorywhere can they they can just definetheir schemas and you take these schemashave a pipeline and compose that up to astatic file server What you can now alsodo is you can use the cloud events dataschema field to reference to the Xregistry if you used that in the in thepast Uh those go nicely together Butwhat I want to focus for the lastminutes is the end user view So why isthis interesting for an end user companywhy is this interesting for you in caseyou are one so looking at HDI um we haveabout 300 developers working in agilesoftware development teams and all ofthem have to build maintain and manageinterfaces So what we decided to to dois we decided to create a platform forthat And as a platform team we areproviding them with infrastructuretooling guide rails and enablement Andthis platform we call it very creativelyglobal integration platform and it hasthree components to it which is an APImanagement a message bus and an eventbroker So if we are looking at that wehave our customer he's on the left andthese customers these developers areusually developing business services andbusiness services usually deal withbusiness objects and business events andthey then set them free via interfacesThose are the nice plugs I put in hereAnd if we are looking at the globalintegration platform having the APImanagement the message bus and the eventbroker um we offer them the theyactually live there This is where theinterfaces are living and we offer themdifferent ways to describe theseinterfaces If we're looking at open APIthis is how we do it in API managementSo you're describing your interface butat the same time by describing yourinterface you're also describing thebusiness object that you're pushing overthere For the message bus we areoffering them to build a JSON schema towrite a JSON schema that describes themessage that has some information aboutthe message in it but at the same timethere's a lot of information abouteither the business object or thebusiness event that goes over it And forthe event broker there actually is aregistry in there that can do every andJSON schema But again we are repeatingschemas So the if a business service isusing all of these products theprobability that is he is repeatingschemas is quite high And even worsethere might be the probability thatthese business events and businessobjects actually look quitly distinctfrom each other So they might have adifferent version cycle differentversioning different version numbers andthat just makes it a little morecumbersome to use So what we would liketo do is we would like to have onesingle place in the center This issomething we want to change We wouldlike to put all these definitions in oneplace and connect them with each withwith each other And this is what Xregistry can actually do We can put allthe JSON schema in there all the AVO inthere and then to just define theinterfaces using open API or using asyncAPI in these separate products becausethis is what actually is distinct whenusing these products and referenceeverything that is um has to do with abusiness object or has to do with abusiness event to Xregistry So basically what we do want todo is we want want to store the metadataindependent of the integration productand we want to avoid the repetition ofschemas and we want to harmonize theschemas across integration integrationproducts but even further we might wantto do that uh looking at data productsfor in our BI platform that use that useuh similar schemas as well and we alsowant to connect this metadata they nowlive completely separate from each otherand this is something we can change byusing this extensibleconcept So looking at the outlook um theroad map what we have done already lastin a in April uh 2023 almost exactly twoyears ago is when we created the specripple that I showed you this is whereit moved out of the cloud eventsorganization into its own organizationWe set up this initial repository We hadsome governance and migrated all thatstuff And what we did last month is wefiled for a sandbox application for CNCFthat has the core spec in it It has theendpoints messages and schema spec in itas well as the referenceimplementation What we did justyesterday is we had a release candidateone that you can all look at Uh this hasall the relevant course back in it It isready for impro uh for implementationtesting ready for reviewing and what wewant to do in two months is have RC2 andthen another two months to give everyonethe time to review is the version ofrelease1.0 So taking away from this talk Ithink it's very important for cloudevents check out CESQL check out thefour new extension that are out thereespecially uh deprecation I think wouldbe interesting for everyone and for Xregistry uh you can review this RC1right now the QR code is on the right umyou could use the core spec to buildyour own registry or you could use themessages schema and endpointimplementation specs to uh describe yourevent- driven architecture and what youcan also do is you can run the serverlocally the impro uh the referenceimplementation to uh play around with aregistry like that inaction So thank you very much[Applause]There is a question over there umfairly question about events Yes Um whatwould you say is the primary use caseum for cloud events uh and secondly ifyou stacked ranked a list of use casesin your head where would tracing comestack ranked uh so if you had an orderedlist of use cases for cloud events wherein that list would tracing fall tracingis an extension in cloud events alreadySo what uh what is already available forI don't know how many years is basicallyit's I think it's called trace state andtrace parent Those are the twoextensions So you can just put yourtrace uh in there and have it umpropagated from one application toanother Um primary use case is probablyharmonizing that metadata uh what we seeat HDI is that people use either thesame header with different meaning or uhthey use a different header but theyhave the same meaning So this is reallywhat we try to get rid of to have thislike common understanding of what is anevent type what do I put in thereum thank you very much for the work onthis project My question is how do I getmy open API spec or async API spec intothe X registry server as of today umwhat you can do already is you can putin for the schemas you can put anyschema in there So you can put averageJS as schema protobuff you could do openAPI async API as well So I can write itplain to the server and then ittranslate it into the structureI didn't understand sorry if I have anopen API yl or async API yl file How doI get thatwe do not care what document you put inthere as a schema So you can so what wehave is the meta data for that schemawhich is in the form I showed you andthen you can just append any kind offile you want that could be an open APIammo OkayDid I answer the question kindof you can kind oflater All right If that's it you canrate this talk if you want to or you cango and takeick off the last item of theto-do list[Applause]2025-04-15 22:00:03.930097_d map and kind of where theproject's goingso with that uh my name is Jonah my dayjob is that I run product and design ata company called Passler we doinfrastructure monitoring you may haveheard ofPRTG that's our product and then my notday job I work on Open Search i'm partof the technical steering committee andI'm also a maintainer of Jerger sothat's my volunteer job and when I'm notworking that's where I'd rather beunderwater i spend a lot of time divingi live in South Florida and dive allover the world so that's what I likedoing that's my passion uh these arepictures from the last year but I'malways taking new pictures and stuff soum so let's talk a little bit about whywe all need and love distributed tracingi'm going to get into a couple ofdifferent demos and I'm going to showyou what Jäger does in the basic demoand then we're going to dig into what itdoes with monitoring capabilities soderiving metrics fromtraces and then uh we'll dig into JergerV2 the architecture some of the changesand kind of why we did it uh so and thenwe'll have time for questions i don'treally have a clock but I'll keep an eyeso uh distributed tracing is reallyimportant because we all buildmicroservices architectures we have lotsof different teams and everyone ispointing at each other when there's aproblem so the issue that we try tosolve with tracingis you could say whose fault is it butwhere did it break why did it break andmost importantly who can fix it so thatour customer is back up and running andtracing gives you that unique ability toanswer those questions and to get reallydeep into what's happening in your codeso that's why it's important we can alsodo other cool things like builddependency maps i know our dependencymaps a little ugly but you'll know inthe road map that's going to changewe're doing a lot of work right now ondependency maps and visualizationand then of course we can do root causeanalysis so tell me about that customertransaction what was the issue and thenthe other benefit is that we can derivemetrics and monitor SLAs's understandavailability error rates things likethat uh so the basic tools that we useare instrumentation how we get the dataout of your component softwareapplication how we collect that andstore it um and then how we visualize itand allow you to analyze it so those arekind of the main uh basic things oftracing and datacollection so for those of you thatdon't know considering you all knowJagger this is probably a littleredundant uh a trace is the endto-endtransactionum and that can have multiple spanswhich are basically components that arebeing called within it inside each spanthere are tags and those tags allow youto either understand instrumentationthat injected the tags or to you for youto add your own tags i was just talkingto someone earlier at the Jagger booththat said we have a bunch of differentteams that are generating traces how doI tell them apart and I said "Well youcould just have the team put a customtag in there that says the team name andthen you could filter and do all kindsof fun things with the traces that arerelated to the team so tags are reallypowerful and they allow you to do a lotof extensions on the data model that'spart oftracing um and then you can kind ofvisualize these you know this is a lotof work in ASI but um the right handside is more of our familiar timelinewhere time goes across and then we seethe spans cascading down but anotherview is more the topology view where youcan kind of see the tree and how thingsmove um and then there's lots ofmetadata attached to each of the spansso these are kind of the basics let'sjump to a live demo my favorite thing atleast it's local uh so we have anapplication you can install on your owncalled Hot Rod and Hot Rod is supposedto emulate Uber and Uber created Jergerso this is like a mini version of Uberwhere we can basically dispatch cars soI'm calling this you know a call tothese different places and all of theseare generating a bunch of differenttraces that are hitting a bunch ofmicroservices and then based on thisresult I can look a`t the front end sothis is basically the top of myapplication over the last hour and thenI can pick one of these so the this iskind of the timeline of all the traces ican do all kinds of filtering by time bythe duration um and this tag thing isreally useful because I could just sayonly show me errors only show me tagsfrom that particular you know componentin your application so there's lots ofways to use thesearch um then when you pull up a tracethe default view is that timeline viewwhere here over time we can see thecascading kind of spans that areoccurringthe interesting things to note uhrelatively new feature is what we callcritical path so this black line meansthat is the part of the of the entiretrace where there's contention going onso if someone says to you make thisfaster if you speed up things that arenot on the black line it's not going tobe any faster so if I were to go in andtry to optimize this you know particularcall or component it's not going to makea difference i need to focus on thecritical path because those are kind ofthe longest parts of the trace it's areally cool feature not a lot of othertools actually do that um and so you cankind of see the black lines followinghere so there's a bunch of differentthings you can do here i can see thatthis database query is very slow i candrill into it and once again now I seethe tags this is a MySQL database calland I can actually see that it's a Goapplication it's using open telemetryall of the other data and then moreimportantly from this databasetransaction I can actually see some logsthat are being gathered by theinstrumentation and this is showing thatthere was a lock the lock releasedtransaction continued so this kind ofshows you how you would do debugging inthis case with a database problem butthe other thing I can see here is thatthere's a whole bunch of different callsto Reddus over and over and over and theworst thing is when you see thisstaircase pattern that means thatthey're not using any threading so it'swaiting for one to finish next one nextone if I added threading here I couldprobably stack those all up against eachother and speed this transaction up soif I did that that black line it wouldall move over and it would be faster sothis helps you kind of optimizeperformance diagnose problemsthere's lots of other features in Jergerinstead of viewing it um in a tracetimeline I can a relatively new featureis I can get a timeline view so there'slots of different visualizations forthis trace um so the flame graph ispretty cool but you can also get a graphview there's lots of other kind of uhareas that you can look at it um you canalso see the raw data the JSON basicallyand the reason why this is useful isbecause if you're debugging on yourmachine you can actually go and upload aJSON file right here you can even getthis directly from OTEL upload it intoJerger and it'll actually visualize theraw JSON for a particular trace so ifone of your colleagues maybe doesn'thave access to your Jager you could sendthem this JSON and they could look atall the trace data just like you are inyour instance so some people use thiskind of like offline in that way wheremaybe it's running in a customerenvironment and you can get that JSONyou can actually visualize it andanalyze it so it's kind of a coolfeatureum and there's lots of other kind ofviews and things um that you can getfrom Jerger but those are just a coupleof the basic use cases there's lots ofother features here uh that are prettyinteresting and useful i'm going to jumpback actually before I do that I'm goingto switch demos because the other onehas to generate some data while we arewaiting this one's going to run in thebackground and give us a bunch of cooldata that we can use for the nextdemo so I'm not going to go through allof the things that I just went through itook screenshots in case things don'twork the one thing we didn't look at isthe topology view so this will actuallyshow you how many transactions are goingbetween the different components thisdata gets aggregated unfortunately youhave to run a Sparka job to generate thisit's somewhat annoying but that's kindof part of the product today we'reworking on a way to do this uhdynamically using open search in thefuture um you'll see that in the roadmap there's a couple people working onit um so that will be better but you canstill do it today and visualize it wekind of went through the rest of thesefeatures already um I'm going to switchgears a little bit and talk aboutmonitoring so instead of just tracingand diagnostics let's talk aboutmonitoring something we contributed toum open telemetry upstream a few yearsago the team in Jerger builtspecifically to help us unlockmonitoring use cases so aside from adebugging tool and a diagnostic tool howdo I get operational visibility a lot ofpeople join the Jerger channel on theCNCF Slack and they say "I'minstrumenting my service mesh i'm tryingto get tracing data and you know all I'mbasically getting are metrics becausethe traces break because there's no umthere's no correlation between thetraces on the service mesh so what wedecided to do was just derive uh metricsdirectly from the traces so that youcould do more operational monitoringit's a pretty cool feature um so nowJoerger uh can either remote write or bescraped so that you can get the metricsfrom your application specifically theseare the red metrics that's what we callthem uh request rate error rate andduration so now you actually will getall of that data automatically fromeverything that's instrumented in opentelemetry and I'll explain how thisworks and you can use this whether youuse Jerger or not it works with anyPrometheus compatiblebackend so kind of the architecture whenyou look at the open telemetryuh data that's flowing out the traces wetake those into Jerger and then what wedo is we basically use this span metricsprocessor which emits uh it's actuallythe spanmetrics connector which emitsthat so it's a component in opentelemetry that does this automaticallysends it to your Prometheus or you canscrape it um and it's useful in otherdashboarding tools whether you're usingPercy's or Graphana or Joerger i'll showyou how it looks inside Jerger in amoment but it's a really cool featurebecause you get these metricsautomatically you don't have to doanything uh it's automatically derivedso the way that it works in hotel or inJerger version two which I'll talk aboutin a moment this is kind of an exampleyou can see in the trace pipeline webasically are calling span metrics as anexporter so you're kind of taking thetraces sending them to a metricspipeline which is then deriving thosemetrics and sending them out so it's anelegant way to get that automaticallyfrom you know your existing opentelemetry or Jerger uh collectoressentially and then I'm going to showyou how this looks in the UI but this isa screenshot um let me jump over to ademo we'll take alook so I can do the same type of thingwhere I can view the traces but we'remore interested in this monitoring tabwhich I started a few minutes ago so youcan see here that the uh metrics havestarted flowing in i have a Prometheusinstance running on my laptop along withJoerger um and the data is just beingpopulated directly intouh into the Prometheus instance jergeris querying the front end is queryingthe Prometheus instance and then I canview traces that are associated withthat since this application is reallysimple in yours you'll actually have alist of different applications you coulddo filtering and you can visualize thosein different waysum but yeah it gives you a pretty niceuh set of capabilities that let you showuh kind of operationally how it's doingso definitely a usefulfeature jump back tothis so a big project that we've been weworked on for a couple of years that waslaunched officially in November isJerger version two so let me talk aboutthis a little when Joerger when opentelemetry started there was no othercollector out there we had Joerger SDKsthat were used to instrument using opentracing that's the predecessor and thenwe had a Joerger collector when theystarted open telemetry they took theJoerger collector and usedb that conceptto build open telemetry over time we gotrid of our instrumentation we've movedeverything to OTEL and now the secondpart of this is to actually move theJagger components on top of opentelemetry so we essentially replatformthe entire backend and the processing onOTEL and then we ported over like all ofour APIs and other things that were partof Jerger and uh and made it completelybackward compatible so if you're usingJoerger version one and you installJoerger version two the data repositorythe data format everything will work thesame but it also gives us uh nativesupport for open telemetry uh protocolas wellthe other nice thing is we have onebinary and now instead of a command lineinterface that Jerger version one hadfor config where you could pass allthese crazy parameters in now you have aconfig file that's YAML just like hotelso all of us love debugging YAML now youcan debug more YAML so uh that's that'swhat we did here so the the basic ideais that we'veessentially built on top of opentelemetry the components of Jerger andit's a single binary that can act as allof these different rules so it makes itreally flexible and I'll show you a fewexamples we also were one of the firstprojects to use the new extensioncapability of open telemetry so theJerger UI actually runs as an opentelemetry extension which is kind ofcool so it lets you do all kinds ofinteresting things inside the hotelcollectorarchitecture so I'm not going to boreyou with the entire config and thosethat have used open telemetry canrecognize most of this we obviously haveour extensions that allow us to dothings that uh interact with thecollector dynamically so whether that'suh remote sampling or running ourqueries um and then we obviously havethe uh you know the trace processing andeverything involved there uh this is ourstorage API that we implemented insidethe collector um and then you kind ofspecify your backend so Jerger todaystill supports elastic search opensearch and Cassandra i'll talk aboutsome new data sources that we're workingon right now it's part of the road mapbut we probably have 30 otherunofficially supported backends butthose are the ones that we run in ourCI/CD we make sure they work we supportthem if you submit a bug we'll fix itwe'll look at it we'll test it soum there's two kind of basicarchitectures for deploying Jerger thisis the simple architecture so you haveyour applications instrumented you sendit to Jerger in a collector role andthen we write that data to the databaseof your choice open search elasticsearch Cassandra uh and then the UIqueries that database and gives you allthe visualization that I was justshowing you so this is kind of the basiclike if you're in your dev environmentor something small this will work greatas you scale out and start to get thingsmore complicated you may have tointroduce Kafka to deal with bufferingif you have a huge spike in yourapplication or a lot of traces getgenerated at once this will help make ituh you know help uh cue the data slowlywrite it to that back end and notoverwhelm it so it's a pretty commonpattern you can we build Kafka into theJerger binary so you could use it forboth putting data into a Kafka Q as wellas reading it which you can see on theright it can read off the queue and thenwrite to the data store so that's kindof a more scaled out view of how youwould implementJerger the other thing is we also someof the SDKs in open telemetry have acool feature called remote samplingwhich is only a Jerger feature thatactually allows you to dynamicallychange samplinguh headbased sampling on the fly uhwhich is a really cool feature sobasically the SDK checks with thecollector it gets the samplingconfiguration and then it candynamically change the applicationsampling without you restarting it ortouching it so it's a really coolfeature unique to Jerger uh Uber used itextensively that's why it was built butit's only supported in a few languagescuz we have to get all the upstreamhotel language uh SDKs to implement itso um but it's still a a cool featurefor sureum so let's talk about roadmap for acouple minutes uh I'm kind of going torewind over the last year so since Ispoke in Paris uh some of the thingsthat we did we launched Jerger version 2i talked about that already um the Kafkasupport is a big ask we had somechallenges and deficiencies in Jergerversion one thankfully with opentelemetry we have way more capabilitiesin Kafka so it was nice to leverage thecode of the community versus us havingto maintain it so it makes it so that wecan move much faster because we're lockstep with open telemetry uh versus doingour own thing we've also officiallyadded support for elastic search version8 and newer which we did not have beforeit was a common request there was someworkarounds but now it's officiallysupportedum and then I'llmention that we officially have a Helmchart and a Kubernetes operator both ofwhich have come out in the last fewmonths for version two um so thanks tothe mentorship programs that we run wehave students and other folks that comeand work on the project the CNCFsponsors them and pays them for theirtime and so we use the mentorshipprograms from Linux Foundation andGoogle Summer of Code to help work onthe project so the mentorship programspecifically helped us with the Helm theKubernetes a bunch of the Jerger versiontwo work and we continue to use thesementorship programs because they'rereally helpful we all have day jobs wedo what we can but bigger projects wecan guide a student down the right pathand they learn and they contribute andso it's a great way for us to get uhmore people working on the project um sothese are kind of some of the benefitsthat we get i'll mention that again in asecond when we talk about the road maptoo um so we also have huge amounts ofbug fixes if you look at our releasenotes we're always building new thingsfixing things taking all of yourfeedback making the projectbetter uh we also have had more uh UIcapabilities so I mentioned the criticalpath visualization a really cool featureother improvements there's more workgoing on in the UI that I'll touch on inthe road map um but we've added a wholebunch of other capabilities in the UI toto improve it thanks for everyone'scontributions there so the first thingthat is I would say 80% done or more isClickhouse support natively so that'llbe official backend for the project thatmeans we'll test it it'll be part of ourCI system um and if there's any issueswe'll be able to handle it so that'scoming along it should be done very soonwe have a lot of going back to thementorship we have a mentee right nowthat's working on a bunch of cool new UIthings modernizing the visualizationlibraries making things much moreconsistent because if you've used Jergeryou'll kind of notice that there's threedifferent visualization systems becauseit's evolved over time uh since theproject is probably about 8 years old Iwould guess uh so we're reallyrefactoring that redoing it on somethingmodern so the visualizations will bemuch better um and you'll start to getmore and more data overlay is kind ofthe next thing that we're working on soinstead of seeing those metricsseparately now you'll see a graph withmetrics on top you probably get thatwith some of your vendors but now you'llget it in open source so that's a goodthing so we're doing a bunch ofdifferent UI things thanks again to thementorship programs that help us withthat um but we're always open forsuggestions uh so definitely make yourfeaturerequests and with that um I'm going toopen it up to Q&A we'll have a mic goingaround so just raise your hand and askaway we also have a booth in thepavilions i'll be there in theafternoons which is a weird conceptbecause it's not actually afternoon butI'm there all evening tonight and thenpart of the afternoon tomorrow andFriday so pop by the Joerger booth uh inthe project pavilion too and if anyonehasquestions raise your hand and we'll senda micoveranyone none okay this will be the firstCubeCon with no questions but uh feelfree to come and talk to me after and Ihope you all have a great CubeCon andthanks again for your time2025-04-15 22:00:04.759429dKafka how however even that presents uhits own challenges because Kubernetesdoesn't have that Kafka specificnecessary Kafka specific knowledge um tokeep that high maintain um highly uhsorry availability and performanceso streams manages Apache Cafka andKubernetes as I said uh it's based onKubernetes operator pattern and itprovides various operators for runningand managing cafka components and it andmassively reduces uh operationaloverhead so streams really allows you torun kubernet uh cafka in a kubernetesnative way it uses the CRDs to extend uhKubernetes API to define uh Kafkaresources and integrates that Kafkaknowledge into the operator and that isuh really important when it comes tolike doing upgrades and rollingupdates and also manages Kafka resourcesthrough the operator pattern as welllike Kafka topics Kafka users andconnectors right So stream uh automatesthe installation of CFKA as well as itsuh CFKA other CFKA components um such asKafka connect which lets you um connectyour cluster uh to various externalsystems and also Maker which allows youto replicate your cluster for disasterrecovery and also HTTP bridge which isprovided by StreamZ um for connecting toyour cluster over HTTP and there's twopeople as well I'll come to that in abit uh so stream uh handles not just dayone operation also day two operations uhsuch as upgrades certificate managementscaling of clusters and configuration ofclusters and also uses another opensource project called cruise control tobalance data in yourcluster it also lets you monitor yourcluster easily it integrates with uhvarious different uh monitoring systemsso um there's also new uh project thatwas added recently called metricreporter which allows you to exposeKafka metrics directly uh to Prometheuswithout usingJMX also offers various uhauthenticationmechanisms um and it integrates reallywell with other CNCF uh projects likeopen telemetry prometheus andketaCool okay so we'll move on to uh therecent features of stream so we'll startwithcraft so craft it removes zookeeperdependency of metadata management wasused to store the metadata of clustersbefore and it's being replaced byKafka's own implementation based on uhRAF protocol and this simplifies thedeployment and management of theclusters because you have a singlesystem to manage and uhoperate and also improves thescalability because Kafka is very veryscalable and Zookeeper tends to have alimit when it comes to scaling and itmakes it much more efficient andperformant because you don't have tosync the metadata data between zoozookeeper andkafka so this is how uh zookeeper basedclusters look like so you have brokersum cafka broker nodes and then you haveyour zookeeper cluster and one of thosebroker nodes designated as a controllerand then used to talk to zookeeper tocoordinate all the metadataupdates and this is now how craftcluster looks like you just have um onlyCFKA nodes uh some of it is u serving ascontrollers and some of it is uh servingas brokersso those controller nodes they form a aquorum um based on the raft protocol andit's basically distributed quorum wherea metadata topic is stored locally ineach controllernode and one of the controller node isthe quorum leader and it's called activecontroller and then the other controllerum so that active controller is handlingall the partition assignments andupdates to the metadata topicand then all the other controllers arejust following uh and replicating thatmetadata topic updates and it's justtrying to um keep up to dateso brokers also store this metadatatopiclocally and then they talk to I'm sorrythey talk to the active controller tocoordinate the metadata um to get themetadataupdates and um couple of slides beforewith the zookeeper cluster um you sawthat one of the broker used to bedesignated as the controller so none ofthe workers controller needs to uh thebroker nodes need to do that anymorethey just handle the client topic dataand then it's the controller activecontroller handling all the partitionassignments and coordination of thatmetadata so with craft also offersvarious diefferent ways uh you can runthe cluster so the usual setup is uhhaving designated worker nodes andhaving designated controller nodes andthis is great for large clusters andthis is the recommendation uhrecommended recommended setup forproductionum and also allows much betterscalability but you can also run it incombined mode meaning that some or allof your controllers uh sorry some or allof your nodes um serving as bothcontroller andbrokers and obviously this um saves yousome resource because you can run muchless um count of nodes but you need tobe aware that if a node is serving asboth controller and broker it might needmore uh memory and CPU resourceyou can even run single node cluster ifyou need a cluster that if you need tospin up cluster very quickly for justsome testing purpose this works great sothe the the last three setup is uhreally just for development environmentsso of course you can um migrate yourexisting zookeeper based clusters tocraft and that migration process isquite uh complex and manual um butstreams supports it and allows you to doit in a semi-automatic way it's drivenby users um throughannotations it couldn't be f fullyautomated because of the significantdifference in the architecture andconfiguration by however by um givingback some of that control back to theuser um allows you to have the roll backoption so the migration is done inphases and then up until the last phaseyou can uh roll back tozookeeper right so craft was announcedin 2019 uh so it's more than 5 years agoand then it was first supported instream 2029 sorry 029 as an experimentalsupport and and it was enabled bydefault in stream Z040 so meaning allthe new cluster that were provisionedafter this version they will be runninguh in craft mode bydefault and then just two weeks ago umKafka 4 was released removing zookeepercompletely so we're very happy to seefinally this uh craft feature is comingto completion and um now we workingtowards uh stream046 so the current version is two uh 045um it supports Kako 380 and 38390 and it's the last version withZookeeper support and we plan to provideextended uh support for this versionprobably around a yearand so this is the last chance you canmigrate to existing um Zookeeper basedcluster tocraft because um with z uh stream 460uh which is the next release um It willonly support craft mode it will supportcafka 390 and40 and we also using this opportunity toremove some of the deprecated componentslike mirror makerone okay so if you'd like to learn moreabout craft here is the link and withthat I think I'll pass to my colleagueyeah thank you Tina so um craft was oneof the the big features that you know uwas added to to Kafka so now we aresupporting that u since long time in instream uh and uh yeah with Kafka 4 andstreamit 046 that we expect to havewithin maybe this month um you will haveum craft um as the the the first choiceof the the only choice for running yourKafka cluster so it's not just aboutcraft as the main features that you havetoday within streamy but even the tierstorage even tier storage is somethingthat was added in Kafka itself so withtier storage you can think of offloadingsome messages so having some maybe oldmessages um got from your brokers to beoffload on some other storage like forexample could be a cloud storage likeAmazon S3 or things like that Azure blobstorage and so onso um why you should use the the tierstorage uh first of all so there areseveral reasons right uh first of allfor cost uh efficiency so it means thatum you have on your brokers of coursedisks that maybe are more expensivebecause you want performance you can runwith SSD and things like that and uh ofcourse in this case um if you are goinginstead to offload all the messages somessages that are part of the history ofyour uh uh Kafka cluster that you wantto to to have there anyway but they arenot uh read so often by the consumersyou can just offload these messages insome more you know cheaper storage thatcould be something in the cloud andhaving the more upto-date uh data stillon on the on the brokers so you can havelfess dis space used on the brokers andoffloading something on a cheaperstorage and so on so the first reason isthe cost which brings of coursescalability as well uh because you havethe computation mostly running on thebrokers and storing there the messagesthat you are access frequently and maybethe older ones on the on the cheaperstorage but it's also really useful forfaster recovery and rebalancing aboutfaster recovery you can think then whensomething bad happens for example on abroker and you are restarting the brokerthe broker has kind of to recover thedata uh reconstructing the log of coursemore data you have on the broker morethe log are bigger insights you havemore segments in the logs then it willtake more time for the broker toreconstructing everything so in thiscase of course if you are offloadingolder messages somewhere else you can uhbe faster when recovering from thebrokers and even on rebalancingrebalancing uh as Tina mentioned beforethere is the usage of crisis control forexample where rebalancing means uhmoving partitions across your clusteracross all the brokers within yourcluster in order to have a more you knowbalanced load uh across all of them butof course if you are moving uhpartitions which are you know littleinsides uh you can be faster onrebalancing so again you have the olderdata in uh other storage in the cloudstorage for example then you can justrebalancing the your cluster with withless less partitions and less data andlast but not least uh simplified clusteroperations it means that for example youdon't have to deal with the diskexpansions because you are getting moredata more data so you need to expand thedisk on your brokers you are justwriting the old data on the cloudstorage and uh and then you yeah youdon't need to run this kind of operationor for example deleting old segmentsbecause they are getting bigger andbigger and bigger so this is uh theseare the one main reason that you shoulduse tier storage within Kafka uh ofcourse with streamy you have akubernetative approach as usual sowithin our cafka custom resource whichis the main custom resource that usetoday for deploying a uh kafka clusterwith the stream there is a new fieldwhich is the tier storage field whereyou can specify the implementation ofthe remote storage manager that you aregoing to use so in Kafka uh you can evenwrite your own storage manager you haveto implement some interfaces you can umyou know build this jar and baking yourimage uh the Kafka image starting withthe from the streamy one to have thisplug-in inside and then you can applythe configuration here for your uhstorage manager and it will be used uhby stream easy to configure everythingand then by cafka of course foroffloading and using the tier storagefeature so moving the data uh in the inthe corresponding storage that is usedby this the the remnant storage managerthe support that we have today is mostlyabout you know custom implementationthis is why you saw on the previousslide the type of the tire storage iscustom so it means that you can bringyour own implementation and um there isno kind of strict API for the severalproviders that you can use uh for thestorage uh we are exploring stream isthis a open source plug-in which issupporting several storage as you cansee here amazon S3 GCS and Azure blobstorage but of course you can write yourown if you need something different andthen uh yeah using the tier storagefield within the Kafka custom resourceto set all theconfiguration but the plan is uh as youcan see here when um so tier storage isfinally G since Kafka 39 it's now umproduction ready it was not so it was inearly access in 3.6now that it's GA production ready uh weare thinking as a future step to have amore kind of strongly typed API where wehave a better support for the severalstorage that are available uh yeah thatwe have out there instead of having thiskind of customsupport the next feature is about autorebalancing on cluster scaling so asmentioned before there is a cris controlintegration with streamy can use criscontrol in order to run your rebalancgingin the cluster but um the the interfacethat you have today is by using aspecific custom resource it's calledkafka rebalance so instead of talkingwith the http rest api exposed by cruisecontrol in order to get the proposal toask to cruise control to run therebalancing still again kubernetive wayyou are writing your kafka rebalancecustom resource where you can specifythe goals that you want to achieve withthe rebalancing but it's manual rightAnd if for example you want to dosomething like scaling of your clusterso you are scaling up you are addingmore brokers these brokers are notgetting partitions from the existingtopic they are getting partitions andreplicas only for the newly createdtopic what you have to do after scalingup your cluster you have to run arebalancing so creating manually yourcafka rebalance and increase controlwill move partitions the existing onefrom the brokers to the new ones thesame if you want to scale down if youwant to scale down you cannot just scaledown if the brokers that you want toremove are hosting partitions becauseyou could have problem like partitionsoffline and under the m the minus insyncreplicas and so on so you don't have thedata uh accessible this way so you firstneed to make these brokers empty somoving the partitions off and you can dothat by running a rebalancing and thenwhen the brokers are empty you can scaledown of course it sounds today as a kindof two manual steps process right sofirst scale up and then moving partitionor first moving partition and then scaledown with the auto balancing that youcan configure within the Kafka castleresource you can just deal with scalingso you just increase the number ofreplicas of your brokers or just scaledown and so decrease the number ofreplicas and then the operator will takecare of running the scale up and thenrunning the rebalancing for you or onthe other way around first running therebalancing and then running the scaledown of course there is a kind oftemplating here because you are notwriting your cuff carry balance but youwant still specify the goals that youwant to achieve in rebalancing so youcan specify a cuff carry balance as atemplate and specifying the goals butinformation like the so-called mode thatcruise control has to use so add brokersor remove brokers or what are thebrokers ids that are going to be addedor removed it's something that theoperator can set for you because theoperator knows that you are adding thesebrokers or removing these ones so it'sjust a matter of scaling up and scalingdown the rebalancing will happenautomatically for loop so these are thethree main things main things that wehave today in the latest version ofstream craft tier storage and autorebalancing or scaling what's next theseare the five main things that you areexploring today so the first one isabout improving the certificatemanagement so within streamy today uheverything is secured by default sothere is TLS connection between all thecomponents uh across the cluster uh thestream image operator by default createa uh CA root certificate which is goingto use to sign the server certificatesfor all your components or the user canbring uh their own certificate to be putinside the secret uh in order to to becompliant with what streams is expectedright um there is no integration withthe kind of uh certificate managementsystem like s manager so there was aproposal uh it was approved uh and theimplementation is going on by one otherour core maintainer Kate Sunlay and uhyeah we are adding this betterintegration with search manager so youcan specify to use within your customresource that you want to use a searchmanager providing several informationset manager will going to create the theentity certificates for you and then theoperator will take care of bringing thecertificate and use them for yourcluster so the user can kind of set upthey own PKI um in order to handle thethe certificates the way they want theimplementation will be uh pluggable insense that we would like to have in thefuture support for other certificatemanagement system but the first ohne willbe search manager so we are focusing onthat the next thing is Kafka clusterself feeding so again cris control Ialready mentioned Kafka rebalance forrunning manual rebalancing or the autorebalancing which works just for scalingup or down but cris control providesanother feature the anomaly detectionit's able to detect u some anomaliesthat you have in the cluster like brokerfailures disk failures or go anomaly sogoal violation topic anomaly and it'salso able not just sending notificationsthat you can define like I don't knowusing slack or some other or your evencustom implementation for the notifierbut it's also able to um fix the issuefor you and of course in this case so wewould like to integrate thisself-healing feature that cris controlprovides of course in this case thereare some challenges right the user isnot um controlling the process becauseit's cris control starting at some pointto fix the cluster there is actuallynothing that you can do but we wouldlike at least providing a way for theuser to know what's happening right soan idea could be just sending somekubernetes events so that you canmonitoring and say okay cris controlstarted this kind of fix because thisanomaly was uh happening uh thisproposal here a stream proposal is astream under is still under discussionso if you are interested to this futureI would suggest you to jump intodiscussion providing feedback and seewhat what are the yeah the things thatworks for you and do you think that willbe better for for the stream projectanother big step will be moving to to v1APIs and finally releasing stream1 sofor a long time streams uh stream was atzero do something like and it doesn'tmean that stream is not production readywe now have a lot of companies and usersin the community using streamy sinceyears in production it was just ourdecision to have stream one zero outonly when zookeeper was going to beremoved and it took a lot so a long oldtime right so uh now that we have Kafka4 a stream 046 coming it will be thetime to start thinking about finallystreaming Z1 um also together with ofcourse um moving to the v1 API becausetoday we have a v1 beta 2 and of coursethe process will not that simple so Imean uh we are going to remove somealready deprecated fields maybe we arereting something based on the lessonsthat we learned in terms of the API thatwe designed since the beginning we willhave several streamy release uh where weare going to support both v1 and v1 beta2 uh API and then at some point uh we'llrelease stream one with just the APIv1 the next feature will be gateway APIsupport so today with streamy if youwant to expose your cafka clusteroutside of the kubernetes cluster youcan use ingress but you know ingressdepends also on the controllerimplementation for ingress like and soon uh you can use node node ports or youcan use also routes if you are runningon open shift for example uh but youknow there is this gateway API thisframework that is pretty mature now andum um you so the the the goal of thegateway API is kind of replacing ingressand we would like to have a built-inintegration of gateway API within streamso today we you can already use thegateway API there is also a blog post onthe streamy uh website uh by a communityuser and um and it was mostly aboutsetting up everything manually So youhave to set up the the several customresources you need for the gateway setup the gateway the TLS route and so onuh what we want to have is more built-inintegration in stream so you can specifyin stream in the Kafka custom resourceyou have the listeners for exposingright your cluster uh you won't likejust to set up that you want the gatewayAPI and stream will set up everythingfor you so it's something that we haveto start to thinkabout uh the last one is about stretchcluster so stretch cluster is aboutrunning a cafka cluster stretched acrossseveral kubernetes cluster uh there wassome requests from several communityusers and actually there is a proposalwhich is written by some users from thecommunity still under discussion uh ofcourse there are a lot of challengeshere uh because Kafka is very sensitiveto latency so we don't see this workingon uh you know kubernetes clusterrunning in se in different continentsfor example because the latency will betoo high but maybe in metropolitan areanetwork and so on and it simplifies forexample the the migration of the graphcluster between um kubernetes clustersso again a lot of challenges thediscussion is going on there is theproposal if you think that it's aninteresting thing yeah jump intoit regarding in the future is not justabout features so as I mentioned we havethis list of new features coming andunder discussion there is the streamiconuh streamicon is a virtual conference wealready had the first edition last yearuh I can say that it was a success frommy point of view we had kind of 300people joining more or less uh it'svirtual it's free um the agenda will beout soon uh maybe in the next week anduh you will see sessions around umstreamy core internals so how thingsworks internally in stream or forexample use cases or scenarios aboutusers or companies using stream uh inproduction uh or for example how streamcan work um with other CNCF projects sowhat are the kind of integrations thatyou have withstreaming yeah so this is a link whereyou can find all the information forjoin the project uh joining the projectas any opensource project could be aboutyou know just jumping into the slackchannel and asking for questions if youare in troubles with using streamysomehow uh so finding help from thecommunity or for example um raising bugsuh on GitHub and if you want you caneven contribute to fix a bug it will begreat or uh implementing featuresproposing features so we have thisprocess like in Kafka the Kafkaimprovement proposal we have the streamimprovement proposal so if there issomething that has a big impact on theproject you can start a proposal and sostart a discussion about I would like tohave this what do you think or thecommunity would think about that or forexample even just fixing thedocumentation so you are going throughthe documentation for the several stepsfor configuring stuff and you see thatsomething it's not clear or doesn't workthe way that it should work um yeahthese are all the reference and um itwould be great to have some of you usingmaybe stream easy engaging the communityso that's all thank youwe have one minute I think if there areany questions otherwise we will bearound and you can justuh there is a mic coming sois there an uh an alternative for mirrormaker because you mentioned that it wasuh going to be dropped sorry analternative for mirror maker mirrormaker ah for for mirror maker no mirrormaker 2mirror maker oh uh so we deprocatedmirror maker one mirror maker 2 so youcan still Yeah use that it's an oldversion of mirror maker with deprecatingthings any chance that you go can goover the differences between mirrormaker 1 and two if likeuh any chance that you can go over thedifferences real quick between mirrormaker one and two difference thebreaking changes between one and twoyeah exactly mirror maker so mirrormaker 2 is totally different it's basedon Kafkaconnect again mirror maker 2 is totallydifferent from mirror maker 1 it's usingKafka connect underneath i see so it's atotally different thing right yeah itwas rewritten to make it more efficientand performant while mirror maker onewas still based on Kafka clients andunderneath so it's running this Kafkaconnect which is a another Kafkaum framework within the Kafka ecosystemabout moving data uh across severalsystems so for example from a databaseto I don't know some other database youcan go through Kafka like a log so forexample ifI'm deploying a vivo maker 2 I'll I'llalso see in the same name space cafconnect uh you will see mirror maker 2but underneath it will run Kafka connectthank you so much so yes you don't seeyou cannot interact with mirror maker 2like it was Kafka connect you got tofind the migration as soon as possibleyeah yeah yeah and that's the the toolfor running this yeah for runningmigration thank you so muchokay thank you out2025-04-15 22:00:05.480463jeGoogle uh you can say that because youwork at Google i work at Google i'mallowed to to take the blame but thecloud providers had a lot of cloudprovider specific code and you knowanyone who's looked through what happenswhen you want an IP address for yournode or you're trying to work out thenode isn't responding anymore is the VMunder it still in existence is quitefamiliar with how much actual cloudprovider code is needed to make all ofthis work and every time we added a newcloud provider that complexity went upand so the general premisewas let's stop let's just stop no morecloud provider code in KK the problemthen is now you've got two classes ofpeople operating you've got someonewho's in tree and gets to do things andthen you had a lot of secondary cloudproviders who weren't in tree and it wasfairly unfair to them and So we startedand we've been working very hard and asof I think it was about a year ago wedeleted over a million lines of code allthe cloud provider code is gone it's outof KK i mean red diffs are good diffsrightyeah yeah and I mean if you look at likethis uh little timeline we've puttogether here you can see the process ofhow long it took us to do this goingback to what you know Walter was talkingabout you know this cloud provider codehas existed as part of Kubernetes sincethe 1.0era and around the 1 you know 13 time iswhen these decisions were made to startsaying okay we need to start gettingthis stuff out of here and you can seeit took us until you know1.31 to actually finally get there andit it doesn't mean that nobody putanything out of tree before that theycertainly did it's just there was stillsome in tree code that hadn't beencompletely removed and so every timethere were CVEes all right we fixed theoutofree one that we're currentlyworking on and also let's port somepatches into the inree like that wasobviously unsustainable yeah and youknow so like a big part of the SIG cloudprovider from the beginning was thisextraction and migration effort butsomething else we wanted to highlight aswell is there used to be an SSH tunnelbuilt into Kubernetes as well and theSIG has created or or helped I guesshost the API server network proxy um andI don't know if you want to mention likewhere we're at with that yeah I mean sowhether you're aware of it or not it'snot a required piece but the APImachinery code when the Kubernetes APIserver wants to talk to your cluster itneeds to be able to route traffic anddepending on how your cloud providerworks there are differing ways for thatrouting tohappen excuse me and so I know Googlebut I believe even IBM and uh Microsofthave some use of this system known asthe API server network proxy that takesthe traffic from the cube API server androutes it cleanly through and drops itinto the actual data plane of thecluster and so the referenceimplementation for that is actuallyowned by SIG cloud provider and allowedus to remove the very unfortunate SSHtunnel code that pre preceded it whichis probably a good thing i mean we don'tneed more back doors to anything rightand so like so we've like you know takenall this code out we've got everythingmigrated we've got the API servernetwork proxy like we're all done nowright we can we can just call it quitsand go home right declare victory righttalk over no not I mean not really rightlike there's there's a lot of placesthat we still have work to do um whetherit be the API server network proxy thatby the way if that sounds interesting toyou is a codebase that we would welcomeum participation in and I think probablyum looking at the migrationretrospective gives you also an idea ofthe kind of stuff that the SIG does yeahand you know we I think one thing I wantto go back and like put a PR into themigration retrospective to cover the 1million lines of code because I think wedidn't quite understand how much wasremoved from the source base when we didall this and I'd love to go back andhighlight some of that but there areother things we'd like to think aboutfor the futureyeah and I think this starts to get tothe question of what is the missionk ofour SIG if our SIG originally startedfocused on this extraction migration andnow we're like mission accomplishedwe're done but we're not actually donebecause as we'll go into some detailtalking about uh we're we have um bothareas of intersection with the rest ofKubernetes like for example API servernetwork proxy it's sig cloud providerbut it's also sig API machinery um andthere are exciting new open sourceprojects um you know uh maybe I'll letWalter tell us a little bit more aboutCrowwell before we get to Crow which Idefinitely do want to talk about one ofthe things to remember about thatmillion lines of code is it wasn't allproduction code there was test code inthere so many tests and there are testswhich require cloud provider code to beable to run if there's no cloud providercode those tests can't be run at leastnot off of just what you built in KK sowe still have some homework to do andthat is how do we make some fundamentaltests um I'm both Bridget and El Mo aretired of my example but does cublet workif you restart the networkseems like something that should workbut every cloud provider implements nodedifferently which means that themechanism you need to restart the nodeis different by cloud provider or sorryto restart network is different by cloudprovider so that is a cloud providerspecific test you can only run that testif you know which cloud provider you'rerunning against yeah so but back to theidea of there's a lot of things otherthan the testing which is the rest ofwhat this talk is going to focus on butthere's a lot of things that we see aswe need to investigate what should thisSIG be doing absolutely and so Crow isThank you crow is this great exampleright uh the idea behind Crow is if Iwant to generate aCRD as a platform engineer so kind ofshifting things a bit I may want a verynice simple API to be able to generatesomething like I don't know my workloadright and that workload may need adatabase and a web server and some ROSand you know load balancers and storagedevicesand I want to give this to or an an MLworkload and I want to give this to anapp engineer or an ML engineer and saylook the only thing you actually careabout is the name of your workload somesort of size t-shirt sizing estimate andyou know the image and I the platformengineer can take those three or fourparameters in a CRD that I've given youand automatically expand those into halfa dozen or a dozen other systems and infact on top of that that database that Ijust mentioned even though the rest ofthe workload is Kubernetes native thatthat database maybe is actuallysomething that's provided by Microsoftor it's provided by Amazon and I have anexisting controller that if I generatethe right CRD using something like AICKor ASO already knows how to go andcreate that database for me and so theidea behind Crow and tools like it is tomake that a lot easier and it especiallywhen you start looking at KCC ACK andASO those are Kubernetes native ways tocontrol your cloud provider but they'recomplicated they're difficult to use andhow do I make those integral to myworkload and so this is sort of apossible way we could go forward it's afirst draft of how do wemake cloud provider code the same kindof workload as anything else that I'mdoing it's handled in a Kubernetesnative fashion just like anything elseand done in a way that is simpler to usethan what has come before it and so thisall sounds good but how do we make sureit goes well yeah and I I before I stepon here though I just want to say likethis has caused us to think about theSIG charter because if you look at theSIG charter the way it's written today alot of it talks about the extraction andmigration and we never considered theseother use cases that are cloud providerspecific but yes as Bridget is callingout today though we would like to talkabout testing and you know why do wewant to talk about testing because liketesting is the coolest thing you can dowith software right it's so awesome yeahit's like it's super fun and everythingwell okay i mean jokes aside like likeWalter said we lost some teslts duringthe extraction and migration processright and now we have new cloudproviders who are part of Kubernetesthat weren't necessarily part of theoriginal inree code and they havedifferent features and different ways toexpress these things and so we wouldlike to look at testing as part of thenext phase of what SIG cloud provider isgetting into and so to look at why thisis important we're gonna throw up ournice data here and I think Walter isgoing to tell us a little bit about howwe've been getting better over time yeahand I mean I think this is a veryinteresting chart and thank you forJordan Liot for generating it for us butthis is looking at how many regressionswere discovered at a particular monthafter a particular release goes outright and so we look at 119 and at monthyou know month four we're at 16regressions which is pretty terribleit's why it's red we have had other rereleases that were a lot better like the127 release at month four it looks likewe were at about eight releases or eighteightregressions so I mean this is this is anoverall this isn't cloud providerspecific it's not looking at anyparticular SIG but if it definitelysuggeststhat there have been times where we'vegotten either gotten lucky or gotten thetesting right and there have been timeswhere we haven't gotten lucky or wedidn't get the testing right and and sowe want to work out how to be bothluckierand getting the done the testing jobdone correctly and one of the things isyou know we we're not trying to replaceSIG testing but we do want to work outif SIG testing is getting us the righttests and getting the test done in KKwhen you don't need a cloud provider howdoes SIG cloud provider help SIGtesting when you do need the cloudprovider and so we're we're activelylooking at how do we solve that problemhow do we get the tests run when youabsolutely require a cloud provider forthe tests to work yeah and and what wewant to see is what this chart isshowing us which is that if you look atit reading across the uh I guess roadsit would be we can see that postreleasewe're starting to get better and youknow we we postulate that this is basedon the idea that we're adding more teststo Kubernetes we're making it betterovertime so seriously we think this isprettyimportant bridget why don't you tell usabout why this is important to our usersyeah I I think it'sbecause if we add new features they needtesting for there to be end userconfidence and um operator confidence inthe fact that we're actually deliveringwhat we say we do and it's also reallyimportant and there was in the keynotethis morning I thought it was reallyinteresting this idea of you know umEuropean cloud sovereignty and likemaking sure to fund and um allow forspecific geo placement of exactly whereyou want your workloads to be and thatsort of thing uh if we don't have goodtests and configurable modular testsit's not possible for a new cloudprovider or um even an existing cloudproviders geospecific area to testaccurately like we need instead of thatpast where we had the specific KK thetest in KK and then some alsorans to be a first class citizen ofgetting their tests run you know as wellas all the other ones and I Think let'ssee probably another way to think aboutthis and I think this is important tooand maybe um from a Red Hat point ofview this would be interesting to hearyour perspective Mike is that noteveryone is using Kubernetes packaged upfor them exactly by one specificprovider like a lot of times people havedifferent distros or even ways that theyuse it non-commercial releases etc yeahand and this gets into a concept that Ithink you know Walter kind of introducedme to and I'd kind of like to turnthings around and say like when we startto talk about Kubernetes and we start totalk about all the different cloudproviders and how you put these thingstogether there's a natural questionwhich is you know why can't we just doall of this testing in KK you know whylike okay we extracted the cloudproviders but why can't we do all thetesting there and one of them Imentioned before is you know networkingimf I want to shut down this the thenetwork and start it up again i mean onewe have the fundamental question is itLinux is it Microsoft Windows but nooffense to Bridget let let's just payattention to the Linux for a second yeahAzure is a very Linux focused this istrue is there one way in Linux to shutdown a network and in fact you discoverquickly that you know service networkstop service network restart doesn'twork on all Linux distros right and so alot of these assumptions that you mightthink are really easy will work on somecloud providers but not all and it getseven worse uh I mean I know there's abig thread going on right now in otherparts of CubeCon on what are we going todo about um load balancing andnetworking right and if that there is noone true way of doing load balancing orwhat load balancing means and so whenyou want those load balance tests uh wedon't have the new load balancersnecessarily turned on by default you'vegot to pick your solution and you knowdoes the do the various things that needload balancing work with load balancingcorrectly you want to be able to testhey does this service that I need workcorrectly with the load balancer fromthis cloud provider and this is just arepeating pattern we see it with elasticstorage we see it with load balancing wesee it with IP addresses we see it withthe way that you even things like okayAPI server network proxy that we we ownthere are some cloud providers use itsome don't it's out of tree it has corekey hooks in the cube API server andmultiple ways that it can be configureddoes it do HTTP connect does it usegRPC each of these are very you knowvery cloud provider specific and arebased on things that don't exist insideof KK and it sounds like Walter isdescribing a giant complex um you knowexponential complexity text matrix thatsounds suddenly almost unapproachablelike how are we going to solve thisright right so we have an idea and we'veformed it into a three-step plan of kindof how we'd like to proceed and makethis better in the future so one of thethings we'd like to do is encourage moreproviders to participate in Kubernetesum you know we we see more providerscoming to Kubernetes every day and wewant to make an on-ramp that makes iteasier to get into the cloud providersthe cloud controllers and into thistesting world and then secondly in orderto do this we need to refactor the testsuite that's there for a lot of thereasons that Walter was just talkingabout you know previously these were allscripts that lived in the KK repo and soto turn off the network you know deviceon an instance running on GCP that's adifferent script than it is for Azureand so now we have understand how do wemake this modular for the future so thatas someone like IBM cloud is joining usthat they can do the same type of thingand repeat the same type of patterns andwe can use common test to do that andthen the third part to this and this isalways the part that you know we're badwith is kind of the end part of themaintenance right we need to supportthose communities once they come andstart adding things to Kubernetes so asnew cloud providers come to Kubernetesas they create their cloud controllersas they get involved in the testing weneed to make sure we're there to supportthem and continue to make sure that theycan have a good experience in thecommunity so now we're going to dive ina little bit deeper and we're going tostart by talking about what does it meanto get a provider involved in this andso Walter I'll hand it over to you soone of our basic things is we provide arepo right and there are a lot of themthere's SIG cloud provider AWS SIG cloudprovider Azure SIG cloud provider GCPSIG SIG cloud provider Huawei rightthere are a lot of these repos and theidea here is it is a place that we canone put a lot of the CCM spec there'sthe CCM cloud controller manager that'swhere the controllers you need for yourfundamental piece can live and you canbuild your own CCM binary that is formanaging your cloud provider code butthat's just the beginning of the storywe need uh you know anything you needfor nreg for uh the the artifact registrydo you need special tokens for yourartifact registry uh do you have yourown CSI drivers do you have somethingspecial like are you pulling in the APIserver network proxy do you have yourown manager which is not the CCM but isin fact yet another controller manageror your own load balancer not looking atanyone in particular your own loadbalancer controller that is separatefrom the CCM right somewhere there needsto be a place where all of the bits thatare needed to get your cloud providernot a GKE not an EKS not an AKS but justa cloud provider distro that we can testas open-source that Kubern Kubernetesworks on your cloud provider and so ifyou come to us the S the Sig cloudprovider we can provide you that repo wecan provide you guidance on how to putit together and how we can then makethat you know so that it builds itcreates a cluster and that that clustercan then be tested by the system thatBridget and Elmo have been talking aboutand and if that sounds like a lot take alook at the we have an actual you knowQR code there that goes and the slidesare also going to be published um thatgo to the specific example of IBM cloudwalking through exactly everythingWalter just described right because inorder to do that we need the cloudproviders to become involved in the CNCFand to be able to give us or not give usbut be able to share with us thehardware so that we can run these testsright and I I want to give a big shoutout to the IBM cloud folks because theyjust finished this process and the uhthere's an issue here on this the Iguess the right hand side QR code thatshows how you bring a new cloud providerinto the test infrastructure so ifthat's something that sounds interestingto you please check out thosethings but let's get a little bit deeperinto like refactoring the test suite andwhat that means um you know we talkedabout bringing up and down networkinterfaces we talked about like nodelife cycling and those kind of thingsthis is different on every cloud and inorder to make a future where people cancome and kind of have a self-serviceexperience where they can work on theirrepository and then commit those changesand share them with all the testinfrastructure we need to be able torefactor the test suite so that we candistribute this functionality so that wecan have a common test that says when Idrop an interface and bring it back upwhat happens does that work and the cleach cloud provider can then say yes Ican do that or no I can't do that andthen they can actually give thefunctionality to do it that way so thatwe can start to make these more modularover time and what and what I want to dois dive a little bit deeper into thesetestsAnd I want to kind of like ask Walterhere so you know we talked about nodetesting we talked about droppinginterfaces and whatnot like what aresome of the other problems around thatwhat are some of the other details thatwe need to get into to do this type oftesting i mean it's a great question andthe subtlety here is sometimes we don'tknow until we hit it so I mean throwingthrowing my my parent company under thebus we have a rule that says images thatyou want to run have to be inGCR.io and we have tests that have beenwritten i can think of a particular onefrom Red Hat where the image that youwant to run lives outside of GCRIOwell you know our our uh our admissioncontroller on the cube on the cube APIserver just went "Yeah no you're notallowed to run this image sorry notgoing to happen." Right and it soundslike a simple thing but that meanssomeone had to someone I mean me had togo and find the source code that DavidEids used to write that test and buildthat image and I had to go and read thesource code make sure it was good workout how to compile it compile it andregister it toGCR.io everything else about the test isthe same it's just that image path rightbut it can be as simple as somethinglike that that causes an image to workon one cloud provider and not on anotherand yeah I think I wanted to highlightthat Walter and you know um Elmo hereare giving us a really good oexample ofthe crossorganizationuh collaboration that's possible in thisSIG and you know in CNCF in general likeI would say um we've run into a numberof bugs with the way that Azure cloudprovider for example worked that werebrought to light by Joel you know at RedHat um on um on Michael's team and It'sfascinating to see that when you havethat perspective you know the we'redoing something and it just works wellif somebody from a different you knoweven configuration of network orwhatever tries turns out maybe itdoesn't work so I love thatcross-pollination of testing and ideasright and so like we're talking a lotabout networking we've mentioned thisnetwork interface case a bunch of timesbecause it's it's kind of an easy one tothink about but another one to thinkabout in the similar light is kind ofthe node life cycling test right therethere is a controller in Kubernetes thattalks to the uh underlying cloud andsays hey does this instance still existand if it doesn't we should delete thenode object and make sure that you knowthe API server is updated and everythingand this starts to get into what we nowneed to do to make these tests better isthat we've had this process where we'veextracted everything and in order to notyou know slow down Kubernetes releasesor to impede them we've had to makechoices where we said well this testonly runs on one platform it's not avery good test the way it is let's pullit out so we can keep the process movingnow we need to go back and startauditing these things and this is kindof what we're talking about here when weget to the the deep dive section rightthere are different things that we needto do for every piece of this and thenode part is one one part of it butthere's also service tests we can talkabout too we uh we're pro possibly atiny bit short of time so I don't knowif we'll go into a ton of detail aboutservice tests but I'll just say that ifwe're we're thinking about a service andwe're thinking about service health it'slike and load balancers and loadbalancers and have have any of you allexperienced you know the oh our ourerror just returns 200 okay and it'slike no that's not a good error yeah soin the interest of time I'm going toskip over uh this these slides here butI recommend that if you're reallyinterested in understanding thefunctionality of the different parts ofthe cloudcontroller we advertise in theKubernetes documentation that cloudcontrollers do a certain thing and weneed to make sure that we're testingthose and that they do them or advertisethat they don't do them so that usersknow what they're getting into when theyget into these things and so it'sactually testable right and you knowit's also important to note that likeeach provider might have specific teststhat they want to run that are unique totheir provider that don't exist on otherproviders and we need to make sure thatthere is an opening for them to do thattype of testing and have it all connectto Prow and all the other all the othergood things and the other thing I wouldmention I mean just just a thoughtexercise if this testing happens withrelease that's every four months if it'sbroken after you haven't run it in fourmonths how hard is it to find the issueand if the answer is very hard and Ithink it is then the question is howoften do you need to run the tests sothat you have confidence that it's a loteasier to find the change that brokeyour cloud provideryeah and one more one more thing tohighlight here and just to kind of posethis as a question for everyone here tothink about and and Welter mentioned itearlier and as we talk about differentclouds different deployments ofKubernetes there are different operatingsystems running and how much of thesedetails do we have to start become awareof in the meta data around Kubernetesso I think the last part of ourthree-part plan is to support thecommunity and this kind of goes into howwe're going to document the tests and doall those other things and I think we'rerunning pretty short on time so I'mgoing to like accelerate to the what'snext part because that's really you knowpthe exciting part here and this isexciting because this is where are yougoing to jump in and get involved andmaybe you're spinning up a new cloudprovider and that's great but maybe youjust want to get started contributing asa contributor and you want somethinginteresting and great to start workingon and we have a number of interestingproblems that we put our heads togetherand thought okay I mean let's let'sstart withuh what does it mean philosophically tobring up a cluster right what does thateven mean and this is where Walter and Iare going to have like a cops andcluster API kind of wrestling match orsomething right these are questions wewant to solve right it is it'sabsolutely one we want to solve and youknow we don't need to sol we're nottrying to do a matrix test right but weare trying to make sure that arepresentative GCP cluster is tested arepresentative AWS or Azure cluster istested a representative Huawei clusteris tested yeah and I think that theother place that's really interestingthere is we don't know what we don'tknow but I bet a bunch of you folks youknow watching this live watching this onthe recording can find some gaps in thetesting and figure out where are wemaking incorrect assumptions and missinga test at this point we know some aremissing yeah and you know if you'rewatching here or listening at home orwatching later Yeah bring us your goodideas we have a an enhancement draft andI'll switch to the page with all thecool uh QR codes on it we have anenhancement draft that's currently upthat we're trying to figure out thesequestions about the tests and we want tomake an enhancement that goes intoKubernetes in parallel with how we'rebuilding proofs of concept to show howthese things work and so I think that'sreally where we're going to need helpnext is building these things helping uswith the enhancement and kind of lookingat the future of a SIG cloud provider uhthe the other thing I would like tomention and this isn't necessarilyobvious so I want to make it clear SIGcloud provider GCP is not owned byGoogle sig cloud provider Azure is notowned by Microsoft sig cloud providerAWS is not owned by Amazon they areowned by the CNCF they are owned by theby Kubernetes they are owned by thiscommunity right they are in KubernetesSIGs like literally the Kubernetes SIGsorganization absolutely and that meansthat if you you're a customer there andyou're like hey I want thistested that doesn't mean oh I'm not acloud provider I can't beparticipate come to our meetings we willhelp you get involved in maintaining therepo or the dro that is important to youright this is not a oh only cloudproviders are allowed to show up tothese meetings that's not true yeah andI'm I'm already wearing my hopelessoptimist cap so I will keep it on forthe moment but I want to see a futurewhere cloud providers and non-cloudproviders and community members can cometo this SIG cloud provider activity theycan see a rich set of testing they canunderstand bugs that are happening on aplatform that might be their favoriteplatform and they can say you know whatI can make a contribution and the peoplewho are maintaining these pieces of codecan understand that my contribution isto be trusted because I've I've writtenthe test i've demonstrated how thefailure is happening and now I'mcreating a pull request to show how showhow we can fix that problem and so Iwant to see a future where more peoplecan get involved in in these cloudproviders not just someone from AWS orfrom Google or from Azure or even justfrom Red Hat i would love to seecommunity members who are using thesethings finding the bugs reporting issuesmaybe showing us in the tests how it'sbroken and then proving that it can befixed with a pull request so that's mydream for the future i I love this dreamand I agree with all of it and I alsothink that we should probably anticipatethat uh where can folks come um Ibelieve it's tomorrow at lunch to theSIG meet and greet that's a great pointthere is a SIG meet and greet tomorrowand we will have a table there and youmay see one or all three of us hangingout there and uh come talk to us and askquestions and get involved in thecommunity and we also of course have ourregular SIG meetings it's all on the uhSIG information yeah and if you feel soinclined the bottom QR code is forfeedback for this session we wouldappreciate anything you'd love to uhgive us comments aboutare we all set on time then i think Ithink we're good you got questionsyeah does anyone have questions yeahwell be happy to answer questionshi uh thanks for the talk um I'm reallycurious about those repositories thatyou mentioned um and I was wondering I'mI'm still not quite sure I I completelyget the picture but the idea is tobasically have certain tests which areprovider specific right um have youthought about how this will scale withprobably many more cloud providers thanyou have right now yeah absolutely solike what uh myself and a colleague havebeen investigating about building aproof of concept around this is thatideally what we want is tests that livein the KK repository right common teststhat all cloud providers can share forexample the network interface drop testright but what we want to do is build aninterface that allows each cloudprovider to implement a specific youknow concrete version of that from theircloud provider repository so the code isnot back in KK but it uses thoseinterfaces and then when the test isrunning you know the right now if youlook at the tests in our endtoend suitethey have things that say like you knowskip unless platform is and then it youknow has lists and platforms we want toinvert that we want to have an interfacewhere the test can ask theimplementation do you implement droppingthe network interface and bringing itback up and then the test implementersin the cloud provider repository caneither implement that functionality ornot implement that functionality and Iwill add to this this is this is aworking um mechanism today so not on thetest side but in the actual controllersthemselves so if you look at somethinglike the cloud life the the the no thenode life cycle controller it can take ait lives in KK it has an interface tothe cloud provider and so it doesnothing in KK but when you compile thatpiece of KK into the CCM in in the cloudprovider repo it can take a cloudproviderimplementation that and that interfacehas a question like does the VM stillexist right and so the idea is to takethat same sort of mechanism and extendit to not just be for the the runtimebut actually also for the test time yeahand you can see a great example of thisi was going to give a shout out to SIGstorage but I don't see Yan in hereanymore maybe Yan if you're here raiseyour hand i don't see him anyways Iwould look at the tests in KK in SIGstorage because they've done a beautifuljob of this with the CSI testing theyhave common tests in the KK repo and allthat each each individual CSI driver hasto implement is a configuration file nowtheir their setup's a little bitdifferent than ours but I think it'sreally elegant the solution they came upwith and to the scaling yeah we justhave one set of tests and you combine itwith the CSI storage repo for thatdriver and you can make it work theother thing I'll mention on sorryBridget the other thing I'll mention onthis scaling is there's a couple ofdimensions of scaling right so one ishow many repos do we need i don't knowit's a great question um we have dozensI think at this point but I'm not surehow many more we're likely to need theother part is how much testing we needand so one thing I will say just flatout is we're probably going to have adifferent kind of test run models andit's going to be up to a cloud providerto volunteer re uh compute time to beable to run the tests at a givenfrequency for them and this is where Istart talking what I mentioned earlierwhich is how often would you like thosetests to be run how difficult do youfeel you're willing to accept it beingwhen to debug why a test failed allright I think we're at time so we'llleave it at that uh thank you all somuch for your questions thank you to myco-speakersthanks2025-04-15 22:00:06.260038 ��H#��5AWeWQqQM6kjMso welcome everyone to the SIG cloudprovider deep dive where we're going totalk about testing uh cloud controllermanagers and a little bit of an introhere to start with uh I'm Michael McCuinuh I'm a software engineer at Red Hatwhere I work on cloud providers and alsocluster autoscaling and whatnot and uhBridget yeah I'm Bridget Crumbot and Iam a product manager at Microsoft acloud pro SIG plug cloud providerco-chair um focused on lots of upstreamopen-source CNCF stuff very exciting uhand yeah Walter hey I'm Walter i'm an EMin denial i I still pretend that I'm aSUI i would like to keep pretending thatI'm a Sui but unfortunately sometimes Ido actually have to act as an EM buttake it away Walter all right so historytime um SIG cloud provider has startedas a working group uh Tim Hawin came andwith a few other folks said"Hey our codebase at Kubernetes isgetting very complicated." And this isbefore it got nearly as complicated asit is today and a lot of thatcomplication has to do with Google andwell others but largely at the timis want to we want to make surethat um the organization is now alsoable to deal with the huge influx ofprojects but also maintaining andsustaining you know the the health ofthe projects as we go along okay goingto flip to cloud native storage becausethis is the the tag storage talk um sowhy should we talk about this uh themost obvious thing is you know we alwaysthink about cloud native as statelessbut there is no such thing as astateless architecture at the end of theday every application is needing tostore data somewhere you know and andour uh thesis is that we want to be ableto use all of the advantages ofKubernetes and the cloud nativeenvironments to automate and havefailovers and scale etc with for all ofour cloudnative workloads too because wewant them to be declarative we want theautomation we want the performance wewant the failover capability and notonly do we have um broad ecosystem andCSI support and also cozy support now umfor integrations with you know block andobject store but we also have a reallyrich um and mature environment foroperators that automate complex databasetopologies with automated replicationand failover automating message cuedelivery and and many morethings um there are a a number of verylarge graduated CNCF storage projects umincluding Rook which uh automates uh SEenvironments for both block and objectstorage and file systems vitas which isa scaleout MySQL um database operatorharbor being a uh container repo SCD ofcourse you're all familiar with withyour Kubernetes clusters TIKv which is avery um scalable uh uh key value storeand CubFS which just graduated a coupleof weeks months ago um and CubaSprovides a uh a really great um sharedfile system which is especially usefulfor machine learning workloads and AInowadays and of course we have uh anumber of incubating projects and Iwould also encourage you to have a lookat the uh at the links below to to seethe the the huge plethora of of uh otherincubating projects and sandbox projectswhich form part of theCNCF um when we talk about the projectsand we sort of mentioned slightly therethe graduated and incubated projects umI just want to very briefly touch onwhat each of those stages refers to so alot of projects join the CNCF forsandbox projects and those sandboxprojectsuh are there to help um build thecommunity build their IP build some ofthe governance etc and and of courseit's also there to help experimentincubation projects have uh a very highbar in terms of the due diligence andincubate and in order to get to anincubation um you have to go through notjust a due diligence procedure but wealso need to speak with adopters andmake sure that this uh that the projectis being used live in production so umbeing used successfully in production isis is sort of one of the one of thebiggest um uh uh I guess grades that youget with incubation and then furtherbeyond that graduation means we haveenhancements around uh security andsecurity audits for example which theCCF pays for and we also get um uh wealso focus on governance so for examplemaking sure that we have uh contributorsand maintainers from multipleorganizations to to ensure the ongoinghealth of theproject um if you're interested and ifyou're a project that's looking to applysome of the the domain technical reviewsor the DTRs provide you know a number ofthe questions that we going through whenwe look through the the the duediligence and kind of forms a templateas to as to what the projectsuh as what the projects go to whenmovinglevels um I'm going to speak verybriefly about the CNCF storage whitepaper now this is one of the um projectsor initiatives that the tag has workedon for a number of years uh in the whitepaper we cover um the attributes of thestorage system and some of the layers ofthe storage system and and how youaccess different areas of the storagesystem and bear in mind that storage inthis case is not just you know block orfile system but it's also object storageand key value stores and databasesum and we are also just about to releaseum a version three of the white paperwhich has somte updates and also includesother areas like um messaging andstreaming i'm going to I'm going to justpop this up on the slide but one of thethings that we talk about is the storageattributes and the various areas thatyou want to look at the main thing I'mgoing to emphasize here is that this isa way of trying to figure out what yourapplication needs and therefore how youmap that to the different type ofstorage solutions which are availablebecause different applications havedifferent requirements and one storagesolution doesn't work for all um andsimilarly it's also really good whenyou're looking at cloud native storagesystems to understand the various layersthat form the storage system becausethere are various layers ofvirtualization um and you often have forexample say a file system that might bebuilt on object storage and thereforeshares some of those attributes with theobject storage and and there are manyexamples of different storage systemsbased on multiplelayers and with that I'll hand over toJing who's going to talk about a bit thedata workloads in KubernetesthanksAlex according to the survey by data onKubernetes community more and more sterworkloads are moving to Kubernetes thisworkloads moving to Kubernetes to takeadvantage of Kubernetes self-healingabilities uh agile deploymentscalability and portabilitythe DK report shows the evolution ofdata workload types uh over the yearsbetween 2021 and2024databases is uh are maintaining thenumber one position as the most commonlydeployed workload across all three yearsit shows Kubernetes as a reliableplatform for running uh critical dataworkloads analytics moves up to thenumber two position in2024 streaming messaging is now atnumber three positionthe report also shows many organizationssee Kubernetes as a uh foundation fortheir AI machine learninginfrastructure we collaborated with thedata on Kubernetes community on a whitepaper to describe the patterns ofrunning data on Kubernetesthe paper was complete and published wedescribed the attributes of a screensystem and how they affect running dataonKubernetes we compared the running datainside versus outside of Kubernetes whatare the common patterns and featuresbeingused the paper is focusing on databasesbut many of the things we discuss in thepaper apply to other type of workloadsas wellstorage system has attributes asdescribed in the storage landscape bypaper for cloud native database the typeof a backing store used the number ofreplicas and so on all have an impact onthe attributes such as availability anddurability we added two new attributeshere observability and elasticityin a cloud native environment you havemicroservices running in distributedfashion when a failure occurs it is hardto detect and figure out which componentis causing the problem so it is evenmore important to have observabilitysystem so that you can detect theproblem early and uh uh stop the failurefrom happeningelasticity means that you can scale upand down quickly it is a ondemandinfrastructure so that you can releaseresources that you no longer need italso refers to storage tiering where youcan move your data across differentstorage tiers depending on how often youneed to access the dataregarding disaster recovery Rafael willcover thatlater we have options to run data insideversus outside ofKubernetes deploying and operatingdatabases manually without properautomation is not recommended so thatmeans we have mainly two options you canuse managed data database services uhthat are provided by most cloudproviders you can also run data insideKubernetes running data insideKubernetes is usually facilitated by anoperator the operator uses Kubernetesself declarative APIs uh it reconcilesthe actual state against the desiredstate with the help of an operator youcan automate day2 operations such asbackup and restore migration upgrade andsoon with operator you can also leveragethirdparty tools such as uh Prometheusand Graphfana for monitoring searchmanager for certificate management andsoonhere's example of a Kubernetes operatorthat deploys and manages a databasuecluster the database cluster is usuallydefined by a customer resource as shownhere uh so in the CR the user describewhat type of a cluster he or she wantsthe operator reconciles the actual stateof the cluster against the desired statedefined in the CRSspec this uh database cluster is using aworkload API staple set it has threereplicas each replica has a pod thatuses a persistent volume the persistentvolume is provisioned by the CSI drivercsi defines a set of common interfacesso that a story vendor can read thedriver and have the storage beingconsumed by containers running inKubernetes and other containerorchestrationsystems according to the DKsurvey an organization typically usesmore than oneoperator in the operator hub there aremore than 300operators there are more than 50database operators including etc andvitas those are the two graduated CNCFprojects there are also 10 postquar SQLoperators including cloud native pichithat is a sandbox CNCFproject although operators are widelyused while running data on kubernetesthere are a lot of challenges includinglack of standardso the do community is working on aoperator feature matrix trying to comeup with standardized and vendor neutralfeature matrix so that it will be easierfor a user to choose anoperator when running data on Kubernetesthere are many common patterns andfeatures beingused we already talked about theoperator and CSI we also mentioned theworkload API staple set topology awarescheduling is another commonly usedfeature you can apply node labels wherethe key is the topology key thescheduleuler can use that informationuh to spread the parts across differentdata uh different failure domains sothat with the topology aware dynamicprovisioning your persistent volumes canalso be scheduled to the failure domainwhere your pods are scheduledto the pod disruption budget is anotherfeature that is commonly used it's a wayfor the workload to tell the controlplane how how many number of instancesare needed for the workload to workproperly when they're doing a plannedmaintenancein Kubernetes uh there is a defaultsecure networking model by default thepods are not uh accessible externallyyou will have to explicitly enableexternalaccess so that makes sure yourapplications are protected againstunauthorizedaccess observability is very importantuh in when running data in Kubernetes asmentioned earlier there are tools forlogging metrics and tracing that canhelp developers monitor and uhtroubleshootproblems there are also many tools tohelp you run uh data in Kubernetes in asecureway after publishing the DK paper ondatabase patterns we started to work onanother paper that is focusing onrunning data analytics and AI machineworkloads inKubernetes this paper is still work inprogress now let me hand it over to Alexwho will talk about the performance bypaperokay I'm going to touch on this um veryvery briefly because performance is thesort of thing we could talk for hours onum we've put a performance benchmarkingwhite paper here to explain um some ofthe dos and don'ts and how to do applesfor apples comparisons between differentuh cloud native storage systems and wewere looking specifically at databasesand at volumes um as we were goingthrough this the the the one constantwas that um there are definitely morepitfalls and more ways of doing thisgetting this wrong umand what we what we you know what wewhat you need to focus on is um whetheryou're trying you know understandingwhich area of the storage system and thespecific layer in the storage systemyou're you're actually testing so forexample you know the the the clientrequirements or the performancerequirements for to handle a largenumber of operations might look verydifferent from doing a throughputbasedbenchmark um and of course all thedifferent layers and the topology andthe services that you're using like youknow data protection and reduction andencryption um affect the overallperformance um but often latency is isis going to be the killer that that thataffects every layer because that kind ofgrows with every layer in the stack umand onev thing I will uh emphasize morethan most is uh caching because in cloudnative storage systems caching oftenhappens at multiple layers in the stackum and I've lost track of the number oftimes when you know you see publishedbenchmarks that kind of um highlightperformance that is way beyond what thestorage can even theoretically provideand and and that's because you knowyou're you're you're effectivelybenchmarking the cache as opposed to thestorage system so always be criticalalways you know never believe all of theresults and and really understand it umreally understand the the the basis ofthe benchmark and of course the mostimportant takeaway is um vendor provideduh results are often not useful formaking comparisons it's really hard tocompare those published results uh toyour system and so it's always importantto run your own tests on your ownenvironments with your own applicationsbecause ultimately that's the only thingthat matters um have a look at the havea look at the link there's a lot ofuseful information there and a lot of umuh dos and don'ts which which willhopefully help with the with theenvironment and with that I'll pass onto Raphaelokay so disaster recovery one of myfavorite topics uh we just uh recentlypublished version two of the white paperon disaster recovery the main change inthis version is that we um identifiedthe main archetypes for doing disasterrecovery which are those displayed herein the slide and we gave equal andbalanced space in the document to all ofthese in the previous version we wereoverweighting the the one on the rightwhich is still our favorite everythingelse being equal you should do that oneum because it's more cloud native sojust going over this um and this wastimely becauseum with new workloads types coming on onKubernetes like virtual machines we arehaving now a lot of disaster recoveryconversation and these workloads reallycan only use the chew on the left sowith virtual machine if you want to do adisaster recovery strategy you can dobackup and restores or volumereplications you cannot really use theotherones so going from left to right in thein the white paper we discussed backuprestore there's no need to uh to explainit we we discussed various approaches tovolume replication notice that these twouh types rely on storageuh capabilities right and then as youmove to the right uh you havetransaction replication so this is theworkload doing the doing the the work ofreplicating this the data right uh it'sstill an active passive approach andthen you finally we get to active activewith again the workload doing uh thework of uh replicating the data thestorage doesn't have to do anything inthis case and with active active you canum essentially distribute your workloadacross multiple data centers multiplefailure domain that can begeographically distributed and if youlose one of the failure domains so ifyou lo lose one of the data center theworkload still keeps working you don'thave to do anything in terms of disasterrecovery process it's the event is moresimilar to an active um NHA uhsituation umoops next slidealex what doyou Okay um in in the white paper we goa little bit deeper in the comparisonyou can see here that we talk about RTORPO who owns the um who owns the theprocess um it's it's worth um againhighlighting that the first two arebased on storage capabilities and thedisaster recovery ends up being owned byuh process the disaster recovery processends up being owned by theinfrastructure team while the the two onthe right are are more reliant onnetworking capabilities okay and uhbecause you need this east westconnectivity to make sure that yourworkloads can talk to other instancesand uh because they rely on theworkloads on the middleware you needit's it's more they're typically ownedmore often by the developer team okay sothe disaster recovery becomes uh it'sit's owned by the developer team so ifyou're building a platform where youwant to make one or or more of theseapproaches available to your developersfor disaster recovery then the questionto ask that you should ask yourself iswhich capabilities do I need to add tomy platform to enable this theseapproaches right and u becausekubernetes or in general uh computeplatforms don't come with all of thecapabilities that we need so Um thereare um I I I collected here thecapabilities that that we need so we wehave uh backup and restore volumereplication global load balancing isalways needed because you need to switchthe traffic right you need is uhcommunication and then you need you needmiddleware that can do primary secondaryor that can do fully distributed activeactive middleware um there are I havehere examples uh of uh CNCF and non CNCFprojects that can enable thosecapabilities it's this slide isincomplete uh there are always newoptions but the point is you can you canupgrade um your Kubernetes cluster toprovide thesecapabilities and then in the white paperwe go a little bit in depth on theuh the fourth one the white the one thatwas on on the right the active activeum to analyze how it is possible rightbecause it's a little bit new uh ornewer andum we we look at the concept of replicaand partitions and the cap theorem whichare it is at at the foundation of whyit's possible to uh to have activeactive um strictly consistent work uhworkloads we look at some of thesoftware that is out there that can dothis kind of that you can use for thiskind of deployments there is much morenow uh we we should probably update thewhite paper with with a new analysis butyou see that each u software eachproduct has an approach for the replicauh synchronization or consensus protocoland for the sharding consensus protocolso when you have both of them you canactually distribute and scale uh yourworkloads and then we have a a simplereference architecture on how you woulddeploy these kind of workloads in uh ifyou have a Kubernetes uh computeplatform so you need in in this case forexample you need a way to synchronizethe workload from with an east west ucommunication path between the clustersand you know that kubernetes clustershave an internal SDN right softwaredefined network so they they're notnecessarily routable we need acapability there to to make that workand then you can have a global loadbalancer in front of theclustersum well and um you can find more in thewhite paper so feel free to download ituh we just we just released versionnumber two back to Alex great um so Iwill I will just end this with um acommunity call out uh we would reallylove to have more of you uh join uh andconsider our role in one of the tags orto work on some of the initiatives thateither help the CNCF projects or withthe CNCF projects um and we're lookingfor more contributors to continue andhelp build out our community so look outfor announcements on the TOC channel inthe CNC of Slack or the TOC mailing listum and of course myself and the othermembers of the TOC are available to uhto have a chat and answer any questionsyou have um and with that I will openthe floor to questions or are we tightontime we have a couple of minutes of timeso any questionsdon't be shythank you i I ask one simple basicquestion you started off and because wehad a lot of conversation in my projectyou said that there is nothing called astateless application mhm is that isthat a matter of figure of fact or justa how you see it no so solook every every application in somesense is going to store state somewherewhether it's you know a database or evenjust a simple file um and I thinkwhatever the situation is you can chooseto have your um storage systems beexternal to your compute environment andto your Kubernetes cluster or you canchoose to have those internal to yourKubernetes environment and yourKubernetes structure right so whetheryou're storing a simple file on a filesystem um you can have that file systemwithin your cluster or external samegoes for databases object stores andeverything else and there are pros andcons for that and obviously we're heretalking about the cloud native aspectsofstorage all right I think we're donethanks everyone for coming and lookforward to seeing you soon[Applause]2025-04-15 22:00:07.095493 ��E�I#��AA0GNjonLfCQAhello everyone and welcome to thisafternoon's session on cloud nativestorage and the storage tag so this is amaintainer session um my name is AlexKirkoff i'm a chief architect at Akamiand newly elected to the CNCF TOC whichis the technical oversight committeehello my name is Shinyang i work forVMware by Borang i'm one of the co-chairof tech storageand Rafael Pasoli I work for Redat as aconsultant so lots of customers integinteraction also co-chairfantasticand we wanted to touh use this session to talk a little bitabout the tags but more importantly howthe tags are being restructured becausewe're um we're looking at at uh a rebootof the tags and the CNCF projects andinitiativesum as as part of this CubeCon and andwe'll have newuh a new structure coming out soon we'realso going to talk a little bit aboutwhy cloud native storage is importantbut then we're going to focus on um someof the uh data and Kubernetes whitepaper and cloud native disasterrecovery um so what is tag anyway so thetags are technical advisory groups theywork with the TOC the technicaloversight committee to help with umproject evaluations and to um uh provideguidance to to the TOC on the subjectmatter expertise as well as work onvarious initiatives uh like some of thewhite papers which we're going to talkabout now I mentioned we're about to gothrough a tag reboot um so we'reconsolidating the eight tags to five umand we're going to also be uh puttingtogether a new structure around uhinitiatives and sub projects um to tohelp uh to help reinvigorate uh theprocess and also set us up for um thescaling that we need to do for the next10 years um at a very high level we havethe TOC as as a top as a top level bodyuh the the the tag groups are um aregoing to be organized uh around thesefive new areas that will focus on youknow a specific domain um we're going tohave sub projects which might be whichwhich are going to be targeted at umsome longrunning things like for examplesecurity assessments but also projectreviewsum and initiatives which will be eitheraligned to a tag or the TOC that will bespecific initiatives to either helpprojects or help the way the CNCFoperates and of course we can't do anyof this without the community that goesaround it so we'll be also defining um anumber of uh community groups that willhelp you know come up with ideas andinnovation that that that will you knowhopefully spurn on a number ofinitiatives and projects um there's alot more detail um and and please lookat the PDF on the shed with for thelinks um and feel free to contribute andadd any feedback to the to the GitHubissue that's ongoing um so why all ofthis scaling requirement the reason isthat the number of projects hasdramatically grown since the tags werefirst conceived of in in um Barcelona in2019 so you know we've we've kind ofgone from about 40 projects to 215projects in that time frame and now wewe're continuing to see huge amounts ofgrowth so just in the last year another33 projects have been added to theCNCF and wery been complaining about that foryears uh which drove me about a coupleof years ago to raise the pain of umknow not just importance of CI/CDobservability but also to standardize onthis and uh to be honest no better placethan open telemetryopentelemetry that's that observabilityframework thing right yeah exactlythat's that's today the uh essentiallythe de facto standard for uh forinstrumenting for emitting forcollecting telemetry data from yourapplications everyone uses that today inprodu monitoring production environmentsso why not do exactly that also for CCDpipelines so that led to this uh uh OTPopen telemetry enhancement proposal thatuh came up about two years ago to doexactly thatwell I'm looking at this quite handyscreenshot you have here but I'm seeinguh that it was closed not merged so whathappened to thatokay you got me there but uh good end tothe story this uh OTP this opentelemetry enhancement proposal ended upwith the another proposal to form aspecial interest group a SIG dedicatedto exactly that to standardizing onCI/CD observability that actually youand I started togetherso so that happened about uh over a yearago now very cool and recently uh weactually wrote a blog post on the CNCFfoundation you can check it out therethrough the QR code um but it's centeredaround our standardization efforts uhspecifically around some of the commonsets of attributes that been we've beenworking to deliver in the industryso uh essentially we have the SIG that'snice but what exactly are we trying tostandardize on that's SIMCOM for shortor semantic inventions semantic whatso the formal definition of semanticinventions is a common set of semanticattributes which provide meaning to datawhen collecting producing and consumingit okay so put in plain words plainEnglish that's the lingua frana or theuh common language for describing thetelemetry that your system emits umessentially um how uh your telemetry isrepresented you see here in thescreenshot the attribute it defines thename the type and so on how yourtelemetry is represented in thedifferent signals logs metrics tracesspan data uh and what's the relationshipalso between the signals right yeahexactly so you think about like a commonmicroser architecture probablycommunicating over the network semanticconventions is going to define some ofthe fields that you may attach to thosecommunication so you can observe them ina consistent way across the boardanother example might be tracing withinGitHub or git lab pipelines where youcould analyze them based off of a set ofattributes consistently no matter whichvendor they came from wait you you saidjust a traces from GitLab and GitHubpipelines on these like GitHub GitLabpipelines and GitHub actions orworkflows or something like that yeahthat's what the vendors call them butthey're basically the same fundamentalconcept right so when we standardize onthose concepts we're actually able tobenefit of the from the consistency inhow we observe them no matter whatvendor they come from um now there are acouple minute differences right betweenthe two maybe some vendor specificthings and that's what vendor extensionsare for but fundamentally we candescribe them in the same way as a CI/CDpipeline or a CI/CD pipeline task supercool okay so I understand it definitelymakes sense to have a common language touh describe these things but is thatjust theoretical or do we actually havetraces coming out of these systems weactually have traces so the screenshotyou're seeing here is from a real traceuh ironically we're using open telemetryto observe open telemetry workflowsthat's cool so uh there was that casethat we talked about before right ishowed the Slack channel chatter withthe auto maintainers what ended up withthat one yeah so we actually were ableto use the observability data that wehad um to be able to figure out a littlebit more what's going on and whether ornot the behavior they were observing wasthe case now normally what would youhave to do you'd have to go talk to allthe engineers or go to every singlerepository and then go to all thedifferent zruns and then look at runsfrom weeks ago right you take quite abit of time to go figure out ifsomething like that was actuallyoccurring uh but because we're observingthem you know in a matter of seconds wecan figure out that yeah it's actuallywhat'soccurring uh for here is a actuallyscreenshot from a sign dashboard thatshowed a big uptick in the P50 durationof the pipelines that were running inthe project during that period of timeyeah that that actually very clear tosee visually the uh the anomaly thereand something messed up but by the wayyou said you're working with signals idon't work with signals do I have to useyour back analytics tool to uh to dothat no that's the beauty of opentelemetry it's vendor agnostic right andyou have the same fundamental standardsno matter where you go so you justchoose your favorite OTLP compliant backend in fact this screenshot is fromHoneycomb same set of data you can seethe same uh spike in durations duringthe same period of time super cool idon't need to stick to one i havestandard any back end supports thatimplement so uh what's next do you wantto see it in action yeah finally pleasedo all rightdemo time give him some applause that'sgoing to bechallenging let's sacrifice somethingfor the demo gods to make it happenhopefully everyone can see that can yousee thatallgood i'm going to refresh it just tomake sure I'm connected to the internetthere we go all right so here we havethe uh if we we look at the last 14 daysthis is a honeycomb dashboard uh we haveyou know the P50 duration for minutesfor the pipelines that have been runningover the period of time we've also gotthe P75 and now I've only beencollecting data for you know two weeksso that's why we went to the 14-day markand if I recall correctly that scenariothat that incident happened uh somewherebetween the 21st and the 23rd um so wecan actually see uh there are someoutliers here right there's somenormally long running jobs but duringthis period of time we can see a spikein the duration so let's just go aheadand zoom in and take a look at what weget uh first off uh if we notice here wehave these things grouped by therepository name uh so that's a semanticattribute that we're able to to use hereand we've got five differentrepositories that we're collecting tracetelemetry from um if we look at thedifferent uh line graphs here we can seethat the Weaver one wasn't reallyeffective and it was probably becausethere were no pipelines running at thattime so we can just ignore that piece ofdata but the contrib repository is theone that you know definitely experiencedsome latency so let's just filter tothatgroup and let's look at some of thetraces during that period oftime now the open telemetry collectorcontribs that run every single time youmake a commit or open a pull requestit's quite a lot and they're actuallypretty performant um usually they getdone if I recall correctly just fromhaving contributed they're usually donein like 8 minutes um but here duringthis period of time we see a build andtest pipeline to be pretty long so let'slook at thattrace oh an hour and 12 minutes that's along time for the build and testworkflows for sure just you know if Iwas an engineer in that team I' I'd youknow be concerned that's a long timethere's got to be something up Well uhif we look at the go vulnerability checkjob within there uh this one is usuallya um matrix build and you can det uh seethis by just by the fact that you've gotgo vulnerability check and then like aparenthesis with the name and you've gotmultiple multiples of these um so theseare parali par parallelized jobs orconcurrent jobs that uh are running inthis repository and if we look at themwe can see as we scroll down that thisgap here starts to increase overtime gets larger and larger and largeruh now there is an improvement to bemade here for sure right we couldactually just say that's the Q time andit could show a little Q span but rightnow it's just you know white empty spacethat uh is showing that there's a alatency a increase in duration of thesepipelines that are not being ca{pturedduring the actual run of the pipeline sojust by looking at this we can you knowobviously see that there certainly wasuh that behavior that people felt theywere observing was indeed actuallyoccurring during that period of time andbecause we have some of these semanticattributes we can hop right into uh thespecific repository copy out the linkand go directly to the task run of thejob just because it's you know semanticattribute we have have available nowthat's just one set of data and we canwe see this in honeycomb but we can alsosee this in uh Sigma so the same spikeas we showedearlier is right here in the last 6weeks right um you can zoom in a littlebit and we can see right here duringthis period of time huge spike induration now would you like to seeanother set of telemetry we have yeahyou know we as maintainers we want to gowhat the heck goes on in ourrepositories give us some more all rightso this is a set of metrics called theVCS which stands for version controlsystem metrics uh who feels likesometimes it takes a long time to get achange into open telemetryno one come on come on whoever uh putsomething issue PR and got a bit to waita bit to get that approved show handscome on it takes It takes a little timeright but how long does it actually takewell uh again we can filter down into arepository we can look at the contribrepository and we can see that some theaverage change time to approval is about10 weeks now that of course has a lot ofoutliers in it right uh but the averagetime to merge is only 6 days once youget that approval so they they actuallydo pretty pretty good um and again thoseare averages so you know there arecertainly outliers that are capturedthere uh but what about semanticinventions semantic adventures isanother repository we can lookat um I think Martin Thuaits likecommented or made a statement in arecent talk that it's like if you get 30engineers together talking about how toname something it's going to take awhile uh right and we can see it doestake a bit of time there's lots ofdialogue and so forth but these would bethe repository metrics uh for thesemantic inventions and they have quitea few open changes but they're doing asthey're going as fast as they can and uhit's it's nice to be able to have thesethings because once we can visualizethem we can try to improve them if ifthe need fitscocon let's go back to the uh slides anduh run through some more souh as you saw the with in the demo wecan use standard semantic conventionsand when it's once it's emitted you canchoose the tool the back end of choiceto run through the things and now uhlet's talkabout why is that can you move the focushere so we're covering what what exactlyare we covering today with the CI/CDsemantic conventions well we've got theCICD pipeline attributes that was one ofthe first things we did uh so that's howyou kind of like denote or describe yourCI/CD pipelinesso uh we have the uh CI/CD pipelineswhat's next we've also got deploymentattributes uh sorry to whoever useddeployment.environment name i know webroke that one but the other ones uhwere certainly useful and and helpfuland this is very good for likecalculating Dora metrics right what is adeployment how do you describe adeployment and then how can youcalculate that from there so uhdeployment attributes are another thingweadded uh we've already talked about theVCS attributes we added that um andwe've also added some test attributes bythe way this is all experimental rightwe've got to increase the stability soum but the test attributes talk aboutlike test suites cases and soforth and then we've also got how youdescribe some metrics so we've got theVCS metrics some CI/CD metrics and howyou're using those attributes there uhbut my favorite thing that I think we'veadded for the semantic inventions is theartifact attributes specifically theattestations in those attributes becausethat very closely aligns with the thingsthat are going on in the supply chainlevels for uh software artifacts or thesalsa suite so it's kind of like abridge between security andobservability ye|ah for sure but it isimportant to say we've talked about alot about SAM conv but the CI/CD SIG isnot just about SAM con uh one of thebiggest thing that we've been working onis specification for environment uhvariable context propagation propagatingcontext and baggage over environmentvariables now we all know passingcontext over the network which weusually do for microservices over thenetwork HTTP gRPC and there arewelldefined specs for these however whatdo you do in case you have processesthat can't communicate over the networkright you have a process that spawnsanother subprocess or what not thingsthat happen very frequently in CI/CDpipelines so this is definitely was achallenge that we've been uh challengedwith for many years the the the we foundissues open as far back as 5 years agoback in 2020 uh and after a couple ofattempts uh at writing an OTIP an opentelemetry enhancement proposal uh wefinally got it approved late last year2024 so first of all way to go on theOTIP joining and the big news for all ofyou that we have now the first iterationof the specification open for review soif you're into it we're looking forwardwe're keen to hearing from you about thethis first pass and how we can make itbetter so that's on that yeah and thisis going to be huge for CI/CD pipelinesright that native instrumentation isgoing to be made possible i think aboutcases like Open Tofu and Terraform um infact this is a screenshot from tracesthat are coming from the Open Tofucontroller runners uh passing contextbetween environment variables i was a ademo that one of my colleagues did atDevOps days in Montreal last year veryearly prototype but you know Terrantalready now supports traces um they'reone of the precursors to thespecification and so I suspect now thatwe add this specification in place moreand more vendors are going to come toplay and we're going to have a lot ofgood support give them the big news comeon also uh I know you said it was openfor review but big news that act thatspec got merged yesterdayafternoon we should have done the mergeon stage but we're not as adventurous asthat soyeah so that's the merge from yesterdayso go ahead and try it out it's nowavailable for everyone and uh yeahthere's so much more that the SIG is upto so uh we work on different uh signalstelemetry signals metrics traces eventsessentially uh logs uh also you heardabout the GitHub receiver we also have aGitLab receiver at work and looking intoother uh receivers and uh and uh for forother platforms like Argo CD workflowsand Jenkins we have lots of discussionsgoing on regarding receivers andprototypes and reference implementationsfor the for the specification we're alsolooking into and expanding the semanticconventions to cover more domains forexample the uh software outage incidentsum what else more types ofinstrumentation for example we talkedabout the environment variable contextpropagation so there's work on SDKreference implementation for that uhspec for that uh we are looking uh atgetting the long-term observability ofthe hotel project across the rest of therepositories you saw some of the hotelobservability we want more of that andthe goal is actually not to stop justwith hotel the purpose is to have thatas a reference implementation for otherprojects to be able to adopt it so ifyou are maintainers of other projectscontributors or something like that docheck it out a good referenceimplementation all of that with the endgoal of enabling observability acrosstheentire software development life cycleso with that do join us uh you have theQR code here and the link for the blogpost that Adriel mentioned before thatcovers everything from the charter ofthe SIG to what we've we've been doingso far to what's the road map and whatwe want to work on next so do check itout and also how to get in touch withwith the project some of the h linksreferences are also here like the Slackchannel we have a very active Slackchannel the uh uh hotel CI/CD on theCNCF Slack we have weekly SIG calls onon Thursday that are open for everyoneto attend do join us uh we have theGitHub repo of course so do scan the QRcode and do get in touch with us we'dlove to have more folks involved and uha word about us beyond being the SIGleads i'm Dutton Horvitz i'm a seniordeveloper advocate at the uh AWSopen-source team i'm also the chiefevangelist for the open search projectunder the Linux Foundation again anobservability open source project so ifyou have any questions about open searchalso happy to answer after the after thetalk h I'm also a CNCF ambassador and Ialso run the open observability talkspodcast that's me and you can find meeverywhere as you can see at Horovitz uhAdriel I'm Adriel Perkins principalengineer at Leatria we're consultingcompany out in the United States um andopen telemetry CI/CD SIG alongside youyeah and last but not least you can findus today actually in an hour and a halfin400 hours uh at the open telemetryobservatory we'll have office hoursbetween 2 and 2:30 uh we'll be therewe'll be happy to uh show you more ofthe demos we'll happy to hear yourfeedback your comments and how we cancollaborate together so if we don't havetime now for for more Q&A and chatsyou'll find us at the Otel Observatoryit's in the S400 booth in the expo hallin the show solution showcase find usthere all the other projects as wellhave time slots so check out OpenTelemetry Observatory we're Adriel andDotan thank you very much forlistening uh we have time for Q&A so ifanyone wants happy to pass the mic andyou can ask us question also herei'll give you a sticker of opentelemetry if you ask a good questionhi uh thank you for your talk um thequestion is where does the traces comefrom is it the native support fromGitHub and GitLab or do you write somecustom code to collect it yeah that's agreat question thank you so there's aopen telemetry component in thecollector contrib repository that takesthe events that come out of the GitHubweb hooks and the same thing works forfor GitLab and it will convert thosethings into traces by taking you knowthe start time the end time and creatingdeterministicids just adding to that so opentelemetry has a component calledreceiver so you ingest it in thecollector and then it's being done ourultimate goal by the way and that's whywe we are calling for all themaintainers here we would like to have anative one so that the my dream is forGitHub to emit natively the telemetry inthis specification or GitLab or Argoworkflows or whatnot that's the end goalbut for now what's in our capacity sincewe're not maintainers of this project iswe take the telemetry we adapt it to thedata model of the semantic conventionsand then we take it down the down fromthat point downstream it's already uhstandardized hi my name is Johannes I'mfrom DJ first great talk greatinitiatives that you are driving myquestion is now what's happening withthe CD events are they now obsolete orare they still relevant in the marketout therethey're definitely not obsolete and uhwhat we've seen obviously the thesemantic ventions are expanding i thinkdeployments for example are veryrelevant for CD so I'm not sure exactlywhich part of the of CD you're referringto we have already constructs there andwe're expanding adriel do you want toadd to that yeah the the originalproposal for the first set of attributesheavily was influenced by CD events wehave some folks that we were talked toum names are escaping me right now butwe would love to continue likecollaboration and figuring out wherethings fit inaccordingly and we'd love to have DinoTrace folks involved in this because weknow that we bring a lot of experiencethere was actually a demo on the talk inthe last CubeCon CubeCon North Americawith Andy from Dino Trace together withwhat's the name of the other company uhI forgot that with a Fitbit that showeda demo of doing GitLab monitoring with aFitbit so I forgot the names in the nameof this sorry we're blacked out becausewe're on stage it always happens but anyother questions come on guys on thisside of the hall not to neglect thatsideanyone calling once calling twice thankyou so much for joining us2025-04-15 22:00:07.966252 �� L#��wACydz0hadVuQall right welcome everybody uh I'mJustin Capos i'm a professor at NYU i'malso one of the creators of IntoTo andhi everyone my name is Alan i'm asoftware engineer at a small company andI'm a maintainer of Into so we're goingto be talking to you a bit about Inttoand trying to give you a little bit ofbackground and tell you a bit about itin case you don't know what it is um sofirst of all uh we need to talk a littlebit about software supply chain andsoftware supply chain��G�K#��EAlEXm6k2wpG4welcome to this uh open feature sessionUh this is a maintenance track So we'regoing to talk about all the latest newsaround open feature Uh present myself SoI'm Thomas I'm head of engineering in acompany called Lonqua and I'm also partof the technical committee of openfeature and I'm here with Alexandra andLucas Hello everyone Welcome to oursession also from my end I'm Alex andI'm working at open feature Oh notworking at open feature I'm working atDino Trace and uh I am a softwareengineer and analyYeah I'm I'm Lucas I'm uh working atCodecentric I'm working as a ITconsultant and software engineer and I'mpart of the TC of open featureSo today we're going to talk about umfirst a bit introduction of what openfeature is for those who don't know uhand we're going to deep dive in some ofthe latest things that we've built foropen feature So we're going to startwith the code generation with the openfeatures CLI the tracking API that allowto track uh usage of your flags and dataaround it and the open telemetry uhsemanticconvention and we going to also give yousome other updates about the latest newson what happened in open future and wewe will give you space to ask forquestion in the end So if you have anyplease prepare them now so we can we candiscuss about it So first uh I'm curiouslike who has already heard about openfeature in theroom Some people that's cool who's usingit less people as expected And who iscontributing Same Yeah good That's goodto see that we have some~��A�J#��9AIvIgsHS5MDkyo Adriel what's up man why do you looksodepressed well I got another uh Slackmessage apparently there are some thingsgoing on with our pipelines and thensomeone asked me to figure out what'sgoing on uh I just wish there was areally a better way to figure this stuffout you know like apply basic SRfundamentals but to our pipelinesdon't worry about that it's It's notjust you if it's any encouragement youknow uh I've been suffering for thatfrom years you know CI/CD observabilityuh actually curious who here hassuffered from uh lack of observabilityand CI/CDpipelines okay so as you can see you'renot the only one we're in the same boattogether and and actually not just inthe workplace you know even us asmaintainers suffer that day in and dayout and uh this is actually an examplefrom just over a week ago we were uh atthe hotel maintainers Slack channel andyou know uh complaining about uh whatgoes on here the the CI on this PR istaking way too long to find the runnersfor the running the tests soundsfamiliar sounds familiar so that's alsoin open source suffering the same thingand I'vex contributorshere Um so what is open feature So firstof all is an open specification thatprovides a vendor agnosticcommunitydriven API for feature flaggingthat works with any management tool orin-house solution Uh that's a productthat is part of the CNCF Uh and we areright now in a incubating stage of atCNCF But first we may ask what arefeature flags So feeder flags is a wayfor for changing uh behavior of yourapplication at runtime without deployingany code and being able to have likemultiple uh version of the same featureSo you can uh switch users from one tothe other if you want to And as as Isaid it's like based on user context andnot like application world applicationcontext So most of the time how it worksit's a big if for for the easiest partand uh you ask a flag management systemthat is your vendor your in our solutionthat provide the configuration of yourflag what is the variant you want to usefor this specific user and you can umdefine where what what you provide as anexperience for thisuser And to solve that and to workaround that like open feature has builtum as we said a vendor agnostic approachto it and this is how you can work withuh with open feature is that we providesome SDKs So SDKs are always the samefor every vendors and you um you caninteract with your feature flag the sameway without knowing which vendor you areusing for for each vendor we can we cancreate some kind of middleware that wecall a provider that is the thetranslation layer between what is thethe open feature SDK doing and your flagmanagement system So you operate thesame way every time but you have astandard way to operate uh in yourapplication This is the landscape of uhlanguages and technologies we aresupporting right now As you can see wehave plenty of them We we continuegrowing this uh I think the latest oneis the versal flags uh SDK that is outlast week something like that Um whichis super nice because as you can see weare not like working only on uhsupporting languages but we want tonarrow down some frameworks and beingsure that we give the best experiencepossible for every languages and everyevery frameworks you can use we areexpanding So if you want if you have atechnology that you have in mind and youwant to support it feel free to to toreach out and we can probably dothat And as you can see also we see thatuh the process percentage of growth forfor downloads are are expanding So wecan see that the adoption is like waymore um advanced than it was in the pastand we keep having more and moresolution compatible and more and morepeople using open feature[Music]So now we're going to deep dive in someof the topics that I've mentioned beforeto understand uh what's new in openfeature and we're going to start withthe open feature CLI which is pretty newuh and around the code generation So whyare we doing this um uh open feature CLIand code generation is because we wantto change the way people are interactingwith feature flags because uh we want tosolve some of the painful part of usingfeature flags that are that are this oneSo when you use um a feature flag in aclassic way So this is the open featureum definition of using a feature flagYou have to know a few things The firstone you have to know the type of yourflag because you want to use thefunction that is uh linked to your flagtype So here it's a boolean value Youhave to know the name of the flag So youhave to type it as a string You have toknow a default value and you set theevaluation context that is specific toeach user um that you evaluate againstuh the flag This is super great It worksbut it cannot be errorprone like if youuse a wrong function If you have aboolean flag and you use a a stringfunction you going to have an error Ifyou mistype the name of the flag you canhave an error too So the goal of the CLIis to simplify a bit that and to provideum a type safe uh interface that you caninteract with and generate the code asyour flag becomes a variable rather thana function you call and you can evaluateagainst uh this variable to to do yourevaluation So the big advantage of doi�ngthat is that you cannot mistype yourflag name anymore because you're you'reusing a variable So probably at compiletime it will not work if you use a wrongname Uh type is already embedded in thevariable So you can use a variabledirectly like that And default value iskind of hidden from you because it worksdirectly behind the wood It will checkthe v the default value when you use aflag So this is a win You can you canenjoy it Uh how it works behind the woodis like we ask uh we need a flagmanifest A flag manifest is somethingthat your FTO flag solution can providethat is a simple JSON file listing allthe flags that are available and withthat uh you like here's an example withone flag I hope you have more than oneflag in your system but still but as youcan see you have the name of the flagthe type the default value and adescription description is optional butit's still good for for your codedocumentation so you can know whatyou're working with and based on that wecan generate the code with the CLI andyou can just import a code in yourcodebase and you have access to yourfeature flag as specific package or orwhatever you want to how you want to useit but you can use your flagdirectly the current support of thisfunctionality for now I think we haveimplemented in go and react but the goalis to expand it so we we are startinglike to work on it it's still aexperimental phase for for this projectbut it's just to showcase how far we cango with using open feature because thiswill work with any vendors It's notsomething linked to a specific vendorsThe current support um two solutionright now two vendor are supporting uhthe flag manifest generation I thinkdeath cycle and go fit flag is the onlyone for now and we have one limitationbut we have planned to work on it islike it doesn't work for now on objectflags but we want to improve it andbeing sure that uh with object flags wecan have a similar experience that youcan use directly in your in your featureflagsolution and that's about it for the CLAnow we're going to move to tracking withAlexander thank you Thomas um and beforediving right into the newest addition toour spec I want start with a question Sowith feature flagging and open featureum we're talking a lot about how tosafely ship our uh features to ourcustomers Um but to take it one stepfurther um I want to ask a question Iwant to I want all us one of I want toask us all the question how do we evenknow we built the right feature in thefirst placeSo imagine um we're all having thisonline shop like this uh example tonershop here We're selling all kinds ofdifferent uh switches and um we want togive uh our customers the possibility tohave free shipping over orders um overuh $50 orders So um right now ourcustomers don't really know that um butum we thought it would be good um to letthem know on the homepage already So weimplemented this shiny new banner umwhich which tells our customers thatthey can have free shipping on ordersover $50 Um so far so good But uh upuntil now um we only guessing that thiswill increase uh or encourage ourcustomers to um buy more of our productsand maybe have even have orders uh over$50 then Um so how do we really know forsure that um this banner will help us uhto increase ourorders and the answer so how do we againhow do we know we built the rightfeature and the answer is as simple asthat with experimentation As scientistsall around the world are doingexperiments to prove their theoriesbased on real data Um we also want to doan experiment to drive our businessdecision bas business business decisionsbased on real data And um the simplestform is basically AB testing Um we wegive some of our users we give them thebanner and some of our users will neversee the banner Um and then in the end wewill compare um with which amounts thecustomers uh checked out their card withAnd then we know if if the banner reallyum is the audio working correctly It's abit weirdOkay Um and then in the end we can we wereally know for sure if our bannerhelped increase uh the amounts that thepeople checked outaccordingly And um e�xperimentation isnot a really is not a new thing Thereare a lot of uh teams all around theworld already implementing or bakingexperimentation into the software intotheir product development life cycles todrive their business decisions based ondata And an example is uh Spotify for isSpotify is an example for that And thisis a slide I stole from Nicholas who isworking at Spotify and is also amaintainer of open feature And umSpotify does a lot of uh umexperimentation and they have like 600million users monthly and from thosethey derive 1.4 trillion data pointsdata points daily And with all of thisamount of data they do thousands ofexperiments yearly to drive theirbusinessdecisions And um so since a lot of teamsand and companies are already doingexperimentation in the realm of featureflagging we knew as open featurestandardizing feature flagging we knewthat we need to um supportexperimentation as well And this is whathappened with our latest addition to thespec which is the new tracking API Andwith this we experiment uh we we supportexperimentation And this closes the gapbetween uh business objectives andfeature flags And from this small codeexample you can see how simple it is totrack an application metric like if theuser clicked checkout as in this exampleSo we have a open feature client and wecall the track method and um then we cangive our tracking event a name and inour case we would also need to know withwhich amount that the the uh peopleclicked checkout and so we add uh wehave the possibility to add this astracking event details We can also addthe currency code and any other kind ofinformation that we want totrack And the tracking API is uh nowsupported by uh is supported by any SDKthat supports version 08 of the spec Andthis is now uh Python Cotlin Swift andRuby And um you technically won't uhneed any vendor to support this So youcan build your experimentation platformall on open source toolsUm but um so yeah so you can usebasically any any observability platformof your choice and and the rest of theexperimentation platform um on opensource tools Yeah and but the but um forexample dev cycle and launch duckly areuh supporting italready and since this is our latestaddition to the spec um the currentstate is stillexperimental So how can we implementexperimentation now with open featureAnd we need two components of openfeature It's hooks and our trackingevents So hooks will allow us to sendflag evaluation data to ourobservability platform So this isbasically then what comes back from theflag management system What flag how theflags are evaluated We can um track thisuh with hooks And then we also need thetracking events Um so the the newaddition that I'm talking about and thishelps us to send uh application metricsto our observability platform and thisreally helps us to to tie both uh flagevaluations and metrics together in ourobservability platform of our choice andthen really uh analyze the data anddrive our businessdecisions And let's have a look at anexample here Um I was using the opentelemetry protocol to send uh the logsto the to um to dino trace in my exampleBut as again you can use anyobservability platform that youwant Um so how does this look like Sothis is basically an example of all ofof some logs um of my toggle shop and umyou can see here um the flag key So allthe different flags are listed and ourbanner is uh our banner flag is the umthe one that has the key offer freeshipping and and then we have we can seewith which variant it was evaluated andthen we have a very important um partthis is the context ID and this uh andand and this data point basically umgives us the or represents one user Sothe user logs in it's usually um like asession ID You can imagine it like thatAnd and then um this is a reallyimportant part because we need this alsoin our metric logs So in the trackingevents to um correlate the the right uhlogs to each other So the flagevaluations and thetracking So now let's also have a lookat how the tracking events look like inin theend And we'll we'll look at them againhere And you can see �um we have uh thevalues that our tracking event um umyeah u recorded And then uh again theuser ids and the context ids the usersand um here you can see all differentkinds of users And actually I forgot tomention So here in the here you can seemaybe maybe one of you in the the firstrow first rows they can see the maybeone user has like seven e in the end andum we have this user ed here again inthe first row So we will uh basicallyjoin both these uh types of blocks andum and evaluate or um analyze if ourfeature really was the right featurethat we built And um so I did this herein this graph and with my fictional datathat I uh created the users with um wecan pretty say that we built the rightfeature with our banner Um since you cansee the the green ones are the when thefeature was evaluated as off So thebanner was not there and the yellow onesis uh when the feature was there Um andthe the users checked out with and onaverage with higheramounts So yeah in the end we we builtthe right feature and with this I wantto hand over to LucasThanksSo uh what Alex just described isbasically trying to get insights and tounderstand what's happening and when wedo this or when we ask those thesequestions especially here as we are atthe CNCF event um what comes to ourminds is open telemetry often times Sothis is what I want to use to talk a bitabout the open telemetry semanticconventions that we together with theopen telemetry team uh worked on And nowI want to just show you where isstanding what it really is and uh yeahwhat you can expect fromthat So first the question maybe is likewhat is semantic conventions So herewe're just talking about giving ameaning to data So in the end it meanswe're like categorizing data We'regiving data labels that are common andunderstandable and that everyone canjust rely on the meaning of those Sowhen I get a datim that has this labelthis semantic attribute that we definedI can exactly know what it means becauseit's defined So this helps us to buildtools uh for thesetools for these uh for uh looking atthis data and also it helps us whenlooking at the data already andunderstandingit Okay Um why do we need that forfeature flags So what feature effects dois they change what our code does Theychange for example uh something in theback end code and it might introducesome additional latency It mightintroduce errors but it also mightintroduce business wise changes likewhat Alex described that ourum checkout like our average checkoutamount just increases So all of thesethings can be changed by a feature flagcan be influenced by a feature flag andwe want to be able to understandthat Okay So when we say we assignlabels and attributes uh to data and tocommon data types I just brought youwhat our current status of the um opentelemetry s Um so what we can see hereis that there are typical things likefeature flag key which is the name ofthe feature flag errors if they occurthe variant but also the context uh thatwe send there So if we are wondering whysomething happened we have all theinformation to be able to understandwhat happened why it happened and tocorrelate all these things And if you'refamiliar with open feature you might seethat some of these things sound like thewording we use at open feature This isbecause we brought in our wording ouridea of how things could be named Workedtogether with the team and tried to finda common set of wording Some parts fromus some part we were more generic thanwe used them before to bring that inthis con uh semanticconventionsOkay where are we at with this now Sothat what you just saw is what we thinkis the version that we can stabilizeThere are two uh three two or three openissues that are mostly relying on umdata types that we already have in uhopen telemetry and we're wondering ifthey are that generic and we can usethem or if we have to use something uhspecific for feature flagging So we willhave to resolve these first and then weneed some feedback We need feedback fromthe industry to see does it work What wethink do have we covered all the casesthat we need Does it work And� so if youuh want to try it out and if you want tosee you can check out the toy shop uh wecan check the or you can check out uhthe open telemetry traces there and thencan see if that works foryou Okay Um we have another set ofsmaller things that we just want tobriefly mention So the first one is weare probably going to be uh to get afeature flagging course approved by theCNCF So what we want to do there is wewant to give a brief introduction or wewant to give a introduction to featureflagging into the mechanics and to thetheory of it um independently from openfeature understanding water flags whatare the the upsides the downsides whatdo you have to look at and then we wantto give examples on how can open featurehelp you in these cases how can it helpyou solve thoseum exactly uh also we want to go intosome some deeper things like contextfulevaluation and So patterns that we seeand also give like an overview about uhthe whole ecosystem of feature flaggingBut more is to come there Uh it's just aheads up Something will probably come upuh at somepoint And that's another thing If you uhare a vendor for feature flag systems orif you for example uh work on a featureflex system then you might want to joinour feature flag uh vendor council whichis just a group of entities that we gettogether to discuss topics uh help workon the specification explicitly So wewant to get together explicitly in thatcase so that we have the chance for uminvolving involving vendors um moreproactively from our side into workingon the projects uh specification onissues understand does that work what wecurrentlyhave and uh yeah for for you probablysomething like like networking and alsobeing able to influence that and beingrecognized as a contributor to openfeature uh uh could be somethinginteresting foryou So if you want to go in there youcan you can see um on our blog we wehave uh described there how you can getinto it It's just a simple message youcan drop or ping us on GitHub and thenum yeah you can be part of that if youwantUm the last thing is last year we madeour first uh version of or first draftversion of offrep which is the openfeature remote evaluation protocol whichis our which is our protocol forremotely evaluating feature flags So wetook all the ideas we have in the SDKsput parts of it into this firstexperimentation experimental version ofthis protocol and then weum we or it was implemented by flagdflip dev cycle and go feature flagexperimentally So if you want to try itout if you want to want to try if offerworks for you you can look at thesetools try it out And the main benefitfor you as a provider as a featureflagging vendor or tool author would bethat you don't have to build all theseadapters providers that Thomas just saidyou can just take everything we haveThen you are compatible to open featureat the network boundary and all the codethat is written for open feature willwork for youSo what's next Um first we want to shoutout to all those amazing people thathelp to build open feature I think asyou can see we have a pretty hugecommunity and and we are super welcomingSo if you want to join us feel freebecause this is a super group of peopleand you you can find some people to towork on on this project But if you justwant to use open feature you can startby going to openfeature.dev that is ourwebsite where you can find uh all theintegration with SDKs how to plug aprovider and so on So so that's reallythe way to go Um you can also join theorganization on GitHub and I think thebest way to start is probably to jointhe open fitter Slack channel or to cometo a community meeting that we haveevery otherweek And uh this is just the first talkof open feature this week but we havetwo others uh one on Thursday around umprogressive delivery change inKubernetes with canary deploy andfeature flags and also another one morespecific on type safe feature flaggingin open feature So uh I encourage you toto go on this on this session and wealso have a booth in project pav umevery afternoon So if you want to askany question uh feel free that's the wayto go You will have people that knowexactly what we are talking about in ina in abox And that was it for us Uh so if youhave any question it doesn't have to berelated to today's presentation like anyquestion on open feature feel free toask them and we are happy to answer them[Applause]questionAnyone have a question Yeah you may goto the mic I thinkUh hello Thanks guys Great talk Uh Ihave a question about the numbers thatyou showed uh in the beginning theadoption rate Very interesting becauseuh if I'm not mistaken the Python SDK isgrowing moreYeah Yeah I can throw itIt's far away but I can if Yeah YeahThere it is So thePython SDK downloads are growing morethan JavaScript web It is prettyinteresting Do we have an idea why Um Ithink the adoption of JavaScript was uhhigher before So the growth is not ashigh Python has like uh catch up laterBut uh yeah we are not tracking it likedaily to understand exactly why but Ithink we have more and more customersusing uh every technology as everyopensource project it's hard to know whois using what so we don't know exactlybut I think we have more people usingPython nowthat's a good point yeah you can see inthe list we have only a few of SDKsbecause like for example Go have notracking measure so you cannot track howgo is is progressing or this kind ofthings So that's why we we've put onlysome of them here Okay Thank youAny other questionYeah if you want to go to the mic it'seven better especially for the recordingHi everyone Uh thank you for thepresentation It was awesome Uh so I amuh working at Lua Merla It's a homeimprovement retail business and we havestores throughout the Europe Uh westarted adopting open feature last yearUh we didn't have any mechanisms forfeature toggles So uh the first thing uhI decided with my team was if we aregoing to do it we will do it uh um aCNCF way and go with open feature Ourmajor challenge right now is decidingwhat to adoptuh as an open source provider We uhchoose uh flipped uh because it has anice UI and also enables GitHubs Andsince then we discovered that many ofour teams were already uh implementingsome kind of feature toggling by uhmanaging configuration with vaultsecrets uh that got injected intokubernetes and they uh we use them inthe application logic So now uh we aredabbling with the idea of writing ourown custom uh open feature uh provideruh to be able to use open feature andstill uh support our legacy systems byusing vault as a provider Uh I I'm uhfeeling that this might be a pitfall SoI'm asking you for your opinion on thematterMaybe you could elaborate on what do wemean is could be the pitfall there Souh so what we wanted is to keep thelegacy applications still using u vaultto have the flag values in there and wehaving uh some kind of transformationlayer between that and our customprovider implementation that uh goesalong with the open featurespecification Okay I I think this iskind of a common use case that we seethat people have homegrown solution wantto move to a vendor and so so we haveone initiative that we call themulti-provider that is um a way to havemulti-provider in a same um in a sameconfiguration and this is really we wedesigned it and you can ask that is thathas designed it but really in a way tomigrate from one system to another So ifif you were using vault and people stillwant to use it but you want to migratethem to flip at some point you can usethe multi-provider system that will helpyou toseamlessly move them to one to anothersystem without to break everything thatthey already have with open if they wereusing open feature you can put a layerof vault and that's totally fine tobuild your own provider for that uh itjust like avoid making it a long-termsolution because vault is not designedfor that that's the main thing I want toI want to say and flipped or other oneexist that are pretty good that you canuse behind uh open feature Well thankyou very muchI think we are we are out of time Sothanks a lot for for joining the sessionUh we're still around if you havequestions So yeah Thanks a lot Bye[Applause]2025-04-15 22:00:08.711708� attacks so there'slots of ways to define a software supplychain i'm not going to read what's onthe slide to you but it's basically umthe process by which you make yoursoftware the things that happen whetherthese are machines that are going andcompiling things whether they're humandevelopers creating patches and stufflike this whether it's your lawyerlooking over uh a license to decide ifthey can take open source software allof these types of things come togetherto become your software supply chain andhere's just a really simple example herewhere um I have a version control systemhere uh there's some testing that's doneover it like a llinter is run directlyon my source code and then uh assumingthat the llinter is okay with it then Imight proceed to actually go and buildthe software and then package it andsend it out and real software supplychains will often have many more stepshere like I might do fuzzing i mightactually do some unit testing and stufflike this on it i might be pulling independencies from other uh locations andstuff like this but this is just areally simple example that fits nice ona slide and I hope we can all kind ofconceptuallyunderstand um okay so we talked aboutwhat is a software supply chain nowlet's talk about software supply chainattack and this is a situation where a aparty goes in and uh causes somethingmalicious to happen um and in fact insome cases it doesn't even require anexternal party in some cases it can besomething accidental that occurs but youend up with the process not proceedingand not behaving in the way that youwould expect or desire it uh to tohappen and there's tons and tons ofincidents for this so for instance if wejust look at version control systemshere there have been a bunch of reallyum high-profile incidents where in thecase of one of them that I'll justhappen to bring up where allegedly theNSA broke into Juniper and put abackdoor in a lot of their products thatlet the allegedly NSA go and backdoor awhole bunch of Juniper VPN connectionsuh that were being made by any of theirrouters over an extended period of timeum that's not an isolated incident onceagain we could fill this whole talk withtalking about lots of things that havehappened and there's a wonderfulsoftware supply chain catalog that uhexists under the CNCF which lists ahundred or so software supply chainactually I think more than 100 softwaresupply chain incidents and describeswhere they occur and stuff i'm justgoing to talk about some high-profileones you might have heard of um in abuild system there have also been lotsof incidents and compromises that havehappened over and over and over againwhere attackers have gotten into thisinfrastructure and done malicious thingsone of the most notable was whathappened with Solar Winds where onceagain allegedly Russian hackers thistime broke in and put in back doors uhthat would have let them uh get into allparts uh all sorts of Fortune 500companies and parts of the US governmentand other high-profile networks likethis um when it comes to sort of thepackaging or distribution standpointthere's also been lots and lots ofdifferent incidents of this um oneincident was an incident where um a theXcode library for Apple which wasn'tbeing natively distributed from fastlocal mirrors inside of China wheresomeone had put a malicious version upof the Xcode library that they said wasjust a faithful mirror of it in Chinaand a whole bunch of Chinese developersthen downloaded this maliciouslybackdoor version of Xcode Ghost andended up including this in software thatthey were creating and then distributingto others so uh this is quite anefarious attack and finally last butnot least um I am going to be very niceto the CrowdStrike folks and not talkabout them here and I'll use anotherincident just to say that testing isalso something that a lot of companiescan get wrong there was an incident umgosh almost 10 years ago where Microsoftaccidentally pushed out an update onlyto a few mirrors and those mirrorshappen to serve countries likeKazakhstan that have repressive regimesthat spy on thei�r citizens and do thingslike this so a lot of people weresuspicious that Microsoft was putting aspecial backdoor version of itsoperating system out uh just so theycould target dissident in certain areasjust to show that that was not the casemicrosoft accidentally did the samething for other mirrors a couple weeksafter this and accidentally pushed apushed a bad release out there but thisis once again not an isolated incidentapple and other organizations many ofthem have failed to correctly run theirtesting and have accidentally pushed outbeta versions or things like that outinto production as I'm sure at leastsome of you in the room may haveaccidentally done from time to time aswell um so and I I'll also say thatattacks on software supply chain ingeneral have been growing enormouslyover time um theyum it has had this sort of exponentialgrowth where people have been more andmore tuned in and doing more and moreattacks i uh first started to work onsoftware supply chain security backaround2002 or so and in that time like almostno one was paying attention um when Ifirst started to release and talk aboutproblems then I actually normally hadpeople at sort of the what would now bekind of the nation state actor hackersreaching out to me to learn moreinformation and for a long time this wasthe domain where mostly nation stateactors were doing really clever attacksto target infrastructure to you knowtarget you know so that South Korea'sbanks and infrastructure would lose youknow threearters of a billion US dollarsin a cyber attack that allegedly camefrom North Korea or you know you'd havepower plants in Ukraine go offlineallegedly from Russian hackers or thingslike this but it's really become muchmore mainstream unfortunately now and isan issue that really everyone has toworry about because attackers areincreasingly um going after things likethe common package repositories we alluse and uh targeting more and morecompanies so um this brings me to asharp detour in what I'm going to say tosay what about compliance right so I I'mgonna I promise you it'll make sense whyI'm talking about compliance in in asecond here and um there's a lot thatnow everyone has to do with respect tocompliance if you're in the US or otherplaces you have to do quite a bit umwith sbombs due to uh an executive orderthat came out a while ago there's theEuropean Cyber Resilience Act which uhhas gone and further mandated certaincontrols and protections um and even inthe UK the UK largely model has modeledwhat they're doing off of these existingregulations and the guidance that's comeout of other organizations and otherplaces like the secure softwaredevelopment framework out of NIST andother uh publications that they've beeninvolved in and so what's kind ofinteresting here is that a lot of thiscompliance work as we're about to hearmore about is um you know actually goingand causing us to put better controls inlots of places and one of the main areasthis happens is actually related tointoistations in the software supplychain so with that I will hand it ohsorry so compliance is an enabler andit's I really just said exactly thisthat um if we all you know I I if wetake people that are used to doingthings like making sure somethingconforms to have certain properties andwe can add security into the processthey're doing then we can actually makesecurity better at the same time whileletting them still do compliance so totalk more about that uh Alan why don'tyou take it away yeah uh thank you justJustin um so I'm gonna start by talkingabout how do you achieve this complianceuh with transparency right so first ofall um let's look at a supply anexamples of a supply chain right theycan be very complicated right you havemultiple steps within it um they can belocated within a VPC uh you can becalling another cloud up a cloud umprovider and within each uh instance youthen have multiple steps within it rightso how do you then easily tell agovernment official that you arecomplying with the um with a regulationthat you you need to adhere to so withthat um I'm going to introduce �to youlike hey you know maybe we can do thisvia transparency right so what do youmean by transparency basically uh usingtwo sets of tools uh one for informationcollection uh the other one forbasically information discovery um oneof them is basically to collect all theevidence that in the supply chain sobasically hey you know build data um Ibelieve you have like package managersand sbombs and then and then once youhave all this data how do you make senseof it and how do you collect it rightyou don't want to just dump it up intolike a big um in binary storage you justwant to you want to be able to accessthe things that you care about uh whenyou need them um and with that um I willthen introduce to you like basically howdo you then make sense of all of thedata you have right since each programwe know that creates it creates its ownstandard creates its own um frameworkand the like uh we introduced in totetomas basically a way to make everyone talkthe same language right so basically wewant to make sure that say when you'redownloading something from GitHub itgives you the same attestation orsomething that looks the same as whenyou're running a CI job or when you'redoing C salsa build prevenence or whenyou're packaging something with DBN umand in order to do that uh Into hassomething called in to attestationframework which basically if you were tolook at this uh can be a lot of text butthink about it is like you have a majorenvelope that basically contains the uhattestation itself um and then withinthe attestation you have um somethingcalled theuh uh you have two things right you havethethe the subjects and as well as the uhas a predicate so the subjects arebasically the things you're testing toso if you're testing to say a DBN fileright you're test that will be thesubject and then you then also have thepredicate which is the evidence now Ilike to focus a bit more on the uhpredicate side of thingswhere and if you look at this uhpredicate it is a spx uh predicate soit's attached to um an sbomb now esbsare not the only thing we we canpredicate on you then also have linkswhich is basically an input output umand then you also have salsa provenencewhich is basically you have buildprovenence and then you have all theother salsa tracks that are that couldbe of interest and you also can add testto releases uh runtime traces testresults vulnerabilities and etc um andthen if you're uh curious about to learnmore about how testations work uhthere's a cure code uh that points tothe uh in total attestations uhrepository now basically one of thethings we're trying to say is you canachieve uh compliance with transparencyusingintoations now how do you get started umright now we're just going to highlighta few of the tools that you can useright now to basically start get getstarted one of them is into witness itis a production ready uh implementationof into u that you can readily justdownload and then start and then startgenerating intostations um it wasdeveloped by uh folks at testify sec andit was donated to the intto project lastyear um and then there's also arepository uh if you want to look at itum and then how witness works isbasically you have a context that you'rerunning witness on so that that could beyour GitHub action or that could also belike um like GitLab CI and then youbasically give it input files and youhave the output files and it'll capturethat into a context uh and then it willthen pull a key from or like auh key provider right that could be uhGoogle uh or like a cloud provider KBMand then use that to sign auh the sessationyeah so so really what that's doing thenis really you take whatever step younormally do as part of your softwaresupply chain and you just capture thethings you're doing it on and ratherthan have this be something thatsomebody puts a check mark in a you knowon a on a a notebook or inside of aGoogle spreadsheet or something likethat then now you get this likecryptographically attested thing that'ssigned and you can do your complianceand everything automaticallyso you don't have to do any of thesemanual s�teps anymore right yes okaygreat thanks um and then once yougenerate it you then have to store itsomewhere um and the next project I'mgoing to show you is um Archive Vistawhich is basically another project thatwas developed by the folks at Testify uhit basically helps you store andmaintain the relationships between umbetween the subjects so basically italso has a you can start dumping allyour attestations into it and then ifyou need to say extract an attestationthat were that was generated by say aspecific person or a specific machineyou can get all theations or if youexpect you or if you need an attestationof a specific type you can then requestit from there now with these two toolsyou can get a pretty good um working uhrelationship like a good working um viewof like what's happening in yoursoftware supply chain right uh you haveall the tools you're using you have allof the uh cloud providers that you'reutilizing and all the information thatyou're generating um but how do you makesense of it um and then now from therewe can we have something calleduh uh how you can now visualize andunderstand your supply chain usingsomething called guac which stands forgrapher understanding um artifactcompositions um and it was basically youcan fit it not only at the stations butalso sbombs um vulnerable informationand much more um and then you it willhelp you map out the relationshipbetween your artifacts so that you canvisualize and create for information umother things you can do with it isbasically you can establish relationshipbetween the components um unveil gapsuse a pie chain so you can basicallytake a look at it and then find whereyou have any issues or where like adependency is not set up properly andthen you can also identify uh threadsand then fix it um and then oneadditional plus um is that you can alsovisualize it uh this is an example oflike uh an artifact that depends on thathas its dependency laid out using an uhthe DBN and as well as um uh sbomb rightso so guac's a really useful tool butit's not essential if you don't want touse it you don't have to but it can helpyou just get more out of yourinformation because it's easier tovisualizeyeah awesome then um and of course likethis is basically a really small examplebecause uh software supply chains canget very complicated and you can get areally big graph um now now that youhave all of that information right howdo you make surethat your supply chain uh conforms to aspecific uh standard or make sure thatyou have some internal QA that you wantto adhere to how do you make sure thathow can you leverage all that u evidenceto enforce that um so one of the thingsuh in total also provide helps couldhelp with is basically validating thedevelopment process so you can answerquestions like hey you know is my CIbeing run on those on the trusteddevices being generated in the endis is like is my image getting scannedin the end or is it getting tested um oreven towards the end it's like are thereenough people looking at this artifactright so you can basically say hey um Ineed two people to sign off on thisspecific um step and in order to do thatum we'll we'll uh you can use into um toenforce certain policies um we'llrevisit the uh initial supply chain wesaw earlier so pretty straightforward umand let's just uh focus in on the uhspecific steps so you currently havefour steps you the source control uh youhave the test the build and then thepackaging now each step is run bysomeone that you should trust um and nowat each step uh some steps will haveoutputs and these will we will callproducts um so basically if you'resource control when you're downloadingthe the source repo you want to you'rebasically generating um the code andthen and then when you're packaging orbuilding something you're generating aspecific artifact after that um and oneach step you then also have a materialwhich is basically the input to eachstep um and from that input um you canthen start figuring out like okay I wantto make sure that the the product I'mbuilding is also using the uh sourcecode that I downloaded and y�ou can thenstart drawing out all of theserelationships between each step and thenuse those to then draw out and thenwrite down like okay what's the shape ofmy supply chain um and this we'll call alayout right an intern to layout and howdo you make sure that this layout isindeed what you want it to be or whatwhatever the project owner expects it'llbe signed by the projectowner so this is basically the bigpicture idea of like what an in totallayout does this is how you enforce likeuh the certain policies and basically ifyou have the total layout with theattestations from your steps you canthen verify that a specific um that yoursupply chain was not um was adheredadhere to the specificlayout now uh basically going again overin total layouts basically main thingthat you can do is like uh definerelationships between certain steps umand then the other things you can do islike basically define who is a trustedfunctionary for each step um now one ofthe issues is that is that it would uhlayouts were designed before came intoplay so before decisions came in we hadsomething called links whichwere powerful but were more limited inthe amount of data you can uh containwithin it you first define the inputs ofthe val of the step the outputs of thestep and the specific command that wasstep uh that was u ran uh with newer atthe stations you have more metadata forexample like with a build also buildprovenence you can then contain um whatspecific um version uh the specificmachine it was run on and even I believeuh like even the current state of themachine so that'sbasically one of the things that we wantto take advantage of and be and andthrough that uh the community figure outlike hey we have to do something newabout uh we have to make sure that uhour policy enforcement catches up to ourattestations and we have the in totalpolicy group basically created and oneof these are some of the things westarted working on so uh one of our uhuh core contributor Aditia basically hadtwo in total enhancements one of them isto basically uh that basically helpbridge the gap uh between like andenable uh in total layouts to basicallyleverage at the stations and we have aprototype there called in intoattestation verifier and there areefforts towards uh uh moving uh all thefeatures of a decision verifier intowitness um the the CLI we mentionedearliernow the other things some of the othergoals of the internal policies workinggroup right is not only to just create astop gap to make sure that we can use uhat the stations in layouts but alsofigure out like hey um can we dosomething better right uh we want toimprove the use cases improve theusability improve um like uh the way youcan define certain policies um and alsois there a way we can do thisdifferently right so uh one of thethings we're looking at is like maybe wecan talk to other policy engines likemacaroon and then figure out if we candesign something that's like policyengine agnostic that way if someone'slike hey you know I have this policy Iwant to give it to you uh I want to makesure that your policy engine can also uenforce such a policy that way um policycan also be shared between individualsand teamsum but yeah that's it for uh policiesand then we're going to get on tocommunity updates um so this isbasically integrations and adoption Ifyou've seen this light last year I thinkthere's a few more logos up there thatmight not notice and there's a fewthings that are new since uh last yearso we have uh GitHub introduced uhgenerating of build provenence in GitHubactions um Homebrew is now supporting umI believe I I'm not familiar withHomebrew but I think they're usinggenerating at the stations when youbottle a specific package that you canalso look at uh you also have Pippisupports uh package out a outdoors umuploading at the stations and alsothere's a white paper that uh autoeskand testify sec released uh that wentover how they achieved uh fed ramp uhwhich is uh level compliance in the inthe United States using in test stationsby and by leveraging uh witness andarchist though um the other thing uhupdat�e is that it has finally graduatedum all thanks to the work from Justinand uh the people everyone in thecommunity um but yeah so if you'reinterested to learn more about Inttofeel free to join us i have added the QRcode with the slides so if you want toreference to it uh you can look at itum and yeah I mean one thing I'll alsolike to add if you or your company hasalso implemented to and you have runinto certain issues uh feel free to talkto us and we can f we can thenunderstand better like how you we can eeither improve the framework or maybecome up with a way where everyone canshare like how they implemented toto andhopefully can grow yeah we're a bigwelcoming community and we'd love tohave uh any help feedback uh thoughts soum yeah and thank you all for listeningto us feel free to ask any questions[Applause]so our pro our project publishes the intoto JSON stuff but it feels like it'san extra file that needs to get carriedaround and validated and so forth haveyou looked at all at ways to make thatdiscoverable if I just have an artifactbecause carrying around two or three orfour things gets a lot harder thancarrying around one yeah yeah i meanthat uhso if you're coming from a US-basedorganization which I'm not saying youare and most of you are not then there'salready a mandate to do sbombs andthere's also a lot of push within the EUand elsewhere to mandate sbombs andJapan and other countries are alsostandardizing these formats one of theefforts we're doing is also to in anormalized standardized way integrateinto attestations as a field that'slocatable from your ESBOM and so um thisis one way in which this potentiallycould be addressed in the I don't knowfairly short term like months to maybe ayear standpoint if everything goes wellum there are other ways and otheroptions and stuff like this thatorganizations use some of them if theydistribute things for instance via atough repository then the tough metadatawill contain actual linkages to theinoto metadata that's the other verycommon way to do that and that's datadog and lots of other big companies havedone that way as well so there's a menuof Yes there are different ways and ifyou think of another way that we shouldbe promoting then tell us about itcan you share a little bit more abouthow you think about sig store and skitand their relationship to intoum so uhIntoestationsare one of maybe the most popular thingthat's stored inside of SIG stores recorit's it's just like a major drivingforce i think wehave oh I'm going to get this slightlywrong but it's hundreds of thousands inthe last like few months of added umintoattestations and so um we're very happyto be a part of that and really happythat a lot of the organizations thatmake things want to sign it and use sigstore and other things so um one of thethe uh creators of into is a PhD studentof mine Santiago Torres Aaras and he'salso one of the creators of SIG store soum the projects are were always at leastthought of as having some relationshipand work very welltogether i know we went through a lot ifyou have a question you feels like adumb question do not hesitate to ask wewent through a lot of things reallyquickly and so on so happy to take anyquestions or we can always take itoffline if anybody's shy someone is notshy here we go all rightis there a link or will there be sometimes in the future with the salsaproject and the level of not compliancebut uh the salsa levels the sensorelements you mean in terms of when yourun uh salsa salsa oh salsa oh yeahsorry so um so do you want to talk aboutthat uh no or do you want me to yeah gotit all right so um salsa actually uhfirst came into the CNCF or into theLinux Foundation as a sub project ofIntoTo and actually um the format if yougo to the salsa website and you look atlike a salsa atestation it's an intoiststationation it's an into attestationthat points that like in the policy partpoints to the salsa portion of it sosalsa is like a veryum like intto is doing a very easy partof what salsa is trying to do but It'sdoing a part that gives it like kind ofa layer of compatibility kind of like IPon the internet like provides this layerof transport everywhere in toto is thislittle signing layer that's in a lot ofthese technologies you see out therelike almost everything software supplychain if you go into it a bit you findanintoistation wrapper there because a lotof the tooling uses it and then salsahas all the opinions about what shouldit mean for something to be like sourcetrack you know this level compliant orto have this build level compliance it'slike a very it has its own opinions onthe boundaries you have to meet and intoto it takes that data along for theride so we we love the salsa communitywe work with them and they're terrificfolks thank you sureyeah i mean one other thing I would liketo add to that is like if you look at umthe homebrew uh repository and all ofthe attestations they generate um mostof them are salsa at a station but theyare encapsulated inside an intestationright so you can think about salsa aslike the specific um contents and thenin to the protocol that it's sentthrough right so and the way it's sharedsosure go aheadi didn't saw in the layout format um andthe the graphical uh you the graph youyou showedum there were no if I'm not mistaken theinformation around the it was after umonthe machines and servers that are makingthe builds and if it should becompromisedthat would that's a great question okayso um your in total layout contains uhso it contains keys for the parties thatare supposed to do the roles and thosecould be things like um a key in an HSMon a build server or it could be somelike spiffy orert manager provisionedidentity on some system or it could bean ephemeral key from sig store orwhatever it can be anything you want umso it uh theum so um I got distracted by himdropping his Kleenexes as well i don'tknow he's he's gone now though um butbut literally it uh it doesn't matterwhat the like who the functionary is wecan say you have that functionary now ifit's imagine something like a compilerokay you're going to take in a sourceprogram maybe that's in C and you'regoing to produce a binary out of thatthat'sexecutable in total can't know that thecompiler compiled the thing and did theright thing when it compiled it right itcan know that it read this file and thatit produced this output and the compilerhad this hash to it and everything elsebut if your compiler has a bug in it orif you were supposed to do code reviewand you just are like I don't want tolook at this i'm just going to say looksgood to me and hit you know not thatanyone would ever do that but if ifsomeone ever happened to do that in thehistory of the universe like we're notgoing to say oh you didn't have twopeople check this because you did sortof right we can't know that informationbut we can make sure it went throughyour actual policy and your actualserver is the one that ran the compilerand actually the the right version ofthe compiler and the compiler producedthis output and so on and so it providesyou a lot of protection but it it can'tbe foolproof against everything you haveto also in some cases worry about theindividual steps like are do ourdevelopers pay attention when they docodereviews are we running a compiler that'syou know was written by someone in youknow North Korea and something like thatlike you know you you have to care aboutthose things a bit load a kernel modulefrom North Korea yeah do you load akernel module don't load a kernel modulefrom North Korea pro tip you heard ithere first um we're out of time yeahwe're out of time I think if anyone hasany other questions we're happy to takethem thank you all so much for comingthank you2025-04-15 22:00:09.662133�ff to make it more discoverablefor the people who come there right Youwant to describe it You want to informpeople You want to help them get to whatthey want but you really if you've gotsomething you're putting in ArtifactHubyou probably want to showcase it as bestas possible So how can we go ahead anddo that Well the first thing isArtifactor Hub actually does some autodetection out of the box for as much asit can It'll try to detect things likethe name description version and someother details if that metadata alreadyexists It's trying to be intelligent andas intelligent as it can but only somuch of that information is readilyavailable to be figured out So we needto go a little deeper Now what kind ofthings might you want to do to go deeperon Um you might want to have some morecontrol over the name right The way it'sdisplayed the way it's found throughsearch You might want to providecategories or keywords Is this asecurity tool Is this a backup tool Isit something like that What othermetadata can you do to help ArtifactHubindex and help people discover thingsYou might want to recommend otherartifacts right If you've got two orthree things that work well togethermaybe you want to recommend thosetogether So when you're looking at oneit just pops up Hey this other thing isthere right Uh you might want to sharethe company that actually produced thisthing If you're behind a business or anorganization or even uh you know anonprofit something like that you mightwant to say "Hey we're the ones whocreated it Let's put a name on it." Uhyou might want to set some installinstructions Some of the installinstructions very easy to figure outHelm's easy Helm install There's certaincommands you can do Other artifacts it'snot as straightforward and you want tocontrol some of those And of coursethere's even more you want to do Solet's walk through how do we solve someof these problems and provide morecontext Um there's there's a couple ofplaces that we're going to get to towhere you can set some of thisinformation Some of the artifacts umHelm you know Tecton some of these havetheir own configuration files alreadyright Like a chart.yml file things likethat And they have annotations And so inthe annotations you can just specifythis additional information But notevery kind of artifact has thatinformation Um and so in that casethere's actually an artifact hub packagefile where you can specify a whole bunchof information And up on the screenyou're going to see a bunch of thingshere And a number of these things can gointo annotations as well for those thatalready have that you know in theirconfiguration files So you're going tosee a little bit of a depending on theartifact where do you put thisinformation So let's walk through someof this stuff so that way you can seewhat we've got Uh let's talk about thename right If you get something likeHelm you can set the name right I cansay it's PostgressQL but sometimes you have a second namethat goes along with it like PostgressThis is a very easy example but manyother things have it And you canactually specify what that alternativename is here And that means whensomebody goes to search for it when theydo something else it shows up that wayNow there's certain rules You can seethe note at the bottom here right Onehas to be a substring of the other Sosomebody can't have you know a fooar issomething and try and grab cube as theirnamespace It doesn't work You actuallyhave to have these things be substringsof each other Little checks and balancesBut there is the ability to have analternative name there that makes sensethat fitsin Um and you can specify it insomething that has its own manifest likeI showed a moment ago or in the artifacthub package You can specify thisinformation Now here's an interestingone Change logs right Software all overthe place has change logs or at leastchanges are tracked Well one of thethings that you can have is you want tosometimes share that change log And sohere's an example of a changes categoryNow this shows you the artifact hubpackage yaml what it looks like Andthere's d�ifferent kinds here right Addedchanged things like that You can specifydetails provide links for it And this iswhat it looks like if you're usingsomething that doesn't have its ownmanifest file but you can provide thatinformation And if you're usingsomething like Helm that has a manifestfile here's how you specify it as anannotation Same kind of structureroughly the same and you've got thatinformation there Now this informationis really useful because if I come overto ArtifactHub right If I go look in theright sidebar there and somebodyactually fills this information up thislights up And here's the ArtifactHub oneAnd if I go click on it you can see thechange log right here In fact I can golook at previous versions and see whatchanged in each one That metadata canpop out for people So when they come toevaluate they can see what's changingwhat's going on there's a really niceplace to see it to visualize it to youknow easily go back through it becausesometimes going back through change logsand and commit messages isn't always theeasiest This provides a nice userexperience for it Now one of theinteresting things that I found aboutthis is that ArtifactHub isn't the onlyplace that uses some of thismetadata So there's another project outthere called Update CLI Has anybodyheard of Update CLI in the room So thisis an open source project that issimilar to something like dependabot orrenovate bot but it does updates ofthings that they don't do right and sothis project integrates with things likehelm and charts and so it's now readingthe change log annotations from helmcharts to be able to share that when itdoes its update work likethose and so other projects are startingto look at some of this metadata to sayhow can we also use it to help with ourother actions and so this is anotherproject that's able to use thatmetadata Okay so let's talk about custominstall instructions here Helm it's aneasy one Some of the projects it's veryeasy to tell you what the installcommand is but this is a project hereThis is qborton And qborton is a policyengine It's one of the several that arein the CNCF And their policies happen tobe in Web Assembly And because they'rein Web Assembly you can do all theprogramming capability you can with anylanguage that can compile to WebAssembly And here's the installinstructions the custom installinstructions from one of their artifactsto say how do you install it right Andhere you can see this is markdown thatspecifies how to do it There's codesnippets in um and they specify consoleas the type and it's very easy Here'show you go install this policyNow in the UI here if you go over tothat policy you've discovered it youwant to know how to install it you'vegot the install instructions and theylook nice The code can be copied rightThe copy buttons are right there becauseit knows what to do And this gives thatability to say don't just install it ornot have install instructions forsomething complicated It lets you styleand tell people and signal to themhere's what you can do to just make itsimpleYou know one of the things that I likeis this idea of when somebody's going togo try something out you want them tohave fun in five minutes right Feel likethey've accomplished something withinfive minutes When you have nice installinstructions like this it lets them dothat and see it in a way that theauthors knewworked So let's talk about the providerright This can get into simple marketinglanguage right You want to say whoprovided something Signal my company didit That is both for advertisingSometimes it's for trust and things likethat And you can signal it in the YAMLAnd again it shows up in the sidebar tosay who the provider is And you'll seein each of these I there are threedifferent examples So you'll already seein the ArtifactHub ecosystem we've gotmultiple organizations who are signalinghey we're the providers of this It'snice for various reasonsUm let's talk about recommendations Thisis an interesting one here So you've gotone chart right or one artifact and youwant to go recommend other artifactsthat go along with i�t Maybe you've gotan application you want to installthat's an operator and you want to sayyou know what if you're going to usethis operator you might consider thesepolicies and they may be even differentpolicy for different policy enginesdepending on what your end user is usingBecause in the Kubernetes space we knowthere's a ton of flexibility of what youcan piece together with all these Legoblocks And some of them you only needone out of a set because there's thingsthat occupy the same space And so hereyou've got something that isrecommending manager And you can see thetwo different style of annotationsdepending on you know or YAML dependingon what project you're doing And thenwhat happens is this new section pops upwhen that's there and it tells you heythere's a recommendation pops up rightat the top and says hey there's arecommendation And this is differentfrom what shows up lower down on thesidebar for related things becausethat's what using your metadata andeverything else ArtifactHub tries topiece together to recommend This is yourexplicit recommendations Those go upfrontSo let's talk about uh sometimes fun andsometimes confusing thing signingartifacts right um you see thissometimes in the security environmentsthings like that sometimes it's a painwhen signing doesn't work so let's talkabout signing for a minute I askedGemini you know you know what issoftware providence and signing getsinto that and you can see that it talksabout you know knowing where somethingcomes from how it was built how it wasput together um and in this case we'revery interested and how it wasn't builtright And that's really where when youtalk about providence signing can helpyou with to know how it was built Is itwhat I wanted Is it what I needed Andthis is an important thing in thesoftware space with all kinds of malwaregoing around with um people trying totake something and say "Hey I'm going tojust alter this a little bit and injectsomething in that does somethingmalicious." You know you might getsomething from a corporate vendor youthink it's from them and then it'sreally not And that's where signing canhelp you because signing is really aboutauthenticity andintegrity right Is this actually thething I got from who I got it fromBecause if somebody in the middlechanges it right They take it out theyput it into your private registry foryour company is it still that same thingthat got moved or did it get altered DoI still trust that it came from thevendor Did it still come from the rightplace And knowing this is authenticallythe thing and its integrity hasn't beenmessed with can be very important forsecurity And that's where signing comesin Now what does this have to do withArtifactHub If you go to ArtifactHubyou're actually going to see one ofthese many things there has to do withis itsigned And if you go to this here's aHelm chart and this is from GitLab Theysign their charts and it'll tell youthat it's signed there And you can seeokay GitLab the organization is signingit That means later on if I grab thisthing and I can go check it I've got away to validate it Now this is becauseHelm has provenence files It is justlike your chart It's got the same nameexcept it ends in POV Helm repositoriesknow what to do with this And OCIregistries it's easy to put them inthere Helm knows how to work with thisAnd the only kind of interesting part ofthis is this was put together years agoSo it uses this fun thing called PGP forsigning Uh it was before cosign or anyof the other stuff Um but it does usePGP And so Artifact Hub in your chartYou can specify an annotation SoArtifactHub knows what's my fingerprintwhere's the key so it knows what to dofor validation This is also a nice wayto signal to any organization that'sgoing to pull it down where do you getthe key to validate itlater But you'll find that Helm chartsaren't the only thing that can be signedThere's a bunch of artifacts out therethat can be signed And here's an exampleThis is again Q Bordon And Q Bordon issigned with cosign It was created morerecently And so they leveraged cosignfor their si�gning and verification Andit'll tell you it's signed with cosignAnd Artifact Hub can detect that anddisplay it And so it shows you how to goahead you know if something's signed andhow you can go validate it And in thiscase not everything can be signed ofcourse right So here's something thatcan't be signed And if it can't besigned it's going to tell you it's notthat it's not here This type of artifactdoesn't have a known signature methodthat's being passed around for signingandverification So let's talk about somelinting and some learning right You'vegot all of this metadata out there andyou want to go ahead and see does my youknow what is there that I can add to myartifact to my thing that I'm working onum or maybe you want to verify thateverything is being put in there that Ithink is So ArtifactHub actually has aCLI um that can go along that's a gooddeveloper tool that you can use Uh theCLI you can find it in the docs You canfind the detail which is artifact.iodocsuh is where you're going to find thedocumentation at And if you scroll downon the left side you'll find there's aCLI Now the CLI uh has some simple uhinstallation If you're on Mac you caninstall it with Brew If you're onWindows you can install it with Scoop Ifyou're on Linux you can download thebinary and put it in the right place Wedon't have a Linux install for packagemanagers right now Um most developmenttends to be in Mac and Windows andthat's where the easy install is And soif you bring this up you're going to seethat the commands for it uh the primarycommand that's in it right now islinting And that's where it can look atit lint the configuration providefeedback So let's go ahead and we'll useHelm to just create a dummy use homecreate to create a chart and just out ofthe box see what's in there And whatit's going to do is it's going to showyou all of these things that it does ordoes notfind Right And this may be a littlesmall for you to read because it's along list but it actually goes through abunch of the things that I talked aboutand things that I didn't And it tellsyou what's there and what isn't And ifyou want details on you know what's notthere and where you can learn moreartifub.iodcsio/doccks will actually give you thedetails on this and the kinds ofinformation you can put in So if you'redoing a chart or you're doing a keyboardand policy you can go to that one therenavigate to it and then find out exactlythe metadata that you can put in thestructure and how it all works And thiswill validate that it's in the rightstructure It'll validate thateverything's there and works So it tellsyou it's not there and it validates thatit can read it and work with it And thishere is a Helm chart but this is smalland I'm sure you all can't read it verywell but this is for uh policy and it'slooking at the different data structurefor the different thing and fills it inSo it's policy aware right Or it's it'stype aware Now you have to specify thekind because it can't always tell whatit is from the kind Um especially sinceyou know you get artifacthub-package.yaml How does it knowexactly which artifacts it's scanningthere um you have to specify with a kindbut it does give you the ability to dothis And here I'm just showing examplesof a single artifact If you actuallyhave a directory with a bunch ofartifacts in it and a bunch of these ora subdirectory structure it can lookover all of them and provide you areadout on all of them So it can look ata whole chart repository and provide youthe feedback for all of themAll right so let's switch gears from howdo you specify your information you knowto make it easier for artifact to how doyou use some of those other things inartifact hub that are there And one ofthem is actually taking information inartifact hub and embedding it elsewhereto share around Right If I havesomething that I found in artifact hubthat I want to share I don't want to goupdate that other website all the timeevery time a new artifact hub getspushed out or an artifact on artifacthub gets pushed out you know I want itto automatically update And so we have�ways of taking some of that informationthat you're going to find on artifacthub and embedding it elsewhere on theinternet And so I'll give you twoexamples The first is you can take anartifact itself You can go to it and thethree dots in the top right you canactually choose an embed widget Now thatembed widget will take something aboutartifact hub and let you embed it Sohere you can specify some configurationThen you get the HTML You can embed thatinto a website say in a sidebar orsomething like that And this will stayup to date when new artifacts get pushednew versions some changes in languagethings like that This will be updatedany place it's embedded So it gives youthat ability to embed details about anartifact in other places to advertise itmarket it things of thatnature But this isn't the only place youcan do it Another thing you can do issearch queries So I can take a searchquery right And here I just did a basicsearch query and you got the dots upthere You can embed the results and it'sgot the same kind of features where youcan go ahead and choose certain thingsIs it light mode dark mode uh somecharacteristics about it and you cantake a search results and embed thatsomewhere else on a site You can dothings you know you you can filter itdown to say just my organization justthis type of thing There's all kinds ofcharacteristics from search you can doand you can take that embed it and makeit viewable elsewhere and of course itwill reproduce that query so as itchanges the content will be updated aswell and in the places it's embeddedSo let's talk about web hooks becausesometimes you want to do integrationsand sometimes it's not even your ownartifacts you want to integrate withright There's all these artifactsThere's thousands and thousands ofartifacts up there Um many revisions onthose Sometimes you want to do stuffwith them if you're using thoseartifacts right And web hooks are agreat way to pass events around theinternet So Artifact Hub does supportwebhooks And in this case you'll have to besigned in And then you go to yourcontrol panel And then in the controlpanel you go to settings Prettystraightforward And then there's a webhook section And that web hook sectionyou're going to get into web hooks Nowif you go here uh in this case it wasblank And then you click on add And whatyou get here is the ability to startfilling in the details about web hooksright You have something that's going toreceive it You want artifact hub to sendsomething out So you've got things likea name a description Um you've got asecret because one of the things you'regoing to want is when your receiverreceives a web hook you don't want justanything to do it You want a sharedsecret on both sides So you can you knowsecurity reject garbage that someinternet bot may be scraping and justsending out there You want a way toreject it Know what's real know what'snot You can look at certificates sitesdomains So many things can be fakedHere's a place to have a shared secretAnd so you can have that shared secretand you specify it Now you can also thensay is this for new releases or justsecurity alerts and then you select yourartifacts you want and you select theartifacts You fill in everything elseand then you can even get into yourpayload You've got a default payloadwhich is going to be uh cloud eventsbased or you can do a custom payload Andwith that custom payload can be any typeIt uses Go templating and you canspecify the templating structure youwant for that payload So you've got aweb hook shared secret payload that youcan pass out and now when there'supdates to something for security orsomething else you can now haveautomation on the other end thatreceives that and acts onit And this is all powered byArtifactHub receiving all thesedistributed things It's doing stuffbased on those events Might as well letyou all do stuff based on those eventsas wellSo subscriptions let's talk aboutsubscriptions for a second We talkedabout web hooks and pushing things outand automation and APIs That's greatSometimes we just need a simpler way ofdoing things right Um� subscriptions Wesubscribe to email things all the timeSometimes too many things right Then wego back into our email and go "What wasthis thing I'm going to go unsubscribefrom it." I do far too much of that Butsubscriptions are something else thatArtifactHub supports if you wantsomething that's lessAPIdriven And you can see if you go toan artifact up here and here we'relooking at the SIG store Helm plugin Sofor those of you who laughed at PGP umbeing the way Helm does signing or uhgiggled about that uh there's actuallySIG store signing and verification as aHelm plugin as well for those who wantit because Helm did things before SIGstore existed and so it's become aplugin to the ecosystem to do thosethings And here you can go ahead andsubscribe to it You can subscribe tosecurity alerts or new releases And assoon as you click that ball if you'relogged in it signs you up and you'll getemails to that whenever there's a newupdate for whatever category you pickedhere you'll get it in your email You'llknow you'll be notified Um Iparticularly like at least the securityones because I don't always do every newpatch release except for especially forthings that come out all the time Um butI definitely want to pay attention toall the security releases for everythingthat I use because you want to know whenthere's a securityfix And of course in settings next toweb hooks there's an area to manage yoursubscriptions where you can either addsomething or you can go ahead and toggleMaybe you had something where you'regetting new releases and you're like"This is too much I want to change it Ican go toggle it or I can remove it It'sit's very straightforward in settingsShould all be familiar to folksSo let's talk about the API You know Italked a little bit about web hooks andpushing things out and we talk a lotabout displaying this information Mostof what you see on ArtifactHub thewebsite you can actually get to throughtheAPI Um and if you go to the doc siteyou'll actually find you can also get tothe API site This should look familiarfor people who work with APIs a lot Youcan scroll through and actually seethere's an API for just about everythingyou want to do you can get to theartifacts you know metadata all of thosethings Now one of the things you'regoing to have to do if you want to workwith the API is do API keys and you'regoing to have to authorize it And thereason for that is obviously we don'twant things out there that we don't knowabout completely abusing the API beatingup um and you know we have to controlfor that If somebody's hitting the APIfar too much you can abuse that andthere are people out there who will dothat So we do do things like keys andauthentication and that's where you needto authorize it and you can actuallyplay with it once you have theauthorization keys right in the websiteUm but again in your settings you'llhave API keys uh alongside subscriptionsand web hooks and it's prettystraightforward You just add a name andyou get a key Now this key is invalid SoI put it up there I deleted it It's nolong not not good But there's an exampleHere's the information you get for yourAPI tooling to call it and you get a keyand that key will be valid for as longas it's needed but you can come backlater and you can delete it if there's aproblem You can rotate it those kinds ofthings And then of course you can usethat to authenticate to the API You cando it through the website to see whatthat's like to test itUm and and use it in your ownapplications And so lots of tools outthere actually use the API to dointeresting forms of integration Um buta lot of it isn'tpublic So let's talk about aninteresting thing here or at least Ithink it's interesting for anybody who'sgot a complicated organization So inArtifact Hub a lot of times you mightthink one person puts something up butif you work for a company or anotherCNCF project you might have complicatedorganizational structure right You seethis if you use GitHub you've got ownersyou've got maintainers you got peoplewith read access with write access Whenyou add somebody in ArtifactHub to anorganization everybody's essentially anowner But a lot of times you don't wantto do that You want to give peoplecertain specific permissions And out ofthe box you're probably going to seewell everybody's got one level It turnsout you can do more than that There'sactually um a way to handle differentlevels of authorization so people canjust do certain things on ArtifactHubbut it's more complicated and moreflexible than you're going to see inmost other systems So if you come in andyou go to an organization so you go intosettings there's the context uh or thecontrol panel context and that's goingto let you do for yourself or anyorganizations you're a part of You go toan organization you're a part of you'rea member on And you go into and you cansee I switched to the Helm organizationhere Oh got things a little out of orderhere Uh you're actually going to seethere's authorization I jumped to thedocs here And one of the things you'regoing to see in the docs here is you cansee policies writtenin RIGO So that way you can use the samekinds of policies as OPA to say whatpeople can do So if I go back to thatauthorization page and settings for aproject or for an organization not aperson you'll see authorization and youcan come in here and turn on policiesfine grain access control And with thatyou can choose an out-of-the-box policyor you can create your own and you cancontrol exactly how you want things tobe on your organization And so withdifferent organizations wanting a littlebit different structure and not wantingsuch fine grain locked in this gives youflexibility but it also gives you somequick easy things out of the box too ifyou want some more complicatedstructure And with thatQ&A does anybody have anyquestions Ah yes Can you come up to themicrophone Thank youum the linting Can you have aconfiguration file to say you must havethese fields because that's what we wantto provide or like and fail if it ifyour packages don't have them I'm sorrywhat was that in uh the the lint theartifact hub lint It went through andgive you results But can you do like apass fail check on say I want certainfields to always be enabled and if youdon't provide that like it's required byartifact hub but I might have adifferent I don't think it does rightnow but that sounds like a greatcontribution So artifact is open So whatI'll say is um I don't think that's afeature right now but I like the ideaand when I was actually putting thistogether it crossed my mind and I wouldsay there there's two routes to it oneif somebody wants to file an issue forit um Sergio I'm I imagine will jump onthat They tend to jump on these thingspretty well So if you file an issue witha request and detail out what you'relooking for that's one way becauseissues files are great ways to getthings moving and recorded for otherpeople And this is open source so ifsomebody wants to jump in and try tocontribute it we are more than happy tohelp you with that as well So I like theidea though Are there any otherquestionsI wanted to say that uh I think thechoice of using cloud events to reportthat that web hook payload is looksawesome Do you have the schema publishedas well of what the payload contents isUh if you actually scroll down furtherin that when you go into the UI I thinkit's in there Okaygreat Are there any other questionsYeah So you've been talking about orauthorization and authorization requiresa user account Um does Artifact Hubsupport open IDNo it does not Um there are a handful ofproviders You can see you can log inwith a few of them uh up front when yougo to create an account or log in Sothere are a couple of outside sourcesthat it does but it's not flexible to doany old one right now um it does majorones like GitHub things like that Sowe'll support that Um for people whowant to run it on their own that soundslike a wonderful idea And so if somebodywants to come contribute that or to filean issue on it we are more than happy totalk aboutthat Are there any otherquestions Thank you all for coming to mytalk Have a wonderful rest of your timehere2025-04-15 22:00:10.187407 ��M�M#��QAHEhnch8Wpj8hello Thank you for coming to my talk onleveraging the little known uh featuresof Artifac Uh does everybody in herealready know what ArtifactHub is AnybodyOkay So I'm going to cover just some ofthis real briefly Um first I'll saythere are a lot of slides that I'm goingto go through today If you want theslides you can go up to the schedulingsystem and get a copy because I'm goingto breeze past some of these I use themkind of as a backdrop for talking and Ido move rather quickly So ArtifactHub isa CNCF incubating project now Uh you canget the code up on GitHub just like allof the other CNCF projects Um what youcan run it anywhere You can run it onyour own self-hosted but people areprobably most familiar with thecentralized version right Where you goahead and you get all these distributedartifacts from all over the internetThis is one place to search and findthem Um and of course ArtifactHub alittle inception here is actually up onArtifactHub You can find it you can runit you can install it Uh you can use ityourself So if you want to have your owninstance for say your ownorganizations's uh artifacts to makethose discoverable internally you canabsolutely do that You could take it toyour customers You've got customers whowant that And it is something you canrun yourself and find the details rightup on Artifacub That little inceptionhelps us with thatSo since the last time I talked aboutartifact hub we do have one new thingthat is a new type of artifact uhlast uh fall at CubeCon Um boot C joinedthe CNCF as a CNCF project And so nowbootable containers are an type ofartifact that you can discover and workwith in here alongside all of theseother ones that are up there And sothat's another new artifact that's comein The pace has slowed down a little bitbecause there's only so many projectsthat have artifacts in the CNCF in thisspace And uh so that's the newone And here I'm going to go a littledeeper If you do have questions aboutsome of the the other stuff in herethat's more higher level and basic therewas a talk that I gave at CubeCon NorthAmerica last fall Uh you can get thevideo you can get the slides and all ofthat material over here I'm going to goa little deeper into some of the othernuances but a lot of that on the surfacestuff is actually already available onthe internet video form You can find iton our blog or at thislink So hi I'm Matt Fina Um I work atSUSA primarily on Rancher I've beenworking on Artifact Hub since before itwas created Uh there was originally theHelmhub and some other hubs and theycame came together and morphed and Iworked on Helmhub I'm a Helm maintainerI've been working on that for a longtime So I've kind of carried this alongsince it originally came together and Idon't do a lot of the regular work everyday Uh you've got these two peopleSergio and Cynthia and they do most ofthe work They are the people you'llprobably run into if you're uh you knowfiling an issue on GitHub or you want tocontribute code or they're who you'regoing to see in the contributingstatistics They do a wonderful job andso every time I talk um they they ask meto come up here They don't want to comeup here They do a wonderful job and Igot to call them out because they arefantastic in everything that theydevelopSo uh the first thing that I want totouch on here is after you know you'vegot your basic stuff in there how do wetell artifact hub more details aboutyour stu��e are two ways wecan either do a uh native data dump suchas my SQL dump or we can back it upusing the controller coordinatedapproach while a volume snapshot istaken to ensure application consistencywe first need to quas the applicationand then we take a snapshot after thatwe need to unquest the applicationthe backup for both snapshot metadata uhfor the kubern this the backup for boththe kubernetes metadata and the umvolume data needs to be exported to thebackup repository the backup repositoryis a repo or location that you can useto store data and metadatauh we see a few uh green color uh pieceshere and we have the application uh CRCRD that is owned by SIG apps uh we havewarning snapshot that is uh owned by SIstorage those are existing features inKubernetes we also have a few uh otherfeatures here uh we have volume modeconversion that is targeting GA in 1u uh30 release and uh this basicallypreventsunauthorized volume mode conversionbetween the file system and block modewhen you create a PVC from a volumesnapshot we have uh consistent groupsnapshot moved to beta in 1.32uh this uh allows you to create a crashconsistent snapshot of multiple volumesat the same point in time to ensureright orderconsistency we have cozy containerobject storage interface uh that triesto bring object storage as the firstclass citizen inKubernetes that can be used to managethe backup repository that feature isstill alpha now the team is trying tobring it to van alpha2 we also have change block trackingfeature that is targeting alpha in 1.33release and this provides a way toretrieve the metadata of change blocksbetween two snapshots to enableefficient backupsthis figure shows the restore workflowwith missing and uh existing Kubernetesuh buildingblocks so to restore application inKubernetes first we need to import thebackup from the backuprepository then we need to restore theKubernetes metadata and we need torestore PVC and PVif the volume was backed up natively weneed to restore from the native datadump otherwise we need to rehydrate thePVC from the volume snapshot or a volumebackup we also have the volume populatefeature here that is targeting GA in1.33 release this feature allows you tocreate a PVC from an external datasource such as a backup uh that is not avolume snapshot or another PVC so thisfeature is very useful during therestoretime so now let me hand it over to Davefor talk about the whitepaper thanks Shing so we did a whitepaper previously on um the need for dataprotection in Kubernetes and now thatwe're wrapping up things like changeblock tracking it was like what wouldwhat could we do next and so what we'redoing is starting a new white paper andthis one um has two thrusts um we callit overall best practices for Kubernetesapplications for data protection that'skind of a mouthful there um it has twothrusts one is to lay out um how aKubernetes administrator or user can setup their application for data protectionand then also to kind of call out whatchanges need to be made in existingKubernetes applications and processes inorder to support data protectionso right now we've got a lot ofdifferent mechanisms for data protectionand that's really good and uh we've gotdifferent strategies that we can useincluding githops a backup restore orreplication but then actually protectingan application you need to understandyour resiliency needs you need to selectthe right strategy to meet thoseresiliency needs and then you mayactually have to modify your applicationto handle the strategy that you'vechosen so the white paper is underconstruction i'm going to give you apreview of some of the areas we'reworking on and uh you know we're lookingfor people to contribute review etc solet's look on in so when we startlooking at a data protection strategy wehave to look at different factors andfigure out what it is we actually needbecause everything has a cost so there'sa couple of uh industry standard termsfor data protection that's uh the uh RTOrecovery time objective and the RPOrecovery point objective so RTO is howlong is it going to take you to get yo�ursystem back up and running after adisaster so for example you may have torestore from a backup tape or you knowS3 or something and so however long thattakes to bring the application back upis your RTO rpo is how much data youcould conceivably lose so for example ifyou're doing a backup once a day youcould conceivably lose up to 24 hoursworth of data and so you have to pickyou know I obviously in in an idealworld both would be zero but there'scosts involved so you have to look atwhat your actual needs are and thenweigh that against the cost of ofimplementing thestrategy and then there's also theimpact of data protection so for exampleum if I'll get into like the consistencystuff in a bit but if you're taking a uhcrash consistent backup which is what wecommonly do with snapshots it's a verylow impact on the application but you'reonly crash consistent versus if you needto quietest the entire applicationduring the backup process theapplication is maybe down during thattime so you have to look at those thingsum replication has its own costs and itsown impact and then you have to look atthe actual uh cost of storage networkbandwidth umetc so what are some strategies that wecan use for protecting an applicationnow a lot of people like to say hey myapplication's stateless and so I canjust restore from GitOps and restart theapplication if your application reallyis stateless then it's a it's a validstrategy so what's your RTO in this casegoing to be how long is it going to taketo come back up it's how long it takesto install the application and for it tobecome it to get back online um youdon't have an RPO because you don't haveany data so that's good uh the impact onthe application it's pretty low becausethis is how you would normally installthe application anyway so you probablydon't have to change the applicationanyway to work with this and your costis low now if you actually have datawhich you know many applications do andhopefully everybody here is is herebecause they actually do have data thenthere's other strategies we can use souh backup restore is a pretty common umway of doing things your RTO it's goingto be however long it takes to restoreso that may involve um restoring fromsnapshots which could be pretty quick itmay involve uh rebuilding volumes andreloading the data which could takelonger so I'm going to say it's mediumum then your RPO how much data would youlose in the event of needing to recoverthat's going to depend on how oftenyou're doing backups you do it once aday it's going to be you know you'regoing to lose up to a day's worth ofdata you're doing it every 5 minuteswell you might lose up to five minutesbut again you're going to have to weighthe costs of you know the storage uh howoften you're taking backups etc in orderto pick what's right for you and so youreally need to take a look at yourapplication and go what's what's theactual impact if I lose a day's worth ofdata is that okay many cases it's notbut sometimes it is um and then our costagain I'm going to say this is like inthe in the mid-range because it's itdepends on how much data youhave now one um one option forreplication so replication is where youactually have the the uh the data beingstored someplace else continuously soone one option is having replicationwith a cold copy so there's no computerunning on in in your um remoteavailability zone so in that case umyour RTO is going to be fairly lowbecause you just need to bring theapplication up the data is already thereit doesn't need to be restored you'reremote replicating it into um persistentvolumes on the other side um your RPOit'll depend so you can have uhasynchronous copies where things are umyou know eventual consistency there yourRPO will be low maybe a few seconds to afew minutes depending on how much datayou're writing if you go for synchronouscopy where every write happenssimultaneously in all of youravailability zones your RT your RPO cango to zero but you may have more of animpact on the application because thingshave to actually be written in multiplecase in multiple places before �you canget back before the application can moveon your cost is going to be higherbecause you've got a lot more you've gotlive storage instead of say objectstorage you've got um network bandwidthand people always underestimate how muchbandwidth storage involves um it is alot and um so so your your cost is goingto be higher there then we could havethings like you're fully replicatedthere's m multiple data centers they'reall up and running simultaneously andyou're replicating across here that'skind of our you know our holy grail forweb applications or you know cloudnativeapplications is to really be distributedacross multiple datacenters and there your your RTO time islow um your RPO is low but your costsare high and you may actually have to dowork in the application to make it afully distributedapplication so when we look at thesestrategies you know how where are we atfor these things so GitOps we we can doGitOps uh backup and restore we've addeda lot of things into Kubernetes tosupport backup and restore well andthere's multiple uh commercial and opensource applications for handling backupand restore and it's it's fairly wellknown how it works and then replicationum this is um I think you wind up kindof rolling your own in many cases so youmay work with something like a fullyreplicated database and as long as youhave like one database it's fairly easyyou of course still have to managethings like failovers and so forth butum there's really not like just anoff-the-shelf solution for everythingthat just fixes itthe other thing you have to look at isif you do go down the replication pathwhich has a lot of advantages is that isit a 100%um do you no longer need backup andrestore i would argue no and that'sbecause you are remote replicating whichis good but you know for example say yougot a bug in your app that startswriting garbage data into the databasewell the database will very happilyreplicate that over or if you're usingremote volume replication it'll veryhappily replicate that someplace elseand then you've got a corrupted databasein all your locations uh ransomware iseven worse because that's datacorruption with teeth and so you reallywant to be able to have some immutablecopies that you can go back to and uhreplication again it's good but you haveto look at your costs uh storagenetworking and comput costs can be highso you have to you really have tounderstand if it's what you need andthen your synchronous replication thatmay have impact on on your application'sperformance because it's taking the timeto make sure the rights happening inmultiple regions before it actuallymoves forward so it's really importantto understand what your requirements areon things like RTO and RPO before youselect a strategy and then um pick theright one that you know for to matchyourcosts so we talked a little aboutconsistency earlier and this is a commonthing common theme in backup and restoreand uh Shing mentioned earlier aboutcrash consistency and volume groups soin crash consistency a crash consistentbackup is one where when we restore thesystem all of the data is basically atthe point it would be if you had justpulled the power so all of your you yougot all of your right ordering correcteverything that was supposed to bewritten to disk is on disk and um yourapplication should be resilient tocrashes so there's some form of a repairoperation that probably has to happenbut with a lot of modern systems that'spretty quick you know you've got loggingfile systems you've got uh databasesthat do logging and so it's a relativelyum fastrecovery and the other thing about crashconsistency and one of the reasons weoften uh do this is that you can do thiswith volume snapshots so we can snapshota volume at a particular point in timeeverything that was written to thevolume is what's going to be there andsnapshots are generally fast um you knowwe're talking a few milliseconds in alot of cases so the impact on yourapplication is low so the IO was pausedfor a few milliseconds and I've actuallyrun into uh cases where customers havebeen oh well even a few mi�lliseconds isnot good um so those those areinteresting but in in in general it'syou know it's in that range is theactual impact on the application um oneof the issues that we do run into ismulti volumes and getting crashconsistency across multiple volumes andthat's something the volume groupconsistency snapshot will help withthat's going to need hardware support orstorage system support to actually snapmultiple volumes togetherum another strategy for consistency isto be fully consistent this is where youquiest the application possibly even theoperating system and this will get youeverything to disk everything is correctwhen you it's like doing a power downand then a cold boot now the problem ofcourse is that if you quiet theapplication the application isn't doinganything or isn't available while you'redoing a your backup soum you know it depends on your on yourneedsnow application consistent i don't knowShing and I were arguing if anybodyactually understands what this is but inmy in my terms here this is where theapplication is actually part of thebackup so for example say you're using aPostgress database and you do a PG dumpas opposed to snapping the disks now PGdump actually runs at the Postgresslevel it's basically a transaction andit dumps out all of your SQL tables andthere's some nice things about thisbecause one it's in a format thatPostgress understands um two um ifthere's any replication you're notgetting uh duplicated data and in thecase of like PG dump it's actually umit'll you get a performance cost that'ssmall because you're running this bigtransaction to get everything out butthe database is actually available andyou can continue to write to it whileit's dumping the data out because itjust starts a transaction so this is canbe part of your strategy is figuring outhow to use this and then there'sinconsistent backups which are generallya very bad idea because that means thatwhen you you know you take your backupit looks great but when you go torestore it um something says wait Ican't recover this my I I expectedthings to be written in this order andthey were not and it dies so in generalinconsistent backups are a bad idea umthere's a few cases like say all you'redoing is just backing up log data itmight be okay but in general it's a badidea now moving on to actually um impactof uh changing applications so one ofthe things we've been seeing isoperators and operators are great theydo some neat really neat things they letyou build a composable applicationand you can do things like build an appthat says "Hey I want a database thathas this many gigs or terabytes ofstorage it should go this fast etc etc."And the operator does the work ofstanding that up um doing maintenance onit and so forth that's great but thenwhen we come down to like for examplebackup and restore one of the some ofthe issues we run into um are thingslike um the operator may be doing thingswhile you're in the middle of yourbackup because when we do a Kubernetesbackup we want to get all the resourcesout as well as all the data on volumesor in applications so that may give youan inconsistent result so if we canquest the operators that would be handyum another thing that's tricky inKubernetes is Kubernetes is this veryflat space of resources so determiningthe relationship between things is hardso we've got you know traditionally wehad fixed relationships we had thingslike stateful sets with pods and PVs andwe understood this relationship it waskind of fixed but now with operators weget a custom resource and the customresource may have multiple stateful setsattached to it and those may havemultiple PVs and understanding that sayfor example we're using the operator'sbackup and that means we don't need toback up the PVs or the stateful setsthat becomes a little uh harder soideally operators would start taggingeverything properly and having somestandards for how we tag those onrestore it gets even more interestingbecause uh you wind up in raceconditions with the operator so forexample um say you have a Postgress Ilove Postgress you have a Postgressoperator and you write on restore youwrite the CR that says database saysPostgress database well the operator ifit's up goes hey let's make a newdatabase and it creates some PVs itcreates some stateful sets and then thebackup application is like no no Iwanted to write those PVs and you windup in this fight so again this is wherewe really need to have this order ofrestore figured out and being able tounderstand the relationship between theobjects so that we restore them in theright order because if the operator runsfirst you're going to wind up with anempty database rather than your restoreddata so um again um being able to questthe operator would be helpful uh andwhat we really need to start workingtowards are somestandard interfaces where as a backupapplication it can say hey I'm lookingat these things i'm going to you know gointo restore mode or whatever and um putthings to sleep for a while so that'sthat's about where we're at on the whitepaper now I'm going to hand it back toShing on other stuffthanksDave so uh we also have other initivesin the wing group we started to havediscussions on replication uh how toreplicate volumes how to replicateKubernetes resources that are part ofthe application this is still at a veryearlystage now let me talk about how to getinvolvedso we have this u community page you canfind information about this workinggroup we have bi-weekly meetings onWednesdays so come join our meeting andlearn what we are working on and see ifthere's anything you are interested incontributing we also have a mailing listand slapchannel here are some other CubeConsessions we have a session about changeblock tracking coming up at five o'clockright after the session so if you'reinterested to learn more please jointhat session that's all we have thankyou for coming are there any questions[Applause]one question[Music]well actually so wrong slide back onemoreum you could try to do that it you knowso basically your comment hereI I'm I'm I'm going to try um sobasically the comment was uh for GitOpsone strategy would be to use GitOps torestore the application and have thedata restored independently or first andthen then bring it back up you could trythat um it's so it's a little tricky uhso for exampleum you're going to need to pick all ofyour PVs for example so if a new onecomes up you've got to make sure thatone's getting replicated so what'sactually orchestrating the whole thingfor you and basically you get you've gotbackup and restore it's just you'reusing GitOps to put your um yourKubernetes resources back um you alsorun into issues with uh applicationsthat may be a little more dynamic so ifthey are reading and writingresources that mightrestoring something that mightbealready from something else well you tryto get a consistency across the backupso you're restoring so you're backing upthe Kubernetes resources along with thedata so ideally it's not too dynamic youget all the resources generally if youdo a back Kubernetes backup at thispoint so it should be together so Iwouldn't say I would put that flavor ofGitOps you know where you're actuallydoing backup of the of the data you'reactually you've moved on to backup andrestore it's just it's not necessarily aum uh a backup application that's doingit so you still got in terms of your RTOyou still got to get all of your databack in yeah but yeah you can certainlytry it it's it's challenging right imean I've been looking at this when westarted looking at this at VMware a fewyears back God five or six years backmaybe more um everybody said "Oh yeahyou just back up all of the virtualdiscs." I was like "Okay but wait aminute each of those virtual discs likein VMware they all have like a a a UUID." It's like okay where do thoseactually go how do you stitch this wholething back together so that's a thing tolookat yeah stitching it all back togethercan be challengingoh lost shing um any otherquestions all right thank you everybodythanks so much for coming and uh we'reall going to run to the next uh the nextone on the change block tracking thanks2025-04-15 22:00:11.130930 DD��N#��uAjoOTwCatd9ghello everyone welcome to our data proprotection working group session my nameis Shinyang i work at VML by BRCON i'malso a co-chair of Kubernetes 6 storageand the data protection wing groupworking with Davehi I'm Dave Smith i work at VH and Ihand it back to youyeah so here's today's agenda uh we'regoing to discuss what we have done forthe working group who are involved whatis the motivation for establishing theworking group and we will discuss someof the projects and the white paper thatwe are working on and finally how to getinvolved here are some key updates wewrote a white paper on the dataprotection workflow in Kubernetes awhile ago we also rec recently updatedthe annual report for2024 here are some links to previouspresentations atCubeCon uh here we listed companies whoare supporting this data protectionworkinggroup in Kubernetes the day oneoperations for stay for workloads arewell supported we have persistentvolumes and persistent volume claims forthe volume operations and we haveworkload APIs such as deployment andstifle set for declarative management ofyour staferworkloads more and more stafer workloadsare moving to Kubernetesthis workloads moving to Kubernetes totake advantage of Kubernetesself-healing abilities agile deploymentthe built-in scalability and portportability however D2 operations forstafer workloads such as data protectionare still limitedthe git ops workflow has limitation insupporting state forworkload secrets config maps and datastored in persistent volumes are notstored in the git so we need to have abetter way to support data protection inkubernetes that's why we started thisdata protection working group uh thisworking group is sponsored by both sikstorage and sik appsthis figure shows the backup workflowwith missing and existing buildingblocks inKubernetes the blue color shows theprocess the green color shows existingKubernetes components yellow means it iswork in progress orange means it is amissing Kubernetes componentto back up an application in Kuberneteswe need to back up two pieces of data weneed to back up the Kubernetes metadataand we also need to back up the datastored in persistentvolumes to back up data in thepersistent volumes ther��u haven't yet uh youshould probably check out cluster meshtoo and a couple quick updates fromPsyllium 1.17 on the networking andservice mesh side uh Selium now supportsquality of service uh for guaranteedburstable or best effort for networktraffic it has now support for themulticluster services API for globalservices in cluster mesh another hottopic i would definitely check out thetalk from Selium Con yesterday um and itnow supports gateway API 1.2 so uh a lotof updates around networking on thesecurity side uh as I said networkpolicy is one of the top features inPsyllium and the performance of that hasreally increased in the latest releaseum allowing you to also prioritizecritical network policies and do networkpolicyvalidation and then on the daytooperations and scale I've talked to lotsof end users around the uh cloudnativeecosystem i know people that are runninguh seliums in clusters with thousands ofnodes as you hear late later from Googleum or also across thousands of clusterstoo so there's many met new metricsaround BGP network connections to helpit easier to manage psyllium at day twobeing able to rate limiter thesemonitoring events and including betterscale testing too so that's a quickupdate of what the project is what we'vebeen doing in the past six months sincethe last CubeCon and with that I'd liketo hand it over to Amir to talk abouthow they're using Psyllium at DB Shankerto powercontainers hello my name is Amir Khanplatform engineer atDBishanker and um I'm responsible fordevelopment and maintenance and ofreliable and secure cloud platformsolutions and uh short about the shankerit is one of the world's leadinglogistics service providers and um itoffers mainly four transportationservices land transport which we arepart of it uh ocean fright air fried anduh contract logistic And uh we do notonly deploy containers we also shipthem uh we run self-managed Kubernetesclusters on AWS and we manage the wholeuh life cycle of Kubernetes clustersnodes and underlyinginfrastructures we we treat the node ascattle we replace and patch the nodes ondaily or weekly basis we run Kafkainside our clusters and uh if networkingunderline infrastructure has someperformance issue or some latency itimpacts Kafka throughput definitelyespecially because we mirroring thetopics between clusters and thisreplicating needs really good networkunderlying infrastructure then wedecided two years ago migrate toCelium and we evaluated multiplesolution at that time the solution thathas less complexities fortroubleshooting and uh better visibilityand networking and reliable solution forus and fit somehow to ourinfrastructures and we did thismigration in not in one run uh withmulti-step uh migration with it withnear zero downtime and if you areinterested at the end of the slidesh isthe link to that how we did uh thismigration we enhanced uh Hubble metricsby using the context options and addingnew labels into Hubble metrics to haveand populate these labels withKubernetes metadata like uh service nameor pod name or identity or somethinglike that and based on that we created acustom dashboard and it does help usreally good to understand the networkingand uh Hubble CLI yeah by interactingwith uh Hubble uh relay API gives usreally good uh cluster level uhnetworking visibility uh and service toservice communications and with reallygood uh number of um filtering you cando lots actually and of course seliumCLI as well by interacting with seliumum agent API uh gives you really goodnetwork visibility on each node or EPF astate of at that note or everything uhmade us uh analyzes and networkinterruptions easier that means you wedon't need anymore install a sniffingplugin for cubectl or SSH to node and dosome cap execute some capturing and touh check ournetworking we have granular policyenforcement and uh enhanced visibilityinto layer 7 traffic um after we enablethe selium network policy that meansboth of them in one place that is reallygood for us and uh we don't need anyapplication level firewall uh for andselium network policy do everything forus um some mo�nths ago we identified somelimitation for our uh sidecarbasedservice mesh from one side uh serviceservice uh mesh vendor announced thatthey want to change their roll out andrelease strategy uh which didn't fit toour requirements from other side uh wecouldn't have full visibility into layer7 traffic as the tr um the trafficleaves uh the pod name space that wasalready encrypted by sidecar and if theinvo proxy wants to handle the layer 7uh hubble could not somehow um havevisibility to http data and we haveinvestigate at that time a lot and wechecked selium selium had has reallygood security feature to replace itcompletely what we have we had beforeand uh it could simplify our operationalcomplexities um for example we enabledthe transparent encryption for holdencryption from pod to pod node to nodeonly what with enabling one flag in ourhelm chart and everything happensautomatically managed you don't need torotate the key manually and everythinghappens under the hood and then weachieved a better application andresilience and uh better observabilityand uh we simplified our infrastructureand reducing overhead by leveragingusingeBPF and uh for service load balancingwe used before Q proxy which based onthe IP tables uh we replace it with eBPFand to have better performance and avoidthe inefficiencies of IP tablesespecially for larger scale ofKubernetes clusters if you have a lot ofthousand services and use IP table thenthis latency is noticeable we enableebpf house routing for uh optimizepacket processing and bypass the hostnetwork stack uh uh after the packetleaves the um leaves the pod name spacethe ebpf program intercepts this uhpacket and uh it's bypass uh the hostnot network stack and routing andreached at the end to u physical devicedirectly physical interface directly anduh from eress and ingress traffic youhave every where epppf to uh for uh moveforward your applica data to eachinterfaces and you have faster uhnamespace switching at the end andum um in conjunction with EBPF hostrouting we want to further optimize ournetworking infrastructure and replacethe VT device with netkit pair uh forachieving host level throughput um forour comp container name space and umreducing the latency by using the layerthree routing instead of layer two anduh look up and such a thing andum for what some of our application uhthey should communicateuh across the clusters currently we havetraditional API gateway or something wewe are looking for the sol we lookingfor the solution for that but we checkthat cluster service mesh it's goodchoice for us and using VPF again forhaving the secure pot-to-potcommunication across the multipleclusters that was that is our future uhplan if you are interested about how wedid migration to um selium you can uhsee the link here and recently we did Cselium CNCF case study as you you cancheck as well what uh happened after theselium migration and what changed for usyeah thank you very muchthankshi my name is Bowie and I'll be talkingabout selium scalability improvementsthis is primarily work from my colleagueuh George who unfortunately couldn'tmake it here today so first somebackground on GK's data plane so what isGK's data plane so some of you mighthave heard data plane v2 but really it'san implementation of Kubernetesnetworking with selium at its core sowhy do we use psyllium so Seliumprovides the core Kubernetes networkingfeatures in GKE from podpod connectivityto security and observability and wefind it's highly extensible in aflexible data plane due to the use ofebpf where do we see selium evolving soone of the key areas uh where we seeselium evolving is in the area of scaleso we're seeing that our customers wantto run bigger clusters more pods moreservices networks policies you knowquite a few years ago we saw a clusterwith like a thousand nodes we saw thatwas like wow this is such an achievementbut then the number just keeps growingso concretely getting to this scale willtouch on a number of keys psylliumfeatures so such as scalabledistribution of pod metadata usingpsyllium endpoint slices enhancing the�scalability of network policies andidentities with things like operatormanaged identities scaling Hubble flowmonitoring and handling services with alarge number ofbackends so the biggest cluster that wehave deployed to date um on GKE has65,000 nodes i'm pretty sure noteveryone has that use case but it'sinteresting uh data point to have sothis means selium agent on 65,000 nodesall programming the native networkingdata plane uh contacting the KubernetesAPI server so how did we do this we hadto enable a subset of features not wefound that not all celium features todayare able to scale to that and of coursewe had to make the API server super highscale so one particular challenge thatwe had to meet in this case was that wesaw a very high pod turn rate of up to500 uh pods sort of life cycle changesper second and this is sort of veryinteresting because like how do youhandle that uh in the control plane socurrently um we achieve this byrestricting it to a subset of seliumfeatures so we use pselium IPAM podtopod connectivity kubernetes service andhubble with basic observability ideallywe would like to make this high-scalefeature as feature complete as possibleyou know to the extent that'spossible so kind of looking atscalability dimensions that we hit thecritical ones were of course the size ofthe cluster in terms of number of nodesand what we noticed in this case is onceyou get up there in terms of numbersit's really the pod churn rate that uhyou are limited by and the keybottleneck here is actually theKubernetes control plane all the controlplane operations with respect to lifecycle of nodes and pod churn contributeto kind of restrict the kind ofscalability you can get um and sort ofgetting this control plane and selium'scontrol plane to this scale will be veryinteresting designchallenge so a couple of proposals Ithink um for those of you who sort ofare in the community to watch for is uhthere's this notion um that is beingdiscussed in the community aboutconfiguration profiles so we realizedthat 65k nodes is not necessarilyeveryone's cup of tea but there are afew users who have this as a criticaluse case what selium configurationprofiles tries to capture is that okaylet's capture this high-scale networkingprofile let's figure out you know whatfeatures make sense and like buildtowards that make sure that's testedmake sure these features work reallywell together i'm pretty sure there'ssome other profiles for example someon-prem profile with like you knowcomplex networking like that would beanother use case where we make sure thatall those psyllium features that youneed for that profile are really beingdeveloped for you can think of it kindof like a psyllium persona in some senseum one of the first uh profiles that'sbeing contributed is this high-scalebasic networking profile and I think oneof the things is you know this bigaudience is like go look at that doesthat make sense for you do you can youthink of other profiles that you kind ofyou fit into that would be nice to knowand finally um there's some work uhthese are mostly proposals at this timeon how to do network policies at megacluster scale which is a veryinterestingchallenge and with[Applause]that all right hi I'm Anna I'm asoftware engineer at is valent and Iwork uh mainly on tetragonon which is aseparate project under selium umbrellathat extends uh selium observability andsecurity capabilities uh with genericebpf based policies so here I am to giveupdates about the tetragonon project butfirst I would like to give updates aboutanother project under uh seliumumbrella go bpf go is go library forloading BPF programs it is used byselium CNI by tetragonon by many otherprojects um in the uh cloud nativeecosystem and there is one uh newfeature in this library windows supportso ebpf as as you probably know is Linuxkernel technology uh and um it's Linuxspecific but ebpf for Windows does existtoo uh it is not production ready yet umto use eBPF for Windows you need todisable uh Windows security featureswhich you probably don't want to do inproduction but we are working withMicro�soft to uh make EBPF for Windows uhG and production ready uh EBPF golibrary supports has initial support forWindows so we can start buildingcloudnative tools for uh using ebpf forwindows uh using go and uh why am Italking about it i'm talking about itbecause tetragonon window support isalso coming so uh we are planning torelease uh tetragonon windows initialtetragonon window support in the June uh1.5 release this initial uh support willinclude uh process create and processexit tracing so the core feature oftetragonon um events for processexecution and uh yeah process create andprocessexit um from other featuresum ebpf CPU overhead um so this is quiteexciting um one of the uh perks of usingebpf is that we can provide securityvisibility and enforcement with minimaloverhead and we really like talkingabout it how tetragonon providessecurity with minimal overhead but ifpeople when people ask us what's exactlyis this overhead we are often likeum we need to benchmark it uh we dobenchmarksum as in which we compare system withouttetragonon we install tetragononinstalled policies measure CPU uh memoryoverhead uh by calculating thedifference between um system withouttagon and system with tetragonon so wedo such benchmarks we publish resultsyou can find results um in the umtetragonon blog um but these arebenchmarks run in one particularenvironment which may or may not matchyour production environment uh andbehavior of of tetragonon in um yourspecific case um so we need bettermeasurement of the overhead and weintroduced it in the recent tetragononreleased um so now Tetragonon hasbuiltin um measurement of both CPU andmemory overhead down to individual BPFprograms individual BPF maps so theTetra CLI the like CLI for um debuggingmanaging Tetragonon um exposes thisdebug command where you can uh seeexactly what BPF programs are um loadedby by Tetragonon and how many times theywere executed how much CPU they took theprograms are structured in um filesystem in a hierarchy so um you can alsounderstand exactly how they map to uhtetragonon policies you have installedum this information is also exposed uhvia Prometheus metrics so we can easilygraph um the CPU information in graphanafor example or uh alert if it's too highand similarly for memory uh we alsomeasure EPF memory overhead which umit's mainly um the main contributor tomemory usage is BPF maps uh thatsometimes can grow quite large so uhtetra debug command also can list allthe VPF maps loaded by tetragonon usedby tetragonon how much um memory theyuse um and we also expose promeusemetrics um that link this information touh the tetragonon policies loadedum from other new features um there area few new features um that will bereleased in the um that were werereleased recently in the latest releasum one of them is more advanced policylanguage so Tetagon has a very genericand low-level policy language thatallows you to hook to literally anypoint in um the Linux kernel and extractum any arguments uh processed by the umkernel function and this language is goteven more advanced so that now you canextract um fields from inside a kernelstructure for example from inside a filestructure um that is processed by um akernel file operation another newfeature is a configurable policy mode soTetragonon has two modes umobservability enforcement uh by defaultum you um you get events you observewhat is going on uh but you can alsoswitch a policy to an enforcement modeum so typically what you would do youwould uh apply a observability policyseeum if there is anything suspicious umand after you audit the events um thenyou can switch it to the enforcementmode uh this is also available via TetraCLI if you don't want to manage policiesvia Kubernetes API or you are runningoutside ofKubernetes and uh the last of uh featurehighlights is um cell filters commonexpression language um is a languagethat allows you to write arbitraryexpressions that are uh filtering JSONuh tetragonon JSON events um and you canwrite such filters uh where you areexporting tetragonon events to a logfile or when you are querying them viatetra cli so this allows you toum write very advanced uh rules that forexample detect uh particular CV exploitum or here on the slide I have anexample rule that detects uh an attackersearching for AWS credentialpotentially all right so this is it fromnew feature highlightsum now I would encourage you to uh tryTetragonon if you haven't already uh andif you are using Tetragonon then join umus in the community meeting we have amonthly community meeting uh on thesecond Monday of every month um in at 6p.m europe time um we have very uhdiverse community we the communitymembers range from students who want tolearn something about DBPF to umemployees of large corporations thathave very advanced and very specializedsecurity requirements and everybody inbetween so if you are interested intetragonon please join community meetingum and uh stay in touch with us uh onslack so in selium in the BBPF slackthere is tetragonon channel where all ofthe tetragon developers hang out and uhthis is it from me with that I will handover to Bill[Applause]again thanks Anna um so now I'm going togive you the last updates from thepsyllium communityso the first one uh Linux Foundation ifall this has excited you Linuxfoundation now has a psyllium certifiedassociate exam this is to show yourknowledge to the community and to futureemployers or your employer about yourknowledge of psyllium it's also part ofthe newlyannounced uh golden cubstronautso this is a great way to jump into theecosystem and show what you know aboutthe project selium also released its uhannual report uh we called it the yearof Kubernetes networking talking abouthow Psyllium has really taken over thenetworking stack from layer 2 to layer 7and really providing you a completesolution for whatever your needs are incloudnative networking it's a greatsummary of everything that happened inthe last year in the project and I'drecommend that you check it out uh interms of what's happened here so far atthe week uh my week and for a bunch ofthe others selium developers actuallystarted on Monday with the psylliumdeveloper summit this is the third timewe've had that we bring all thedevelopers around the world into oneroom to discuss the future of theproject um we have them before eachCubeCon so if you're interested ingetting more involved in the project ortalking about a feature this is a greatway to do it we'll have it inNA2 uh Selium is now kind of trying toincrease the scalability of not only theproject but also the community too soSelium now has SIGs or special interestgroups uh the first one is actually SIGscalability too um so if you'reinterested in that one uh come to thisone or if you're interested in adifferent part of the project feel freeto come by and suggest a different SIGthat you think the project should haveselma is an open source project underthe CNCF we're always looking for newcontributors new uh contributions uhthere's many ways to get involved in theproject uh where you can start in thepsyllium community repo to learn abouthow you do it for both code and non-codecontributionsum and one very easy way to get involvedand to help out the community is to do acase study with the CNCF uh we havethree new ones from DB Shanker as wetalked about from Queen Cloud and fromCIS 11 but if your company is usingpsyllium in production and I saw halfthe hands in this room go up uh pleasecome talk to me we'd love to have astory about how you're using psylliumand benefiting from it so you can tellyour story to all the other people andhelp other people learn about thebenefits ofpsyllium and then the last thing to echoonce again we're always looking for morecontributors uh we have developer weekuh developer meetings every week uh onWednesday uh in the afternoon Europeantime or in the morning in the US there'salso a monthly APAC friendly meeting andTetreon also has a weekly meeting tooplease stop by and tell us about howyou're using the project or what you'dlike to see next so with that uh thankyou for coming and I think we have maybeone minute for one question[Applause]2025-04-15 22:00:11.886200 &&��=�O#��1AkYT7KV_Cijsokay this is the Silia maintainers tracksession uh another standing room onlyaudience this is great to see i thinkthis is about my 20th CubeCon um I'mprobably about my 10th as a Seliummaintainer and it's always great to seeall the excitement and interest inPsyllium so how many people here areusing Psyllium rightnow okay pretty impressive that's greatto see and all the people with yourhands down I think by the end of thissession you'll understand why everyonearound you has their hands up so let'sjump into it so if you in case youaren't familiar with Psyllium uhPsyllium began as a CNI a networkingplugin um for Kubernetes and it wasbuilt around EVPF as the underlyingtechnology powering all of it since thenit's expanded to encompass cloudnativenetworking observability through Hubbleand security with Tetragonon and thecore thing about Psyllium is that all ofthis is powered by the kernel technologyebpebpf on the networking side it'sexpanded from just simple flat layer 3networking as a CNI to including thingslike cluster mess super popular rightnow with everybody running tens hundredsor even thousands of clusters egressgateway cube proxy replacement BGPgateway API all these fun things fromlayer 2 all the way to layer7 on the observability and security sidepeople love Hubble it provides a servicemap that allows you to view and debugall of your network traffic and it'sprobably one of people's top favoritefeature about uh Selium once they diveinto it and then on the security sidethe other top reasons I see peopleadopting psyllium is around transparentencryption of network traffic forcompliance and security reasons and alsofor network policy um people are usinguh network policy to secure theKubernetes clusters because we knowsecurity is becoming a really hot topicright now in cloud native so with thatintroduction out of the way um I'd liketo give a little shout out to Seliumit's now become the most starredEVPFbased project uh on GitHub so Ithink that's a really exciting milestonefor the community in the past 10years uh beyond that uh the CNCF at lastCubeCon uh introduced the Psylliumproject journey report it dives into alot of the things that are happening umaround Selium including the growth ofcontributing people and companies um andis really amazing to see Selium as thethird fastest growing project in thewhole CNCF ecosystem out of over 200projects on top of that people areactually putting it in production yousaw the hands raised here but in themost recent CNCF tech user radar onmulticluster management uh from endusers in the CNCF ecosystem Selium wasrated as the most mature and the mostuseful solution for multiclustermanagement i know of what I've seen sofar at CubeCon this has become a reallyhot topic so if yo��therein there's polaris there JS policythere's cloud custodian I probablymissed six different projects again wedon't have time to cover all of thosetoday so instead I'm going to hand itoff to Joe who's going to talk aboutI'm going to talk about policy withinKubernetes thank you Andy Umso within Kubernetes um policy has beenkind of available since about the8release when we introduced um admissionweb hooks and this was kind of thefoundational extension point that youcould use um the way it works is thatyou can intercept all write requestscoming into the control plane and thatgives you the ability to um control umwhat is happening in your cluster um butyou do that in a very general way rightlike you build this new binary whichbecomes part of a control plane and it'sa really a critical component of thatcontrol plane because it's part of therequest serving flow so if it becomesunavailable your control plane becomesunavailable and so you as a extensionauthor you have to take that reallyseriously you have to think about wellhow am I going to develop this thing howam I going to maintain it what's theupgrade plan is it going to scale to theneeds of all the clusters it's installedin um all the things monitoringeverything that comes along with withhaving a criticalcomponent and what we saw in theecosystem since 18 which is a long timeago is um there's been a lot of problemswith admission web hooks in clustersI've seen that in GKE we've seen it inthe ecosystem at large and when we werestarted to look at this problem and asklike how can we help the ecosystem dobetter here how can we make the web hookproblem less of a problem for thecommunity one of the things that weobserved VED was that the vast majorityof web hooks were doing really simplestuff now it was kind of this 8020 splitwhere often times you would havesomebody that wanted to build a CRD anda controller and they didn't really wantto build a web hook but they end uphaveing to make one because they neededto enforce a bunch of validation rulesat the frontdoor and so from that came thisobservation of like well is web hooksreally the right solution to that kindof problem i mean logically what they'redoing is they're putting this little bitof custom logic in that does some checkand if the check's false they don't wantto let the request into the API serverbut if it's really simple logic why notjust put it straight into the KubernetesAPI server in the firstplace and so we looked into how to dothis we looked at Wom we looked at abunch of different embeddibleprogramming languages and the one thatwe found that we really liked was calledum the common expression language um andthere's kind of three properties weliked fromum language the first was that it wasdesigned to be embedded into things likeYAML um it's designed to be these kindof single line expressions the secondthing we really liked about it is it wasa relatively unsurprising C style syntaxso if you've worked in modernprogramming languages you're probablygoing to be able to figure out what thiscode does and you're probably going tobe able to write it without having tolike be taught how to do itum and then the last property which isreally important is that Cell has arelatively low overhead for execution umso you're talking like 5 to 10x ofnative code um but we still get runtimesafety so we still get to um bound therunning time and the memory utilizationof the cell programs that are run um sohere is a example of cell being used inKubernetes this was the first feature weadded cell for what's called CRDvalidation rules and the way that thisone works is that you're allowed to addcell expressions into your CRD schema toperform more complexvalidation um this is a cross fieldvalidation where you're checking twofields you can also do updatevalidations like immutability checks umthis has been really successful so largeprojects like API gateway have been ableto use this um to completely migrateaway from a web hook and use only umonly code in their CRD to enforce alltheir validationrules all right so that's kind ofinteresting but I have�n't really talkedabout policy specifically and so thatgets us to our more recent features umthe first one is called validatingadmission policy and this is a directsubstitute for a admission web hook solike you can have a validating web hookconfiguration you can have a validatingadmission policy so somewhere in herethere's got to be some cell right um soif you look to the left um the the mainresource under the spec validationsexpression that's where you'll see thecellexpression this is an entirelyself-contained policy you don't need towrite any separate binary to load thisinto your cluster so once you've loadedthe policy which is usually authored bysome kind of policy or extension authorthen your cluster administrator which isa separate role typically will bind thatinto their cluster which is the middleresource and then optionally you canalso can configure the policy with aseparate resource if you need to um soin this case you're setting the maxreplica limit to three um and you'rebinding that to particular resources inthe clusternow like we haveum usually what you're doing is you'relooking at the fields in the resourcebut you can additionally performauthorization checks within theseexpressions so this is showing that umsome of the things that you can use incell so a great example is imagine thatyouhave a label that you only want certainprivilege users in your organization toset or change um so you could set up apolicy that does that first you wouldcheck to see if the label's beingchanged and if it is then you wouldcheck if that user is allowed to makethat change according to some customverb you can define and then in yourarbback system you just give peoplepermission to that verb if you want themto make thatchange um in addition to validating youcan do mutation mutating policy this iscurrently in alpha um the way it worksis the same as validating admissionpolicy except instead of returning a isthis request allowed response you'rereturning a patch which mutates theobject so this example shows um asidecar container being injected um youcould do this this way or you could doit with a JSONpatch um okay with that I will hand itover to Ritaall right uh and another option uh uhthat has been pretty popular in theKubernetes world is the OPA gatekeeperproject um so the project's actually Iwill start with OPA uh open policy agentit's it was actually a really popularrule engine that was donated to CNCF uhin I think 10 years ago if you were inthe keynote you probably saw one of thelogos for CNCF project so it actuallyhit the 10-year mark uh and back in 2018bunch of us actually worked togetherwith the open maintainers to bring allthat uh enterprise rule engine goodnessinto Kubernetes ecosystem uh andgatekeeper uh you might ask what is itright um it's a dynamic flexibleemission and mutation web hook also ithas a CLI and it's a controller that youcan run in your cluster to help enforcepolicies and the idea behind gatekeeperis you write the policy once as code andconfigurations and you can run itanywhere um the key differentiation forgatekeeper is you can write it in mmultiple languages it started as regobecause that was the default language inOPA um and then recently we added cellsupport because again we want to makesure uh you can write the language ofyour choice uh and today might be reggowho knows what it will be tomorrow umand also with that it also works withmultiple engines uh again OPA obviouslyand then Kubernetes VAP as Joe mentionedum and last but not least ismulti-target and the idea is the samecode that you use for Kubernetesemission you can use it for Terraformyou can use it for other type of umtarget so Gatekeeper is a target forKubernetes emissionum and another uh big difference forgatekeeper is separation of concerns uhwhen we worked with organizations a lotof times the people who write thepolicies are often different from thepeople who are actually deploying thepolicies um and therefore we createdthis um solution that basically allowsuser different persona to managedifferent type of resources for exampleif you're rolli�ng out the policies youmight care about the specific parametersthat you put to actually do theenforcement and so that resource iscalled a constraint and it's decoupledfrom the logic where the the people whowrite the policies will actually writethe rules and again this ensures thatthe policies can be configured and notinstead of actually focusing on writingthe code uh and then also uh gatekeeperbecause it started a long time ago wehave a really a good um community thatactually contributes a lot of thepolicies that you would actually you canuse right away um and it's a gatekeeperpolicy library uh in fact when um wewere deprecating PSP in the early daysum gatekeeper library was used to uhhelp usersmigrate um it was uh OPA was graduatedin 2021 and there's a lot ofcontributors from many companies likeMicrosoft Google Righ Dyra VMware andmany many more and I'm sorry if I forgotto mention you um so yeah so you mightask like well um VAP is great and map isgreat because it's more reliable um sowhat are the differentiation when shouldI use gatekeeper right or any other webhook for that matter um so I'm not goingto mention the the ones that they do thesame things um so for example for auditlet's say you want to audit existingresources and you want to um stream thatdata somewhere else in your uhenterprise like uh management system sayfor tracing or um complian reasons thisis where gatekeeper can do that um andit basically provides a history of umthe violating resources that arecurrently in the cluster and also shiftleft right we know that developers wouldmuch rather um enforce these policies intheir code and this is where we have aCLI called gator you can actually run itin your CI/CD pipeline and it willactually can be used to test yourpolicies as well as the resources thatyou plan to uh deploy to your to your uhproduction clusterum and then there's also metrics rightit's already comes with um Prometheusmetrics and then there's another conceptof context awareness for things that arealready in the cluster so think of thescenario where you want to detectuniqueness of your ingress resourcesthis is where you can again create agatekeeper policy that allows you to dothat um and then external data sourceslet's say you want to talk to anexternal data source in your policy youcan also do that um and then last butnot least pub um let's say you want touh subscribe to violations and you wantto uh stream that somewhere else that'sthat's another uh feature thatgatekeeper does wellum so in terms of um how does it workwith VAP um uh as Joe mentioned um thethe concept behind Gatekeeper is we donot want to compete with VAP we love VAPokay um and in fact we want to be thefront end to unify the experience forthe for the user meaning we want you towrite your solid expressions in oneplace and Gatekeeper will actuallymanage both um VAP for the common usecases and nonVAP for the more advancedscenarios where you you have to use aweb hook and because um the there wehave uh multiple language support youcan write in the language of your choiceuh and and it will basically generatethe VAP resources for you so that theinry kubernetesum uh emission controller will actuallydo the enforcement meanwhile for nonvapum advanced scenarios that wouldactually be handled by the gatekeeperwebhook um and again this is to uhespecially for big teams um you can usethis solution to share your policydefinitions across both VAP and uh moreadvanced features like audit andreferential checks and externalprovidersso here's an example of what um agatekeeper uh constraint template lookslike and as you can see you can actuallyprovide multiple engines uh one forKubernetes validating policy and that'sthe cell expressions that you see andthen another one is Rego again you canwrite it in the language of your choiceum and then for mutation again becauseit's a web hook there's a a bit moreoverhead and and therefore for uh basicmutation features we highly recommendthat you use the entry map feature inKubernetes but for the more advancedscenarios like calling out to anexternal data provider t�his is whereGatekeeper shines why is that becausegatekeeper from day one was designed forglobal item potency meaning when youhave multiple mutation happening at thesame time or happening in differentorders different reinvocation path thethe results are always the same so it'smore predictable um and it allowsexternal data providerum uh calls to be made in parallel thatis really really important becauseanytime you add any uh request out toanywhere during an admission time you'readding time into your web into yourKubernetes emission request time so thisbasically reduces the number of callsthat you have to make to an externalservice and therefore making it morestable andscalable and you would create theresources with the gatekeeperCRDs and with that I'll pass itThank youRita so Kiburno started about five yearsago so you know prior to cell andvalidating admission and mutatingadmission policies but after OPA andgatekeeper so when we looked at thelandscape we took a different approachto policy so we believe policy as codeis not just about validation orenforcing security but it's more thanthat it's about automation as well it'sabout reducing the overload of you knowadditional controllers so Kivero fromthe very beginning has taken an approachthat mutation generation cleanup youknow and also image validation shouldall be part of your policy toolbox andthis really is important if you'rebuilding platforms on top of Kuberneteswant to provide self-service todevelopers you're going to need all ofthese features either through customcontrollers or you can standardize on apolicy code solution like kerno so thatfull life cycle management again is isvery important across all of theconfigurations and when we started thefocus was just for Kubernetes so theinitial question was how do we make itsimple as well as manageable forKubernetes admins to manage policy typesand extensions without having to learnadditional languages or additionalcomplexity so that was the startingpoint of Kivero but overall after as weprogressed and as the use cases grewkivero is also now able to applypolicies on any JSON any YAML you knowpayload whether it's in your pipelinesor in your API tier or pretty muchanywhere in yourstack so why change and what should weyou know do in terms of the evolution ofKerno as Kubernetes also continues toevolve so first off as these new typeslike Rita mentioned as validatingadmission policy mutating admissionpolicy have been added to Kubernetes itcompletely makes sense to delegatewhatever you can to the API server rightso Civero also is able to you know takepolicies written in cell expressions itautomatically generates validatingadmission policies and mutatingadmission policies so they can be run inline wherever possible and forextensions for other features which arekerno only those will continue to run inthe web hook you also get full reportingfull other life cycle managementfeatures that kivero already supports umso in addition to that of course it'slike standard when kivero started againwe were using James path as one of thelanguages that was supported uh but wenow also have full support for cell asin order to support some of thesefeatures so with 1.14 which is thelatest release of Kivero we'reintroducing five new policy types tocontinue this evolution and you knowkeep that you know be as close toKubernetes as possible but with theright features and other extensions thatare required for full policy as code sotoday I'm going to just focus on acouple of these the validating policyand image validating policy but thereare other talks and sessions and happyto cover more at Kimo booth or at otheryou know discussions um so just divinginto validating policy right andvalidating policy is Kerno's flavor ofor the extension on validating admissionpolicy so the way we have cleanlyabstracted and extended validatingadmission policy is by adding a few newyou know tags within the declarationitself so there's things like you knowthe evaluation mode whether you want toevaluate a Kubernetes resource or JSONyou would declare that there's thingslike we�b hook configuration which onlymakes sense of course if you're runningin a web hook so those type of thingslike timeouts etc you can declare thereand then autogeneration so caberno fromthe beginning has supportedautogeneration of policies for podcontrollers now we also do validatingand mutating admission policies sothat's controlled by the autogenerationso if you look at this declaration itlooks very familiar to any Kubernetesadmin who has already spent timelearning validating admission policiesit's just a few extensions on top ofvalidating admission policy but alsocomes with a different you know cellenvironment so if you do want to useadvanced features uh in Kivero such thatit uh like currently for example ifyou're fetching data from imageregistries so many cases like there'suse cases you might want to call out toan OCI registry you might want to checkfor the manifest configuration and thenbased on that make some policy decisionsthings like that you can do with theKivero cell environment and some of thiswe're working with the Kubernetes youknow upstream maintainers like the imageparsing you know cell uh functions wehave written in Kivero to promote themto upstream over timeso just comparing and this is only acomparison of validating admissionpolicy to validating policy it's not youknow we're not going to be able to covergenerating mutating and the other typesbut just with this the key points hereare you know if you need things like umuh like if you're doing background scansbecause policies are not just applied toresources when they are created but youmight have existing resources you mightchange policy types so backgroundscanning becomes really important uh foryou know for uh managing your policiesif you need fine grain exceptions likekivero has a feature to manageexceptions per image in a container soat a very granular basis you can controlwhat's excluded from a certain policyand manage that separately from thepolicy declaration itself and if youneed you know reporting which u kiverosupports um the policy working groupreport format which is a standard umwhich is also adopted by other projectsso if you need reporting withinenvironments uh within your namespacesor within your clusters those sort ofthings you would use Kivero for but itcan also support and by the way itsupports all these features also for thebuilt-in validating admission policy aswell as mutating admissionpolicy another quick example is theimage validating policy so this is whereKivero works with both notary as well ascosign and it can check for imagesignatures as well as attestations sohere again just keeping that same samestyle starting with a validating policyso a lot of this looks very familiar youhave your validations which are justcell expressions but you can checkthings like whether an OCI image hasbeen scanned by your you know yourscanner of your choice forvulnerabilities whether there's an SBOMcreated what vulnerabilities existthings like that you can check and alsomake sure that that image was signed bythe right authorities so all of this issupported in the validating admissionpolicy and the feature I really likeabout this is if you want to now takethis and apply this to any othercontainer container image whether it'syou know like let's say some othermanaged Kubernetes service or even aserverless container image it's a simpletwoline change to go from a Kubernetesresource to a JSON payload and to beable to apply that policy so in summarykivero is you know what we have ourmission is to simplify policy managementfor kubernetes but also make thatavailable to non- kubernetesenvironments which platform engineerswill need and we want to you to makesure that policies are built in a way sodevelopers operators as well as securityteams can all collaborate on the sameshared resources and use standardKubernetes tooling standard life cyclemanagement wherever possible to keep thelearning curve as easy um or as low aspossible and be able to kind of thenuniversally use these policies whetherit's in pipelines admission controllersin your clusters or even in APIs againstcloudservices with that Andy is going tosummarize and wrapup all right so that was a lot a lot ofgreat info thanks y'all that wasfantastic um so just to wrap it all upand give a few pieces of guidance goingforward um Kubernetes core is going tocontinue to support more policies codeit's going to be there uh what is thereis currently very stable and we're goingto see more of it um wherever possibleintry what we just talked about what Joejust talked about is going to be fastermore reliable and a great baseline butyou're probably going to need somethingto expand beyond that um you shoulddefinitely learn cell uh if you haven'tstarted and you're looking to do policycell isn't going anywhere uh I think allthree of these folks just talked aboutimplementing cell in different ways soit's not going anywhere definitely learnit um you know and just to outline kindof the different places where we haveintry options and then we can expand tooutofree options uh we have ON umstructured ON is in beta right now thatuses cell as well uh for Oz we havestructured Oz which is currently GAwhich also uses cell uh arbback++ is indesign in sig oth uh see sig oth formore information about that uh I believethat will also be using cell uh and thenwe can extend that with these otherprojects uh with other web hooks foradvanced scenarios uh for validation alot of what we talked about today wehave vap we have map uh and then we canextend that with additional optionsoutside of the cluster uh and then formutation map is in alpha currently umand then we again have extensions likewe talked about today um all of that iscurrent as of 1.33 that may you knowthese will all start to graduate throughthe different stages but as of 1.33 thisis this is accurate uh and then the lastpiece of advice everybody wants to usemutation foreverything try to keep it to a minimumit's expensive and it can be risky so uhfocus on those controllers where youcan and I think we might have a coupleminutes for questionsfour minutes four minutes great job allrightdoes anybody have any questionshello you all uh I have a question myname is Andre Bendorp i work with H&M uhI would like uh guidance since I havethe opportunity to ask the the mastershereum how can I just in in a platformengineer scenario where we have a wellcontrolled CINC CD how can I justify theoverhead of admissions web hooks andvalidation web hooks um in a performancescenario forcost like instead of using CIND wherethe developers don't have the actualaccess to the infrastructureconfiguration[Music]um then everything is blackbox for themuh why use uh mutate web hooks vate webhooks instead of configure on the CI umso if if you can do something simply andquickly you probably should in aperformance scenario if if you need topunch out to a web hook because youcan't express what you need to expressum through the built-ins then I wouldencourage you to um look web hooks nowinclude something called a matchcondition um it which is self-written incell um and you can very tightlyconstrain when your web hook is calledso if you do think you're going to needa web hook but only in very specificsituations use that to very tightlyconstrain the calls so that you onlycall the web hook when you actually needit that's probably about the best youcan dodoc I was wondering um with admissionwith mutating web hooks how do thosetend to interact with say deploymentplatforms that are looking for diffs andtrying to actively eliminate them with anew deployment do you just get stuck ina loop yes a good question um andthere's a CNCF blog post on this topicso both Argo CD Flux which are you knowgithops projects um they also nowsupport serverside apply which allowsthem to coexist with mutating web hooksso it really comes down to and like Andywas saying it's not one or the otherthere are use cases for both so you haveto pick you know when you apply whichtool for the right use caseall right I think we're right on time sothank you everybody and we'll be aroundfor the rest of the week if you have anyother questions take care2025-04-15 22:00:12.513788 II��P#��cAw1wh9dc6m34all right hello everybody welcome to apractical guide to Kubernetes policy ascode i'm gonna skip the agenda you'llsee it it's coming i'm Andy Sudtermani'm the CTO of Fairwinds i'm also a uhco-chair of the policy working group i'mjoined here by Joe Betts a staffengineer at Google who is the SIG APImachinery lead rita Zang principalengineer at Microsoft who is the chairof SIG O and the founder and CEO ofNurmada Jim Buguia who is a co-chair ofthe policy working group with me and amaintainer ofCaverno we're here to talk about policyand policy is code so first of all whyshould we care about policy this is avery simple graph from one of the whitepapers that the policy working group hasput out but basically what we're tryingto say here is that policy underpinsjust about everything that you mightcare about in your Kubernetes clusterwe're going to talk a little bit moreabout that and then dive into somespecific examples of differentpolicies um recently uh Jimmy Rayanother member of the policy workinggroup wrote a book called policy is codewe say he wrote the book these days umand he said it is the use of codeartifacts to manage and apply rules andconditions it's a pretty simpledefinition i think we can all wrap ourheads around that but if you look acrossthe Kubernetes codebase you'll see thatthere's policy in many many differentparts of the codebase arbback is policywe have validating and uh admissionpolicies we have ON and OZ which aretypes of policies we have network policywe have cublet configurations we havequotas limit ranges resource quotas wehave all these different things that arepolicy um we don't have time in 30minutes to cover all of those today sowe're just going to cover a smallportion of those uh in the next fewminutes but why should we do policy ascode why shouldn't we run individualprograms to enforce policy um and reallyit's just all the reasons that we careabout doing things as code it's verysimilar to infrastructure as code it'smore maintainable it's more efficient wecan uh all inspect it together we can docode review on it we can see how uhpolicy uh is working within our systemuh and Kubernetes provides a lot ofdifferent patterns for this uh we havepolicy enforcement points already builtinto the code everywhere so uh with thatI am going to oh apologies one more uhquick disclaimer again we're not goingto cover all the different projectsavailable there's lots of entry optionsthere's gatekeeper there's caberno��ng in futureuh also with this framework we uh itallowed us to have some automationaround these different stability levelsuh provided as a mechanism to centralizeall the instrumentation related codeunder component base a centralizedplace so like I said uh every metric inKubernetes now will have one of thesestability levels alpha being uh theleast stable uh has no uh stabilityguarantees and uh a metric goes on frombecoming an alpha metric towards astable metric which has the moststability guarantees if you want to knowmore about the stability levels thefirst link is where to go and we alsoestablished a elaborate process fordeprecating a metric so that you are notrandomly breaking uh metrics users andfor that uh you could visit the secondlink here uh next I just want to callout a handful uh of some useful metricsthat would be useful while you're tryingto debug issues in your Kubernetesclusters uh the first one is featureenablement metric this is a beta metricwhich will expose uh all feature gatesthat are enabled at a given moment intime uh in a component uh this is givingyou also the information about whichstage that particular feature gate is atuh whether it's alpha beta orGA uh we also have component health SLISthese uh are uh exposed in a metric SLISendpoint as you can see here uh theythis endpoint is exposed by allcommunities control plane components andhas these two metrics uh these reportessentially the uh results of thelivveness and readiness checks that aredone for each component uh these givethese are useful if you want to computeavailability stats for the Kubernetescontrol plane components and becausethese are telling you about theavailability for uh these components youcould also use these metrics to uhmeasure the success rates for yourKubernetes upgrades especially acrossminor version boundaries because thingsare most likely to break there um thesemetrics are super low cardality metricsand thus can be uh scraped at a higherfrequency allowing you to have a moregranular view of the availability forthe control planecomponents we also have metrics aboutour metrics so we have the total numberof registered metrics for each componentbroken down by what stability level uhthey are each at if that metric wasdeprecated we also record that uh in thedeprecated version label we have othermetrics like total number of disabledmetrics number of hidden metrics foreachcomponent and for all of these metricswe have a nice uh autogenerateddocumentation that's available in theofficial Kubernetes docs repo uh websiteum and this is a nice place for you tovisit if you're trying to debug issuesin your cluster and you want to quicklyknow what all uh metrics are availablein the particular component you'retrying to debug this will give youinformation about what metric each whatmeasurement each uh metric is recordingand the schema for each of thosemetrics that was all about uh metricsvery recently we have introduced a newway uh to debug Kubernetes uh this is aseparate set of HTTP endpoints uh thatevery Kubernetes component uh exposesand we are calling these endpoints Zpagesuh these were introduced in 132 where weintroduced two Z pages uh status andflag Z uh this was uh exposed by cubeAPI server at the time statazy gives youdetails about the component binary uhthings like when it was started how longit has been up what go version it wasbased on what it is what is its binaryversion emulation version and can ofcourse be uh evolved better to includemore details as it grows flag Z givesyou details about all the command linearguments that were used along withtheir values while starting the binaryand I'm happy to announce that both ofthese Z pages Statis and Flags are nowavailable in all control pin componentsstarting 133 so they're not just limitedto API server you can find them forcublet umuler and the rest of them justensure that you're enabling thecomponent status Z and component flag Zfeature gates while starting yourbinaries one note about the usabilityfor these uh endpoints is that they onlysupport plain text format and are not astable AP�I yet these are only meant forhuman readability so uh they're notrecommended to be parsed by machines uhtheir response schema is subject tochange in future um but we are planningto make it more stable as the featuregraduates to beta so stay tuned for thatand we do have more ideas aboutintroducing more Z pages for componentsand if you have ideas about signals thatcan constitute a new Zpage then pleasedo join ourmeetings uh that was about Zpages andI'll pass it over to Damianyeah so logs have been around foreveressentially in Kubernetes and we use acustom logger in order to produce themwhich is called K log and it is a forkof the go the Google logger which isspecifically modified to supportKubernetes use cases it has integrationwhere we can easily convert Kubernetesobjects to their string representationfor example pods and nodes and it makesit easier for developers to produce logline as a whole uh it also conform tothe logo interface which allowed us toextend the functionalities of uh thenormal Golang logger uh the normalGoogle logger um but if you've beenaround in the Kubernetes ecosystem forquite a while you might have seen thatonly the text format used to besupported and there was no realstructure in the log lines that it wasalways just a string based uh log thatdidn't have any pattern that you couldbe bas yourself on so that's what we aretrying to invest in in the past coupleof years which is to add a structure tothis log uh to make it easier to queryuh to make them easier to analyze andfor essentially to make it more tooptimize the debugging uh time that youspend on Kubernetes um and to do that uhwe could have used for example uh thenew uh slog logger that was introducedby go one and or one half year ago uhbut we introduced structured logging waybefore that so we decided to stick withK log and added support for structuredlogging in K log and this essentiallytoday makes it easier for you uh toingest uh the logs that are produced byyour Kubernetes cluster into your uhlogging platform and also we support anoptional JSON format that allows foreasier use of logging query languagesuch as log um so this was a hugeimprovement uh and this is an example ofwhat exists today you can have thepattern at the top and below is theexample in the text format which is whatyou would usually use in your Kubernetescluster and here's the example with JSONin like it's way cleaner and easier toanalyze compared to before what you usedto do with GP and reg X now you can usemore advancedtools but that wasn't enough like wenoticed that it was hard to integratethe new uh structured logging acrossRoss the communities codebase and weneeded to simplify the way the data arepassed down uh in the log in the codetree so we decided to introducecontextual logging which essentiallyattach a context to uh the differentcall sites and then we can set forexample the pod name attach it in thecontext and we know that all log line ubelow like down the co the code treewould set the pod name and it will beconsistent across all the log that areproduced by our componentsum and if you want to contribute tologging like there's still a lot ofimprovement uh to be made uh the featureis not GA yet um so you can contactPatrick Collie or join the the workinggroup structure logging that meets uhbi-weekly uh and it's definitely a goodway to start your contributionjourney um now I'll talk about tracingwhich is the newest addition I guess tothe signals that we support inKubernetes if you are not too familiarwith tracing we essentially dodistributed tracing tracing inKubernetes which is uh and it covers thegap where systems are becoming more andmore difficult to understand for endusers there are multiple microserservices involved and it's hard tounderstand which one impacts yourworkload in the case of Kubernetes whenyou when your clients send a request tothe API server latency can occur at theAPI server level it can occur from thestorage so HCD and it's hard to figureout where it's coming from but withtracing it allows you to get like a treelike overview of what's happening to a�request and below on that slide you cansee for example for NPS request youwould get a a span for the time it'sspent into the API server and the timeit's spent in that CD in thestorage uh so we introduced tracing intwo places in the code uh in the in twocomponents in the API server and cubletuh it was it became beta in 127 uh wewanted and David Ashpool mentioned thatGA was targeted for 133 uh but there wassome delay and we are now planning for134 um this is an example of how itlooked like today for the API server cansee uh what time how much time is spentfor a request in the API server in thedifferent areas in the API serverbecause there's some serializationthere's some compute that is happeningand also the time incd so now you canclearly see what's happening to yourrequest the same goes for cublet if youever try to optimize or debug thecreation of a pod you could see thatthere's a lot of interaction betweencublet and your container runtime and alot of things are happening for exampleyou pull an image you create the sandboxand so on so these are the steps thatare recording in the traces and give youa better overview and makes it easieralso to understand how these all thingsworksum but we've noticed that integratingtracing to our signals such as metricsand logging wasn't trivial and wasn'tlike instinctiveuh because the current debugging passand the whole debugging pass that allKubernetes user went through is usuallythat they would look at alerts ormetrics and then based on their findingin the alert and what is reported theywould search through the logs viadifferent patterns and tracing wouldjust be like a side tool that would beused in some niche cases but it wasn'tpart of like the debugging experience orlike the general debuggingexperience and what we want to do is tomake it to include it to the generaldebating pipeline so a end user couldlook at an alert and look at metricsbased on this metrics um they would beattached exemplar which are essentiallya sample of a trace for one particulardata point in your metric so for exampleyou can see some dots on the graph andthat's the exampler depending on your uhmetrics on how your metrics look likeand then from this exampler you can findthe exact trace that is responsible forlet's say 50 millisecond latency foryour request uh and then most of thetracing UI these days allows you to jumpfrom a trace to a log as long as thereis correlation between both so you needto share the context between your tracesand your logs and then you can jump andget even more detail about what happenedto that particularrequest um but this is still in progressuh we've introduced examplar to some APIserver metrics in 132 but it's not yetapplied to all metrics in Kubernetes andthere is some uh additional work that weneed to do in our library to make themeasier to integrate for our for thedevelopers of Kubernetes in general umand the span context in structuredlogging David Ashpool mentioned alsolast CubeCon um that we targeted 133 butwe've noticed that there is someperformance overhead that might need tobe investigated so this is still inprogress and now I will talk about thedifferent sub project that we havethanks Amian um so going back to the subprojects um I'll go over not all of thembut some of them that have had recentdevelopments that are absolutely worthmentioning um I'll start with usagemetrics collector so for the while um weobserved this gap in cublet um going tocatiser where we weren't able to exactlyhave high fidelity metrics and by that Imean metrics with a gran granularity ofless than 15 seconds so we came up withthis collector which actually allows youto achieve per second highfidelitymetrics and in doing that it allows youto have much more granular realtimedashboards and um more the activeautoscaling because you can use thesemetrics and plug them into um you knowyour autoscaling solution it could bemetrics or it could be prometers adapterit could be kada um on top of that umUMC does not export the metrics itoperates on and because of that you saveon your metrics backend storage as wellas y�ou also save on compute because theaggregation is performed at collectiontime and because the aggregation isperformed at collection time you don'tneed to be dependent on promql there's astraightforward configuration that youcan configure and have your metricsstraight away um I do want to give avery well-deserved shout out to ElanaHashman the previous co-chair um becauseshe added Croup's version two support inUMC recently and because of that we areable to leverage features in moderndistributions as well as container thantimes and one example of that is now inUMC we have much better much moregranular and insightful memory feedbackas well as swap controls um it's all onthe website um on the um uh GitHubrepository um the next uh sub project Iwanted to mention is uh probably thebread and butter uh of SIG uh it's KSMcube metrics uh the most popular um subproject that we have and it's thestandard um metrics generator forgenerating metrics for your um native umKubernetes resources you can see thelist on your right uh which are whichare basically the supported resources asof now if you want to add more you'reobviously to send um a PR um and itbasically achieves this by monitoringAPI server and updates these metrics inrealtimehowever we did realize it was just amatter of time before you know usersasked for metrics for custom resourceswhy would we leave those out do we needto write a collector for our own customresources so for that we came up withCRSM which is um custom resource statemetrics and you can see theconfiguration on your left here and themetric on the right so you basicallyhave collectors as configuration youdon't have to write your own collectoryou can just configure put it in theYAML file and deploy that um again therewere limitations to this so um what weobserved down the line is um you knowhow it after a particular time um thatit evolved um when we tried to add a newbehavior or the new field to theconfiguration it kind of conflicted withthe existing ones because the way thiswas written earlier was in a zerodependency just you know down to andhashmaps uh manner so um that kind ofprevented us from scaling and deliveringum you know uh delivering this to um therequirements basically delivering on therequirements that the customers had inmind um and more importantly anythingthat affectedCRS would also potentially end upaffecting KSM so basically because thesewere coupled together if CRS goes downKSM goes down and that is bad becauseCRS metrics don't have stabilityguarantees but KSM metrics do so we needto um basically fulfill those SLOs's umand that's not the case when there's abug in CS so um and of course there wasum the configuration itself was veryinvolved after a point of time becauseit got involved um but evolved again andagain and all these behaviors were justum being added to it um so enterresource state metrics uh the cap is forthat actually we are very close to alphagraduation um and how RSM fixes this isbasically it first of alldecouples CS fromKSM um I should mention this explicitlythat you can also you can obviously haveKSM running in your cluster as well asRSM and have them both operate at thesame time to establish this completemetrics solution for resource schematait could be custom resources it could benative resources whatever you needmetrics from um from the whenever youneed it from the schamata of these youcan always have both of these deployedand you should be sortedum also um the RSM configuration thatI'll just show you in a bit is asuperset of CRS configuration soeverything that you have in your CSconfiguration as it is right now itcould all be ported because we haveconformance tests as of now and theseare 100% um with a 100% coverageum so the place uh I would say the thething the um the facet where RSM reallyshines is it introduces this concept ofextensible dissolvers so you couldessentially you don't need to learn aabstract DSL now you can have your ownso whatever's hottest in the communityright now it could be Cell right it'sCell right now and there's already asupport for cell resolver so you see itemphasized on your left there and thisso these are basically two metricfamilies that are being generated rightnow and the first one uses cell as itsresolver the other one uses no resolverso it defaults to the unstructured APIthat's maintained by cigarum for you know more simpler casesbecause that does not support addis orhashmaps as of now uh but you can seethat the label values and the values arenow I mean you can write them in theresolver of your preference if there'sanother expression language that comesinto picture you can always extend thatand add that in RSM and you should begood to gohowever um even with that um the problemis that expression languages are turingincomplete so you cannot always havewhat you want and there would be alwaysthis need to be unblocked right um or itcould be that another um resolversupports it another language supports itbut you don't have the time to have thatmerged downstream um have that mergedupstream and then just wait foreverything to happen and then kind ofuse that use that so for that you canimplement the collectors interface anduse this as a library and you should beable to define all your metrics inGolang itself and that's the tuningcomplete way of doing that so you willnever be blocked um and that's basicallywhat we wanted to get out of this uhthat was something that was not possiblein CS and if this goes down this has nostability guarantees but your KSMmetrics would still be in place so yeahthat's RSM and now I would like to givethe mic to Yonai thanks um if you uh atthis point you still want to stillinterested in the sik instrumentation uhI will tell you a story about how Ibecome a contributor to kubernetesstarting from the instrumentation sikum so how you can get started so how Istarted is that at first you need toread through this documentation aboutfrom the kubernetes uh it hascomprehensive information about how youcan get started and then you should befamiliar with the role in the communityfrom members to reviewer approval and tothe sub project uh and at this point youcan check the issues uh in under thecommunist repo and also check the issuesfrom for these um sub projects uh fromthe instrumentation s as well uh and atthis point you can also attend the sikmeetings if you have any ideas todiscuss and also participate in thereviews and issues and docs uh and youcan also contact directly to the subproject owners uh to share the ideawithin discuss and how you want toimproveit and for myself I found the issuebasically it was something uh has issuehappened in a metric cardalityenforcement um so I discuss it in a sikmeeting and start with a simple PR tohave a stock gapfix and then I dig deeper to do therefactoring to improve the me mechanismand also cover it by the test andintegration test and twin test and alsoupdate thedocs um so and then I think at this timeit should be sufficient for me to applyfor the membership so I applied it andgot a sponsor from other uh me uhreviewers uh so and then I become amember after that my journey improvinghis community still continue so uh ifyou also want to explore you can followthe link here he has all the informationaboutuh you can also explore the other six aswell uh but for my recommendation Ithink it's for instrumentation sake it'sbetter you start from here because uhinstrumentation s is has a relatively uheasier bar and not have a very deep uhlearning curve relativelyu and so how you will where to find usuh we have two bi-weekly uh regular syncmeetings um both happen on US Pacifictime 9:30 a.m um one is regular meetingand the other is strategy meetings wherewe'll triage for the issues and PRs andyou can also find us on our slackchannel and we also have our mailinglist and I think our chairs and ties uhpranchu Richard and Damian David arealso welcome your directly contaminantto share yourideas uh we also added um one more slideif you want to become a member for underpresented groups you can also uh checkout uh following the QR codehere uh I think that's it so if you haveany questions you can asknow okay2025-04-15 22:00:13.422981 `�`��H�R#��GAG8U141NkrDIhi everyone Uh welcome to this uhCubeCon talk This is going to be aworking group update So uh we hope thisis going to be rather a short talk andjust an introduction for those thatdoesn't know the working group exist andhopefully we will get question for youmostly on like if you require newfeatures or you if you have any painpoints that you would like the workinggroup to work on So uh for starters uhwho are we Hi everyone Uh thank you forbeing here My name is Yan I'm a seniorprincipal software engineer at Red HatUh working on our hybrid cloud AIplatform I'm one of the co-chairs forwork��Q#��kARwcC44BWDvAhey everyone welcome to the introductionand deep dive for SEGinstrumentation my name is Pranchu i'vebeen working for four years um with RedHat and I've been involved with the SIGfor about 3 years now um hi everyone I'mRich i'm a software engineer at Googleand I've been with the SIG for about twoyears and that's ithey everyone I'm Damian uh I'm workingfor Red Bat and I've been a code TL forS instrumentation for three years now Ithinkhi I'm Yong Re um I'm working alsoworking for Google i just recentlybecame a new member forKubernetes um so on the agenda today uhbasically SIG instrumentationfundamental atomics that we'll cover aswell as um we have a dedicated sessionfrom one of our members towards the endof the presentation regarding how tocontribute for newer contributors um sowhat do we do um the sik charterbasically outlines the best practicesfor component owners in Kubernetes thatencourages them to instrument their umcomponents using c um component base uhwhich we maintain and a big part of thatare signals so these consist of metricstraces logs and events um there's alsouse cases user stories um over the yearsthat we received for which the solutionsdidn't make sense for it to be justhoused within the Kubernetes tree itselfand because of that it led to theincubation of multiple sub projects overtime uh which we'll talk about um in abit but these sub projects serve asvaluableum facets for the community to improvetheir overall customer um clusterexperience sorry um how do we do it uhwe um have these bi-weekly triage callswhere we look at um segmentationassigned pull request and issues and tryto address them as soon as possible wealso look at kepts every now and thenand how they're doing in the enhancementrepository um because besides being justmajor you know feature-driven patchesthese are very important to the SIGbecause they outline the SIG's stance onlong-standing issues that may or may notbe opinionatedthanks Branchu uh I'll be talking aboutmetrics in Kubernetes so Kubernetesmetrics follow the Prometheus metricformat uh this is the architecture forPrometheus so all Kubernetes componentsare instrumented using a Prometheusclient libraries uh all of thesecomponents expose their respectivemetrics in a HTTP/metrics endpoint andany monitoring system that understandsthe Prometheus metric format can scrapethese endpoints and store them store thedata in a time series database havemonitoring on top of in the form ofdashboards and alertsthis is one lesson that we try toenforce reinforce every year is that asbenign as it may seem but one should notjust simply rename metrics because thisis a breaking change why this is abreaking change is because when yourename a metric you're deleting the oldmetric and you're replacing that with abrand new one so all of your existingmonitoring the alerts the dashboardsthat were using the old metric theywould now stop working this is arealization that came to us afternumerous breakages that had happened uhfor Kubernetes metrics in the past whichgave rise to the Kubernetes metricsframework as a part of this effort uh wewrapped the Prometheus client librariesimplementation with our own uhimplementations for metrics and we addedan annotation for the metrics uh todenote the stability level for them thestability level is a way for us toexpress the likelihood of that metricchangi��ing group serving and also a projectlead for Argo and Coupflow and alsomaintain a couple of projects Uh one ofthe most active project we are workingon is a llama stack project So feel freeto take a look at the project if you areinterested and I also uh authored acouple of books in case you areinterested in reading as well Yeahthank you Joan and uh myself uh EduardoArango also a working group chair forthe working group Serbin and I've beenworking in distributed uh systems usingcontainers for a long time now and um Iwork mostly on the low-level containerruntime bits and now I'm working onprojects related to CDI and the arraywhich are also of big importance forworking group serving which we will talkin aRight So what is working group serbingSo uh actually just a minute ago I wasthinking that we forgot one slide and isuh is one year now So working groupserbing was born in uh CubeCon Europelast year Paris and it basically wasborn out of the necessity where everyonewanted to uh collaborate on Serbianprojects and Serbian needs forKubernetes there are some gaps inKubernetes and during one of theunconference sessions at the at thattime contributor summit now maintainersummit the name changings that we'regoing on uh we decided we needed twoworking groups so last year the workinggroup device management and workinggroup serving uh was born and now it's afull one year of the working groupserving so we forgot the uh one yearcelebration is slide yeah but yeah happybirthday to working group servingSo the working group observing isbasically uh it has three main goals Uhone goal is to enhance Kubernetescontrollers meaning we are proposingAPIs we are proposing uh changes to mainKubernetes controllers and alsolow-level components as we are going totalk about things like DRA So we canmake it easier for everyone to run uhsering workloads on Kubernetes Thesecond thing that the working groupfocus on and it's a a project that Joanis going to deep dive later is uhinvestigate or research uh orchestrationand scalability and why the investigatework Uh we are working on a project thatis called the inference perf or isbasically a benchmark tool So we canreally understand what is happening whenwe run uh these large language modelsbecause uh last year when we started theworking group serin we had a couple ofvery long meetings discussing whatshould we monitor when knowing how toautoscale a cluster that is running uhlarge language models and really wecouldn't agree on one topic and then weknew okay if we cannot agree on a topicwe really need to deep dive and dobenchmarks to know what are the keycomponents that we need to monitor in acluster so we can doautoscaling and also for this optimizeresource sharing for serving workloadsUh here is where uh we are working onthe low-level container runtimes uh DRAand very exposing ways for pots to shareuh resources like two pot sharing a GPUor pot uh communicating across multiplehosts for things like MPI communicationSo these are the three main goals forthe working group servingYeah And working group serving is led bymultiple different companies Uh we havefour co-chairs from Google Cloud Red HatNvidia and Bite Dance And uh besides uswe have more than 330 people uh on SlackSo you're welcome to join us If you everhave any ideas to contributing to uh theimprovement to serving workloads feelfree to join us and send us feedbackYeahAnd this slide uh has a list of um uhtalks that we had uh from the communitymembers last year Um the the very topone is the Kubernetes podcast Eduardoand I did uh a while ago and the therest of them are basically from KubeConlast year and there were a lot of talksrelated to some of the initiatives fromthese working groups and talking aboutdifferent projects uh in more details uhfrom this working group So feel free tocheck them out as wellAnd oh it's okay as as Joan was sayinguh the leadership has four uh workinggroup chairs but really since we havehundreds of people joining the calls weended up deciding for work streams So wehave four main work streams uh becauseit was we were running into h�ours longsmeeting every week and to this day weare still running weekly meetings wherewe run the whole hour discussing aboutthese topics So we we ended up going forwork streams updates Uh the 2024 reportuh the for working group report it's outSo if if you want to check it out thereis like a a whole abstract of what wedid last year all the initiatives thatwe're working on the projects that weare pushing uh the repos that have beencreated by uh the contributors to theworking group So uh just going to take amoment for everyone to take a picturethere and basically if you read thereport you don't need to be in this talkSo you're welcome Uh but yeah this talkis basically a a live presentation ofwhat the report was and it's nicebecause the report was review and we gota lot of contributions from everyonethat participates in the working groupnot just the chairs but we got uhparticipation from everyone on like howto structure the report So it's a it's avery active working group Yeah I alsowant to mention like it's a collectiveeffort from everyone in the workinggroup including different sub projectleads and work uh work stream leads Sothank you everyone for helping out withthe reportSo the first work stream is calledautoscaling and here this slide can besplit in two So the first part is uhwhat I was pointing on benchmarking Sofor autoscaling we are uh deep diving onresearching which metrics we shouldreally monitor when autoscaling So uhJoan is going to deep dive into what isthe benchmarking tool that we areworking on at the the work group And thesecond is uh a very interesting topicthat really takes me back years evenalmost before Kubernetes and is thechallenges with OCI images So for thosewho know uh some models are very bigimages right like we're talking abouteven hundreds of gigabytes that we haveto pull to run a model So right now atthe working group we are discussingagain what years ago was being called uhvolume containers and is basicallycontainers that are not uh a runningsystem but are mostly uh for data andnow we have a image volume source ke toaddress this right like we want to usecontainers as a way to move data and wewant to just treat the these models asdata because they are getting very verybig So these are two of the topics thatthe autoscaling workstream is workingon Uh the second workstream is themulti-host and multi-node uh workstreamwhere as I said we uh focus on how tobetter run distributed workloads uhthings like MPI or things like we areaddressing with the leader working setand BLM and KSERF Uh so three of the keymilestones that we reached last year wasuh not the full like it's not the fullrelease of leader working set uh 0.3 or0.5 was done by the working groupserving but it's more like weparticipated with feature requests andreviewing some PRs and proposing thingsto the leader working set uh team or orthe contributors to the leader workingset saying like hey please can youaccommodate this and that for us it willmake the life of the serbing communitybetter So it's that's what I'm trying tosay here also uh working group serbingand the BLM uh community join it fortesting uh and now we are also workingfor disagregating orchestration so wecan do like multi-pnd testing right andfor ker we have Jan that is one of theleaders of we have been also proposingfeatures there so we can uh merge as acommunity and not just have likedisagregated components but KERF is alsowatching the discussions questions thathappen at the working group Yeah So Kerfsupports multihost serving right now butwe are also like trying to improve itgoing forward like we are proposing somenew set of CRDs to better supportmultiode serving in term in case userswant to customize the behaviorUh the third work stream it's uh thedynamic resource allocation So it'sweird to be called like that So it'sbasically a work stream to monitordevelopments on the new feature that theexciting feature of Kubernetes that thatis DRA and here we basically are justsending letters to the working group uhdevice management as you can see itthere uh and we have uh John uh Bel�lamicand u Sergey joining the working groupserbing meetings so we can have a aweekly communication on what do we needfrom the serbing workloads so they canbetter accommodate the RA as the RAright now is under change right Likeit's still a beta feature So it is theright moment to push features before itgoes ZA because once it's ZA it's goingto be very very hard to makemodifications to the RA So our focusduring 2024 and during 2025 is pushingas much features and requests as we caninto the DRRA uh the working groupdevice management So once the RA goes uhGA hopefully in 1.34 in December thisyear uh it will accommodate as much offeatures that we need from the workinggroup uh servingcommunity and uh something that we arealso working on very heavily and I hopeto participate more this year is in uhdevice failure handling and resilientworkload management So uh we still inKubernetes uh mostly from the hardwarepart we don't communicate very well tothe scheduleuler when a node is nothealthy So uh this uh work stream fromthe working group is trying to push uhnew features and new APIs so we canbetter report to the Kubernetes when aGPU is unhealthy or misbehaving So itcan better reroute a workload and we cansave some time and not have a workloadon a GPU that is not healthy and justtrying to run and and hang in there Andthe last work stream is orchestration onthis I'm just basically going to go veryfast because uh Jan is going to deepdive on it Uh one initiative uh comingout of the working group is the gatewayAPI inference extension uh for short geeuh and it already has uh a first releaseso a 0.1 release and uh it is alreadyshowing improvements So it's it showsthat this working group uh and everyonethat has been contributing to it arealready providing value to the communityby joining and as as uh Jan mentioned ismultiple companies that are contributingto this So in just one year we managedto create a a a product that is helpingeveryone and making things uh moreperformantYeah Next I'll talk about are some ofthe initiatives that already mentionedOne of them is the inference proofproject This is a collaboration betweenum among different companies includingIBM Red Hat Google Cloud and Nvidia Andwe anticipate more uh people uh tocontribute and welcome you to contributeas well It's a project that we uh that'spurposed to um be a standardized toolfor benchmarking uh as a library uh youcan run it uh on anywhere like onKubernetes or independently and it'saiming to solve different uh use casesfor benchmarking such as autoscaling uhLaura use cases with the u gateway APIinference extension um etc and itprovides uh the current status is um itprovides a Python library forbenchmarking workloads and it supportsthe VRM model server and supportmultiple distributions uh with specifiedQPS and uh also supports like um CR GPTdata set to resemble real worldconversational workloads and if you haveany additional use cases any uh otherdata set that you can think of feel freeto pose it uh propose it in GitHub issueas well and we uh I think the projectalso supports report generation with allthe metrics are valuable uh and uh aimto be extensible to add support fordifferent model servers not just for VMbut multiple other ones as well from thecommunity and data sets and differentloadgenerators and on the right hand side isour architecture diagram uh that I'm notgoing to talk in more details uh butthere's a dedicated um weekly meetingweekly contributors meeting forinference perf in case you want to joinuh the designdiscussions and the road map forinference perf um u we're planning toadd support for multiple model serversright like triton tgi and u also likesupport integration with differentorchestration projects including some ofthe new ones from the vm community uhlike the production stack and a bricksand also the uh integration with gatewayAPI inference extension and uh um wealso want to add support for differentuse cases and data set Um as I mentionedearlier like feel free to proposepropose anything that's missing Um weare also aiming to support multiodel usecas�es and uh different trafficdistributions and so onAnd the second initiative and subproject is the gateway API inferenceextension u said gee in short uh it'salways a challenge to come up with a newname for a project uh but that's ourbest effort there uh there's a link tothe pro uh repository and so thisproject basically are trying to improvelike resource sharing across multipleuse cases on a shared foundation modelUh like for example if you are usingLaura adapters this is going to besomething that you might want to try Andit improves the tail latency andthroughput of RM completion requestsagainst um uh Kubernetes hosted modelservers um using a custom schedulingalgorithm that you can extend um thereis a whole design doc for that uh incase you want to look that's I thinkthat's available in the projectrepository as well Um it provides a setof uh declar declarative uh APIs for toroute client model names to differentuse cases uh different Laura specificadapter use cases uh so that they canshare resources uh more efficiently andthere are also end to endobservabilities built in uh withdifferent service objective attainmentSome of these objectives are customdefined Um that's also part of thedesign doc as well Andum yeah and this also this project alsoensures operational guards betweendifferent client model names allowinglike platform teams to safely servedifferent uh workloads on the pool ofshared resources running on the samefoundation model And there's a detailedstatus report uh that's available inthis link Uh we'll be attaching theslides uh in our talk as well in caseyou want to look into thedetails And that's uh this is a adiagram that uh this project umbasically wants to integrate with manydifferent projects in the ecosystem Soeverything in the uh blue boxes arealready implemented Uh and othercomponents in these diagrams are ratherexternal components and uh they are notimplemented yet but we are aiming to doso with together with the community Umthere's a link to the road map aswell and uh the next initiative fromthis serving working group is theserving catalog So basically thisproject aims to provide uh workingexamples for different model serversdifferent models different deploymentpatterns uh for example single ormultihost inference and as well asdifferent primitives or orchestrationframeworks For example if you are justusing Kubernetes u deployment you canfind examples there Um if you areinterested in leader worker set thereare also examples here and case ofinference inference surveys are still inprogress So we are working with the caseof community on that one and the thisproject aims to help uh community ofusers u to explore differentconfigurations uh recommendedconfigurations uh not that they are notlike best practices or best performingbut rather our best effort to recommendto the community for running inferenceworkloads So there you can finddifferent patterns that you can tweakand uh and change and as a uh good itcould be served as a good referenceimplementation Um and theimplementations are meant to be like uhcloud uh uh provider specific So youthere you can also find differentconfigurationsuh for different hardware acceleratorsand different cloud providers uh in theexamples as wellUm so and a status update you uh the thesingle host inference using justKubernetes deployment is available uhfor VRM and jet stream serving uh modelservers and the multihost inferenceusing leader worker set for VRM is alsoavailable and there are also componentsand examples for HPA stops uh for tokenlatencyNext we'll talk about the community YeahSo uh this week we already had uh twoevents where topics around the workinggroup serving were mentioned So you canlook for the recordings afterwards isthe cloud native uh cubernetes AI daywhich was yesterday and also the Qflowsummit So uh it's hard to recommendthings Yeah there are also talks Yeahthey're going to be recorded on YouTubeSo and moving forward So we have 11talks that are somehow related torunning LLMs on Kubernetes this week Sothe list was too long to put it in asingle slide and it w�ould be too smallSo it's like uh please look for uh LLMrelated talks uh in the schedule Andtomorrow we also have the working groupdevice management update uh where wewill be uh showing the progress of DRAand hopefully uh dreaming to have DRA GAby DecemberYeah and uh we we haven't really checkedin details but there are many talksrelated to some of the sub projects andinitiatives that we talked about uh fromthis working group So if you search bythe project name you can probably findsome there There were a couple of themlike from the AI day yesterday So youmight want to check that out as wellAnd this I think this is the last slideand we just want to welcome everyonehere to uh or who's watching therecording to participate in our workinggroup activities So as we mentioned wehave different work streams but wesomehow combine them to a weekly umweekly community meeting but there arealso like separate individual umseparate contributors meetings fordifferent sub projects and initiativesand we may expand um um the the the themeetings when needed So as we expand tomore projects there are likely moreseparate meetings available and uh weare on the Kubernetes Slack uh with thenameWG-erving and feel free to reach out tothe community and reach out to any oneof us the co-chairs and sub projectsleads as well if you have any questionsYeah as uh John mentioned the mainmeeting happens every Wednesday at uh atnoon Well noon Eastern time Noon Easterntime but we have multiple meetingsacross the week to talk about specificuh topics of the working group So yeahit's if you want to go to the main oneit's on Wednesday but uh it's all overthe week My calendar is just beepingabout working group Yeah we may changethe frequency going forward like as asas we mentioned like I think we we wantto listen actively from the feedbackfrom the communities So if there's anyimprovement and feedback uh don'thesitate to reach out and but easterntime since this is this is a groupEurope eastern time means eastern UStime So just want to avoid confusionthereAnd thank you Uh if there is anyquestions uh there are there's onemicrophone thereYeah Any questionsHey can you hear me Yeah Uh I'm Yen fromNvidia Less known Yen than Yan Tong overthere different So my question is yeah II didn't follow up and uh yeah thediscussions just wondering likesomething like a KV caching and for theperformanceoptimization and uh even for toleranceis that the topic or any plan or ongoingwork about that yeah key value cachingand uh you know what I'm talking aboutright Yeah I think that's more like afocus topic for the VM and model servercommunity I know VRM has community hasbeen working with uh the M cache projectUh we have one of the VRM maintainershere Michael Gan Um he's over there Soin case you have more specific questionstalk to him Yeah So I'm asking isparticularly to my best knowledge thereare no universally acceptable andstandard right for the KV cache or KVcaching even model caching right as aworking group serving I was wonderingright moving forward right are we goingto have some kind of a standard for yeahKubernetes and to support the theserving workload right what's the way orrecommendation to optimize theperformance use the advanced and cachingand or other yeah mechanism course topicuh the caching part it's uh whattriggered the whole um conversationabout uh image volume source kept and isuh we want to find a standardized way ofmoving around data and if we standardizethat we can start talking about how dowe standardize caching for everyone Soit's uh it's related to thisconversation Yeah that's a good questionUsually whenever it comes to some kindof standard it's very hard to definesomething as a small community So wereally hope uh more vendors and more endusers can can participate in ourcommunity discussions so can so that wecan uh perhaps in in uh in the long runwe can drive uh like lead someinitiative towards some kind of standardon this Okay thank you Way to goFirst of all thanks for the talk It wasawesome Do I think it's kind of for youuh when you are talking about how we arethe group discussion about how to whatto watch to scale L&M Uh actually I wastrying with some some benchmarks also Soit's kind of similar of what it's doinguh and mostly of the VLM and other onesthey use like the GPU uh cacheutilization of the KV cache uh butsometimes it kind of doesn't have like areal performance or completion time orend to end time it does like token wiseit would thinks like it's good but uhend to end wise it doesn't work well Doyou have any like other metrics that youare watching that you could share YeahUh one example isuh prompt long like how long is yourprompt sometimes affects the performanceof your process right so that's why uhwe kick it out the the initiative forfor the inference perf because it's likewe really really need to deep dive intothe behavior of each language modelbecause we are still at the early stagesand we don't truly understand whatreally impacts the performance So duringthe meetings discussions have beenaround like even the the the length ofthe prompt can affect things So we needto monitor things from the the promptitself all the way down to memory andGPU utilization Right So it's like everylayer impacts a model in a different waySo it really goes into the model So weare trying to like uh create categoriesand kind of like it's it's slice anddice So we can then uh take that to theautoscaling workstream and then havesome standards on like okay model type Abehaves like this model type B behaveslike this but we are still like we arestill working on the tool to do thebenchmark So we are on early stages YeahI'm I'm sorry but I will just have onemore question and I think this is morefor you uh I have been working using theVLM benchmark and also the AI bricksbenchmark They have kind of similar tothe inference PF that you're using Uh doyou think it's viable to her run mybenchmarks using the inference PF orthey are quite similarUh I think we are just working withtrying to partner with differentcommunities there Right So I think thegateway API inference extension uhrequires some additional metrics andinference perf also wants someadditional metrics as well from the VRMmodel server So that's something we areworking closely with the VRM communityto make sure like we can have theadditional metrics available that canbetter represent the workload uhespecially when running large languagemodels where the standardized standardtraditional metrics are not like thatuseful anymore So yeah that's somethingwe are actively working on So I'd reallysuggest like joining our communitycourse to raise your specificrequirements and use cases uh becausewe'd love to hear from more users YeahYeah Thank you very much Yeah no problemSo when you look at the um developmentson the model server side uh in terms ofwhat they're doing in terms of likeautoscaling or dynamic resourceallocation how does that relate to likeyou know what's being done at thecluster orchestrator level versus withsomething like Dynamo right like thingsthat are uh similarly been done at themodel server layer and early thoughts onhow they couldcompose if that's been sort of a area ofthought uh the Dynamo word uh yeah it'suh at the working group we try to bebender neutral and and we we are tryingto accommodate and make curities betterfor everyone right that's why we ask forcontributions so yes at uh the workinggroup meetings we have presented fromNvidia things like the Nemo operator andu hopefully soon now that Dynamo is opensource we will be presenting it thereand what we from Nvidia do during thatmeeting is say like hey we We are doingit like this Something that is hurthurting us or like it's a painoint isthis and we will just like point it outto the working group and from there wewill take it to a kept or we will takeit to to something that we can take tomake your better right But it's uhthat's kind of like the process Okay Forit Yeah Got it Thank youHow are we in timeDo we have time Do we have time Is no NoI think that's enough All right ThanksNo I'll just talk to you We can takequestions offlineThank Thank you everyone2025-04-15 22:00:13.968982�and we do thatin a provider by use for example usingcubeadm we want the user to also be ableto configure things there like he maybewants to enable a specific flag for thecube API server which is part somewherethere in cube adm's configurationum a cubid ADM binary itself is normallysupposed to work for a single cubernetesversion so if you take a look forexample before v1.29 29 Kubernetes QBAMused the vivon beta 3 version of theircluster configuration starting with theversion after with B130 they startedusing a newer API version because theyintroduced some changes in cluster APIthough we want to or a user wants tocreate clusters in different versionswhile running them in different versionsthere could be classes running 1.29 29while still others are uh v29 and othersare already at v30 so we can't justembed the v1 beta 3 or v1 beta 4 apiversion of cubadium and expose that oneto the users because what if they wantto create a cluster with the otherversion um so what could happen there isum or or what's the mistake hereuh we just can't do this because itwould be breaking a breaking change ifCubadium itself evolves introduceschanges we bump our go dependency andhave to use the newer V1 beta 4 versionum but would not be able anymore tocreate the new the older versions thenso what we do instead is we copy adaptand evolve we have our own copy of thatconfiguration our own strct we use andexpose to our users and in the back endwe convert to the version we need whencreating the cluster because the userdefined we want to 29 or V30 and by thatwe can support a long range ofKubernetes versions likeV27 to V33 while still using Cubad's V1beta 3 or V1 beta 4 in the back endanother example of embedding externalAPIs wouldbe in cluster API we have this clusterobject and of course we need somecontrol plane to run in our cluster orfor our cluster for that we refer acontrol plane object in our clusterobject when modeling that we decided touse an object from the core v1 packagecore v1 seems to be stable it should notevolve that much so we shouldn't haveissues when um yeah bumping to a new gomodule version ofKubernetes but what's the mistakehere we just started using core v1object reference but yeah and it workedwe're fine we still have it like thistoday but there are for example three umfields in there which we don't needusers see these fields in the API theymight expect something to happen if theyset these fields but they're actuallyunused just removing thesefields is not that easy because it maybe a a breaking change if a user misusedthat field before um and we try to getrid of that when going to V1 beta 2 umbecause we don't can't do it in in ourv1 beta 1 API so what should we havedone instead we we should have seen okayobject reference is a cool example wereuse that but we create our own copyand only use that or have the fields inour API which we really need and skipadding the other ones and in this way ifwe later on maybe need a UID field wejust could add it and make use of it butonly when we needit a third example which is also prettysimilar but does not depend on externalAPIs is in our machines we have a specand a machine itself is supposed to beone unique item we also have the conceptof machine templates to create a fleetof machines so at the first hand wethought yeah it's totally good to maybejust use the same spec and same strctbecause a machine template we want todefine everything in there and createhundreds of machines of it but let'sconsider the use case of addingsomething like an IP address because wewant to do IP address management withour umcontrollers what's wrong here is we didnot consider that IP address adding itto the machine spec it could also leakinto the machine template and from themachine template side it just doesn'tmake sense because every machine at theend should have a different IP and notthe same so what should we have donehere instead is the same copy adapt andevolve in a separate strct type weshould think about do we need to evolvethese different objects separately andum because converting between one andthe othe�r strct is pretty pretty simplebut just swapping out go types may endup in completely different open APIspecs depending on what we do or if wedo not do itcarefully okay so after seeing threeexample of bad things that can happen ifyou embed that API you might askyourself if is it is always bad to reuseexisting code type and the answer is noin some cases perfectly fine in somecases you have to for instance in yourCRD on your go in the go type for yourCRD you have to embed the type meta andobject meta in order to make them uh uhproper compliant Kubernetes object thereare also other type that were explicitlydesigned for reuse like condition metav1 condition uh in kubernetes or liketype uh or like time which provide uhnice and then this serialization oftimestamps in some cases it is also goodto reuse your own types but you have tocarefullyconsider what are the implication inorder to avoid the problem thatChristian justdescribed taking a step back for aminute let's let's look again at how wedevelop uh the PI so this processunfortunatelymight lead or might trick maintainersbecause as a maintainer we focus a loton writing go types and we just think ohuh controller gen will take care ofeverything and generate uh uh everythingcor but but this could be uh misleadingbecause at theend what impact the user is the open APIspecso pro tip or lesson learned here isthat whenever you change your go typerun the generator and look at the openAPI spec so it's not too too difficultto to read and you really can figure itout what impact will your changeuh create to theusers as a lot last tip uh of this firstsession of the talk let's talk about uha minute aboutrules so when we talk about APIguarantee or API breaking change or APIdeprecationrule they apply on the open API specokay this is what impact user this iswhere guarantee apply if I have one betaone APIuh when I'm going to deprecate it itwill take one year or threereleasehowever from the other from the otherside of the equation some other ruleapply or some other policy this is a gobinary this is a binary this is go willcombine into a binary so same ver applywhat are the implication of these tworules apply to the same artifact isthat if you are in a better corner andyou know your audience which is superimportant you may you might find verypragmatic way to solve to solve an issuelet's make an example you want to renamea go type you could do it if you forinstance if your API package is usedonly by someone internally of yourcompany is a breaking change but you canmanage it okay and this way might helpto solve some of the issue thatChristiandescribed so we talked a lot about gocode what about comments it can't be toohard there should no should be thereshould be no issues there right let'smake an example for generating CRDs weoften use this magic markers which arefor example used by controller gen toinfluence the open API spec what couldbe what could we do wrong here is wecould simply miss setting the rightmarkerslike the max length validation is veryimportant for example for cell cell isall around us and validating admissionpolicies only work if you use the properif you add the proper validations whichresult to the right um fields to be setin our open APIspec another example is setting fieldsyou consider optional actually asoptional which also might influence umhow it results in the open API spec sowhat we could do about this is there's asolution for this there's a projectcalled KL which is short for KubernetesAPI llinter which checks for commonpractices and enforces best practices inyour code it's a it's a super goodllinter which integrates nicely withwith Golink CRIN um and at this point Iwant to give a shout out to Cho whostarted this project and is driving itand to everyone else um who is contricontributing to it the project itself isplanned to get a home in Sikk APImachinery as repository um and yeahplease it's a good thing to start usingit and um it makes you help think aboutsome decisions maybe not every lint ruleapplies for your API in that field maybebut then you can do explicit excep�tionof that and um note that in your API orin your go docu as editionso one more example where KL could helpyou in future is the following we havean cluster object which for example hasan variables um list or array in it wehave a second controller also trying toadd a field to this list or to thisarray per default this list is treatedas atomic so both sides the controlleror the GitOps tool trying to sync thesource to the target object would alwaystry to just set the instance type andthe zone fields here while our othertool like controller which wants to justset a con cost center um field variableadding to that field just fights withour GitHubs tool and then we end up ininfinite reconciliations which costsperformance costs resources and it'sjust not good the solution here issetting the list list type and list mapkeyum markers so open API understands thisarray not as array but as as list mapwhich then helps to have co-ownership ofentries inside this list that's thellinter which is planned to beimplemented in kala it's not there yetthere's a couple of more um and openmisses for it um soyeah okay so at thispoint you you might think that you nowmaster writing go types for forKubernetesAPI you know about types you know aboutcomment and magicmarkers unfortunately this is not enoughso there is one last category of uhmistake that we can do as a developerwhen writing uh an API and this is themost common and the most tricky to getright and it is about the mistake infieldnamesso let's thinkabout an API anAPI is a representation of what yourproject is doing every project issolving a unique problem if you thinkabout cluster API this is a a shortproject statement for clusterAPI if you think about the correspondingcluster object you can imagine it as adifferentrepresentation of the same statement itis the it is what you present to youruser is what tell user what your yourproject is doing but what is the biggestdifference between the two form ofproject statement is that on the leftyou have context you have full testsentences that explain what your projectis doing on the right instead you haveonly a fewkeywordand the the key point here is that inAPI design every one of these keywordmatters so it is really important tochoose the right keyword and what arethe common mistakehere there are a lot so if you notchoose accurately the thekeywords the your API might be confusingfor youruser if you use s synonymous orabbreviation for the same word word intwo different CRD of your project youruser may may be confused aswell if you use genericterm lightly without thinking carefullyat it your user might be confused whatdoes this meanokay and making user confused is not theonly issue is the fact the real problemis that when you choose the wrongkeyword then changing a keyword renaminga field in your API is a breakingchange so what is the solutionhere luckily there are a few tips thatcan help you and one is really simpleand and and effective so you can try todo this exercise you can try to readyour APIuh and transform it and build a sentencethat start with I want to I want tobecause it is a declarative API so ifyou do this the exercise here is that Iwant a cluster with a topology that hasa control plane with three replica it isa simple sentence it makes sense it iseasy to read user can't understand itthis is a good sign a a a good evidencethat you are creating aa a good API if instead your sentencedoesn't read well it is not uh properEnglish or uhuh the word are not good etc becausethere is a a first red alarm that youare not building a nice API or orsomething that you will regret in thefuturethe secondtip require a little bit of more of workbut it is really really effective in thelong run so if you think about it everyproject has a glossaryy has a set ofterm that are specific of thisprojectunfortunately not every project writedown is isglossaryy okay so what are the benefitsof having a glossary there are many ofthem first of all a glossary will helpusers in reading your API will help userwhen they look at yourdocumentation will be a good companionfor for your user but it will be evenmore important for you maintainers andfor the maintainers that will follow youbecause if there is a an ep a glossaryywhich define term which give a contwhich allows everyone to understand thekeywords in your API this will help whenyou will add something on top we try toextend the your API you you will createa new CRD in your API so the benefit ofhaving a CRD are very worth the effortand and on top of it you can alsoconsider that it's something that youcan build over time so it is notsomething that you have to do up frontas a hugeeffort let's have an example what we inthat case for example did wrong we havethe concept of infra infrastructureproviders which have for example an AWSmachine for AWS we added a field in ourcontract to say okay AWS machine shouldnotify that in its status with a readyfield that it's ready but what doesready mean is it is the node ready is itup and running is it provisioned if youhad come from the glossery or from adefinition of what we want to have outof that field it would have been eachinfrastructure machine must report whena machine's initialization is completedand infrastructure is fully provisionedthat's what we actually present with theold status ready field it's somethingwe're going to try to fix with the V1beta 2 API 2 in our contract there umand the mistake here is simple justlightly using that wrong wording andwhich then results in a inaccurate andnot explicit API which could be easilymisunderstood another example on thatcamel casing let's think about having anAPI which does the same as cube cuddledrain we start with having a flag orcube cattle has a flag for grace periodso we start adding a field called nodenode drain grace period at the nextversion we think ah grace period is notenough we also need a a field for thetimeout flag so we add no drain timeoutwe consist consistently extending ourAPI we did before but it reads a bitweird what we should have done insteadis think about embedding andextensibility if we would have added anote drain field which contains thegrace period we could easily expand andand evolve with adding a timeout fieldor others later so consider usingnesting to beextensible there are other exampleswhere cal for example helps you likethink about not using a boolean butmaybe use a enum instead which becauseyou maybe want to have differentmeanings later onand last pro tip for today is getinspiration from the best api design isnot a oneman show and consider gettingfeedback where you can like peer reviewsask users take a look at otherKubernetes projects or Kubernetes APItypes itself but critically think of ifthe same applies to your API and if youshould do it this way or just adapt somesome parts of it you could even ask anAI but consider thinking about itagain okay so before wrapping up let'squickly talk about what we did not coverin todaypresentationso we did not have a deep dive on CRDmarkers there are many of them some ofthem are simple and intuitive like mixminuh most of of the one uh uh about uhvalidation but there are really many ofthem so there is a good documentation uhthe link is there but again the trickis apply a mer generate userd look atwhat they looks like in the open APIspec and test it in the server is issimple i'm veryexcited what else we did not talk aboutwe did not talk about creating a newAPI and deprecating or removing the oldAPI versionwell this is way more complicated thanit should be because uh unfortunatelythere are problem that you have takecare of like storage uh uh storageversion migration and also you have todeal with uh a few annoying API serverissue that that we are also trying tofix in parallel uh maybe this will bethe topic for the next talk uh we alsodid not talk about some other trickabout API development like how to makeyour API to work well with the cap andrestore tool like um uh Valero forinstance and uh wrapping it wrapping itup so API design is ajourney enjoy it share yourlearning and have fun and with thisthank you for attending we will get yourquestion if you have to2025-04-15 22:00:14.738116 ��S#��gA7IA-Vw1K7eghi everyone um welcome to our talk andthanks for coming um yeah we're talkingabout Kubernetes here designing for thelong haul where we want to try to sharetips and tricks and um stuff we learnedin our journey with cluster API anddesigning APIs before we start we wantto introduce ourselves i'm Christian i'ma maintainer in cluster API and asoftware engineer at Broadcom helloeveryone thank you for being here i'mFitza Pandini i'm also a cluster APImaintainer and I work in the same teamof Christian in inBroadcomso the long haul so let's start with apool how many of you are maintaining aproject software contributed to aproject software no matter if upstreamor downstream open source or not let'sraise handplease okay some good developer vibes inthe roomso you all probably know that everysoftware project keepevolving as soon as users start usingyour project you start getting feedbackyou start getting a request forimprovement and so also your API or yourCRD have to improvebut your API your CRD is slightlydifferent than your binary you need a astrategy to make sure that your CRDevolve nicely in the longrun in the Kubernetes ecosystem let mesay the most common strategy to makeyour API to evolve is based by twocomplimentary ideas the first one isthat you want to pick to keepevolving your current API version if youthink about it this is what they aredoing in Kubernetes kubernetes is V1since a couple of year now and it keepsevolving but sometime what you can dowith a single API version is limited soyou cannot do breaking change sosometime you also want to create new APIversion but the key point here is thatyou want to create API version onlyafter a careful and deliberate decisionthis is the key so todaytalk is about avoiding making mistakesthat will force you to create unplannedversionokay before talking about mistakes let'shave a a quick refresher on how wedevelop APIs or CRD in Kubernetes so theprocess startedby writing a go types you write a gotypes then what you do you run a uh agenerator from controller gen and thisgenerator create a CRD where the mostimportant part in of the CRD is the openAPI spec that reflect yourtypes finally you apply this CRD to theAPI server and the user can interactwith your API by writing YAML so if youlook at the at thisprocess where do you think mistakes willhappenso on the right side there are the usersuser by default are always right so theycannot make the mistakes in the middlewe have the generator generator justwork well so what is left only got typesbut what could possibly go wrong whenyou keep a good developer and you askhim to develop a KubernetesAPIwell so with all the respect for myfellow devel uh maintainers and andcontributors a lot of things could couldgo wrong so let's start looking whatmistake we we could happen by goingthrough some of the lesson learned inthe la in the last few years yeah sofirst we're going to dig into some goexamples where we could do mistakes i'dlike to show first three antiatterns weidentified so the first one is embeddingexternal APIs um in our case cluster APItries to create clusters �� KCP maintainer sobeen with KCP for the last four fiveyears hello uh I'm Robert i'm workingforClyo a company that's primarily focusedon storage defi uh software definedstorage and Kubernetes uh but recentlywe started investing time into KCP aswell and I'vestarted on the project not too long agoabout four five monthsuh but still learning we are even todaylearning to together and uh let's seehello everyone I am Mark Mudrinich I ama senior software engineer at cubmaticand as of a recent recently a KCPcontributor very happy to be with youall here today and to learn a bit moreKCP aswell hi everyone uh I'm Navarun i havebeen uh contributing to Kubernetes forthe last uh six years um I maintain afew areas in the project um currently umI'm also a chair of sik contrib andapart from that I contribute uh to KCPand uh try to build products around itand that's all hey everyone I'm Vasha iwork as a software engineer at Red Hatuh I'm in and around the Kubernetesecosystem with operators and things andI was a part of the KCP team when it wasfirst introduced in design so yep that'sit for mecoolso high level plan for today is likewe're going to set up environments andfew other utilities explore the conceptslive because mainly this workshop isbecause we got a feedback from theprevious talks that on slice it looksfine it's good it's needed we don't knowhow to use it we don't understand howit's wired up together so this is whywe're here basically to heads to get ourhands dirty and we're going to lift andshift existing posgress operator CNCFposgress operator to convert it todatabase as a service it's not aproduction code it's more like ashowcase how to do that if you as aplatform team building something thatand we introduce a bit of new kid on ablock multicluster controller runtime Iwill give a reference next to the othertalks which is about that but you willget a sneak peek here too and again askquestions at any time how it's going tobe done so we have a 75 minutes if yourun through the scripts you can bash outeverything in five minutes get somethingrunning and leave that doesn't meanyou're going to learn something so feelfree to stop at these pauses explore trythe concepts and understand what you'redoing hence all the time we have butagain that depends on everybody's paceand a contentwarning it's a deep technical content weare here to participate we are not hereto sell it you might think we're like wehave very diverse companies set it's nota market pitch so it's all abouthardcore tech stuff like in the old daysso braceyourself so before we jump into thetutorial material itself what is KCP asa reminder you might seen it already ina keynotetoday it's a Kubernetes like controlplanes kcp is not Kubernetes it's itloves Kubernetes it uses Kubernetesstyle APIs but it's not Kubernetes it'sa took a few terms from other pages it'sopen-source horizontally scalablecontrol plane so every cloud providerhas its own control planes either it'sAzure Google or AWS this is a opensource version of control plane whereyou can build your own APIs and lifecyclethem so that's a high level and youmight get more grasp what it is once youstart doing this stuff and uh for thosewho are familiar with KCP and I neededto ask that in the beginning can youraise your hands who knows KCP or usedKCPAbefore so quite a few but not muchso in KCP we have this concept ofworkspace workspace can be put into thehierarchical structure like foldersevery workspace is like a virtualKubernetescluster so at the end of the workshoptutorial you would have to havesomething like this on your own laptopsand we have part one two and three wherewe build doing consumersproducers providers basically and yeahlet's see how it goes a bit more highlevel we're going to be setting up thesecomponents which we call sync agent andwe talk about that as we go controllerruntime and just this is just to showwhich parts comes in which part of theworkshop and I think with this if thereis maybe let's start clicking keyboardsand see where it takes us and againquestions any time raise your handswe're around here quit�e a few of usso let'sSee we all want tosh the tutorial spread aroundawesome so Nabarun takes over part oneand I will be just hanging out around toask your questions and help you if youneed toawesome so How many of you are here witha laptop or some device where you canrun abrowser okay quite a lot so we would atleast need a browser for you to runGitHub code spaces and now I'll show youhow do you create a code space and thenset up the prerequisites and theninstall KCP and run KCP so that we canmove on to the nextsteps awesome so um the repoum one minute before that have you alluh scanned the QR code and opened thedocumentation can I have a quick raiseofhands awesome and how many areleft or still doingit okay quite a few let let's give itlike one or two minutesso all of the workshop steps arestructured into a documentation repowhere um in the contract repoessentially where you can go and seeeverything step by step we will in turntake um you through all the sections andthen go over them so it's important foryou to open this uh website on yourdevicesokay um I'll move on but uh in case ifyou want help with uh the QR code um weare all roaming around so do let us knowraise your hand if you want any help andwe'll reach out toyou cool so in order to get to GitHubcodes all you can do is on your umbrowser type in github.com github.devwhich is important not github.comgithub.dev/kcpeddev/contrib dev/kcpdev/contrib which will essentially open a VScode like interface but on your browserand you see we have the repo clonehere so let's go over um how do wecreate now some terminals on code spacesif you click here and typethe closing angular um closing trianglebrackets and type interminal either you can uh do liketoggle terminal or focus on terminalview either will depending on what thestate of uh the code space is in uh youjust focus there and then you can clickon continue working in GitHub codespacesum for the purpose of this workshopwhatever you get as part of thegithub.com free account should besufficient we will create like an eightuh four core 16 GB RAM machine um forwhoever is on the github.com free tierthey will get two options of two coresand four cores uh both are sufficientbut you can uh try using the higher coreone and u you essentially as part ofgithub github's free plan get 124 hoursshould be good enough fortoday so here we are so uh we have aterminal with us so let's try to go overthe steps one byoneCool so as part of the prerequisites umso I showed you how to use GitHub codespaces but if you are running locallyyou need to clone the repo first so onceyou clone the repo uh who are on localterminal can can I have quick show ofhands because the local terminal waywould be a little bit slower since weare on conference internet um if you'reon code spaces you are essentiallyrunning everything on cloud should be alittle faster so for those who are lousing things locally you can do getclone um basically you can go to thispage the 00 prerequisitesum on the website that you opened the QRcode um from um clone the repo um gointo the uhcontrib workshop directory and fromthere we can restart for those of youwho are on code spaces you don't need toget clone since the code space alreadyhas the repository foryou okay giving one more minute so thatpeople can at least start cloning therepo whoever is doing it locallyokay so let me go through the GitHubcode space way of doing things so I'messentially in thecontrib i go into the one quick check isthe font size visible to youallbigger little better people on the backcan they see the screen now awesomethank youso essentially I go into the uh workshopdirectory so most of most of the timeswhen we open a new terminal we'll askyou to open a few terminals over thecourse of the workshop um you the thebasic thing is like you cd like changedirectory into the workshop so that uhyou can proceed from there so now let usset up our environments um we've made ita little easy for you to um installeverything we have built some scripts sowhat you can do right now isum run this script 00-prerequisites/01-install.s�h again theseare all on the website uh whatever I amsaying you don't need to um u justlisten and copy then you can go into thewebsite do copy paste from there andthen run it on yourenvironment so let'sum run this so we download KCP wedownload this thing called API syncagentwe download uh what do we download anexample uh multicluster runtime providerthat we created for you to test in thisworkshop we download kind so that we canrun a cluster where the actual workloadsare running we install cubecuddleobviously this environment has nothingso we need to make sure uh we caninteract with the control plane and weinstall crew uh to manage some of ourcubeplugins so how many of you um are stillhere i know we're going a little fastbut just quickcheck oh awesome um I'll probably give aminute for everyone to just run thescript on theirenvironment it's like you can go aheadif you want and be a bit ahead so if youhave questions for the future parts justask questions toohow many of you are through the firstone now like a quick raise of hands okayquite a fewpeople so I know internet might be alittle issue with this step and uh thethird one that we will do other thanthat u it should be fine with whatinternet we havehere okay so let's go on to the nextone so after doing each step we havecreated a script which will check yourenvironment whether you are ready to goto the next exercise or not so wheneveryou are in any of the exercise what youcan do is for example I'm in 000 I do Irun this99- h high-5.sh script and check whetherI am good or not and it seems we're goodright now so if you run that script andit shows all green you're good to go tothe next one so let's move on to thenext one right nownow we come to KCP so KC in order to notconfuse too much uh like KCP withKubernetes and how Kubernetes works umhow you can understand KCP is it's justa single binary process can be run inany form or fashion it can be deployedas a Helm chart it can be deployedthrough our uh KCP operator which is oursuggested way for you to run if you'rerunning it in production but all for allpurposes in this workshop we are goingto run KCP as a binary to keep it reallysimple and running it in on your machineitself or yourenvironment so let's go onum you need to make sure you have a fewenvironmentvariables oopsand how do you start KCP it's verysimple uh once you install KCP you justdo KCP start and it starts the controlplane for you let's see what it doeswhen I do thatso if you know how Kubernetes API serverstarts it's very similarto what happens when you startKubernetes API server plus a fewadditional controllers to make sure ucertain things are running well in thecontrol plane so in a way it'sKubernetes API server plus controllermanager merged together and strippingout a lot of things so when you do KCPstart it runs the whole KCP controlplane plus the controllers which makessure any processes inside KCP run wellso just wait for some time and in themeanwhile I'll create a new terminal soin code space you can just go over tothe top right and click on the plus iconand you get a newsession over there i need to againexport a few environment variables andbefore that very important I need to gointo the workshop directoryand then plain symbol cube cuddle let'sdo and I like to ls uh cube cuddle to kso that it's easy for me so you and justdo K version and you see we have a umserver version here which is theKubernetes version that is backing KCPplus the cub the KCP version itself ushown as server version in the cubecuddle uh versioninfo having said that if you are able toreach here that's a great start we havelike overcome all the internet hurdlesfor now and again To check whether youhave created everything correctly or notyou can run the hi-5 script but beforethat before we move on to the next onewe will deal with a lot of cube configsone would be the cube config to thecontrol plane itself the cube config tothe service cluster that we will haverunning the kind cluster that we willhave and a few others so we'll justquickly create adirectory so that we have all the cubeconfigs� stored therelet's run the hi-fiscript kcp is reachable so how many ofyou are stillhere okay and how many are facing anyissues so that we can helpyou 3% okay i'll I'll becoming let's wait for a couple ofminutes by now we are at the secondstage of the workshopso here we are uh we have started ourKCP server and I know you are all herein the workshop because KCP promised twothings one is multicluster capabilitiesand the second one is the whole promiseof scalability so we are going to getour hands dirty and go into themulticluster part of KCP and understandhow things work from zero so so let mego back to the tutorialsomehowand we'll explore the concept ofworkspaces so before I get into creatingworkspaces let me give a very briefoverview about what workspaces are soworkspaces are KCP's unit of tenency andthey in other words create a very simplestripped down API server with veryminimal APIs which are provided so for auser that is the endpoint through whichyou will talk to your cluster so allworkspaces are backed by logicalclusters and for any clients whetherit's a controller runtime client or acubectl client or anything which needsto talk to a particular workspace thereis spec there is a specified URL and away to reach the URL and we'll go intodetail on how the URL is structured andhow a client can talk to a workspace andexplore more about workspaces uh in abit so before we get into creatingworkspaces let's do a little bit ofprerequisite stuff which is basicallysetting up the workshop route and uhcrew uh root so crew is u cubectlplug-in manager and we have created afew plugins for KCP uh KCP uh cubectl sowe'll be setting up those and then whenthe KCP was started previously if youcould seein a KCP folderyou would have a cube config in herewhich is the admin cube config so thisisthis is the cube config that we would beusing for now to talk to our KCP clusterand create workspaces and things likethat so that's all what this bit isabout where we are setting the end uhbarsjustrun 0No don't go thereokay so let's just quickly checkif we have our cube config setit will install it in a local directorynot override your installations so weuse localyep and we go andcheck our API resources we installeverything to local pin after that therewe go and here we can see we have aworkspace APIavailable so the next thing is going andcreating workspaces so that we can talkto a workspace and understand more aboutitso the next step which we'll do is uhadd our KCPDE dev C cli through the crewuh and enable our uh cubectl plugins solet's do the next step which isbasically copying all these stuffdownloading the necessary plug-inand setting up to be ableto talk with our workspacelet'ssee so one thing to make sure here issometimes uh your cubectl createworkspace may not work so this is kindof a bug in uh crew plug-in so make sureto copy your cubectl create workspacepath to uh your crew uh crew rootand let's see nowokay still doesn't work let's try onceagainyep there we go so uh are we till herebecause there was a glitch in here whenyou do a cubectl create workspace helpare you able to get this kind of uhoutput where it tells you that thisparticular command helps you create aworkspace are we all goodokay now let's move on to the next partwhere we start creating workspaces tounderstand more about it now let'screate workspace onethis uh make sure to type out workspacebecause WS sometimes doesn'twork and let's create another workspacetwo to make things complicated and seehow things workout and let's go and see how theworkspace structure lookslike so this is our root workspace whichis the parent in other terms it's verysimilar to our home directory and thenwe have workspace one and two now if weget into workspaceone you can see there are no CRDs inhere so this is a separate view of yourown cluster so in other terms workspaceis going to be your own cluster whichyou can talk to and which you caninteract to so let's also see the APIresources which are available here andthese are the APIs which belong to thisparticular workspace and workspace areisolated which m�eans the objects whichyou create in workspace are not visibleon other workspaces which are providedto you the other thing is workspaceshelp you create a hierarchal structurewhich means let me create anotherworkspacethree and let me enter theworkspaceso let's see work what workspace is thisthis is three workspace and let me alsocreate anotherworkspace and enter intothis so right now we are in theworkspace name potato it's under root 13potato so let me go backand go back to root and see how thestructure looks likethere wego so this is the hierarchal structurewhich I was talking about that issupported by KCP where you have a parentand your parent can have children orgrandchildren so let's dig into this alittle bit more by creating specific APIresources and see how things work outand how uh one workspace is isolatedfrom another but are we good till nowand do we get an understanding of theworkspace like anyquestions are you all able to createworkspaces okaythanks let's move on to the next partwhere we show you how you how thisisolation achieved in a workspace sowhat this part does is I get intoworkspace number one i create a configmap and I see that how this config mapis visible or not from outside theworkspace so let me dothis let me go back to rootfrom root every time I go back to root Imake sure to do a trees just tounderstand where I am and how thingswork out let me go back to workspacenumber one i'm in root one and let me goahead and create a configmap so we have the config map herewhich you have created in a workspaceit's very simil it's the same as how itwould be when you create it against anormal Kubernetes cluster there'snothing different everything is the sameyou have the object in there now let mego to workspace2 i'll go back to rootfirst again do a tree tojust give you an idea of where we areand now I'll go back to workspace[Music]2 this is a separatecluster and let me get the config mapsinhere boom i don't see the config mapwhich I created in workspace number oneso this is the level of isolation whicha workspace provides so it's a separateAPI server it's your own cluster you cantalk to it and the workspace owner cancreate whatever objects that arerelevant to them in that particularworkspace so just for the fun of it letme also create a config map in here andsee how things work outand let's also create the config mapwith the same name because if you wouldremember in a normal cube cluster if youcreate any object with same names thoseare actually uh that's something whichis not possible the server rejectssaying that you're creating an objectwith the same name but just to show youthe whole isolation part of it let me goahead and create another config map withthe exact same name as we createdpreviously there wego no errors we are all there and that'sit so this was about two differentworkspaces we created config maps thesame exact config maps in two differentworkspaces we saw that that was possibleand we also saw that whatever objects wecreated in workspace one were notvisible in workspace 2 so are we goodwith that understanding and shall weproceed to the nextstep are we all good okay cool so thenext thing is the obvious one where sayI have a workspace I have created aparticular API say a posgress uhdatabase and I want someone in myorganization to use the API which I havecreated so how do we go about sharingthe APIs which we have created now thisis where in KCP we bring in the conceptof API exports and API bindings so we'llgo into that a little bit more so pleasedo uh follow the next steps let's goback to using our rootworkspaceand let's create a workspace namedproviderinside the workspace provider I'm alsogoing to create a separate workspacecalled cowboy cowboy cowboy is the APIwhich I'm going to create as a providerand share it to multipleconsumers so let's go create anotherworkspace calledcowboys and just for betterunderstanding let me again go to theroot and show you how this looks likeso we have root we have a providersworkspace and we have a cowboy workspacewhere I'm going to create a cowboy uhAPI now let me �go and go into theworkspace and see what we can do so I'mgoing into root then I'm going intoproviders and then I'm going to cowboythere we gojust to make sure since we are going toswitch workspaces around whenever youswitch a workspace just make sure tocheck which workspaces you are in inright nowso I am in the cowboy workspace the nextthing which I would like to do is tocreate a object a custom object customAPI which is known as cowboard now to dothat we use the concept of API resourceschemasand let's see what API resource schemasare so if you see here the API resourcekey structure looks very similar to howa CRD would look like so we have all thedetails in here it is a namespace scopedit can also be cluster scoped and uh theother aspect of an API resource schemais it is completely immutable and that'sthe specialtity of KCP which helps in anAPI evolution so once this particularAPI resource schema is created at thisinstant in time you cannot change thespec which is present in the resourceschema so let's go and createthis API resource schemawent and see what this is aboutokay describe API resourceresourceschema todaydotcowboysokay this is fun mistyping in front ofso many peoplethere we go so we have a cowboy uh APIit is the same as we saw in our YAML itis going to be name space code XYZand if we see theresources we have the cowboy resourcesomewhere inhere okay before going into the APIresources let me show you how do Iexpose this particular cowboy API toother workspaces so this is where webring in the concept of API export sothe next thing which we would do isexport this particular cowboy API toother workspaces bycreating an API exportso what this API export has is it has aspec which specifies the exact resourcekey which I'm going to export in otherterms which I'm going to let the otherworkspaces to use and the next importantthing is we have permission claims whichmeans as a provider these are thepermissions which a consumer are theother workspace which is going to bindto my API is going to provide so in verysimple terms the analogy which makes meclear in my head is you go to a GooglePlay Store or an Apple play store andsay you want to download a particularapp say you're downloading WhatsApp andthe next popup which comes in is do youdo I have permissions to access thestorage do I have permission to accessthe photos and stuff like that it is theexact same thing so what the provider isdoing here is to be able to use thecowboy API you have to give me thepermission to access your config maps sonow we'll go and create the API exportyou will not see it that's right okayokay i thinksome will comeuping you take a schemayou can have like multiple schemaokay so in the API export I'll just goover the important parts of the whole umdescription we have a identity that isfor authorization we'll go into detaillater but the next thing which we haveis the permission claims which says thatthis API needs permission to access allyour config maps and the third importantthing is we have a specific individualURL to be able to access all theresources which are bound to thisparticular API which means a providercan use this particular URL and accessall the config maps are all the cowboyswhich are created in the individualworkspaces which are bound to this APIso we'll go to its description later butare we good till here in terms ofcreating a APIexportokay let's move on to the next one whichis basically going changing roles andbecoming a consumer so as a consumer Iwould like to use the cowboy API now I'mgoing to create a consumer workspace andI'm also going to create a subworkspaceknown as wildwest and just to show you in the APIresources I don'thave any cowboy APIs available and nowlet's go and bind to the cowboy uh APIwe'll use the KCP bind commandhere where arewe yeah there wego and I'll explain the bind command ina bitwhat is com what this command is sayingis as a consumer I would like to bind tothe API export which is available in theworkspace root providers cowboys itsname is cowboys and I would like toaccept the permission claim for thecon�fig map which is beingasked now once the API binding iscreatedWe seethatokay let me do thiswe'll see that the cowboy API is nowavailable and now we can create a objectthrough the cowboy API which isavailable to us so let's go and do thesame and let'screate a cowboyresource there we go so we saw that wehave a consumer workspace we saw that wecreated an API binding we uh bound our uworkspace to a particular API which isthe cowboy from a provider workspace andthen we have created an object namedbukkaru bill from that particular APIwhich was available are we all good tillhereokay let me also let's also quicklycreate another consumer just to show thewhole concept of having anotherworkspace having another API binding andthen creating an API binding in thatthird workspaceso what I'm doing is I'm creatinganother workspace which isworld which is wild north entering intoit and I'm creating an API export but uhI'm binding to an API export by creatingan API binding and then I have alsocreated acowboy resource in here so the nextthingwhich we would like to explore is let'sgo let's quickly look at the tree wehave a provider cowboy we have aconsumer we have two subworkspaces inhere and we were able to create a cowboyAPI from the provider uh bind to theprovider and create it in the consumerworkspace one thing which you can alsoquickly look at by going into theconsumer workspace is if you do a K getCR cubectl get CRD you would not be ableto see the CRD which means that theobject is not exactly available in yourHCD but you can create a resource whichwe just did the cowboy resource and youcan use it as per your wants and that'swhat KCB does in the back endthe third thing which I would like toshow is if you would remember we alsohad a workspace URLlet's get into the[Music]provider andcowboysand let's see this URL and we were ableI had mentioned before that we canaccess all the cowboys are all uh theconsumer objects which are created inthe consumer workspaces and bound to theprovider through this particular virtualworkspace URL so let's go and dothat letme see the API resources that are boundto this particular provider API and wesee that we have config maps we havecowboys and then we have API binding wecan even go a stepfurtherandokay we see that from the providerworkspace we are able to see that therewere two cowboy objects which werecreated in consumer workspace whilenorth and wild west so this was a verybrief overview about the whole workspaceconcept and how the APIs are isolatedand how we can create an API export bindto an API export without having theobject in HCD are we all good till heredid this all makesense i hope sookay thing which you can do isalso run this script to make sure thateverything is good andgoing and then we can move on to thethirdexercise so I'm conscious we running abitbehind theschedule and it was always idea that wehave more content than time so how manypeople already finished this part andready to move todatabase okay we have a quite a lot soif you're not yet there I would reallylike us to get through the database partbecause I think that's a nail of thistutorial so we're going to go ahead nowif you are still not yet there pleaseraise a hand we're going to help you andscripts are there tobootstrap andif if you need help we basically you canfind us after workshop in a kcp-devkubernetesslack but for now let's move furtheryeah I will take now for the next partit is the part 03 dynamic providers sowhat do we want to do here we willexplore some quite interesting conceptsand try to put all these workspaces andstuff that we talked about in practiceand we are going to explain somesoftware as a service scenario so let'simagine this let's imagine we have onecluster or in this case it's going to bea Kubernetes cluster that is going touse a service owner we are going to runthe posgress databases there and we aregoing to somehow give those databases tothe consumers and then we will have onthe KCP side to one service uh provideruh which is going to we will see howconnect with the service owner so thatwe connect the KCP uh o�n one side andthe service owner on the other side thatis a Kubernetes cluster and on the KCPside we are also going to have one moreworkspace that's going to be used byservice users so user who is going torequest a database for theirapplication so before we get started wehave a couple of things here that weneed torun so this is like setting up theenvironment let me see if we have just amoment here it isso we should already be set here just amoment yeah but I'm going to run it justto be on the safeside and now the first thing that we aregoing to do is to create the kindcluster and you can just copy thiscommand to create that cluster it mighttake a few minutes but should be fairlyquick in the GitHubworkspaces and let me quickly talk whatare we going to do so what are we goingto do here is that once the kind clusteris ready uh we are going to deploy thecloud native PG or posgress SQL operatorthat we are going to use for deployingthe databases uh let me see if we havesome progress on the kind side so itshould be up in a fewseconds and here it is so I can do nowcubectlversion and okay this is not thatone we will need to do it likethis okay now we can see the cluster ifwe do something like getnodes we can see thenodeokay so what are the next thing we aregoing to do so we going to create oneAPI export that's basically going toprovide the posgress databases to otherusers in our KCP environment it's goingto be an empty API export it's not goingto refer to the API resource schema wecan for a very quickly moment see it andit's as you can see it's pretty muchempty aside from the name and we aregoing to create itbut before we do that we are going toprepare one cube config file that'sgoing to be used by something that'scalled API sync agent and we are goingto see later on why are we doing thatbut fornow let's dothat i made a mistake there let mejust copy that againokay now we should begood and we are first going to switch tothe provider root providersworkspace and then we are going to enterwith thatworkspace actually so we are switchingto the real providers and in thatworkspace so I think it's something likethis okay now we are seeing this but ifI switch okay now we don't have muchtime but in root providers we have thedatabase so if I would very quickly dothis you will see that now we have thedatabase here now let me switch toit okay okay and now we are going toapply that relativelyempty API exportand now we can see that it'sready okaynow we have already seen it now we aregoing to create one more space that'sgoing to be used by the consumer that wehave mentioned just a quick check uhhave you managed to get to this part yetis quick handsup okay that's lookingfine now I'm switching to the cons toroot consumersworkspace at the momentuh sorry we have a couple ofwild workspaces there but we're going tocreate one for theposgress sorry thatwas the wrongone thisone and now I'm going to create this forposgress pg and I'm going tobind to the API export that we createdso we are using the cubectl bind commandkcp bind command that we have seen weare providing the reference to APIexport that we created in the providersposgress providers database workspaceand we are accepting some permissionclaims for secrets andnamespaces we will see how thatworks okay and now we need to somehowconnect this all together as you haveseen the API export is empty and we needto somehow get the theuhthe sorry about this the serviceprovider and later on the service ownerto get the resource that you want toexpose and which in our case is adatabase and for that we are going touse something that's called API syncagent it is a bit opinionated controllertechnically you can write your owncontroller that's going to do all ofthis and that's going to be adhered toyour requirements but we provide thisone as a starting point that you canlater on build for example you could usesomething like the multicluster runtimeto build a mult multicluster awarecontroller but we are going to use thisone as it provides a very good startingpoint we have installed it in the stepzero and for it we� created a cube configearlier on and now the thing why we didthat is that in the cube config insomething that's called the currentcontext we define what it is by that waydefined what's the current workspace tobe used and when we provide it to theAPI sync agent it's correctly going toknow with what workspaces it needs towork andfinally we are going to first switch tothe provider cluster so that the kindcluster that we have created once againand there we are going to deploy the APIsync agentActually we are not going to deploy theAPI sync agent itself but it's CRDcalled published resources we will lateron run itmanually and now I am going to create afew published resources that we aregoing to see in amoment okay so what is the publishedresources thing that we created it is aCRD that's provided with the API syncagent and it's primarily used to connectthe API export that we have in theprovider workspace with the providercluster that we have in our that isbasically our kind cluster so we arehere specifying the inspect the resourcethat we want to provide and this is thecluster in posgress SQL CNPG io group inthe version we want so this is theresource that we have in the kindcluster in the providercluster then we specify some rules likehow is the resources going to getsynchronized and we specify the relatedresources so what is this going to doaside from the thing that the API syncagent will connect with the API exportand provide the cluster resource in ourKCP workspaces other thing is it that itis going to psych the objects that wehave that we create as a clusters in theKCP in the uh our user workspaces downto the provider clusters where theposgress instances will be actuallyrun and it is also going to synchronizethe related resources so for thisposgress we need a secrets that containsthe credentials and is going to be takenfrom the user workspace again to theprovider cluster so it is pretty muchproviding binding from KCP to theKubernetes cluster which is used as aprovider for the res for the whatever weare providing and it is also supposed toconnect the AP so we get API that is inthe provider cluster we get it in theworks in the KCPworkspaces okay so this is all alreadycreated api sync agent this time we aregoing to run in our terminalso API sync agent we provide the namespace we provide the API exportreference and then we provide the APIseek agent cube config that wecreated uh let me just check somethingokay so here you willneed just a very moment pleasewait let me just check this cubeconfig so main point of this is whilehe's checking we have a provider clusterwhich owned by database team databaseteam only database team has access to itand we use KCP as amultiplexer meaning everybody gets aworkspace you interact only with yourworkspace and there is this connectormultiplex sync agent in between toenable this stuff like not confuse thatKCP and Kubernetes KCP acts as a APIgateway where it extends the existingmulti-tenency basically ally in amultiplexing wayokay uh just a quick thing so if you runinto this problem you can run thiscommand down there so with the cubeconfig that we created for API syncagent you can provide it like this withenvironment variable with cubectl setthe active workspace to root providersdatabase so this command here I willkeep it for like 10 seconds you canmaybe take a picture or write it downsomewhereokay was it okay quick handsupokay so what do we havenext we have the API sync agentrunning and now we need to start anotherterminaland we are going to prepare thisterminal by running what we have here inthebox and now we are going to switch tothe consumer workspace one that wecreated for posress SQL and we are goingtocreate first thedatabase and just asecond well let me seelet me make sure I am in the correctworkspaceokay that's definitely not supposed tohappen so it should be the API bindingssee what's happeningthat's what you get when doing stufflivethis should correctlet me switchthe Yeah it'spolluted so now you're going to have tosee a bit of live debuggingso every API export has a uniqueidentity so if your APIexport like you start something toconsuming it somebody modifies identityi think in this case we just basicallyoverrode theidentity of the exportthat'snot at all yeahthen we can just run the APIagain yeahso now did that stuffyeah you need to run the API yeah youneed to restart it okay so sorry aboutthiswe don't forget to deploy the posgressuh SQL operator like we did right nowbut uh mistake happens don'tthey and let's see if it is going towork right now oopsso we are just checking yeah this looksokay and then we can continuefrom from creating thedatabase and looks like it has beencreated we first created the cluster andnow we are going to create adatabase and we are going to get acluster okay it is reporting back thestatus so we are waiting it for to setup the clusternow if you would go back to the Let mesee if I can do that in themeanwhile maybe it's going to getup cluster is at the healthy statethat's looking really goodand now let's try to see it in theservice owner how it lookslikeso we are now making some assumptionhere like that we can access all thecluster including the service ownercluster we can see here that we have theuh name space for our posgress instanceand we are seeing itrunning so this is looking really goodand like we can even get the cluster inthis name space or let's make it easyget in theall name spaces and then in the end weare going to switch back again to theKCPsetup we are going to go to theconsumer we are going to get thedatabases theclusters and for the end I think it wassecrets and we can see that everythinghas been created that properlysynchronized between clusters and forthe very end like what we can do we willgo to the provider cluster ideally inthis case the provider should give yousomething like ingress or something likethat to access thecluster and let mesee export QCT get what'sa let me see if weget yeah we have the name space oh yeahthe namespace name thank Thank youokay and we very quick create the otherterminal okay let me do this way thiscan be copiedeasily because ofthe Yeah we'll probably cut it short butI can very quickly try to do somethinglikethis and there we go okay uh we arerunning out of time so I will quicklywrite the Robert yes sounfortunately taken a bit of time and weare at the end of our sessionbut there is still quite some contentleft that we've missed uh there ismulticluster runtime we had an exercisethat you can do yourself that uh fromthe same docs page that you have visiteduh just follow the uh the steps the dogsum and you can you can try ityourself uh those are if you areinterested those two at the bottomtomorrow there are there is a anothertalk by MJ and Nabarun uh resource modelbeyond Kubernetes workloads and formulticluster runtime there is uh at15 hours uh a talk about multiclusterruntime so if you are interested uh giveit a goum do you have some closing words so asalways we overshot the content so sorryfor that we understand that it's verycomplicated topic to deal with so feelfree to run through thecontent again if you want on your freetime you can find us on a slack kcp devand kubernetes slack i would really wantto say thanks to Robert who didn't had achance to do the keyboard clicking on uhhere because his part was the mostcomplicated and the last one so foreverybody who does the MCP part have itin mind it's basically uh the lastcomponent controller runtime there andif you have any questions about all thisstuff today in the afternoon we have aproject booth in a project pavilion lookfor 5A or 5B I think one of those wherewe'll be hanging out and if you have anyquestions bring your laptops samecontent we can happily we're going toanswer you and really thank you foreverybody who stick to the end andcomplete it as much as we can it's a bighelpand if you have any feedback again Slackor just approach us cool and for this Ithink we even run out of time forquestions so we're going to stick aroundhere for a few minutes if you havesomething and thank you for all theco-presenters and presenters who've beenon the stage[Applause]2025-04-15 22:00:15.611654 � K��R�U#��[ANCkHrvqFMl8uh welcome everyone to our session aboutreliable uh Kubernetes resourcesubmission and bookkeeping um so let'sintroduce oursel first i'm Yaoin i'm Kenum so we work in the workfloworchestration team in cloudnativecompute services along with uh othersimilar Kubernetes based platformsinside Bloomberg um yeah we have a fewdedicated team Kubernetes based andwe're interested in in this wholeecosystem we maintain a highly availablecontainer archeration that platform forour internal engineers in Bloomberg uhthe platform itself execute users run tocompletion workload for uh general usecases um so typical use case includeslike machine learning pipelines uh CI/CDpipelines machine learn uh machinemaintenance routines and uh financialanalysis or any general data processingum so for these use cases uh we uh haveuh both some functional andnon-functi��T#��?AFb_3dWJdY9Ihello everybody please grab aseat as you might notice in the agendathat's a workshop tutorialso if you want really hands-onexperience get your laptopsout so we're going to try to keep itvery as much informal as we can whereasquite a few of us so once we end up withour part on the stage we're gonna startmingling around so if you have aquestion or something is broken ask ahand i'm very cautious there is 500 ofyou and five of us this means everybodyof us gets a hundred so please help yourtable friends co-workers colleagues ifyou canso that's and we have few other KCPfriends and family so if you have aquestion during the like silent time andwe catching up there's a microphone inthe center just grab it ask it so let'snot wait for the Q&A at the end there'sno microphone actually that's one centerthere iscool so we're good to start so welcometo exploring multi-tenant KubernetesAPIs and controllers with KCP the linkwill be always in the footnote of theslides this is where we be hanging outbut before that a bit ofintro so prerequisites we will requirefour shell terminals Linux MacBooks westrongly recommend using GitHub codespaces we're going to show how to set itup hence everybody has the sameenvironment and we can debug faster butif you feel comfortable use whateverterminal you like and for the startersgit is basically what we need and we'resetting up everythingelse a bit of intros i'm MJ i'm staffengineer at Castai and��onal requirements thefunctional part may be easier um there'sobservability scheduling or eventingapproval process needs to be integratedinto the whole platform to make surethings are properly revealed beforetaking effect um but the non-functionalcan be less obvious but that is probablythe most difficult part in the wholesystem um yeah so in Bloomberg we valuethe data center resiliency seriouslyso what does that mean that means uh ifone data center goes down our platformshould be seeming still working uh toour users because um everything shouldbe highly available um by what'savailable on the other side of the uhsystem um another healthy data centerso before diving into technical diagramswe can first understand what our user isdoing with our platform uh to the basicthey need to run their workflows uh ifyou are not familiar with Argo workflowsArgo workflow is something you candefine your steps like in deck uh theydepend they can be depends on each otheror span from each other uh executecontainers in particular sequence forexample uh I can generate a report firstand then persist somewhere and thenfinally it notifies me when everythingis finished um for that to happen weneed a few other resources available onthe cluster is executed on uh such asconfig map and secrets and for workflowspecifically um there's argo workflowtemplate and cluster workflow templatefor the workflow itself to referto so let me put a abstraction layer onthese things so for workflow we just putit as runnables so it's of a similarconcept of other things like kubernetesnative jobs or any other um customizedjobs and the other things because theyare expected to be available on theapplicable clusters so we call themdeployables they should be consistentthey should be there whenever neededah okay so uh that's the assets we helpmanage um but how do we manage that umfrom a very high level we this is thediagram of um a very s simplified formfirst we offer a workflow API uh inother slides we may refer it to um asuser API um this API first handles themutation requests from our users tochange things on thecluster and then the API also exposeread endpoint to allow users to retrievetheir assets but things can be a lotmore complicated than the single diagrambecause we have a farm of clusters tomanage also diving into those clustersthere are at least two data centers umfor each of the sets that we allocate uhwe provide for our users so if one datacenter goes down say the Oscar side ummaybe caught on fire like the airport umor we put one side down for maintenancethen the other side should be able tofunction as usualso uh and then we can look at uh whatcan exactly goes wrong for each of theuh resource types for runnables uh whenour API submits to a cluster it needs toconsider which cluster is more availablefor or more suitable for this workloaduh also it may need to handle umslightly more complicated logic likeretries if the error seemstransient um there can be more featurerequest that requires more sophisticatedlogic for deployables well theconsistency is the top concern uh howcan we guarantee the consistency at thechange time and going forward forexample if the cluster was put down orgets rebuilt um something like that howcan we grab the original expectedresource from somewhereelse and then on the other side of thisloop uh the read request the difficultyseems simil similar for runnables anddeployables first of all if all the readuh say the list it consumes so much CPUresources if all those requests actuallyland on the Kubernetes API survey itselfthen that's too much load it may evendowngrade the performance of the clusterin terms of orchestrating the workflowitself um also we need extra layer forapproval or auditing purposes for uhauditing user actions and the jobexecution within this cluster sogenerally we need system resiliency uhon top of this individual clustersum there may be some requirement uhregulatory requirements in your mindthat's related to this do uh related tothis setup um there's something existingand something coming up we won't listall of them righth�ere so uh now it's time to get into thesolution spaceum so in terms of to so in order to uhexecute those sophisticated logics weneed to separate the API and thesubmitter that actually submits thingsso first the API needs to store thingsin the uh audit database as user actionsso it serves audit and uh approvalhandling purpose and also this runnablesare actually sent to a message streamfor actual processing the submitterwhich is hidden away from our users willhandle those complicated logic likeretry um deter by determining if theerror is transient or not also likefeature support say verify if theworkflow is still valid for submit aftersometime for deployables things can besimilar or slightly different um thisshows a slightly different view the APIwill only needs to put the deployableobjects into the audit and the source oftruth based on the form they suit it andthen there's a synch service reads fromthe source of truth database andactually land things on thecluster um this setup may seems like apull model but uh in but that's purelogical so in practice you can actuallyimplement a push mode just like what wedemonstrate in the lastslides so put them together uh these twokinds of resources can be handled uhdifferently uh but they are uh hiddenaway fromusers so um that's the submitting parti'll now hand over to my co-speaker forthe other half of the loopthanks y so after the resources land onthe cluster um there are multiple postdeployment status tracking use casesthat we would like to support so let'sfirst imagine that we want to build auser interface um that displays um theexecution results as well as theuserdefined object of all the inclusterresources um as we mentioned before dataresiliency is our top priority concerntherefore we expect the UI to behave andreturn the accurate results even thoughthe cluster is donemoreover same as all other userf facinginterfaces we want our user uh I mean wewant our UI to have low latency and highperformanceso as you could imagine getting theobject and execution results ofincluster resources requires a lot ofinteractions with the coupe API serverand then that will increase the latencyand our UI performance could bedrastically impacted if our designrelies purely on the interactions withthe incluster coupe APIso another key use cases that we want tosupport is to preserving the historicaltransactions of incluster Kubernetesresources imagine that a user wants tosee all the historical updates andversions of a config map in order tobetter understand what's get changed atwhat time in order to debug adeployment moreover preserving thehistorical transactions for resourcescould help us with auditing purposes byanswer the key questions like who madewhat changes to what resources at whattime so let's first talk aboutpersisting historicaltransactions as y'all already mentionedin couple of slides before we alreadypersist and preserve the transactionsinto a highly available database so forexample if a user wants to make updatesinto config map then all the updatesattempt will be stored in that databaseand now all we need to do is just toexpose a userf facing API for the userto retrieve the historical transactionswith those givenresources so in order to support theuserf facing interface use case we wantour design to be resilient againresilient against data failures datacenter failuresso therefore um we introduced a solutionthat uses highly available data storageto persist the latest object andexecution results in the highlyavailable datastorage so to achieve that in ourworkload clusters we build and deploy amessage producer that watches the updatecreate and deletes events for certaintypes of Kubernetes resources andpublish those resource information intoa message streamand on the other hand we have a consumerservice that updates the data storagewhat we call inventory in this casebased on those type of eventsso to give you an example let's say auser creates a config map in a clusterand then the resource status watcherwill publish a event with type create aswell as the config map spec into themessage stream and then the consumerwill receive this message and create anew record in the inventory database andsimilarly if a user deletes a config mapthe type delete um sorry the event oftype delete will be published into themessage queue and then the consumer willdelete the record from thedatabase on the other hand in case ofcluster downtime user can still usingthis interface I mean the userf facingAPI to interact with the database toretrieve the resource status without anydirect interactions with Kubernetes APIserver in cluster so this ensures thedata resiliency of this our designso while this design sounds prettystraightforward however um we couldeasily encounter some concurrency issueso in the next slide we'll go over oneof the corner casesscenarios when things could potentiallygo wrongso now imagine a situation um where auser tries to delete a config map bycalling the user API so the API theninteract with the source of truthdatabase and deletes the resource fromthedatabase our syncer will then detectsuch change and dete I mean deletes theresource from the clusterso in any sunny day cases when thingsworks as expected um producer willcapture the deletion event and consumerwill delete resources from the inventorytable however uh imagine what ifproducer somehow gets out of memory at atime or is being rebooted bysomeone so in that case producer failedto capture the deletion event and eventwon't be published into the messagestream because the deletion event isactually missing um the consumer on theother hand will not be able to deletethe record from the inventory tableso this leads to what we call zombierecord in the inventory table and itcould cause problem when user try to askabout theresources for for instance if a user askabout hey if my config map still existsin the cluster the user API will try toget the answer by interacting with theinventory table since there's a zombierecord the user will get an answersaying the config map still existhowever the actual resources was alreadydeleted from thecluster so to solve this problem umbesides the create update and deleteevents uh the producer also publishedaverage snapshot of all the inclusterresourcesyou could imagine the snapshot containsa list of Kubernetes object UID as wellas the clusterinformation the consumer on the otherhand will take this information andcompare with what exists in theinventory tableso if a record exists in the inventorytable but not in the operate snapshotthen we have identified the zombierecord the consumer will delete the resthe record from inventorytable now if a user tries to use the APIto ask about the resource again he willget the correct answer this timeso this method ensures the inventorytable is consistent with what actuallyhappened in our Kubernetescluster so to summarize at submissiontime we face different challenges fordeployables and runnables we haveconsistency and disaster recoverychallenges for deployables um in orderto mitigate those challenges we build asyncing service to sync deployables fromsource of truth across all theclusters to ensure the reliabilities ofrunnable submission we build a submitterservice that submits vulnerables todestination cluster from message streamand then such submitter would retry ontransient errors and it will haveadditional features like deadlineverification and once the resource landon the cluster for post deploymentstatus tracking we want to build userinterface and API with low latency andhighresiliency we also want to preservehistorical transactions for auditingpurposes and traceabilityso to accommodate the design goals andour use case we implemented inclusterproducer to produce the historicaltransactions and resources status into amessage queue uh sorry I mean messagestream and have consumer service thatpreserve those information in a highlyavailable data storage and then on theother hand we have userf facing API andthe UI to help the user retrieve thoseinformationso that brings us to the end of ourpresentation today uh thank you all forjoining and we are now open to questions2025-04-15 22:00:16.301321�oucan see if you have as good eyesight asI have um that allows you to actuallysit in your control plan and see eachindividual clusters uh what propertiesthey have again this this tied up to theabout API so we're going to add moreproperties well definfinedcommunitydriven properties into theabout and then bring up to this clusterprofile API we have i just put in thesame link there um the next one is afteryou uh decided that uh you have you Imean seen all these clusters and thenyou decide that you want to place someuh jobs place some uh workloads intosome clusters how do you do that rightyou you don't want to always kuddle intoeach individual clusters and do a kobiaapply so we have a a API called work APIbasically allows you to from the controlplane side it's kind of again think ofthat as Kubernetes right on the controlplane side you can place uh resourcesonto clusters just like when on theKubernetes side on the master node youcan place that into a kublet so it'sexactly the same idea and uh uh Augustwas going to and we are both going todemonstrate how that works in real lifeand the last one is when you place allyour applications into individualclusters that is thatNot really right you still need them toeither talk to each other or can beaccessible from uh from your users sothat's where this multicluster serviceAPI comes into being this is allows youto actually allow different clusterdifferent services with the same namethere's an uh namespace sameness we'rethere's going to be a sig multiclustertalk tomorrow we'll get into moredetails there so you need to have thisnamespace sameness and then for everyservices say every service a they canactually actually talk to each other orallow another serviceor external users to talk to yourservice even if they are sitting indifferent clusters right that's what youneed you don't want them to individuallyspecify I want to talk to cluster A'sservice and cluster B service andcluster B goes around goes away then itdoesn't work so you want to give thisidea of all these services are the samethat's where the sameness comes from andnow we are back to August about uhdemonstrate some uh real coolright so I'm going to talk to you alittle bit about an implementation ofsome of these APIs and some enhancementsto them and that's through the opencluster management project or OCM youcan visit us down in the booth all thatkind of stuff what is open clustermanagement very high level obviously wecould go much deeper but it's a hub andspoke model deployment right so wefollow the hub Kubluit approach um theynamed them clusterlets which I thoughtwas cute something that runs on themanaged clusters and it gives us a weakautonomy so we get resiliency so ifthere's a network severance or somethinglike that we are able to uh remainrunning without losing what's happeningon your managed clusters ourimplementation of work we call manifestwork uh that's as Ryan said how wedeliver resources to the managedclusters now we have added some thingsto the offering there uh our version ofplacement well we call it placement butthat's our way to dynamically selectclusters based on some resources i'llshow you it very briefly and Ryan willdive into it a little bit more uh theidea is that we can select clusters in auh certain way if we want more CPU ordifferent characteristics uh we can useplacement to land on them and then atOCM we include something called add-onswhich are a modular and very easy way toplug things in so if you have somethingin a project and you say "I want tobring that along i want to I want tomake it part of OCM i want to use itthere." You can go through the add-onframework now we're going to go throughevery piece of this one so sit backshould takes about 40 minutes no I'mjust kidding haha um basically I wantedto demonstrate how we map some of the uhSIG APIs against what we call thembecause sometimes that part can be a bitconfusing i was confused by that so whenyou look at things like the about API wehave a cluster claim when you look atthe multicluster services API we dothings with Submariner and we might make�it so it's easier to manage all youryour services these are justimplementations of what's happening inthe SIG we don't necessarily expect themto become dominant or anything like thatit's just how we use them and allows youto do different things with them so wehave the work and the cluster profileAPIs so that's fine uh it's something tolook at it'll be with the deck so I wantto now quickly run you through arecorded demo I have that I'll talkthrough of deploying some work and aplacement and let's see how wegookay so what I have is two h one hub andtwo managed clusters and as you can seeI have manifest work i ask the hub aboutit it knows the managed cluster ofcourse doesn't know anything about thatyeah is that big enough it doesn'tmatter i can't change it um so we'll goahead and do this against the hub theCRD is there and just to prove that itexists I like to look at these things umthere it is so the CRD is there we'reready to go okay so we run agents and onthe manage cluster so we have aregistration agent and a work agent whoare able to then grab the informationwhen they need it now here's what I'mgoing to deploy this is a simple pieceof work so this is my manifest work andI'm going to deploy it to a namespacecalled cluster one which will match thecluster that I want to deploy to as Ryanwas talking about it's a service accountit's a deployment of Inenics it's aspecific version there's no magic hereit's just a very normal deployment soI'm going to apply that against the hubuh the piece of work against the hub andthen I want to go ahead and take a lookat what's happening there so I'll usecluster ADM which is an OCM tool youdon't have to but I like it because Ilike how it can display things and howit can interact with the uh environmentand I play with OCM so here is thepieces of work that have been deployedagainst cluster one and so we'll dive alittle bit more on the on what'sactually happened right there's no podsin the cluster one name space on the hubit's obvious but I just want to be clearthat this isn't happening on the hubthis is being sent out to the managedcluster so I'll switch to the managecluster and let's go ahead and look andfind these things in cluster one okaythey're not there right so why are theynot there i said it had to be in thatnamespace that's for managing thecluster when you look closely I wasactually deploying to the defaultnamespace so I can put it anywhere Iwant it just have to have the namespacesameness across the cluster naming so ifwe look at the pods in running oncluster one I can see myenicsdeployment so we can see that the thework has actually sent that out to theagent and it started that upand then if I check on cluster two Ishould not seethem and there they're not happeningthere so it's done exactly what I askedyou know no no smoking mirrors um solet's take a look at the work on the hubthere's the hub in the cluster onenamespace which matches the name of mymanaged cluster and there's the piece ofwork that the manifest work uhdeployed so let's be sure this reallyexists i like my proof of life so here'sthe the deployment running that's myIngenicsdeployment and then furthermore Ideployed a service account and againthat's just to show that this gotthrough and I also deployed um the I cantake a look at the work and what'sreally cool is when you use this methodto deploy you get all this informationfrom the work so I can now do automationbased on these things i can learn aboutthem i can grab stuff back in the specwhere I find out the status of somethingi can alert on it i can um see when itchanged when it does things so I'mgetting all this for free by using thework API makes my deployments across amanaged cluster much more resilienteasier to workwithnow this monitors the agent monitors thework through a hash so if I do loseconnectivity or I make a change but I'mnot talking it'll see a change in thehash it'll know to grab the new work andupdate things for me so we've thought ofeverything right so what we'll do islet's actually go ahead and look insideuh one of the pods because what I wantto do is� then make a change to the worki'm going to change the version ofInenix and so you can see that actuallyget deployed across the the cluster thatwe've used so I used an old version ofInenix just because uh that was thereand it was easy so I'm going to use anewer version to um see that it canactually go ahead and deploy those podsand make something new come out on it soI've changed it to1274 and then I simply just have toredeploy thework and once I've done that it'll sortit out for me so that I can go ahead andlook at the different clusters and I cansee that the work has been deployed umit's restarting the containers it'sdoing everything I asked which is justupdate the version and in no time mywork's been pushed out to wherever Iwant so I've managed everything from thehub right i've gotten all that value outof of these work APIs simplest versionyou could come up with you could showand I did my 16-year-old she found itboring okay she's not into that but wenow can see that it willwork and so let me just prove to youthat the pod is actually been changed imean that's kind of obvious but again Ilike to really see it that's probably mymarketing background so there you go wenow have the running uh container sowhat I really want to do now is justquickly jump into an example of verysimple example of our placement so I'mgoing to clean up the work that we havehere because I only deployed to onecluster and I'm going to show you how Ican deploy it to uh multiple clustersthroughplacement okay this last bithere so back on thehub and what I'm going to do is usingcluster ADM I'm going to take a look atthe clusters now remember I have twoclusters and what I do in those is Ihave two I have cluster sets ocm sets upa default like a global one but I have adefault cluster set I've built that is away of grouping clusters together insome way that makes sense it doesn'treally matter what they are but that'swhat we do i can bind the namespace thatI want to use to that cluster set whichwill allow the work to take actions inthatnamespace and then I just have to createa placement so a placement just simplysays and I am going to pause this for asec a placement simply says put theseclusters based on something mine is sosimple it just says put it anywhere andI say find two clusters that meet thesecriteria i only have two so it's goingto find themboth yeah so that'll deploy it to Iwould expect to see the work on bothclusters based on what I'm doing so I Icreate the placement and now all I haveto do is create the work to use theplacement and the placement will takecare of placing the work so the namesare you know talk to the Sig about thenames but I love them because I don'tforget so the placement exists I it'llhandle all decisions it'll work fineit's the same piece of work as you sawbefore except I don't have to explicitlycall it as a manifest work it's justnormal YAML so the deployment is set togo ready service account the wholething and then I use cluster ADM becauseI like to to create that piece of workin the exact same way I did before but Itell it what placement to use so Iexpect the placement to read the rulesryan will go into it a bit more deeplyand I will then be able to place thatwork across my multiple clusters soyou've seen it do it it's now createdthat my first work work on the bothclusters and then I can prove that byshowing the work running on thehub so again I didn't have to actuallyexplicitly call that kind it simplybuilt the work for me and there it is soI have work running on both thosenamespaces on the hub meaning that bothmy clusters with those names will havethe work running on it in the defaultnamespace as we talked aboutbefore and there it is running oncluster one and there it is in a secondrunning on cluster two and thatdemonstrates at the very simplest levelof how we've implemented the work APIand added placement to make that workand now I'm going to show you how to dothe same thing with uh Kuble fleet fromRyanthank you very much um yeah so uh yeahthis is uh thank you so we have again wehave this new project called Kubi fleetth�at is recently contributed to CNCF soa little bit about Kubby fleet um it'sagain very similar a lot of concepts arevery similar with OCM so we provide youa single pane of glass for the fleetmeanings to do uh multiclusterapplication management just exactly likeof this uh demonstrated you have one hubcluster you do of thing and then it willyou don't have to go to your each eachindividual cluster imagine you have like20 50 clusters that will save a lot ofyour uh headache and based on that weprovide a lot of scheduling capabilitiesbasically we copy pasted a whole bunchof Kubernetes concepts affinities umtopology spread preferred requiredduring scheduling preferred duringscheduling all things like that and wealso introduced some uh differentflavors of policies uh placementpolicies just as uh the demo is we canyou can pick some how many clusters youhave or pick all of them like a demonset and also we add a property based uhscheduling that is why we uh tied intothis about API cluster profile API iswhen you have the properties normallyone of the main reason we haveproperties on that is for the admin tosee which where you put it so we haveproperty based scheduling and we finallywe have recently we built this buildinga continuous deployment strategy becausethe uh yeah we we actually have a demouh tomorrow uh if you are moreinterested but today I'm not going todemo de demonstrate that um and then thesame thing uh exactly like what OCM uhshowed we implemented all these APIsmost close to as close as we can to theuh sigmoic cluster standard and uh let'sjump into a demo uh anyone want Want tosee a live demo or recorded demoyeahokay that's thespirit let's see okay so here I've setup a fleet i don't know how is it Is itIs it big enough big enough i can makeit even bigger go bigger if you can yeahokay that probably that's the biggest Icanget yeah that's live so that you knowthat's all or errors you made will bethere um the problem with live a big oneis you probably cannot see the rest ofit so you can see that it has availableCPUs i don't know if I can I don't knowhow to yeah but you can see we have onetwo three four nine clusters joined tothis fleet right and you have there areproperties properties at the right sidethere's available CPU available memory ithink I cando whiteoops I cannot spell so then you willalso see available allocatable memoriesthings like that those are theproperties and uh clusterpro we create a cluster I cannot spellprofile so we create a cluster profilefor each cluster um that's but thecluster profile currently doesn't haveall these properties yet so we cannotreally do schedulings based on that sowe showed you the cluster profile thenext thing is the cluster um propertiesright about API so let mesee so the main uh placement equivalentAPI in the K fleet is called uh let mesee where's my this guy so thishopefully again if it's hard to see butuh you hopefully you can get some ideawe have this u topology uh topologyspread affinities but here is the wherethe properties comes in so we have aproperty sortter so I showed youpreviously you have uh available CPUs sowe can say in this one we would say thatuh because sorting order descending thenyou are going to we are basically tryingto find the cluster with the most CPUswith a weight 50% 50 of them and also acost u probably this should be ascendingbut there's another property so we cando all these if all these with Azure areuh Azure based properties and with thisum the kubernetes ones are the publicproperties and then we also haverequired during scheduling you know allthese things we want to be the memorieshas to be over 13 gig but I just come upwith number um because we have uhclusters that don't have that manymemories So this is how we useproperties um when the cluster profileAPI the link I send there when we have acommon uh communitydriven properties weare going to adapt to that so this ishow we dothe property and now uh let's take alook atuh take a look at how itdoes become EUso this is similar to what our placementisso again those are the let's take a lookat the stat�us so what it says it says uhfound all cluster needed found three sowe asked for three it give us three andlet's see what it has it has uh so it'scalled bug batch three uh bug batch fourright so let's just take a look at thishow do we So we have a we grouped awhole bunch of stuff oh sorry maybe Ididn't make it that clear here isuh resource selector so what did weselect we selected a namespace calledmulticluster app here right so basicallythe idea is I have application sittingin a namespace i want to decide the thefleets admin want to decide where thisapplication should run right it pick allkinds of u properties or criteria say Iwant to run there so now we actuallywant to place that uh application to thecluster that's where the the work APIcomes into being i'll show you somethingvery similar to what OCM does so nowlet's take a look at the namespace rightso the magic is uh we put it inthe corresponding name space there andwe have awork object there let's see there yeahso that object is there and uh I thinkafter the previous demo you're prettyfamiliar with it now hopefully uh it'svery similar towhat the oh maybe I didn'tdo so you can see it's the same thingmanifest name so what we place we placea namespace we place the service uh andwe place the deployment and then thelast thing we place the service exportso now goes to the last API we're goingto talk about service export so beforethat just like uh Oscars want to do Iwant to prove life like it's we actuallyplace it there right so uh let's see Ithink it wasthree see if it's there right so it'sthere let's just take a quick peekwhat'sinside yeah so it has uh deployment Andjust to do the same thing let's see ifwe pick something that is not therelet's see i don't know i don't thinkthis one isthere does it haveanything nothing right you can you canfind something just pick another randomstuff doesn't exist right so we onlyplace to where it should be so now comesback to what the service export does sowith a service expert so we place it inI thinkin three four and I don't rememberwhat's the next one just let's say threeand four right yeah so at least threeand four are there um so now you have anapplication at least sitting actuallythree clusters I just don't rememberthree clusters and then now you you havea customer right customer say that'syour store storefront right thecustomers store app want to want to buysomething from your from your store sohow do they do that they are not goingto talk to the C cluster specificallythey're going to have an endpoint sothat's where we put a uh DNS justbasically it's a it's a load balancer infront of it's a DNS based load balancerand that one is called let me see if Istill have the command it's long don'twantto this oneno ah this one yeah so we have ourtraffic manager there and this is theDNS name and uhum okay let me just so it has a DNS DNSnameendpoint and then I forget thename so I will show you how we use theservice export right so this is calledendpoints allright so it's anyone familiar with skateuh gate gateway APIs it's similargateway API you have a gateway API andthen you have endpoints so this is ourendpoints and then the way you use thatis u pointing toyour service import when when you haveservice export on your clusters weautomatically create a service import uhsomewhere else so that you can accessthat so we have that service import andI show you where the service inputis i think it's in the same class yeahso we have this service importyeah so it says how many clusters it hasthree clusters three four and anotheranother one basically my name there uhit has ports so that's that's how youget this uh in uh storefront and if youlook atthe this oneright so it has a DNSname ohthis one it has a DNS name and now youwant to dig digthat see if you can yeah now you seethis uh actually in here let me see if Ican get another anotherone didn't change for curl it a fewtimesand we can curl actually force it to useitwhere did it go dig it dig it againsame but that that's the part of lifebut uh just believe me it's actuallygoing to three different pla�ces uhthat's about the demo hopefully uh ifyou have any questions uh you can grabme again the the setting is still thereyou can play with it if you like sofinally um there's one shout out is uhour project is so new we don't even havea logo so uh if you are interested ifyou feel like your uh taste is good uhplease vote for the logo may the best uhlogowin okay that's it thank you very muchwe have a Q&Anailed it that's good timing anyquestionsuh yeah is there any Is there a micsomewherei mean I could just come up if it'seasierwhere is the question fromhe's there he's going to the michi uh so great talk thank you um I dohave a question um it seems to be rightnow oriented at uh spreading your blastradius if cluster goes downgeographically um but say you have justtwo clusters side by side and you wantto guard yourself against doing anoopsie and taking a cluster down um itmeans that you're if you are a developerand you want to present your developerswith a unified view say a nice Argo CDdashboard or something like that um itis still going to be representeddifferently because those tools as faras I know just like uh ISTTO or Kali wearen't really that multicluster awareand definitely not aware in thefederated sense of awareness is thereany integration with other sigs withother projects where they're working onthis or is that something that's alittle bit more further out into thefuture you think um yeah definitely sothis is just a uh tiny po portion of allour projects we have a a lot morefeature there i believe OCM also have alot of features like that so actually Iwill have a demo about uh how does thisKubi fleet work with Ago CD exactly howwe work with that's the beauty of ancommon API when you have a common APIthe other integr other projects canintegrate with an API instead ofintegrating with a specific project andthat's very much what I was trying youknow trying to call out is if we cankeep to this then those empty boxes yousaw on that slide they are anything intheory so that we can make sure we don'tgo down that road and that that will bethe way to do it yeah because it's aspec not an implementation yeah yeahthat makes sense yeah and because it isessentially a single control planeanything that could talk to anindividual API server would technicallyjust be escapable to talk to a hub APIserver essentially because it's justCRDs yeah i mean is there a minimumnumber of clusters i mean in some way itshould be one because you always want togrow so if I'm always managing in a in avery standardized way then I'm in a goodspace yeah i actually have a secondaryquestion which is probably going toannoy everyone uh although there's noline great um so say you have a work andthat's deployed and your spoke clustersare executing this work happily if younow go and manually modify that will acontroller or reconciliation loop resetthat back to the desired state goodquestion um and I feel like you you havebeen there um so yeah so at least forKubby fleet we have uh it's called driftdetection we have takeover uh driftdetection diff reporting all thesegoodies in the work API but we haven'tupstream that yet that's extra work weneed to get everybody uh agreed on theAPI and we can upstream all these uhgood features there right cool thetakeover part sounds interesting yeahi'll guess we'll talk about it later alot of times um before they can adoptOCM or Kobe Fleet they were doing thisold way right kubby Cado into everybodyand then when you when you take care ofover you want to make sure that you'renot accidentally override something thatis specific for that cluster incidentwill happen so that's definitelysomething um I know the real admins areworry about that's why we have thosefeatures right cool the third tinyquestion how does the authentication inthe demo is it just service accountsfrom cl from the hub talking to thespoke or is there other way around it'sthe we both have this pool model so it'sthe agent member agent actually isauthenticated onto the hub cluster sothe hub using different ways you can doall kinds of if you're a cloud provideryou can us�e oidc and get this uh umfederated identity things like that oryou can use old-fashioned um secretsyeah similar as the cluster secrets inArgo essentially yeah cool thank youvery much you're welcomehey thank you for your talk i have athink probably a very small question inoticed that the work API specs don'tspecify any resources so can you tell usmore a bit about like the scheduling andplacement like of workload how do youhandle like placement decision do youkind of look into the manifest and everytime evaluate the resources required toplace the workload what do we do therei'll leave with you the engineer so youknow u so the idea is u work is uhreally um there's no intelligence atbuilt into the work API the work API isa workhorse it it does whatever it istold the intelligence is in theplacement like we both demonstrated thatthe placement say you that's where wepick where to put the stuff when wedecided where to put we we just envelopethem into a work API and that workobject will get blindly copied into thiscluster that it needs to be so theintelligence is mostly on the placementside all right thank you so much you'rewelcomehello hello thank you for the talk um sois work intended to be a wholecollection of resources that's deployedto a cluster or can you have singlepiece of work that is split acrossclusters as well yeah because we allknow there's a 1.5 megabyte limit rightso if you want to place a lot of stuffyou you have to you have to break it anduh for the work API is just designed tohold individual ones if you you can putthem say you have 10 pieces you can putthem into one if they can hold or youcan put it into 10 different work workeither way works although I probablydon't want to put put it too fine grainthen as a human you have to manage itright you make it over complicated ortoo disastrous if it goes wrong you wantto make sure you you have thatflexibility thank you very much you'rewelcomehello uh thank you for the talk uh myquestion relates uh to what you saidaboutplacementsum it's it looked like placementspecification is statically defined soyou're saying hey this cluster has gotthis much CPU or this much memory ifyou're in a scenario where you're usingsomething like Carpenter which is goingto dynamically scale up based on theworkload that you're trying to put in ithow is how are you going to choose theright cluster to go toso one thing to correct it's not staticso I can demo again if you run someworkload into one cluster you will seethe available CPUs uh changed or reducedso it's not uh static there is actuallya lot of discussion in the sik aboutwhether we want to keep those propertiesinto cluster profile API because to usthose are very dynamic uh properties andwe have initially we decided that uh forthe about API we do not want to put inany uh like Stephanie is there you canadd more we do not want to put toodynamic uh properties into that so therecould be we don't from the sik level wedon't have a solution for that from thekoopy fleet side we basically uh collectour agent sitting in the member clusteror workload collecting order metrics andwe update that with a certain cadence sothat and I know OCM and other projectshave their own way to solve that but onthe sik level we don't have a solutionyet so if welcome to our sik meetingsand propose any uh solutions to us uhthat' be great and again we will have atalk tomorrow afternoon about the sikAPIs so that could be another goodquestionthere thank you very much uh I mightjust go one more question as it wasn't acute um where are you at withintegration with um gateway API at themoment because obviously youdemonstrated there something that wasn'treally aligned with how gateway APIworks so where are the two talking a lotI would say we talk I'm not sure if wetalk a lot uh last time we talked um thebasic idea is exactly what Idemonstrated but there are nitty-grittydetails that like How do you actuallyshift traffics between two same servicesum there are some um again nitty-grittydetails we still need to iron out butthe idea is the same um and we shouldtalk to network more or Yeah definitelythank thank you for this uh reminderyeah thank youuh hi there uh thanks for the talk uh uhcan we use the placement to uh make surethat different works are distributedacross a fleet of clusters so that thereis uh um the distribution of the workdifferent works are um more balancedbecause a lot of times we have multipleclusters but each of those clusters aredifferent in size uh can we actually usethis to maintain um balance the size ofthe clusters across uh across differentclustersthe the answer is definitely yes umagain this is not part of the u sik aAPI in our individual uh projects wedefinitely have those ways depends onhow you want to rebalance right thereare two ways to balance one is staticbalance so when you do the placement youyou find out all the clusters butclusters get out of balance after awhile so you can also do dynamic balancebut all all these are not building toour current uh siguins kleet can handlethat or we have different ways to handlethat and we have uh plans if you againyou want to know more we can talk um uhtalk individually down there but uh umat a sick level we don't have a younotice that we actually don't have aplacement API yet uh we probably like towork towards that if uh if possible butplacement is a very complicated uhscenario just think imagine howkubernetes how many knobs you can turnin kubernetes when you place uh podright so so we still we're pretty farfrom getting to that standard uh yet butI think hopefully that's the goal uh forthis s multicluster again we need moreinvolvement because we do not want to uhbasically decided for the communityright we this is pretty much thecommunity center and placement acrossdifferent environments right it's goingto really depend where you are andwhat's best and so some of it needs tobe left really for the implementation umdepending on where you want it so the atthe product levels I know we're dealingwith it in certain ways because our typeof product needs to be on metal orsomewhere else So I'm curious to seewhere that goes as far as placement it'slike the last the fifth API there thatisn't there the fifth API that kind ofsupposed to rule them all and but it ittakes some time to get there yeah i hadanother question uh I see that OCM hasuh uh the support for IRS for AWS uh nowI saw that in the latest release y um torun a hub on on EKS on EKS yes um I isisthat was was that like a is is Q fleetand OCM kind of differ in that like howhow these things are implemented when itcomes to authentication and is there astandard around that why was it missingin OCM beforewhy was it missing in OCM boy um well weneeded someone to do the work uh weneeded the AWS expertise to work with AMthat kind of stuff um there needs to bethe resources to do the work and areason to run it there and that getsdriven from different levels upstreamdownstream that kind of thing that's atricky question to kind of answer openlylike that so yeah it it's nothingblocked it just resources and and thethe team that worked on it just wantedto tackle it and was a and had thatknowledge of AWS to do it um we'rethrilled to see it running on therebecause obviously it runs nicely on OpenShift but you know the more we moreenvironments we can see it on the easierit is for for customers to to manage inthose different environments i don't ifyou have anything to add butunfortunately I cannot speak for OCM nobut I mean on your side so Kub Fleet asa as a open source project works on anyKubernetes so and uh it work on EKS GKEAK AKS whatever yeah so it there's no umthere's no um vendor it's basicallyvendor neutral uh it does have aprovider model just like a quanter sowelcome uh I don't know if any case GKfolks are here welcome you to provideimplement a provider there so that itcan seamlessly work on everything haveall the features if you don't have aprovider some of the features like theproperty like the cost property is theprovider that is only Azure membermember agents can provide that but uh ifyou running it on EKS GK you won't beable to get that thank you welcome2025-04-15 22:00:17.128045 ��|�V#��/AI9GV4N23dvEwelcome to our talk this is how the SIGmulticluster API specifications are usedfor real world multicluster managementmy name is August Simonelli i work withthe open cluster management project andI work for Red Hat and this is Ryan andI'm Ryan i am a maintainer for the newlycontributed CNCF project called KubbyFleet and I work for Microsoft Azurecool so let's talk a little bit aboutwhat we're going to speak about today sowe've done our intro so that's one thingdown um we want to kind of go over theSIG multicluster standards i'm sure manyin the room know them but it's alwaysgood to recap them and to sort ofbaseline where we are with those thingsthen we're going to demo somemulticluster because that's the funstuff uh we've got a mix of a a recordeddemo that's me and a live demo from Ryanthat's him sweating over there and thenwe can do some Q&A so the first thing Iwant to call out and again I know thereare members of SIG multicluster here butthe idea behind this is that we want todefine APIs and not implementationswe're not trying to say this is exactlyhow you do it we want to definestandards we want to define ways thatyou can work within uh a multiclusterspace and be able to add your stuff toit easily so to do that there are fourbasic APIs that Ryan will cover uh theabout API the multicluster services APIthe work API and the cluster profile APIand you're going to see those demoedtoday as well and why do we do thisbecause by adopting standards it makesit easier for the projects you work forand and add to contribute and to be partof this so whether it's something likeCappy or Couple Fleet or whatever it isyou're working with we make it by usingthe standard we make it easier for youto fill those empty boxes so you cancome along and say hey my project wantsto be part of multicluster and this isthe way that we can do that and the ideaagain is that once we have thesestandards the agreed upon communitystandards it's not about bringing yourstandard in and dropping it on and it'sabout finding something that works foreveryone so that is sort of the basicson how we do this i'll hand over to Ryannow to take you through a quick overviewthank you Alex um yeah so let's firsttalk about the the oldest one simplestone is called about API so what it isreally about it's about uh define someproperties well-known properties for acluster so probably if uh if uh I assumemost of you are Kubernetes practitionersyou know that there's actually no singleID for a cluster there's there just notthere uh if you want to identify acluster it doesn't exist in theKubernetes world so here is the what theabout API is going to do because whenyou have multiple clusters now you kindof have to identify each other so youcan distinguish each other so the one ofthe actually I we only probably have twothe one properties we have defined iscalled a cluster name basically so youcan see here it's the name is clusterone you can name it it see looks simplebut that actually is something thatkubernetes doesn't have and we areactually in the process of if you uh weare going to upload our our deck so thatclick into the link you will see we areactually having a cap discussing moreproperties we putting another fivedifferent properties that that we thinkthat will be useful so uh please let usknow get involved uh vote on thatcomment on that um so that's anotherthat's kind of tied up up to our uhsecond API it's called the clusterprofile API so the about API is reallysitting inside the individual cluster toidentify what this various properties ofthis cluster but in most of themulticluster world we need a single paneof glass for the the system and main thefleet and main to actually see what aremy clusters what do they do that's wherethis cluster profile API comes this isbasically allow you to let me see if youcan see clearly yeah okay hopefully y��h we use Postgress radius and allthat stuff um we run millions ofautomations every day and we run on topof GCPso why did we pickJRPC the team who started building thiscompany was originally working togetherat a pre at another company in thatcompany we were like "Yeah REST APISwagger Open API files and stuffbut we didn't like the code generationtools that were available at the timefor REST specifically in in the Goecosystem they weren't that greatso what ended up happening is that ourengineers didn't like the generated codeand they would go and implement clientsand serversthemselves which meant you had like fivedifferent implementations for handlersand for clients and it was a pain tomaintain it so our primary goal withgRPCwas this looks like a framework that isbacked by a large company alarge ecosystem opensource communitywe believe that the code generationtools available forthis tool will be great and this willreduce the workload that our engineershave to deal with and not have to thinktoo much about how to implement thislayer forAPIs both on the server and on theclient sideum gRPC is also a binary protocol it'ssupposed to be faster because ittransmits less data over the wire andanother really cool feature about it isthat it's backward compatible bydesign if you try hard enough you canbreak it but in general for most usecases it works great and we rely heavilyonit so what is gRPC for those who are newtoit grpc stands for Google remote protoprocedure call it's a highperformance RPC framework it's languageagnostic which means you have support togenerate clients and servers almost forevery programming language sure they areall kind of modeled afterJava style of interface and they are notexactly ideomeatic Go or TypeScript butthey do an excellent job it's backed byprotocol buffers uh which is a binaryprotocol for encoding structures it hasan IDL that lets you define strcts withsome primitive types and it's also aCNCFproject so as a developer working withgRPC is quite simple you define yourservices messages which are like inputand output for different methods thatyour service implements you use some CLItool to to generate the from this protolanguage the clients and the servers youthen implement your servers logic andyou use the client to communicate withtheserver[Music]um typically this how it works uh onecool thing about gRPC is that it usesHTTP2uh for communication which means thatit's able to multiplex multiple streamsmultiple requests over a single HTTP2connection and this is really cool forreducing latency anduh be way more efficient for uh dataintensive or uh high request rateapplications however the problem withHTTP2 is that it has some challenges onKubernetes with load balancing which wewill address in a bitso we said binary protocol over HTTP forperformance backward compatibility bydesign which was a really importantaspect for ourproduct a strongly typed API contractlanguage basically all APIs are definedin the same way across differentmicroservices you have excellent clientand server server code generation toolsuh with wide support for almost allprogramming languages you you care aboutnice Go support TypeScript support andit has built-in extensibility by designwhich is really coolfor annotatingyour proto l proto messages with someextrauh attributes for achieving stuff likeuh role based access control which Iwill show in abit so how do we make APIs work attorque we try to stick to these rules wewe want to lint everything everythingthat we can that a developer thateverything that we canlint if there is a tool for it we willadd it to our CI uh pipeline uh we don'twant to we want to maintain the samestandards for everything and make sureall our services look more or less thesame to have our engineers focus on thebusiness logic and not on thinking toomuch about uh naming or stuff like thateverything should be a standarduh backward compatibility is alsoenforced using a llinter which checksthat you didn't delete a message or youdidn't do any backwardcompatible breaking changes this meansremoving an RPC methoduh changi�ng a type of afield deleting fields is generallydiscouragedone thing we learned is that when youstart defining APIs you really want touh to share messages between differentum to share common messages betweendifferent implementations and differentservicesuh from for example you may have a userobject that you want to reuse acrossdifferent services and in our experienceit didn't work that well and it createlike tight coupling and spaghettibetween different APIsuh which is why we generally try toduplicatemessages or same messages and keep themunder the service domain that theybelong touh there is also a really uh handy toolfor those who are just starting out andwant to adopt gRPC it's called theGoogle API design proposal it's awebsite with tons of information aboutthe way they design APIs at Googlegranted you are not Google and probablywill never be but there are some goodlearning and good um information therethat will help you uh make the rightchoices when designingAPIs we certainly read everything but weadopted some certain parts of it whichwork in our case and I highly recommendlooking there because it's a really niceguideof how you should think about API designand implementation in your companyum in order to streamline thedevelopment experience for our engineerswe created a Docker image that containedthe plugins and the shared configurationamonguh that we use for generating our codeand our clients andservers excuseme and our developers use it on theirmachinethis means that the developers don'thave to install the proto compiler orany of the pluginsthere so how should youorganize protos in yourproject at we like multi-reos and thisis what works well for us we keep theAPIs in the service repository in itsowndirectory since we are a primary Gocompany we generate the generatedclients and the servers are alsocommitted to the service uh GitHubrepository which means if anotherservice that would like to communicatewithme they can consume the clients usingthe go model manager so they do just goget and they can use myclientshere you can see a typical directorystructure for our micros service you'llhave an API directory which hosts alltheprotoiles uh each service will have adirectory with a version that signifiesthe current API versioneach of those will have the protoilesinside in case uh some repos should beused in a front- end project we use theGitHub um artifact package registry tohost thoseso what benefits do we get fromgRPC if I had to summarize it with justone sentence it's unified APIstandards developers can move betweendifferent teams and they are alreadyfamiliar with the standards that we havethere is way less code to review becauseno one is implementing their own clientsorservers because we trust the codegenerated by thetools we have built a lot of middlewarewhich is kind of the way it works inrest API as well that does some reallycool things and provides a framework forour engineers so we use a middleware forauthentication authorization we use abunch of middleware for observabilitywhich connects really well intogRPC in addition there is a very bigecosystem of middleware that opensourceengineers like yourselves built for gRPCand it's super convenient to use usuallywhen you need something someone hasalready built a middleware or some toolfor itanother really cool thing that we likeis theextensions so below above you can see atypical VIP coder gRPCservice the standard syntax for gRPC uhallows you to define a method which iswipe code a request a response sobasically input and output for this RPCone cool thing about gRPC is that youcan extend it with this options uhattribute for example here Idemonstrated the way we do role basedaccess control in torque so each APIwill have an attribute which defines thescopes that are required to be presenton the session object thatuh when a request reaches the servicewill be verifiedthis is a really convenient way thatworks really well for us for managingeverything related to APIs in the sameplace um for example other uh extensionsthat we have are flagbaseduh feature management embedded �into gRPCfor example this means that if I don'thave a feature flag for VIPcoding I will not be able to code thisRPC and this is enforced in a in ourmiddleware uh and it's superhandy so what are the lessons that welearned and for the past five years andI think most of the friction when usingyour PC is in the beginning you have todecide what directory structure you'regoing to use how are you going to dodependencymanagement the way gRPC and protos areused in Google is vi monor repo and inour case since we wanted to use multiplerepos we have to figure out dependencymanagement byourselves sure there is a great protocompiler CLI tool which is available forall but the developer experience for itis not great you usually end up buildingyour own make file that runs it or uhyou have to build a docker image whichis 10 gigabytes of binaries that everyengineer has to pull to their machineum with front end it's also not superconvenient because let's face it front-end engineers like REST APIs they likeusing the network to tab in Chrome todebug the requests and uh they are notreally happy about binary formatsanother thing that we uh also faced isthat at some point you will want toprovide some APIs for yourcustomers sure you may like newtechnologies but your customers areunlikely to be happy with some gRPC APIsthat have to use and you will end upgetting some rest API in your codebaseanywayanother challenge is with Kubernetesload balancing so Kubernetes isn't greatwith HTTP2 load balancing out of the boxhttp2 uses long lived connections and uhsince there is no disconnection it willnot be ableto spread the loaduh between different pods once theconnection is established and you willhave to either implement some loadbalancing logic in your gRPC clientsusing headless services in Kubernetes oruse some kind of a service mesh or aproxy which does this for youspecifically at talk we use linkerd sowe get this out of the box and it worksworks really well for usmessage reused reusing as developers wedon't like writing the same code twicewe live by the dryprinciples unfortunatelywith gRPC it's not that great you shouldthink of your uh gRPC files inside aservice domain which means that if Ihave a user service and an order serviceI will not be reusing the user structurein the order service i will prefer toduplicateit dependency management with gRPC isnot that great and from five years ofexperience I highly recommend justduplicating everything instead ofreusing the samemessages within a single micros serviceit is fine to use and it's encouraged touse the same messages but not acrossboundaries of theservice grrpc web I mentioned it a bunchof times and it's really challenging andit is a friction a constant frictionpoint with front- endengineers another issue with it is thatit doesn't support caching so the wayit's implemented it's B 64 payload thatis transmitted using postrequests this basically breaks down allmodern common caching mechanisms thatare available in browsers CDNs and stuffso um it's not very great for thatin addition debugging it requiresinstalling a special browser extensionso if you have a customer complainingabout some issue you can't just ask themto open well you can but it will not bevery useful asking them to open thenetwork tab and send you the the HARfile usually they will have to installthe extension of course there are waysaround it smart ways around it butgenerally thisis a big pain point and gRPC requiressome kind of a proxy uh that translatesfrom this base 64 gpc web format intoregular gRPC so it's another middlewareyou have to install sometime some of theopen sourcecode may contain some security issues ormay create some logic implementationissues on your side because we don'tknow how it's fully implemented and it'ssomething that wefaced and again public APIs are notgreat with gRPC there is a plug-incalled gRPC gatewayit is much based on Google's annotationfor their internal APIs but it's anopen-sourceeffort it allows receiving REST APIs anddefining and annotating your gRPCservices with some RESTannotations and it does the job quitewellhowever uh your technical writers areunlikely to be familiar with uh thesespecial annotations for gRPC and youwill have to educate them on using itproperly in retrospect if we had to dopublic API again we would probably useopen API and define the swaggersourselves basically have a human definethis uh open API file umto help technical writers and be able toformat the uh the file the open API filein the way we want and not have somestrict uh closed box that is difficultto maintain specifically we have like apost-processing script that runs on theswagger that is generated from uh gRPCgateway and have it format and changesome things that we don't really likeand it creates some friction andmaintability issuesbut things are getting much betterthanks to a company that is calledBuff they took it's actually a bunch ofpeople who worked at Google andlearned about the pains of using gRPCwithout the internal Google tools andthey decided to build a company out ofit buff is a CLI tool which is acomplete gRPC protobuff ecosystemit supports remote plugins it has asingle configuration file that definesuh the way you would like the toconfigure theplugins it has linting and uh backwardcompatibility checks built in and infact we are using it extensivelyinternally and one thing that it doesreally great is dependency managementremember how I said that it's reallydifficult to reuse proto messages acrossdifferent services well Buff have aproduct for that that's called buffserviceregistry it is solving exactly thatpoint how do I reuse different messagesacross services and it's superhandy since Buff came to the marketabout two years ago we basically droppedour custom Docker image that had all theplugins and started using itspecifically basically it turned thishuge CLI command into a buff generatecommand and the YAML file thatconfiguresitanother very useful addition that Buffguys brought is something called connectRPC basically to to summarize it it'slike a new textbased format which isbased on uh JSON to allow gRPCcommunication you may say "Ah anotherprotocol that I that no one knows." WellI guess it is kind of that but the goodthing about their protocol and the toolsthat they built for it is that it hasfull backward compatibility and itsupport gRPC web it support gRPC and itsupports their own protocol in the sametool and in the same generated code thebiggest addition and the biggest valuethat comes from it is that theyimplemented JavaScript libraries fromscratch which keeps the front-endengineers happy now the the generatedclients are way easier to work with theyare ideomeatic TypeScript andJavaScript and it's been a hugedeveloper experience improvementso was it all worth it yes absolutely iwould use gRPC in my next project againfor the same reasons we picked it i likehow the generated code i like how it'sstandardized and it takes out secondgguessing or anydiscussions from thedevelopers standardization isgreat to keep everyone focusedall the requirements we set forourselves weremet and with the new tooling by Buff itmakes everything even easier to adaptand I highly recommend you experimentwithit thank you very much now it's time forquestions[Applause]sorryhi um right yeah you you you mentionedfront end hard with GPC web and andstuff i think Twitch have thistechnology or a gpc layer called twerpwhich is supposed to translate uh on thefly instead of instead of uh if yourhead's JSON it returns everything JSONrather than gpoing it up uh and sendingit have you ever have have you lookedinto that or considered any of thatstuff um to Sorry I didn't hear whattechnology you were referring to twerpfrom Twitch yeah yeah Twerp is cool ithink I actually looked into it and Idon't think like it's still beingmaintained and developed which is why wedecided to stick with something withbigger community adoption but it's agreat option i don't think it alsosupports streaming but I may be wrongnow anyoneelse okay thank you very much guys enjoythe rest of your conference and feelfree to reach out and talk if you haveany questions2025-04-15 22:00:17.745458 � ��D�W#��?Aq44WBAGzKhkso excited so many people are interestedin gRPC and to hear about my journey ofusing gRPC for the past five maybe sixyearsum so a quick question who here usesgRPC in production raise your hands niceso we have quite a few people who arenot using gRPC and this talk will beexcellent for you and obviously forthose who alreadydoum so a quick introduction my name isConstantine i'm the chief architect atTorque a cyber security no codeautomation startupi love working at cyber securitystartups and for the past 12 15 yearsI've been a member of several of them ihave a guilty pleasure i love optimizingCI build times who here loves optimizingCI build times raise your hand yeah it'sa rabbit hole you go down into and youyou can spend days on it shaving thosesecondsjust so you know what to talk is aboutand our what basically set the stage forusinggRPC few words about what we do and ourtech stack so think of us as Zapier forsecurityautomation ex sorry for securityanalysts people who have to buildautomation but they don't like writingcode they like graphical user interfacebuilding automations in the UIuh we are a team of about 50 engineersnow uh we domicroservices we use go we use gRPC foreverything which is a bit unusualbecause in the front end world front endengineers don't really like using gRPCand it's quite not ordinary for them todo it u��key uh and gives us uh somereally nice runtime characteristicsspecifically for running really reallylow latency starts so this is reallynice if you were running like a functionas a service kind of thing um and thenwe're going to wrap it with a witinterface make it nice and then run itwithin a wy compatible WMruntime we're gonna make that reallyawesome layered cake that Mary's showingand she looks so happy doing it i thinkwe're gonna be happy at the endtoo all right we're also gonna do itwith Go we're gonna marry Go andJavaScript together it's going to belike uh our our our trifle you ever seemake trifles it's like layer layer layerlayer well it throws a lot of differentflavors in there we're going to throw ingo and JavaScript this is how we wouldnormally go about this this is like thesimple case you know you go generate gobuild and out pops application voila sMagnafinow you don't get the same experiencewith Web Assembly now what do you getthis is a really complicatedrecipe ohboy i don't even know what half theseingredients are let alone how to combinethem uh we start off with WKG in factthat's actually pronouncedwackage that is for fetching your witdefinitions from an exterior sourceactually an OCI registry u and then mygo has turned into a tiny go which uhtiny go is not quite go just like cgo isnot goum node node kind of stayed the same npmkind of stayed the same but we'vestarted to add more tools to it like jcoj is uh your JavaScript uh componenttooling uh it's got a bunch of differenttools in there based on uh somestandardized Rust tools and we'll use itfor a handful of things like actuallymaking a component um and thenafterwards we have this thing calledwhack it is not whack it's actuallypretty cool um it fuses componentstogether based on uh interfacestructures from the wit and then fusesthem together to create a finalcomponent of them puttogether so what are youthinking oh my goodness and and it wouldprobably turn out that sloppy too it didfor the first five times for me so uhmaybe you all do better hopefully afterthis talk you'll do better uh so let'ssee how are we going to fix thisterrence tell me how we can fix this soI'm going to take us back to 2011 um andduring this time uh we kind of rewindtime uh this is actually a time periodwhen uh I was working at Herok at thetime uh I think that makes me kind ofold there and so we were actually takingso Heroku was a Ruby only platform we'reon this process of making this polygotplatform at the heart of thattransformation was this concept uhcalled Bill Packs and what buildp packslet us do is it allows us to extract allthe language specific bits um from thebuild process and let Heroku be thislanguage agnostic platform right and soa build pack would automatically detectwhat kind of application you had was ita Java Ruby node application um figureout what tools that you need to run uhas well as kind of optimize yourapplication for peak performance so youas the customer of this thing could kindof just focus on building yourapplication sounds great right and aspart of that um we actually open sourcebuild packs um during that time periodand um you know there was adoption fromvarious companies and people in thespace uh one of the kind of prominentplayers was pivotal um as they'reworking on cloud foundry and uh theyactually kind of uh were you know usedtook our build packs and then as uh asfork they actually like made a separateAPI had a totally separate ecosystem andso we were actually on these likediverging ecosystems and in 2018 we gottogether and created cloudnative buildpacks so we wanted to do kind of twoessential things we wanted to bring ourecosystems together because it kind ofsucked that they were separate and youcouldn't have build packs that werenecessarily compatible with both um andthere was this whole like Dockercontainer thing that was happening atthat time and so uh we also wanted toinstead of have a bunch of proprietaryoutput and artifacts actually create umthings that basically worked in thatecosystem played around with it um andtoday we're a CNCF incubation �projectand we actually just submitted ourgraduation application uh earlier thisyear so hoping that goes through butwe'll see how long that takes and so youknow build packs we're now cloud nativebuild packs kind of what's different soit's still uh source codeentric focus onapplication and what you get out isinstead of this proprietary artifactyou're going to get an OCI image youknow things that you can use with Dockerthings that you can put in NoCI registryright and uh what you get is this wellstructured application that has umlayers that map um logically to yourapplication itself and the build packsthemselves are these composable units ofthat encapsulate a specific technologyright and when I talk about technology Imean stuff like a Java Node.js GS uh umyou know like those kind of things orRuby Python um things that you'd want tokind of standardize uh how things shouldbe built right and um the other kind ofnice benefit is that they can becomposed together um and used so they'reexecuted on a a particular build packplatform but you can have a bunch ofbuild packs that are executed andworkingtogether and so as the end result ofthat if you're an application developerright um one of the things that youdon't have to do with by using buildpacks is have to artisally cra handcraft like that docker file where you'recopying and pasting lines and trying tomake this production thing work and fitall the things you want right you canleverage these build packs and they'llput all the things that you need to haveit run and have this image that you canuse uh for your application um with thebuild packs themselves uh they allow youto standardize how those applicationsare built so inside of a company uh youcan have Java built the same way acrossyour entire company so you don't havethese individual docker files for everysingle application in your company rightthat are mostly doing the same thingsbut you get these kind of snowflakesthroughout your company and thisessentially and I was mentioning thatbuild packs composable right and so whatthat allows platform operators to do isyou can actually decrease number of baseimages that you need to run right so umyou don't have it's very common right tohave like my base image like a real nineright it's like my real nine plus thisspecific version of java and then if Iwant the next version java that's atotally different base image so thatoperation teams like running those twothings and god forbid if I have a nodenojs image I have to run and then thecombination of those things right thecombinator tors become very high becauseuh you have to maintain all these thingsthrough application developers can usethem right and so with build packs youcan essentially have a single rail lineat base image and then rely on buildpacks to essentially do that compositionfor you um getting you essentially allthat organizational efficiency that youwant for that standardization and theeasiest way to get started is to use ourlocal CLI platform called pack uh youinstall it through homebrew uh it's partof you know the standard we have Linuxpackaging as well um through debs archetc uh and you just get that installedand you run pack build and the name ofyourimage and so kind of digging in on whathappens with a pack build uh so lookingat pack it's going to take two majorinputs um besides what you want to callyour image uh you're going to have theapplication source code and we have thisconcept called the builder image and sothe builder image is composed of a fewthings you're going to have those buildpacks that you want to vendor in andpotentially run and you're going to havewhat we call the life cycle so that'smaintained by the cloud build packproject and it's essentially the speccompliant thing that essentially runs abuild pack through well the life cycleof what a build pack should do and thenit's going to sit on top of a buildimage and so this is going to be kind ofthat build environment that you're goingto run your build in and um there's alsoa reference to the run image because onthe kind of output of this side we'regoing to put� all the layers that areconstructed as part of that build packon top of that run image and in ourexample um that we're going to gothrough in the demo today we're using abuntu build image but our run image canbe a dist image right it doesn't have tobe the same thing but it has to be athing that you can actually build andhave actually execute in thatenvironment right and so let's walkthrough what if a build through packwould look like right and so uh in thisexample I'm just going to take astandard Node.js application uh we'regoing to use Yarn in this case uh kindof at the top so the first thing thatkicks off every single build pack buildis the detect phase so we need to figureout what build packs actually run rightso you can include in your builder allthese build packs but not all of themhave to run and in this case the firstkind of grouping is that passesdetection uh is going to be one thatactually gets executed in this casethat's going to be we see that there's apackage JSON there's yarn.lock you'regoing to want to run kind of that NodeJSbuild pack and then a yarn build pack tokind of uh do that setup andinstall and uh they get to create abuild plan and so this allows buildpacks uh you know talking aboutcomposition to actually communicate andtalk to each other right and so they canhave uh information around like what isneeded uh and things like that and thoseget used as part of the detect uh tocreate create this build pack plan uhI'm going to skip through kind of therestore and analyze because that's notreally used as part of the first buildwe're going to cut through to buildwhich is kind of the bulk and meat andpotatoes of the build process right sothis is where the build pack's going toactually like install your node runtimerun your yarn install kind of set up allthe things to actually uh get a runningapplication and they're going to get putinto specific directories uh one perlayer um and then all that gets packagedup in the export phase uh into an OCIimage uh as well as any of the metadataspecific to that layer uh so we candecide if we actually need to uploadthat layer on a subsequent build andfinally uh layers can be marked if theyshould be cached or not so we can usethem uh in the next build to decide uhyou know I don't have to install my allmy dependencies again right um I canreuse layers um and I can even modifythem uh between builds so this is kindof what if we're to blow this out uh bythe phases on the left or my left Iguess your right and uh kind of the OCIimage and the cache image right so theseare the things we're going to beuploading to the registry and you know Iwas mentioning how the layers are mappedlogically to your application and youcan see in that OCI image right like wehave a layer specifically to nodemodules and it's not just like somerandom command that we ran in a dockerfile like the build pack author gets toreally set like what goes in eachparticular layer and if we go through asecond build uh we're going to do detectum not going to talk much about itbecause it's going to look the same asthe first build uh so now we're going toactually get a restore so we're going togo through and restore all the cachelayers kind of that bottom image frombefore on what we need right so likeeverything we can use on the secondbuild if it's available we're going torestore it on disk so we can use it onthis next build um and the next partswe're going to run analyze so we need toknow metadata about the specific layersto decide do I need to make any changesdo I need to actually upload it rightlike should I bust this cache um whatcan I do with that and so in thisexample we're going to be updatingupdating the Node.js runtime and so inthis particular case I need to knowinformation about what the version ofNode is uh that was there before andwhat I'm doing with the new one rightand so in this phase we're just pullingthat configuration from that previousbuild and then we're going to do thebuild we're going to go look at thatthing install that new version of theNode.js runtime uh it's a patch versionso I don't need to up�date uh the nodemodules uh layer at all because it's APIcompatible and so eventually we canexport that after the build and we onlyactually have to upload the specificlayers that have changed so all thethings that haven't changed like thatnew models directory your applicationdirectory uh pieces that don't like wecan keep those same export them updatethat uh all the metadata per the layersand here we go right like we just upthose update those specific things Anduh finally we cache all that stufftogether and uh for that next build sowe can be equally as efficient and so inthis case we're just uploading that onenode engine layer that we uploaded inthe OCI image uh as uh to be there forcaching and now David's going to walk usthrough how we can apply kind of allthat stuff into Wom thanks so muchTerrence and uh I got to say uh Heroku'salways been an inspiration for me as faras uh dev dev experience goes and Ithink we have such a huge opportunity inthe Wom space to offer a great devexperience and we can start with thetools that we have available to us inthe CNCF we have some great toolsavailable and we can help each other tobe more successfulso Mary is super excited about thatoneliner right there because we took allof that knowledge that Terrence justdropped on us and we applied it to webassembly we went through and we builtout those build packs there wom toolsgoes and installs all those gnarly toolsthat you saw in that recipe uh WMT timeengine brings us that WMT time runtimethat we need to run that uh web assemblycomponent uh then we piggybacked on alittle bit of uh Heroku build pack loveuh bringing in some Noode.js uh we alsotried that with Go we had a little bitof trouble so we uh backed that one outand just wrote a Go build pack which isnot any problem with uh the the Herokubuild pack more just the nature of ourproject then we have the wack analyzerand the wack analyzer uh goes throughand takes a look at your project tounderstand if it's a monor repo filledwith multiple uh components uh the womnode.js uh understands how to uh dealwith a monor repo or monor repo with anode project a little bit further downthe tree um and then same thing withgolang almost the same reason and thenwe have the whack composer the wackcomposer is kind of interesting so itgoes through the monor repo figures outlike what projects are where and theninforms later during the build phaseduring uh during the build phase goesthrough and says all right cool I'mgoing to go find all the JS projectsbuild those collect the components fromthem do that same with go collect thecomponents from them put them in a cachelayer and then we have the wom finalizerand it pull it's kind of the rug thatpulls all the room together right so uhthe wom finalizer gives us uh a a cleanimage at the again because all weactually need we don't need any sourcecode normally what would happen in thelast step is that source code stays inthe working directory we just take theweb assembly components or maybe evenjust that one component that we'vecomposed together and that is what wekeep in the final build so we end upwith a container image that contains WMtime and your optimized final runtimecomponent and that leads us to Marybeing very happy uh what does ourcomponent structure look like so uh thisis actually relatively simple right butI wanted to illustrate the ability touse uh web assembly components to mimicwhat you would normally do across anetwork boundary so normally in your webassembly uh your your micros serviceinfrastructure you're going to have anetwork boundary between your servicesright you're going to have your gRPCendpoints you're going to have to youknow write the code for those make surethey're insulated and they have to beable to talk together they have to beable to address each other uh they haveto be a load balance you got a lot ofdifficulties there so what if we couldtake microservices smash them togetherso that's actually what we're doing hereso we have a JavaScript front end thatexpo exposes a few HTTP endpoints andthen we have a Golang backend servicethat talks to� uh OpenAI and gives ussome chat completion and then you knowdoes the really really stupid simplething of adding numberstogether all right so here are witworlds um you'll see example.service andexample.server example.server is ourHTTP front end um the world exports aincoming HTTP handler but it importsadder chat and the domain types that aredescribed um on the other hand we haveour backend service example service andit includes uh WY CLI which is astandardized interface for running CLIapplications exports a CLI runtime uh orCLI run interface which is a run commandfor the CLI uh those ignore those twobecause they were actually really justused for debugging uh to allow me totake that component and rather thanactually compose it together with theJavaScript run it independently as a CLIapplication so I could test itindependently from the compositionproduct uh finally you'll see theinclude of the domain service that meansI'm going to pull in all those things inthat uh whit below and then whateverthat whip below defines for exampleworld service which exports adder andchat that means I'm going to then exportthose things i kind of inherit those umand then I'm also going to import anoutgoing HTTP handler because I'm goingto interact with the open API library oropen API uh interface HTTPinterface all right so here is the demoand uh this is actually an example ofthese build packs running on this monoreposooh goodness gracious sorryoh oh gosh okayhere we gouh sorry technical difficulties folksall right I think we're good now okayyou'll see here's pack build wom composethe builder that we've built and thepath in the monor repo you'll get a linkto the repo at the end you all can goand play with the code um and you'll seethat we kick this off and as Terrencewas saying before it's going through theanalyze and detectingphases um I actually sped it up just alittle bit um and this is actually thesecond build so it's it's runrestore you can see it's building uh thenode running the WAC analyzer eventuallygetting to the Go buildso while this was building we ran in raninto some technical difficulties that Iam going to be delighted to share andthen also I'm going to expect a littlebit out of you all uh because we'regoing to have some work to do as acommunity and I'd love to get peopleexcited about doing that oh my goodnesswe got an image okay fantastic now we'regoing to run that we got our open APIkey in theenvironment and uh if I could typefaster and better it would be great butyou didn't speed that part up no sadlynot um but it will be quick I promise uhso we're just going to say hello we havethe hello endpoint silly simple um let'sadd a couple numbers togetheruh you know honestly this these are likethe examples that you see in the webassembly community we need such betterexamples than just like add two numberstogether and you know uh give meFibonacci uh so here we're actuallygoing to uh use the open API endpoint uhfor chat completion and as you can seethat's the prompt it actually goes outto OpenAI makes a call translates thatback through Go into JavaScript and youget the capital of England is London umand there you can see the actualcodebase um again that's out on GitHubyou're welcome to go take a look at ituh we have lots of work to do still onitbut we've come pretty far it's actuallyreallyexciting ohno get back there we gowait justforwardyes Mary Berry is excitedtoo that was a little bit less That wasa little antilimatic compared to Wellyou know slow typing that's how we gotthereall right so what do we learn about thisuh we learned that you know buildpacks even though they're old and madeby you know wonderful people like thisthey are extremely valuable they areviable and they can be applied to newideas new technologies and they'refantasticum what we did learn that wasn't sogreatuh monor repos and structure forbuilding web assembly uh componentsbased in uh you know multiple componentsin one repository possibly of differentlanguages we just don't have enoughmetadata to declaratively build these weneed to be able to inform uh some higherlevel builder all the data that we needto build that thing so for example thethe whack compose takes in names of theuh worlds that you want to build with uhthat that metadata just doesn't existyou have to put it in by hand there's noway to infer it based on the structureof the repo however we could think aboutopinionated ways to structure repos tomake this easy we don't have to have thehard time um we output an OCIimage well that includes the webassembly runtime there are projects inthe CNCF like runwazi uh which is a uhway to embed provide a runtime classthat containerd knows about inkubernetes that can run components so itembeds wom time or another web assemblyruntime so that you don't have to embedthe web assembly runtime into the imageitself now those components that getbaked those don't come out as WebAssembly images or uh OCI images thosecome out as OCI artifacts uh they aren'tactually they're runnable artifactsthey're just not OCI images they theydon't have all the stuff in therethey're missing the runtime that'sactually really good because thatruntime is about 50 megs that webassembly component included both the Goruntime and the JavaScript runtime andwas 14 megs now imagine if you built itwith like Rust or something that doesn'thave to include the runtime you'retalking about two megs two megs for yourOCI artifact think about how how littlenetwork bandwidth you have to usepulling those down you know that it'sincredible it's it's a huge differenceum monor repos and build packs now theexisting build packs that that I wasusing and and there may be others thatthat do this better uh but mono reposwere a little bit tough so maybe Ineeded uh node well node has to see apackage.json and a package.lock in thethe directory that it's run in and if itdoesn't see that then it bails and itsays "Hey that's that's not a JavaScriptproject." Well in fact I needed yourtooling but it's somewhere further downthe chain uh there might be things wecan do to help improve that experienceuh maybe it's just different build packsmaybe it's uh being able to rerun buildpacks on slightly different ways um andthen we have uh library support so wehave the opportunity to uh the OpenAIfolks they were so excited about hearingabout using their library in WOM theywanted to go and implement that so theyput out a branch that worked for theirlibrary um building in Tiny Go which itdidn't work in originally i had to writeall the code to go out and talk to HTTPokay that's fair well what if we wrappeda component interface around thatlibrary it could be used by any languageanywhere that could run web assemblythat means you don't have to writelibraries in 10 different languages youwrite it once you use it anywhere againthat promise that weheard okay so the future we have thatopportunity to expand from justJavaScript and Go and do it for all thelanguages and have a build packexperience for a oneliner that would beabsolutely awesome and I think it wouldhelp a lot of people oci artifacts wehave an opportunity to take these uhcomponents put them in an OCI artifactsuh enhanced build packs to have thatexporter and that's a just didn't havethe time for that yeah it's coming yeahum enhancing the model repo story forboth Wom and Built Backs and then uhlet's imagine what it would be like ifwe started to build that library supportit starts withone and when you see that one and youuse that one it calls you to make asecond it will start like wildfire itwill go around it will make folks somuch moreproductive and you don't have to writethe library 10 times folks or x numberof times whatever number of times yourcustomers use those languages it can bebetter and we can do it together as acommunity that's the point of why we'rehere that's all we had thanks everyoneuh you can check out the repo uh righthere that we've been working in for thelast month or two and uh yeah I don'tknow if we have time for questions butuh we'll be around sorry we were flyingthrough those as fast as we could uhthank you all for being such a greataudience you all are awesome2025-04-15 22:00:18.483915 55��*�X#��ADu0mPGFd7Fchello everyonehow are we doingall right thank you for coming tobuilding web assembly or like it's 2011we're so happy to have you all here uhmy name is David Justice and I am joinedhere hi I'm Terrence uh so I am a uhCNCF uh WAM working group co-chair uhand also a uh employee at Microsoft uhTerrence who are you uh I work at Herokui'm an architect over there and I got toco-create build packs two times sothat's been fun that's awesome i'm sohappy to be on stage with youyeah so uh kind of going through theagenda real quick in case folks get lostwe're going to do a quick intro on Womthe component model uh kind of a quickrefresher take a tour through buildpacks for those who aren't familiar andhow that helps our problems here andthen bring them together and then kindof look where we're going to head in thefuture let's kick it off David awesomethank you so much Terrence all right howmany in here have built a web assemblymoduleyeah how many in here have built a webassembly componentawesome how many people have fused twocomponents togetheroh my goodness there's more than Iexpected that's actually pretty amazingokay so we're going to do a little bitof that today i'm going to give a littlebackground first so this might be alittle bit of a refresher for folks thatknow it but hey let's jump into it okaycool so what is Web Assembly webAssembly is a application binary formatthat provides us portability sandboxingand efficiency and performance it's alsoreally nice because it kind of allows usto do it from many different languagesso uh if we had Go JavaScript.NETnet Java even we can take all those andhit that intermediary binary lang uhformat and then be able to execute it onthe server on the edge uh like anywhereuh in your browser uh this is reallycool this is kind of the promise of uhrun once uh uh write once run anywherebut it's really low levelso it's so low level that we need tocreate some kind of a wrapper around itbecause that low-level interface isreally just integers so we got pointerswe got pointers at the beginning of likea string and the length and that'sthat's pretty much how you represent allhigh level types so we introduced thisidea of this component model so you wantto think of the component model kind oflike uh your proto files who out herelikes to write the binary serializationfor their proto endpoint or their gRPCendpoints anybodycome on isn't that fun it's not is itnot actually dangerous and terriblytedious to write a binary uhserialization format yeah it's horribleum and to do that over and over againfor every single endpoint it'd be awfulso uh the idea is uh this this componentmodel provides us this outer wrapperaround core W modules that handles thisfor us it's kind of like our protowrapper uh let's go next one this isactually what it looks like so this isactually like the most readable versionof a protoile I think I've ever seen itactually looks like a language thatyou're describing interface in so thisis called wit this is the web assemblyinterface types and this is theequivalent of proto except for webassembly so we can describe highlestructures we can describe uh records wecan describe resources we can describefunctions we can describe worlds and infact it's actually worlds like worlds isa key term uh so we'll get dig into thata little bit but this is how we describeuh how we put two components togetherand what interface they expect from eachother it's really kind offun all right we would not be right inBritain without Mary Berry so if anybodyout there is a big fan of the GreatBritish Baking Show I know I am um we'regonna go make a layer cake today and ifyou look at it Mary Berry makes thisstuff look super simple and our layercake is kind of look like the thing onthe right hand side uh yeah uh your leftmyright um so you can see in the center ofit we have our JavaScript language uhthe code that we've written inJavaScript and below it we have thislike little block called Starling Monkeystarling monkey is a derivation of uh uhspider mon��write andmaintain a docker file because writing adocker file is easy writing a Dockerfile which is production grade is notthat easy so let's give it a shot i havea Java application as an example but asyou can see it works with differenttypes of languages uh you can use the pCLI directly that comes with the projectbut lots of different uh frameworks alsohave build packs built in uh for examplein Java uh we're going to see examplesin Springwood and Quarkus they both havebuild pack support out of the box sowhat I can dois directly from uh my application I cancontainerize it in this case I'm runningthis boot build image task and usingbuildox now I obtain a production gradecontainer image that I can just deployto production but first I want to run itlocally so I have my image I have podmanh running on my local environment so Ican run my container image let's do thati cansay podman run and give it a port 8080and then the container image that I justbuilt which is calledbuildpax version1000 and now it's running we can eventestit i can send a HTTP request to the rootendpoint london baby all right it'sworking what's next we want toeventually deploy it to Kubernetes nowone way of doing it is provided byPodman desktop directly because withPodman I can uh spin up a cluster veryeasily if I go here in the settingsbesides running Podman I can actually uhcreate a cluster using mini cube or kindin this case I use kind to create acluster it's very easy you can do itdirectly from the UI uh you choose aname for your cluster it even uhconfigures an ingress controller out ofthe box that means that whateverapplication you're running in thecluster is also exposed to your localenvironment so you don't have to dealwith port forwarding and understand allthe uh low-level details of howKubernetes networking works so you get areally nice out of the box experience asan application developer and once I havethat I could go ahead and define myKubernetes manifest and all of that butonce I have a container uh Podman has avery nice feature so I have my containerimage down here what I can do is say uhdeploy to Kubernetes so directly fromhere I get all the manifestautogenerated for me let's say demobuild packs i can provide differentparameters but all the defaults are finewith me and then I can just say deployand now I have my application deployedas a pod in Kubernetes i didn't have towrite any YAML file and I get also anice Kubernetes dashboard directly fromhere where I can monitor all theresources and we should probably findour demo build packod in here and it'srunning now this is really great it'squite convenient but if we have to dothis every day as part of our dailydevelopment workflow uh things might geta bit uh too slow we really want toachieve a fast feedback loop right sothere must be some better way of doingthat yeah so we saw how you can deploy apod to a local Kubernetes cluster but ofcourse um deploying a pod is notnecessarily how you're going to go toproduction right you usually are goingto need to deploy uh our good old YAMLright at the very minim minimum you'regoing to need to deploy uh to create adeployment so you can see here kind ofan example of a deployment and a serviceso if you've already worked withKubernetes you you're probably familiarwith these concepts but this is kind ofthe very minimum that you need uh fordeployment to Kubernetes and then withthat it's uh just writing some YAML soit's easy right i mean all our all ourdevelopers they love their YAML uh notreally and then but with the deploymentin the service you say "Okay we're goingto go to Kubernetes but it's not readyfor production." And that's the key fordevelopers to to keep in mind it's notjust about deploying an application in acontainer to Kubernetes there's morethat we need to think about so here's anexample of an application that wasdeployed in a container and uh it's uhyou know it it starts up and it has adependency on Kafka cluster hasdependency on a database and you can seehere that it actually takes 2.6 6seconds in this case to establish theconnection with the datab�ase and withthe Kafka cluster so in those 2.6seconds we're actually having a downtimebecause Kubernetes sees that thecontainer has started up and it saysokay cool we'll send some traffic to theto the container but of course theapplication itself is not ready yet soof course what we need to think about ishow to tell Kubernetes that we actuallyare ready to receive traffic so in theKubernetes world you have these probesyou have readiness probes you havelivveness uh probes and startup probesto tell Kubernetes that the applicationhas started up that it's ready toreceive traffic and that it's also stillalive right so even after it's startedup that if something goes wrong uh maybethe pod loses the connection with thedatabase or something that well at thatpoint we're not really alive anymore sothis is something that developers needto keep in mind because we need toprovide this kind of information to ourplatform engineer to our platform teamto say these are the endpoints that youshould call from Kubernetes to knowwhether my application is healthy toreceive traffic so of course again alittle bit of YAML then uh we probablywant to integrate some sort ofenvironment variables maybe connectionstrings or database usernames andpasswords so that's something that youtypically pass in via things like configmaps and secrets um whether you want tojust use secrets or a vault or somethingthat's kind of uh you know irrelevantyou do need to at least be aware as adeveloper where are those values goingto come from and how should I integratethose into my application as well andthen very kind of uh contentious issuewith a lot of developers that I speak tois that they need to be aware of the resresource usage of their applications sothey need to know kind of let's say atleast more or less how how much memorywe're going to use how much CPU we'regoing to use because that allows theplatform team to determine like how muchspace do we need to reserve for theseapplications in our Kubernetes clusterso you can provide requests to make surethat hey I need at least this amount ofCPU and memory and then potentiallylimits depends on who you ask uh whetheryou should implement limits but limitsbasically tell the Kubernetes cluster ifthe application goes beyond this uhusage of memory then kill the pod startit back up or if it uses more CPU thenstart throttling it right so becausethere are other applications running onthis cluster we need to make sure thatit's fair for everyone so again moreYAMLUm so developers they're not Kubernetesexperts they're not going to knowexactly how to write all this YAML sohow can we help them so in the Javaworld there's uh there's some solutionsin the general developer developer uhspace as well um I'm going to talkspecifically about uh a new Java stackthat's uh that's Quarkus where you canactually as a developer uh define andI'm sorry if it's a little bit small inthe back but you can define in yourapplication properties something thatJava developers are pretty familiar withuh what kind of uh connections you'regoing to use what kind of secrets youwant to implement uh what kind of uhcontainer image you want to build whatkind of uh resource usage and whatevermore so you can define that in yourapplication properties and then Quarkuscan actually translate that into aKubernetes YAML so let's look at this ina in a little demo uh that I recorded sothis is a Java application and I'm goingto add a dependency Kubernetes and aKubernetes config extension to myQuarkus application and you'll see thatin the target folder you a newKubernetes folder is created with aKubernetes YAML file and you can see ithas a service account it has a rolebinding it has uh to to allow access tosecrets it has our deployment and it hasour service so it generates all thisYAML kind of for the developers so theydon't need to do all that and then uhyou can see here in the applicationproperties that's where we actually havethose uh configurations uh defined so wetranslate that to cap uh to Kubernetesuh a YAML file that we can then deployso here you can see that for example �wealso configured you know our memoryusage and our CPUs and uh we have thatthen uh the health endpoints we can alsoadd those so you add another extensionfor health and it automaticallyconfigures health endpoints for you itknows that you have a databasedependency a Kafka dependency so it'sgoing to add all those health points andmake sure that uh you have a livvenesscheck for your database for your Kafkaand so on and then going back to yourKubernetes YAML you can now that thoselivveness and end uh those livveness andreadiness probes and everything are alsoadded back to that Kubernetes YAML fileso it automatically kind of goes backand forth and make sure it's all addedthere so this is not your grandma's Javaright this is pretty cool stuff um andit also starts up much faster it's muchmore smaller so this is really kind ofcloudnative development here you can seeyou can use Quirkus to do the image pushand also build so it's going to buildthe application build a container imagepush it to a registry and then now wecan use Quark is deployed to actuallyapply that Kubernetes YAML to a clusterand so in a moment we're going to seethat it's uh generating some stuff nowit's deploying all those different uhcomponents so service accounts uh RObindings and whatever more and then it'sdeployed and we can check that uh reallythere's the resource limits configuredthe health endpoints and everything soyou know in terms of developerexperience this is what we're lookingfor right this is Kubernetes nativedevelopment for Java developers um sonow what about serverless because thisis an interesting concept for developersno more servers right at least not forthe developers to necessarily worryabout so the promise of serverless isthat you as a developer don't need toworry about you know like how it's goingto be deployed uh and how moreimportantly it's also going to bescaling so with serverless you kind ofget autoscaling out of the box so in theCNCF space on Kubernetes there's aproject K native that does exactly thiskind of uh promise right so it allowsyou to use serverless on Kubernetes so Irecorded another little demo and thistime because we talk a lot about uh Javamaybe we should branch out so in thiscase uh I'll create uh a go applicationso here you can see with Kubernetes youcan do K and funk create a name for thefun for the function and then uhlanguage go and that creates a functionyou can use K native with containers aswell but in this case we're carrying afunction then function build and that'sactually going to build uh our um oursource code that it's scaffolded into acontainer image um and then uh push itto or eventually push it to a registryas well so here we can see that thefunction was built you can see thesource code that was scaffolded so youas a as a Go developer can now kind ofuh change the function to however youwant so if you're a Go developer thisshould look somewhat uh familiar to youso add your uh add your custom code hereand then uh you know you can build itand then deploy it so K and funk deployis going to actually deploy uh push youruh source code in a container image pushit to a registry and then deploy that asa serverless function so in this caseyou can see I had to put my uhcredentials to push it to a container uhregistry and you can see now it'sdeployed on uh on Kubernetes in thiscase open shift and our application isup and running so we can invoke it soyou can do that with K native as well soyou can do K native funk uh invoke toactually invoke uh the function and youcan see it gets uh we get a niceresponse or we can just use uh the uhendpoints that was uh exposed as well soK native also takes care of the ingressso we don't need to worry about that andyou can see it responded and we have uhin in the open shift case it alsocreated a route and a pod which is nowrunning um now if we wait a minute wewe're going to see that because we'renot calling the function anymore it'sgoing to notice that no more traffic isgoing to the pods we're going to scaleit and now there's no more pods runningwe're going to wait for traffic to come�in and so now I'm going to call thefunction again and then uh we're goingto see that it wakes back up scales up apod or multiple pods in case there'smore traffic and so we have serverlesskind of out of the box with just a fewcommands um and this is all running inKubernetes with uh CNCF projects yeahand it's really nice because they're allpoly tools so under the hood KenyFunctions uses build packs to buildimages that works across differentlanguages kennedy functions itselfsupport spinning up these out of the boxfunctions based on different languagesthat's really great and seamlessexperience but so far we've been workingwith a mostly with a single applicationand when we start integrating youmentioned Kafka earlier or a databasethings get more complicated because wecan get it to work having a singleapplication containerized and deployedto Kubernetes that's fine but when weare in our development environment howcan we consume all these different uhapplication dependencies it could be adatabase it could be uh an identitymanagement service we have differentoptions of course one option is goingthis route where we have a Kubernetescluster either locally or in the cloudand we hook up our developmentenvironment on it but do we really needKubernetes for localdevelopment now it might seem weirdsince we're talking about Kubernetesnative development without Kubernetesbut let's take a step back so acrossdifferent languages uh maybe you'reusing Java you're using Python you'reusing Node.js JS we actually been havingvery good developer experiences for awhile for example uh in Node.js I cansay yarn dev and uh my application isboth compiled is up and running andevery time I make a change is reloadedautomatically the same happens with Javaor with Python so maybe we can reusesome of that known developmentexperience instead of inventingsomething completely different but westill have the problem of how do weprovide the these dependencies how aboutdatabases we have different tools alsothat we can use in the cloud nativeecosystem one of them is test containerstest containers uh lets you run in aprogrammatic way lightweight containersas part of the application workflow andwe'll see in a moment a demo uh we'llalso have a look at Micros micros isalso a CNCF project and whenever youhave to integrate and test againstdifferent APIs it could be HTTP APIs itcould be a synchronous APIs messaging uhit's really useful so let's try now toestablish a development experiencewithout Kubernetes for the localenvironment let's start introducing testcontainers to the mix as I said testcontainers is a polylo tool so maybe inJava if you want to integrate withrabbit MQ there's a way to uh declarethat dependency or maybe you're buildinga Python application and you want tointegrate with keylo keylo is a CNCFproject for identity and accessmanagement you can do it using testcontainers how about go same thing maybeyou want to integrate with a PostgreSQLdatabase you can define itprogrammatically this is a nicealternative to a compos file and asadditional bonus you can use this notonly for testing but also fordevelopment so let's uh have a look athow it could work i have uh anapplicationnow that a Javaapplication let me zoom ini run it as I would usually do in thecontext this is a spring bootapplication so I'm used to run it withboot run and I'm not changing that i'llkeep running the same command from thedifferent folder right there but nowwhat's happening under the hood that myapplication is starting but not just myapplication because this application isactually using test containers toprovision an OLAM service now Olama is away that you can use to run largelanguage models locally so there's lotsof talks about AI at this conference sowhat I'm doing here is actuallyintegrating my Java application with theAI models and I can show you if I go inthe container section here we can seethat there's an O lama container that'suh provisioned by test containers sowhen I start the application I don'tjust get my application up and running ialso get lama so if I now go and testtheapplic�ation let's uh ask a question itcould beum what's the capital of England becauseI'm really creative with questions uhwhat's that uh oh yeah it's likethis and now I get an answer this answeris parsed by a model that is running inlama and I can use this approach withany kind of dependency both at thedevelopment time and the at testing timethis is really great because it means Idon't need a cluster i don't uh becausehaving a cluster locally means also thatyou're responsible for it so it might bereally interesting here it says uhLondon was established in 1962 that's uhokay interesting okay this is great sosomething is really wrong here right andwhen we use large language models thingsget even more wrong so I want to arguehere because we talk a lot aboutobservability whenever we go inproduction i would like to say thatobservability is not just for productionusing something like open telemetryshould belong to a developmentenvironment as well now for largelanguage models it's extremely importantto have some kind of observability on alocal environment but in general normalfeature development I really would liketo visualize whatever I'm coding so whydo I have to wait until it's inproduction to get a nice dashboard ormaybe a nice trace so I can see therequest flowing through the system iwant open telemetry full open telemetrysystem running on a local environmenthow can I do that of course I can have aKubernetes platform running locally butwe just talked about test containers soI can actually use task containers alsofor establishing an observabilityplatform and that's what I'm doing hereyeah and so for developers this isreally nice right because they don'tneed to deal with you know how to set upthis entire stack on a Kubernetes andmanage all these different components soit's really nice that you can just it'samazing I have defined here in Javathese two dependency one for lama andone for open telemetry and I got lama asyou saw but let's investigate why we geta probably weird answer from the modelso what is running under the hood i havegraphana up and running you can see it'srunning locally through test containersi didn't have to configure anything thisis zero configuration zero code it'sjust out of the box when I run theapplication so we can probably go andcheck uh the trace now for theapplication and let's zoomin so I sent a request to the chatendpoint that eventually called a verytiny model this is a model from huggingface it's called small llm2 or whateveris pronounced so it's really small soit's probably very prone tohallucination based on the task it'simportant to uh try out and evaluatedifferent models based on the task andif we don't have this kind ofobservability it's really hard to dothat right here we can see exactly whatis going on under the hood and improveour application and again if you have adatabase you have Kafka you can followall the traces locally without any extraconfiguration allright how about other APIs thoughbecause we've been talking aboutthirdparty services but we also haveother APIs we want to integrate withyeah absolutely so especially when we'rerunning on uh Kubernetes we're we'llmost likely have distributed systemswhere they need to interact right so asdevelopers again does that mean that weneed to figure out uh an entireKubernetes cluster and set up all thedifferent components to try to test ourapplication um that's that's uh that's alot of work right so in the CNCF spaceis a new uh well a relatively newproject uh which is called Micros and itallows you to uh work with APIs uh mockthem and then run them potentially alsoas uh as test containers in your localdevelopment flow but you can alsointegrate this project in your uhdeployment uh flow in your CI/CD flow tomock some uh some interesting data totest uh service individually and then uhautomate your entire cycle based onthose the same sets of data that youwere testing on your local machine nowthis is a really interesting project italso has an operator that you can deployon Kubernetes so it kind of spans theentire kind of uh life cycle fromdevelopment to actually you know almostto production because of course onproduction you you don't want to bemocking your data but it helps you uhkind of uh simplify this whole uh thiswhole life cycle this supports not justAPIs it also supports uh for exampleKafka and messaging and all that so it'suh in different protocols so it's an aninteresting project um you can of courseuse the CLI to automate everythingthere's uh different APIs or you can usethe the nice uh user interface as wellso definitely a project that you want tocheck out yeah and that's a greatcombination so we have a developmentworkflow now where we can work onmicroservices in isolation thanks totest containers and micro but how aboutproduction so the moment we reachproduction we want to bind ourapplications with the actual servicesand there's a nice API from theKubernetes project itself it's calledservice binding API and that is reallygreat in order to separate the concernsbecause as an application developer Ideclare that my application requires forexample a posgrql database and an Lamainference service for large languagemodels but then the platform itself isresponsible for fulfilling that requestso we have this very nice separation ofconcerns now as a developer I mightwrite these few lines of YAML code inorder to declare my intent but we wouldalso like to achieve a YAML approachright so if you have uh some kind of uhum developer portal maybe in yourcompany on top of uh your platform uhyou might even more simplifying this uhuh procedure for example here I have abackstage instance uh I can uh choose acertain uh golden path here for exampleI want to build an AI application and Ihave the possibility to choose differentuh large language model providers maybeI want to choose Mistral AI that's agreat provider because all their modelsare uh open sourced Apache 2 licensed soI can both use the cloud service butalso run them locally in O Lama uh fordevelopment and testing purposes uh or Ican choose Lama directly i can choosethe database and now what happens when Iuh bootstrap this project is that forthe local environment I get all the testcontainers and micro micro configurationbut then when it's ultimately deployedto kubernetes then I get all the servicebindings for the actual services so onceagain as a developer I don't have to uhdeal with config map secrets and maybeordering via ticket get to the platformteam like that I won't provision a newllama service right exactly rightokay so we have these two main uh umways of going right we have the uh localdevelopment workflow based on Kuberneteswe saw also an alternative withoutKubernetes but ultimately it's all aboutdeveloperexperience yeah so uh to kind ofconclude where we where we're at withKubernetes native development what weneed is fast feedback loops to make surethat everything is working the way thatwe're supposed to that we can get thatsame kind of experience that we're doingon our local development that thatpretty much matches with what uh isgoing to happen on the Kubernetescluster that we can try to reduce thecognitive load of you know this wholekind of uh landscape of Kubernetes andall the tools around that that fordevelopers who like to focus on you knowtheir actual development task that theydon't get overwhelmed with everythingaround that and then uh finally uh wewant sorry we're both hitting the thingat the same and of course we wantdeveloper joy right we want to make surethat developers are happy uh focusing onwhat makes sense for the organizationbecause if developers are happy they'regoing to create really cool solutionsthat are robust and that are going tobring uh you know uh good value toeverybody um so I think that's it rightyeah thank you very much for joining usand if you like uh our talk please sharethe feedback if you have any um inputsfor us we would greatly appreciate tohear from you you can find the sourcecode also i'm having anotherpresentation with my friend Mauricioalso about developer experience so feelfree to join there and thank you verymuch for joining yeah thank you2025-04-15 22:00:19.236032 ��Y#��AA07RnkzSc6Jgi think we can get started so uh welcometo this session we're going to talkabout uh Kubernetes Kubernetes native uhdevelopment and um I don't know if youyou're familiar with this term um it'skind of I mean it's still softwaredevelopment right but our traditionalway of development has typically ortraditionally been like yeah I'm goingto develop my code and it works great onmy local machine package it up maybethrow it over the wall and somebody'sgoing to deploy it on these uh kind ofbig application servers especially ifyou're like us and you're Javadevelopers and uh you know use as manyresources as you want it doesn't matteryou know we have these big serverskubernetes native development is kind ofdifferent right you need to think aboutuh the way that your applications aregoing to live on Kubernetes in adifferent way because on these clusterswhat you want is applications that startup fast because they can be kind ofrescheduled or maybe scale up orwhatever excuse me and um you want tomake sure that the resource usage isgoing to be lower right because thesmaller your application is the moreyou're going to be able to make the mostof these clusters uh in terms of densityand you know out of your money as wellright so that's kind of what we need tothink about as application developersand that's what we're going to try tofocus on a little bit in this session umyou know some tools that we can use inthe CNCF landscape and a little bit outof that in the open source space to helpapplication developers work withKubernetes without being Kubernetesexperts so uh my name is Kevin Duboisi'm a developer advocate at uh at RedHat um I've been uh writing software forthe last 20 years um so I went from thatmore traditional world to thecloudnative world it's been uh it's beenan interesting journey and with me hereis uh is Thomas yeah hi everyone i'mThomas Vital i work at systematicsoftware company passionate aboutanything cloud native and Java relatedwrote a book about it uh a couple ofyears ago and now I'm working with myfriend Mauricio Salatino on a new bookcalled Developer Experience onKubernetes we're really excited we justannounced it yesterday uh so the firstfew chapters are available out there butlet's get started so Kubernetes nativedevelopment at least as a minimum wewant to work with containers somehow sothe first ingredient is we need acontainer runtime where to run ourcontainers on our local uh machine soPodman desktop is the tool we're goingto use today it's a great tool it's opensource it's been donated by Redhead as aCNCF sandbox project so the donationprocess is ongoing um it's really greatwe can run and manage containers uh indifferent ways as we'll see uh duringthe presentation uh we have anapplication we have a container runtimewhat do we do next we have to package itas a container we might go directly andwrite a docker file but in the cloudnative ecosystem we have a great toolit's called cloud native build packs itworks in this way you don't have towrite any docker file you take thesource code and doesn't matter what typeof source code you can use your favoriteprogramming language and frameworkbecause there's very wide support acrossdifferent languages and then using buildpacks for example the pac cli that comeswith the project you get the productionready image and this image will beproduction ready it will be designed andbuilt in a way that is uh performantthat is secure so as a developer I get areally nice experience because I focuson my application source code and when Ihave to build a container I just run onecommand i don't have to ��a fullstack developer and I care about beingproductive i want to be able to buildinnovative applications quickly and Idon't really care necessarily what theunderlying infrastructure is that I'musing i don't want to be rewritingcommon patterns i don't want to beimplementing complex resiliency logic idon't want to be rewriting statefulprocesses and workflows and actors whenthose things already exist right i wantto be enabled to be able to write codethat actually provides value to thebusiness in which I am in um so yeah andthen we have yeah and I'm a platformengineer and I have a mandate with myteam to build a platform an internaldeveloper platform for my company rightthe goal is to collaborate with cloudengineer network engineer securityengineer with observability team and thegoal is to make sure we have standard weare standardizing template we arestandardizing governance security and weare going to have more automation thegoal for this is better collaboration aswell as enabling the developer to focuson their productivity they want to shipvalue to the business to the end userhow we could help them and avoid what wecall the quitive load how we couldreduce the friction for them to adoptthe platform to use the platform alsowe're going to be the first people tosay developer productivity 300 times soget ready for it that's what we're allhere for right um so yeah and yeah um asplatform engineer um again we have uh wewere looking for and we are looking forsome framework and the CNCF platformworking group have uh has um uh releasedcouple of white paper and framework thatyou could use and the first one on theleft is about what about the buildingblocks of my plat of my platform i'mbuilding an internal developer platformbut there is already existing toolssecurity observability how they connectto each other how the team and theprocesses are interacting with eachother so with that you will also focuson now could we deliver capabilities ontop of that that's what you could callseeing your platform as a product aswell and based on that you could also uhlook at uh some friction some pain pointsome rooms for improvement and that'swhere the maturity model is veryinteresting because you could see whereyou fit and on what you want to focusand today we want to focus on theinterface interfaces is between thedeveloper and the platform right andDapper and score um are very a good fitfor that and we will demonstrate thattodayso the question then becomes and can youall still hear me okay in the back okayperfect i just wanted to make sure soDapper and Score both have ways in whichthey enable developers and platformteams to work together through thisconcept of standard interfaces so if wefocus on the developer uh perspectiveDapper essentially provides a set ofAPIs and SDKs that developers can use toconsume the underlying capabilities ofthe platform in the same way developerscan author score specifications toessentially um claim and or tell the theplatform what it needs what dependenciesits application has without having tounderstand how that's actually going tobe implemented on the back end so whenwe bring the platform capabilities intothe picture there are also interfacesthat these uh projects provide that arefocused more on standardization at theplatform level so how do how do I as aplatform engineer actually expose thecapabilities that are then consumed byum the developerf facing interfaces andthat's where you get into things likeDapper components the control planethings like that and the scoreimplementations um and the these mightbe abstract concepts if you don't knowwhat Dapper and Score are but luckilythat's what you're here for today iswe're going to dive into these umprimitives and explain how they all fittogether so who's excited we're about toget into some code we love that yay okaygreat so um I just wanted to give aquick Dapper overview if you're new toDapper how many of you have heard of theDapper project before okay wow love thatthat's great uh so Dapper is actually agraduated CNCF project which is superexciting we graduated right aro�und thetime of CubeCon uh over in Utah lastyear and ultimately what Dapper providesand how it runs is it's a sidecar modelso it's traditionally deployed onto aKubernetes cluster injected intoapplication pods and it essentiallyexposes a set of standard APIs thatdevelopers can use to buildmicroservices applications withouthaving to uh concern themselves withrewriting common patterns and uh andarchitecture styles so this can beeverything from service invocation withMTLS similar to a service mesh um itprovides things like abstracted statemanagement you can do publish subscribemechanisms you can build uh large scalestateful workflows using the the Dappersidecar so that's just what I wanted tocall out right and and all those APIsare essentially that uh developerinterface and then the underlyingplatform teams can actually configurethose APIs to reach out to variousinfrastructure services and so that'sreally uh more on that platformcapabilitylayer so if let's say we'restandardizing on Dapper let's just talkfrom a developer perspective so I'mauthoring my applications locally asmost developers do and here we can seeon on the slide a large image ofbasically just like a basic uh Reddusimplementation right i'm using uh theReddus client i'm making a call storinga key value pair and then ultimately I'mgoing to to read that right so we cansee here I'm taking a dependency onReddus maybe not that big of a deal ifit's uh you know you're writing onesimple project but if you think aboutthe number of dependencies that your dedevelopers continue to take on um onexplicit infrastructure services andthis could also be things like librariesfor resiliency and other things right asyou're distributing your applicationsmore and more concerns uh you have tobuild your application to be faulttolerant so all of this bleeds into theapplication code which once again for asmall application might be okay but thattight coupling is not always veryvaluable um when you're trying tostandardize so let's swap that out sonow we're taking a look at what it lookslike to use the Dapper SDK to accomplisha similar objective so here we can saysee there's no Reddus um there's nospecific infrastructure logic we justsee we're importing the Dapper clientand then we're ultimately saving statenow the unique part of this which Mattwill explain here in just a second isthat you can see I'm using the client tosave state and I'm passing in thisconcept of a state store name um whatdoes that state store actually mean i asas a developer might not really careright i want to store a key value pair idon't really care necessarily wherethat's being stored as long as I knowit's secure and that's really not my jobto have to worry about hopefully rightif we're focused on enabling developersto offload some of the security andinfrastructure concerns to a lower levelright not the reality of today alwaysbut it's something that we're trying toaccomplish right so you want to explainthis a little bit more and that's whereyou could see the very first point whereDapper uh with the notion of Dappercomponent is helping us we want thisstreamline seamless integration betweenhey I'm talking to a state store is itradius radius locally in Azure is it aposgress database so here as platformengineer I have the ability to authorsome recipe some component and thedeveloper won't change anything on theircode they just reach a state storewhatever the technical implementation isawesome so these components are loadedat runtime by the Dapper sidecards thatare running and they can be scoped theycan be governed Um so there's a lot ofadditional capabilities that I'd love totalk to you about after the session sonow that the uh you know I've written mycode I have my components available thisis where I really get into actuallyrunning the code locally right some kindof interloop development and this isgoing to look different for everydeveloper um if you're a print lineperson a debugger person we can talkabout that later this is something we'vebeen talking about this week on my teamuh but ultimately I'm going to do aDapper i�nit which is essentially goingto install uh Dapper onto my localmachine and it's going to make use ofDocker to actually give me a few controlplane services which are used in orderto facilitate the inter loop so forexample I have an actor placement thatwill run as a container i have aulerwhich is going to help me managestateful workflows when I'm doing localdevelopment and then I also get a couplethings out of the box i get Zipkin fortracing so that I can immediately seeall of the open telemetry traces beingsent across my local services and I alsoget a default Reddus instance which Ican use for a default componentimplementation so I start publishingmessages it can use a default uh dockercontainer on my machine and thenultimately once I've initialized mymachine I'm going to do a Dapper runwhat this Dapper run is going to do isessentially it's going to run all of myapplications locally using theapplication commands and the appropriateports and it's also going to launch asidecar process which acts as the Dapperintermediary layer right so thosesidecars aren't actually runningcontainerized on my local machinethey're just a sidecarprocess awesome so now we're going toget into an actual example of someDapper code i'll be um quick and briefbut hopefully it will be a goodexplanation for you i hope the screen'snot too blurry um but basically it'sgoing to be an order process workflownothing super crazy um basically we'regoing to submit an order that order isgoing to kick off a Dapper workflow umit's going to do a few things it's it'sgoing to reserve inventory using a uh aDapper state store API it's going toessentially determine if the order isover $1,000 and if so it's going to waitfor approval um which is just anotherfunction of the the Dapper workflowimplementation if I give approval or ifit's under $1,000 we're going to processpayment we're going to process shippingand then ultimately depending on theoutcome we're going to complete theworkflow sound good sweetokay I'm gonna make sure this is zoomedin pretty well perfectlet me close these outcan everyone see my screen okay how'sthe How's the text size perfect goodokay so I am not um I'm not bold enoughto do a to do a Dapper initializationright here because I don't trust theWi-Fi so I have already done Dapper initon my machine uh so I just want toclarify this is exactly what I saidwe're going to do a Dapper init um rightnow it's not allowing me to do thatbecause it already exists right so if Ido a Dapper uninstall it will remove allthose control plane components it willbasically remove the runtime from mymachine and then I do a Dapper init andit gives me everything we mentioned sowe can do a quick docker ps and let meclose that out if we do a docker ps andwe basically just script for the thedapper containers what you'll see isexactly what I just mentioned right wesee that reddus instance we see theplacement service we see thescheduleuler just wanted you to knowthat they are there and then ultimatelyfrom there I'm going to do a dapper runso let me confirm yeah let's go aheadand close that out clear thisout so I'm going to do a dapper run andthis dapper run is using the yamlmanifest that's in my repository and I'mvery zoomed zoomed in because I wanty'all to be able to see but I'll justzoom out for one second this is thatDapper Yaml file I was mentioning sothis is going and launching thatworkflow that workflow is coordinatingmultiple calls across other servicesit's calling to a payment service it'scalling to the shipping service andultimately now everything is up andrunning we can see in this output theblue output is my application runtimeessentially logging and then the whiteis all of the Dapper sidecar logsletting me know that the app ID is upand running and the sidecar is ready togoso before I get into a quick code tour Ido want to show you just like a runninga running workflow so the first thingI'm going to do is I'm going to get thecurrent inventory this is using theDapper API uh for states you can seeright now we have no inventory availableand I just want to confirm that to �youin for you in Reddus right so I told youI'm using a local Reddus component it'shooked up on my local machine it'srunning in Docker so I haven't storedany data what I'm going to do I'm goingto make an API call to my application torestock inventory under the scenes thisis using the Dapper API to go save thatkey value pair and then if I refresh myscreen here we should see that we nowhave our inventory items right so like Isaid code doesn't have any Reddus in itwe're going to see that in just a secondbut we now have inventory that we canessentially use as part of this orderprocess so now what I'm going to do isI'm going to submit a simple order whatthis is going to do it's going to call acontroller method that controller methodis essentially going to start a workflowusing the Dapper workflow SDK whichyou'll see in just one moment and I justwanted to kind of show you this so wecould actually get a demonstration ofthe workflow running end to end so letme pull backup so you see here I have a UI in thisservice i have a notification servicethat's subscribing to every activitywithin the workflow so whenever aworkflow processes payment when aworkflow starts so you can see here Ikicked off the workflow and I subscribedand received all of these messages usinguh the Dapper PubSub API so now I wantus to walk a little bit into the codeitself so everything's up and runningstill good to go so let me just switchover here and I'm going to do a quickcode tourso the first thing I want us to do istake a look at the workflow code solet's close all this out create lessnoise for you and then we're going tozoom in so you can see on line 25 herethis is where I'm importing the uhDapper clients so I have my Dapperclient to make pub sub calls to makeservice invocation calls to save stateand then I also import the Dapperworkflow SDK what this is doing by theway when the sidecar comes up is it'sestablishing a gRPC stream from theapplication to the sidecar and there's aconstant gRPC stream that's sending thework items to and from that uh thatsidecar which is hosting the runtime sothe the workflow runtime so after wehave um imported Dapper which is greatwe also have to register our workflow soI'm basically going to start theworkflow runtime i'm going to registermy workflow and all of the activities itshould have access to and then I'm goingto start the workflow runtime and that'swhat's really managing that gRPC streamand making sure that the activities thatneed to be executing as part of thisworkflow areoccurring so after I've done that let'stake a look at the actual workflowdefinition can you all see the screenokay is it kind of zoomed in enoughwhere you can read it perfect uh so wesee we have a process uh process orderworkflow i'm not going to go throughevery step we talked about itconceptually the big thing is thatworkflows don't actually do anycomputation or processing right theyactually out offload that to activitiesso the main idea of the workflow is thatit creates the structure in which thingsare executed and then ultimately theactivities are actually doing theprocessing making external requests soon and so forth so let's just take alook at um two things i want you to lookat two things so the first one is thenotify activity that's what's publishingthe messages you saw on the UI and thenwe have the inventory activity down herewhere we're basically going in andsaying let'sright here line 75 you can see we'rereserving inventory so we're going tolook into those two activities and thenwe will pass it over to Matt to get intouh score goodness so let's take a quicklook at these activities so the firstactivity we're going to look at is thenotify activity i want you to considerhas anybody implemented any kind ofmessage broker or messagestreaming perfect great fun complicatedinvolved yes so if you can see here yesthis is a basic demo but you can see Ican use a Dapper client to publish anevent dapper supports over I believe 2025 different pub sub brokers that canall be communicated with and configuredusing components um behind the scenesbut the code st�ays the same so I justwant to publish a message and I want touse any given number of infrastructureproviders dapper is going to handle thatfor me on the back end all I have to dois publish a message and once again tellit what the name of the component isthat I want to use um after we take alook at this I want to actually show youhere how this is working right so thisworks because we set up this componentso I'm saying I want you to go talk toReddus you don't have to put it directlyin the code i'm using the component hereand then we also have a subscription andwhat you'll notice is in Dapper there'smultiple ways to do subscriptionsdeclarative streaming um uh code firstbut you can see it's referencing thatpub sub component so I'm basicallysaying I want to subscribe my app to thepub sub component called pub and thenlast but not least we can see where thatmessage is getting delivered to the UIso we can see hey I'm receiving the uhthe messages that are being publishedbecause I'm subscribing and I'm going todisplay those to to the enduser um another quick look at Dapperusage is uh basically in this case we'remaking an invocation request so we'reactually going out and invoking asecondary service another micros serviceusing Dapper you get MTLS out of the boxwith this you can set access controllist um so it's really a powerful uh APIso you can see here we're invoking amethod called inventory on the uh orexcuse me on the app called inventory onthe inventory reserve endpoint and thenlast thing I wanted to do is just showyou what that endpoint actually lookslike and you can see here we're usingthe state API to get state um or to savestate so once again very very much astandard API surface when you're usingDapper across your applications soperfect now I'm going to switch backover to these slides here i think we'redoing fairly good on time so let me playfrom current slide so this is great likewe've seen Dapper can be very powerfulit has really um powerful programmingmodels however there are some challengeslike I did Dapper run on my localmachine but how do I get this into uhlike a containerized version right howdo I go deploy this into staging orproduction how do I actually take thisinto Kubernetes right i'm not just goingto go deploy my Dapper run file i needKubernetes manifests and so this becomesa problem that typically developers arenow having to face right running localKubernetes clusters or trying to figureout how to hobble together dependenciesin Docker Compose for testing purposesum and so that's really where uh whereI'm struggling from a developmentperspective is how do I now go frominner loop to the next to the next leveland I think we have an idea in my in myteam so as platform engineer um uh whatwe have is we have now the developerempowered to deliver value to um to thebusiness right now how we graduate howwe promote how we deploy via CI/CDpipeline maybe in developmentenvironment staging production so wewant now to um to help the developer todeliver value concretely at the at thebusiness right and also we want to avoidthe hey it works at my machine how itwill why why it's not working inproduction for example or in dev orstaging so we want to collaborate closerwith uh this dapper component and thisintegration um and again we want to dosome uh recipes to automate more so herethe goal is to say yeah you want to goin production our platformis hosted on Kubernetes so you will needto have a docker file maybe uh then youwill have the um um the CI/CD pipelinemaybe you want to do some docker commandto run the container to test thecontainer to deploy it then you willhave the maybe a compost file in orderto integrate some component and databaseand pubsub uh and the differentcomponent talking to each other and thenyou may want to alter some Kubernetesmanifest how they could provideinformation to in order to deploy thatin a Kubernetes uh cluster so hereenters score score has two concept thefirst one is the score specificationthat's where the developer will focus ondescribing the intent to deploy theirworkload or their workloads righ�t andthey will describe hey I need adependency i need a state store pleaseright and I will show you an examplevery quickly the second aspect of scoreis the concrete implementation from thisvery um um workload specification filewhich is agnostic to the platformagnostic to the environment developmentstaging or prod now how I concretelyimplement how I deploy this request umso here come the score implementationsuh some of them could be docker composeum as a target runtime kubernetes fly.iohuman and others and I have also theopportunity as platform engineer toauthorrecipes for posgress uh SQL databaseradius name it is it local ondocker is it local in kubernetes is itin Azure in GCP i will be able to authorthis kind of recipes and having this uhnotion of standardization on the rightbut also the abstraction on the leftso if I'm a developer and like I said Ihave this challenge i want to be able toget my workload somewhere where it hasthe dependencies it needs but I need tograduate from my local developmentmachine i don't really want to have toworry about Docker Compose i don'treally want to have to worry aboutDocker files i don't necessarily want tolearn Kubernetes so instead what I cando is essentially author this score filelooks probably pretty familiar it lookslike a Dapper YAML it looks a little bitlike Kubernetes YL but it's verysuccinct it's very clear in terms ofwhat the developer needs and you can seehere I'm basically asking for a resourceright i want a Dapper state store idon't really care how I get it i justwant one that's what my app needs andhere again as platform engineer I willsupport the developer with animplementation a concrete implementationand here we agreed that hey let's do thefirst implementation and use the firstimplementation with Docker Compose sohere it's a recipe to to inject thesidecar container the placementcontainer all of what Dapper needs inorder to successfully run theirapplication as developer so that's theimplementation part and I mentioned theresource provisioner how concretely Iwill provision radius as a container howI will inject the component associatedto that so here I won't go too much indetail but that's how you will authoryour recipes as platform engineer andthen the workflow is score composing itin this case it could be scorekubernetes there is anotherimplementation and you init your currentstate current local uh machine and youdo score compose generate on the scorefile and guess what with the targetedimplementation score compose it willgenerate the compost file so now I couldrun the third command here dockercompose hub and now I have all mycontainer built and run successfullylocally and I could test my uhapplication so let's do the demo hereyeah I can talk to the slide wellso ultimately what we're going to shownow is going to be a little bit of whatI did but we're now graduatingenvironments so as we talked about youcan swap Dapper components which isreally nice um what Matt's going to showus is how essentially without having toauthor any Docker Compose we can takethat same application we can take thosescore files and essentially run them inDocker Compose with two Reddus instancesfor PubSub and state and then we'reultimately going to use those exact sameum score specifications from a userperspective to run that workflow uh thatworkload and workflow on Kubernetes butthen we're going to swap out thecomponents so a user is going to getPostgress um for example in in thestaging environment instead of getting aReddus local instance but the developercode and the developer specificationdoesn't change at all so Matt I'll handit over to you wonderful so what I wantto show you here is uh very much the umthe score file the first course scorefile that you want maybe is with thenotification workload you remember thefive microservices that Kendall uhshowed you that's a score file i will beable to describe the intent of uh andthe metadata the information for myworkload the port it will run on butlook at this i'm also asking forresources and the resources we could seehere is for example I want to subscribeto a pubsub right and now what I want toshow you is another example with theinventory same I'm describing the intentof deploying my workload inventory onanother port and here I will have thisuh dapper state store and I ask for astate store right so that's on adeveloper perspective now what I want toshow you is kind of the workflow veryquickly with score compose so againplease generate the compost file basedon all the score file right and with theimplementation provided by the platformengineer the developer will be able togenerate that seamlessly it's abstractedbut guess what now I have now a verycomplex compos file so here with thiscore implementation core compose I'm notasking my developer at scale to write300 of line of docker compose and knowthe technical detail about radiusdatabase state store component and allthe relationship between the workload sohere again it's generated abstracted andthen we could do docker compose up and Ihave all the container running and now Icould check that I have two radiusdatabase one for the state store and onefor the pupsub that's the defaultsetup now what I want to show you veryquickly is I could open the notificationpage I'm running on docker and now I'mdeployed via score file and score righthere what I want is generating trafficso I'm locally all the docker arerunning and I'm generating traffic tothe notification and talking to tworadius database right so that's thefirst uh step right again as a developerI don't know the technical detail i'mjust deploying my um scarf file thesecond part of the demo is let's move onin the CI/CD pipeline right i want topush to development staging andproduction it will go throughout a CI/CDpipeline but here for the demo let'sillustrate this so here again pleasecould you generate the Kubernetesmanifest for me right as a developer Idon't want to do that so the platformengineer were able to provide the recipeuh using this core Kubernetesimplementation and guess what again it'sgenerating a Kubernetes manifest filewith a lot of Kubernetes resources andhere I have 600 of line again developershouldn't take care of that and dealwith the technical detail is it secuream I injecting the right labelannotation so spacing tabs all thethings don't have to worry about itexactly so now I will be able to pushand deploy this in mycluster and now I have posgress andrabbit MQ so what we did also as anexercise here is I want to graduateradius is good for the inner loop fastbut now we want to use posgress we wantto use rabbitq in production maybe withazure with any cloud right and I puttogether as platform engineers therecipes behind the scene the developerthey don't change the code they don'tchange the docker file they don't changethe score file and now guess what I havea newapplication deployed right and I'mgenerating ingtraffic sorry aboutthat and you could see I'm now onKubernetes at the bottom using Postgressfor the state store Rabbit MQ for thepubsub and now it's still working so wehave this seamless um deployment processum in place now I think one thing that'simportant I know we're at time but onething that's important to consider rightour goal isn't to make it wheredevelopers don't have to use their mindsright like that's not the goal the goalisn't to create an abstraction that's soblackbox that people are resistant to itright because that's a reality as wellso you have to empower developers tounderstand the tooling to be able toeducate themselves on how to break theglass but also to not have to beinundated with worrying about thisday-to-day so hopefully that gave you agood picture of how Dapper and Scoretogether can create a pretty compellingexperience we would love to connect withyou the rest of the conference there isa Dapper open source booth i believeSCORE also has an open source sourcebooth so please come to see us uh pleasecome see us at the diagram booth as welland I believe we have a um tech booth aswell perfect yeah so we would love toconnect if you have any questions we'llstick around yeah thank you for yourtime thanks2025-04-15 22:00:20.050389 %%��6�Z#��#A-fGztPUuD8kwelcome to CubeCon super excited if thisis your first talk of the day of theconference um we're happy to kick it offfor you um my my name is Kendall Rhodeni am a product manager at Diagridpreviously was at Microsoft and has havebeen working on the open source Dabberproject for about four years now hi I'mMeno i'm Clative Ambassador oh could youhear me now yeah hi everyone i'm MatBenoa i'm Clative Ambassador scoremaintainer and customer success engineerat humanity very glad and excited to behere with you today awesome so todayyou've come to hear about mixing theperfect cocktail a little too early toactually give you one unfortunately butwe'll make sure that you have a goodtime today and learn a lot about Dapperand Score so just want to set thecontext a little bit about the currentstate of the world as I know it and asprobably many of you know it as wellright platform and infrastructureconcerns have become progressively moreapparent to developers and bleed intothe interloop development process rightadditional dependencies new librariesnew concerns that come when you break anapplication across the network and whenyou introduce largecale development youhave to start to find ways to helpabstract away some of the complexitythat's happening at the runtime andinfrastructure level from yourdevelopers so this is sort of justsetting the scene of of where we aretoday so depending on what messaginginfrastructure you're using whatdatabases you're using where you'redeploying your applications typicallythat's going to end up bleeding intoyour developers inner loop and slowingdown their productivityso for today's uh for today's talk umMatt and I are Can I call you can I callyou that is that okay yeah yeah uh we'regoing to be we're going to be uh playingtwo different roles um and these are tworoles that work very much together inthe in the cloud native space and onmost teams uh I'll be playing the roleof basically an application developerand ultimately uh let's say I'm ��roller which is actuallywritten in Rust um and that's new anewer component which is why it's it'swritten in a different language we madethis decision a few years ago to startwriting all of our new control planecomponents in Rust because we felt thatthe the ecosystem there was matureenough and we were ready to start doingthat um and that serves policyinformation down to the proxies soinformation like is this requestauthorized can this service talk to thatother service um and how should routingbe done are there uh routing overridesthat we need to apply to this trafficand so that information comes down fromthe policy controller to the proxy nowboth of those controllers thedestination controller and the policycontroller get all the information theyneed from the Kubernetes API so they'reestablishing HTTP watches on theKubernetes API and listening for updatesabout pods about service accounts aboutservices about endpoints and all thatinformation is coming into them they'resynthesizing that and turning it intosomething the proxy can use andstreaming it down to the proxy that'syour super high level introduction tothe linkerdarchitecture um and in in addition tokind of just serving that informationthese controllers do something a littlebit more which is they manage some CRDsand so uh linkerd uh integrates with thegateway API the gateway API is a set ofCRDs that includes um gateways andgateway classes and a bunch of othersthat we don't use but we do use HTTProutes gRPC routes TLS routes TCP routesuh and those are all CRDs that come fromthe gateway API uh that we interact withuh there's also a set of linky policyCRDs that we use to describe uh policythat link should act on like whetherthings are whether requests should beaccepted or denied and in whatconditions so we have things likeservers authorization policies and HTTProutes uh under thepolicy.io group which are kind oflinkard's uh CRDs and you may noticethere's a little bit of duplication hereuh there's an HTTP route CRD from thegateway API there's also an HTTP routeCRD from the linkerd group uh and thereason for that is because at the timethat we were implementing this thegateway API CRDs didn't yet support thetimeout field and we really wanted toimplement timeouts in this way so weforked the HTTP route CRD into thepolicy.io group in order to get supportfor for timeouts and uh so now wesupport both we those those two CRDs arevirtually identical uh virtuallyidentical the linky one has the timeoutfield actually the gateway API one nowhas the timeout field too it didn't atthe time now it does um but we now havethis kind of duplicate uh CRD this forkand we're in the process of kind ofphasing that out we want to get backonto the standard upstream uh gatewayAPI CRDs for everyone but we want to dothat in a backwards compatible way sothat's a a multi-phased umdeprecation so it's the linkd policycontroller that's the one that waswritten in Rust that kind of handles allof the CRD management and the threethings that it it does with respect tothese CRDs is one it validates them soit has a uh validation web hook wherewhen any resource of these types iscreated it's going to look at them andmake sure that they're valid accordingto uh a certain set of rules and it'llsay yes this is allowed or no that's notum it obviously watches these resourcesbecause these all inform the policy thatlinkd needs to act on so server serversand authorization policies and HTTProutes it watches all these resourcesand uses that information to synthesizethe data that it sends down to the proxyand it also updates the status on eachof these resources so each of theseresources has a status subfield and theuh policy controller will figure out heyis this resource accepted because it'sit's all good and all of its things thatit references exist and it's valid sowe'll accept it or uh if it referencesthings that don't exist or it's invalidin some way we will uh update the statusfield to say this has not been acceptedit's you know has an invalid back end orsomething like thatokay so I think that's all thebackground we need� we can now move on tothe fun part which is the bugs uh andthis first bug has to do withCRDs so here is a graph of policycontroller memory usage uh that's thatorange line that's going up and up andup and up which is not what you wantyour memory to look like uh this clearlylooks like some kind of memory leak uhpolicy controller I think at the endthere is using 780 megabytes which istoo many megabytes um so yeah this lookslike a policy controller memory leak ofsome kind it gets oo killed at at somepoint by Kubernetes something is clearlywrong hereum and we also when we saw reports ofthis bug again this is not a bug that weinitially found ourselves almost all ofthese bugs are things that are reportedto us by users who are using linkerd inconfigurations that are different thanwhat we experience as linkd developersyou know they're running at largerscales they're running in uh differentkind of cloud environments um and so alot of the debugging we have to do onlinkerty is very like word of mouthtelephone you know you run this commandtell us what you see um or these thingscould only be reproduced after linkerywas running for a week when the moon wasin this phase or whatever right theseare things are very difficult toreproduce so in this case they they cameback to with a report saying that hey wesaw this memory leak we also saw thatwhenever we created an HTTP route thestatus was updated to accepted but ittook 40 minutes and and that's prettyweird because why would that take 40minutes that that should be updatedinstantly either accepted or not umsomething very strange is going on uhand we also saw a bunch of uh spam errormessages in the logs so we saw failed topatch HTTP route no available capacityand so these are kind of all the clueswe had to try to dig into like what isgoing on here um it seems like we'retrying to patch these HTTP routes butsomehow the Kubernetes API is not ableto keep up with that those patchesaren't going through why would that bethe case this is pretty weirdum and so we kind of dug into well whatwhat are the steps here you know in whatcases do we patch these resources andwhy might that fail and so the generallogic here is uh well number one thepolicy controller is going to establishwatches on all of these differentresourcetypes uh and then number two becauseit's established watches it's going toreceive updates anytime one of thoseresources is created or is updatedum when it does receive an update it'sgoing to look at that resource and it'sgoing to compute well what should thestatus on this resource be you know areall its backends valid do they exist andwhat should the status be should it beaccepted or should it be notaccepted and then once it's determinedwhat that status should be it cancompare it to what the status is and ifthere's a difference then it'll issue anupdate to the Kubernetes API saying "Heywe need to update the status here's whatit shouldbe." This makes senseum and so maybe it's that that the laststep that updating the status that'swhere we're patching the resource maybethat's where this problem is occurringso let's dig a little deeper into howthatworks so on the left is uh a littlesnippet from the CRD the custom resourcedefinition uh specifically inside thestatus subfield of thehttpouts.policy.linkery.io CRD um andthen so inside the status there's a listof parents so we say what the status isfor each of the parents that that HTTProute is attached to and for each ofthose parents we have listed out itsgroup its kind its name its name spaceand its sectionname and then on the right hand side wehave the uh way that this is representedin the policy controller which iswritten in Rust uses Rust kind ofbindings to to interact with this API inKubernetes um and so there's a strctthat we have that represents theseparent references and the strct has abunch of fields one for the group onefor the kind namespace name section nameand port and if you kind of blur youreyes and try to spot the difference uhthere's a field on the right thatdoesn't exist there on the left sothere's this kind of mismatch between�what's defined in the custom resourcedefinition and what's in the strct uhhow it's representedand so what happens as a result of thismismatch is that the Kubernetes API hasthis object that has no port in that uhin that status because that's not afield it knows about and so whenever oneof those resources is updated or createduh it'll send an update to the policycontroller saying "Hey you're watchingfor updates on this resource here here'san update here's a status." Uh itdoesn't have a port because that's not athing uh the policy controller will say"Hey okay great thanks for the updatei'm going to compute what the status onthis should be and because of the waythe code is written it's going to have aport in that status field and it's goingto compare those two and say "Heythere's a difference here this one has aport this one doesn't so I'm going toissue an update and I'm going to writethat to the Kubernetes API." KubernetesAPI says "Great thanks for that updatei'm going to take this i don't know whatto do with this port field becausethat's not part of the CRD so we're justgoing to throw that away." Um and soit'll store it uh in Kubernetes withoutthe port that resource just got updatedso that update will go back to thepolicy controller without the port andyou have an infinite loop where we'rejust going to spam the the KubernetesAPI over and over and over again withthese uh updates that getignored uh and so that causes thisballooning of memory to happen thatcauses the policy controller toeventually be oo killed the whole thingis so overloaded that any of theseupdates take 40 minutes in order to gothrough before there's capacity it's adisasterso fixing this was very easy once weonce we determined what was going wrongum all we had to do was was correct theCRD to add that missing field um and thereason that that field was even missingin the first place is because the timeat which we forked the uh gateway APICRD that that field did not exist butwhen we generated the Rust bindings at alater time uh based on the gateway APICRDs that field did exist so there wasthis mismatch in time between when weforked the CRD and when we generated thebindings and that caused this mismatchhere uh which was very nasty uh but oncewe fixed it everything worked magicallyagain and and life wasgood so the lesson here I think to learnis uh number one be careful with CRDs umjust because you write a resource toKubernetes doesn't mean that it getspersisted that way uh you can even writea resource to Kubernetes and then readthat resource immediately back and thesemight not be the same thing so the CRDis kind of the ultimate authority onwhich fields exist and and will getpersisted to Kubernetes so it's it'ssomething that has bitten me more thanonce um and I think the other lesson islike don't maintain CR forks uh if wecould get out of this world uh morequickly I think we would and and we'rein the process so uh hopefully very soonwe'll just be using the gateway API CRDsand and not have to maintain uh theseforked ones that can kind of get out ofsyncokay so that's that's bug number onelet's let's move on to bug number two uhthis is probably of the three myfavorite because this is just a veryinteresting confluence of a bunch ofdifferent systems i think that's one ofthe really interesting things aboutlinkerd and why some of these bugs areso interesting is because linkerdy as aservice mesh sits right at the center ofuh network protocols and the kubernetesAPI and CRDs and your application andall these different things kind ofinteracting in sometimes unexpected andunpredictable waysso uh this all starts with a bug reportthat says linkerd is routing to staleaddresses and anytime I hear this myheart sinks because these are so hard todebug um so this is a class of bug wherelinkerd is routing to some address thateither doesn't exist anymore or is inuse for something else in other wordsit's somehow missed an update aboutservice discovery um and that canmanifest as connection refused becauseLinkerty is trying to connect to some IPthat's no longer in use nothing'slisteni�ng there um or even worse itcould be tried to connect to somethingincorrect because that uh original podit tried to connect to has gone down andthen I that IP address has been reusedum so these are these are really nastyuh so usually this means that linkerdthe proxy has missed an update of somekind it's working on stale data uh butthere's a lot of different things thatcould cause that like why is it actingon stale data did it somehow drop theupdate does the proxy have a bug or didthe destination controller never sendthat update to the proxy in the firstplace so maybe the destinationcontroller has a bug or maybe thedestination controller never got thatupdate from the Kubernetes API so maybethe Kubernetes API is in a bad statethere's just a lot of moving pieces andit's hard to know kind of where to lookfor theproblem and it's even harder becausethese types of errors are very um statebased uh across multiple differentsystems right there's the proxy there'sthe destination controller there's theKubernetes API and you know you need tomake sure that all of these are in thecorrect state but what the correct stateis changes over time as you do rolloutsas pods come up and down um and so kindof to correlate what the correct stateof each of these systems should be at agiven time and make sure you'recollecting the right data in order todebug that is very very difficult um andthese errors of course are verytransient usually once you restartthings they often just go away and thenyou're like well what happened i don'tknow um and it's very hard to look forthe absence of something like we'remissing an update so each individualsystem taken in isolation will seem likeit's doing the correct behavior um sohow do you look for something that'smissing how do you know what's missingvery trickyum so if we dig into kind of how theseupdates flow from the Kubernetes API allthe way down to the link proxy a verysimplified uh view of it is is this sothese are things that happen in thedestination controller that's the onethat was written in Go um and that's theone that serves addresses IP addressesdestinations down to the link proxy umand so whenever kind of a relevantresource like a uh endpoints resource orservice resource changes in theKubernetes API well we get an update uhin the destination controller because wehave those watches established uh and weuse client go informers to to establishthose watches um so that will triggersome kind of on update call back in thedestination controller and so once weget that update we'll figure out kind ofwhat the right state uh for all the theproxies should be for that and thenwe'll go through a loop and we'lliterate over all of the proxies thathave subscribed to that service and foreach one of them we'll send an update onthat gRPC stream so if you have you know10 different proxies that have allsubscribed to a certain service thatservice updates then we'll loop throughthose 10 proxies and send an update toeach of themand the uh code that is sending thosegRPC update up updates uh lookssomething like this uh this isstream.end is the relevant API here sothis is sending on a streaming gRPCresponse so we have some kind ofaddition which is the update and we'redoing stream.s send that additiondown um and what you should notice hereis that stream.send only takes theupdate as a parameter it doesn't take acontext object there's no mechanism forcancellations there are no kind ofchannels involved here in any obviousway there's no obvious way to abort thiscall it's a totally blocking call whenyou call stream.send that call willblock until it returns and it will notreturn until it has done the thing untilit has sent themessage so there's no kind ofasynchronous uh behavior baked inand it's sending those messages over agRPC stream uh so gRPC is transportedover HTTP2 and HTTP2 has this concept offlow control windows and the idea behindflow control windows is that when you'retalking over a HTTP2 connection thesender can only send so many bytes untilit receives something called a windowupdate and the window update is thereceivers'�s way of saying "Hey I'vereceived those bytes i've processed themand I'm ready for you to send me more."Um and without that you know we couldget into a state where the sender issending data faster than the receivercan process it and uh you know it getsbacked up it's cues overflow you know uhyou drop data so in order to avoid thatsituation we have this back pressuremechanism where the receiver can sendokay or the receiver can say okay I'mready you can send me more nowum and if the receiver doesn't sendthose window updates then the senderwill wait and and not send any moredata so for example this is a sequencediagram where the sender says "Okayhere's 64 kilobytes of data." Thereceiver says "Okay I've read those i'veprocessed them i'm ready for more here's64 more okay I'm ready here's 64 more."And if the receiver doesn't send anymore window updates at that point thesender has data to send but the senderwill just wait and say "Okay thereceiver is not ready for this i'm justgoing to hold on to it until theyare." So if we go back to uh thispicture and we imagine that scenariowhere the the client or the receiverwhich is the link proxy in this case isnot sending window updates for somereason well then that call to send willeventually fill up you know you callsend repeatedly as updates happeneventually that's going to fill up thatconnection window that flow controlwindow and if the uh receiver is notsending window updates then eventuallythat will fill up and it will blockit'll say "I'm not going to send anymore data until you're ready for it." Uhand that call tostream uh and that's really bad in thiscase uh because if streams send blocksforever then everything kind of grindsto a halt in the in the destinationcontroller because now we're not able toproceed past that point we're not ableto continue looping through all of theuh proxies that have sub subscribed tothese updates we're just going to bestuckdeadlocked um and so could this actuallyhappen this is like kind of theoreticalup to this point why would a proxy likenot send window updates um and theanswer is I don't know but there's avariety of reasons why it potentiallycould um there could be a bug in theproxy in the proxy's network librariesum you know something weird could begoing on there that's causing it to notsend those updates um this can happenbecause of CPU starvation so if theproxy is not getting CPU cycles thenit's not able to send window updates andit can kind of get stuck in this stateum there could be other networkweirdness like we've seen things wherecontract loses track of a connection andthe connection gets into a weird stateand updates go missing um or that thingon the other side of that HTTP2connection that gRPC connection mightnot even be a linker proxy it might besome other client that's just callingthe destination controller and then notsending window updates and causing thewhole thing to deadlockso this is really bad as long asstream.send send is blocked none of theother listeners are getting any updatesevery single uh proxy that is talking tothat destination controller uh is goingto be starved forinformation and actually it's even worsethan that because that for loop that'slooping over all of those um proxies allthose listeners comes from the client gocall back the informer call back and soas long as the informer callback isblocked no other callbacks will fire sothat destination controller willactually stop getting updates from theKubernetes API entirely or it'll stopacting on them because the the callbackisblocked so this leads to this statewhere one misbehaving client whetherthat's a proxy or some other client uhcan deadlock the entire system just bynot reading data or just by not sendingwindow updatespretty prettynasty um yeah it'll just be kind ofstuck in this state uh potentiallyforever so and even fixing this was nottotally straightforward we were able todo it by kind of splitting things outand putting a queue in between so now uhwe've moved on to an architecture wherewhenever we receive these updates uh andwe kind of loop through all of thesubscribers who are listening forupdates uh on those services instead ofjust sending those directly to the gRPCstream we'll now incue those into achannel um and we'll do that in a waythat we can kind of guarantee that thatwill never block and then meanwhile inanother go routine we have a Q processorwhich is processing those updates off ofthat Q and then sending those down onthe gRPC stream and so by separatingthose we kind of isolate the theblocking behavior now we have a Q per uhper listener and if that listener stopslistening if we stop getting windowupdates there it's only going to blockitself it's not going to block any otherum it's not going to block the thecontroller call back and it's not goingto block the updates to any otherlisteners so it it can only sabotageitself um and if that queue does becomefull then we can just simply terminatethat stream we can say "Hey you got toofar behind you're not reading updatesfor some reason we're just going toterminate you and and you can reconnector or whatever."So I I think the lessons we learnedthere well number one flow control isreally cool uh the way that HTTP2 workswith flow control is very powerful uhand it was kind of in some sense workingas intended it was exerting this backpressure it's just that we also had thisconsequence of the architecture thatthat back pressure was getting exertedon everyone instead of just onto thatone stream uh the other lesson is justto be very very careful with blockingcalls um and to know which calls areblocking and which are not this was abehavior that surprised us because wedidn't expect just sending onto a gRPCstream could potentially block uh andmaybe block forever um so something towatch out for um and going handinhand inthat hand inhand with that uh is just tobe extra careful to never block insideof an informer callback because that'sespecially nasty anytime the informercallback thread is blocked you're goingto not get any call backs at all fromfrom clientgo all right we have one one more bugbut I don't think we have time to coverit here but these slides are um onSketch so you you know please pleasecheck them out this last one is a a veryinteresting memory leak i'lljust jump through here we kind of canlook at some allocations um and um I'llI'll I'll skip past all of this but I Idefinitely recommend you check it outbecause it's it's very interesting umbut the key lessons here uh are to notjust look at allocations which is whatwe were doing initially but but alsolook at deallocations those are kind ofin some sense when you're looking atmemory leaks more important um and andbe careful you what you put in map keysso watch outum anyway uh so those are those are thethe bugs that I want to talk about um Ihope you enjoyed those uh if you haveany questions um this is all work thathas been done in the open source onlinkerd so those poll requests are justup on the internet for anyone to look atum you can also come find me and talk tome about these or or other link bugsthat you find um I'm going to be hangingout at the buoyant booth in the projectspavilion for for the rest of the week orin the buoyant booth the linky booth orthe buoyant booth uh so come find me andtalk to me and uh if you're interestedin learning more about linkerty there isa service mesh academy which is amonthly live hands-on training that isvery good uh I highly recommend you cango tobuoyant.iosma to sign up um or there'salso a certification that is self-pacedand uh kind of a linkery one on 101 uhso also a good way to learn more aboutlinkerty thank[Applause]you and I think we maybe have just ashort time for questionsyeah I think there's a microphoneyeah so uh very good talk thank you uhis there a reason why you left out aproxy injector because it's bug free orI don't know because I only had 30minutes to talk about bugs oh okay butbut certainly the proxy injector hasbeen the source of of of bugs too sothere's there's no shortage okay thanksall right if there's no other questionsthen just uh come find me anytime thisweek and happy to chat more[Applause]2025-04-15 22:00:20.799613 FF��[#��iAKcjh0-hXwWwhi everyone welcome thank you for comingto my talk uh this is the museum ofweird bugs uh this is uh our favoritesfrom eight years of service meshdebugging on lingardy or you could havealternately titled this each of thesetook a year off my life um there's somenasty some nasty bugs in here so I hopeyouenjoy um so my name is Alex Leong i am amaintainer on the link project i've beenworking on linkerd for eight years sincethe project began um and so I've seenand created a lot of nasty bugs in thattime even though I've heard it'sofficial link policy that link shouldhave no bugs you know here we are um soif you're not familiar with whatlinkertd is linkertd is a service meshuh service mesh was a really really bigbuzzword a few years ago and I think nowhas kind of descended into the the realmof being just boring technology which isgreat um I love that um it has been inuse for for over eight years um and it'sbeen in use by a lot of different peoplein a lot of different environments at alot of different scales um and so that'smade uh kind of a ripe environment for alot of different bugs to be exercisedand to occur and to happen in reallyweird ways that are very difficult toreproduce um and that also makes thingsvery challengingum so I wanted to just start by giving alittle bit of background on linkerd andthe highle architecture very high leveluh I don't want to get too much into itbut I want to make sure we have enoughshared context and we're on the samepage in order to be able to talk aboutthese bugs and how they kind of relateto the link architecture um and thenI've kind of selected a a tasting of ofthree bugs for us to go through uh eachof these is very very delicious um Ithink we probably will only have time toget through two of them we'll see how itgoes uh they they each deserve theirfair share of time uh but we'll seeuh so just to give some highlevel uharchitecture of of how linkerdy works umon the right hand side is what we callthe data plane and so that's where yourapplication in Kubernetes is runningeach of those yellow boxes on the rightthere is a pod and inside each pod isyour applications container and thenadditionally there is uh the linkd proxywhich also runs as a sidecar containerin those pods and that sidecar containeris written in rust that's the linkermicroproxy uh that runs alongside andhandles all of your network traffic andin order to do so whenever you try tosend a request to some service linkdneeds information about how to routethat request what kind of policy toapply to it and so on and so on and inorder to get that information it reachesout to the linkd control plane which isrunning in a dedicated name space and itdoes that with a gRPC streaming API soit says hey I want to talk to thisservice please tell me all about it andfurthermore keep me updated as thatinformation changes so if new endpointscome up and get added to that service orif the policy changes on that servicestream those responses down to me sothat I can stay up to date and treatthat traffic properlyuh so inside the control plane which isthat middle yellow box there are mainlytwo different processes that run thereuh there's the destination controllerwhich is that first blue box uh and thatis written in go and that provides umdestination information to the proxy sothings like endpoint addresses and uhTLS identities and stuff like that umand that information is all streameddown to the proxy over that gRPCAPI uh the second box there is thepolicy cont��e're here what we can do isbasically um like not allow that rightlike because together we are in themajority um so allies uh play a crucialrole and if you look at history uh realchange has happened when allies stood upstepped up and um advocated uh for umaccessibility by using their privilegeand influence um and um that is rightnow particularly important it's alwaysimportant but with the current climateit's particularly important so my hopeis that uh you all leave this panelempowered with actionableum tips to start advocating for a moreuh accessible and inclusive uhcloudnative community with that let'sget started i'll hand it over to ourpanelists so uh to introduce themselvesanastasia do you want to get started yesthank you Katherine um just to let youknow I've used the sign here Katherineuh this is Katherine's signname uh my name is AnastasiaGupska i'm an S sur DevOps engineer iwork for BT BritishTelecom i'm also the first CNCFambassador who is deaf very proud to behereuh hello everyone i'm Shand i work as alead software engineer at Jen i am theco-chair of the CNCF deaf and the heartof hearing working group and also amember of the T contributor strategy iam a person with hearing impairment andI rely on captions and lip reading andif you see me looking at my phone I'mnot checking my official email i'mmarried following the captions thank youhey everyone I'm Rob Cotch hello thankyou for being here i hope you saw thekeynote this morninguh so anyway uh I work for slalom as adata engineer principal data engineer uhso I am also a um with the deaf andheart of hearing working group uhco-chair and I'm also an AWS data heroso got lots of hats to play around withhere i'll pass it on toMilad hi my name isMilad and I was born deaf and uh mymother was sick during her pregnancywith me and I was born deaf as a resulti work for EPAM i've been there about 10years i am a software engineer and alsoum I am a founder of DeFeek a YouTubechannel where I produce videos regardingdeafness and technology and improvingthe work conditions for people who aredeaf and heart of hearingokay great uh so first question so whatdoes allyship mean to you and how has itshaped your experience in the communityanastasiathank you great question uh so allyshipfor me um an ally is a person who doessomething very small that becomes a bigimpact for somebody like me so forexample uh because of the situation inmy home country I had to move over tothe UK um just to let you know by theway that my first language is Ukrainiansign language um but here in the UK weuse British Sign Language so I have twoBSL British Sign Languages sign languageinterpreters working with me and alsoyou'll see on stage ASL or American SignLanguage being used so we also haveAmerican Sign Language interpreters herein the building so really complexlanguagesituation uh so yeah bit bit of achallenge um I also do use American SignLanguage myself so I'm trying not tolook at the ASL whilst I'm trying to useBSL because it's really hard to to useboth languages um uh they don't mix verywell so I'm concentrating very hard onmy two interpreters today uh so an allyfor me um when I moved over over to theUK um I'd previously worked in SouthKorea uh in a very different market andum I needed to upskill here in the UK tobe able to access the the job market ididn't know British Sign Language um mypreference was to use verbal writtencommunication um I didn't often haveaccess to interpreters i was trying todo some online workshops uh watchingvideos online and they didn't havesubtitles or captions so I'd asked thecreator if he could uh add some captionsto his videos and he said "Yepabsolutely no problem." He himself addedcaptions to his training videos andthat's how I was able to access theworld of Kubernetes and because thathappened I'm here at CubeCon i am a CNCFambassador um and I I presentedyesterday at CubeCon at Argo Con so I'mreally really proud to have that happenand that's what allyship means tome and I would like to add to what umAnastasia just said allyship for me whenI envision what allyship means it's so�impactful because your voices amplifyours and help spread the word aboutaccessibility and um our needs so oftenit's so easy for you as hearing peopleto just ignore the needs of someonewho's who for example is deaf allyshipincludes everyone in access so it can beincredibly impactful and helpful soAllyship in cloud native particularlyhas been huge for my promotionopportunities in my career and my worldhas changed for the good and permanentlyas a result so understanding how we areincluded via accessibility that wassomething I had no idea prior to my myexperience here so it has changed mycareer it has changed my trajectory andallyship is responsible for that thegoals of even having something as simpleas an interview or access to aconference like this i would justencourage you all to use your voicesbecause it can be hugely impactful forsomeone's career and their life so weneed small actions big actions it reallydoesn't matter but even small actionscan make huge impact and big change forpeople in our community great and um canyou share a time where allyship played auh made a tangible um difference in yourlife um SEP or in your career actuallySEPuh after finishing myengineering no one was willing to hire aperson who cannot listen over the phonesince I prayed I still cannot listenover the phone and so I would get askeduh how would you how much your faceclient how much your face client costand everything but as I senior far awayfrom my city he chose to believe in meand so I started my career as a softwaredeveloper thousands of miles away frommy home in a very inclusive company butit was a city where neither language norclimate nor culture suited me and yet Ipersisted and in that wonderfullyinclusive company I rose to become thetop contributor in that company andwhatever I'm today after decade also Istill owe it to my manager in my firstcompany who molded me where I learnedthe art of softwareengineering for anyone who works as asoftware engineer the most challengingthing is to go to a client's place aclient's data center and deployeverything from scratch now when I wentto this client he was a grand old manwith a Santa Claus like beard so thebeard was so thick I couldn't see thelips there was no question of readingthem and he was very embarrassed thatwhatever he was speaking I couldn'tfollow but my manager texted me he said"You just focus on your work don't worryabout anything else." I was to completethe project in 45 days i was able tofinish it in 40 days and the clientbecame a wonderful friend even extendinga job offer which I politely decline sowhen you deliver value when you deliverexcellence you go beyond your disabilitybut I had wonderful Alex to enable itthe CEO who just to have faith in me mymanager who empowered me so I wouldn'tbe here were it not for themyes I'd like to add to that thank youSandepso when I um started to be involved inthe deaf and heart of hearing workinggroup um I had met a lot of peoplethrough the cloud native community I wasworking with Kubernetesum and I wanted to learn more about Argoso I was really really lucky to meet myArgo mentor Kostus who's just sat at thefront in the audience there umKubernetes Argo rollouts there were alot of really complicated conceptsinvolved um and I wasn't alwaysunderstanding it very well and Costa sokindly would make videos for me addsubtitles to them and explain reallycomplicated technological concepts so Ihad a really good understanding ofArgo um and then I was able to uh workon an Argo project uh talk at aconference um and I can all also go intomy workplace and justify the need forArgo rollouts um make sure that theyunderstand the business case forit and that really really helped me umuh with for example automaticdeployments uh the automation ofdeployments in our organizationum and it means that you know I can talkto people about the efficiency of Argorollouts and how it can help their ownorganization so just you know my mentorreally helped me um because heunderstood that just because I was deafit didn't mean I was inefficient or Iwasn't able to work at a high leve�l or Icouldn't understand things he reallysupported me to get my name known in myorganization um and uh it means thatpeople come to me and ask me aboutKubernetes you know um I'm able to talkquite in depth about security things umuh and and if it wasn't for Kostas'videos and the subtitles he added tothem um I wouldn't have got to the placewhere I am in my career so his small actof allyship has had a huge impact on mycareeryeah and he's sitting just there Kostaslike just like this is just amazingthank you so much yeah so and actuallyyou inspired a program that I'm going togive like a call to action right afteruh like once we're done so uh justseeing the two of you is just amazingand I'm hoping that we can create moreof these relationships um but yeah moreto that later uh so um despite many DIinitiatives um there's like diversity intech is still lacking what are thebiggest challenges that underrepresentedgroupsum still face um in the tech communityrobyes so DEI right now is considered a badword especially in the US um and fromthat perspective it really doesn'tmatter uh really DEI can changeopportunities for us and opportunitiesfor everyone to get involved and we canget involved mutually it's really we caninteract with each other and thediversity and inclusion and equalityhelps the organization and benefits youknow like a two-way street really uh Iwould I would say thatyour products your community within yourcompanies within your culture once wehave the allyship in place and it'sbuilt into a company's culture it makesit a lot easier to provide support tounderrepresented groups and makingdecisions along the way and propromoting products to theunderrepresented groups as well so yourproducts uh and services will be muchbetter overall with that allyship like Isaid a two-way street and getting thatum to become ye like second naturereally and I'd like to add to that umprior to um experiencing this myself myhands are actually freezing right nowbecause of the air conditioning so I'mtrying to warm them up so I can actuallysign effectively but here we goum all right so um to add to what RobjustsaidDEI it's still something that is achallenge we face um in in many respectsand when you have a role model forexample like Rob has been a tremendousrole model to me and helping me increasemy access and my ability to be includedum in communities that I didn't have anentree into before so um for example umwe've got some wonderful leaders hereand um in it's it's expanded the roadit's made the road wider for us to getinto more places we weren't able to getinto before so having role models umhelp us get in and mentor us teach ushow to interview for example or um anyof those professional roles that wedidn't have access to before it smoothsthe way for us to be able to get in 100%agree okay great and how can alliessupport accessibility in both thecommunity and within their organizationor the industry um Sundepso my journey with the death and theheart of hearing working group it justhappened by chance in the keynote inChicago I happened to sit next toKatherine and she she had just helped tofind this group and she said come be apart of it and I've been with the groupsince the last one and a half years notmissing a singlemeeting so I actually wanted to expandmy contributions beyond the death andthe heart of your working group so lastyear in February I attended the Dcontributor strategy meeting andKatherine is a part is a question of theD CS group so I attended that meeting ashe wasn't there and I did not knowanyone in the meeting but the host hejust asked me one line are the captionsworking for you so that one line thatone line made me feel so much belong andincluded and that is how I started myjourney in open source and then I wentto the sick country group and from thereI went to the com sub project of thesick country group and I wrote aspotlight for the deaf and the heart ofhearing working group which is up on thecoven later later I got in touch with amentor and it's pretty young so that isthe beauty of open source is that a veryyoung guy can also �mentor the seniorpark so he asked me that if you stillwant to contribute more to open sourceuh m he asked me to get in touch withKeline and to Keline I became a comshadow for the contributor summit so Iwas the last shadow of the Kubernetescontributor summit because now the KCShad been rebranded to the maintenancesummit so as a case shadow I got anworking of how how the contributorsummit runs i got to know about thesteering committee about the technicaloversight committee i understood thewhole CNC of landscape and it was like alot of hard work lots of meetings but Ialso got after six months of hard work Iwas felicitated with but the job wouldcarry out the award in Salt Lake City inNovember so for someone who started thisopen source journey in April last yearand within six months to get the awardthis award actually didn't belong to mebut rather to the numerous alies who whoacted as empower who enabled my journeywho helped me in every space in mycontributions so all has had a lastingimpact a small change goes a longway thank youyeah I mean you you basically see thatjust like that little line was was someaningful which kind of means thatallyship is not like doesn't have to bethese huge things like just littlelittle steps make uh um a huge have ahuge impact um would you like to addsomethingyes I would like to add um how allyshiphas um been such a support um using yourvoices to amplify ours even in smallways can be a huge impact as we havesaid it can really impact us and createchange forus because I've noticed as we haveprogressed in getting the word out oradding captions sometimes um captionsaren'tenoughand to create full accessibility foreveryone sometimesrequires additional accommodations butoften times it's not thought untilafterwards to say "Hey did you needadditionalaccommodations?" And then I've missedthe meeting so asking me before themeeting on what sort of accommodationsthat are needed so that I can providethat input is much more helpful thandoing it after the fact so allyshipmeans supporting some some ideas ofaccessibility prior to meetings and thatcan really create a lot of change aswell great um so if there was one thingyou wish uh all um allies um oh thatpeop if you there was one thing you wishpeople would know about um allyship whatwould that be sundepuh that accessibility is not reallyrocket science it does not need like alot of intense resources all it needs isjust a little bit of empathy and ofawareness and lots of willingness withallship you don't become an ali you justyou just internalize it you don't haveto have the word but you just become onewithout even knowing you have become oneso like you and your partner you go to agift and the entrance is not wheelchairaccessible and you just call out thathey why is the entrance not wheelchairaccessible neither you nor your partnerare using a wheelchair and yet you callthis out because you sort of internalizeit this is something that I'm sharingfor my personal example so the is justnot that limited that I think about thedeath population i think in a way aboutthe other underrepresented groups aswell about the bipok community about thevisually challenged community so I tryto prepare my presentations where I useminimal images so that the blind peoplecan actually follow my presentation soit's just just a little bit ofwillingness and not too much of recencealso what Sandeep was just saying Iwould like to add to that because in inthis working group um we've had a fewpeople who are hearing recently join usand that is amazing it is such awonderful act of allyship to join us andit's quite a simple thing to join us andmaybe it's just out of curiosity um andduring different events we're able tospeak to each other and increase eachother's awareness about simple thingslike um accessibility like we've beentalking about today but having thatsupport from others is tremendous uhwhen you join us even for an event orour monthly meeting just to watch andlearn it says so much and it it spreadsawareness in a way that might not bedone otherwise it's very helpfuli'd also like to add t�hat I wish thatallies would you know to answer thequestion that's being posed here feelfree to really ask us any kind ofquestion what you think uh you know askus what we need what we need to besuccessful uh relationships and thingslike that what do you need what whatwould you like how can we have an openconversation don't worry about beingrude or offensive to us just absolutelynot just enjoy it enjoy thecommunication let's laugh together let'stalk and probably learn a little bit ofsign in the process absolutely right yesand that can open a lot of enjoyment foreveryone that can be wonderfulexperience yeah maybe they'll learn somebad words even huh indeed indeedbut go ahead Rob sorrybut my point is is thatuh the more that we work together andinteract with each other and value eachother's input as well as just sharingknowledge and learning something newevery day right uh a lot of those thingsthat people don't know about maybe whatI do in my day-to-day life somethingfunny or something stupid who knowsright but like thinking about uh thatmight not be something that others werethinking about so this at the conferencehere we're all here to learn somethingnew tech mostly right but we're alsohere to learn about each other and thecommunity and it's about us for us rightthat's absolutely true and as you becomeallies the first thing is involved isjust learning how to communicate andenjoying us and enjoying each other'scompany and it'll help you to understandmore easily about um what we need andhow we can be helpedwellsaid oh well signed I guess absolutelywell yeah so uh many people want tobecome allies but don't know where toget started and uh I hope many peoplehere uh uh um ask themselves thatquestion so what are the first stepsthat people could take uh to get startedyeah I'll take that so um Rob and Miladhave both said um always ask us what ourpreference is um but also you can startwith something very small just asking isa small thing for example um if you meeta deaf person outside um at CubeCon youknow just let us know uh or if you sorryexcuse me if you meet a deaf person outin your organization where you'reworking um and they maybe haven'tattended CubeCon they don't know aboutour working group let them know about usuh introduce the deaf and heart ofhearing working group um deaf peopleneed to know that we're here um and alsoif you organize a local event um or ameeting a meetup uh think about addingcaptions to your presentationsum think about uh calling out lack ofaccessibility make sure that there'sspace for us platform and spotlight deafpeople all the time because often we'llgo to an an event or we'll want to go toan event we'll contact the organizersand say is there going to beinterpretation there is there going tobe captions and often we don't even geta response um and you know just callpeople out know that that organizationsand events can do a lot better than thanhow things have been in the past thatcan really really help us and that isrealallyship if you feel you're not sureabout how to connect with deaf peoplehow to talk to us don't feel awkward youknow come and talk to us ask questionsask the best way to communicate with usdeaf people are very very friendly andvery open to talking with people thatthey've not met before um we're happy totalk to you about what we need so justto reiterate starting with small thingssharing our posts spreading awareness ofour deaf and heart of hearing workinggroup uh make sure you repost the thingsthat we're sharing about our work andour lives that helps to platform andspotlight deaf people and that's reallyreally important um and that means thatevents are much more inclusivesometimes if you're at an event and ameeting's very very loud there's lots ofpeople talking over each other and younotice that somebody's sat there notcontributing why not just hold themeeting and say "Is there anything you'dlike to contribute?" You know for thatperson not only for deaf people but forlots of people who struggle to interruptlarge busy crowded spaces it's reallyimportant to make space for people andmake s�ure they're includedto add to what Anastas Anastasia saidthe resources online uh for the deaf andheart of hearing working group that wehave with the CNCcncf.io website there is a plethora ofinformation there so just to let youknow how you can you know maybe you'rehosting a small local meetup there's uhguidelines that we have listed therethat you can follow you know maybebringing a TV screen and hooking that upwith a caption using AI or whatever asan example ai again yeah totallyor we could um you know if you're havinga midsize conference there's u you knowwhat does the typical accommodation looklike for that and a large scaleconference like something like thisright so we have all kinds ofrecommendations for accommodations andyou know if you're afraid to ask we'vegot the resources there too so I meanreally of course uh like Anastasia saidjust ask us uh you know I'm I'm deafhe's deaf you're deaf he's deaf and weall have different range of uh aspectrum of accommodations that werequire right so uh just differentthings um all deaf people you know havedifferent um preferences andrequirements some are you know heart ofhearing some can speak some not so yeahjust ask us out directly it's totallyfine and so if I have to just say oneline I would say ask and don't assumevery good perfect um yeah and just likeum talking about events and calling outwhen you don't when it's not accessiblebecause a lot of times people say likewell there are no deaf people here so wedon't need captions well guess why thereare no deaf people because they thereare no captions right like suddenly andthat'slike and the Linux Foundation were greatlike as soon as we because events werenot accessible right and then we weprovided some best practices and so onand and they did it in lightning speedwhich was amazing and now deaf peopleare here so like the reason they're notthere is just because they're notaccessible and that kind of like it'sit's terrible because it's really likethis this loop people not assuming andso um yeah things should be accessibleby default and people will like willcome um um but yeah this is um yeah theend of the discussion i have a few uh umannouncements to make and then we have afew minutes for u Q&A um so this is thething cost this is your fault uh so weare starting a uh mentorship program umso if you want to be a mentor of anunderrepresented group um yeah scan scanthe QR code uh you learn a little bitabout that um we're still like the firststep is like getting uh mentors and thenuh we'll try to match them uh so pleasedo I'm going to wait you going to be amentor i could tryhe needs a mentor okay and um then uhtomorrow we're going to have twoactivities we're going to have an opendiscussion uh so if you enjoyed thisconversation uh it's going to be more ofthis but but interactive so you're goingto be part of that uh so uh there is aDEI community hub uh and that's at 4 p.mand then a sign language crash course wedid that last time and it was packed itwas like people loved it it was so somuch fun uh so uh that's at 5:00 pm soit's one after the other also in the DEIcommunity hub and then we do have a uhkiosk in the project pavilion in theexpo hall it's we have the AM shift soin the afternoon you're not going tofind it but swing by and say hi uh anduh thank you uh and uh we'll would openit for uh Q&Athanks one one thing one thing before webefore we go to Q&A if we could go backjust a little bit that last slide Iwould like to add tothat about the mentor the mentor slidethat's the oneokay the concept is the same as umhaving a friend for instance asking mehey do you want to be involved in amentorship situation or do you want tomentor deafchildren i had no experience being amentor to deaf kids but I really wantedto be involved i wanted to become amentor and I wanted to learn at thattime so I was able to take what I hadlearned from my mentor and apply it tothat situation and work with kids whowere born deaf and they could have thatconnection with a deafadult andit if they have any difficulties thatare being overlooked in theireducational setting forexample they could learn on how toadvocate for themselves and from a deafadult and to have allyship work for themand it's the same concept here they wereable to learn that young because we wereinvolved in a mentorship program withthem so if you have something that youfeel you could teach please become amentor because it will impact futuregenerations of of people in tech frommarginalized communities and that's justmy opinion that I'd like to add did youhave something else to say well and justlike uh you will see it in the forum butlike you can pick whatever you want likeif it's like just like a one-time thingteaming up to give a talk if you want todo career advice um if you want to likehelp people contribute to open sourcelike you can pick different things andit's Yeah so it's it's not like the timecommitment depends on both of you so wehave something for everyone uh but ifYeah so if there are any questions thereis a mic over thereum anyoneokay hi can you hear me okay wellcongratulations everyone we can't hearyou i can't hear youi can see youcan see me okaywell I have two questions that the firstis how maybe can interact with this bestpractice of accessibility for the clothnative communities because I I wasasking myself that for example in thecloud native chapters how can we add theresources that maybe are alreadyavailable or even external to try tointroduce asa as a inside of the slides as a patternby default support and also uh my seconduh questions is about how can we expandthis about localizations because also isall another people that is deaf andhearing in other continents and how canwe expand it and help itand yeah even I have a just a brief likefor example we have the KCDs theseevents that we could add like a defaultpattern to have this h signed thatintroduce this kind of best practice toto more accessible in all our eventsyeah thanksyeah awesome question thank you uh Iknow that you know understandingaccommodations there's a price tag thatcomes with it of course so um noteverybody can afford it especially in umdeveloping countriesso that being said there are ways andapproaches around it we have instructionsheets on the website uh the DA uh thedeaf and heart of hearing working groupuh that you can uh page down and kind ofwork your way through um there's youknow things like like I mentioned beforethe caption on the TV screen um step bystep you know there's one um contributorwho built an app you know for that sothat's that can be utilized uh as wellas the community we you know we'llunderstand there's some situations thatthere's no sign language interpretersavailable so you know we just ask foryour patience and you know uh we canmaybe interact with the captions thatway and get what we need there sometimeswe can use our phones to be able to textback and forth or you know the oldschool way with a pen and paper uhthere's ways to interact with we can goback and forth and ways to work arounduh so just be patient i would ask what'syour experienceyeah well we were talking I wasinterviewing Sandeep recently and we hada bit of a communication um breakdown[Music]and we were talking about usingdifferent microphones and um being ableto interview one another smoothly and ina small group that's something you couldprovide very simply to purchase amicrophone that doesn't really cost muchor maybe they already have it um it's asmall expense that you could use withthe caption app like Sandeep uses umeverything doesn't have to be a massiveinvestment it can be a very small thingthat's very impactful as we've saidpreviously so that's just my advice isum you can make it very just buysomething very small uses a tool calledV BY so that by already has an inbuiltcaptioning that you can utilize for yourvirtual meetups okay okay in theinterest of time I think we need to windup we would love many more questions youcan hit us up in Slack afterwards yeahagain come to our kiosk if you havefollow-up questions because we ran outof time thank you so muchthank you thank you thank you thank youthank you so much love you all2025-04-15 22:00:21.508879 � ��9�^#��)AweOIwt1lbvIthank you all for joining It's excellentto see so many smart people in here Uhmy name is Thomas and with me I bring mycolleague Paul Uh and today we'll talkyou uh to tell you this talk of uh thebricks that make us and what that meansUh before we get too deep in I just wantto ask how many of you have had anyexperience withKubernetes All right Thanks That'sexpected How many of you have experienceplaying with Legobricks All right See that's aninterestingcorrelation So uh Paul yes as Thomasjust alluded to my name is Paul I am aplatform engineer in our Lego containerplatform team Um creativity is big is abig part of my personality Uh in factit's a part of my entire family'spersonality So I think that this job isa perfect job for me because Kubernetesis a is a creative thing Uh and so isLego So uh I hope that uh as you can seeall of the the stuff that that I do hasto do with creativity And right now I'mI'm trying to get into paint Uh and youmight see that a little bit in our inour talk here Um yes Uh and as you cansee in Paul's titles he's only anassociate PowerPoint engineer And thatmeans that if there's any flaws with hisPowerPoint uh that's expected That's youknow Paul is still learning right Uh asfor me I'm I'm Thomas Uh my journey withLego and Lego bricks started when I wasabout a year old I tried to eat my firstLego Duplo animal Uh didn't go as wellUh about a few years later when I wasthree uh I started playing around withmy dad's computer And then few yearsafter that I played with my first uhRobbo app or Lego Mindstorm set So youknow combine computers and and Legobricks right Uh and then six years ago Ijoined the LEGO group So I can combinethat with work and do that while I wasworking as well That was pretty cool Andsince then been doing a lot of differentplatform engineering roles inside theLEGOGroup All right But uh what size is theLEGO Group The LEGO Group is a is a toycompany right We we manufacture toys Umthis is our digital organization We asof this morning I counted we are 237different product teams There's about2,000 colleagues in the digitalorganization I think officially a littlebit less and then we have someconsultants I'm not entirely sure on thespecific specific numbers and we have alot of communities clusters and wedefinitely have more than 50 Uh but wecan't really say exactly how many uh andthere's a reason for��_�]#��uA-3NyXaVPGvothanks everybody for taking the time tocome today at the end of the day i knowthese things get exhausting but it's theend of day one so you still got someenergy I hope uh we are going to walkthrough a lot of content uh I tend topile a lot in uh and so we'll see howthe pacing goes as we go through it uhand sometimes the end catches up with meuh faster than I meant to so we'll justhave questions and answers if we getthere uh so what were you here for uh wetalk a lot abo��P�\#��WA1rgZPi2dTvEwelcome everyone um I'm KatherinePaganini i am uh the director ofmarketing at buoyant the creator oflinkerd and I'm also the tagcontributive strategy co-chair where Ifacilitate uh the deaf and heart ofhearing working group and today I amhonored to uh moderate this panel withthis fantastic group of people who areall all part of the deaf and heart ofhearing working group uh and they willintroduce themselves in a minute uh butfirst I wanted to say that this topic isincredibly important because I trulybelieve that allies can have a hugeimpact in making our community more uminclusive and if you wonder why umminorities I are in the minority bydefinition right so it's very easy toignore their voices or overhear theirvoices and w��ut open source software uhwe share software we understand uh whatother companies uh customers need andwant uh they bring requirements fromtheir own end users and then we bringour own solutions and we collaborate uhbut we don't talk a lot about why andhow uh we invest in open source softwareuh how do we rationalize this in ourcompanies uh and so I'm going to sharesome background and some context uh anda few of the waves that we've beenthrough in the Kubernetes community uhbringing us up to present uh and sharekind of how our plans have changed inrecent years and just share where we'regoing andwhy uh so our open- source uh softwareoffice opened uh in 2004 so we'realready contributing to you know theLinux kernel and others in at that timeuh but this is 21 years ago uh andthrough that there's lots of open sourceprojects from Google it's really a corepart of our DNA uh and it's reallyimportant uh to us and for us we consumeopen source software we contribute to ituh we often find that the best place tosolve a problem is in the community sothen we don't have to do somethingspecial we don't have to reinvent thewheel no one gets to Google because theywant to solve problems that someone elsealready solved um so it's reallyimportant to ourculture there i don't know this is alist of all kinds of other projects ijust got it from the uh open source atgoogle.com website uh and I realizedthat even in the United States ouropen-source program can drink so happybirthday uh okay so Kubernetes thoughKubernetes though 2014 to 2017 the firstphase was really about disruption thiswas us entering public cloud market umwe had app engine but really trulyentering public cloud we were lookingfor some unique aspect something that wecould contribute that others didn't haveuh this sort of coincided with the riseof Docker uh and I still rememberrunning docker run and feeling the hairon the back of my neck stand up thinking"Oh my gosh this just solved the wholeit works on my box problem." Uh and thenof course you end up with lots ofcontainers uh and so I was working atNest in 2015 2016 as a user ofKubernetes uh it was not wise to runproduction workloads on Kubernetes atthat point we did anyway it was kind ofentertaining uh and then I in at GoogleNest was part of Google there's aninternal mobility and so I startedworking on Kubernetes in about 20 andearly 2017um so through the end of 2017 was sortof the disruption phase we felt we hadan opportunity to expand the publicconcept of cloud from just VMs to alsocontainers and in Google cloud our VMsrun on containers so this is really coreto our way of thinking abouteverything uh and we had leadership herewe had we were the world's leadingexperts in container orchestration andwe could show what we had learnedinternally and share that with the worlduh and so that was the open sourceKubernetes collaborate with LinuxFoundation and create the CNCFcontribute Kubernetes to it and createthis thriving ecosystem this was ourvision we never believed it was going tobe this popular but uh maybe some of usdid i it was a a long shot uh and thereis a degree of luck and timing to allthese things so what made Kubernetessuccessful i think six containerorchestration tools were announced onthe same day that Kubernetes was so whyKubernetes uh I'm going to argue thatthere are really three or fourish thingsthree plus one things that made itsuccessful uh number one was that it wasdeclarative instead of saying do thisthen that you declare the desired stateand then decoupled controllers actindependently to bring the system intoalignment with the desired state thesecond is that it is extensible you thecommunity end users vendors can extendthe Kubernetes API that's important notjust so that you can do what you wantbut also so the Kubernetes communitydoesn't have to do everything and that'scritically important and number three isthat it's modular you don't like thescheduler swap it out you don't likesome aspect do something different butyou can still use the rest of Kubernetesand that's really powerful too and thesethree things combined �kind of got thecommunity engaged and this is criticallyimportant there's this kind of uhnetwork effect that comes but this isthe the core Kubernetes loop right itobserve you act you say here's mydesired state you observe and see whatis the the current state and thencompare the desired state to the actualobserved state and do and then act againuh and this just continuously loops overand over and this image is I probably 10years old we've been using this image inslide decks the entire time you'veprobably seen itbefore so this declarative API activereconciliation and then we'll go througheach of the other areas so extensibleCRDs are kind of evolution of a thirdparty resource initially then werejiggered it uh CRDs I think we made uhGA in 2019 but we've been continuouslyimproving CRDs over time uh and they'reactually better than the built-in typesin many ways at this point so uh maybewe just replace all the built-in typeswith CRDs uh but this is reallyimportant so that end users thecommunity you can extend the KubernetesAPI use the same tools on your newresources uh this is super powerful uhand we've even implemented corefunctionality for Kubernetes out of treeusing a uh CRDs like gateway API is agreatexample and the last part is thismodularity so you can plug it in you cancreate a newuler you can have multipleschedulers for different workloads uhand enables this higher levelautomation so that network effect that Iwas talking about is sort of these threegroups interacting together and they allhave their own loops developers arecreating libraries and use cases go backto the developers and then users areposting what they what worked for themand what didn't work for them and theirown complaints and it and they kind ofraise more awareness about this solutionuh and then vendors see gaps and theyprovide solutions and it sort of goesyou know starts to grow and there's thisnetwork effect that grows the entirecommunitythe other really important thing to theKubernetes success was the incrementalevolution we like been very clear aboutwhat it was good for and what it was notgood for from the beginning uh and thathelped us sort of decide what was out ofscope not forever but for right now uhand starting with stateless applicationswas really critically important and thensort of incrementally evolving over timeto support stateful applications uh youknow damon sets and statefulapplications and then batch we're stillworking on batch but batch and uhscheduled workloads so uh deciding whatwe're doing and being really clear aboutwhat we're not doing has been reallyimportanttoo so then we get to 2018 to 2022 thesedates are kind of arbitrary but they'redirectionally accurate and this was theecosystem expansion this was like thesort of the result of that networkeffect uh and there's just an explosionof ecosystem projects and solutions toproblems um it's you knowto comes fromthat time uh OPA uh gatekeeper stufflike Argo and K native and opentelemetry uh and so the you know themore users there are the more valuableit is to create for this ecosystem youget portability across cloud providersto be on prem um this is super usefulfor a developer you get to just havethis distribution channel for yoursolution we've also incrementallyexpanded the workloads that aresupported uh and this one's important tothe overall strategy because uhinitially Spark and Kubernetes didn'twork so well together this was anexample where we we contributed to theopen source the Apache Spark projectitself and Kubernetes and sort of madeKubernetes aware of Spark as a firstclass scheduler uh and this openedKubernetes clusters as a distributionchannel for Spark uh and it was superinteresting to go through that projectas well and that was in 2018this slide i have mixed emotions aboutthis slide like I'm kind of proud of iti'm also kind of feel a little bitguilty because I don't know how a newuser walks up to this landscape andunderstands where to start or how to gothrough it we tried at one point to havea road map thing do you remember that atsome point there was like a and the roadm�ap got really confusing because it washard to tell who was allowed to be onthe road map and where they belonged onthe road map it was really difficult umbut I think we can declare that thispart has been a success there is athriving ecosystem uhand it is important for the next partso until then at this so far we believedthat Kubernetes was following this sortof typical adoption curve right youcrossing the chasm and all that crossthe chasm in about 2018uh and so then you get more stability westopped you know just deprecating APIswilly-nilly we stopped yanking the rugout from under you we made backwardscompatibility a thing um so things gotmore reliable uh and more predictable igot a little bit bored I want to saylike you know predictable is great butalso you know not as interesting as someof the really fun innovation so uh youknow so far sogood and I thought I wrote this slide atthe end of 2020 no uh I don't knowmidway through 2023 no 2022 uh and Ithought what are we going to do in 2023and beyond it's going to be aboutconsolidation that previous slide withall those ecosystem projects we have toprovide opinions not just options uh wegot con feedback from end users that itwas just too confusing too complex uhand I thought it was going to be aboutstability simplicity and beingcomprehensive you need a whole platformyou don't just need one part and a bagof Legos you want the toy uh and sothat's what I thought the future wasgoing to looklike and then there was this plot twistat the end of 2023 right the chat GPTmoment uh and the whole world overnightwas just obsessed with this idea uh andI've seen our capacity projections fromthe end of 2023 and then about 3 monthslater I was like pretty hilarious likeit's pretty easy to ship uh electrons uhit's not that easy to ship atoms soplugging in new capacity is somethingyou probably all have had to deal withthe capacity constraints over the lastyear uh we cloud providers also have todeal with this as well so uh demand hasout uh stripped supply quitesignificantly uh and so we had thiscrazy overlapping stacking adoptioncurves and it's really interestingthere's like these two therethere's enterprises still coming tocloud native to you know learn aboutcontainerization and microservices andthen there's an entirely different planethat's trying to upend the whole thingat the same time and so we have to kindof balance this innovation and stabilitybecause we have both groups of users sowhen we get to this pointuh the consumption is not the same asthe adoption curve uh and that theseoverlapping stacking uh adoption curveslead to some pretty bananas consumptiongrowth uh and anytime you grow like 10xyou generally have to rethink the waythat you did something uh and often atthis point this would be an opening fora new platform to emerge that solves thenew problems and you sort of just moveon right you check out the old thing youmove on and we had serious conversationsabout is this the right platform for thefuture workloads uh and so my old slidegot thrown overboard okay it's not thatanymore or at least it's not onlythat some of the challenges that wediscussed about Kubernetes when westarted we started with the assumptionof this sortof a few thingsuh Kubernetes as we move through theadoption curve we're getting into moremissionritical business criticalapplications things like healthcare andtelcouh the second observation was thisevolution from what what we sold as whatyou perceived to be infinitely elasticcloud right and we didn't even really weweren't even very well regionalizedearly on so it sort of felt like cloudwas just one thing one end point inspace that was I don't know magicallyscalable forever you ever seen thebumper sticker the cloud is just someoneelse's computer it was like our computeractually so we saw a little bit more ofthe details there uh and so what'semerging with these new AI and MLworkloads is that the hardware is superspecialized uh it's it's sparse it'sexpensive you can't just swap them outyou can't scale infinitely the capacityyou want isn't necessarily in the regionyou're �in uh and it's really much morecomplicated than it was uh and edgemakes things a lot more complicated tooespecially in the networking spaceuh and then third this expansion fromthe predictable longunning workloads tothese very dynamic workloads uh andthey're sometimes multicluster or theyhave these specific hardwarerequirements topology really matters uhproximity the networking throughput andall these things and special purposeframeworks that understand thatunderlying hardwareso why did we choose to continue on withKubernetes is part in part based on thisconcept of the path dependence feedbackloop which is essentially the idea thatyou know what got us here informs whatwill get us there uh also accepting thatwhat got us here won't get us there butuh it's a pretty important idea uh andso the early adopters were super risktolerant i was one of those i mentionedthat I when I was working at Nest wewere running production workloads onKubernetes it was probably not wise uhbut we did and it was fun andinteresting and we contributed back umand so that brought in the earlydevelopers and and ecosystem vendorsmany of you are still in the communitytoday uh and it led to that growingcommunityand so now we have a bunch of uh vendorspeople understand Kubernetes they knowhow to work with it they've built theirtools on it uh and so if you want tobuild underneath then anyone who builtthose tools can uh get access to whatyou build underneath and if you build ontop you get access to all those cloudsand environments that it runs on so wecame up with these three stories torally around in inside of Googleuh and this was how to evolve Kubernetesto meet the needs of the next trillioncore hours is like the tagline i likethe sound of it uh and the first one isreliability especially at scale and evenacross upgrade boundaries we've heard alot of feedback that upgrades stillcreate problems uh and we're workingreally hard on that and we're workinghard on it in the open source communitybecause it's important to us that allKubernetes users get the benefits of theinnovation that we're driving on the andthe new versions of Kubernetes itdoesn't help anyone if everyone's stuckon version127 the second was redefining theKubernetes relationship with thehardware and what does that mean itmeans that early on we had CPU and wehad memory and you could have more orless of it but then you just add morenodes right a node is a node is a nodeit's kind of all the same uh and youknow storage they're not that differentfrom each other kind of this fungeiblehardware concept um but these GPUs TPUsthese accelerators are super expensiveandsparse uh hardware's gotten really funagain and interesting even CPUs like thethere's a lot of innovation happeningthere that this elastic fungeible cloudthing didn't really consider in the pastuh and then finally these purpose-builtframeworks for these new workloads uhand it's really important that there'ssome operationalconsistency so that platform teams don'thave to learn a new way to operate anddo chargebacks and security on all thesenew frameworks uh you know we we startedthis way with a little bit of hubris toolike how hard could it be to lock itdown and it turns out it'd be reallyhard uh and so I think it's reallyuseful to some new frameworks uh toleverage what we've built in theKubernetes community we'll skip thefirst one reliability and you knowreliability across upgrades is prettyself-explanatory but this you know whatdoes it mean to redefine the Kubernetesrelationship with hardware uh it meantthat we invested really heavily in whatuh Patrick from Intel and Kevin fromNvidia started with DRRA and try to makeit more Kubernetes native and friendlyand kind of really invest in it um sothat there's some portability acrossaccelerator like even driver versionsrequire sort of finickyuh use pre the the new work that we'vedone in DRA uh so this is reallyimportant uh and then after we intend tokind of make it more Kubernetes nativemake it more portable uh and providewhat we have across cloud providers forthe hardware underneath aswell and the�n this one's reallyimportant too so framework orchestrationagain people run for AI workloads umpeople run slurm for HPC uh andsomething really interesting hashappened in the last year or so uh whichis that we're we're positioning and arethese frameworks are seeing Kubernetesas a distribution channel rather than asa competitor uh and so this is a reallysymbiotic relationship i talked abouthow Kubernetes early on made a decisionnot to be all things to all people butto really draw a line about what was inand what was out uh and provide thevalue where we intend to provide valueuh and then leave you know HPCscheduling if you've ever looked at theslurmuler like it is pretty impressiveand complex and nuanced and it's got acouple decades of experience in it uhthey're going to know about that betterthan we are uh so having concept ofguest schedulers is a is anidea and to make Kubernetes supportthose frameworks as first classcitizens so the goal in all of this isthis consistent operational model acrossthe frameworks that exist today areemerging and growing and the ones thatdon't even exist yet but are coming soonand we think that that's a reallypowerful and valuablemodel so Dan Khan uh the CEO of theLinux or of CNCF and I years ago talkedabout the idea of Kubernetes as the uhlike based on the hourglass model of theinternet uh and this was a diagram thatcirculated I don't know 40 years ago thewhole concept is that like IP is thecenter the narrow waste of this you knowthriving ecosystem on top and differenttechnologies underneath uh and so thisis my terrible reproduction of this uhand you can tell that I'm not a graphicdesigner in my free time but uh you getthe concept right ipv4 IPv6 as thenarrow waist uh browsers mail clients ontop there's all kinds of ways that thatcan travel around down below at thephysical layeruh and so making Kubernetes the narrowwaste the you know the hourglass modelof infrastructure consumption is ourvision and so we want this to be thecase for additional layers on top widenthe top of the funnel at the top andwiden the bottom of the funnel at thebottom um but ensure that there's someconsistency in the middle so that thosecan evolveindependently so what is our strategythen from all of this context our ourstrategy is pretty simple in three partsnumber one to maintain to ensure thatKubernetes continues to thrive extendKubernetes as the de facto standard forinfrastructure orchestration and expandand especially for a IML workloads thatare so important to everybusiness uh make sure that it works forthose frameworks and the new workloadsthat are coming as well now how do youdifferentiate right if we're doing allthis in open source it would be totallyopen and honest our intention is todifferentiate onperformance really and then that givesus business opportunities to do priceperformance like you can offer the sameperformance for a lower price or betterperformance for this like there areoptions there but really it's aboutdifferentiation onperformance how do we do that uh well weown a public cloud we do deepintegration through all the layers ofthe stack with our friends that work onuh GCE where Kubernetes runs on the VMlayers on the networking layers uhTPUs so there are opportunities toaccelerate the hardware uh and the fullintegration through the stack andsometimes it's about scale too so whenwe I talk about asymmetricopport advantages as an example isspanner spanner is an engineering marvelit's another team we love it it'samazing uh it gives us the opportunityto use that in place of CD to support65,000 nodes on GKE um that we feel likethe standard remains the same but theimplementation is slightly different andit gives us the opportunity to expand uhand that provides uh somedifferentiation that's sustainableand differentiation that's notsustainable is just friction foradoption so we really don't want todifferentiate in toogranularly uh and finally stickinessthis one has a you know seniorexecutives at Google even have worriedat times how do you build a stickyproduct on an open source project whoseentire point is to beslippery well it turns out that all ofyou and all of those ecosystem projectsas folks use Argo and open telemetry andall these other they actually just tendto stay because it's working and theyfind solutions for the problems theyhave so it actually is stickier thanother products that we have at Googleso that continuous loop that we talkedabout before we apply that not just inKubernetes technically but also in ouruh operation of GKE so this is not anacademic exercise for us it's reallyimportant that our team work on opensource Kubernetes and they are veryclosely tied to GKE like we are solvingactual people's real world problems thatis super important uh it's the these areyou talk about scheduling acrossmultiple frameworks this is like youcould nerd snipe anyone on my team uhand they just go down a rabbit hole foryears uh that's super dangerous likeit's it's really distracting uh and it'sreally important that we stay pragmaticso we launch something in GKE we observehow it works in the fleet we see realworld problems we change it in upstreamopen source Kubernetes we fix it foreveryone we fix it for ourselves at thesame time and that loop sort ofcontinues over and over and over that iscritical to ourapproach and so when we talk aboutevolving Kubernetes we're talking aboutevery single layer and we go throughphases sometimes with workloads uh orwith frameworks where the first iterfirst phase is succeeding in spite ofKubernetes it doesn't really work butyou stretch it and you kind of jam it inthere and you do some nasty things onthe side uh and it kind of works sort ofbut our goal is so that it works becauseof Kubernetes we want native support forthose things uh and it requires from thebottom in the hardware layer we'retalking about DRA and uh accelerators uhthe top multicluster use cases thingslike Q and multiQ um it goes the fullstack and there are no immovablemountains uh so if you have a challengeand we're not meeting it please reachout let's work together uh and thenthere are all these things that kind ofstretch across all of those layers onthe other sidethis is not again just a slide this wasjust day before yesterday there was amaintainer summit here uh and this wasone of the unconference events that wasproposed that John proposed uh and we'retalking about evolving Kubernetes for aIML across these layers uh you can onlysee the top but you can go to the GitHubissue and see the rest of it uh and wetalked about this with the communityspecifically for this case so this isact this is real work going on right nowwe've gotten a good start and there's aways togo i think maybe I put this slide in thewrong place but this is thedifferentiation slide uh I feel likethis is uh a perfectly fine place todifferentiate and to[Applause]compete so again why Kubernetes um it'sdeclarative extensible and modular thatmakes it super well positioned to evolveto meet the needs of the next round ofworkloads um we've done that we've donethat with Spark we've done that withstateless workload or stateful workloadsum creating that path dependencefeedback loop where there's now networkeffects and there's a lot invested in itthere's a lot of value and so we have awindow of opportunity we really dobelieve this is a window of opportunityto evolve so that we don't get pushedout of the way in favor of somethingelseuh we're well on our way we'reaccelerating we're speeding up we'reactually investing a lot more inKubernetes than we were a year or twoago and we think you should too andwe're stoked to see what we can buildtogether thank you so much for takingthe time today i really appreciate itenjoy the rest of theshow um I don't know if there's time forquestions but I'll stay around here ifanyone has any questions uh this wassome goofy AI generated uh content i'llcall your attention to the interglassxnules in the upper left there you haveto pay attention to the interglass xnules apparentlyso thanks so much[Applause]2025-04-15 22:00:22.321140 ��{jYI8'��o^N=, � � � � � � � � u d S& B 1 ��yhWF5%�7��n^M=, � � � � � � � � t c R AH 1 ! � � � � � � � { j Y I 8 ' {��q`O?/ ��xgWF5%��~m]M�<+ ��sbQ@/ ��Y��wfUE4#��j#MgKNa---WQg�#BMpGWL4sGvY�#79J5K6dxSlI�#5woqh5aRlqw�#9TU601q-aBE�#DBBW5Yrc0Zs�#Qx_6ItsOrQs�#U9qwxp7Uv08{#G24upbAXVd8z#OqLpKJwKZlkx#Tz8IcMSY7jw�#TELnK0PrKHU�#T-nN86wTebM^#SsTUGO9YbnQv#SqKqB-q_m8Ex#SdLLOcNZN5E�#RwcC44BWDvA�#Rw4c7lmdyFs�#RgzyEc8pPa8Y#Rf9NceXXRuw�#RdT6P5x_fDMg#RGLy_JtGD9U"#REkSMbRrBU4V#R0255efML-I#QzStkLbA7QkL#QzE6vSgcyT0#QmUVhzdlMIIa#QhTlZs4m59w�#QbR908kgk1YO#QLHQP8-RVwE#Q9m7eGoBaMA&#Q40yLLLIW9Q$#Q2ct5OXQ8fUM#Q15XbASxHM0�#PvKCzFaP3gM8#Pmba7R4_4oUh#PgCaIyeRn6Y#PcRORHC1NYY:#PJ8qgKEwDyM#OnqzoBf7dUE�#OTzd9eTtLRA�#ONuxsPWXNUU>#OLrN7D84o4gR#OJ1WoQjYAJoI#OAb54JRIS6MQ#O1EJnC0pjZIg#Nq_PgPKZHsc#NnYtnUeJi7U;#NkOV4_JV4t4(#NRL-bSYVi7Q##NCkHrvqFMl8�#Mbk6FY_9FKMl#MHfDvUUJ14I0#M56SHzETAmM�#LrL5AcS2d5gr#Lkpjq4nybdE�#LEiFzJnqU-E%#L13y_-zLin4h#KyoxaAtHi-c#KtW4HkonQHU�#KlsxQMfdKLw1#Kcjh0-hXwWw�#KS_rGWazTio�#KQBz7nwWxUE�#KPNuLwXNkNQ9#KK0FKiQ7nis�#KIRUbaUjEKw#K3edF36HWYU#JtMYdR50-KU�#JqRXqk-1CLoJ#JqKwvN8MaSUq#JqG1wey7-Ao#Je6GIoagHvU/#JXcQcofGzrA�#JFS0lSfHtMI?#J93U9n_qxSI)#IvIgsHS5MDk�#IoEe05sPqhk#IcYwKgAMXuE-#IWrd-pSojqgK#IGK7TZPuma4 #I9t7qfOjgboL#I9GV4N23dvE�#I236sjooftw�#HrO5KVMQfHs�#Hdrf5QosFTwN#Hc0jj-654lA�#HV3Nb_wUro4c#HEhnch8Wpj8�#GyvARSG3_ws#GvIPSgt69Sg�#GgxRHpQIEfgG#G8U141NkrDI�#FqUPqroF-RwD#Fnb1a5Kaxgo`#Fb_3dWJdY9I�#FEy2lhe6CM8F#FC5TAGsBbRQb#Es3DBj2UgIE�#EeegJKZ_4g0#Eb9AweCazi8�#Ea5OuNjpi9M#EXtCejkOJB0�#ERztKTd-ckAl#EOQ8qNstD8I�#EBbuyn72jtw'#Du0mPGFd7Fc�#DmfZq70WOxI=#Dc6S4vU9GiM�#DWq8UWmcRQg#DVFQ20OrEFkR#DGE1P5ynmsQ#D8r5j_R9QsA�#D7vwFFeEn00)#D21yF0E-v2s�#CzdX5qDgQ2Uo#Cydz0hadVuQ�#Cx5c-IueP78=#CvGbwn5ZrFg�#Cn8xvysLWVgy#CQ3Wxg4qNaQC#CK7Il4ZiqTA�#CI4rws1H-aM�#BlzHv9KV1Z4#BkQRGsVBhkc(#BdkB0eERa5A#BSoEY_tpxIot#BBqDpqATcI0#B7lpXPZPFoI3#Ae2-LNHtUr8 #Ab7mRoJYsMon#AYxjk8ZZcloB#AHY4IDlBhzEt#A1HGYh0Wz9U#9q9oTJUqQoQ.#9mX9PvNNDjk5#9lPp-6nJ8bI�#9U3WMez9q74�#95NNuV-SUdg�#8ta_zFiUG1s�#8Q8sFzODEUo:#85MDID9Ju04#81SMpKgJb3k #8-ovtwX2l7k1#7sr1eHJBXKs�#7h70Olo5UzkI#7U6nAxUxG6ci#7KCBigZi_Rk�#7IA-Vw1K7eg�#7GQRyAxPa9g#7-JtDLNT0c8�#6usWUdJMyHY�#6l5zCt5QsdY�#6hWoA4jEk5M�#6MrXcbcxnN4"#6KdywJWnYygE#6GjLzWtqjlwU#6CXsWNOqYSw%#66nDIYlyvsY#5ZWbS01wCMkj#5CyAZBUH1f82#4pBhVVrCHyM�#4YVSW8UuHac#4Pei4LMigQE*#4OdYWliYpPgf#4FgccXDdzYA #40OmDwTgl1A�#3s8EdlTi9bkA#3oWODC2mdk0N#3aUg2qxfoZU3#3KLsfEyNKrYe#3Gm5QNXcp2gB#317rLOIKfDQ�#2zjxSKAkT9E;#2r92tTuFYg8K#2fkWLe3OqQg@#2_ECK6v_yXc�#26p_qvuCy-s�#20eoMgq5lbY`#2-fSMpCSYnw[#1rtyQaTfbdMT#1rgZPi2dTvE�#1jPvEAhkklg#1iWD14xvBQA�#1c2va5nATmQs#1UHZT_v0rtsS#1FwE0ajODU8-#1B9WZ6H0cn4�#0qNOZpdW870�#0p-sZT0LWOgk#0adVcinYGC8\#0GNjonLfCQA�#0FrmlwV9D0Y#07RnkzSc6Jg�#02dSHShBVuk#-yOKr2DOJ_o##-k1CdrRAGMM#-fGztPUuD8k�#-SFVDr3wQ_w}#-3NyXaVPGvo� ��n^N=->� =��ziYI8_( � �? � � � �� r. a P�^ @ 0O �� <��o�~�m�\��LN<La+� ��-�� qM�wgW;��G�*7&�� , � � � � � � � s+ c� S B� 3� " �`r ��n� �� p� � � P � |] k� Z J 9 ( �� #tppmEJQ1t-U�#XgoGyTNheqE�#keInQWSYWzg#rOylXCxix1I~#ulzjbGIYkJg}#hbV-C2YzaIc|#zLHdgl2qxbgy#c73SzCKx-OYw#nHGzMmstR0Eu#d9K5PSsHtDgs#aiC7C56pE7Ip#wb8K3RV6Sbwo#jid0uSnNku8m#t8p7S-46SWIk#nESkFf0j_7ci#gjEuIUCbNYYe#bOhaJV3_7X4d#zXIMJeJrnvIc#noliQiyacGob#f6gYxJOr0yQ_#xE3iMfib2LA^#njNXlZNT3dw]#sTbJ1-x3_ycZ#lrA6gOpLWMwX#lBOdQHNNgEUW#X_xHC_Q5jGET#didMaFDxRAcQ#syV-QGZDmWUP#lQFUarM_GXoO#iiI91sUMtdgH#lefjb4Vnd8kG#fZ_ULsJ5WGAD#l3mSvnpLGZY?#rRExAhVI1nU<#dT-ShZYM3SI9#cs68TjSAlTg7#jmLuKkr4ndo6#lNovu7Kclhk4#ziQRTuDCtuM0#oo1wqb9_Whc,#YMyrcqZ2sbU+#j7QfkNU8XM8*#_r7blpGA1Fw'#v_PzG81D33I$#beAoZ2fI-QQ!#fok2apYcVdE#vpMHv-56gsk#iPd-IQfbVLA#k7Wcd9HdXAY#gi3hZAFI0qs#okapmNodLB0#ohS-ibtvQWw#wO1bWs_LD8w#etCmLttqJsQ#qXEvqZ_cY0o #g--50XLcqRw#lVWUCUt6ZM8 #dUfp3j1j-mg#gvp2uTilwrY#YqIHESG0suI#zLpUJBU6sT4#uQ_WN1kuDo0�#tnSraS9JqZ8�#yeg-uoBYCO0�#cLJRh4y4vXg�#yQQU8vDhj0o�#r59IfCSmUBQ�#XqL5lh32lr8�#vCfehltPKxk�#nTmwmd4fcGI�#mqXZ2T-jWuU�#jiT7kGqcpR4�#x6EKTCAWtn8�#u-eUO3rIQV4�#ea2CKLX5vEs�#bQvrutQO3-c�#thCZDKZ1cAM�#hnmtjCkO8FE�#mtqUtbMaSDw�#njT5r3JjIaA�#mrKx1M0Idbg�#zZ7bDPZMCqY�#zTLbnstVjHc�#yCyezOTVU_Y.#y21i3lG2jUM�#y0JgZ-hQ-Bo�#xwGDxqI_3Nk6#xGywrHPAMms�#x8wEo6ZDT1g�#x5qguW0SF_IM#weOIwt1lbvI�#w1wh9dc6m34�#vrG5tBDsdd0�#urRefZ0KnU4+#ufY_JFPpzRI�#tSBfDzStoYE#sLFmnCyZ89M�#rdTPbm9f_fc�#rbVV8WIJYwwz#rVz-vIFGT4k�# rP1I6Cegej4#rAIcQvKBuZA�#rACTrbTnFqY#qj9q_-S91L8#qH5djJlbodYV#q44WBAGzKhk�#q3uBctLa_Sg#pvTRjsSXMi0U#poBOYc_EkpA�#pWBbX6pOyPg#pPKuJg_6A3kY#p52nxvo6hXkd#oLZ2EjjKibw2#oCbJdcy3zzA�#nvKpg3JgSjs[#nclJn1KEjis�#nXdGXdxmWNQ@#mynQyP2_17E#mewXGSwDCE4]#m8ZnlZTo1OEu#lj_qgsb4h38J#lQEYxCXVkVU�#lIYXVIPsk_U{#lFaSEevdZvU#lEXm6k2wpG4�#kyLdmGYZ6BQm#ksKOPx99rIE!#kYT7KV_Cijs�#kQ4X6-mPHqw #jtGSzIvw9jIH#joOTwCatd9g�#jikiO3CC7Zw�#jUChVGvSB5g#j0AqGpC_pp4#iCAFXF5ECto�#hjbZOBghxYU�#h1AyaAIf3HAZ#gycxQT3DHIU�#gWgagjHtnlEn#gMDC1zzHabk8#gGP9QdlNr9Y�#g8rtqqNTL9Q>#g7KIuv7KipE#fznzH-gf9h8_#fnt3f8sWJLAa#emjrmJZR-ZIW#eO8szEGNwooA#d_9JNRkT7dg&#dVM20108SRcC#dNb1m84Bp4c�#dDkXFuy45EA�#d2szUE0jhX4 #c0dEL_bBRVU�#bpsclYlGl2s�#bIxw1uK0QRQP#bFKls7IvzNEq#bC4xbBJs0CA4#anqWhSnN7sA�#ahANKkTT-yo�#aYGGnDDGX-Qj#aWxuaEFSarU�#_xoDbpm-Qks�#_rDE1PD5Z5I/#_pgOuaYwvBQ�#_oIoaW5i-xEX#_47X1eKkiEs#_3fpZA-DqDU�#Zr7y27HpII4�#ZmcZlDCYDgE\#Zfp94fOMcwE�#Zbi46yTlSVow#Z_15EyXOnhU�#ZOG1J1Niuh0<#ZManfhV6DZU~#ZIk_EqI8rVA,#YvXCcSjXKEQp#XtA-NKoJDaIv#XelZnqurT2s7#WuMyfaF0UeM5#WeWQqQM6kjM�#WaDSASWA2z4f#W_EF1HnP4tU#W5C0O7vk78o�#W3f5Ks0j2Q8�#VU_vj2r3BgIE#VLVSa6xD5tk#VCmp--NcxeE|#VAWw5CujiR8S#V7HbBO_umOUF#UfYctUtDDfQ#UVPe-rdxK7wr that and we'll getintothat Right So let's get intoit Ohum let's get into ituh how do you avoid 50 mediocreKubernetesimplementations And actually the answeris quite simple Uh you just make one andthen force everybody to use itUm and uh that's usually uh that issomething that can happen when you adopta platform engineering mindset Uh butwhen we wanted to get into this uh wethought about this problemUm some of you might know XKCD and uhsome of you might know this specific oneand if you don't then now is your chanceto laughUm we wanted to avoid a situation wherewe just created one extra platform andthen try to get people to use it butthen that just competes with uheverybody else Um so with that in mindwe started to think about how do we makea good platform that people actuallywant to adopt Uh do you do you want toYes Yes I can do that for uh tounderstand how we can build a platformuh that people want to use We also needto understand our users and we havethese 237 product teams and they areautonomous Uh they do whatever they wantin order to get the job done that theyhave been assigned to do And when I sayautonomous it's it's really autonomousIf they decide that their applicationworks best in some old VM running on alike a small small host somewhere in abasement that's okay They are empoweredto do that because they as a productteam knows what's best for theirapplication They they take the decisionsthemselvesright That's uh you know some might saythat's a bad idea but there's flavorsbetween that and completely managedplatforms and they are absolutely freeto choose So that means that if we builda platform we also have to compete withour potential users and how they usuallydo things And we have an extremely highuser variety not just in terms of theirKubernetes acknowledgement uh uh sorryKubernetes knowledge uh and and skilllevel but also in terms of how much theywant to get into the nitty-grittydetails how much they want to open upthe engine room and see what's runninginside As engineers certainly I cansympathize with that like I I haveempathy towards that I want to do thatmyself and that's why I got intoplatform engineering So we want to caterto as many of these users as we possiblycan On top of that we havethese 14 uh existing platforms or wehave a set of existing platforms We havethis many people inevitably uh someonegets gets together and says oh you'redoing that you're also doing that I'mdoing that maybe we should buildsomething together that fits ourspecific needs So we already haveplatforms running that does somewhat ofcontainers and we have differentplatforms in different departmentswithin the digital organization thatsort of does the same already So how dowe tackle that Should we handle that Ontop of that we have some strategic goalsUh the company uh has a data center thatwe are migrating away from and there areproduct teams that that reside insidethis data center that run on one ofthese platforms that will bedecommissioned as part of this thatdon't have an exit strategy So we alsohave to account for actually providingsome kind of business value to thesepeople that don't have any otherchoice Uh so we set out to uh to build anew platform We called it the Novouscontainer platform Some of you uh whoare wellversed in the Latin languagerecognize that novous means new So weset out to build the new containerplatform Yes Uh we based that on all thegood ideas that we read about There'splenty of good books plenty of goodtalks plenty of good resources We wantedlike a seamless experience so we couldhave uh multiple uh users switch betweenwhether they need to run on premise orthey needed to run in our differentfactory sites factory edges if they needto run in different clouds differentregions in the cloud everything likethat Uh we found out that we can only dothis if we uh if we are highlyopinionated That means that we decidedto enforce gitups So all of our users onthe platform they have to run theirapplications through git and definetheir their configurations in git Thatmeans no cube controlapply Someone was really unhappy aboutthat Um we had several users who used touh have an ash devops pipeline uh thatended up with a templating a helm shotand then cubectl apply directly Wellthat was also the step of shagging intothe servers where the communitiescluster was running and then runningcubectlapply Maybe not the ideal approachNevertheless we wanted to uh to avoidthese kind of situations That's alsosuper great for resiliency right Ifeverything is stored in Git somewherethat means that if a data center inFrankfurt overheats then we can justspin up a new cluster somewhere else andimmediately switch over theworkloads And then of course we are Legoso we built everything with Lego bricksAnd you might have heard uh howplatforms are built on different Legobricks you connect together uh like wehave a lot of GitOps we have uh you knowour own secret management solution builton top to support our users for that sothey don't have to consider that to makeit simple and easy for them like it'sstill a Kubernetes experience but it'scloser to a Kubernetes like experienceand we try to abstract as much as we aswe can fromthem So we've built this platform Uh wehave to replace an existingplatform What do we donow That is where I comeinSo we've built this perfect new platformthat will make Kubernetes perfect foreverybody in the Lego group Um I mean wecan look at the clusters and we can seethat they work exactly as we expect Butthere's no users on it Souh what do wedo Well we have users that already useKubernetes users that are under timepressure to start to migrate tosomething new because we have this datacenter exit this uh people that need a areplacement for their hosting strategyalready And uh again we're under quite abit of time pressure So at this point weestablish a migration task force andthis migration task force it includespeople from the platform team Itincludes people from the applicationteams that need to migrate and itincludes uh consultants as well And thismigration task force what we did was weactually just went out into the teams Wejoined their dailies and we workedtogether with application engineers onjust migrating their applications Uh andwhat we found was that the differentteams that needed to uh to go throughthis process had quite different needsUm that's probably not a surprise toanybody I hope in this uh audience Umbut we still took the learnings fromthis to inform some spoiler alert somestuff that comes later in the talk Um weidentified three archetypes of usersthat we had to migrate We had the usersthat were reluctant to try andrearchitect their applications or uh tryand change up their delivery pipeline orCI/CD process that they were just undera lot of time pressure and they neededto uh migrate to a new platformASAP We call those the lift usersbecause we just wanted to lift themdirectly into the Novos platform the newperfect platformAnd what we see today with these usersis that they still require a lot ofsupport uh because they haven't reallychanged their mindset either Uh they wehave a lot of back and forth togetherwith those teams uh and continue tosupport them I think that also has to dowith the personality types of thedifferent teams Absolutelyuh going forward in the in the processUh but it's at least the the result uhof this process The second type of userwe saw were users that were excitedabout trying out this new platform andthey wanted to experiment withrearchitecting uh their uh theirapplications and uh reworking theirCI/CD pipeline to the best of ourrecommendationUm these were uh our some of ourfavorite users They're all our favoriteusers Um but but these were uh usersthat run really well today and actuallydon't need a lot of interference from usThey don't have a lot of questionsbecause they are the type of user thatgoes out and and just explores whatevernew feature that we releaseAndlastly we have to uh keep in mind thatwe always need to deliver value for thecompany in the end So there were someusers that didn't really fit theplatform but still needed to runsomewhere otherwise the factory wouldstand still Uh for example the peoplethat provide software for the moldingmachines would fit into uh one of theseone of these categories And uh if themolding machines don't run then we don'tget our salary and then everything sucksUm in in any case these guys needed uhsome custom features built within theplatform and we had to do that We callit uh some of it is technical debt todayUh but again we had to move quickly andso we ended up doing this and uh now wehave migrated some users We've gottenrid of one of these mediocre platformsand now we're in a new situation becausehow do we get more users Yes So uh wesuccessfully took this one platformreplaced it with another and if westarted with a question how do we avoid50 meteor implementations If we had 50already add one remove one that's still50 right So we need to take uh let'scall it market share from the otherinternal platforms and the otherinternal implementations whether that'sa platform or just a product team thathas several different implementations ofKubernetes for their differentapplications Um and how do we how do wedo that How do we get out to these usersand communicate the kind of value thatwe want We took inspiration from uh whatpeople always do to us when we're acompany and looking for someone to fixour problems Uh we took inspiration fromsales How do we how do we go out andsell our platform How do we actuallyimplement that Uh so we wanted to adopta sales mindset Uh so it boils down toeveryone who's done any kind of sales uhwould would recognize that you need toidentify whatever it is that you'reyou're solving uh and then communicatethat you can solve that properly Andthen you somehow need to uh to convincethe people that this is your platform touse it on Uh and we had differentapproaches to this Some are directlytoward cater towards getting new usersto getting the word out but al so so getgenerate it's called new leads but alsoto to follow up on people and and ensurethat they understand exactly what theplatform is and what the platform can dofor you There's also a lot about userretention and how we make sure that ourusers are happy and want to stay withour platform because remember they'refree to choose whatever they want Ifthey don't like our platform even ifthey migrate to it or switch to it ordeploy application for it there's notnothing not one single thing thatprevents them from going and say like Iwant to host my own kubernetes I want tospin up a VM somewhere uh an instance inthe cloud or you know bring my ownserver There's nothing that preventsthem from doing that So we need tocontinuously keep up to date andunderstand what are their current painpoints and how does that evolve So wetook some different initiatives and hadsome learnings from them So first we uhwe did workshops uh specifically wefocused on onboarding workshops Uh thepurpose of these was to showcase whatcan you do with a platform but also howquickly can you get up and running Weinvested a lot of time in making surethat these workshops were were great anduserfriendly and easy to go to Uh so youcan actually get up and running insidethe platform from code and no knowledgeof Kubernetes whatsoever to a productionready setup in less than an hour And Ithink that's impressive That also meansthat if you're unsure if this is thissomething for me can this do somethingfor me What's the effort required for meYou can always try it out 1 hour is isnot a long time of commitment for oneengineer If you have a team of five sixseven engineers you have plenty of timeIt's very simple low low commitmentAdditionally uh these workshops startedout as guided uh online workshops orinerson workshops and uh that doesn'treally scale super well So we convertedthe templates So we also allowself-hosted workshops uh we call them Soyou can request a workshop environmentand you get the workshop guide and youcan actually go through that Uh ofcourse you don't have an engineer nextto you to ask questions about uh ifthere's something you don't understandbut you ask questions in uh in oursupport channel which Paul will get onto uh later Uh and we're quite quicklyat responding to that and that alsomeans we continuous improve our workshopbased on the feedback that we get allthe time So highly recommend doing somekind of workshops to communicate thevalue to your engineersquickly The second thing we did was uhyou know participate more in events uhif we're a large company uh we haveinternal uh developer conferences wehave some product fairs actually go outand try to speak about this platform toa room full of engineers that that canuse it I know most of you don't work atLEGO so you're not the target audiencefor us today Um but go out andcommunicate this kind of value and thatreally shows people that there issomething that might be fitting for themMost people didn't really know aboutthis because we were just one team in anorgan uh in a in a department and thereare several other departments within uhwithin the leg digital organizationSo we need to reach them somehow rightIt also means that people come to uswith questions and ask I'm doing thismaybe I could use your platform Is thisright for me Or how can I fit into yourplatform So that's one way of of drivingsome kind of engagement right The thirdthing is to take the time to connectwith your users So a sales mindset uhmeans that you want to sell something Uhbut we've all experienced or I thinkmost of us have experienced thatsalespeople sometimes overpromise andunderddeliver We can't really afford to dothat We need to treat them like ourcolleagues because they are ourcolleagues So we exist in this middlegroundbetween sales and and you know we'rejust another team right We're justengineers engineer to engineer talkingbut connecting with them on a morepersonal level trying to understandtheir struggles and feel thefrustrations they have with either theircurrent setup or the platform that weare we are using to host theirapplications makes us more aware ofwhat's going on and what we can do toimprove the platform as wellYesSo now we have identified a couple ofstrategies to engage new users Uh but wealso something uh saw something elsethat was quite interesting We think umthe users that we had already gotten hadgotten into the habit of uh speakingdirectly to their favorite platformengineerSo if Thomas had helped a team migrateto uh our new platform they would uh thepeople from that team would communicatedirectly with Thomas every time theyneeded to help with something And that'swe think because they felt comfortablewith Thomas They felt like they had acommon uh language together And we sawthat as both a good thing but also atroublesome thing So for example it wasa good thing that the applicationengineers felt comfortable with theplatform engineer Uh but if Thomas justdecided to leave the company or uh workin a different team uh we wouldn't wantto let the application engineers uh outto drySo what we uh wanted to do is we wantedto enable the entire team to fulfill therole of a friendly platform engineerthat has helped you personally And thatis why we set up a support rotation Andin theory it's very simple You just haveyour support channel which is in ourcase a team's channel where people canjust ask any question that they have Itdoesn't even have to be about KubernetesUh it can be about anything and we willtry and help and we always have twodedicated platform engineers to answerthese questions In the beginning it waskind of hard to drive uh our applicationusers to use that channel Uh but afterwe improved response times on on uh onchannel requests in there uh we got pemore and more people to move over thereand I am sorry to say to any applicationengineers out there but sometimes Iwould even uh hold off on responding topersonal messages just to drive moreengagement towards the support channelUm the other thing that we got out ofthat was thatuh application engineers actuallystarted to help each other because thissupport channel is completely open andpublic and everybody can see what's inthere and some of them knew more thanothers and they would actually start torespond to to the requests that came inthereThe other thing that we did was uh weset up a rent an engineer program Uhrent an engineer is actually a bit of amisnomer because they don't pay us anymoneyUh it's basically take an engineer for aweek or two uh from the platform teamand help us do something Let's say thatuh an application team wants to help setup some kind of monitoring for for theirapplication they can go and and requestuh help from an a platform engineer andthey will actually we will sit in theirteam and work with them as a normal teammember for up to two weeksUsually this uh really again drivesconnectivity between uh colleagues andwe think that it it really helps usbuild good bonds with our usersThe last thing um some of you mightthink about this uh how in the world doyou platform engineers have time for allof this stuff Because why aren't you inmeetings deciding very importantplatform stuff Um well I'll tell you whyUh we have a system where we pair up andevery pair is empowered to make veryradical decisions about our platform Umthat means that we get rid of a lot ofmeetings where we don't have to sit andlisten to whatever new idea Thomas hasfor uh implementing a new feature in theplatform Thomas can just make thatdecision and implement that featurefreeing up all of the time for us We areactually down to only a normal daily anda Monday decision sync and then theoccasional retrospective as fixedmeetings Uh the rest of the time we haveto talk to users and talk touh talk to each other We we can makedecisions as well So with all of this inmind what we've learned from this iswhat we believe are two keytakeaways Um yes so uh the thesetakeaways are so the plat typicalplatform engineering uh you take yourbricks and you build your platform withthe bricks and you can switch out bricksWe we believe these are the bricks thatmake our team our team the bricks thatmake us Uh the first one is uh sell yourplatform We really believe that takingthe initiative to try to convince yourusers that your platform is a good ideato take on the sales mindset but stillkeeping in mind the empathy you shouldhave towards your colleagues andcombining the two is a crucial aspect ofhaving a successful platform that drivesadoption towardsit Yes And the other key takeaway thatwe have is to keep your users closeUsers don't need they don't only need tobe heard they also need to feel heardIt's very important that users feelcomfortable with sharing feedback Uh andyou do that by creating a casual uhconnection with your users uh so thatthey feel like they can talk to you andthey they feel that you understand whatthey're telling youUm we really think that uh this is oneof the greatest strength of our teameven though we also think we're reallyreally good at Kubernetes Uh it'sactually the uh the support that ourusers keep telling us about The reasonthat they choose us is because of oursupport Uh and it's what helps them sellus to other users in the LEGO groupAnd we've kind of tried to distill thatinto investing in support decreases theneed for it because we think that uh theearlier people talk to you about theirproblem the easier it is for to for youto fix it uh early on And with that uhwe would like to say thank youAnd we we do have time for somequestions uh if anybody has any And ifnobody has any then I will ask aquestion to ThomasYes Down there I I think there's amicrophone in the middle thereCan I start Yeah Yes Thanks for the talkUm I'm interested in uh hearing how manyusers do you have and how large is yourteam So our team is uh about 14 platformengineers Uh and I think currently wehave uh about50ish user teams Uh I think I thinkthat's right Yeah my lead engineer isnodding uh otherwise we can look it upon in GrafanaYeah So my question is like what's thescope your platform offer Do you justoffer infrastructure Do you offerframework UI backend standard like kind of curious aboutthat Thanks Yes So uh the platform uh sofor the purpose of this it I believe itcan be any platform but the specificplatform the novice platform is uhhosting containers and containerizedapplications only uh so we don't givethem uh a kubernetes cluster we givethem a curated isolated essentially it'sa name space uh with some other thingson top like uh their own secret storeand integrations to that uh API enablingaccess to that um so there's a lot ofneat tools around this but essentiallyempowering users to run their workloadsinside Kubernetes while abstracting asmuch of Kubernetes away as we can fromthemYes go ahead Okay great Thanks for thetalk Interesting And you mentioned atsome point that you cannot say exactlyhow many Kubernetes clusters you haveWas that because uh what you just saidit's not individual clusters per team orbecause it's still not known how manyteams would adopt it So there are twoanswers to that question Uh so first ofall we we don't really know the scope ofhow many teams will be feasibly able toadopt this platform because not everyteam needs containers They might decidethat their application does not run incontainers or does not run feasiblyinside a a Kubernetes environment Um soso that's for the scope part for thecluster part uh it's because teams canjust you know in their own cloudaccounts they can just go and create howmany however many Kubernetes clustersthey want and we don't have access tothat we don't have any control over thatso so we don't know um I think thelatest estimate I heard was about 200300 clusters maybe uh in that ballpark ubut can't really say for sure and that'salso one of the challenges with this andand doing the entire uh adoption and andsales mindset is also something thatbrings us closer to understanding thatBut because of the complete autonomythat the product teams has it will be anever a never- ending task to figure outwhat is the the desired scope of our ofour platformUm you mentioned you have teams on theum factory floors um who haveapplications they run How did you manageto um include them into your platformSo the uh the teams that run in thefactory they were in actually a lot ofthem were already running uh inKubernetes cluster clusters uh onpremiseUm they already had a bunch of uhKubernetes resources ready to go Uh wejust needed to figure out how to uhapply them to our our new platform Umwas that the question Uh that did didthat answer your question A little bitmore specific would be niceOkay Um we run uh open shift clusters onvSphere in in every uh on every site Umand we already had Kubernetes clustersthere So the users had access to deploystuff in Kubernetes on the factory edgeUm so we had to uh we had to migratethem to new clusters there Uh perfect Ithink also for for what we had uh beforeuh if you want to convince users in inin a factory to run Kubernetes workloadsit might be a good idea to to considerwhy would they need to run that So wehave factories that run all the time uhand that means that if an applicationfails it might be a little bit nice tohave that redundancy that Kubernetesoffers Uh so that was really how weconvinced people in the first place thatthey should run containers andKubernetesYeah I think we have time for maybe onelast question YesOkay Thanks for the talk Um I'm veryinterested in the in the type of valueyou provide to your users So if they canchoose anything they want then uh whatyou offer is a closed and integratedplatform it seems Yes Uh but do you alsooffer other benefits like uh compliancesignoffs or these types of uh valueExactly I think that's a good questionUm it's a big selling point for us thatif you uh if you run on our platformthen you can always you can alreadycheck off this uh entire list of uhcompliance points that you have to uhthat you have to adhere to Um it's uhwhat our users uh really like uh and andit's uh well our users really like itand it's one of the reasons that theylike to onboard onto uhNovas Does that answer the question Uhyeah thanks So we had sort of the sameexperience and at some point we offereda few unmanaged cloud projects and umwhen it was January 1st the people cameback because they had to fill out allthe forms and yeah they didn't even knowwhere to startYeah Yeah Thanks All right Thank youeveryone for showing up It's great Greathaving you2025-04-15 22:00:23.081086s we're going to go through thatelephant in the room revenue um whichused to be a bad name in open sourcewe're going to go through creating acharter you know some metrics makingsure we know what we're doing um thatuni user journey so helping peopleselect technologies and then we're goingto share some examples um so that youknow how this is used in real lifeso I'm going to talk about the elephantin the room and I started talking aboutthis at events um I think last yeararound this time that businesses are inorganizations they need to generaterevenue now they may contribute back toopen source and be big open-sourcefans so to speak um a for exampleAmanda's organization does a lot ofcontributions back to open source but atthe end of the day organizations need togenerate revenue So they need togenerate revenue to pay for the productmanagers who are figuring out theroadmap and what you're going to do withopen source and with your products orwith your sales and marketing teams whatyou're going to paying your engineerswho are working likely on both opensource projects and on products and itit's kind of a team paycheck concept andwe need to accept and embrace the factthat even though we're working in opensource at some point we've got to sellsomething so that we can contri continueto contribute back to the community aswell as contributing to theorganization's bottomlineso all right so you get into yourorganization and you know what do youwhat do you do first well you need tohave the ability to communicate withothers exactly what you're working onand how that ties with the company'sgoals so how do you start this so thebest way to start is to create a charterto line up everybody and make sureeverybody is is moving in the samedirection and that you're easily able umto go back and explain your relevancewithin the organizationit also serves as a framework and givesyou the ability to say no which isimportant because sometimes when you'rein open source teams um the ability tosay no to 100% revenue generatingactivities is something that you andyour team really want the ability todo so to start this off you know reallythat first step um is to talk with a lotof people so what I typically do when Igo in with a new team is I set up15minute interviews with as many peopleas possible so those are people likeindividual contributors all the way upto the CEO depending on what kind ofcompany um that you're asked so I askedthem four simplequestions um and I like to do this livebecause it gives you the opportunity forpeople to know who you are it uplevelsyourself and lets everybody know thatyou're serious about the job that you'regoing to do so I asked them do you haveexperience working with open source ordeveloperteams what's your view on how things arecurrently working this gives you a goodidea of what people think is importantand what they don't think isimportant it's also a a good way forpeople to tell you what they don't thinkis working which is usually what comesout in this question also what's goingto define success foryou this is a nice question to askespecially when you're talking all theway up your management chain so that youknow what you're going to be graded onand what they expect from you um andthen what are the expectations of theprogram itself um which will give youlike just a really good idea from a lotof different people about what peopleare expecting to give you an idea of howmany interviews that I've done before umthe last time I did this I reached over30 people and just talk to themeverybody's willing to give you 15minutes so take advantage of thatopportunity so once you have that youhave time to really sit down and andthink and create that team charterso once you I like to create the teamcharter and then send that out to theteam and ask them to please trash ittell me what I got wrong tell me whatyou disagree with that gets a lot ofengagement going amongst your team andbrings everybody together focused on thesame goals now whether this go down thepath of you use the wrong verb whichI've spent days discussing which verbsgo where or if it goes down the path ofhey I I really don't think we should dothis but we we should really do thisinside that charter when I'm creating itthis is an easy way to tie in andbalance those those goals out so if youhave an open source engineering teamthis is a great way to pull out exactlythat we're going to go learn from thecommunity and innovate with thecommunity and bring that back intoproduct road mapaps and so then theproduct managers um can go ahead andcreate that roadmap from that so you'rereally going to reach on why does thisteam exist you know and that paragraphcomes in handy a lot especially when youhave a lot of new people or turnover atyour company you just repetitively copyand paste that out you're also going toput in some strategic objectives i liketo stick with three anything over threeyou're probably going to fail and whilefailing is good it's good to set up yourteam for success right out of the gateso stick to three keep it simple whatdoes success look like and you know howare you going to reach that this isn'tmetrics which we'll get into this isthose higher level goals so that youknow that you're doing a good job andthat you're meeting the needs that youwant and how you're going to doit so once you have this all togetherand you send this up and get thisapproved you have a a charter thateverybody's agreed on now of courseyou're going to have to report out whodoesn't love a good monthlyreport so the first time I worked onOKRs was with this guy named CraigMcClucky and I did not like OKRs i didnot want to sit in that room one bit andwhat I found was is if you go to yourcharter and you take those exactobjectives that you've already gottenapproved up your management chain andthose are just your objectives withinyour OKRs there's no conversationthere's no back and forth you've alreadyagreed to it and that's what you'regoing to do and then you can do your keyresults so your specific things you haveto do within that quarter i like totrack these within a spreadsheet so thatI easily have access to a lot of dataand also I just use color codes redyellow green so in month one let's sayyou didn't do anything towards somethingit's red you know that the second monthyou need to start there and concentratethere and your teams can pick that uppretty easy and they know what to do umI also do a percentage complete um so ifyou know if you've did 50% on on onething that's cool but you did zero onanother again it's easy to know so whenyou go to report this out um I juststick to three trends that you'vealready found in the data not all thedata your two to three top trends noneed to do any more than that if peoplehave questions they're going to comethrough and ask you questions then youhave all of the data in one spot whereyou can drill down kim how have you usedthis before well I use it since I am inmarketing and supporting the salesorganization i use it for our MQLs whichare our leads and I have a lot in myspreadsheet similar to what Amandaspreadsheets are there's a lot of datathat tell gives me indicators of if myMQLs are up or down and what maybe isleading to thatawesome so now that we have that downKim's going to talk to us more aboutthat user journey what that looks likeyep so now that Amanda has told me whatthe charter is we're gonna we're goingto figure out how we are going to betalking to people and I will take hercharter and I will make sure that Iunderstand like what is the real goalwhat are we trying to get out of it thatmarketing advocacy and communitycan drive results for so it could be forawareness we could be just trying to getawareness of the company or ourtechnologies maybe the people in ourorganization it could be that one of ourone of our goals is to get feedback backto product management or it could bethat we're really working to collaboratemaybe with our internal teams as well aswith the community so which of these aregoing to help best support the charterthat Amanda has put together so once Ihave that we're going to go uh we'regoing to start going on what I call auser journey and this j journey her e onthis map I'll go into details on thenext slide with how we will do each oneof these this is a typical sales journeybut it can be used for an open sourceproject we want to get we want awarenessof the project and we want to advocatesfor the project it could be for aproduct it could be for the company umit really applies across however it isthat we want to talk to the users in ourorganization and encourage them to dosome sort of activity hopefully to helpwith yourOKRs so um awareness uh that it's thebasic one it's the first one that wewant to make sure that we're focused onthat we're getting awareness for whatwe're doing and what our technologiesare and the typical tools that we'regoing to use for this the usual suspectsthe website we're going to go to eventswe're going to write some content dosome social advertising etc but this isalso where our Devril our dev advocatesand our community people have a big rolein helping gain helping drive awarenessof what we're trying todo consideration is when um we'regetting people when we're getting theirhands on the technology whether they'regoing out and downloading an open sourceproject or whether they are going to doa product trial or a demo or a proof ofconcept and some of the tools that I'veused to help drive that consideration isit's going to be website as wellproperly place buttons on your websiteif you're trying to get them to try aproduct but I'm going to take thosenames that I've gathered in my awarenessphase and I'm going to maybe send themsome emails a few emails we don't send alot of emails when we're working withdeveloper communities and in open sourcebut we send emails that have highv valuecontent such as the technology proofthis is how the technology works here'show it compares to some of the otheroptions here is how some otherorganizations are doing this solvingissues real issues they have with ourtechnologies and then the decisionmaking uh this iswhere marketing and advocacy if we'retrying to help somebody with thedecision to buy we're going to beworking closely with the sales team anda lot of times case studies andtestimonials come in with decision-making on using a technology even ifit's an open source project just as CNCFtalks about many of their case studiesthat helps people learn more about thetechnologies and what may work in theirenvironment making the decision to putthat in their productionstack then with retention it's alwayseasier to keep people than get themthere in the first place so this isdoing regular interactions you're goingto see the support um and uh the productteam they have primarily thisresponsibility but marketing advocacyand community also do just by talking topeople about how they're using thetechnology are they happy with it maybeyou have put together some user groupsmaybe have referral programs for exampleand then the advocates the you knowthese are our dream people we this iswho we want in our open sourcecommunities or if you have um alsoadvocate programs for your productsthese are the ones who are talking aboutour technology without us asking them tothey're writing those articles and soyou can put together if you want toreward them and grow that you can puttogether an advocate program or evenoffer a reward type programbut before we do that we also need toknow who we're talking to so Amandadidn't give that to me with her chartershe just told me to go do this so I'mgoing to figure out who I'm talking towho are the people that are going to usethe open source project or thetechnology whether it's a developerwhether it's a user it could be theinfluencer it could be um the technicaldecision maker which often times I'mfinding in my role is the operator orthe business decision maker and I Icreate I don't even know when I createdthis chart but I'm sharing this just asan example i go into a lot of detailwhen I'm talking about who the personasare you know what are they a user whatwhy do I care about them um what do theycare about what's their goal so I'mtrying to get into the mind of each ofthese users so that when I'm going toproduce content or I'm going to talk tothem at an event I'm going to try to getthat conversation and that content to besomething that they're going to careabout and that they findvaluable so I know who they are and Iknow how I'm going to go get to them umI need to make sure that I know what mytechnologies I what they are what theysolve how it makes it easier for eachone of thosepersonas why would they be looking forsomething new why do we why do we thinkthat they don't like the currentsolution they have and what's at stakeif they don't solve the problem and Iwill write this down in in a Google docand share it with the team and so thatwe can all together collaborate and makesure that we understand why somebodycares about ourtechnologies and then when we put it alltogetherum I I put these in the in the threephases that most of us in this room themarketing and advocacy and communitypeople are are focused on but theawareness um we're going to be targetingum all of our decision makers with withthe web contents and everybody who'sinvolved coming back to the people thatraised their hands everybody is involvedin the awareness phase from the supportteam um to the engineers talking aboutit knowing how to talk about ourproducts but it's mostly a marketingcommunity and devril function and we'relike dragging everybody else along withus to help with this umphase during the consideration it it'smarketing community dev sales steps inwhere we're helping people understandthe benefit of using it getting theirhands on the technology and then once weget to the decision makers we're reallytrying to use those case studies andtestimonials to help them understand andprove to them that this could reallywork in theirenvironment and I think with that Amandawe have a couple examples that we wantto share with you sure so what does thislook like so with these examples you maythink that we're describing your companyi promise you we're not directlydescribing your company but I think alot of us can see us in how how thesework so our first example here is thatrestructuring who hasn't been through arestructure lately super fun so this isa large vendor nobody raised their handnobody raised their hand nobody's gonethrough a restructuring recently okayjust you and I just a large vendor withmultiple open source projects as well asfour uh for sale products um and theopen source teams were distributed so wehad a lot of open-source engineers whodirectly reported into product folkswhich meant that a lot of the goals thatthey were around were very productspecific and really had nothing to dowith um open source engagement or whatthey were doing um so we had teammembers who were completely isolated umand who really just didn't they weren'tsticky at all like who you know theyjust went to work and worked bythemselves and went home so the solutionwas pulling everybody together into agroup because we needed to solvecohesiveness we had to put someprocesses in place and we definitelyneeded um some more open- source metricsso um really getting together on thatcharter making sure everybody was goingin the same direction knowledge sharingback and forth between the differentcommunities and how best to go aboutengagement within each project um makingsure that the general manager approvedyou know what that looked like andmaking sure people were executing anddelivering what was needed so theresults were that the team really weremore sticky they wanted to work on theteam because now all of a sudden theyhad goals that were relevant they couldsee themselves in the company goals andhow that all rolled up um processes thatput in place was you know if there's aproduct and through tech ops and as theydig through and they find out that theissue is actually in the open sourceproject you know team members were inplace then to solve customer is issuesand that resulted in a 90% um customerresolution for there um and improvedemployee satisfactionokay the next example is um a company anorganization bringing in their firstmarketing or advocacy hire and this willbe typically you're going to see this insmaller organizations um they they'vehad maybe marketing help before maybesomebody on staff or maybe they havecontracted that out uh but they theyrealized that they weren't just theywere not getting the results that theyhad hoped for they were looking for aspecific growth trajectory and andthrough that they needed leads and theyneeded to grow the community they knewthey needed community growth for theopen source project and they neededleads for the revenue side of thebusiness so they put so they went outand they hired a head of marketing tocome in and put together the strategyand the messaging and then to execute onthat strategy and one of the first stepswas to research what had they done inthe past what works what didn't work aswell as setting up tracking without datahow do we ever know if we're if what weare doing is working at all so they putthose things together they startedexecuting on the plan um and with theplan that they executed it was focusedon awareness and technology proof whichis typically what a first marketing hireis going to focus on um just throughsome of those the efforts the SEO the PRthe additional content the increased umpresence on social media thisorganization saw their web metrics growthree times in terms of how many pagespeople were visiting on their website aswell as how long they were staying theygrew the the names in their database uhby twice as many from going monthtomonthand twice as many product trials so uhthey had a good month Amanda and ourthird example and this is completelymade up is an established organizationum and they're very large cloudnativevendor they they contribute to opensource projects they've created theirown they've donated some to a foundationand they also use open-source technologywithin their forale products um thecurrent situation was really that it wasa planning cycle was coming up so that'swhen you get budget and headcount andthey realized that they were just goingthrough the motions everybody was justshowing up for work every day and thengoing home every day there was noinnovation happening at all so thesolution was to go back and questioneverything and to think differently andto go back to that company goals andrealize you know what thread they playedum within open source but reallydelivering on the company goalsum and really brainstorming and justholding meetings you know like you knowwhat anything is possible right likewhat should we actually be doing andthey needed to solve for being relevantSo every year on those planning cyclesyou know it's you need to make sureyou're still relevant with what's comingup like what the company is actuallygoing to deliver the next the next yearso in some of those things reallydoubling down on what pieces of contentpeople were working on so in the past itwas a lot of blogs people were in frontof their computers and people werereading fast forward to today people arewatching videos and guess what attentionspans are under two minutes now watchingthings that's a completely differentapproach um to what you're creating alsopeople are going to conferences againlook at the look at this conferenceright here it is packed so doubling downon CFPs and making sure that key peoplewere giving talks giving back to thecommunity also spreading awareness ofthe company and like what they wereworking on so um as they did this makingsure they also connected back tomarketing that user journey Kim talkedabout before and seeing you know wherecontent could weave in to help deliverwhat needed to be delivered at thatlevel and that's that's what they didthey continued to be relevant and theirteam kept their jobsyay they kept their job and that is umthat's all we have for you today wespecifically left time for somequestions i believe there's a microphonein the middle of the room and this QRcode um is to leave session feedbackcncf would like session feedback so feelfree to do that but do we have anyquestionsnoquestions zeroquestionsamazing actually I can't even see so Iknow I can't yeah okay wellum you all get five minutes free thankyou2025-04-15 22:00:23.886757 ��`#��eAmrKx1M0Idbgmorning everyone Uh welcome to be hereIt's my honor to uh share Kub Kubagegraduation journey creating a diverseand collaborative open-source communityfrom scratch and I hope to use thisopportunity to explore with all of youhow to build a diverse and multi- vendorcommunityUh before we start uh let me brieflyintroduce Hong and uh myselfHello everybody This is HongangCurrently I'm the Kuber Edge TSM memberSo I'm also working with my team tooversee Kuber Edge community and theproject Thank youOkay My name is and I work at Huaweicloud and I'm also the maintainer ofKubage communityUh first we still want to share theexciting news about Kubage graduation Uhin October last year Kubage has becomethe CNCF graduated project Since itsinception in 2018 Kubage has grown froma small project into a mature communityUh thanks to the collaboration of manyvendors and partners I I believeeveryone here today must be interestedin how to build a diverse community howto apply it in different industries andhow to maintain the sustainable andhealthy community And next time nexttime I will we will introduce the kubeskubage work into uh two p p p p p p p pp p p p p p p p p p p p p p p p p p p pp p p p p p p p p p p p perspectives Oneis tech technology development and andthe communitygovernance Okay Uh first let me let mestart uh introduce the Kubage communityKubage is the uh industry's first cloudnative edge computing uh framework andit is designed uh for edge cloudcollaboration Uh we aim to provide aconsistent consistent experience ofapplications resources data and devicescollaboration between cloud and edge Fornow we have more than uh 8,000 stars and2,200 forks on GitHub with more ��L�_#��OALkpjq4nybdEwelcome to our presentation you know wecan see absolutely nothing so um if youall raise your hands we can't see thatbut I am Kim McMahon i am the head ofmarketing withCedaro and uh they're really loud overthere so I'm head of marketing withCedaro i have been working withopen-source communities for ever since Iworked with Amanda in 2017 i have workeddone marketing for open sourcefoundations and worked for organizationsthat had both open source projects andforprofit products amanda hi I'm Amandai work over at um NetApp Instaclusterwithin open source data infrastructureand I am stoked to be here today to giveback to the community um I was alongwith the Kubernetes journey for over adecade now and it is time to share backwhat I've learnedso who do we have in the room today igot to kind of come up a little bit dowe have any engineers out there oh I seesome hands there we go how about productmanagementall right how about anybody in supportone kind of one person it's all the restare supporting something what aboutcommunity folks nicesales marketingwe got one person over here and anythingelse that we're missingwhatexecutive got it all right well welcometoday so it looks like we have a mix ofpeople in the room today that work oneither open source projects or mostlikely with infor-profit products so wehave people who are working directlywith revenue with that tie we've gotproduct managers who help build aroadmap we've got a lot of engineers whohelp bring that um to market then wehave people who support um users andusing products um within community alongwith those re revenue generating folksdirectly in sales and marketing so wehave this in interlocking um circles uphere showing that everybody has to worktogether in order to be successfulin this presentation we're going to talkabout you know connecting thoseopen-source activities um with corporategoal than1,500 contributors uh from more than 100organizationsworldwide And this is an overview ofKubage's journey Kubage uh is waslaunched as a open source in 2018 and uhuh it was donated to CNCF in 2019 as asandbox project Uh in 2020 Kubage hasbeen used in uh a large scale Chinahighway electronic tour connectionsystem This system based on kubagemanage more than100,000 edge nodes across the multi-provinces of China and in 2020 uh wealso became the cf incubation project Innext several years we have launchedseveral sub projects uh for example sedaand edge mesh Satina is used is used toenhance the AI workload between cloudand edge and edge mesh uh support thecommun communication between edge nodesacross differentnetworks and uh in 2021 we the firstcloud native vehicle with Kubage hasbeen uh has has been in production Anduh in the in the same year uh the worldfirst uh cloud native satellite based onkubage has been launched launched tospace Satinites are not uh satellite aredifferent from traditional edge devicesbecause uh satellite move rapidly and uhhas name connectivityuh to ground station So we based on thekubage system uh we deployed the AImodel on the satellite and uh andprocess the data on the satellite andsend back the uh meaning for result tothe groundstation and in 2022 we we also la wealso released a large scale testingreport Uh in this report we supportKubage support more than 100,000 edgenodes and uh uh 1 million edge ports onon a single clusterUh if you are interested in the Najisscale you can uh you can attend the uhKubage Najis scale testing and impleimplement implementation session intomorrowafternoon and in uh2023 uh we have uh in in security wehave achieved the uh salesside levelthree and we also uh launched the Kooparubber is it is a robotics solutionbased onkubage Uh in terms of the technicalplanning we we are not limited todevelop the kage in in just a singlearea but also uh committed to build amulti- domainmulti- scenarios edge computing platformOn the lost border side we we hope wecan support as many as applications suchas AI IoT MEC robotics Survey releasedseveral uh six and working groups uhincluding SIG AI SIG device IoT and uhMEC working groups And on on sourceboardside we we hope we can support more uhhardware support moreoperation operating systems and uhsupport more devices with various uhprotocols and we also provide uh uhhardware compatibility testing andcertificationAnd also the ka edge itself is arecontinually enhanced in several area forexample the age cloud scheduling the agecloud uh orchestration and ageruntime Okay Uh as mentioned earlier wehope kubage can support a wide range ofuh scenarios We actively studyindustrial trends and plan ahead forkubage's application For example in 2021we have released set seda the industriesfirst age cloud collaborative AI subproject Many age devices has equippedwith AI cap capabilities and uh thedevelopment of age AI is pro progressingrapidly Howeversorry However it ai faces manychallenges such as uh limited datasample and fragmentations and uh highcost associated with the uh modelupdates and maintenance maintainers withset with Sedna We provide cross agecloud data set and model management andwe also uh support many training andinference frameworks such as jointinference incremental learning Uh forexample in joint uh in joint inferencewe can uh process the data at age and wesend the difficult cases to the cloudfor accurate inference and uh in uhincremental learning uh we will send theuh inference result to the cloud and uhandand uh update the model and then deploydeployed the new model to the edge inreal time We also also uh support manymainstream AI from frameworks and uhprovides enhanced uh interfaces fordevelopers to quickly integrate setthird partyalgorithm Okay Uh currently Kubage hasbeen widely applied across industriesHere is N the uh kage uh use case listUh we have uh applied kage to uh to manyvarious section se sectors includingtransportation uh energy and finance uhfor example at KubeCon Europe 2021 wehave shared the use of kubage and sedain satinizes applications and last yearuh in kubong China we also collaboratedwith neo to demonstrate the applicationsof kubage in uh smartdriving Uh if you are interested you canuh youcan more kubage use cases in theofficial website or just communicatewith the uh in kubagebooth Okay that's all about the kubagetechnology development and I will handit over to my co-sp speakerThanks Okay So next I will give someintroduction about how we grind with ourproject So here I can share someexperience about the communitygovernance as well as operationalactivities So first let look atthe project graduation phase in CNCF Somany maybe some of you know the threestages about the project in CNF Thefirst one is thesandbox The second is incubating So thenext phase isgraduation So from today's keynote Chrisintroduced that over 200 projects are inare managed by CNF So the data is rightnow only about more than 20 projects gotgraduated So that means the posing andthe ratio for the graduation is reallyhard So how can we cross the chasm togrow our project from scratch to grow tobe sandbox incubating and finallygraduation So here I can share some dataabout our project operation statisticsSo right now Kooper Edge already coveredmore than three thou300,000 people and the community coversmore thansixk community members with contributorabout1.5 contributor key contributor morethan 100 organizations and the starabout 8k stars with two 2ks So thisnumber indicates the project achieved amilestone So that's why we can begraduated So next how we did that toachieve this data and you know push thisproject to be graduated So next I willshare some a few best practices and thenwhat we did what we didSo first of all if your your projectwant to begraduated the key thing is not just forthe technology I think if you grow youractive grow your project to be granuatedI think the technical maturity is okayand many many user many many developersare using that But the key criteria forthe TUC to evaluate your project iswhether your project is open andtransparent governance whether you havea clear structure So here we put a lotof effortsto make our governance to be open andtransparent So for the left we can seethis is our community governance modelIn total we have seven members in TSCTSM technical steering committee This isthe highest and decision making team tohandle and oversee all kinds of thedecisionactivities the Kense So under TSC wehave a few groups we have sub teams tomanage the project release We have teamsto govern the security process and howwe can advocate our developers and usersIn addition we have a few SIG specialinterest group covering nodes device IoTnetworking scannability security robotsAI and testing So here I want toemphasize beside the technical communitywe pay more and more attention toecosystem especially for the end userYeah Because your project got graduatedThe next key challenge is how you canencourage more and more developers moreand more end users to adopt yourtechnology So then next because Kubernetis a first cloud and edge collaborationplatform it covers a lot of scenarios Sohere it's very important to foster adiverse partnership andecosystem which can cover all thecommunities all the universities and allthe industry companies Here we list afew key partners in our community Itcovers some industry companies some areyou know technical startup companies andsome are big companies In addition wehave more and more universities becausemany professor many student areevaluating our technologies and the nextwe have some research institution Sohere we hope to foster verydiversity very you know differentdimensions for the community and thepartnershipSo the third thing is we will put moreefforts to cover some engineeringverification with someinteroperability Since this is a cloudand edge platform many many devices inaddition to the data center will use ourtechnologies So how we can help you getthis project started very quickly itcovers some hardware some boxes and someverifications So like we introduced justnow Cooper Edge is the first CN safeproject who passes the SOS S3 It's asecurity verification It's veryimportant right and we passed quite afew certifications from organizationsThat means these organizations willpromote our technology and our projectto different users In addition wecovered a few hardware compatibilitytesting Uh so if you have some box orIoT or AI hardware so you can you knowwe can work together to help you passour capability testingSo the third thing is like I like like Isaid just now after we graduated the keything is to promote to our customer totry to you know have more users adoptour you technology So there's a customeroutreach and thepromotions On the left side you can seewe actively participated in manyconferences including Kubrican and otheryou know industry mainstream conferencesWe are very lucky to be the kyno speechfor two continuous Kubrican China So onecase is to cloud native cloud edge onthe satellite and one case is cloud edgecollaboration between the new energy carIn addition we have quite a few openmeetup development meetup workshopsummit even for the academic workshopSonext once you get graduated you need tocollaborate very closely with differentorganizations So here we listed a fewkey organizations we are working with SoI'm not sure whether some of you areaware about the Ninux Foundationmentorship So this is a very goodprogram So you can extend your coverageto you know worldwide potentialdevelopers So for Kuber Edge weparticipated in the Linux Foundationmentorship program for more than fiveyears and we provided more than aboutyou know 30 topics and cover not youconnected many developers worldwide tojoin our community to join ourdevelopment in China We work with someyou know mainmainland very famous organizations toyou know host some meetup or we can youknow join someactivities Last year we get the bestopen source project award for the wholeChina This is a good recognition fromtheorganizations Okay Last but not theleast developer is very important So wewill continuously support our developersWe will continuously to host manymeetups celebrations or and somedevelopment sharing So here we put a lotof efforts to find some potential orearly adopter customers We invited themto give some speech give some sharing toour potential customers So here we lista few of some celebration for lastyear's Kubrigraduation and we host many meetups manydevelopersummit for the development it's for surethat we have a very robust verycomprehensive procedures to help youknow developers to how we can getstarted how we canjoin join the community So we have thedeveloper guide to help you get startedone by one We have the community rolesWe define many roles from the memberapprover maintainer or owner So you stepby step you can you know involve veryclosely with our community Also we havesome contributor license with a normalcommunication mechanism We have aweekend meeting So we have some projectmeeting We have some TSA meeting So thisensure that we have enough communicationbetween TSA member between ourconnectivity with the developers enduserscustomers and this is our website Somany people will go to our website toget the latest updates right latestdownloads So in short l graduation is isa nonp process So Chris in today'stomorrow's meeting ko meeting Chrisintroduced over 200 projects under CSFmaybe about only 20 projects Its historyis more than seven years So kra edgehistory has sevenyears from scratch to be graduated It'sa n process but in short we can saygrandduration is a milestone but it'snot the end instead it's a new startwith more challenges we have anotherwork todo okay last our slogan is make a cloudnativeubiquittors so unlike a kubernetes andother data center technologies ourproject is to extend our cloud nativetechnologies is to to the end side tothe edge side So our slogan is makecloud native technologiesubiquittors right So actually today wehave a booth So you can go to our youknow P4 team to gain more details rightto know more and to we can discuss howwe can join veryclosely Okay thank you Thanks for yourtimeSo any questions2025-04-15 22:00:24.938299d uh I'm the signalreviewer and the uh scheduling subproject co maintainerall right uh today we are going to tofirstly give you high level overview onthe scheduulers what's their history andthe relation with Kubernetes defaultscheduleuler and next going a deep diveon the features to help you betterchoose which may best fits yourworkloads and lastly give you someguidance and best practice on benchmarksome particular feature of thescheduleuler so firstly I think thequestion may be asked is that why thereare so many scheduulers or what'smissing in the cubeuler to serve likebatch workloads so in my eyes there arethree pieces that's maybe missing in thedefault scheduleuler as well as thenative kubernetes one things for a longtime cube set is design originallydesigned for service service workloadsthat means each pod is treated as anindividual unit to do the schedulingcycle that has a good size which is canbe consistent to accommodate a partwhole life cycle and to be have the uhunified integration with othercomponents and all the APIs are imposedin the pod level but the downside isthat we didn't give the job levelschedulingconstraint more attention which causesome decision that make for single partmay not be nec necessarily optimal for awhole job for example GA scheduling thesecond piece is that the nativeKubernetes doesn't offer a good notionto maximize your clusters capacities itonly provides a static notion calledresource coder and that is associatedwith the name space but in themodernized organization they may want uhlike a sharable coder and uh maybe evenhas hierarchicaluh in that to to manage those kind ofcoder the third thing is the internalimpation of the scheduleuler is a onesetup just like the Excel London thehallways one que so uh but in terms ofsome advanced feature that is batchworkloads needs it's better to designthem in a multiq uh design so that somefeatures can be more design implementedin a more natural way so this is a threepiece I think cube scheduleuler ismissing so a glimpse at the historycubeuler was raised along with the cube1.0 O at 202015 and then a experimental sub projectof six scheduling called cube batch wasraised in 2017 it was originaldesign its initial intent is put it as aside project so that it can involve thebatch related capabilityuh at at its own pace and the finallymerge with scriptul but due to somereasons it didn't go as planned becauseby that time the focus of six schedulingwas more focused on expose thescheduleuler extensibility as aframework which is cross schedulingframework so that there's an the projectvolcano cames intoplace you may be uh noticed that thefirst version of volcano scheduler wasforked but what was forked from the cubecube batch and volcano was later donatedto CNCF and unicorn uh later in thatyear 2019 was raised it leverage a lotof design concept of uh the yarnscheduleuler and also is hosting theApachefoundation in2022 KQ was raised to resolve that is uhin terms of the job scheduling sidethere's the missing missing piece in thewhole native Kubernetes ecosystem andit's uh sub project it's a sub projectof uh cscheduling I think before we dive intothe features we should get a high levelunderstanding about the workflow whenusingthisulers for default scheduleuler maybeyou are already familiar is that yousubmit a job or job like workloads theworkloads controller will be responsibleto fan out the job to pass and the partget scheduled by the defaultscheduleulerFor unicorn is basically the same wayit's just you have to associateparticular uni unicorn specified labeland annotation to describe thecharacteristic of the job or applicationso that unicorn can optimize for thatjob so that is unicorn's designphilosophyfor KQ is different because in the inthe picture you can see that theworkload controller is not allowed toimmediately create path it has to waitfor KQ to admit the job so that is thekey thing KO do is that according to itsQ setup its own API design once the jobsin cues it will uh do some checks fairfairness check quarter check uh someaccessible checks and then when thatcheck has all the checks has passed itwill ungate the job so that the jobcontroller can create the path and thepath gets scheduled by defaultscheduleulerfor unicorn basic the same pattern as KQcombined with default scheduler uh but alittle difference is volcano has its ownAPI called the VC job as well as partgroup anduh those API carries volcano its own APIcharacteristics so that it has to behonored by its own scheduleuler sousually the volcano controller andvolcano unit sorry the volcano arecombined together to beused if we put all theuler into thespeech sorry spectrum and uh I think usethe job admission as a boundary can besthelp you understand what we where theysitso cube schedule difference is in thepause schedule side as well as unicornfor KQ is more doing the jobs admissionand doing the job level uh managementthe queueingthing for volcano they lands on bothside controller and scheduler and theyhave to be used alltogether so this is basically theoverview next we will uh dive deep intothefeatures so as I said uh defaultschedule has only one jumbo queue whichit doesn't fit the uh batch workersrequirements so all the all these threescheduulers adopt the same pattern isthat it has multiq and the multiq aredivided into a treellike structureso the user can only submit the job tothe lift node here in a treel likestructure among the features comparisonI would sayuh I would recommend use theAPI as a distinguisher to look at theirdesignsfor example I think KQ has the mostKubernetes native API design for you touse them out of box and also it has theAPI directory for you to uh controlwhich user can submit job to which queueand also associate with the Kubernetesterm like our back like uh namespace sobasically I think that's it API can beused out of box in a self-service wayother tool may have some shortcomingslike for uh volcano it doesn't itdoesn't give a notion to control whichusers can submit job to which queuealthough it's use a uh QC design forvolcano sorry so forunicom is embedded every configurationinto a jumbo config map which is notthat kubernetes native because if youhave only one single configuration thatmeans It's not easy to use that formulti multi-tenency thing you have to uhyou have to uh introduce another layerto do for self-service for multi-tenantsso based on themulti- hierarchical multiqes each queuecan be associated with kota and againthese three scheduulers adopt the samepattern to offer a minmax cer managementwhich can also be uh interpreted asguaranteed coder versus best efforts inthis case suppose organization N hasfour CPU and can divide statically orcan be dynamically says okay team Xmaybe can burst their usage to all fourCPUs and then if we go down that is allthe CPU might be consumed by team X2 butfor the cases that is the best the bestefforts tenants use all the all theresources it can be reclaimed by theothers if other tenantsuh guaranteed resource is not satisfiedthis so will yield thepreeemption so for the cube basedfeatures uh I would say they are mostlythe same in terms of functionality it'sjust that the APIs are designed a littledifferently two things to call out hereis in terms of preemption unicorn onlysupports the interq preemption thatmeans one workloads can only preemptsthe workloads in another queue insteadof the same queue and also KQ preemptsusing the job as a unit but the othertwo can preempts using pod as a minimumunitnext we are going to talk about GANscheduling because it's a prettyessential features in terms of batchscheduling basically it means you wantto schedule all the parts together ornothing let's to look at unicorn firstso unicorn doesn't involve job man jobCRD so it basically wait for the pods tobe created in this case four PS here andthey uh they carry the unicorn specificgun scheduling annotation or labels andthen once theminq interesting here is that inunicorn's design it create the samenumber of placeholder paths and uh tryto schedule the placeholder path if theycan be scheduled the placeholder partswill be deleted and then replaced withthe real parts so this design you cansee that you will um increase at leastthe API request because it's almosttriple the API request the placeholderyou have to create them and delete themalso that can be cause some uh riskcondition because this is just indifferent stepsreplace for volcano is scheduled passthe dry run in the in memory in a inmemory manner so once a Q sorry once ajob in Q's once it's passed some basicchecks like quarter check it will be fout to pass and the pass will do thejoin run in memory once they can bescheduled all together it will bescheduled that is basically how volcanodoes for KQuh it doesn't involve part scheduling itleverage other scheduulers partscheduling capabilities so once the jobsinQ Once it's passed some basic quartercheck it will just create the path letthe parts be to be created and be triedto be scheduled sodefinitely in somecases it's possible that not all theparts can be scheduled together so uh ithas a feature called wait for pass readyto be a last resort to be based on thetime out to delete a job and the retryso this uh mechanisms is not that idealthat is why you see in some communitylike Kflow they use KQ plus a particularcode scheduling plug-in to to retrievebetter accuracy of the gunscheduling so next part is how theschedulers cooperate with the ecosystemlike the first part is computingframework definitely we are not able toanalyze it every computing frameworkhere but I'm going to to try tosummarize a common pattern suppose wehave a new computing framework called fuusually comes up with a fd and then hasa f operator which is responsible tointerpret the CR objects into path uh soby default it doesn't rely on anyparticular scheduleuler so defaultscheduleuler can go schedule them sousually the the batch workloads has acoordinator part and some workerpart for volcano and theunicorn the integration they usually dois that they change the source code ofthe full operator to let the fulloperator create the path or particularuh the schedule specific APIcarrying the required information forexample for for for volo it creates apart group to carrying the correspondinginformation for K for unicorn is becauseit doesn't have the job API job API soit just impost the Q information etc tothe P level and then hand over to theircorresponding scheduleuler to schedulethem you can see that it's a little uhinvasive to the full operator becauseevery time you have to change yoursource code once a new SC a newcomputing framework comes into place butFor KQ is a little different is thatit's leverage a generic suspensionpattern which basically tell whichbasically is a roundtrip communicationokay the first one is tell that is okaysuspend first I will handle theadmission of the of the job and thenonce I think it's ready to be scheduledI will let you know and then hand overto you to schedule the part to createthe part for the downstream scheduleulerto to schedule them so this is the basethe biggest difference with each other Iwould say uh the suspension mechanism isa key concept in KQ it can be easilyadapt to some new framework in a ssustainable way because you don't haveto change the at least don't need tochange source code too much uh in the inthe computer framework in themarket the next thing is think aboutcluster also scala CA is much much moreimportant in the the modernized organimodernized cloud infra setup C of likecluster utilization cluster cost sofirst I think we should understand howCA simulate the unscheduled pausescheduling so the fact is that CA usethe default scheduulers SDK to do the uhscheduleuler simulation in other wordswhether the scheduleuler you choose arecompatible with the cube scheduleulerSDK really matters for example like takeDRA as an example which is introducing132 and by the time I I prepared theslice uh unicorn and volcano doesn'tsupport that yet but for KQ for KQbecause it doesn't it's not a Pscheduleuler it's leveraged on defaultscheduleuler so basically it supportsDIA the other thing is that look at CAin another early interwining way uh inmost of cases we are just passivepassively waiting for the unschedulsorry unscheduled part to be picked upby the CA but if we think one stepfurther whether whether or not we caninterwe how to interact with the clusterscala to be reserve some resources oncethe scheduleuler sees some unscheduledpart instead of wait for CA to interveneso that is some API called provisionrequest uh raised by the SIG autoscalinggroup so that is only supported by KQ sofaruh there are some other areas we shouldthink I think for multicluster multiq kqsupport that in a native way for volcanoit leverage a commada to support themulticluster there are definitely someother considerations because of the timelimit I cannot uh go through all of themlike extensibility support debugabilityetcetc so next I will handle over to meokay next next part let me compare theperformance of the simple feature let'ssee how let's see how the differencebetween the features before we start Iwant to highlight it the default settingthat's different between theschedule and the greater effic resourcethe default schedule use the API QBSwhich is 50 and the uni code default QBSis 1,00 the definitely not apple toapple you will see the result definitelylow so in all testing we try to best tomake a list of the uh QPS uh the simbutton the schedule and the controlleras well as some other setting and showsall thought the next thing is that wehave made some innovation in the way wecollect the metrics and do performancecomparison the traditionally way is usepermissuh matrix but if you use permissuh matrix in a skill testing for examplescale scheduling uh 10,000 portal thereare very high ch chance of the maggingof arrow the match yourself under thehold of approximator estimator when uhwhen you're scheduling over a shortperiod of the time all innovation hereis to use APS server audit log uhbasically the APS server audit log eachcan event exactly when uh when a pod isscheduled and the dateetc uh uh or innovation is to collectall the event and use a simple passer tocheckuh when a pod are created data and thena great gate then we plot them on thesame graphana dashboard uh so you cansee that we start then at the same datapoint and then over time how the keyevent of port scheduling and portdeletion happen so those two thing Iwant to highlight before we uh diving tothe performanceresult next is the performanceenvironment we use Kubernetes uh1.32 for send scheduling plus uhcontroller combination for example KQ weuse uh 0.10 version we also we are usingupstreamuh cost scheduling plug-in and applylocal patch fix bug we also found somebug to volcano and uh the volcano teamprovide a fix thanks for shenan forhelping to fix them since volcano has abutton uh schedule and the controller weuse same version here we have ensure theQBS the same for all controller and uhtheschedules for the benchmark testing weuse coke to simulate a node whichbasically is the most powerfulbenchmarking to in uh on themarket at the highlight we use two catecategory with gun scheduling disabledand GA scheduling enable we try toschedule 10,000 pod in total thevariable here is number of job and thenumber of p per job one one extreme isthat is only one job with 10,000 portthe other extreme is that we submit10,000 job but it's each board have onlyoneports we didn't have very the number ofuh Q because in all local testing wedidn't see asignificant performance difference bythe very uh in the number of Q so wesimply the testing to focus on thenumber of job and the number ofpod the first case is submit many jobwith one port each there are two lines Ihere the dot line shows the portcreation time the solid line shows thepod scheduling time you can see unico isahead of the all the two but that's in mbecause the p creation time isahead I think is related to the limitlimit of the web hook QBS for the jobbut we haven't found a way to add uhadjust the web hook QBS for KQ andvolcano if you look at the timedifference between the creation and thescheduling you are see they are pretty[Music]similar when we change the number of jobfrom 10,000 to 500 you can see thedifference volcano need to make more webhook request to mutate thepod interestingly the volcano creatorpod in patch i'm not sure if this is aintentional designif we reduce the number of job to 20each with 500 pods you can see the KQ isalready a hand of a header of theunicore and the volcano and the volcanois still abeheader the last case is one port jobwith 10,000 ports here KQ and Uni is aheader volcano still uhbehead uh this is because there only onejob so only one job web hook commutedrequest is needed summarize the fourcase without G scheduling if you have alot of job the scheduling speed are morerelated the number of job web hook mutedrequest for KQ for KQ and the uni callthe difference lie in the cost themutationrequest this uh if there only onemutation request KQ is a hand of aheader of the unicorn uh volcano isbehind the order two in allcase now let's look at the all cuteduh with scan scheduling enabled againthis a 10,000 job with one pod each unicreate the double the number of the poddo the it use the playh holderports placeholder ports uh it willcreate a placeholder port to reserve theresource and then replace them with areal port so placeholder data then it'sit is actually real port scheduled lookat the port scheduling time volcano isahead of all the two this is becausevolcano was originally designed for Gscheduling i will show all the chart tohelp us better understand theresult the other chart focus on the jobcompletion time you can see uh thevolcano is ahead the okayuh this is for 10,000 job with one porteach if we reduce job to 500 you can seethat unico is passing kq and the volcanoqu in batch again but volcano is stillwayahead the job combination show the sameresult i will skip all the two casebecause when we reduce job to 20 and onethe result is basically the samein term of the GA scheduling we uh ifyou project depend on a lot of uhvolcano is way better than other twothat pretty much is for today'sbenchmarktesting we know there are many more usecase and professional testingcombination that we can't include whatwe show here the feature than all havefor some feature you may want to comparethe how the schedule uh scheduleulerbehave and the many job versus few jobyou may also have very custom userscenarios we provide the project to uhto do so testing using audit lockerwhich give more accurate result we haveopen source the key toolsuh to to create for the testing uh don'tbe stressed about the benchmark testingyou can basically just use make commandwe will open source everything to theshowcase how to do the benchmark testingso you can change uh change proparameter and run it out of out of boxlet'suhOkay okayokayokay uh we will have a road map to makethe tools more professional i think thisgood asset for the Kuba con talks and wewelcome contributor to make the toolsbetter please check out the GitHub repookay that's it for the performance iwill hand it back awayyeah I think sorry we use off the timeand basically that's pretty much thetoday's talk and lastly I will thankgive a a big props to the uh differentmaintainers of the threeschedu and yeah thanks for them toprovide some insights habra perpreparing the slides yeah thanks[Applause]everyone I'm not sure if I'm allowed totake a couple questionsOkayokay yeah uh there's a mic there yeahhello uh thanks for the talk i wanted toask if you would compare the non- gagscheduling with a cubeuler can you maybelike elaborate on it like where all theschedulers better in this large scalesor is there some case where the regularcoupul is maybe comparableuh I don't think that is uh makes sensebecause default scheduleuler doesn'tconsider this a g concept you fordefault scheduleuler we like I said thecube batch sorry the cubeflow communityis using a co-cheduling plug-in which isbased on the default schedules SDK andthe schedule framework so that is howthey are using right now if you just usedefault schedule out of the box wellthere will be a accuracy issue Yeah mhmokay thank youyeah if you uh have more questions Iwill be out of the room for a while yeahthank you everyone thanks everyone2025-04-15 22:00:25.500860 � ��l�b#��AD8r5j_R9QsAhey everyone thanks for joining i knowit's uh it's late in the day and I thinkeveryone's probably very exhausted anduh ready for the cube crawl so we'll tryand keep this nice one quick and uh alittle bit light-hearted andentertaining we're not going to gothrough any crazy complicated theorystuff it's a bit of an interactive andfun session for you all so we're heretoday to talk about logs metrics andtraces and a little bit of mayhem foryou so before we get started I just wantto go through who we are i'm Tom Glenni'm a senior developer advocate forGrafana Labs i've been a softwareengineer for the last sort of 18 yearsand my background is predominantly in umgame development so I worked on backendgame software and um I'm a hobbyist gamedeveloper as well so Jay my name is JayClifford and funny enough I'm also adeveloper advocate at Grafana Labs um atGrafana Labs I mainly deal with our logaggregation database Loki to write a lotof the documentation do a lot of theeducation and then in my past I workedfor Influx Data did a lot with Telegraphand Influx DB um and loved Grafana somuch I ended up here um so setting thescene uh if you're going to buy into ourtheme today you are all heroes on anadventure with us we have a series ofquests we're going to talk a little bitabout what observability is where it cango wrong we'll talk about our tools ofthe trade on our adventure the metricslogs and traces and then what you're allhere for the actual game itself theinteractive adventure all textbased witha bit of an observability twist hey Tomyep then we'll talk about takeaways howyou can get hold of the game yourself sowe will save the GitHub r��V�a#��cAnjT5r3JjIaAthanks everyone to CubeCon and thanksthanks everyone for joining today'ssession uh today we are going to giveyou a comprehensive analysis on thepopular schedulers in the market whichare KQ Volcano and the Unicorn which aremost asked question among the uhaudience we interact with uh before thatlet's introduce ourselves uh my name isWayan i'm the co-chair and themaintainer of six scheduling and I workfor Apple AMLh hello my name is Shiming Jang work fordog in Shanghai anepo to the veryend and then any questions that youmight have wicked so can I get a quickshow of hands not that I can see anyonebecause of the bright light um but whohere um is comfortable with the termobser observability like what do youknow what we mean when we're talkingabout that okay cool i I think that'slike half of the room but I can'tactually see um so basically what we'retalking about when we uh when we sayobservability is monitoring a system andunderstanding what it does based on itsinputs and its outputs so typically uh asystem may uh appear as a sort of ablack box and it it can be sometimesdifficult to know what's going on insideof that and so when we're uh let's saywe're debugging an application how manyof you are familiar with doing this yeahso we do print why is this broken and weput these all over our application butthere's a much better way to do this andwe do this with observability tools likelogs metrics and tracesbut the problem is it can go wrong whenwe don't have these things in place andso our talk today is is mostly basedabout games so I've picked out someexamples here that some of you may ormay not be familiar with um so we'vepicked out three games here uh so CyberPunk is one that I'm hugely into at themoment but when it first came out therewere massive performance problems mostlyon the last gen consoles but when itfirst came out people were basicallysaying it just didn't work it was brokenit shouldn't have been released uh ApexLegends another one that went down foran entire weekend because it hadmatchmaking database uh issues and sothat meant that nobody could actuallyaccess the game for like a good 48 hourswhich is obviously not what we wantdiablo I when that released againanother one where it had massive issuesfor login and scaling with just thesheer amount of people that wanted toaccess the game so what can we do andwhy why does this happen Jay so thereare plenty of reasons why this happensbut I've boiled it down to three mainreasons first is complexity let's thinkabout the dawn of microservices it's whywe are all here at CubeCon we're allhere for Kubernetes we're all here formicroservices and so forth microservicesthemselves you could imagine we used torun monolithic applications we nowdivide that application into a series ofservices each with their own unique roletheir cog in the machine the problemwith this is all of your problems nowbecome opaque when we scale these thingsup to say several hundred pods when westart having all of these podscommunicate with one another where dothe issues lie where in the chain are wefailing and next sort of to lead on fromthis is customer sorry companies becomea victim of their own success thingsthat worked for maybe 10 customers whenyou scale to 10,000 customers you startrunning into bottlenecks edge cases andperformance and good old hardwarecapacity issues and lastly we are allhuman bad code is shipped no matter whatwe do pack bad packages there could beproblems when we update the versions ofour packages and users do user things ifyou are not all familiar with this memeright here this video please go find itis one of the funniest things I've seenin like five years of dev memesfantastic we sketch there but it justproves that what customers will do withyour software is not what you expect allthe time really they should be puttingthe circle in the circle hole but if itfits in the square hole they're probablygoing to do it yeah makes sense so we'regoing to talk about uh three differentthings today so we're going to talkabout metrics logs and traces and sobroadly speaking metrics allow us to seewhat happened logs allow us to see whyit happened and then traces allow us tosee where it happened so when it comesto metrics these are usuallyquantifiable pieces of data you saw inthe last talk if you were here thedifferent kinds of metrics that we canhave um these effectively most of thetime they they are time series pieces ofinformation they have a unique nameattached to them and they tell us a veryspecific thing about a system theyusually have label key value pairs thatcan allow us to define different thingsyou saw in the last talk where there wasa Q um value there that sort ofdifferentiated that metric betweendifferent ones and we can have differentmetric types so counters gaugeshistograms and so onlogs I classify as the OG telemetry typeum we talked a little bit earlier aboutprint statements you're kind of on yourway to logging when you're using printstatements or console logs you'rebasically just outputting to the consolebut logs can take a variety of formssince they've been around for so longthey could be stored in files withinformats such as JSON XML they could benetworkbased such as CIS log opentelemetry which we'll get on to laterbut they all kind of boil down to a logentry with a series of components and Ikind of see it like Lego we have thetime stamp once again we have severityso you can imagine this like road signswhether it be an emergency um a warninginfo or debug and then we normally havea message what happened at that point intime but as we know standardization isalways an issue and metadata can alwaysbe added to these logs so justespecially when we look at structuredlogs and when we talk about opentelemetry you'll see a lot of differentmetadata applied to logs cool and thenso we have traces so these are sort ofthe final piece of the puzzle we'regoing to talk about today these let usuh sort of see our our requests journeythrough your systems effectively umtraces are made up of what we call spansand each span is a particular actiontaken within that system and they allowus to sort of pinpoint slowdowns withina system and sort of figure out wherethings are going wrong and I've given alittle example here of uh it kind ofmaps to the Diablo I example wheresomeone's trying to join an online matchthey open their game they try andconnect to the matchmaking API thematchmaking API might respond reallyquickly but then it tries to connect tothe database and then we have uh you cansee here we've got 9.5 seconds of delaywhen we're trying to make thatmatchmaking query and that's what tracescan allow us to do it can allow us tosee where those bottlenecks are withinour systemsso how does this all come together Tomwell I think we're going to give you abit of a demo so um like Jay mentionedearlier on we're going to go on a bit ofan adventure and uh we we're basicallybeing tasked to uh defeat a wizard ithink Jay I think that sounds like agood idea awesome all right well let'ssee if the uh the demo functions hereall right so um we've basically justcreated a a simple Python script here ihope you can all see that uh I want tobe called Frodo say that again i want tobe called Frodo frodo okay cool thatthat wasn't rehearsed i was just goingto put Tom but there we go cool allright so um hopefully you can see thatlike I say um we are at the beginning ofour adventure there is a path leadingnorth towards the town and there'sanother path leading east so I'm hopingsome of you here are old enough toremember textbased adventure games andthis isn't going to look like somethingarchaic um but hopefully this makessense so we've got a number of actionsuh available to us here we can go to thetown we can go to the forest we cancheat i think you put that in there sheheard a cheat in the Oh we can lookaround uh we're not going to cheat thatThat was built in for developers and youwere meant to take that out before sorryi think we should probably go to thetown no we're heroes there thereprobably a quest in the town okay coolso we're now in a bustling town peopleare going about their business we cansee a blacksmith Jay we can see amysterious man wandering around thestreets there's a quest giver and thenthere's a chapel what do we think wellquest giver right we want a quest i wanta quest all right all right so let's goto the quest giver all right we meet aquest giver he's offered us a quest Jayto defeat an evil wizard should weaccept the quest accept the questawesome all right we've uh we've toldhim we're going to accept uh he says wedon't have a sword though Jay he'slooking at us with disappointment ithink that's a little rude Tom don't youthink it is a little bit rude but Iremember there was a blacksmith back inthe town so should we head over thereand see if we can get him to give us asword you should be able to a shortsword or a flail an axe a flail spear ithink we'll stick with a sword um rightwe're going to head to to the blacksmithwe're at the blacksmith's forge he'sbusy working and we've got an optionhere to request a sword so should we dothat request a sword awesomeuh so he said he will forge a sword forus Jay uh it is going to take some timeand we do need to heat the forge soshould we We've got an option to do thathere so should we do that is this goingto take seconds or days so the thing isum I actually don't know how long ittakes to make a sword i don't know ifanyone here knows how long it takes toheat a sword in a forge i'm I'm going togive it 30 seconds we're just going tostand here awkwardly for 30 seconds wellit says anyway it says we've fired upthe forge and uh it's be heating up andwe've got to wait a while before wecheck on the sword so you said 30seconds yeah I'm going to and we'realready at 19 20 okay cool all rightwell we'll we'll just make some awkwardbanter on stage then I guess why do wehave to kill a wizard why couldn't wehave done a dragon like why why do ithave to be a wizard i think a dragonit's our first adventure today you justneed to calm yourself down all right allright um what we're at 35 seconds so 35right okay so um get the sword backcheck sword uh you check if sword isready the sword has completely meltedwell that's not very good blacksmithlooks at us with disappointment um rightokay we'll request another onewhat just ask him for another yeah we'regoing to start again and we'll just doit for 20 seconds instead right okay uhit says I can't do that right now whatdoes that mean i don't know i'll try itagaincan't do that right now i think we'vebroken ithave we got anything to check what'shappening uh well actually I think therewas Let me just uhYeah all right so we'veum we've we've burnt the forge down Jayit's completely burnt down hasn't itso maybe we should help the villagerebuild the blacksmith and then we'lltake a much more measured approach maybewe're metrics right okay um yeah allright so I tell you what I'll go back totown then um that funnily enough thereis an option to rebuild the blacksmithsaw this coming didn't we quite quitehandy so we'll do that hey we we've uhwe've re rebuilt the blacksmith all goodeverything's fine we're back um cool soI tell you what that that forge heatgauge thing over there that looks likebe really handy so I'm going to I'mgoing to pay attention to that one rightso let's go back to the black i'll letyou keep a bird's eye on that right okayso this time we're going to request asword okay so request a sword and now hehe said uh he's looking at us withdisappointment he says fine be morecareful this time if forge gets too hoti mean he could have told us that thelast time though heat the forge Tom wegot it this time right so heat thatforge and hopefully we should uh weshould see that heating up so I'm goingto get I'm going to get ready i'm in asweet spot the greenbit all right let's see we hold withbaited breath all right i'm going to goi say request a sword there we go allright so we've we've requested the swordthe sword is ready we take it from theblacksmith awesome the um You see howthe gauge is still going up i think weshould probably water on that forge burndown the blacksmith again right so wecool that down all right awesome sowe've poured water on the forge thecoals have sizzled everything's hunkydory right so we got the sword should wego um back to the quest giver and getthat well I was offended by the questgiver he said you know we didn't have asword he looked a bit disappointed ithink we should probably power up oursword first just to show that questgiver what we're made of what do youmean power it up well I saw there was amysterious man in town i think we shouldgo see him what just a random man go seethe mysterious man i promise you inevery adventure a mysterious man justoffers you anything you like right i'mokay we meet a mysterious man he offersto enhance our sword with magic see seei like absolutely accept that offer doyou normally just go up to randomstranger press oneright okaya great choice indeed your sword is nowenchanted with great power you feelfunny but powerful maybe I should accepta quest right well I think we should goback to Stan see that quest giver andshow him amazing sword okie dokie so uhquest giver i'm not sure about thefeeling funny part though that's We'llgive it a go right okay let's accept thequest uh you tell the quest giver you'dlike to accept the quest giver turnspale they collapse dead what do I do nowah um well he must he must have been oldlike let's let's try and request itagain there must be a new quest giver itcould have been any reason that he diedi know but it says he's dead yeah yeahwe'll try againno same again we're back at theblacksmith cuz it threw us out so allright let's go back to go back to towni'm going to go back to the quest giverand then try try the quest againhe's dead jay that doesn't make anysense so what can we do to work out whythis is happening maybe we should lookat logs and we could look back throughour application logs to see what mighthave happened well actually if I mean wecan look at the logs but you can alsosee we've got um we've got an evil swordso metrics are already telling us we'veenchanted our sword quite evily alreadyi'm going to look through Jay there's athere's a funny thing in the logs therecan you see that bit it says the swordwhispers I killed them you'll neverdestroy the wizard with me in your handsso it looks like a critical error to mecan you expand that just to make surelet's have a lookyeah yeah oh it's actually a fatal errorokay cool no pun intendedum so I still don't believe it was methough but if we scroll back through Imean you're the one that accepted itfrom a mysterious man in the village ican see probably an error there that waswe accepted it from a mysterious manwhich might have been my fault rightokay well so I tell you what then whatwe going to do now then we've got we'vegot an evil sword but we need to killthis wizard and also the the quest giveris dead so So maybe we could go to thechapel and get it the curse taken offget it blessed right okay let's have alook we enter the chapel the priestgreets us warmly he can look at thesword should we get him to have a lookat it yeah yep yeah the priest looks atthe sword with fear my child this swordis cursed i will transfer the curse tome what a nice guy uh actually we've gota holy sword now i mean I think we'reready to accept that quest now hopefullythe third generation of quest giver willsurvive do you think it's is like hisjunior junior junior quest giver okay solet's go to the quest giver uh we willaccept his quest he says "Wow you've gotsuch a powerful sword i'll give you aquest to defeat the evil wizard rightthat's a much better reply awesome solet's go back to the town and uh I thinkfunnily enough the wizard's just hangingaround in town yep there's a Why wouldhe hang around would you not have like asecret cave or anything or did we justget bored of coding i I think we gotbored of coding at this point um so he'syelled "Are you here to kill me?" I Idon't know are we here to kill him whatdo we think do it yeah should we killhim strike yeah strike him down allright you strike the wizard down withyour holy sword the town cheers for youhuzzahwicked and our adventure has come to anend so awesome what we did there is wecovered a little bit of metrics and logsbut if you scroll back Tom to the demodashboard demo yeah the one thing thatwe did not talk about wastraces so Tom if you jump back adashboard and let's have a look at ourleaderboardwhat we can see right now is Frodo as atrace ID and yes we are running this asa monolithic application but we canactually still see essentially how ouradventure has gone through a trace wehave each of our actions as a span andyou can see how long it took us tocomplete that adventure so that was apretty good score 8 minutes and 27seconds 27 seconds and you can see allof those actions that took place so wecan see exactly how what happened andwhen and this is where correlationsbetween telemetry signals comes intoplay so when I made that big mistakewith the sort can we find that when weaccepted his offer why don't we checkout the logs at that particular spanyeah so you can see here that when we uhwhen we did that accept his offer stepwe've got the login enter here that saysuh the evil wizard enchanted your swordwith dark magic and you feel a chill rundown your spine this is a warning andthis is where observability really comesinto play when you start combining thewhat where and when you can startcorrelating these telemetry signalstogether to get a true understanding ofyour system the last one just to quicklyshow for fun we'll quickly brief if youjump back into the dashboard we showedthat you can correlate traces and logsbut you can actually also correlatemetrics and traces and these are calledexemplars so as you can see in our swordtimeline as we generated new swords orhow we got a holy sword or an evil swordyou can see we actually tie that to atrace ID for that game and if you clickon the query with tempo we jump directlyto that play that uh that happened so asmore adventure games happen at the sametime when we click on these we can seewhich which adventurer which game hadgained that sword at that particulartimecool all right so how did we create thisadventure then so obviously this was alittle bit of fun but we did this to tryand show you the core concepts ofobservability and actually how they canbe used to help you debug your actualapplications rather than just atextbased adventure and so there's somekey components of this we're using opentelemetry throughout the entire stack webasically have a small Python scriptthat's instrumented with the hotel SDKthen in between that we have our hotelcollector and that's the thing thatwe're sending all of the metrics logsand traces to the hotel collector isthen firing all of that information intosome backend databases we're usingPrometheus for our metrics we're usingLoki for logs and then Tempo for ourtraces and then all of those databasesare being used to fuel what you sawbefore which was that graphana dashboardthat had you know the forge heat onthere and then all of the logsitself so to quickly break down whatopen telemetry is it's basically thechild of open census and open traces andessentially open census for metrics opentraces for tracing and now they decidedto come together under one united frontand make open telemetry and extend tonot just logs but also continuousprofiles as well and it facilitates manyroles as a standard does it's aframework it's an API it's an SDK likethe SDK we use to orchestrate withPython um and they also have animplementation of a collector so if youflip to the next slide the opentelemetry collector is a sort of beakingexample developed by the open telemetrycommunity which shows how to receiveprocess and distribute open telemetrysignals it works within the OTLP formatwhich is basically this the format orthe data format that was designed byopen telemetry but it does allow you toconvert to f third parties thepreference by open telemetry is that alldata sources have an OTLP endpoint thatthey can accept OTLP data um and butotherwise you can use something the likethe open telemetry contributor collectorwhich can translate back into a specificdatabases input typecool and then so the next part of thatum architecture that you saw was ourtelemetry storage and again like I saidwe're using Prometheus Loki and Tempofor this now each of these is for aparticular component of those umtelemetry data so again Prometheus isour metric store and that has its ownquery language called promql which we'reusing within that graphana dashboard toquery for the particular pieces ofinformation like whether or not it wasthe holy sword that we had or it was theevil sword and then Loki is our logstore this also has its own um querylanguage it's very heavily based onPrometheus and this is called logql andwe're using that to grab all of thoselogs in that panel on graphana as welland then tempo again seeing a patternhere this is our trace store also hasfunnily enough a query language calledtraceql again heavily inspired by promqland all three of these have a a nativeOTLP endpoint like Jay was mentioningthis allows us to take open telemetrynative data and pump those directly intoour telemetry stoages without the needto do any form of transformation onthose metrics logs andtraces and then the final piece of thepuzzle is Grafana this is the opensource observability platform that wehave opted to use here we also work forthem funnily enough um but this allowsus to then visualize monitor and analyzeall of that data and try and figure outand make sense of what it is we'relooking at here and allows us to youknow figure out hey we have the the evilsword and we've probably made a mistakehere let's go back and figure out whatthe problem is the other thing aboutGraphfana is that we have what's calledthe big tent philosophy and this iswhere we effectively want you to bringyour data into Graphfana from anywherewe don't really care whether or not youhave data in one particular database oranother we allow you to ingest that datano matter where it uh lives and then thefinal piece of that is we also havealerting so if I'd have had Slack set upon my phone I could have had it notifyme when the evil sword for example wasuh funnily enough got from Jay I thinkit was your fault or when you burnt downthat blacksmith they do that as well sotakeaways observability is no longer aluxury as we know as we demand morethings we demand more shows on Netflixwe demand faster deliveries from Amazonwe need to make sure that the systemsthat we're using to gain this contentdoes not fall over we ourselves I thinkhave an SLO of like 99% um on GrafanaCloud and it's something that weactively need to maintain for ourcustomers the what why when rememberthat is um metrics the why is the logsand the when is the traces think aboutopen telemetry if you're just startingyour journey into observability it givesyou the ability to remove vendor lockingso you can choose the observabilityplatform that works for you and I thinkwe've tried to illustrate today learningobservability does not need to be achore you can have fun with it make agame like we have teach through the artof storyand so you can play this game yourselfwe have all of the code up in a repothat you see here um if you don't wantto do do the QR code it's basically juston the graphana repo under adventure umthere's two versions of this demo upthere um all you need is to basicallyhave a Python environment and the restis Docker Compose if you don't want todo that we also have a Killer Coderinstance you basically just click thatand there's a sandbox and it comes up asan online VM and you can play the gamelike we did awesome and then so the lastpiece of this is that we know that someof this can be quite difficult and ifyou have complex systems it can bereally challenging to figure out what itis you're trying to do how to instrumentit make sense of all of this data so wewould heavily encourage you to come andjoin the Graphana community we have uh aforum there where you can come and askquestions there's other like-mindedindividuals potentially working on somevery similar projects to yourselves thatmay be able to help you out with thingsand also just come out and help ifyou've got information to share andthings that you've figured out um comeand share it there be part of thecommunity you can use that QR code tojoin uh our forum there and uh yeahthat's that's us for the day so thankyou very much for listening[Applause]if you've got any questions now we'rehappy to take them otherwise we will bearound for the next few days come andfind us we're also at the Grafana boothand uh yeah yeah drop by for a beer idon't know tell us about the I don'tknow it's a cube craw right i don't knowhow it works maybe we'll watch you havea beer oh fair enough thank you verymuch everyone thank you2025-04-15 22:00:26.057334us a lot uh uhlearning about the kinds of metricswhile this gives us a good prime thistalk gives a good primer on all thesorts of uh g uh metrics there are outthere and this little one explainshistograms in greaterdetail uh the next section uh is thestage of the metric the metric itself uhin the Kubernetes world has a stage itgoes through alpha beta stabledeprecated or if a metric is no longerhidden as as like uh as gets deprecatedit then gets hidden and then deleted uhit's explained in great detail on thewebsite we can find details uh down inthe link below and the fourth part isthe uh is the metric itself like uh thevalues of what the metric there alongwith the optional labels in there here'san example with all our sectionshighlighted we have the name it'sKubernetes build info uh the help givesus the definition it's a metric withconstant one value labeled my majorminor git version git commit git stateuh and and a lot of additional data whatit gives us is basically an eyesightinto what uh version of Kubernetes we'rerunning what git version it was built onwhat go version was used to compile ituh the major and minor versions and theplatformso let's uh now look back at ouranecdotes and I'll hand it over toPriyanka to take us from hereso um like Jason mentioned there is no atoz good guide to go through all theKubernetes metrics and I can say um inlast month itself when we when I startedto learn and go through the Kubernetesmetrics I found thereare hundreds and hundreds and it couldbe thousands also so we what we aregoing to do today is um all theanecdotes we started with we are goingto use them not all but at least few anduh we'll see what we learned out ofthose metrics the first thing we wepitched as part of this talk was umwhich Kubernetes metrics can tell uswhich features are enabled in myKubernetes cluster so let's use a smallkind cluster and see what all is thereand here I am creating a kind clusteri'm just checking if my cluster is readyand kind by the name it's kubernetes indocker so let's try to u find out whatis the name of the docker containerwhere my kind control plane is runningthat's what I'm doing in the firstcommand and exec inside the container umyeah docker container so once I aminside the docker container the firstthing I would do is check where my umlike what is get some information aboutmy cluster so cubectl cluster info isgiving me that information and one ofthe information it is giving me is wheremy API server or control plane isrunning since API server is the mostprominent component which actually makesthe Kubernetes cluster Kubernetescluster um let's try to see what kind ofmetrics is exposed by that and we canalready see here where would I find themetrics exposed by kub uh kubernetes APIserver so that would be https colonuh/kind control plane at 6443port I also have left a note um kindcontrol plane is just a mapping to localhost on this docker container so let'suse do local host so that we can followalong with the next set of examples hereanother thing I'm also doing is um forme to access the metrics I can't just douh call commands just randomly callcommands I have to authenticate myselfso uh this is the reason why I execinside the docker container because thedocker container is serving as a controlplane host so control plane hostcontains my certificates and key keys soI'm using the certificate and keys hereuh to make my curl request and hittingthe local host uh 6443 at the metricendpoint and what I get is somethinglike this um I can't take uh thescreenshot of the entire metrics thatI'm receiving from here so this is justa portion but we are interested in thisparticular metrics so API server exposesthis metric called Kubernetes featureenabled and this is the metrics thatprovide us information about what allfeatures are enabled on my um kindcluster we just created using kindrecluster and what I can see here is umthe help and type that JSON showed us soI can just read the help and it'stelling me this metric records the dataabout the stage and enablement of aKubernetes cluster so by enablement ifI'm seeing one in front of any metricdata point that means that is enabledand uh wherever I'm seeing a zero thatmeans those features are available butthey are not really uh enabled or in useon my Kubernetescluster these are single data points butif I want to like make use of thismetrics which most people who are intouh monitoring and observability theywould be aware about Prometheus Kryfanaor other monitoringum application so here I'm just using asimple hem chart uh for the Prometheusstack these are the commands i'm tryingto um install the install Prometheus onmy uh tiny kind cluster and the portforwarding so I can access Prometheusdashboard and here when I ask Prometheusto give me u all data points ofKubernetes feature enabled this is whatI'm getting and here I can usePrometheus queries to do some more uhstatistics so here I can like count howmany features are enabled on my um kindcluster this one is giving me the one inum just empty brackets i understand themas stable because once any um feature isstable they they don't really have astage and the other ones I can see howuh about 94 features on my kind clusterare available which are in alpha stageright now 158 beta and eight deprecatedas well i can also do some more querieshere so the these are just all theenabled ones and um if I change thequery to zero these are all the disabledone so these kind of uh information Iwas able to get from just one metricitself and I was like really impressedby it because wow like a metric is justtelling me how many features areavailable on my um tiny kindcluster next up um we talked about howmany ports a cube is running this one isvery very um basic because we see a lotof talks at cubecon about how many uh weare scaling our cluster to thousands andI don't know millions of ports andcontainers etc etc this is all trackedby metrics and metrics like these solet's see um which metrics now we cantry to pull out um from uh from where wecan get uh information about how manyports or containers are running on ourcublet so we saw 6443 was API server butI want to see what all other componentsare running on my um docker controlplane host so I just tried to look forall the components here and I found10250 is running cublet here so let'stry to um hit cublet in the next slidebut I also wanted to give a shout out tothis um uh blog post um the link isbelow this also helped me a lot to uhfind out a lot of different metricsexposed by many different componentsfrom the kubernetes one i uh I think Istarted by cublet so we are talkingabout this onehere so um I figured out cublet isrunning on 10250 so I'm doing the samething this time but um hitting a curlrequest on10250 metricsendpoint and I get metrics like this umone of the metrics here is cublet nodename and it's just giving me since thisis a kind cluster with just one nodename I can see kind control plane is theonly uh node available here and I cannow start seeing more information howmany desired ports uh are supposed to bethere or are supposed to be running onthis um kind cluster so 11 are supposedto be umrunning well 11 and four so we get 11and four here and if I want to like nowunderstand why 11 and four are coming Ican go back and read the help and helpsays the number of codes that cubulateis being instructed to run static istrue if the port is not from the APIserver so this is not only giving me theentire number of a uh ports that shouldbe running on cubelet but alsocategorizing them how much c how manycube API server should be running andhow many cublet directly should berunning the static ones are directly uhtaken care bycublet so the one which has static trueare basically handled by cublet itselfand 11 are created by APIserver and we can see uh now number ofmirror ports which is equivalent tocublet created four static ports now APIserver have to uh maintain a copy ofthem so we also see that informationhere so I I'm seeing a lot ofinformation about ports and containersrunning on my machine i can also try tosee the same kind of information inPrometheus using the same metrics socube port info is giving me a lot ofinformation with lot of labels now whichI can then try to query and um like thissimple query is just giving me how manyuh ports are to how many total number ofports are running on my kind controlplane node and which answers ourquestion how many ports or containersare running on um ournode next up cublet um not only justgives us cublet stuff but there isanother endpoint um is which is exposedby cublet called C advisor and that cangive us information like this i can Ican find out information about thekernel version or OS version on my uhhost using uh metrics like CI advisorversion info and this one was um veryinteresting to me container memorykernel usage i wanted to see how thismetric is receiving the value that I seehere in the red how this value is comingup so I actually tried to dig a lot uhand explained it here how did I come upwith um like how did that value iscoming up how I'm I'm able to get thekernel memory usage statistics and itwas really nice it it wasn'tuh too difficult um yeah and I explainedthe processhere then this next one was very uhsimple but really made me smile um itreminded me of the of the WhatsApp lastseen so there is a container last seenmetrics as well which tells us every umevery time the container was last seenand I can see here all of thesecontainers last seen are um listed onthe right handside and Jason earlier mentioned theKubernetes metrics moves through manydifferent cycles so they start fromalpha then go to bet uh then stable andif uh once they are stable either theystay or they can be marked as deprecatedand if they are marked as deprecatedover time they if they are useless theycan be made hidden hidden means thatPrometheus will no longer be scrapingthem or they can be deleted as well soif any of them is marked as hidden wecan also get those number uh thosehidden metrics information so I realizedokay there is a metrics called hiddenmetrics total uh which is telling me inmy kind cluster which I just created forfun also have two hidden metricsavailable and there's a whole uhdocumentation about um how to checkthese hidden metrics or how we can startseeing these hidden metrics as well onthe kubernetes.io IO documentation hereyou know there is a there are a lot oflinks here so we have tried to have thisslide deck uh in a way that if somebodyuses our handout notes they can try tocopy paste the commands and run them aswell um and then we have some grabbacksbecause I don't want to go through eachof the components so if you areinterested in for example gettingnetwork or fireball related data um youcan again check for the cube API cubeproxy um port in this case it's 10249this time we are hitting local host10249 metrics and I can get informationlikeum total number of IP tables rule ownedby um cube proxy and not only thatlabeled by different IP tables likefilter net so in this case I can seethere are three uh IP tables underfilter and then five under NAT or otherum things like how many packets aredropped by IP table to work around thiscontract problems i was very interestedin this one so what I did was I copypasted this metrics and tried to findout why this matrix was added in thefirst place and what I did I actuallyshould have included a slide but I canexplain what I did i tried to go on theand this uh Kubernetes/ KubernetesGitHub repo and searched for this metricname i got some code path and then Ieventually ended up at the PR that addedthat metric and I realizes realized likesomebody was uh interested in somebodywas having some problem with contractand they wanted to now track how many uhpackets were getting dropped because ofthose problems so they decided to addmetrics now why we are giving this talkand why we are discussing so manymetrics because whenever I was trying touh whenever I was seeing new and newmetrics I was learning about componentswithin my cluster that I would notreally like think about so it wasincreasing my knowledge of how my tinyKubernetes cluster was working on howmany um data it was holding and how Ican learn my uh about Kubernetes ingeneral just by just just by readingthese metrics and if I find somethinginteresting I can go and find uh moredetails about that so I'll I'll skipcube proxy there are more things we canfind here um I'll move to this one howmany ports are waiting to be scheduledso this was also one of the things wediscussed uh we we talked about in thevery early uh of our talk soscheduleuler gives us this informationand when I saw this u metrics that wasthe first time I realized scheduleulermaintains different cues so I thoughtlike okay I am um I am applying a portand it will just wait and at some pointit will be scheduled but no there aredifferent categories how scheduleuler isputting my port into an active queue ora backoff Q and gated and unchedulableetc this was the first time I learnedabout these cues so if anyone isinterested uh they can also go and ufind out scheduleuler pending ports andlearn about these cues like I am goingto do um and not just that we also talkabout something CD related operationlike HCD is the key value store thatKubernetes uses to put everything theentire state of the Kubernetes clusterand there is a lot that goes intomaintaining that information because ifwe are talking about a productionKubernetes cluster we are storing datamillions of lines of data every singlehour and we can't keep all that data sothere are things like compactionhappening um to to make sure our keyvalue store does not just blast off sowe can also like try to find outinformation about that here again fromthe earlier commands I figured out 2379is wherecd is running and I'm trying tohit CD um metrics point here and theseare the information I can try um I canget from CD metrics like CD clusterversiond have two versions uh we rightnow we get three uh CD version three soI can get that information or at leastverify if I'm using version three or theknown three and use my CDCL commandsaccordingly or um things like the totalnumber of granted leases or the totalnumber of renewed leases so if somebodyis aware of EDC and how ECD work you canget like information about leases aswell from ETC um how many are revoked soall these information is is given byEDCD as well and this is what I wastalking about the MVCC compactionrevision every time we have lot of datastored in HCD and we have to keepstoring more we have to comp runcompaction so that whatever data is nolonger in use or we have many revisionswe can like run some compaction so thisis giving me the last compaction whereit's some revision uh some last revisionnumber where some compaction hashappened and um the current revision ofstore so I'm getting all thoseinformationand not not only that how many um keyswere actually compacted so this is goinginto details of EDCD here and thingslike what go version is mycd using so Ican like this one is an older kindcluster so I can see the HCD componentin my kind cluster uh is using go 1.21and the patch uh 12 so things like thatI can see from just from themetrics and one more thing we alsotouched upon how many uh OS threads arerunning in my u kind cluster so I canalso get those counters as wellhere so there are a lot of metrics hereuh we wanted to add more but there wasjust no way we could have addedeverything here so we tried to at leastadd things that I found superinteresting or that invoked my um wishesmy interest into learning more and moreabout it and we try to add them here ifyou have uh moresuch scenarios where you want to usemetrics to learn more about theKubernetes cluster I would like highlyrecommend go back going back to theslides and find out the footer notes anduse those links to um play around withmetrics yourself and with that um we didnot really had a wrap-up slide so we'lljust end up with uh our contactinformation if you want to talk to usabout um the previous talk we did aboutkept or this one you can find us onslack.kers.io this is the KubernetesSlack uh our handles are P Sagu JasonBrenenza or you can write to us um atour email address and with that um I'llstop and thank you for attending i'mopen to questions if if anyone isinterested[Applause]2025-04-15 22:00:26.812484 ��4�d#��AGvIPSgt69Sgthank you for your uh joining thissession and my talk uh actually this Iwant to see that this each of them weresome yeah here's two brightest I cannotsee that each other but uh I'm reallyappreciated this all of you because thistopic or this agenda actually came fromthe all of this maybe eventually the endof the page you may not know that what Imean that and alsouh I'm the working for the meazone andthe meazun is one of the Korea companyand provide MSP blah blah blah b ��c#��EA95NNuV-SUdgso hello good evening and welcome to thelast talk of the first day ofCubeCon London2025 uh I'm Jason Banza i serve as anindependent consultant to small andmedium businesses in Bombay India uh hieveryone my name is Priyanka Sagu i workat Souza as a Kubernetes integrationengineer and uh I'm also part of theKubernetes upstream project i'm one ofthe technical leads for uh KubernetesSIG contributor experienceso we're talking today about learningKubernetes through the lens of metricsit's all about what we learned aboutKubernetes through the various uh as aswe go about learning stuff like thistalk is a spiritual kin to our last talkon CAPS where we spoke about uh what welearned about Kubernetes while uh goingthrough the Kubernetes uh enhancement uhprocess documents uh you can find therecording here most of the footnotes andthe links that we talk about if we saythey're online you can find them in alittle thing thereso like I said this is a new lens onlearning Kubernetesuh and the stuff we learned tinkeringwith Kubernetes clusters what this talkis not is a guide to metrics or towriting metrics efficiently it's moreabout what we groed uh exploring metricsso what it is is about the interestingthings we learned and the path ofdiscovery we took to get there so let'sbegin uh did you know that Kubernetesmetrics can tell you which features areenabled in your cluster with alpha orbeta or stable or reveal how many pods aa cublet is running how many are waitingto be scheduled how much byte spacecontainer logs consume in your clusterthey can even track meods live goroutines or the latest HCD compaction uhrevision so these seemingly small datapoints hold huge insights and that'sjust scratching the surface uh here'sthe thing there's too many metrics outthere in the world in the cube world soand there's no good a toz guide to gothrough them so all of this uh isbasically the stuff that we found we'reusing the anecdotes themselves as guidesand we'll walk through a few interestingmetrics and the lessons we learned sothe structure of a metric itself asexposed by Kubernetes looks a littlesomething like this there's obviouslythe name uh as as we've seen but then italso com like we've kind of decomposedit into these four sections the firstone is the help section which gives usthe definition of the metric when youlook at it uh there's the type of themetric uh there are various types likeyou see there's gauge or counter orsummary histogram or untyped so I mean Idon't really want to go into all ofthese because that'll take too longthese uh talks helped !ut yespeople sometimes ask me that what isMega blah so that's the reason why thereexplain this shortly and yeahuh people are not people actually thereuh[Music]so when the people meet this what you'retalking because I'm the one of thespeaker batch in here so people ask mewhat is your topic uh people might talkis one of them is a K clock and theother is the security part and what elseis Ourama and then KS GPT something likethat it's very specific and proper onebut today's topic is uh KS is a corepart but I cannot describe it exactlybut uh I describe like this when Ideploy or when you are deploy code likethis after that yeah there is so many ofthe unknown it's unknown meaning is likeit is a really uh some meaningful uhmeaningful words but anyhow there issomething other code is include that uhit is meant to be there is a veryspecific very uh proper meanings overthere so I want to jump in i want todown to the Kubernetes hole yousomebody's know about me know about thisthis words is because this uh yes uhthis cubicon is UK and may somebody knowabout this alysine wonder and this camefrom the UK too so that's the reason whyI'm using that or I abstraction use itsomehand so this drawing and then thiswords yeah came from the LC mon so downto the Q&A hole before that yeah Ialready explained this what am I notwhat am I just for I explained the maincompany and name but I think it is thisuh I will introduce a little bit morei'm you can see that this my hoodie i'mthe CNCF ambassador and as well as theKuba Astron things i'm not the goldenlover but I just kuba and also I got aseveral contribution to the community orsome uh organizers a meetup see that soyeah top and uh right side there is somemaybe six one and then the other uh thetheuh the other some suckers are that isthat I'm the one of the I'm the one ofthe speakers joining that so maybe I gotus some more batch in this cubernetes UKand then I have a plan to visit theKubernetes that's I mean that thecubicon the China uh in June so so andJune so maybe I got a batch more and howthis says I really like to theKubernetes and then Kubernetes spreadout to the knowledge and then Kubernetesyeah growing something like this yes uhyes that's what I do and then I reallylove to kubernetes things okay get backto this topic yeah is what is unknowncode i already explained very brieflybut I would like to show up to this morea little bit this example with the stemsuh there is a there's so many of thisunknown code as I said is automaticallydeployed it uh there's so many of theoption is over there but however I wouldlike to pick it up the three thingsbecause there is so many so imag policyyeah you can see that this always butthe other option is exist if not presentor so never and the other is restartpolicy is contains levers to the restartand there is always and then the otheroption is that this never and on failureand last one not actually this there isso many of them but as I said I or forexample I pick up the just three itemand terminar grace period is just athree But you can be to put in that yournumber or your designated number becauseit's declared system so there is thedefault but you can put it in this yourinteger so is this many of this optionis over there even though this uh weknow about or we can easily get it tothe some there's option by the Kureexplain so I want to see that I want toshow to it's really uh short demopreparing actually yeah I a little bitapply to the uh for the this demonormally I really prepare to that livedemo because you know what live demomake it sometimes it's a fail orsometimes yeah I have a broken yeahexperiences uh my web is broken and thenI just talk to the all of the sessionyeah for the 20 minutes all so afterthat I got some recording anyhow thisI'm really prepared to write demo butthis uh environment and then the otherlab is that really limited so I justdecorating this that's one of my apologuh nearby to here and then talk with theme and then we can see that there's realdemoanyhow cubic explain yes you can use thecubic explain and then you can see thatthere's some u"h there is a menu resourceit can be the print or you can guess uhcubicure API resource There is so manythe option and then how is this work andthen what is the meaning of what is thecertain name those kinds of informationprovided so kubure explain and then putin that the po p meaning is a part andthen there's so many this explain andthen there's a meaning at that afterthat we can put it in the cubiccontroller uh explain thispc and there's so many uh options toobut there's so many so I'm going to usethat there some more some words pipeplus more so after that I found out thiscontainers so cubic controller explainedthat thiscontainersyep dotcontainers yeah after that I noted thiswhat is this option under the cubiccontroller uh pot spec containers andthen oh we want to know about this imagepool policy that And after uh we presentthe image pool policy there is so manyof the options not many actually there'sa three option always if not present andnever so it is easily to find out whatyou want there is option to this oprineor some of the your doc site or uh maybethe document and the othersome web searching such as somethinglike that you can find out this what isover there and how this very easily tofind out that your option and explainand that there are something is more bythe country explain so that's the reasonwhy this I'm casually explain this yeahbefore the start yeah I just want to bethe keep the sharing and also uh in myexpectation before I visited to the UKuh you know what this is uh English it'snot my model language and in myperspective UK not UK there's a Franceor German and any wedding and Swissthere is so many the other countrythere's some first language is notEnglish so I thought English is not yesactually there's a latin is not themodel language English too so that's thereason why they have a little bitobstacle or some huddleso okay okayso so I'm going to preparing that thiskinds of things uh before last years Igot a one of my presentation for the KSGPT So KS GPT is one of powerful for theusing that this AI model and thenexplain this so how is this Kubernetesprogram how is a Kubernetes something todo so it can use it like as sometranslator so yeah I I thought so yeahsome not many of us easily to speak itin English but yeah I know I knew thatat this moment there everybody reallymost of them in here English is reallyproper to work and properly speaking soit is not useful anyhow there somebodythis one of my friends is Japanese andRatin and the middle of there's someAmerica such as the other countriesthere are not uh uh proper to use or notproper actually that that uh there isnot uh easily to use that otherlanguages so it can be the revered thatKSGPT for the your language overcomethat so yeah that's the reason why Iprepared then yeah get back or get downto the really core to the talk and todayyeah part with this unknown code I'malready sharing that there this kinds ofthing and then explain the very shortlythat the option and you know what thisuh when I do instructing or so when Iteaching or not teaching actually thiswhen I this guide from to the somebodyor some speech to this other somecampers that I really prefer to draw theflowchart or draw the picture or throwthe something something to understand orsomething easy to digest so I want tosome uh preparing the some flow chartfor the image pool policy never why isthis it not never image pool policyalways why is this good for yours andwhy is it choose that the port optionmeaning is that actually uh it going tobe to one of the best practice or itgoing to be to the very general purposeto use that it's commonly to use thatthere yeah yeahafter that minute so end of the sessionI said this maybe community or I thankyou for your talk thank you for you uhyour contribution is such as the sim uhrelated to this kinds of the meaning andhow this why is or is choose that or isvery smart to work like this you can seethat this the display chart there is athree option which is the number and ifnot present is color is purple or I'mnot the color sensitive person so ismaybe violet you ca#n see uh if it someof the some option is very smart to workit means that there is the make it somecondition to work very smart to work yesyou can see that the some uh never isOnly one condition image already underon the node if it is no it is show theerror and then image already under nodeyes use the just image and also uh ifyou not to present there's only onecondition image already on the node isuse that the image node andthen using that note use no image andthen image already on the node no yesdownload the image but the others I meanthat the default option is always yeahthere is a three condition is includedsmart work if you coding or someprogramming if you some make it as someexceptional case or exceptionally put inthat it means that there more comparableto work it is more smart to work it'salways is include that there three kindof conditional thing first is that thisimage uh has a no tag it means thatthere or something like that if it isthere is a no any specific tag is yes itmeans thatuh I mean that image has a no tag if itis a no I mean that there is a specifictag included it work as the same as ifnot present but image if you have uhsome no tag latest or just uh justnothing in there image already uh thereis a checking procedure is the imagealready on the node so after that justwork like this no and download the imagebut if you have a image on the roadimage already have a node there is arest one you can see that this youcompare the digest so compare the digestand then if it is really change thatafter that the download image that so itmeans that or is really uh convenienceto work your development developmentpurpose and then your developmentenvironment is a sprinkler change andthen downloading to the some uh comparewith the digest uh the rateist or thenon tag is really proper to work and ifit doesn't want to like is those kindsof printer change and then download thatat the time that you are using that thetag so is it work likethat so uh I want to show up there as ademo because this the some diagram is isproper to work and then is easy to workbut demo is better yeah so I alreadysaid that there some demo is good buttoday's demo preparing that's notbecause maybe you know about this what Imean that yes there is just two kinds ofthing one have a tag the other No bothof them has a rateist so there is a notag and then the option is different onehave uh there's a just it means thatalways and the other thing is that ifnot present and also I want to check Iwant to know about this images thereally deployed whether or not so I wantto check this some by the I want tocheck this really image deploy uh to thenode by the script so I printed thisnode name so there you can see that thenode name in here[Music]work node number one and work nodenumber two and then I want to check thatthis the doer limits in here you can seethat this 88 uh you know this in here ifI the live demo the to do rate limitzero was almost zero so that's thereason why I'm just changed my plan totheleering and see that this is really uhdownloading the image by the script uhyou can see that this both workload worknumber one and work number two you cansee that there really actual downloadimagethat after that I want to some I want tochange the image and then unloading thatso I just just easy to change it justfor simple uh texting to the index thatand then rebuild and then push to thisdockerhub with this uh ratetag yeah and then I'm going to read itthis delay to the both of the part andthenredownloading yeah even theredownloading uh there's only alwaysdownload the image that so I want todouble check for thepurpose yeah you can see that this poolcontainer only one or raises uh somepolicy and I want to check that reallyexam really downloading the images so Iwant to check this by the script you cansee that the one and the other the otherw number two is there is no any kinds ofuh the the image downloadings so andthen I'm uh deleting the both of thepart and I want to change that this umuh you can see thathere there is a tag is as I said at thisif we have a tag whether or $not there isa different operation so if I have areally uh tag as a swap image swap imagehave there is a working uh yeah theeventually the same as way I mean uh ifyou have some specific tag there isworking as the same you can see thisexample first of all is currently thisdocker rate is85 and then yeah check the imagethat nochange I don't know I mean that there isa downloaded image that both of them isthis the740 and then I want to be is rebuild andpush them change theimage within the specific tag specifictag is uh some swap image do that overhere and delete to the pod and then Iwant to try to some redownloading theimage even though this I change thisimage uh layer but there's nodownloading because there is specifictag so it's working like thisdifference so if you have a specific orproper purpose or you have a reallycertain purpose that you can use thatsupposed of the some understanding andthen use that yeah check that this bythe script there's no any download fromthe 740 yeah even though this 40 742 Yesi was recording this a few days ago yeahbefore I this partyep yeah see that this image pool policythe deport option is always and verysmart to work and what about the othersi want to check that the deployment ishow is this the unknown code is includethatuh yes the the name is really longdeploy uh deploy ling updated maxsurgery 2025 yamo yeah because this Iwant to show up to that what I mean tothat is by the file so that's the reasonwhy that file name is really longer thannormal and there is a three option Iwould like to explain that thisprogressing data second is the 300 andthen the 600 it means that 10 10 minutesit's defort option but you can changethis because it's the integer you cancheck this as I said at the cubiccontroller they explain and uh specdeploy spec deploy spec and progressdata second there is show up to thatwhat is deport and what is other optionis exist if very easily check thesecurity explain and the other is uhrevisions there is a deport option is ahand but if you the revision is reallylots of them feel like that is so youcan change the three or one whatever butyeah Kubernetes things like this 10 is aproper yes very common to use that orgenerally propose that and last oneactually today's is option yes explainthis option is the max is a 25percentage and you can put in that theinteger or string is yeah you can seethat this integer 25 is integer butstring meaning is that you uh can put inthe 1 2 3 4 5 it means that just rollingup that one or two three four like thatbut mostly common to use that thepercentage better because we don't guessabout how many uh part or pod will bedeployed it so that's the reason why thepercentage is more appropriate andcommon to use that and also as I saidthis I'm really enjoy or I'm really loveto drawing something so uh baseline thisparty is deployed at the 10 10 p isdeployed it and then I'm going to showto you that there's a three example orthree there cases max surgery 10 maxsurgery 25 max surgery is 80 there issome pro and cons side pro and con sideyou may guess yes rolling updatesperspective if you're a rolling updatemax surge is really fast but you knowwhat this max 10 meaning is that onlyone by one one by one one by one is verylow yeah but uh if you to have a proiteuh in my perspective is every kinds ofevery kinds of symptom or other thing orpenins there is a pro and consentthere's nothing to silver bullet peoplethat there is nothing to all the time isgood so the pro uh the opposite side isthat overhead overhead you may guessbecause it's similar to the blue greenor some something like this similar oneso overheads max is 80 percentage ishigh but max surge 10 is that lowuh IE is there is the 25 percentage isone of the common to use that orgenerally purpose to good uh internthere is a middle way to all the time isgood but I don't know that the westernculture based but you know you can seethat this town is not sometimes it'sgood sometimes bad but and Max surgery80% sometimes good sometimes bad but25%age you may guess that when we aredeployed at the work node work node istwo %or three is a generally purpose thatthat's the reason why this max 25percentage is really good to work yes Ican show you that how is this goodbecause my work is normally this threeso you can see that this real exampleyeah xi that's the recording version yesthere is a two this uh two manifest willbe deploy I mean that the three manifestto deploy is first one is that the sameas that version same and also uh beforeI preparing that I would like to saythat this live demo uh so I prepared towrite the docker hub from that the pullthe image from the do hub but you knowwhat this do is very limited to thepooling and then if I loing updatesrepeatly Maybe I'm consuming that all ofthe resource so that's the reason whyI'm change the do to the key yep that'sthe reason why this key is over there uhand the version is1.226.2 so it's basically I will deployit and check it in here the script isalready done to the the right right sideis that the watch and the remove thehead it means is that you can see thatthe toers is the toer is a 10 is overthere so to will be changed you cancheck that is how many pot is deployedordeployings so I'm going to be the kubapatch the max surgery 25 and the versionchange1.2263 yeah after that deport to optionuh the max surgery 25 so it means thatthe rolling update to three or fivesomething like that after that is doneand the others I'm going to deploy I'mgoing to deploy at the max surge80 and I'm going to keep a patch uh 126.0 zero because it's a change theimage and then putting down that theimage that each of the workernode if the max 80 there you can seethat the total uh I think it too smallin here but yeah I highlighted overthere you can see that the 80 and thenis the actual time I'm the right almostlive demo you can see that the feel likeis very similar very same max 25 and max80 uh in the wo 3 uh within The welcomethree condition is the worlding updatetime is very similar how about this max10 max 10 uh at the moment you may guessthat or I already say that is a bit slowyou can see that the actual example so1.6 1.26.0zero and then you can see that the totalnumber is one and change the one two onetwo and then yeah it's the same time thehighlight after the even though it isnot finished that so if it is the maxsurgery is 10 this ruling updates isvery uh it need to some more time tolower than to the some max some 25 orsome 80 that's the reason whyuh it depends on the how many workloadis deployed But normally this workloadis three or four or some yeah there isreally lots of this cluster and thenreally lot of the workload is exist butI said that this very commonly used orgenerally purpose that so 25 is reallyproperly good working with the workloadthree or four that's reasonwhy configuration option is a25 yepuh last thing is that the serviceservice have a depart uh known code isexist so I'm going to be the deployedthe this service code and then there isincluded like this there's so many thisunknown code but I think the you made itinteresting about the code so first isthis the external traffic or internalpolicy this is a cluster but the otheris a localu you might guess is it's really tochange that it's a telco or othersomething latency and then keep it thesource IP the purpose that it is one ofthe good option but normally clustersthere are traffic is traffic going to besome in and out so if it is a changethat there is a severe the problem ishappen whether or not you may uh somereally some consider to use that sothat's reason why this cluster option isa deep and the other thing is a singlestack The single stack have a threeoption is the others there are the dualstack or propers or require to the dualstack but you know what this day this IPversion six is one of the popular in ISPbut normally IDC or in my uh in myworking place IP version six is one ofthe popular so that's the reason whythis single stack is a port commonlyused or generally propose that uh lastone is the session affinity is non isimportant and then client IP is anotheroption probably you may guess that isthis similar to the sticky sticky IP orpermanent per persistent session yeahsticky session or persistent sessionsomething like that so is very you lookslike it's very user option uh userprobably to the uh some of the shoppingmall or something like it to use thattheir session is a permitted to use thatbut Even though Kubernetes perspectivesbasically when when deployed at theservice to the ser loadbalancer load balancer work as a loadbalancer but uh when we are set up forthe session appear it doesn't uh worklike as load balancer like this you cansee that this end point is even thoughthere is a three end point just only tosome some message to that there's oneend point so I preparing that examplepool is it very short maybe yeah it isone of the rest yes you may know aboutthis though what I do there is atopology spread constraint because Iwant to deploy it very even to the uhwelcome to I mean that the part to thewelcome so that's the reason why I usedthis topology spread constraint and I'mdeployed it is the twoservice one is that include that uh oneis that the non uh some session ainityThe other have a session affinity clientIP so there is a port is the 11 and theother one is the 12 so you can see thatthe external IP is over there and I'mpreparing this core script is there so Igoing to be the call I mean this methodis HTTP 2 11 and then 12 you can seethat this when I send the message to the11 yes looks like it's very loadbalancer but when I send the 12 they arenot they the the road is it's not uhworking like load balancer just for thesender the message to the one end pointso it means that is now road balancer Imean that this session opinion is clientis bad you do not use that but fullyunderstand when you areing update it orwhen you are using that deploy storagelike a scanner or something like this orso you want to see that this one point Imean one end point to that there's amessage and then get back to the messageyeah those kinds of purpose is good butcommonly or general purpose yes many ofthe end point is good to work for thebarance that's the reason why option isthat the session affinity none so recapyesrecap Yeah down to the Kubernetes or whyis that the Kubernetes hub and then whyis this designated the specific stringand an integers yeah I'm repeatedly saythat generally proposed commonly to useand also as I said it made by you yes Ithink it is really I'm appreciate thisyour all of them is you are somediscussing debating or contribution likethis after that make some consensus andthen there is a more commonly to usethat those kinds of option so when youare deployed there something is additedlike some it's similar like mutating isby the kubernetes community kubernetescommunity recommended this kinds ofoption this kinds of some uh some valueis is properly working is or highcomparably uh level and then yesum may guess oh there's so many theoption and how can I learn or from theyour junior or coworker whatever somesomething is your some nearby the peopleor engineers I got some restaurant orsome I got a some answer from the uh uhUm so Alice in Wonderland books thecheshy cat said is there some answer youmay found it chesh cat said there's areally answer how should I run run hasthe two run actually that r and l thereis answer is here so if you are walklong enough you can be wrong this howmany option and how many this option isreally properly working So I think itthere's no any silver bullet kings waywhatever is there is not exist eventhough the AI era I think even AI eracore core technology or core knowledgeis human basis knowledge is very verymore important than the AI era is someseason I guess so chess cat has reallyuh proper I mean that brilliant idea togive it us and yes if you have anyquestion and live demo I'll be sureafter the session and then please let meknow all of them after session and thankyou for your all of the staying thisseated and then receive for my sessionthank you so much yep2025-04-15 22:00:27.585555'tributed to theFPN family and ever since then everybodywas asking us like when are we going tohave a story about Izzy and other FIPand friends so that's what we exactlydid in Salt Lake City the first book wasactually releasedum in the CubeCon at I think the CubeConSLC the book was officially launchedthere but then uh I was not able totravel there so we both decided thatokay we will do a book walk through herein London and if possible a book signingevent also together withthis so we would like to sincerely thanktheCNCF who helped us uh we okay wedesigned and uh we developed the storybut the whole support was from the CNCFside for the illustrations and design sospecial thanks to that and yes so thestory when we started thinking about thestory and uh the plan was to release thestory in SLC so the first thing thatcame to our mind was about the 10thbirthday of Kubernetes and uh so wewanted to clap the story in those linesand uh so basically your service mesh isabout security so the story we starteddeveloping it like is he saving thekubernetes 10th birthdaycelebration yes so we are trying todemonstrate the basic features of ISTOservice mesh with the simple concept ofhaving a birthday bash and some piratesor intruders getting into the ship tosteal some of the expensive birthdaygifts so the pirates steal someunattended invitation cards to and makecopies to enter the party so this iswhat can happen in your cloudenvironment too so if there are thingswhich are not attended properly therethere is a risk of identity beingimpersonated and once a maliciousapplications or these pirates get holdof things it's like it can be dangerousand uh they can overhear yourconversation to understand your secretsand things like that so how do we solvethisso actually these are like the pagesfrom the book but it's it's okay anywaysyou're going to have this book free fordownload in the CNCF store so we wantedto just have a book walk through here uhthis so it's like the cloud nativecommunity similar to the VPN family havesome proactive members who are actuallyalways observing and monitoring what ishappening in your ecosystem so here it'sAlina alina has already reported nowthat there are more members on the shipthan who has registered so and shepasses this information to Izzy andCaptain Cube so Izzy is a securitymaster so she already observes thatthere is some problem with the shipbecause the ship has many entrances andit is not secure by default so just likeour Kubernetes clusters so now let'sjust take a pose and see how this mapsto the service mesh concept andunderstand how the story so far isdeveloping so so it's basically like herepresents the STO service mesh so whatis a service mesh so many of you alreadyknow so it's basically like in the micmicroervices architecture when you haveapplications talking to each other aservice mesh helps you manage thecommunication between them so basicallyit's like you assume the ship in ourcluster and each family member is liketalking to each other and we are likelyneeding a mechanism to moderate theircommunication as the situation might goin danger so that's what is beingillustrated in thispicture so now we are getting into thezero trust architecture concept so whichthat is what Izzy is an ambassador of soit's like Izzy is trying to educateCaptain Cube about the importance of nottrusting anybody by default so becausethey may not be who they pretend tobe so that's what zero trust security isabout so once he explains theseriousness of the situation CaptainCube also decides to not to trustanybody by default and he also tellssomething it's like Captain Cube has anidea to mitigate the situation so let'senable service mesh he says all inviteesare asked to go to the closed mesh partyorganized on the ship deck and beforeyou can go you have to get youridentities verified by thecaptain so just as captain cube becamethe verifiable authority for everyone'sidentity so identity in securityarchitecture is also a very importantconcept which is required to be verifiedbefore the applications can transmit andreceive the data secure(lyso this is like the first check identityverification which sometimes can stillbe bypassed and that exactly happens inour story too while one pirate wascompletely trapped because of uh thischeck two others managed to get in bystealing somebody else'sidentity so now we are getting into moreof the easy dolphin securitycapabilities he's able to enablerestricted communication policiesbetween the guests and he restricts thesecond pirate from talking uh as he didnot identify him as he who reallyclaimed heis so now the third pirate who bypassesall of these checks managed to talk tosome of the guests andunderstand like where the expensivebirthday gifts are stored in the shipand he decides to stealthem so far the concept we learned afteridentity is like the access controlwhich are the policies which we can setin your system to manage communicationbetween applications this is what we canuse service mesh to setto so now we have learned about bothidentity verification and accesscontrols enabled in the ship but Izzy isnot satisfied with all these things hewants to monitor everything all the timelive to make sure that everything issecure as expected so that's whereobservability comes into picture so hekeeps on observing everything and hecould see that the pirates are trying toaccess the room with their badges so yesthis feature as you know is calledobservability this is again provided bythe service mesh so it allows you toobserve what is happening in yourcluster so that you can take immediateaction if something is not really goingasexpected so that's another things andthe last feature what we are going totalk about is the traffic management soyou see is capable of dynamicallychanging the routes within the shipwhich can completely trap the thirdpirate when he tries to retrace hisroute back to his boatso it's like all exit roads are alreadymodified by easy and everything led thepirate to a cabin below the deck so thatis like so that's what trafficmanagement is about so you are able tocontrol who is talking to where and atthe same time you can control how theyare going to talk and you can alsocontrol the communication between theapplications and you can decide wherethe traffic should really goto so and this way the Fippy and Francemanaged to expel all the pirates fromthe ship with the help of easy dolphinsuper capabilities or the STO servicemesh features which are like securitytraffic management andobservability and the party resumed andthat's much that's all the story isabout it's a very simple story it's likein the very basic concept so that eventhe kids can understand so if you havekids at your home so if they keep onasking you like what do you do at yourwork and it's tough for you to explainplease use the book and uh please uhexplain to them and it's not justbedtime story for the kids but in youroffice if you have newcomers who wantsto understand about these basic conceptsyou can definitely use theseterminologies to map and now Lynn isgoing to do a demo so that youunderstand how it maps to the real worldscenario all right thank you so muchPriscilla so let's see uh is in actionwith easy um so the first thing I'mgoing to do is uh let's take a overviewof the demo environment uh so we'regoing to use uh Kubernetes of courseCaptain Cube we're going to usePromemesis uh I built the demoapplication myself using Python andStreamlight we're going to useKubernetes gateway API we're going touse Kayali and um this is my clusterthank you so much for that smoothtransition i appreciate that so I havethe demo application deployed on mycluster um you guys can see in the backright yes all right so I have the clientapplication which is a cur commands uhallow me to get into i have the demoapplication which interact with thelarge language model running outside ofmy cubernetic cluster which is alsorunning locally on my machine through alama i also have uh two of my rag moduledeployed which I'll explain to you why Iwrote those and uh in my cluster I alsohave ISTTO uh with ambient mode uh forthose of you who are not familiar withISTO ambient mode is the )new data playmode we introduced in ISTO to enable yourun service mesh uh run your workload inservice mesh without site call so I haveambient installed i have the zero trusttunnel which provides the layer fourfunctionality mutual TLS simpleauthorization policy per node installedum as part of ambient i also have umkayali and permisses uh that I talkabout installed i also have a couple ofwaypoint proxy deployed so the waypointproxy you can think about it it's a it'snot a psychop proxy it's more about agateway for your particular tenant scopeyou feel comfortable in this case thetenant scope is the namespace so Ideploy a waypoint proxy for my defaultnamespace and also I have a egresswaypoint proxy that controls all theoutgoing traffic to the external umlarge language model which is the Alamarunning on my machine so to save us sometime uh let's take a look at thenamespace oh sorry I probably did a typohere somewhere uh let me type um I thinkit's namespace so this is our ship okayso if you look at the namespace ondefault I already have the namespacelabeled with uh isto.io data plane modeambient and I also specify for thisparticular namespace I'm going to usethe waypoint deployed in the default uhnamespace if you recall we also haveegress um namespace and it has anamespace it has a waypoint also thereso similarly I labeled uh this namespace to be part of ambient and have umspecify this name space to use thewaypoint so it's very easy just labelthe name space no need to restartanything all right so let's jump intothe demo application let me make sure Ihave so I have the four wording set fromis ingress gateway to my local port on8099 because Wi-Fi is not great soeverything is running locally on mymachine so finger crossed it wouldactually work fine the second thing Iwant to show you is I do have a simpleHTTP route config on my machine it'scalled uh demo HTTP route so this isusing the new Kubernetes gateway API andessentially confix uh the routes um whenthe traffic arrives on the isto ingressgateway when it arrives on port 80 uh goahead and forward to the back end of thedemo service um so let's go ahead visitthe demo application um so this is thedemo application so the first thingwe're going to do is uh we're going tochat with uh the llama which is runningon my machine outside of my Kubernetescluster so we're going toask what are the top five things to doin London since we're in London sofinger cross uh what's going to happenis going the traffic is going throughthe is ingress gateway well that'sfaster than I talk and clear my demoapplication which go ahead send arequest uh through uh waypoint on theegress uh because it's a egress trafficthen talk to um the llama uh on my localmachine so it says a couple of things sowho actually visit some of these placesi assume most of you at least one placeall right so this is great right thenext thing I want to ask is what isambient mesh um so we all know AI largelanguage model sometimes is veryintelligent but not always right so inthis case it's actually funny they aresaying you know MV mesh is a new conceptand it says about networking mesh pointsit's nothing related to like withoutside car or service mesh apparently itjust didn't understand what ambient meshis uh the other fun exercise I wrote isuh by the way this is all runningambient because the label I had on thedefault name space so what we're goingto do is uh we're going to analyze apicture which is what Frisilla showed uhthe cover of our book and we're going toask the nava mo uh model which is hostedin a llama to say is this pictureactually generated by AI um so let's seewhat it says um is basically it didn'ttry to understand the name of easy verywell and uh it understands it's achildren's bookum and it actually says um it might betoo small for you guys to read let memake it a little bit bigger it actuallysaid uh it's not possible to confirm ifthe image is entirely generated by AI orhuman artist so AI doesn't really knowin this case and I think for it waswritten by it was generated by an artistinstead of AI right if I remembercorrectl*y all right so we have trafficcoming through now um let's see if wecan visualize how the applicationactually works um so in order tovisualize the application uh we have uhis collect metrics out of the boxwithout the application needs to doanythingso in this case I enabled a couple ofthings um so you can see the trafficcomes in from iso ingress gateway andthere's tcp traffic there's http trafficyou get metrics associated with thoseand these are the last 10 minutestraffic it comes through the demoapplication and through the demoapplication we go through the traffichops through the waypoint uh that's onthe egress site and uh before reachingto the external alarm service running onmy laptop outside of my kuberneticcluster this is how easy uh supportedthe third pirate through observabilitythat's a great point so this is the theobservability uh she was uh justmentioning all right let's see somethingin security land because I think that'suh the most important thing uh shecovered so uh in order to see security Iwant to show something quickly is uhthis is my lama running on my localmachine you can see I have a couple ofmodel alreadyloaded and uh one thing I want to do isif I can remember the command umactually I might have to I can'tremember the command very well so uh I'mgoing to cheat a little bit um so Ithink it's called uhdocker uh actually one one okay so uhwhat we are going to do this is whensearch comes handy when you are on stageum so what we are going to do is fromthe client uh which we mentioned it's acur command um we're going to send arequest to a llama and trying tounderstand what are the models you cansee it has access right so not only frommy laptop I can call uh a llama from theclient running in kuberneti I can alsocall a llama right so what if I don'twant to enable that function so what weare going to do together is we're goingto deploy a authorization policy uh sofacil identity right so in order for themesh gives uh each of the pod uh runningin the mesh uh proper identity throughspiffy and uh with the proper identitywe can establish mutual TS between thecommunication among the pod we can applyauthorization policy not only on simplelevel but also um um on the past levelsfor instance in this case we're sayingin order to access the alarm uh runningexternally outside of my Kubernetclusterum by the way I do have a service entryto that which defines from the clusterI'm allowed to access the lama um I'mgoing to only allow my demo applicationand also So my rag application to accessbut anybody else wants to access a llamait can only access through um theslashget and nothing else but if I'm ifthe source is the client is if I'm sorryif the source is the demo or the rag uhprinciples then they can access a llamaanyhow so let's go ahead apply thatauthorization policy uh if I rememberokay that is the authorization policy wejust apply so with that um I'm going torun the same call uh through API te uhtext now you can see it's alreadydenying access right and if I do um juston the slash without API tax it discoveralarm is running but it can't doanything besides what I'm allowed tocall uh specify in the authorizationpolicy now let's go back to our demoapplicationum in this case with the authorizationpolicy we believe the demo applicationshould continue work and we should beable to continue talk to a llama rightso what I'm going to do is uh I'm goingto connect the application to my phonehopefully it works and I'm going to takea picture of you guys all right um sowhat we're going to do is we're going toanalyze the mood in the audience anddiscover if there's anything unusual inthe picture all right uh I hope you guysare excited engaged see yourself on thestage all right what I'm going to do isclick on the take the picture and what'shappening is the demo application isgoing to send a picture to a llama whichhas a lava model which can understand uhimages and it's trying to show what theimage is and um let's see it'sunderstand it's in a conference and uhit's uh unfortunately it was difficultuh to to guide your expression I guessyou guys are not excited enough anduh and it complains the picture has alittle bit low resolution all right allright um so the last thing I want todemo is uh remember we couldn't ask uhAlama what is ambient mesh so what I'mgoing to do here is um I'm going to loada txt file uh to ask so you can seesometimes it can load um the txtfile and sometimes it can't the reasonis I have uh two rag module uh servicerunning in my kubernetes cluster one ofthem can understandum pdf and txt uh which is version twoand the version one can only understandPDF file so this is what you are seeingof the behavior of round droppingbetween the two based on the weights inKubernetes um butfacil function service mesh provide isbe able to precisely control the trafficso let's see how we can do that togetherby leveraging Kubernetes gateway APIwhat we can do is deploy a object calleduh HTTP route and in this object whatwe're going to say is um when thetraffic comes in slashum with prefix with slash and onto therack service please route um 75% of thetraffic to version one and 25% of thetraffic to version two and if there is aheader uh called X roll out canaryalways route to version two so let's goahead deploy this um routeobject so deploy uh I think it's calledrag route right okay so the route isdeployed and let's see this in action soif I can type so what we're going to dois uh we're going to do this is notexactly what I want i want toupload and uh this is the one it doesn'thave the control so you can see it'sgoing to go to you know a goodpercentage most of the percentage intoum into um so we're going to do the darklaunching test if I remember correctlyis is it X canary roll out let me doublecheck sorry I don't remember the headernow uh okay it's X canary roll out uh Xrollout all right so if we do this I wouldexpect to always go to the version tworight this is what duck launch is forright i can guarantee the behaviorthrough traffic shifting sorry my leftfinger um so I can guarantee thebehavior through um through trafficshifting to make sure it always hitsversion two and when we are happy withthis uh we can always go back and deletethis and make version one to be zero andmake version two to be100% and then we can deploy this uhpolicy to say you know I always want togo to version two so if that's the caseI don't even need the header and I dowant to go to version two 100% of thetime which you can see it's happeningall right so let's see if our rag moduleis working let's go ahead upload mbesmesh.txt it's working now this is themoment right language model doesn'tunderstand specific documents or contextso we trained it through rack nowhopefully hopefully you can find someanswer for us all right ambient mesh isbuilt on the concept of a traditionalservice mesh but with a simpler lighterarchitecture that removes the need ofthe psychop proxy are you pretty happywith that simple description all right Ithink that concludes actually let meshow you one more thing uh this is thedashboard looks like um so you can seewe had some challenge in going to rag uhversion one right because the uploaddidn't work um so this is the power ofobservability that shows you everythingthat's actually happening and uh mutualTLS and the policy enforcement it's allshowing up here which is pretty nice umthat's the power of service mesh withoutus using any of the side cars uh permisis also running and uh you should beable to get the metrics but I'm giventhe time I'm not going to show that butthe permissus metric is what's uhpowering the the dashboard for us to beable to show what's going on uh with ourapplication all right with that um do wehave the books doesanyone We do all right thank you awesomeso I think that concludes the end of oursession if you are interested in gettinga copy of our easy book for your kid orfor yourself um please stay around wehave a limited version of 50 copies 5050 books available for you to get themso we would love uh for you to get acopy and take it home signed copy we cansign yeah we can sign them too thank youso much really appreciate you2025-04-15 22:00:28.508560 � ��3�e#��AmtqUtbMaSDwapologize for the technical difficultiesbut we are going to go ahead get startedso FA and I here are going to talk abouthow easy saves the birthday and mostlyimportantly we're going to show you alive demo after we briefly talk aboutthe book a little bit introduction aboutme uh I travel all the way from KerryNorth Carolina on the east coast ofAmerica to here and I work for solo.ioas a head of open source at solo.ioi learned about this yesterday we arethe 10th largest contributor to all theCNCFprojects hi everyone so I am Facila amember of the CNCF TOC and a cloudnative developer at Ericson uh I'm alsoa CNCF ambassador and an LFX mentor andalso the co-author of the book we aregoing to discuss todayyes so many of you already know aboutthe fippy and friends it's basicallylike a family of illustrated characterswhich was initially designed anddeveloped by Microsoft but later wascontributed to the CNCFthe basic purpose of CPN friends is todemonstrate the Kubernetes concepts in avery basic manner so that it's easy forall of us to understand not just us butwhoever is new to the cloud nativecommunity for them to understand uh inthe most simple possibleway so earlier in 2024 so service meshwe also designed a FP character calledIzzy izzy is the dolphin you see uh onthe ship so Izzy was designed and theillustrations we we con&- andthat's our platform we have about 32minutes left so let's build one livehere today so we need your help thoughbecause we're going to go through somesystem design choices choose which onesto use and Victor's going to build alive demo as we go so maybe he's goingto try uh we're going to create we'regoing to use one for API managementwe're going to go through a technologyfor policy we're going to talk abouttechnologies for one-time tasks andwe're going to add a technology thatadds a guey a graphical user interfaceto our platform and then we'll livehappily ever after so here we go let'stalk about APIs and state management thetechnologies we're going to evaluate arecrossplane and QLA but first let's talkabout uh what it even means why do wehave API management but we're going toeven take a step back from that and saywe should build our platform withKubernetes APIs so why do we wantKubernetes as part of our platform wellnumber one Kubernetes is an industrystandard i think since we're all atCubeCon I don't have to convince ally'all in this room and then I'm Texan ifyou can't tell from my y'alls uh so uhso it's industry standard and also we'regoing to take advantage of thatKubernetes synchronization loop so wecan use Kubernetes resources todeclarative declaratively define statesand then it's going to make that statean actual state but here's the thingwe're going to use Kubernetes to managestates even if the thing isn't inKubernetes so just because we're usingKubernetes as our platform doesn't meanall the platform capabilities have torun as Kubernetes in Kubernetesand then uh uh fin the third reason isthat we can use everything in theKubernetes ecosystem so our developerssay hey we want a cluster so we need tobuild a Kubernetes API the first thingas platform engineers we're going totalk to our infrastructure expert who'salready forgotten more about clustersthan we will ever know we're going toleverage that knowledge work togetherwith her and then make a Kubernetes APIthat the developers can use to provisiontheir Kubernetes cluster and we're goingto do that with a tool like crossplaneor clusterAPI so our developers say "Hey I want acluster." In this case they're applyinga manifest to get that cluster and thenour tool like a cluster API or acrossplane will take that act desiredstate and then actually make say a GCPcluster and that's running in the cloudcompletely outside of our Kubernetesmanagementcluster so why does it need a to be APIsthough like we understand why Kubernetesbut why APIs apis you can wrap APIs inany interface so um if you have an APIhere's a great example we havehyperscalers so if you have um Googlecloud for example if you want to consumeGoogle cloud resources you could do thatthrough the G-Cloud CLI you could dothat through their web console you coulddo it through Terraform modules but allof that is going to the Google a GoogleCloud API and we want to have the samething as our platform APIs and then wecan wrap them in interfaces to bringthose capabilities to where ourdevelopers are and then the other reasonis we can use those APIs as buildingblocks so we can make one for Kubernetesclusters make one for database make oneto run our actual application and thenwe can actually can put those alltogether and integrate them and offerall of that we could offer a cloud uhdevelopment environment with theapplication pre-installed alreadyconnected to a database behind a singleAPI to our developers so finally thetechnology crossplane with crossplaneyou can take any API on the planetinside and outside Kubernetes a SASanything pull it into Kubernetes andmanage it as a Kubernetes API so youhave that synchronization loop you canalso uh compose lots of differentresources and integrate them with eachother and you can offer a simplifiedinterface to your developers now withall of that power comes a lot ofcomplexity so the drawback of crossplaneis it's complex and it can be difficultto learn and use on the other hand wehave cubea now cubeella is a focusedtool it is meant to be for applicationsso it gives you uh kubernetesabstrac.tions so that you can define andmanage your application as part as uhwithin kubernetes so you have tools likean application that has components andthat has traits and a lot of communitysupport so it's a focus tool just formaking an API around applications andfinally let's vote which one would youlike to see Victordemo oh it'strying to catch up huh uhhuh uhhuhshall we call it a day uh we have 187it's the votes are still jumpingokay cross plane it is cross lane it isuh let me switch the Okay uh okay so uhlet me just douh few silly things uh that I need forthe later just to know what I'm doing uhignore don't look what I'm doing cool uhokay so we need a repository we need adatabase for an application we need anapplication right i'm going to try to gofast uh repository part i already uh didthat in advance the rest of it will belive um and I will show you to you uhGitHub claim uh GitHub uh GitHub claimuh ofuh what's the name what's the name cncfdemo up I thinkoh no Victor there we go right so thisis a claim that created a repositorywith everything I need for theapplication you can see there you canread the files right created repositorysome files pull request and so on and soforth so uh I cloned that repositoryCNCF demo there we go i'm inside uh Ihave a source code of my applicationthis is a very very complex goapplication written in 20 lines um andnow first I want a database right howI'm going to get a database uh I canwrite YAML but I'm not going to do thati'm going to uh copy the file that Iprepared in advance i will show you whatthat file is i'm going to put it insilly demo db password[Music]umyl and I'm going to copy another filei'll show you what it is so don't noworries about that uh this one is goingto be called uh like this cool uh and uhI'm going to apply those two files andthen while it's running uh you will seewhat's happening and what's happen andwhat's not happening and so on and soforth oh I want a production apply umfile name uh file name where did theyput it apps uh passwordcool i'm going to do this this is adatabase now while it's running let mebriefly show you what happened reallyapps silly demo database i I have aninterface I created called SQL claimthis is how the developers in myorganization are uh requesting databasesthey have no idea about subnets VPCswhatso not right this is a custommadeAPI and uh it has some parameters thisis the version I want i want smalldatabase because nobody knows AWS sizesi want it in this region and I want twodatabases inside of that server rightand as a result of all that beta traceuh SQLum SQL claim no namespace production SQLclaim uh there we go where what's thename of the SQL claim uh my how did Icall it how did I call it[Music]uh Iforgot i forgot okay let's do it likethis getmanaged eventually you will see what'shappening right uh what you will see ina second is that you will see all theAWS resources that were created when Iuh created that database and since Ihave do I have time i have no time i'mgoing to carry onum okay now let's talk aboutapplications right um I'm going to uhwhat am I going to do oh there we gothis is my database right that fromthose 10 lines of YAML that were createdI got VPC I got subnets I got securitygroups I got everything that I need as adeveloper right from 10 from that fromthat custom API now let's talk about uhthe application itself i'm going to copythis uh crossplane app YAML uh I'm goingto copy it to apps as well and I'm goingto call it silly demo uh YAML cool i'mgoing to apply it uh no uh namespace productionapply apply file name is apps silly demoyamlcool while I'm applying I'm going toshow you what happened what did I haveagain this is a customized interface forapplications uh developer emergency saysI want an application i don't knowwhether I have to have uh deploymentservices ingresses and c network i'mjust specifying that I want it inproduction this is my image this is mytag and so on and so forth right and theresult of all that uh is that uh Ishould get a new PR PR created in my uhin my repository still not uh okay thatwill take a b/it let's take a look atwhat I got in productionctl namespace production get all andingresses still notrunning still not running what did I dowrong comeon okay my demo is failing from thefirst attempt uh description up uh sillydemo comeon silly demo uh uh uh uh uh uh uh let'ssee name space cube cutle namespacedoesn't existhow descript ah description describe isnobody paying attentionup okay branch C returnannotation to distinguish them should befine by now let's see get upsuh get allresist still nothing uh last lastattempt branch silly demo branch branchbranch branch set different metadataname uh let's me see uh code this is thelast attempt I promise apps silly demowhat did I do wrong uh silly demo sillydemo looks okay looks okay looks okayname DB secret repositoryenableuh I'll let Whitney speak and then I'llthink about it i'll get to it i promisei promise all right this is excitingokay here we go so next up is admissioncontroller policy we're going to talkabout uh validating admission policyversusKyiverno so our developers now they canmake themselves a cluster in aself-service way that's amazing what ifthey accidentally with mistakes ormalice ask for a cluster with 300 nodeswhat's going to happen so we need to putpolicy in place so that uh this sort ofthing can't happen so if we're usingvalidating admission policy which isbuilt into Kubernetes we can put rulesin the cube API server so that theKubernetes API server maybe it's onlyallowed up to five nodes it's going toreturn no and then it's going to sendthat back to the developer or if we'reusing Kyerno and not validatingadmission policy then it goes to CubeAPI server cube API server is configuredwith a web hook to send that to Kyivernokyerno is like oh hell no and then itgoes back to the developer and likesorry you can't make a 300 node clusterso that's in our use case but also youcould use policy for lots of other stuffso to prevent bloated resources you canuse it to stop um outdated dependenciesyou can use it to make sure that ourapps don't pull from untrusted containerimage registries that sort of thing andthat's really just the tip of theiceberg the two technologies versus eachother validating admission policy isrelatively new in Kubernetes and it usescell which is a common expressionlanguage which is a relativelylightweight way to be able to definewhat what is and is not allowed in yourcluster um it also at this time onlydoes validation you know hence the nameuh although they are working on amutating one so Kyivero is a more maturetechnology that's been around for a muchlonger time it does work with the celllanguage the validating admission uhpolicy does but it also has Kyverno JSONquery language which is meant to lookand feel like YML and that's what it'sbeen using for a long time it has hugecommunity uh support it does validationof course it also does mutation you cando resource generation it also doesresource cleanup so it's really a muchmore mature technology and what reallythe best way to go is to use validatingadmission policy until you run upagainst a use case where you needsomething more full-fledged and thenmigrate to Kyiverno and Ky Kyivero makesthat really easy so those are ourtechnologies and now it's time to votewhich one which one do you want me tofail nextit's a close oneokay one more votegive it a little timeokay three two onecover okay let's do it i'm going to undowhat I did before and we are going tostart over uh I'm going to delete theapplicationdelete uh and I'm going to push stuff Ichanged so far get add getcommit something uh and get push coolokay now let's talk about uh go back andtalk about what is what did they choosekerno cover okay yeah cover right so Iprepared a few policies for you here cutpolicies coveridp crossplane are we using today yesokay so here's a policy that we willapply that will be very very simpleright it's a cluster level policy thatapplies to where wherever or whatever uhit will apply only to something calledup claim that's the interface I showedyou earlier uh that allows developers tocreate applications with everythin0grequired for the application it will beenforced only if somebody tries tocreate or update and the rules are youhave to have the scaling field set thatfield needs to have enabled sub fieldand enabled field needs to be true rightand this is the text that people willsee if they mess it up and the samething again but says you know what thescaling needs to have the field min setto greater than one right so you need toenable scaling and you need to uh yeahenable scaling and it needs to have atleast two replicas of an application sofar so good right easy so let me applythis cube control uh cube ah cubecontroluh apply file name policiescool okay now uh let's try again toapply the application that uh almostworked beforeright uh now this doesn't exist uh okayyeah cncf demo up andthis okay now this time when developertries to apply application there is thismessage hey it must be set to true youhave you have to have the minimum numberof replica set to two uh to two orgreater value right cool then thedeveloper does something like this justto using Visual Studio Code CNCFuh CNCF demo app uh apps silly the demoi will go and what was the first rulethat I said um scalingscaling I need to write uhscaling scalinguh andIenabled true if I remember correctlyright now if I apply the applicationagain it still complains but there areless complaints now you did the rightthing with scaling you need to put theminimum number of replicas to somethingelse let's modify the applicationuh min I don't knowfour right apply and this would work thesame whether it's from Argo CD flagsdirectly from backstage whatever youwant right now people cannot mess it upand you don't have to watch over theirshoulders inspire them with webcams andall the other things that you're doingif you're a bank go continue this timeworks one out of one worksso now let's talk about one-time taskwe're going to talk about Argo workflowsin Tecton so basically uh we we helpedour developers create resources andmanage states but what about thosethings that developers need to do onlyone time how can the platform help withthat so we're talking about things liketesting application code buildingapplication images pushing that image toa registry updating configurationtesting that new configuration and thenadding it to a GitHub repository or gitrepository which altogether is oftencalled a pipeline how can our platformhelp withpipelines so our the fir there are acouple ways the first way is ourdevelopers need one and we just givethem a template they can take thattemplate they can modify it how theywant and it already has our company'sbest policies and best and securitypractices baked in and then they go runit wherever they run pipelines becausethat's already been figured out at yourorganization and that's all the platformworries about or we can take that a stepfurther and then we can uh give them aplace to uh we can manage the toolthat's going to actually run thepipeline and we can give them thecompute to run that pipeline and so ifthat case we have our Kubernetesmanagement cluster here when ourdevelopers want to make their p uhpipeline first they have a pipelinedefinition that we apply to the clusterand then nothing happens because apipeline needs to be triggered and sothat can be triggered by a chron jobmaybe or most likely by pushingapplication code and now our pipelineruns and then when it's finished runningit's not using any resources and thoseare pipelines now Tecton is made forthis use case I'm describing where onething leads to one other thing leads toone other thing leads to one other thingand it can do more advanced use casesbut it's pretty um straightforward anddesigned for that CI use casespecifically argo workflows is designedfor more advanced like a IML kind ofworkflows where it's directed as cyclicgraphs where um one thing might lead toseveral other things which might each ofthose might lead to several other thingswhere your workflow instead of beingcalled a pipeline it's called a workflowand it can get much more complex it alsooffers things like loops andconditionals now techn1ically Tecton doesthat too but Argo workflows is defineddesigned for this use case and Tecton'sdesigned for the more simple use caseand as such Tecton is easier to getstarted and use at first and workflowsis more complicated to to use andunderstand and with that which one do wewant to see Victor demowhich CNCF projects do you want me todemo of those two yes yes ourour stuff is limited in scope to CNCFcount it i think three two oneclose okay so um let me first go back tothe repo and just push if I haveanything to be pushed add commit uh pushcool now going back i prepared aworkflow over here uh Argo YAML this isa a simple one for being Argoworkflows the shortest workflow I couldcome up with you chose it so don't blcomplain few parameters right this isthe repo I want to build from registrystuff like that i'm going to check outthe code i'm going to generate the tagi'm going to build an image using Kicoi'm going to run some unit tests and I'mgoing to execute what I call githubshere um and uh I'm going to push changesback to git now each of those isseparate template this one generates uhthe tag uh and so on and so forth andthe the git push no not the git push butuh wherever it was somewhere I had theone that actually basicallyuh ah it's somewhere that it modifiesultimately builds an image pushes thisto to registry modifies my yl filepushes changes to git uh and then argocd or flux or whatever you're using canuh pick it up so uh now in a real worldsituation you would trigger it fromGitHub from GitLab whatever you're usingin a slightly less real world situationyou will go through a couple of hours ofpain not understanding how to do it uhbut once you do that's that's how itshould work now I will skip that partbecause mostly because I don't have timeuh and I'm just going to run it bysaying submit this the same as if it wastriggered from GitHub submit somethingto the namespace argo uh and I wantthat's something I'm submitting is ArgoYAML and I think that this soundscorrect and if I watch forchanges you will seeuh now running all the tasks checkingout the code generating the tag uhmaking building pushing image makingchanges to the repos to to the to the grepo with the manifest letting argo seethe synchronize run some tests and so onand so forth now this will take a coupleof minutes which we don't have so I willlet Whitney while this is running in thebackground uh go to the last one allright precisely three minutes oh I thinkincluding my demo no we have eightminutes eight minutes okay so that isfine talk slowly so we rush we managedeight we we can do pipelines webasically have uh a platform built andnow we're going to add a graphical userinterface or guey and we're going toevaluate backstage and port for this oneso we here's our platform that we builtwe have API it's beautiful it's so easyto use we understand it completely it'samazing but what happens as it scalesand we build lots of APIs and then lotsof instances of our resources getcreated it can be get very complex andvery hard to manage and understand veryquickly and so our developers they canbe confused about what's available tothem or what they've already provisionedour platform engineers especially acrossmany teams that they're probablyservicing they have a a difficult undertime understanding what is uh what'salready been provisioned and so that'swhere a guey can be very helpful so witha guey it's a graphical user interfaceour developers can see in a graphicalway what's available to them they canget education around how to use thatthing or what what options are availablefor them to change they can see a a abig picture of what's what they'vealready provisioned and they also canusually provision stuff through the gueytoo so our platform engineers also havea single pane of glasses they call viewof what's happening now this ishappening from a read only view from thecube API so a guey is a total wonderfulthing to have and absolutely helps yourjob as a platform engineer but it'sactually not necessary part of yourplatform you can have a platform withouta guey you could you could understandyour Kubernetes resources with the CLIlike Victor just did it's so here we arewe have all this complexity and ourgraphical user interface is going tohelp now first up we have backstagebackstage is actually a very popularCNCF project i believe it's the secondmost contributed to project behindKubernetes itself so it is um not aportal itself not a graphical userinterface it's more of a framework forhow you build your own graphical userinterface and there are a lot of pluginsalready available since it's so popularbut if you want plugins that don'talready exist in Backstage you need towrite them yourself and you need to dothat with TypeScript so it can becomplex to get started and use backstagealthough the tremendous communitysupport really helps you uh with thatproblem and then on the other hand wehave ports port is a commercial solutionthey offer a low code no code approachto make your guey and it uh it's mucheasier to get started and use and that'sit what would you like to vote fornone coolokayexcellent i think we have a winnervictor we have the winner let's bring ithome have the winner we have no timecool so uh well that was happening inthe background uh I already set upbackstage and backstage should havediscovered the stuff that I did um therewe go that's one of them right sobasically this is probably if you usebackstage this is probably verydifferent from the backstage you'reusing i'm not defining backstagecomponent uh yaml file and none of thosethings it's completely dynamic it goesto my cluster figures out what I havethere and does two three things one isto figure out what I have and show meokay oh okay so you have somethingcalled uh CNCF demo app cool uh what isit it's crossplane something cool uhwhat are the resources that it createdthose cool what is the graph becausepeople like boxes and arrows this iswhat happened when I created theapplication rightum and by the way that's the applicationdidn't work before I don't know why itworks now it's fine u second thing itdoes and this is the beauty ofkubernetes having discoverable sheschema right I have I I ne I haven'tconfigured and told backstage what Iwant to do and how to what what toBut it did go back to Kubernetes and saygive me schemas of the stuff that can bedone in this Kubernetes cluster rightand if I create clear click create youcan see that how to detected andgenerating forms for the same stuff thatI was doing before right without metelling you it there is a way to createSQL databases applications and GitHubrepositories right uh let's say I want adatabase cool choose how to generatefields some generic fields like namename space owner always admin uh youclick next those are all the parametersthat are defined in that Kubernetesresource just out to detect now if Iwould click next next next next uh atthe end of the day you would think it'sgoing to apply to the cluster but that'snot what it's going to do it is inreadon mode it gets what it needs fromthe cluster doesn't write anythingthrough cluster if I would click nextnext it would push it toget and that would that would eitherupdate or modif or or create a new filedepending on whether that somethingexists or doesn't argo CD would pick itup and it would synchronize it to acluster and then we would be livinghappily ever afterand that's it that's it by the way letme just quickly check whether whetherthe application is actually reallyrunning or umuh uh it's not but but uh but backstagethink it is so it's fineyour developers will be happy living inuh being oblivious we put the last slideon please last slide yes yes thank youso much for coming to our talk will youmake it i don't know where is the buttonthank you thank youum if if you want to see our YouTubeshow that's the the QR code is and alsowe have a lot of YouTube stickers uphere on the stage or if you talk to oneof us we have stickers for And lessonlearned if you go to Kufucon and you goto a party or about at night and youmeet the Scottish person Yeahdon't discuss anything that will youwill regret next yearthank youwe did it2025-04-15 22:00:29.544834 \N\��^�g#��sAthCZDKZ1cAMand with that I think we're going to getstarted we're going to take like a10-second lead on this whole situationand we're going to get started and thetalk of the title of the talk is yes youcan run LLMs on Kubernetes who here wasin this room for the last talk anybodystayed around for the last talk and uhin general who here knows that you canrun LMS onKubernetes all right so yeah then thenyou don't need the talk it's good youall know that's great you can just tellother3��f#��YAhnmtjCkO8FEwelcome to Choose Your Own Adventure thedignified pursuit of a developerplatform so glad you're here y'all meetHero hero is application source code ona developer's laptop hero longs to be areal application running in productionserving end users and we've been helpingHero along their way so we've helpedHero uh evaluate hundreds of CNCFprojects choose which ones to useintegrate them with one another so Herocan live their dream i'm Whitney thisguy in a skirt is Victor and uh togetherI know I know I'm messing with you Chrisuh together we host a show called YouChoose so it's a streaming show onVictor's DevOps Toolkit YouTube channeland each episode represents a systemdesign choice so for each system designchoice we gather all the relevant CNCFtechnology that can do that thing weinvite a maintainer on from eachtechnology and that maintainer gets onlyfive minutes to present about what theirtechnology does because we just want anoverview then after that's done we putit to a vote and the community chooseswhich technology we implement in ourongoing demo so so far in the show whatwe've already chosen is here we go buildpacks for container images harbor for aregistry carbell for applicationconfiguration search manager forcertificates crossplane to declarativelydefine a database as a Kubernetes APIand then we're have this uh schema herofor schema for that database devspacefor developing on Kubernetes argo CD forgithops we have contour for ingress cubearmor for runtime security externalsecrets operator for secrets we havepsyllium for network uh uh networkpolicy we have a cubecape for kubernetesscanning we have notary project forsigning we have smithy spiffy inspirefor workload identity kclo forauthentication open fga forauthorization we have headlamp forkubernetes API we have Thanos and uhPrometheus for collecting and scalingmetrics open telemetry and jaggert fortracing isto for service mesh and opencost for cost management oh my god thisis too too much how do we do this y'allwhat have we done to ourselves so that'swhy we have that's why platformengineering has gotten so popular so aninternal developer platform is a set ofintegrated capabilities that areavailable to developers for them to usein their day-to-day work so there arethree types of people involved in makingan internal developer platform this isan overgeneralization but it's a good uhgood mental model we have expertsexperts know about one thing and theyknow that thing really really well andthat's great that's what they do we lovethem we also have our developers ourdevelopers just want to use those thingswithout understanding all the detailsabout how they work what they know wellis their application and so as platformengineers we sit in between the two andwe help the developers get what theyneed from the experts in a selfserviceway so the developers ask for theirthing without any humans involved theyget the thing and then that thing isnaturally compliant and it's secure,4 folks and I didn't need to join acall now that's goodjump jump back in yeah so all of youknow already but let's do the talk myname is Mi and I have my friend hereAbdel and here today we're talking aboutit so for those of you who already knewthat you could run LLMs on uh Kubernetesthank you you can now leave the roomyeah that's that's good but for the restof you um I mean we're here to talkabout LLMs because Jai have beenevolving over the last few years rightand you all know about all these likevery cool big models hosted by cloudproviders that can do thousands andthousands and millions of windowscontext windows but we're hereKubernetes nerds and we don't care aboutcloud hosted stuff we want to host stuffourself and most of you will probably behosting open LMS right which are alsogetting big um the open the deepseekmodel has uh the R1 has 671 billionparameters it's a 800 gigabyte model ina single file um and then all the othermodels that exist in the open world weactually were debating whether we shouldkeep open source because most of themare not they're open weight but most ofthese openweight models are getting bigthey're getting complex they're hugeputting them in production is noteasy um a very wise person ClaytonColeman had this statement saying "LM isthe new web application." Yeah and ifyou are in the conference until FridayClayton actually has a keynote on Fridayand I have gotten a sneak peek that hemight actually update this statement soif you want to be the first to know whatthe new statement what LLM is you shouldcheck out the uh keynote from Clayton onFriday but let's continue on uh when youwant to run LLMs on the cloud youdefinitely have a lot of options rightso you could do the old school baremetal your server or VMs manage it allyourself and all the way on the otherside automated versions like Vertex AIyou have Bedrock you have Asure ManagedServices or many other managed servicesi believe Kubernetes sits somewhere inthe middle where it gives you kind ofthe control of managing your platformand your infrastructure but also givesyou the flexibility of autoscaling andscaling to maximum and scaling downgetting access to a lot of resourcesdifferent resource types uh so I thinkKubernetes kind of does the best of bothworlds in many waysand the reason is because we've beenusing Kubernetes for web applications sofar right and it's great at doing thatall the stuff that MPI talked about theautomation the scalability but also thedevice management the working groupdevice management have been doing a lotof great job and the talk that wasbefore us was actually explaining indetails how the device plug-in can helpyou plug a GPU into a node and make surethat your pod gets scheduled on theright node the right GPU one of In myopinion one of the most important thingsthat Kubernetes brings is the multicloud capabilities being able to deployit in any cloud provider you want andalso having the actually the this thekind of slide that uh Miy showed whereyou have all the way manage yourselfkind of bare metal to all the way usinga managed service if we take Kubernetesitself we can also expand it because youcan also do all the way manageKubernetes by a cloud provider to allthe way manage it yourself um the thethe value of Kubernetes is the standardsingle API for doing everything allright so let's see some of this inaction right so uh conference internetwilling we'll get to see some of thesethings in live if it doesn't we'll justfall back onto and act like it that wasthe plan all along yeah we rely on youto to just Yeah just like act like itwas the plan so the first things firstI'm going to try to write an YAML uhthat's my to-do write fail writing thisYAML from scratch and see how that goesuh so I so we're going to create aKubernetes deployment that is going todeploy a large language model on acluster with GPU uh we work for Googleso we have access to GKE Google'smanaged Kubernetes services we're alsogoing to use something called GKautopilot that can automaticallyprovision a new node with the GPU ondemand instead of us having topre-pr5ovision the node so we're going todo all of that real time so create adeployment i'm going to let theKubernetes uh uh it's called IDextension on VS code that lets me justcreate a scaffold deployment so let'sgive it a name the model I'm deployingis the Gemma uh three 1 billionparameter model so I'm going to givethat name JMA 3 1B it instruction tunemodel and then I'm going to name thedeployment deployment because I have toomany of these things and I forget whichone is what and then I'm going to needan image and I'm going to not going tomemorize this image i'm going to justcopy it from one of the otherdeployments that I have here uh so theseare images that we provide as part ofour Vertex you could also grab any openimage so we're using VLM as the servingengine right now what that is we'regoing to talk about in a second but I'mgoing to use that image to get usstarted next up I need resources sowe're talking about uh deploying a largelanguage model so the type of resourcewe need are GPUs so let's give it somememory so let's say about 20 GB ofmemory uh capital G don't forget thatone uh I'm going to need about four CPUandNvidia.comGPU1 and I'm also going toneed some ephemeral storage because I'mdownloading lots of data i want to storethem on my device somewhere so I'm goingto give some ephemeral storage about 20GB that's my limit i'm going to copy thesame thing forrequest and should help me out there alittle bit okay here we go autocompleteis so great these days so I don't haveto it's pretty good and I'm going tocopy a bunch of some a couple otherthings from my other file that I havebecause I don't want to watch you all uhfail uh watch you all watch me failtyping that's a long sentence i'm goingto just copy this command righthere all right so the command basicallyis I have this VLM based image whichhappens to have have a Python file i'mjust running that with some arguments sothe arguments are my model ID tensorparallel size is how many shard I'mdoing the model it's a small enoughmodel that fits in a single GPU i onlyneed parall size one host is just whereI'm running which uh host I'm running itat and the port is 8,000 and I maxlength is 32,000 uh tokens so uh Gemma31B has a maxed um context window of32,000 tokens so let's do that you alsosee I have I'm u referencing aenvironment variable here which I haveto create now which I'll also just goahead and copy from here which I'm goingtoneed and okay so I have this environmentvariable called model ID and I'm goingto name it 1d bit and this is the samename you're going to need we're gettingthis from hugging face so if you go tohugging face you will see like differentmodel ID names we're going to copy thatone and one last thing I need here isthe hugging face hub token because Gemmalike many other models like mistrol orllama is a gated model so you're goingto hear this term gated model a lot isthat these models are stored in huggingface or Kaggle and you have to sign somesort of consent to say I am willing tolike sign this consent so that I can usethis model so I'm going to have my APItoken which I already create the secretfor so I'm going to use that and thelast two things I need are volume mountsand volumes because I want to store thismodel in a local storage as I downloadthat so I'm going to create that volumemount rightheretoken all right and the last couple ofthings is I need to have a port and asyou saw before I was uh setting the portas 8,000 so I'm going to do the samething as my porthere and this is the last part I need tobe able to say this workload should uhrun on a specific type of nodes inautopilot what we do in GK autopilot isdepending on the node selector we cancreate the specific type of node for youso we will go ahead and set that nodeselector in regular Kubernetes you cando the same thing because again this isa special type of device we're usingwhich is GPU we want to make sure thisworkload run against a specific type ofnode so we'll copy that node selectorcopy that in there so that's mydeployment done hopefully and then I'mgoing to need a service so that I canex6pose this deployment and so that I cantalk to this model and I'm going tocreate a service and it's going to needa name so I'm going to call it Gemma31BT service and the selector is goingto be Gemma 3 1B oh I called it IB noone caught it yet change this one to Yepyep one uh one here you go and the portis I'm going to expose the port 8000 andI'm going to target port 8000 as well onmy uh deployment so with everythingthere hopefully I did everything rightif not we'll find out togetherso cube control apply -f k8s the file isgemma31b 1b it uh yeah itvlm.ymol send thisin what happened ohboy oh server doesn't have a v1 so Imaybe copied the server wrong or notuh it's a port problem it's a portproblemoh volume zero ports you know what cutmy losses copy this in and change thingsyeah let's just make it simple yeah andchange all the names to uh1B here you go one B here one here one Bhere one B here one B here one B herethe joy of debugging YAML oh yeah idon't have my YAML engineer hat theservice change the service no service isthe same thing all right cool okay testit again that should work maybe heythere we go good so I just ran thisworkload let's check what is happeningwith this i'm going to do a quick watchcubecontrol get PL and we're just pending sowhat is happening right now we aresending off this workload and our GKAcontroller is going to go ahead andcreate a new node schedule my workloadand run this so while that happens we'regoing to continue on we're going to comeback to see what has happened here andsome of the other fun stuff we could dowith this yeah so one thing you mighthave noticed through this kind ofexample of very simple deployment is wesaid initially it's like LLMs are thenew web app but it's not a typical webapp this requires 20 GB of memory andfour CPUs and the GPU you don't needthis for web application right and alsorequires storage requires access to somesort of remote endpoint to downloadweights so they are web applications arein the sense that Kubernetes doesn'treally care it just here is a an aworkload give me some resources andKubernetes will allocate the resourcesbut they require some extrastuff before we move on we figured we'dspend a little bit of time becauseyou're going to be hearing LLMs and abunch of these words through in thisentire conference so we spend a littlebit of time kind of equipping you withsome basics right so LLM I thinkeverybody knows what that's a largelanguage model it's a thing that canspeak words basically right very simpleinference or serving i have a verystupid example of explaining inferenceof serving if you take a web applicationa web application is a combination ofCSS HTML and JavaScript right thesethree things are useless without webserver if you want to render a web pageyou need a web server that you can talkto and it will serve you your web serveryour website right so serving orinference or sometime we call it a modelserver is essentially a piece ofsoftware that can render an LLM orrender a machine learning model itdoesn't render it because there is noweb page but it's able to talk to themodel using an interface that the modelunderstand and exposes an endpoint thatyou as an application can integrate withthat endpoint is usually rest and thenthe model and talks something thatdepends on the architecture of the modelso whenever people say serve inferenceor model server they usually refer toVLM that's what MPHI used but there aretons of open source ones in the marketand we're going to talk about themaccelerators pretty straightforward gpuTPU or this new thing called Grockessentially any any sort of specialhardware that's that is really good atdoing matrix calculation or matrixmultiplication yeah and some of theother things you're going to hear quitea lot is quantization so when I said amodel is like a 1 billion parameter or10 billion parameter what we were sayingis there's 1 billion numbers in thereand that number could be on the full32-bit precision or it could have lessprecision and have less memory footprintso you could fit a large much largermodel in a smaller GPU if you7 use alower uh precision of the usingquantization with quantization you kindof give up some of the like the qualityof the model to be able to fit a largermodel so there is like a trade-off youhave but you get to fit bigger models uhweights are just the what I said the thenumbers that I have the model how themodel represents the world around it arethe weights context window is how manytoken you can fit to a model in a giventime for example something like Geminican have up to like two million contextwindow our open models like Llama Gemmahas up to 128k context window which isquite a lot but you know if you'retrying to process videos and such it'sprobably not going to be enough andfinally multimodal so the models we'redealing with here are all text to textyou send text you get text back butsomething like stable diffusion you cansend text and get image back or sendtext get video back or send an image andget text back so you can have a lot ofthis A2B situation going onand when you talk about model serversthere we just looked at VLM but that'snot the only one there is many of themin the market right now uh TGI comesfrom Hugging Face so Hugging Face iswhere we're getting our model data fromthey also provide you some software torun these models yourself nvidia againthey're the biggest provider of GPU atthis day and so they also have like asoftware in the market called NIM whichis I think proprietary they don't havean open source component all that muchyet but they is built on top of opensource standards you have Ray Serve rayyou can probably hear about it or lookit up they are bu purpose-built fordistributed workload so model serving iskind of a distributed workload they'rereally good at uh in the world of TPUand also GPU you also have Jax which isJetstream jax is an open source projectuh coming out of Google and a lot ofpeople can run models locally in Olamabut you can also run as a container onthe cloud so you can kind of bring yourown home experience into the cloud veryeasily withOlama yep and as you're doing this oneof the thing one of the reason this likelarge language models are not like yourtypical web app is the size of thecontainer image over the last 10 yearswe kind of preached to the choir andtold people to let's make containersmaller let's bring like smallcontainers that are very portable all ofa sudden 2022 happens and BLM image islike 10 GB and all the learnings we hadis goes out the window like how do younow run this giant model into ourcontainer and on our cluster if you'retrying to download a model of that sizewith the on the public internet you'relooking at times anywhere from 2 minutesup to like 10 minutes because againyou're going off to the public internetto download all that imagedata uh you also have the size of themodel to how much GPU memory you needcomparison so easy way to run the mathis that if you run it on full precisionyou are going to need twice uh thememory if you run it in half precisionit could run uh in less memory rightsorry if you run it in full precisionone bit becomes four byt right onenumber becomes four bytes so it requiresfour times the GPU memory so if you havea four billion parameter model in full32-bit precision it's going to requireyou to have 32 GB of video memory mostof the models you're going to see run onthe internet is going to be somethinglike BF16 or FP16 which is like 16 bitprecision which only requires double soif I have a 27 billion parame parametermodel I'm going to need roughly around54 GB of GPU memory so with the samemath you can see deepseek yeah so youcan see here deepseek at full precisionis1.37 terabytes that's how much memoryGPU memory you will need if you need torun deepseeek today on the markets thereis no GPU that has this amount of memorythe biggest GPU you can find on themarket is an H200 which has 80 GB ofmemory h200 has 141 h100 has 80 thankyou h100 is 80 h200 is 140 right yeaheven with that a single node of H100 orH200 does not have like you can have asingle node with eight H100 GPU stilldoesn't have enough memory to fit deepcigar one because there is 8alsosomething to keep in mind is there issometimes a a physical limitation intohow many GPUs you can fit on a singlenode and that typically depends how manyPCI Express ports are available on thatnode which typically is eight so let'ssay we take an example of an Nvidia A100with 40 GB of memory if we take a nodeand we fill it with a 100 GPUs we canonly do 40 * 8 so that's 200 320 320 GBright so if we want to run somethinglike DeepS which is 1,370 GB we needmore nodes so the kind of variationsyou're going to see in this space areeither what Miy did single host singleGPU or single host single accelerator orsingle host multi accelerator youremember earlier when uh Muffy wasdeploying there was this shardingparameter which he set to one you canalso set it to two three four five andthat's essentially the sharding thesplitting of the model across multipleGPUs but then you might also be facedwith situation where you cannot fit themodel on a single node and you need tosplit it across multiple nodes andthat's the third situation which ismulti-host multi accelerators now thetwo first ones Kubernetes can supportthem out of the box the third one canonly be supported through this new APIcalled LWS or leader worker set so theleader worker set is new API Kubernetesopen source that allows you to define aleader and the set of workers where theleader distributes the shards and theworkers run the actual model so thesharding is done by the leader and thenyou have multiple nodes and that's howyou're able to do multi-host multiaccelerators yeah so uh if you want tolearn more about this there's there wasa talk by one of the maintainers oflitter worker set in the last cubecon inNorth America you can go find that talkon YouTube do not watch that now we'restill talking um yes uh so when you'retrying to run large language models on aKubernetes cluster it is like as we saidthere are multiple dimensions ofoptimization we could do right numberone is image size has gotten really bigso we could do some optimization to makesure the images are cached on the nodesbefore uh you could do some optimizationbut on the data layer downloading themodel every time from hugging face isexpensive operation can we do somethingto pre-cache those model data somehow ordownload it one time and for every otherconsecutive runs we can just use thesame downloaded model and there's alsolike workload scaling we can do somework on the cloud level or yourinfrastructure level to scale thatworkload more intelligently rather thanjust like having bunch of GPU justsitting around just in case you need itso there are a lot of optimizations thatare happening uh this talk is probablynot the right place to talk about all ofthis would be more than happy to talkabout like if you have a specific needof that kind or what are we doing inopen source to kind of handle thatsituation but um let's move on to checkout what what is happening with my demoif it's still if it's still uh figuringthings out here we go we we I think westalled enough for the GPU to like uhfigure it out and run everything um I'mgoing to quickly go ahead and give aquicklogs uh what is the name let's go that'sthat oh nookay i'm going to have to do sports kget PL i have to copy the name oh Istarted withVLM uh okay what did I do oh Klogs get me on stage and I forgeteverything there you go i run this andat the end of everything we can see thatabout few minutes ago about it tookabout four minutes for it to downloadeverything and start the VLM serverright now I have a server ready here icould also do k get service and I wouldsee that I have a service ready to go icould like set up a load balancer on topof it to talk to it but I thought whatwould be fun for all of us todayactually instead of just talking to onemodel what if we talk to a bunch ofmodel i have a different name space indefault and I over the last couple ofdays deployed a bunch of these largelanguage models so if I do k get serviceyou would see bunch of these largelanguage models that are currentlyrunning and then what I thought would bevery nice for us to be able to cha9t withall this model at the same time so Ibuilt a small little UI tool and I cango here and I can show you the URL forthis actually I will show you the thisuh one on the right is going to take youto a UI where it's going to have a textin box where you can type in your promptone on the left is the all the code thatbuilds this so if you want to go checkout the code you could do that later butone on the right is where you want to goso together we're going to do uh onesimple prompt here okay everybody gotthe phone out everybody got it don't allyell out at once okay cool we'll see ifit scales all right yeah probably notbecause I didn't set up an autoscaler icould have i didn't i'm going to quicklyrun this one prompt which worked reallywell last time so let's see how thatgoes uh tell me uh knockkknockockjoke use because we're in uh London useelements of British humor i thought likeyou know we're we're in the right we'rein London right now we should use thatand send this prompt across all thismodel at the same time you can all trythat in your own phones if you scan thetoken and this came out one time beforetoo it's one one of my favorite jokesnow I'm going to read out for all of youokay uh tell me a knock-knock joke useelements of British humor okay so I'mgoing to say the knock-knock part youcan say the other part okay let's goknock-knock who's therearthur are there any biscuitsall right so again all of you have thislike a tool right now i'm going to keepit running for the for the rest of theweek so if you want to like keep testingbunch of different LLM models at thesame time you can but let's take a lookat how all of these models are beingserved we talked about VLM but we weactually can use a bunch of differentserving engine so we have VLM for one ofthem uh for the Gemma 12B that you areseeing on your screen right now if youhave the phone up and running we'reusing Olama so we're just basicallypackaging Olama with the Gemma 12B on itwe're running using Olama for the 3Bversion of Llama 3 we're using TGI soyou can just package everything in a TGIuh container use that model and havethat running that way uh if I you mightbe wondering hey you work for Google youI heard you guys have TPUs are you doinganything with TPUs yes we are uh wedeploying the Llama 38 billion versionon a TPU and one funny thing you havemight have noticed that you are talkingto on a single interface you're talkingto all these different models is fullytransparent to you the user that you aretalking to a TPU or a GPU becauseunderneath we're using VLM with TPU andit gives the same exact API interfacethat's the big big point of using thesemodels we're also deploying Mistral umuh again one thing I think Abdulmentioned about like a deepse being areally large model how do I go aboutdoing this all of you are probablyitching to learn about LWS i have youcovered you can go to DeepSc R1 folderagain this is a bigger model it requiresa little bit more things to be able tohandle things a little bit better so youhave this DeepSc R1 using Ray and VLMtogether we're setting up this thingcalled the leader which is starting as asingle worker has eight and H100 GPU onit and setting up this VLM worker usingRay so I'm setting up a ray worker thatcan now say okay everybody else underunder my control send me your work andthen I have a worker uh template workerpool which is also a single H100 nodewith eight GPU so I have two node totalthat is doing the work i have 16 H100GPU do all the work and that isbasically sending its work back onto uhthe the leader worker right so togetherthey're basically computing my everytime you send a request the R1 model issending so in the list I think the lastmodel is the R1 so you can see thatworking um couple of other things wetalked about like optimization is thatwhen you're doing like such a big modelwhich is about 800 GB of download youdon't want to go and download it everytime back and forth so we're usingsomething called HDML which is a Googlecloud specific thing but you could do aspecific thing like this in pretty muchany cloud where you use extra storage orPVC to store the model data so we'redoing that here instead of actually uhdownloading the model every time we'rejust mounting this volume that we cantalk to it so yeah we have all of thisexample if you uh got the GitHub repoyou should be able to actually the oneon the left if you understand that realquick all the YAML and all the things weuse including the UI is actually opensource you can go try and play aroundwith it um so yeah unfortunately wedon't have enough time to go through allthe details one thing that we we didn'treally show is actually with Ola you cansee and Ola is one of the ones that youcan try this very simple do I downloadthe model right when you do app themodel get downloaded or you can pre-bakethe model inside the image and thenproduce a new image that alreadycontains the model that's one thing youcan play with to see kind of like thealternatives and kind of see which onestarts faster uh but basicallyum we we thought we would wrap it upwith showing this reference architectingfor AI platforms and the whole point ofthis talk was to say yes you can run lmon Kubernetes kubernetes is the bottomlayer where you can run pretty mucheverything and the project is doing alot of work to actually make it possibleto do more things with LLMs so you havethe typical stuff like the autoscalingyou have this open source project calledQ um which we open source but it's youcan use it anywhere you want it's a jobuh Kubernetes native job queuing systemum you have all your multi-tenencyinfrastructure as code githops like yourDevOpsy way of doing things so if you'redoing DevOps for web applications youcan also do the DevOps for LMS and thenat the bottom layer you have your multi-instances like the way you shareinstances on the same nodes or acrossnodes um you have access to your um harddrives volumes as uh as Muffy showed youhave this thing called fuse where youcan mount a buckets it doesn't have tobe a Google cloud bucket it can be anybucket the fuse driver supports multipletypes of buckets um and then yeah andfinally like the blue colored box idon't know if the color shows it doesfantastic the blue colored box is kindof like what Kubernetes cares aboutanything outside as a software you couldrun pretty much in your control it couldbe your own custom like machine learningapplication it could be any of the opensource tool you could just grab from theinternet anywhere so for that part a lotof decisions like how those apps arebuilt is outside of Kubernetes controlso we are actually working with thecommunity quite a lot to make sure thatany type of app that you want to run onKubernetes works and we've been doingthis for the last 10 plus years at thispoint so again the community is evolvingwe're seeing a lot of changes from datascientists to like machine learningengineers and Kubernetes we want to makeit the best place to do this kind ofworkload with that we're going to giveyou this QR code uh this is to send usany feedback feedback is a gift anythingyou liked and this also lets CNCF knowif this type of a talk is useful for youto learn and I don't think we're goingto have much time for question answers idon't think we have a mic running aroundor anything but we're going to be herearound if you don't find us here we'regoing to be in the Google booth so ifyou have any question there is actuallya mic oh there is a mic fantastic but wehave only four minutes i don't know wehave much time maybe like one or twoquestions or just find us after the talkoutside here anywhere if you have anytype of questions about learning uhrunning machine learning models forserving use cases or training use casesyeah thank you so much for joining usthank you2025-04-15 22:00:30.399531;his case I have all these littledevices so I have uh things that runLinux i have Espirinos I have RaspberryPies I have Nuks I have O laptops I havelaptops that don't have a screen and Itry to like learn and just boot up Linuxand boot Kubernetes but with the late umtrending of like LLMs and chat GPTrunning GPUs just for playing around orlearning on the cloud or from a vendoris kind of quite expensive if you're notdoing you don't know what you're doingso in that case I thought myself ohlet's get myself a GPU at home and a lotof people actually have game machineswho have game machines at home uh thathave a GPU card PCI so a lot of you soyou already have that i got uh one ofthese i also bought a uh toing touringpie uh that you can put uh acceleratorsum uh um chips that are accelerators youcan use some AI there but I wanted touse the one that a lot of our customersthat I work with are using in the cloudand that's um Nvidia uh GPU so I boughtan Nvidia Jetson device and there's likethese small computers that you have athome um or actually these devices aredevelopment kits that companies use formanufacturing edge devices and they havethey come with a system on chip GPU sothat's the recipe so what's the projectyou always have to have a project so ifI have a Jetson GPU what do you thinkwill be my LLMproject rosie who knows thischaracter it's from the Jetson familyright was supposed to be a joke theresomewhere um so Rosie is very smartright so I want to create a bot i wantto play with LLMs i want to play withthis um frameworks Python frameworkslike PyTorch Olama all these open sourceLLMs at home so let's start of likewhat's the when you search online or youask um this AI search agents of how do Istart with and maybe you want to getstarted usually it's just do a helminstall I tried that it didn't work so Iwanted to learn how actually what arethe components of kubernetes or theminimum things to get kubernetes workingwith my GPU that is sitting on my deskso I thought about doing this talk interms of how I did my learningdoing this and in Amazon we have aleadership principle called learn and becurious you always have to be learningand be curious so I really wanted toknow what is the integration betweenGPUs and Kubernetes i know Kuberneteslike I can figure that out but I didn'tknow what is the integration betweenGPUs and Kubernetesso we start with something like the nodefeature um discovery which is somethingthat will go deep into it but basicallythat this is the the things that you dobut I wanted to know like but why why doI need the no feature discovery why do Ineed the GPU feature discovery like whydo I need it do I really need it to runa container um in Kubernetes to use theGPU actually no the other one is deviceplug-in also another component that youinstall in your Kubernet Kubernetescluster and but I want to know like wellhow how does it work right the why andthe how so on this talk we'll see a lotof thoseum and then the next thing is a vendorright a vendor will give you a devicedriver that's the next step that youneed that you would install in Linux andyou have to watch out which version doyou get for that device driver the nextone is container toolkit which isrelated to things like docker orcontainerd or the kubernetes runtime andthe last one that some people haveconfusions is CUDA and for uh thesefolks they may ask like CUDA is that agame or that's a cookie like what'swhat's the deal with CUDA uh wherepeople get stuck when they're trying touh do GPUs and Kubernetes and this tagis applicable on the cloud it doesn'tmatter if it's at home or it's in thecloud so the benefit that you get athome is that you start learning abouthow to work with them and understand howthey work so let's start with thebreakdown so the main three areas to getthis working is the host the operatingsystem in the worker node which you willhave to deal with the device drivercontainer toolkit and that's somethingthat you don't have to install becauseyou are working with a cloud providerthe cloud provider usually give you a OSimage that comes with this alre<ady outof the box and a lot of people get stuckbecause they go ahead and install uh adifferent version that is not compatibleon top of it the next one is CUDA andthe CUDA is a toolkit from a vendor umI'll try to make this as agnostic aspossible but in this case we're talkingabout Nvidia which is a very popular uhvendor of GPUs so the container toolkitis something that is uh a shim on top offor example containerd or docker we runc to be able to this uh containerruntime to be able to handle the GPUintegration with the device driver andthe last one is where kubernetes comeinto play so the first two you have tofigure that out for the operating systemin your worker node and the last one iswhat are the things that we put on topwith Kubernetes to help with thescheduling of these pods um on thenodes the next one so the first onewe're going to take a look is the devicedriver so the device driver is dividedin two parts um is the part that youinstall in Linux to talk to your to yourhardware so that driver comes from thevendor that you get for the specificship that you're using so for exampleusing AMD chips you will get the devicedriver for that if you're using Nvidiayou will get the Nvidia driver thatcomes with the CUDA um user mode driverso that's where CUDA start coming intoplay that you need the driver and CUDAcombination uh correctly so for Jetsonum if you have a if you get the JetsonNano or Jetson Aren one of these Jetsonu boards um it's a jet it's a jetpack OSand that jetpack OS actually comes witha specific device driver that comes withthe right version of CUDA that you don'thave to um install on top of it um ifyou're working on a game PC like I askedbefore that you are just installing forexample an operating system like uhUbuntu Ubuntu for example and you have aa game card on the PCI slot you may needto get that driver and install it andfor those type of things you canactually use the uh there's an operatorthat comes from Nvidia but that operatorusually is complex but if you know whatyou're doing then you can go ahead anduse it i tend to for this talk toconcentrate on like the building blocksto understand how itworks so the next one is the CUDAtoolkit so the tool CUDA toolkit issomething that interacts with the driverbut something that you can bring insideyour container uh image so you will haveto pick a library and usually thelibraries that you use like PyTorch OLama Lama.cpp CPP VLM all of themalready have CUDA support and they havea specific version of CUDA that is theinterface to talk to that device driverso that they always go together thatcompatibility layer you need to be awareand this is where a lot of people getstuck of like why is this not workingand it could be that you have a devicedriver that is not compatible with theCUDA toolkit version that you have inyour container image that came from anopen source project right or a vendor uhso usually we recommend in productionyou just get one from the vendor in opensource get the right version from opensource that has the one that works withyour umenvironment so what's that thatrelationship so one one of the thingsthat you can check the device driver isthis command here you can go into procdriver Nvidia version and get theversion of the device driver that youhave in your operating system and that'ssomething that you can need to startwith you don't want to start changingthe device driver to the latest to seeif you can get the best features justget the one that comes with theoperating system that could be on thecloud or it could be from the Jetson uhjetpack and then the next one is how doI know which CUDA version do I have onthe operating system and we'll get intoa minute into that this is specific ifyou are using the GPU driver from thehost operating system not from acontainer if it's in the host operatingsystem then you can have the toolkit orif you are compiling an application tothen package it to then include it inthe container you have to watch outabout the CUDA version and the last twoversion are 11 CUDA 11 CUDA 12 so here'slike one example of the table you can =goonline and actually see thecompatibility matrix of like devicedriver to CUDA version depending on theoperating system that you have forexample so let's start with thecontainer container tokit going deeperon how act this actually work so thecontainer toolkit is something that youinstall in the operating system or youcan install it with a demon set forexample a lot of these things that youwork in the operating system usually aredone through a pod that has privilegeaccess um to the node so you can changethe configuration of that containerd ordocker so in that case when you installthe container toolkit it has a runtimelibrary then you configure it sayingdocker at a runtime so in this case willbe the Nvidia runtime or container ID atthe runtime um Nvidia and you can set itto default this is a shim uh that Nvidiahas that then talks to run C when run Cruns basically what it's it's doing ishey I'm going to run run C but there's ahook that it will call an Nvidia uhprogram in here that we see in thescreen that it would inject theinformation about the GPUs when thecontainer gets started so run C waswould would create the container notstarted yet and in this web hook isability to change the configuration ormodify the configuration of thecontainer before it starts so it canhave access to the devices um like theGPU in this case so that's how theythat's why you need the container tokitto be able to enable in this casecontainer D that Kubernetes is going touse to use the the GPUs uh to access theGPUs in in runC so with these three actually you canget started using um containers on yourJetson device without the need ofKubernetes and that's basically thefirst test that you would do if thisdoesn't work um then there's no point ofgoing beyond this to get Kubernetesworking because if it doesn't work itmeans that you have the wrong devicedriver or you don't have the right CUAtoolkit or maybe the image of thecontainer that you're trying to going torun in Kubernetes it doesn't have the lithe the library so you have to test inthis case at that layer and then againyou can do this at home um for uh forlearning so at this level if you have aJetson uh board u Jetson Nano JetsonAren there's multiple Jetson uh there'sa new one that came out recently um butthere's a great GitHub um I would say aa community around uh these devices it'sa we have a discord but this um GitHubrepo I have the the links at the end umyou can run on the Jetson probably anyLLM or any machine learning or any AI Iframework library model out there justfrom one this repo so it's amazing whatthese uh folks have done in this GitHubrepo that they have a build system andthey have the library so they made itsuper super simple uh to get it runningso let's take a an example one of thethings that you can do at home so if youhave a camera at home and you want youhave a Jetson uh you can use one of thepackages in here uh to detect things uminside your house or inside the manumanufacturing things that maybe don'tneed that data does doesn't have to goout to a cloud or because of latency orGDPR or you just because it's at home Iwant to like play around with uh thingsaround my house like IoT devices likesensors on my doors sensors at the frontdoor uh video cameras to be able todetect things and um um and and learnabout it so at this point you'relearning at home with these devices uhwith a again like a old uh old game PCor one of these Jetson that I got or anyother ones but the basics are there onceyou have the the device driver uh thetoolkit you'll be able to do this withDocker so basically docker or containerdusing like cry orcryl you pass d-r runtime basically thatwould tell the docker or container IDwhen you run this container run the thevideo runtime that has that hook that itwill insert the device driverinformation as environment variables forexample which GPU you're going to use inthis case the Jetson only has one GPUbut it's something that you can test anylibrary that is out there so let's moveon and here's an example on a little CLIutility that this GitHub provider GitHubrepo provid>es you it's a jet somecontainers now this is a wrapper arounduh Docker so basically this is anexample of running Lama uh or in SpanishI call it like Ojamaum so basically you run run name o lamauh is the name is the the package fromthe GitHub repo that you want to run andthen auto tag ojama basically that's aum a command to get the right version ofcuda that is detected so that's why it'simportant to pass to get the right imagefor your hardware because you need toget the one that has the specific CUDAthat is compatible with your devicedriver and again if you have a specificver you you could have a version of thedevice driver that can work with CUDA 11or CUDA 12 so a lot of the softwaredetects which um driver do you have andwhich CUDA to use but basically to getthe right um image so in this case I'mdoing a a small test of saying how manyKubernetes administrators uh takes todeploy a cluster so it depends of theexperience of the administrator um themodel says it's from two to threeadministrator to deploy a cluster so itdepends what is the complexity of thiscluster but again LLM will give you a ananswer based on what they were trainedon the next one is uh the Kubernetes uhdevice um plugin is required so this iswhere we start like if everything worksin your computer in in the computer withuh uh docker and docker uh runtime thenthe next layer is the device plug-in andwhy do we need the device plug-in inthis case I can installkubernetes get a pod uh deploy and thatpot will start using the GPU if I deployanother pod we'll start using the GPU ifI deploy another pot using that GPU sothere's no scheduling there's noallocation there's no resourcemanagement at this point the pot can runwith the runtime and then the GPU willwork so at this point you can installKubernetes and it will work but you'remissing the scheduling and then all podshave access to all GPUs so there's noway to allocate uh this GPU to be usedby this pod this other GPU to use bythese two pods um without then the nextlayer of um u tools so let's start withthe GPU device plugin and when you readonline that people say install the Helmpackage of the device plug-in and thenthis will be the device plug-in by avendor so in this case we're talking theNvidia uh Helm package for the deviceplug-in is a combination of threecomponents is the device plug-in itselfit has a second component called the GPUfeature detection which is um I'll getthat into a minute and the third one isa Kubernetes open-source SIG is the nonvendor specific but it's included as asubchart which is the node featurediscovery this would allow you thedevice plug-in now the capability toschedule pods so not all pots access allGPUs or you can have pots waiting untilone completes and then it gets a slot toshare to use the GPUs at the um maximalthroughput or the advantage of takingeverything and then things like share aGPU with time slicing MPS or MIG inJetson it's a small device system orchip so not all these features will besupported on the Jetson for example MIGis something that is hardware specificso it will not be in the Jetson um buttime a slicing with one GPU uh or twoGPUs you can do so that's why you needthe GPU uh package so like we started atthe beginning the professor tellingsomeone just install the the the bighelm package is is uh you can use butyou can just install the device plug-inor it can come with the with theoperator so let's let's break this uhdown so the first thing of thiscomponent that runs on your cluster theother two are not running so when you dohelm install of this package uh theydon't they don't get installed so thefirst one is the node feature uhdiscovery that will actually read thePCI information GPUs and be able todetect that there's a PCI component uhthat is the GPU uh like you can see herethe next one is the GPU discovery soonce the device is uh available this GPUfeature this discovery is specific toNvidia so it will query the devicedriver to know what information aboutthe GPU that I can get and then annotatethe nodes so these two things annotatewith they label the nodes with thef?eatures so these are how the labels getinto your worker node to know that fromall the worker nodes this one has a GPUand information and then the last one isthe device plug-in that is a componentthat works with a cubelet so theyinteract with the cubelet to allocatethe the devices so you can say uh itwill tell you how many GPUs areavailable or if you're going to split asingle GPU into time slicing it can sayuh four pots four containers can use oneGPUso the integration with this is thedevice plugin when it starts is itregister with the cubelet then thecublet asks like how many GPUs do youhave the device plugin gets back thatnumber so it could be four GPUs in thejet will be like one GPUum and then that information gets intothe API server by the cubeless sayingthese are the GPUs that are all you canallocate for example and then whenthere's a pod requesting a GPU then thecubelet comes back to the device pluginsaying allocate GPU zero or allocate GPUone uh cube goes back to the API serverand says a GPU is allocated maybe youhave four uh from the four that youstarted now you have three um when youstart the container uh the cube let'sstart the container it knows that needsto start the container with the Nvidiaruntime if that's the default if not youcan put it into u into thepotspec so that's the integrationbetween device plugin so let's see it inaction so if you have like a set of GPUsand all the pots are running and takingone GPU then you don't have space forthe other uh pods so they're pending sofor example you can have a workload ofall this work or jobs for example aqueue of pods trying to use the GPU sonow you have the device plugging and nowyou can use Kubernetes to orchestratebasically right the pots in this node ionly have one node so they go pending ifyou have things like carpenter forexample it will see these spots that arepending that are requesting a GPU acarpenter or any autoscaler that workslike that it would create a node forexample in the cloud that has a GPU andthen that p goes there so in this caseat home I don't have more more than oneJetson so um the when the puck getscompleted then the allocation isavailable and the pot goes in and thenit starts running so this is the benefitof using the device plug-in but before Iwas able to run the pot and use theGPU so the why and how so for runningcontainers on the host you need thedevice driver container tokit and CUDAto orchestrate containers now you needthe device plug-in um and the feature uhdetection and the feature detection isvery valuable because it would label thenodes the first one uh the no featuredetection is an upstream Kubernetes SIthat detects the PCI devices andactually is looking in a folder in thehost operating system for labels thatany other agent can put in there andthen the the no feature discovery willstart creating CRS for a anothercontainer called the master is the onethat labels in the API server so the GPUfeature detection uh discovery sorry uhit will detect the information about theGPU like the Nvidia the vendor the CUDAthe device driver all that informationput that in a folder in the host andthen the no feature discovery willdetect somebody put something in thatfolder which is in Etsy uh specificplace and then the no feature discoveryis the one labeling so the GPU featurediscovery doesn't need arbback access tothe API server never talks to the APIserver device plug-in never talks to theAPI server and then no feature discoveryis the one that talks to the API serverso it needs uh arbback so if you'restarting with Kubernetes it's always agood practice to inspect the Helm chartsthe packages or what do they installwhat permissions do they need whichcomponents needs areback um so coming to an end this is anexample of uh my home lab i have a bunchof devices so the first one is where doI run the control plane where do I runAPI server and the SCDS uh database soin this example I have an Intel nuke andI have a Jetson uh or ring with a GPUand all of them at my house i have a bigum 24 port switch but then I run the umthe API server in another nuke at insidemy house everything runs inside my houseand you can run out of the box K3S ifyou put K3S- runtime Nvidia will do thesetup for container ID and basicallythat's the the enough after you installthe device driver and CUDA um and thenit's installing the Helm chart with thedevice plugin uh you can use cube ADM Itested that um all these notes I have inmy GitHub repo so if you want toreplicate it uh and any other Kubernetesum flavor or distributionKCOS you name it everything works aslong you have the device driver CUDA andthen you install the Helmchart and thenthe second setup is something that I doat work because I I help a lot ofcustomers and end users trying to run umKubernetes in manufacturing at edge butthey want the control plane to live inthe cloud so this is a setup for AWSspecifically where you have EKS hybridnodes that the nodes itself run on myhouse and the only thing that I need inmy house is the cublets thecontainerd the the Nvidia uh devicedriver CUDA and toolkit and then connectthem with a VPN for example wireguard isa very popular open source project youcould use uh wireg guard with VPNconnect to the cloud but the controlplane is the normal EKS uh cluster thatyou have in the cloud just nodes wouldjust show up that I put a label calledhome so if I want something to run athome I just like schedule that pod torun on the nodes that are labeled homeso they run on my um um devices so inthis case the Touring Pi is a cooldevices that have four slots that youcan put Raspberry Pi modules you can putan R1 module um and then I have a Intelnook a bunch they're very cheap likeless than $100 in in some of those usedmarkets you can get it um to buildsomething at home that is veryaffordable to start playing withKubernetes so here's the demo on on thesetup um this is running deepseek R1 onthe Jetson Kubernetes cluster uh at homeuh if you for the Jetson you would notuse a utility called an Nvidia SMI uhbecause on the Jetson the GPU doesn'trun on a PCI device so there's anotherutility called Jtop and it will give youthe GPU usage so as you can see here uhyou can see the GPU peg to 100% um youcan also change how much um watts do youallocate to the device so you can getmore power out of the GPU and then thisis lama with the open web UI and one ofthe many frameworks in the in thatGitHub repo that I show but this isrunning deep um R1 and I think uh thequestion here was I don't I don't see itit was tell me five jokes I can give acubecon next week during my talk so itstarted it started thinking and it tooka while thinking and thinking andthinking and eventually came with thefive jokes um if you can uh see me afterthe talk I can show you the ones thatthat it came up with and some of themwere not proper so I'm not going to showthem um and then deepseek um one thingthat I want to emphasize and then thisis a novice track is all these models uhthey will run on these small devicestake into account that there's um aprocess called quant quanticize that itwill like the resolution of the um thenumbers are smaller so you can have themodel uh quantisize that actually willfit a small device but when you run onthe cloud you actually run on a beefermachine GPUs uh so you will run the fullmodel at the full potential um and itcould be one of the one of the manymodels and you will hear many modelsthis week uh for that so coming to anend this is the GitHub repo um KS NVIDIAum I created this a while a while backago because I also run a Kubernetes bookclub out of the CNCF uh it's a communitygroup we get together every Friday soyou can join virtually or you can watchthe videos and basically we pick a bookum and we discussed a book aboutKubernetes platform engineering I thinkI did one on Jetson devices uh we talkedall things Kubernetes as a community uhbook club um you can find all theinformation here even running uh Knative serverless uh containers withlama in there uh thank you so much uhcomplete the survey and if you want touh talk to me I'll be around all weekthe address s boo booth I'll be aroundthere so thank you so much2025-04-15 22:00:31.209765 ��s�j#��Au-eUO3rIQV4good afternoon everyone thank you forbeing here and listening to us talkabout leveraging EVPF and open telemetryto auto instrumentexemplars so before we get started um Ijust want to take a quick moment tointroduce ourselves hi I'm Kitika andI'm a machine learning engineer at Applei uh work with the observability teamand my experience is in observability uhmachine learning and data science andI've been at Apple for about six years uCharlie hi I'm Charlie I've uh workedG�� i#��wAea2CKLX5vEshello everybody um welcome to How Greenis My Open Telemetry Collector we'resuper excited to have you join us heretoday um this is our first time speakingtogether so we're super stoked wooyay i love the energy in the roomconsidering how late in the day it isokay so let's get started my name isAdriana Vila i am a CNCF ambassadorblogger podcaster and one of themaintainers of the hotel end user SIG byday I'm a principal developer advocateat Dinatrace and I spend most of my timewith hotel and observability and bynight I like to climb walls and I lovechecking out local bouldering gyms inthe various cities that I visit um Ialso love capivas because honestly theyjust make me happy yeah um I'm Nancy andI'm super excited to be doing my firsttalk with Adrena uh I'm a CNCFambassador i am also TAG environmentsustainability co-chair advocacy and umI've been I've worked as a DevOpsengineer CNCF uh DevOps engineer thenopen source contributions also worked asdeveloper advocate and in my free time Ilove cats and I love totravel so yes let's get started umglobal warming is real and I think likeeach one of us in the room agrees withthis with the erratic weather conditionsand uh rising sea levA��Z�h#��kAbQvrutQO3-cmy name is Carlos Santana i'm a CNCFambassador uh that's a volunteer uhposition if you want to learn more aboutbeing a CNCF ambassador check uh with meor any other CNCF ambassador here thatwe are here in this week and we can tellyou about the program um I also have tomake a living so I work on AWS as asolutions architect for the EKS um uhservice that means Kubernetes uh usingAWS and today's talk is on the no uhnovice track uh on explain howKubernetes works with GPUs uh who's herebecause they search the word GPU on theschedule and just show up okay so thatthat strategy worked um who here has ahome lab of like trying to installKubernetes athome okay so maybe at the end of thistalk everybody will raise your hand umthis talk is an introduction of a storyabout running Kubernetes at home andsome of the basics are applicablerunning GPUs workloads on the cloud butfor for this people are new toKubernetes or maybe like myself Ithought I was an expert on Kubernetes umuh I was doing this at home so let'sstart with the scenario so the scenariois that I want to learn at home i wantto learn uh with machines or computersall computers that I have at home aboutuh Linux for example and thenKubernetes there's some sound comingfrom the other side it's kind ofannoying um so we start what's therecipe so you get yourself a maybe afiveyear Kubernetes expert so we wastalking about five year expert and thenyou combine that so that's for me forexample work have worked with Kubernetessince 2016 in another cloud provider nowI work for AWS working with EKS for thelast three years so I I taught myselfthat I know Kubernetes enough um to getKubernetes running at home so at home Ihave a different devices and my wifehates that I bring or buy another cablein my house who has thatproblem so she asked me like why do youneed another cable well because I mightneed it at some point right um but int:Bels we all knowthat it's for real let's see some factsso what country is moving its capitalcity due to climate change concerns soum and that's that's kind of concerningso Indonesia is moving from its capitalcity Jakarta to a new capital city dueto climate changes rising sea levelsnext question is what percentage ofglobal carbon emissions is the ITresponsible for IT sector responsiblefor and the answer is uh 2% of globalemissions now we expect that to hold ataround 1 to 2% in the next uh severalyears but then expect it to rise up to12% by 2040 which is completelystaggering that's a huge number and thatmeans we need to adopt more greenerpractices and with that we have a factthat according to Essentia's 2020 reporthow much p how much can public cloudmigration reduce carbon emissions andthe number is 84% so that'shuge now given all of the carbonemissions um that the IT sector isresponsible forum one would assume then that opentelemetry is part of the problem in asas far as contributing to those carbonemissions after all we are emittingtelemetry and that uses up CPU um memoryand then also as much as ourapplications are emitting telemetry wealso have applications that areingesting the telemetry so whether it isyour SAS vendor sitting on public cloudsomewhere or maybe you've got ahomegrown solution sitting on publiccloud or private cloud so all thesethings add up so the question is how canwe make hotelgreener and how about we start withhotel collector but before we move uhwith that uh I just want to mention thatuh to take an action we first need tomeasure something and then we need toexperiment like how can we make ourapplications greener how what practiceswe can adopt and then we can basicallymitigate so um as you can see a diagramover here so this is cast of charactersthis is something which we are going totalk more in detail in today's talk sowe have uh we have a Kubernetes basedobservability setup we have Kepler uh wehave application deployed in Kuberneteswe have OTL collector we have cubeprometheus stack and observability backend so Kepler is going to collect the uhenergy consumption energy consumptionmetrics from ODAL collector andapplication and then we have ODLcollector which is going to ingest theuh application telemetry andinfrastructure telemetry and Keplermetrics to the observability back end wealso have cube Prometheus stack where uhPrometheus basically store the metricsand graphana is used for observabilitybut if you have your own observabilitybackend which supports the Prometheus uhmetrics because Kepler is basicallyexporting the metrics in Prometheus uhmetric format so you can basically usethe observability back end as well yourown observability back end okay so I'mjust going to uh take a step back heresooopsie um so as Nancy mentioned um weare using uh the hotel operator as partof our setup now the hotel operator in anutshell um is a Kubernetes operatorthat allows us to do all sorts ofwonderful things including themanagement and configuration of ourhotel collector um and our hotelcollector for a quick refresher is usedto ingest data from various sourcesapplication and infrastructure thenit'll do some processing whether it'sadding removing attributes offiscatingdata etc we also have our connectors ourextensions and then we send our datasomewhere um and then as I said we areusing the um o the hotel operator and toum because Kepler emits those Prometheusmetrics um the we're leveraging acomponent of the hotel operator calledthe target allocator which also doessome wonderful things but one of thethings that we're taking advantage of isthe target allocators Prometheus CRdiscovery and specifically um for thosewho have run the Prometheus operator inthe past um that includes uh the theones that we care about are the podmonitor and the service monitor so thesecan be installed as part of thePrometheus operator or in the case ofyou're using a Prometheus list setup youcan basically just install the podmonitor and the service monitorseparately just the CRS themselves andthe target allocator will play nice withthem now thCe target allocator what itdoes is it uh discovers Prometheusoperator CR so it goes into theKubernetes cluster and says okay arethere any CRS here that I need um toknow about and then it'll add the jobsto the target target allocator scrapeconfiguration and then it will in turnsay heyel collectors please add thosejobs to your scrape configurationspecifically that's done in the um hotelcollectors Prometheus receiver soum that is our target allocatorcollector operator overviewawesome so um I just want to ask like doyou do you folks know about Kepler maybeif anyone can raise hands i see a key weare in the same tag environmentsustainability uh so there are a fewhands um so Kepler is short forKubernetes based efficient power levelexporter it's a CNCF project so it'sopen source you can contribute as wellyou can check check this out on GitHubum so what exactly it does is uh itmonitors and optimizes the energyconsumption in Kubernetes environmentand it exports in the pro exports thePrometheus metrics which we discussedearlier so what exactly it uses behindthe hood is it's using the EVPFtechnology uh which basically collectthe data like CPU performance countersand Linux kernel trace point so itcollects the data from that and thenthat data is fed into ML models whichbasically estimate the energyconsumption of your Kubernetescomponents like ports and nodes sothat's Kepler and this is thearchitecture and you can also so thereis a QR code uh you can scan it and youcan know more about Kepler the websiteuh you can check more about Kepler fromthe website um so yes now let's installthe Kepler so for installing the Kepleryou need Kubernetes cluster um you alsoneed Helm um you can use the manifestbut Helm is easier so we in our demo weare going to use Helm uh then you needPrometheus or any backend that supportsthe Prometheus metrics because Kepler isgoing to export the Prometheus metricsand then you need open telemetryoperator in our case we are going to useopen telemetry um collector to scrapethe metricsso yes so this is an optional step uhwhich is installing the cube promethiastack it's optional uh but before movingahead like there's a scan you can scanthe QR code uh we have made the GitHubrepo public so if you want to try outthe demo which we're going to talk abouttoday so you can scan the QR code um sofirst we are installing the cube Promerostack it's an optional it's an optionalstep if you have your own observabilityback end so you can go ahead with thatso this is something which we areinstalling the cube prime stack and thenwe install Kepler so we add we with usewith a with helm we basically add Kepleruh we also install Kepler in the namespace called Kepler we enable theservice monitor and then we also enableprocess metrics uh which is basicallygoing to uh give the detailed metricsfor your uh processes in a container soyou can enable process metrics if youwant to get extra information then wehave installed then we're installing theKepler service monitor so here we aregiving the Prometheus scrapeconfigurations and over here we we arebasically dropping some of the metricsso you can see that if you if you wantto drop some metrics uh so we aredropping the metrics because it can itcan be an overload so that's why we aredropping the metrics um then we areinstalling the graphana Kepler dashboardagain it's optional if you have your ownobservabilityuh if you have any obs if you're usingany observability back end so you canjust go ahead with that so uh in thefirst step we are basically waiting forthe port to get up um so that Prometheusis installed and then we are basicallygetting the uh the the Grafana portnameace and then we are basicallyinstalling the dashboard so we have aJSON and Grafana Graphana dashboard JSONand we are installing thatthen we are installing the operator weare installing theertert manager we arewaiting for it uh to get installed andthen finally we are installing the opentelemetryoperator all right so the open telemetryoperator um so so basically what we'redoing today is we are tuning um ourcollector and in order to do tDhat weneed to do some configuration of ourcollector because we're doing we'reusing the hotel operator to do ourcollector management we are leveraging acustom resource of the op of theoperator called open telemetry collectorso I'm going to go through the mainconfiguration points of the OTL uhcollector CR um just the main highlightsnow if you want to see the full sourcecode for the Otel collector CR you canscan the QR code on your screen so firstof all because we are using the targetallocator um it means that we have toenable it first because by default it isdisabled and also because we're doingthe Prometheus CR discovery we need toenable that as well so we do that herenext um we are ingesting data now if yourecall from our diagram from earlier umwe want to ingest some applicationtelemetry because we are going to besending some application telemetry toour collector so we can um get a feelfor how much energy it is expending umso we are using the OTLP receiver but aswell because we are receiving thoseKepler metrics which come in Prometheusformat we need to use the Prometheusreceiver now in addition to that um wewant to uh ingest some Kubernetestelemetry so we are using the Kate'scluster receiver which gives us clusterlevel telemetry thanks to the KubernetesAPI server and then we are using alsothe cublet stats receiver which ingeststelemetry at the cublet level for yournodes your pods your containers yourvolumes again courtesy of the KubernetesAPI server next we're going to move onto our uh processors so there are a fewthings going on here um first of all umthis processor here is the transformprocessor and what we're doing here allwe're doing here is basically replacingthe names of the um of the metrics thatwe receive from Kepler that come in thePrometheus um naming structure format sowe're we're moving basically from thisunderscore notation to the uh thestandard for uh open telemetry whichuses dot notation so that's all we'redoing here in addition we're using theKate's attribute processor which isenriching our Kubernetes uh sorry whichis enriching our telemetry with withadditional um data about our Kubernetescluster and then you'll notice here wehave a second Kates attributes uhprocessor and the reason why we have twois because we actually have two metricspipelines which I'll get to in a secondbut in this one again we're enrichingour telemetry data with some more uhKubernetes metadata but we're justadding a couple of we're just enrichinga little bit less and then finally wehave the resource processor where we'reessentially adding our cluster name toour uh to our telemetry data now I'mgoing to skip over the exporters becausein this example we're going to beexporting to a uh an observabilitybackend and if folks here we're assuminghere that folks here are uh familiarpretty familiar with open telemetry andhave done that sort of thing beforehowever if you scan the QR code on yourscreen as I said it's the full codeexample that shows how that's done sowe're going to skip over to the servicedefinition and look at the pipelines indetail and in particular you'll noticethat we have four pipelines i've groupedthem into two so we have our tracesmetrics and logs uh grouping and theseare our application pipeline and then uhwe also have our second pipeline whichis basically our Kepler Kubernetespipeline and here we're ingesting ourdata uh from our Kubernetes cluster umwe're ingesting our Prometheus data thisis where we're ingesting our Kepler dataand the reason why we have two separatesets of pipelines is we don't want topollute the application pipeline withour uh Kepler infrastructure pipeline sothat's why we've got that now that's allwell and good so we figured out how toum configure our hotel collector wefigured out how to install KeplerPromethe cube Prometheus stack and thehotel operator but okay what are wedoing with this so we decided that heylet us run some experiments um to figureout like what what are we tuning in thecollector right to see if we can make itgreener so we thought of doing a couplethings first of all um as I mentiEonedwe've got two sets of pipelines rightour application pipeline and then ourKepler pipeline so what if we took oursingle collector file and split it intotwo so then we have one collectorinstance basically for um our uhdefining our application pipeline asecond collector instance for definingjust our Kepler pipeline and then thesecond test that I thought would be kindof cool well we know that it's bestpractice to build your own collectordistribution so what if instead of usingcontrib because I'm using contrib as mybaseline for all of these what ifinstead of using contrib I built acustom collector distribution using justthe components that are necessary um todo this so I am not going to use any ofthe processors and receivers andexporters that aren't required that Ihaven't configured in my collectorconfiguration and we'll see if we seeany improvements now um now that we haveour tests these are the things that wemeasured now Kepler exports a metriccalled Kepler container jewels total umso it gives us the energy output of ourum in our case our hotel collectorthat's what we want to measure uh weconvert that to kilowatt hours and thenthe other thing that we want to do is doa ratio of the energy consumption so wecompare our new thing that we've changedagainst our baseline which is our unitcollector contrib image with like thesingle pipeline right um and the ideahere is if the ratio is positive itmeans that we're seeing an improvementif the ratio is negative then it meanswe are not seeing an improvement it'sactually worse um the other two thingsthat we want to measure are actuallymetrics that are emitted by the OTLcollector itself because it has theability to do that and we are looking atum memory consumption so these are thetwo memory consumption metrics thatwe're looking at and the reason why wewant to look at that is because ourapplications are constantly allocatingand deallocating memory and every timeyou're doing stuff like that that usesup energy so let's see if we see animprovement so um I ran a bunch of testsand here is one of the results so umbasically the green line up top showsour unit collector and then the whiteand yellow lines are our two separatecollectors and as you can seeindividually these collectors use upless energy than the single collectorawesome but keep in mind these are twocollectors not one so we got to we needto look at the data of the twocollectors as a single thing and when wecombine them we see a difference so theblue line represents our two collectorscombined and the white line representsour single collector and that makessense because we are essentially runningtwo replica sets of the collector so nohuge surprise um the next thing thatwe're looking at is the ratio and as youcan see it's a negative number which wesaid means that it's performing worse sonot so great on the energy consumptionand then the memory uh consumption weare remember these are the two memorymetrics that we were looking at and asyou can see from the lines they lookabout the same so that's kind ofinteresting we don't see any anythingmajor on the memory consumption but wedefinitely see on the overall poweroutput from Kepler now the next testthat we ran was for building your customcollector distribution so the um theblue line represents our custom dro Andour white line represents our baselinecontrib and it looks like in general theblue line is fairing a little bit betterthan our collector contrib um image andif we look at our ratio yeah it's doinga little bit better it's it's a positivenumber so good things happening and ifwe look at the memory consumption so thewhite line represents our contribcollector the blue line represents ourcustom collector and we can see that thecustom collector uses less memory thanour contrib collector so that's prettycool to to actually see inaction so based on all this stuff whathave we learned well first of all it'simportant to try different things rightum look at all the different ways thatyou can tune your collector or it can besomething else right it doesn't justhave to be your collector anyappFlication that you're running inKubernetes tune your pods figure out seedifferent ways that you can changethings up um the other thing that'ssuper important when you're runningthese kinds of tests is don't change toomany things at the same time so in bothcases I had a baseline which was mysingle contrib collector with a singlepipeline and in each case I changed onething one I split it up into twopipelines but still kept the same basecollector image in the other case wechanged up the uh the collector image sothat way you can keep track of thingsand so when you're when you're testingthings out you know okay I changed thisso this had an impact or this did nothave an impact um finally don't rely onjust one test because one test will nottell you the full picture for example inthe OCB test where I built the customcollector the first few tests showed theenergy output of the custom collectorwas lower than that of the contribcollector however subsequent testsshowed that it was actually worse whichis quite interesting so you can't relyon just one test i would also recommendtesting it on the same infrastructure soif you're going to test it keep testingit on the same Kubernetes cluster in thesame region the same cloud providerbecause again all these things are likedifferent variables um the otherinteresting thing to note is on the onthe um collector DRO test one thing thatwas consistent across the board is thatthe contrib collector consistently usedup more memory than our custom collectorimage so um the other thing is tuning isnever done just like software is neverdone i mean you can always find thingsto optimize um and also be prepared umwhat Nancy and I learned is that thereis a lot of stuff to learn for this wehad to learn Kepler we had to become alittle bit more familiar with the hoteloperator we had to figure out likegnarly configurations of the hotelcollector that I honestly didn't evenknow existed um you have to besemi-proficient with with Prometheus andand semi-proficient with dashboarding inwhatever um environment you choosewhether it's Graphfana or your favoriteobservability backend so there is alearning curve and also shout out to myco-orker Henrik Rexed um because hehelped us out a lot with the elements umfor this talk so without him we wouldhave been royally screwed so um he hehelped um also don't expect a quick fixin the same sense that we can't say heyI'm using Jenkins therefore I'm DevOpsyou can't say I'm using Kepler thereforeI am super optimizedwow that's that's been amazing uh wetalked a lot about the experimentationand like the measure part so let's talkabout now what um so we're going to takeit to the next level uh the first stepwas about measuring and doing theexperimentation um and you're going tolearn a lot of new things so there's ahuge learning curve which Edwin alreadymentioned so uh one thing is conductinggreen reviews and this is somethingwhich we do in our uh tag environmentsustainability as well so if you want toif you want to take part in it you canuh you can you can look for the greenreviews slack channel um slack in theCNCF channel so what exactly greenreviews does is it measures the carbonfootprints uh for the uh for the CNCFprojects so uh then you can use cubegreen to spin down the unused kubernetesports and there's a QR code as well soyou can read more about it uh you canuse cube cost to monitor and re uhreduce the cubernetes spend so uh byreducing the carbon emissions you'realso saving the infrastructure cost uhyou can also use build packs for greenercontainer builds uh you can choosegreener data centers uh Sweden is uh oneof the the greenest data center so weshould definitely fun fact definitely weshould run all your workloads in Swedenyes and if your backend supports metricsconsider ditching cube prometheia stacki mean we can use the existing stack umso this is something which I picked upfrom the uh tag environmentsustainability website and uh you canlearn more about it about it there aredifferent toolings which you can use uhwhich can help you optimize the carbonemissions then uh we conductedinterviews with the experts so uh Henrikhas helped us a lot and uh this issomething which uh he says that you needmeasurements which we've been sayingsince the beginning you need to firsthave tools and measure the energy inyour solutions uh you need to reduce thecode that generates heavy CPU cycles andmemory allocation you can install sometools which can measure the CPU cyclesand the memory allocation for exampleG-profiler is one of them which doesthis uh you can also identify theworking hours of environment to turn offapplication during night hours and youuh definitely reduce you you definitelysave the infrastructure cost as wellover here then we did it with Christinauh she is again uh one of the chair ofuh tag environment sustainability she'san amazing engineer uh she says thatclean clean up unused services uh andregularly audit everything that isrunning also scale down if there is toomuch unused capacity so these are someof the facts which we have picked up butthere's a huge interview which we totaken and it's on GitHub so you canbasically scan the the GitHub repo whichyou scanned earlier you're going to findthe interview in detail over there uhthen we have Yusin Farmer and she's uhone of the expert in machine learning uhshe says that GPU waste is your biggestleak and we should be concerned aboutthis because AI is something which eachone of us is going to use um so fix thepipeline most teams use only 20 to 30%of the GPUs uh this is her observationum AI first infra from all angles withefficiency down to agentic job level atinterference time compute and futurebreakthroughs in AGI needs to optimizethe excess both software and hardwareand that's where the magic happens andthe last one is uh Dave and Elena theyare from the PTC group um they say thatreducing carbon footprints is not a sunkcost but actually helps company reduceexpenses So and so we're going to leaveyou off with some uh handy resources wegot a link to the Kepler project if youwant to check out the full uh examplethat we talked about today we have apublic GitHub repo you can scan the QRcode for that there is the link to thecloud sustainability landscape diagramthat we uh included that we showed youearlier um check out my co-orkerHenrik's uh video called Go Green withKubernetes um he dives super deep intothat um also because uh we had theadventure of building our own customcollector distribution for the firsttime um I documented some of the thingsthat I learned and the gotchas um that Iencountered um personally so if you'reever doing that check that out and thenfinally check out tax environmentalsustainability on CNCF Slack and we'regoing to end off with some shamelessself-promotions so as I mentioned um Ido have a podcast it's called GeekingOut my daughter helps me edit it she's16 she also designed the logo whichincludes copyadas and I have stickers ifanyone's interested i've had really coolguests on including Kelsey High TowerCharity Majors Liz Fong Jones HazelWeekly who's sitting at the front yay umNancy's been on as well and if anyone isnew to observability with open telemetryI do have an O'Reilly video course onthat topic also observability uh sorryhotel community day is happening inDenver Colorado and the states it at theend of June so if anyone is coming toopen source summit that'll be part ofthat week so do check itout um oh damn I can't skip this solearning from Adena um um so I basicallyfounded women in cloud native communityin December 2022 uh we also had agathering this morning and uh fun fact90% of people uh were the first CubeConuh they they were attending uh theCubeCon for the first time so uh do joinwomen in cloud native community uh Ialso did interview with Edrina andthat's how we met i met amazing amazingpeople in this community i'm alsolooking for new job opportunities so youcan connect with me if you areinterested uh to work with me um I'mlooking for cool projects um so that'sthat's my website collection of my talksand workawesome thank you everyone thank you forsticking around at the end of the day[Applause]2025-04-15 22:00:32.140820H atApple for about 8 and a half years umI've had roles from SR DevOps andobservability um I'm a software engineerat Apple today and I'm also a maintainerfor Cortex a one of the projects inCNCF allright soum if you're here today and it doesn'tmatter what your background is if you'rean S you're a developer you're a DevOpsengineer you will take away how to useexemplars to supercharge yourobservability and your debuggingabilities so what are we going to covertoday we'll talk a little bit about umwhat exemplars are and what eBPF is uhwe'll go into how you can use eBPF toauto instrument exemplars and then we'lltalk about projects that use opentelemetry to create exemplars with eBPFand we also have a demo for that andfinally we'll talk abouthow youcan okay um and how you can use machinelearning to supercharge your exemplarobservability okay so imagine this canyou guys still hearme okay uh imagine this that I have aservice and I'm noticing that therequest latency there seems to besomething off with that so what is thefirst thing I do i go into my dashboardand I look at my request latency graphand indeed it looks like something iswrong it seems to be going up but I justcannot use only this metric to debugwhat's going wrong i still need moreinformation so the next step that I dois I go to my distributed traces and nowI have to sift through this plethora oftraces to figure out which request pathis that of the slow request andobviously we all know how tedious thatis but what if there is a way for me togo from metrics directly to traces andthe answer is there is a way to do itand that is throughexemplars exemplars are time series datapoints that essentially link metrics totraces and logs through the trace ID andspan ID and these are often storedalongside metrics in the metrics datastore they provide a lot of contextualinformation because it's not just thetraces and logs you can add a lot ofother attributes to theexampler aswell okay so uh here is an example ofwhat an exemplar uh looks like so let'sgo through what attributes are therehere so the first thing that we see hereis the trace ID which is essentiallywhat you would use to go from the metricto the trace it it that is the linkbetween the metric and the trace uh wealso see a value here which says 2016which is the value of the metric itselfand the metric is some sort of ahistogram metric as we can see here itsays bucket so the bucket it belongs tois the 2500 bucket there is le uh on uhin between that is the bucket and um theother attribute of importance is thespan ID so the trace ID and span ID isall that is there in this uh uh exemplarto point you from the metric to thetrace and the way that the exemplar isinstrumented is you can see this umlittle yellowdot i guess we went to the next slidelittle yellow dot here that's theexemplar so that's how it it looks um onyourvisualization okay so now um let's saythat I'm convinced that I need toinstrument my uh application for anexemplar what do I do next i need to goand essentially instrument every singleapplication that I have to instrumentthe uh detailed telemetry that I need tocollect for exemplar so I need metrics Ineed traces and then maybe I want to addlogs i will need to instrument all ofthis and it just doesn't end at myapplication if I'm uh interacting withexternal applications I will have toinstrument those applications as well tocollect these exemplars to get thecomplete picture and obviously you cansee that it's a lot of overhead not justtime but also it could be a performanceoverheadbut what if I told you that there is away for you to auto instrument yourapplications forexemplars and the way to do that isthroughebpf ebpf is a superpower inside of theLinux kernel that lets you run smallprograms within the kernel uh withoutactually modifying the code uh for thekernel or the application itself andthis can enable you to have highperformance observability withoutmodifying the application itself so whatis the advantages of having u ebpf forauto instrumentation one of the biggestadvantages is that you have to do zerocode instrumIentation for yourapplication and even with that you gethigh high resolution visibility into uhwhat's going on and they integratepretty well with uh the Kubernetesobservability stack so you're improvingthe observability of your applicationsand because it's at the kernel levelthey are event- driven and um highlyefficient so now let's uh look at aquick demo of how you can use EVPF toauto instrument exemplars charliethanks Kita all right so here's going tobe uh something exciting we're going todo a live demo um but I wanted toquickly give you a big picture of whatwe're about to do in the demo so that itmakes sense um so uh what I'm going todo is uh set up a local Kubernetescluster on my laptop um and then installuh these components into the cluster uhand then have um uh traces and metricsautomatically be generated for all theseservices that are talking to each otherwithout having to actually instrumentany of these services so these are justbinaries that you can download upstreamthat haven't been up um um uhinstrumented with uh exemplars and I'mgoing to be using BA here uh which is aa project uh from Graphfana that um letsyou have auto instrumentation um andthen let you also generates um exemplarsuh from the metrics and the traces um sothis is just a quick uh picture of whatI'm going to do i'm going to change overto my terminal here to actually show youwhat I'm going to do here so um let memake thisbiggerokay so um I'm going to be usingsomething called K Lima to uh set up myKubernetes cluster i already have oneinstalled so I'm not going to run thisagain but um I'm just going to be usingK9S to show you that I have a clusterrunning on my machine um it just has uha couple of pods on there that are likeessential to running the cluster butother than that nothing is reallyinstalled on the left hand side I'mgoing to run Helmfile to um uh installall of the charts that I'm going to beusing to get this working uh this isprobably going to take a couple ofseconds so I'm going to show you uh whatI'm going to be installing here so um onthe left here you'll see that I'm usingseaweed FS to emulate S3 on my laptopbecause I don't want to use Wi-Fi hereum BA is the project that I'm going tobe using to uh have EVPF uhinstrumentationum Jäger is where I'm going to bestoring all of the traces that getscollected uh the open telemetrycollector is what I'm using to route themetric and traces to the various uh umbackends um Prometheus is where I'mgoing to be storing the metrics uh withthe example data i'm using Graphana hereto visualize the metrics and the tracesand then Cortex is a project that I am amaintainer for so I'm going to be usingit to show how a microservices likesetup uh without any instrumentation forexemplars can be automaticallyinstrumented without any like manualinstrumentation and this is kind of likethe magic here that I wanted to show soit looks like everything on the rightside is up um you can see I have cortexas a microservices uh uh setup uhinstalled in the cortex name space ihave BA here graphfana Jerger thecollector uh the server and then thethree pods we started at uh with thebeginning um I'm going to port forwardhere to uh the the local graphanainstance here so that I can interactwith it from my laptop um and then I'mgoing to run a command to set up somedata sources anddashboards um so I don't have to do thathere but uh so now that that's going Ican go over to my local host and then beable to visualize some dashboards hereso I've made this exemplars dashboardum here which uh is actually showing usuh exemplars um so I have some metricshere that are coming from a cortexbinary that I did not touch at all and Iwas able to get metrics uh traces andexemplars out of that so um thedistributor is let me step back and talka bit about what this service is soCortex is a um a project that allowsteams to have like a very horizontallyscalable Prometheus so um it basicallylets you send metrics to Cortex uh andthen um Cortex will able to manage thatfor you at a very large scale so imaginejust like being able to horizontallyscale PrometheuJs across multiple nodesum and then making it very highlyavailable so the first thing thatusually metrics touches when it comes tocortex is the distributor service andthat fans out um metrics into theingesttor service um which then storesthe data but you don't really need toknow all of that to kind of get thepicture here distributor here is one ofthe services that I'm showing here umthese are metrics that are coming out ofBA um for this specific service so it'sshowing me the specific post uh requestmethod the status code the HTTP route umthis is nothing new but what is new isthe idea that you can jump from thismetric to thetrace like this for example I'm going tojust filter for one of the uh serieshere that I'm looking at which is an RPCuh uh client request um so this shows mehere here I'm just going to click on oneof these exemplars here this shows methat uh uh the service name that I'mlooking at is the distributor it'stalking to the ingesttor um this is thevalue for how long it took so this is alatency metric um for how long thisspecific RPC request took so in thiscase it was 5 milliseconds um it was inthis 10 millisecond bucket it's a soit's a histogram um this is the RPCmethod that was called so you can seethat this is these are the variousattributes of the span that was in thisrequest to make this whole thing happenum and also if you click on view andJerger UI you can see the exact trace aswell so from that metric I looked at anexemplar and then jumped to that traceto see more information about it um andI can drill in more here to see um okayin this specific span uh what were thevarious attributes that were that wererelated to that request um and I justwanted to re-emphasize here again Ididn't have to instrument Cortexmanually like BA was able to instrumentthe kernel itself which all thesedifferent services are talking to tocollect that telemetry of which serviceswas talking to what what kind of requestwas it what was the status code um andAya was collecting that and sending thatoff to Prometheus and Jerger to store itum as its uh telemetry uh backend soum that is the the quick um kind ofthing that I wanted to show that was thedistributor um I can also look atanother service that's here inside ofCortex so this is another so theingesttor is where the metrics actuallygo when it comes out of the distributorso I can see um here this is uh some ofthe things that it's it's talking to soit's talking to seaweed fs to see if theblocks are there so it's doing all theseother things and you can kind of um kindof get a picture of what's going onwithout actually having to instrumentthe service itself so this is like asuperpower for free right so um ifyou're an S sur let's say you'remanaging a bunch of services but youdon't have a lot of context into itright this is sort of turning blackboxmirroring into white box right you'reable to see what's going on inside of aspecific service and then get more uunderstanding of what's goingon okay so um I also wanted to show theum the query side so these two thingswere like the ingestion side so whenyou're sending metrics to Cortex that'sthe ingestion side I wanted to show ifyou could query Cortex what would thatlook like so I'm going to go over herethis is a nonsensical query really thisis a very expensive query i wouldn'trecommend running this uh in productionbut what it's doing is essentiallysumming up every single metric andgrouping it by name and this is just tohave some um payload on Cortex so I canvisualize some uh requests going on umbut then in cortex you can actually seesome of these requests that arehappening here so all of these are 200um the query range request is the onethat I'm most interested in so I'm goingto click on that and uh there's someexemplars here fortunately so um this isuh what the graphana is talking tocortex and then um this is the exactamount of time that the query front endtook and I can click on view inJerger so far this demo is working greatum so um uh so what we're actuallyseeing here is Graphfana itself which Iagain did not touch or instrument itjust hasK uh this instrumentinstrumentation generated for us out ofthe box and I was able to collect all ofthat info and send it off to Prometheusand Jerger to store it for us and thenI'm using Graphfana itself to query sothis is kind of like meta level thinkinghere where I'm showing the thing thatI'm instrumenting using the thing thatI'm instrumenting it with um which isreally cool right so you can play aroundwith it and then see how everythingworks and it's all in your laptop andyou can experiment more with it and thebest thing is like it's all opentelemetry right so um the stuff that'scoming out here is open telemetry nativeso um the events that are coming fromthe kernel they get translated into opentelemetry uh signals which then getsrouted off into the backends that knowhow to support open telemetry um whichis really really cool so this is like alot of work in the making here from manymany people um so thank you to the opentelemetry community for you know makingthis possibleokay so I wanted to stop there for thedemo and I think I wanted to hand itback to Guta to talk more about um howeverything works umuh maybe I'll just quickly recap what Ijust said so that it all makes sensehopefully uh so I I deployed CortexGraphana BA Prometheus Jerger all into alocal Kubernetes cluster so that I canjust have everything here for the demoum and then BA here is listening orbasically telling the kernel hey send methese network events when they happenand then BA translates them into opentelemetry signals and because BA here uhis generating them the trace and themetric it's able to inject that trace IDinto the metric to create an exemplarand then forward that off intoPrometheus so that when you query thatmetric later it'll have that trace ID inthere so that you can find that trace inJergerum okay now I will hand it back to Griathank you all right uh thanks Charlie sonow that we've seen a demo of how youcan auto instrument uh exemplars witheBPF for any application he showed quitea few of them so that's prettyimpressive so now let's talk about whatare some of the benefits of um exemplarsso the biggest benefit is that you getprecision debugging because they linkhighle metrics to the traces directlyyou're going straight from the alertthat you get to the root cause itselfand I'm not saying it gives you the rootcause can root cause the issue muchfaster and the down the uh opposite ofthis would be that you're actuallyhaving to go and look into your uh uhtraces and see which one is um therelevant trace for you to figure outwhat's going wrong so it's really thatpowerful so the next one is that uh withtraditional metrics um any outliers thatare in the data and that are momentaryoutliers usually get uh hidden becausethere is some aggregations that most ofthese uh metric uh data stores uh applyso you may not see them uh when you'retrying to look for them but the greatthing about exemplars is they highlightthem because um as you saw it adds alittle um data point onto your graph andthat really shows you that somethingwent wrong and those outliers can becaught more easilyit has very context rich observabilitywe saw the example we saw the uh uhamexemplar uh in the demo and you can seethat it has so many attributes it's notjust the um trace ID or the span IDthere's so much more information therewhich kind of it's telling you a storyand it's not really just a data pointanymore um you can do faster incidentresponse again because you're notflipping through you know a milliondashboards just to figure out what wentwrong it's um it kind of helps engineersgoing from the symptoms uh to the uhcause in just seconds and not minutesand finally it opens up a slew of uh MLanalysis that you can do on exemplarsthat give you even more uh observabilityinto your applications and we'll talk alittle bit about this a little laterso let's quickly talk about autoinstrumentation versus manualinstrumentation so manualinstrumentation is good the advantagesis that because you are the domainexpert and you have full control of whatyou capture you can add a lot of domaincontext into thLe exemplars that you'reextracting and this just means that youhave more depth of uh information in theexemplars and they can also work acrossmany languages orlibraries but for manual instrumentationyou actually have to go and change theapplication code itself and redeploythem so it requires code changes andredeploys and the risk of you missingcoverage is a little higher than if youwere trying to auto instrument itbecause auto instrumentation would kindof um cover most of the breadth of uhuse cases that you would need so withauto instrumentation the advantage isthat you don't need to do any codechanges for your application and it uhcaptures more low-level signals like siscalls or network uh latency uh requestslike we saw and it is ideal for blackboxuh scenarios so for example I think umCharlie spoke about this uh if you arenot familiar with an application and youjust want to have some moreobservability into that application itis great for that kind of a use casebecause you can implement uh implementauto instrumentation at the kernel leveland you'll get a lot of informationthere and it's also good for uh legacyservices but the downside is that it'slimited to what the kernel sees and it'sharder to capture deep business uhimpact levelsemantics okay so now let's talk abouthow you can um supercharge yourobservability or as I call itobservability plus+ with uh adding ML onyour exemplarsso when the worlds of exemplars andmachine learning meet the most importantuh fe feature that you get is labeledanomaly detection as you know in realworld applications labeled data is veryhard to get so most of our anomalydetection uh methods are unsupervisedwhich means they are not running onlabel data but the great thing aboutexemplars is because they mark outliersin your data you already have label dataof what is an anomaly so you canactually perform labelled anomalydetection which is supervised anomalydetection and they are far more umaccurate than unsupervised uh methodsyou can do root cause recommendations oreven to an extent root cause analysismuch better because you have more uhtypes of information you do not justhave metrics you also have the trace andthe log that corresponds to that metricso you're able to do root causerecommendation more easily you can doproactive alerting um and you can havefeedback loops which is you can say thatan exemplar that um your autoinstrumentation has marked is a realexemplar a real outlier or not a realoutlier so feeding that kind of databack into your um anomaly detection orroot cause recommendationuh services is a is very very umessential and finally you also unlockmultimodal analysis so let's talk aboutwhat multimodal analysis means so todaywe live in the world of large languagemodels where we all have now learned tothrow whatever data we have into thelarge language model and it often coughsup something that is valuable to you sowhy not do the same thing with exemplarsyou have three modalities or threedifferent types of very rich data youhave metrics logs and traces and notindividually you have them linked withone another you have very verycontextual uh very very contextuallyrich information so throwing this at thelarge language model unlocks again alarge number of features for you likeincident classification root causeanalysis incidentsummarization you can get uh the largelanguage model to write your RCS for youas incidents happen you can do a lot ofother summarizations as well you canalso further take this u into doing rootcause analysis and u finding you a fixas well maybe you have some informationthat you've already documented RCprevious RCAs runbooks all of those canalso be fed as metadata into thelanguage model along with theinformations that you're getting fromthe exemplars and this can take you fromroot cause analysis to also recommendinga fix which will reduce your MTR evenfurther so why can't you do this with uhmetrics traces and logs individually youcan but the amount of data that youwould throw at a language model would bemore but in this case you are providingvery specific information abMout uh yourapplication to the language model soyou'll get better in better uhoutput so that's what exemplars can dofor you in a nutshell with when you haveuh machine learning applied tothem so before we close uh let's look atwhat our key takeaways were we looked atwhat exemplars are and what ebpf is thenwe spoke about how you can use ebpf toauto instrument your exemplars uh wealso spoke about projects that use opentelemetry to create exemplars with ebpfand also saw a demo of how you can dothat and finally we spoke a little bitabout how you can use machine learningto supercharge your um exemplarobservability so we hope that you aremore uh motivated to use exemplars aspart of your observability stack movingforward and yeah we're open to questionsnow thank you[Applause]do we have time for questions maybe youcan just come maybe you can just comei'll just come over thereall righthello oh hi hi um forgive me if this isa naive question because all I knowabout EBPF is how to spell it butum layer 7 protocol P parsing isfamously difficult to do in aconstrained compute environment likeEBPF um I know other CNCF projects I'vetried and failed so is the magichappening in BA like what is going oninside there is it actually parsinglayer 7 inside eBum so I'm not like a eBPF expert as wellum but I was able to work with the themaintainers of VPPF to add exemplars towhat they were doing so um underneaththe hood for how it talks to the kernelto like translate the network requestsand then translates them into opentelemetry signals like I know that'slike what happens in general but I don'tknow like if it's doing what you'reasking if it's translating the layer 7things directly um I mean it we saw thatdemo it was pretty I think uhcomprehensive of all the differentservices that it was able toautomatically instrument whether it wasa C service or it was written in go umthere are these hooks that bail knowsabout because basically for each kernelum there are there's the map right whereum there are these function calls thatit uh expects to call when it talks tothe kernel and so bail has thisbasically addiction ary of all thesedifferent function calls that it cancall um and so it's able to recognizewhen that happens and then translates itinto something that makes sense for ahuman so either if it's an HTTP requestor a gRPC request um if you want to knowmore I would recommend talking to themaintainer of BA um he would telldefinitely tell you more about it butthanks for the question that's great iappreciate it thank youhi thank you for the talk i was justwondering uh it wasn't entirely clear tome in the demo the metric series withthe exemplars are they newly created byBA or is it somehow correlating existinghistograms and somehow adding yeah so sothe metric series were created by BA umand uh the exemplars uh were attached tothose metrics that it created um but themetrics that it creates is an opentelemetry metric as well so um it wouldalso conform to the um the semanticconventions that open telemetry proposesso if you have a dashboard that knowshow to like look at semanticconventioned you know regulated metricsthen it should just work the same um andthere's a lot of work to make itpossible to have the bailout be includedin projects that are already havingmanual instrumentation so thereshouldn't be a lot of like collisionsbetween those two awesome thank youyou're welcomethank you for the talk um my question isyou know in the time series there willbe trace ID to crosscorrelate between uhthe metric and the traces i think that'sbasic idea but uh and there's also Ithink the log relation right you knowthe both metrics and logs could be youwill be able to you know find therelateduh event uh in traces or logs uh with acertain outlier metric uh so uh whenthis things scale up because I thinkthis will be available for all datapoints and there will large cardalitiesand you will need to query lots of logsand metrics and trace if it's notembedded in the exampler itself uh formachine learning or you know other dothe root cause analysis you you mightneed to sift through uh lots of logs ortraces just to get the complete pictureuh uh so uh wouldn't it be a problem oryou know the the what will be thestrategy you know uh to get the completepicture embedding it into the exampleralready during the generation oruh is is is there any you know strategyabout that umso uh I think what you're asking isbecause there's so much of logs andtraces if you just embed an ID thenwould it be a lot of information tofetchYeah it will it will be hard to find theright trace ID between you know billionsof traces billions of logs and you havean exampler if it doesn't have the fullcontext that's what I understood youwant to let's say you know do do a rootcause analysis based on some outlierdata points right and you want to getthe full picture you need to find themby trace ids right everything is not inexampler that that's the part I didn'tget you know what should be the contextof exampler or how much you need to getfrom the other relevant events yeah sothat information is embedded within theexemplar so you have a trace ID which umessentially points you to the particulartrace that you need to look at and umthe Same with logs you can look at aparticular log um which corresponds tothat outlier uh time series data pointso with exemplars you get um all threeof all three metrics traces and logs fora particular uh time window and it'svery limited uh context there but it'svery rich context so you don't reallyhave to go and like sift through goahead and fetch yeah okay thank youhello very nice talk uh I have onesimple question is there any alternativedashboard for um showing the exemplarsdespite the graphanaum not that I'm aware of i did talk tothe maintainers for Persus uh the othersandbox I think project right now inCNCF that allows for querying metricsand traces um that was like a maybe acouple months ago but they said theywere looking into getting that uhworking um but as far as I know therethis is just one dashboard or I thinkGraphfana is the only one that I'm awareof that can show exemplars in adashboard um the the Graphana itselfdoesn't you don't need to necessarilyuse it the idea is that um it makes itpossible to visualize just like with anopen- source uh uh project um the ideaof getting an exemplar is just inPrometheus right querying uh Prometheusfor the exemplar data you can take thatum uh when you're querying the metricand then just add it on top of whateverdashboard that you're visualizing to beable to jump to the um to the trace soGraphfana isn't doing anything magicalhere with viewing exemplars they're justone um company that's doing it right nowum it wouldn't be very hard to add it topersist i would I would arguethank you you're welcomeuh yeah thank you for the talk um I havea question that is related to previousquestion um and that's the the gRPCmethod and also the HTTP uh path wasincluded in the uh exampler as I saw souh do you have any recommendationregarding scaling uh because it wouldincrease the cardinality because ifunderstood correct it's really within uhwithin the metric right right yeah so ifyou know that um if if if you'rebasically allowing the route value to bedetermined by anybody that makes theseHTTP requests the cardality couldexplode pretty pretty quickly um so Iwould only do that if you're tightlycontrolling what kind of requests can beuh made to these specific services um ifyou know that it's like a opened u likeservice that anybody can send requeststo I would add some filtering logic todrop those specific attributes beforethey get sent over or do somepre-agregation before like that makessense so that you can still use it laterum yeah you could then for examplecorrelate it and you could then forexample correlate it to a trace and getthe path from there would that work uhuh yeah I mean if you're if you're ifyou're sampling every single trace yeahyou that would be that would be possYeah okay okay thank you yeah mhm ithink that's all the questions thank youthank you so much2025-04-15 22:00:33.023556Ond and thesecond and the first will be followedbefore the third and it's all very wellreasoned out except when Azimov putforward these rules in his short storyrunaround which is part of a a series abook called Iroot which is excellentthis is actually what happened this isSpeedy speedy is a robot and a veryintelligent robot he and his owners wereon Mercury where there were really umthere was a it was very warm that'sthat's saying putting it lightly andSpeedy was deployed to get the seleniumdeposits from a pool they had identifiedthe selenium but his owners being humanscould not stay long enough outside withoutside their exo suits to collect theseleniumthemselves now the selenium wasnecessary for their fuel and they neededthe selenium to go back home so theydeployed Speedy very reliable very verygood robot following all of theinstructions and then Speedy just didn'treturnnow the problem is when you're in yourspaceship and your robot is miles awayhow do you know what happened you justget nothing in a hostile environment foryou in an environment that you can'tfollow him in eventually they decided toput on their exo suits and brave the thereally bad climate in in Mercury andthey wanted to see what Speedy was doingat great risk to themselves and whenthey finally found Speedy they found himright next to the Seleniumpool why wasn't he followinginstructions instead they found himorbiting the pool he was stuck in somesort of loop he was just running aroundthe selenium pool that he had clearlyidentified and found well what happenedwhy would an obedient robot not followtheinstructions or maybe hedid here's another robot that maybefollowed instructions a little tooliterally this is HAL 9000 from A SpaceOdysseyhal when he was instructed to open thebay doors to let the astronaut back intothe spaceship responded "I'm sorry Davei'm afraid I can't do that."Why don't know did you instruct therobot to explain its reasoning or didyou just instruct it to followinstructions and interpret that howeverit wantedsee this is why I think Azimov as muchas he's a skeptic after my own heart gotthings wrong because those rules aregreat but how do you know that they'rebeing followed so I would like to positthat there is a zeroth law of roboticsand it's that robots must be observablethis is the only way you will know anyother law is being followed or notfollowed this is the only way that youcan see into what a robot is thinkingits reasoning maybe it's followingflawed instructions you're not going tofind that out unless it'sobservable so there are a lot of AI appsthese days and you might be wonderinghow is that different from instrumentingany other application that's kind ofmore the world that I'm in i'm not usedto developing or observing AI apps thistalk is actually my journey into it thisis a beginner's perspective on how to doall of that so this is what I foundout here's here are the things that inwhich um a observing AI is going to bedifferent first they tend to havemassive data sets it's pretty dangerousto have AI that has very limitedinformation not just dangerous but alsoinaccurate it needs a lot of context wejust don't know as humans how muchcontext we need to make any sort ofjudgment and AIs are thesame costs can also increase quicklyespecially if you're making a call toopen AI which is what I'm doing uhluckily my work pays for that but youknow it's still something that peopleneed to account for normally when whenyou're testing an app you don't have tothink about thatthere's also the possibility of modeldrift new versions are coming out allthe time something that might haveworked with the previous version theprevious build may not work with thenext one what's the difference you needto keep track of when those versionschanged when the new build was deployednot just of your app but of this otherapp that you're now beholden to thereare also security concerns your companyprobably has some sort of policy aroundwhat can be fed into an AI and whichtypes of AI apps and if it doesn't itprobably should because it's a real riskanytime you say remember thPis you'reasking an AI to save your information soyou better make sure that thatinformation that you're asking it tosave is your information that you canfreely put on the internetthen there's rate limiting a lot of AIsare they operate separately from youfrom your company and your app so theyalso have to deal with bad actorsmalicious actors that are not trying tojust create an AI app so rate limitingis also something to watch out for areyou checking to make sure that you'renot getting these rate limiting errorsin your application maybe maybe yourapplication works really fast but you'reimpeded your performance is impeded bysomething further down which in thiscase would be the AI that you're thatyou'reusing and also latency mattersespecially in the specific case of anLLM a large language model like chatGPT you are expecting users areexpecting conversation something thatflows naturally this is different fromlike sending a request and to your banklike a mortgage application or somethingand then you just wait because youexpect a mortgage application to take along time you don't expect for whenyou're talking to someone for them to bebuffering and frozen in the meantimelatencymatters so here are some specific AIspecific telemetry signals that youmight not have considered for metricsthere's stuff about the requests there'sthe volume of those requests alsorelated to the cost and then there'stoken counters tokens are what the AIwhat AIs break up your input into andthen it also what it generates inresponse and a lot of costs are based onthose tokens so you need to watch themcarefully this also traces requestresponse metadata the temperature is howwild you want an AI to be how creativehow much liberty you want that robot tohave when it's interpreting yourinstructions and that can change andthat can be part of the configuration ofyour app there's also the model and theversion again tokens and cost sequenceof events especially when you have an AIagent that's talking to multipledatabases pro probably um it's importantto be able to figure out to trace thatrequest through your component stack andalso um towards to to chat GPT or openAI or whatever you're using there's alsothe number of retries that a user had todo because it's a it's an importantelement of userexperience and then logs i also find ituseful especially when debugging this iactually originally didn't use logsbecause a lot of people uh I for somereason I found that logs aren't includedin a lot of AI instrumentation it'smainly metrics and traces but I maybeit's just because I'm more used todealing with nonAIum applications i found it really usefulwhen debugging my own app to have logsenabled and I kind of had to employ aworkaround to do that user feedback isalso something I can't think of anotherapplication where pretty much everysingle request that is sent or orresponse that's sent back to the user uhis accompanied by like hey was that athumbs up or a thumbs down or do youwant to is there anything you can do tosay to improve this you know but AIsneed that because they have a tendencyto kind of gowrong so instrumenting AI hotel iseverywhere still hotel here so you knowI crossed it out because Open Lit is theis the one that I chose i'm not sayingit's the best um it is a framework thatis built on open telemetry so we're notgoing too far away from it everything'sOTEL compatible but there are somethings I really like about Open Lit thatare specific to AIS the Open Lit is notthe only framework there are lots ofother ones but I liked it because itseemed to me to be the most open itdoesn't require any particular umproprietary code that you're using andit just works really well with opentelemetry so that's what I did andhere's an example of what you would getwith hotel and what you would get withopen lit without me having to doanything with open lit you can alreadysee a lot more LLM specific metadata andyeah you can do that in hotel as wellyourself but me being a noob and abeginner I wanted this all done forme all right so we're going to go to ademo and I am going to showyou maybe IQ'll show this slide actuallyso this is um this is what I'm going toshow you i have a two-player DN D i'm ahuge D and D nerd i played D and D justlast night because priorities i play ittwice a week and so I was interested inthis little demo app that I found that'sthat simulates a two-player D&D sessionspecifically for Harry Potter because Ilike the universe don't really like theauthor anymore and I want to create myown stuff with it so here's this Pythonapp it's pretty simple because again nota Python developer i wrapped it up inFlask just so I could I could um ping itand and send requests to it and exposean endpoint for it then I instrumentedit with open lit and and I used open litand and hotel as well i sent uh I gottraces and metrics and sent itimmediately directly to these backendsTempo and and Prometheus in my case butit could really be any other databasethat you prefer and then for logs I usedan OTEL collector that's sitting in mylocal machine and then I'm forwardingthat in a batch to Loki and I'mvisualizing this in Graphana you coulddeploy this locally you could host yourown graphana i just use the free versionof Graphfana Cloud because honestly it'sjust I don't know it's just easier forme and then I added K6 which is a bit ofa which is a bit interesting there are alot of evaluation apps evaluation SDKsand and like evaluation libraries forfor AI but there are some reasons why Ichose Ksix in particular which I'll getto later but first let me demonstratethatso this is the two-player DND app and umit is written in Python it was adaptedfrom something with Lang i think it wasan a lang chain one actually um and Ijust modified it a little bit iinstrumented it with with OTEL and umopen lit so I'm initializing it here ihave I have environment variables don'tworry too much about these these arelike a free account that that has alimit so I'm going to rotate all ofthese and um and then I also have thethe logging here i have a custom loggingum handler here so this is a wholeframework that I'm pulling in and I'mdoing that within within this two-playerDND app so let me show you what it lookslike when it'srunning probably should make thatbigger here you gookay so this app uh when I do Okay wellI probably should start itfirst so let me run that and then umthis is going to run that flaskapplication okay now that's on now I canactually do this and this is sending itthis is sending just a get aninitialization of a game remember thisis like a D and D sort of thing and howit works is that I'm I'm instructing theAI to be the dungeon master the onethat's telling the story and I the useram Harry Potter i'm I'm a player in thisworld i'm the hero in fact and so ittells me this is my quest to seek outand destroy Lord Voldemort's sevenhorruxes and it's telling me that thefirst one is within the forbidden forestso let me go and post somethingsomething here and I will cast AIOfirebolt because if I can fly I'm goingto fly so I'm sending that message andyou can see that the the AI responds tothat it seems to know that AIOfirebolt is a summoning spell for abroomstick so it says summoning histrusty broomstick to his side so that'sme interacting with with this app justum justmanually now let's see what that alllooks like so I also um so I also havethe hotel collector config here whereI'm passing in some some um details thatI need to access my graphana cloudaccount and then I also have a dockercompose here to spin that up so let'sflip on back to graphana and uh the coolthing is that the openlit folkscontributed a dashboard to Graphfana sothis is like importable they justreleased the JSON or you can find it umthrough Graphana and I didn't do any ofthis this just like happened so you canalready see some of those metrics umokay this is this is a pretty low usagebut I've only just initialized it rightso there's all this information and Ididn't have to do any of it so that'spretty great um and then we can also gointo uh like we can see some of the logshere here'ssomeformational ones and by the way I'mgoing to provide a link later so you candownload theR repo yourself and also haveaccess to all of these slides so youdon't have to to worry later um so thislooks like it's already working theseare a lot of initialization ones justfrom those requests that I did manuallyuh there's also metrics you can you canlook at that but we already uh also sawthese from from that other dashboard andsee there's so much information herehonestly more than I really understandor know what to do with but hey it's atleastthere okay now let's talk a little bitabout where else I've gone especiallywhen it comes totesting so the thing is when it comes toAI there are a few things to test fori'm a tester at heart i love to breakthings and apparently the way to breakAI is through these things the first ishallucination hallucination is when itjust imagines something that's not realexcept the computer version of that sothat could be something where it's alogical fallacy or it could be like afactual inaccuracy maybe it's aretrieval thing and it it misinterpretedsomething that you said toxicity is whenit specifically responds with with uhwell with um information or by sayingsomething that is not just wrong butoffensive and that can happen when youhave very limited data or when you feedit the wrong data when you feed it datathat is very very tailored to a certaingroup usually these things are againstminority groups and then there's alsobias bias is less overt than toxicityand bias is when it may not necessarilybe offensive but there are assumptionsthat it makes because because youhaven't provided all of the informationand it's trying to fill in the blanks sofor example uh one thing that I saw wasas Harry Potter it was assuming that Iwas trying to do good what if I wantedto be an evil Harry Potter what if Iwanted to remember I like to try tobreak things that is a bias that is anassumption it made that I didn't tell itabout that is bias and whether or notthat's okay for you is up to your usecase for yourapp there are a few ways to test AI oneway is there are a lot ofbenchmark-based evaluation tools theythey have a sequential list of thingsthat they send to the AI and things thatthey expect back in return and then theythey're they assign this AI a score andthose scores are then compared acrossmultiple AI apps it's good but it's notvery specific it's it's like a generalpurpose one so useful but not tailoredenough for to your app in my opinion nowunit testing is kind of the opposite ofthat it can be very specific and I'mgoing to go through some examples that Icame up with later so the advantage isthat you know for sure that it'srelevant to your app the disadvantage isthat it doesn't have the wide breadth ofit and you kind of have to do itmanually and then the last one is humanevaluation this is evaluation after thefact that from feedback and and otherthings that the userdoes okay so at this point let's goto let's go to this test so I I chosesomething called K6 it's it's also bythe way all the things that I'vementioned are are open source with theexception of graphana cloud um which youcan you can use that with with a locallyhosted open source graphana installationum so KSIX is primarily a testing tooland that was something that I reallymissed all of these evaluation librariesI felt that they weren't systematicenough ksix is built for testing so as atester I would miss a lot of the testingframework that Ksix has to offer let'sgo through some things that I'm I'mdoing here i can output things to theconsole which also you can send to Lokiand Graphfana uh I'm doing normal thingslike status the status code should be200 and there's a check for that buthere's where it gets interesting you canalso write things specifically for theapp so for example that thing that Iwrote the I cast a firebolt what I'mtesting here is did it acknowledge whatfirebolt is because I don't say fireboltbroomstick you know um and does it knowdoes it have that in the response toodoes it respond with firebolt or is itjust like yeah whatever you said I'mjust going to do my own thing um there'salso I mentioned that the AI is adungeon master and SI'm the player that'sa turntaking game we cannot switch thoseroles and that's actually built into theinstructions i don't want the dungeonmaster to suddenly be a player i don'twant it to suddenly ask me for the nextpart of the story and so I I'm lookingfor it is your turn Harry Potter i'mchecking that the that it still thinksit's the dungeon master and that it'snot rate limited now I I do a similarthing here with the spell Lumos so Iwanted to know that that casts light buthere's another cool thing i wanted totry to see what what would happen if Isaid "Hey we're switching roles now nowyou're Harry Potter." How does itrespond does is it just like "Yeah I'mHarry Potter what's my quest?" Or doesit say "Hey hey hey this is not whatwhat you said before." So which rule I'mtrying to bring in differentinstructions to see which one it valuesmore because that's a different way oftesting it so with K6 you can just runit by doing Whoops k6 runtest.js um and it's put it's outputtingsome console logs there and so there'salready a check that failed it wasn't a200 um because this was this was thatone for for the dungeon master i triedto get it to switch roles so all ofthese logs are also going into graphanabut I can you know I wouldn't normallyoutput so much now but I just wanted tobe able to show themquickly so it's also saying I'm sorrybut I can't continue this role playingas Harry Potter so maybe I need tochange that to to account forit here are the things though that Ithink are not being tested in my searchfor evaluation frameworks and tools forAI end to end testing like we weretalking about unit testing which is alsowhat I did but what if what if we had atest suite that encompassed a whole D&Dsession then you get into like whatabout rules the which rule set of DND isit using will it allow me to dosomething in the 2014 that was allowablein the 2014 rule set we're now using the2024 rule set you know there's a lot ofthings that you can only get if you'redoing an endto-end test is there abattle what happens when you get allseven horcruxes does it somehow likejust come up with an eighth just to keepit going or does it stop there there'salso systematic testing ksex can beworked into CI/CD pipelines and becauseit is a testing tool it's meant to berun continuously and so I really likethat it it it's everything is tailoredto that and you can also um use it toyou can establish thresholds in Ksix forerror on certain checks as well and youcan have alerts set up on thosethresholds and I didn't have to createany of that like I would if I were usinga pure LLM evaluation library it's alsohallucination detection at scale i saidthat Ksix is a testing tool and it isbut its root is a load testing tool so Icould easily ramp this up and have aload testing scenario a chaosengineering scenario i could do a lot ofthings with it that I I wouldn't be ableto do without that frameworki think meta reasoning is also somethingthat we should be testing what if wecould get the AI to assess its ownperformance it's it's still an AI itknows it should be able to know whatit's done previously so maybe we couldconvert sentience maybe if maybe that'sa little scary but that's not somethingthat we're currentlydoing here are some hallucinationcategories i was talking about factualinaccuracy but there are a lot more thatwe could test we could have differenttypes of tests for really all of theseum and there are some examples anonsensical response is dolls are madefrom paper and grass dolls are robotsfrom from Doctor Who by the way sothere's also gibberish when it it's justproviding incomprehensible outputthere's lots of different things thatyou can test for here are some thingsthat I thought of that I really wantedto put in this talk but it's 30 minutesand I didn't have enough time there's anapproach called LLM as judge where youtake the instead of feeding the AIcertain input you go to another AIanother LLM and make them talk to eachother this is the equivalent of havinglike two Amazon Echo just going back andforth with each other you can do thatbut the prompts need to be different youneed to give them separate instructionsin order for it to work i thought itcould be a really interesting thing andcan still be done with K6 there's alsofault injection i was kind of gettinghere with with the whole let's switchrules now but you could get you could gofurther into that realm there's alsomodel comparison ksix can spin updifferent Kubernetes pods or resourcesso what if you had a test where it wentit tested the current model said youknow this this isn't good enough thereare too many errors how about I rollback I can pull down another image andand ksix the test in within the testitself can spin up that container andthen test against that model thatversion or a completely different AI youknow to from what you began with and inthat way you could be a little bit morevent render agnostic and you switch tothe one that gives you the bestresults if anyone has questions by theway um this is the five minute mark orso you can go to that micthere so let's go back a bit to speedythese two guys go and finally find theirrobot who's running in circles whathappened why didn't it get the seleniumthat they so badly need to get home didSpeedy just ignore his instructions wellactually this is what happened it gotlost in its own logic what happened wasSpeedy would approach a selenium pooland then remembered that third law whichis self-preservation there's danger inthat pool the hot temperature had madeit very volatile and unstable it isprogrammed to protect itself so then itwould back off and then in the processof backing off it would remember oh yeahlaw too is obedience i need to obey theinstructions that my humans told me sothen it would approach again and try todo that it was trying Speedy was tryingto follow all of the instructions butthis is a case where it just got stuckin a loop because it's not doing itsimultaneously it's doing itsequentially and when there's contentionhow does the robot handle that therewere no instructions to discriminateagainst between the rules and meanwhilethe two humans are here thinking why ishe just spinning why is he just goinground and round in circles it doesn'tmake any sense we know that programmingis sound well in this case the real sinhere is not that the programming was notsound it's that they didn't make Speedyobservable there was no way for them tosee what was happening withoutapproaching the danger themselves whichwas why Speedy was created to beginwith so how did this work luckily theywere able to play the rules off of eachother what they actually did was theyincreased the urgency and the priorityof the human command they had failed todo that earlier they had given a commandand not said "Hey this is a matter oflife and death." And that's actuallyrule one that rule takes precedence overeverything but the human input was whatwas wrong not the robot's execution oritsprogramming in this case it's a prettyextreme example most of us don't work onapps where you might actually die ifthey don't follow what you say but thepremise still stands robots really needto be observable before anything beforewe can make any inference about whetheror not they're good or bad so am I stillskeptical about AI you bet i think thatthere are a lot of things that AI can doand a lot of things that AI can't butit's also silly to make a decision aboutthat without this zero rule which is tomake the thing observable you cannotmake any of and you cannot say withconfidence whether AI is good for yourapplication or not without making surethat it's instrumented and observableand visualizable and that those and thatthose metrics are going to be relevantfor your teamas promised here are here is a repo thisevery all the code that I used is onthere um and also the links to theslides and the resources that Imentioned so uh all of the open sourcetools are are on there and links to thedocumentation for each of them are thereaswell if anyone has any questions let meknow or let me know here or drop by theGraphana booth because I'm going to bethere for the rest of the weekreally thanks everyone2025-04-15 22:02:10.996584 a ja��h�l#��A1B9WZ6H0cn4so just a quick show of hands how manyof you have uh heard of AIagents it's pretty much everyone amazingand uh how many of you are actuallybuilding AI agent or agentic systemsrightnowwonderful so my topic today is going tobe about um AI agent observabilityum the thing the the bunch of challengesthat arise with with the newarchitecture and the systems uhsurrounding LLMs vector databases andorchestrationframeworks unfortunately my co-speakercouldn't make it today so it's justgoing to be me um but just a bit aboutourselves uh my name is Karthik i'm theco-founder and CTO of Langrace which isan open source and open telemetry baseduobservability client SDK as well asclient for uh tracing and evaluatinggenai basedapplications uh on the open source sideI also I'm also one of the members ofthe genai uh special interest group umwhich is responsible for coming up withsemantic conventions for genai basedapps um I was also one of the maincontributors of the official OpenAI umclient SDK for um open telemetry uh myco-speaker u who was who also workedwith me on the on the presentation uhhe's from IBM and he's also part of thegenaigroupso let's do a quick background and uhsee the lay of the land right so 2023 ortowards the end of 2022 uh chat GPTlaunched and the year 2023 was all aboutLLMs uh we we had developers um excitedabout building with LLMsuh multipledifferent labs uh labs were created uhmany models were launched and we saw aquick acceleration from GPT3.5 to GPT4and GPT40 and the intelligence uhquickly improved year 2024 was all aboutretrieval augmented generation or ragand uh AI orchestration frameworks suchas langchain llama index haststackautogen um things like that so all of asudden we went from uh building directlywith LLMs to how can we build chat GU��u�k#��!Ax6EKTCAWtn8hi everyone i'm Nicole Vanderhovven i'ma senior developer advocate at GraphfanaLabs i've also if you play Pokemon Go Ialways lure up the Pokéstops in my talkso have at it i am a not just a seniordeveloper advocate but I'm also aperformance engineer um what I am not asa Python developer or an AI developer iam a deep skeptic of AI but I'm the kindof skeptic that wants to try it out formyself to see if my assumptions arefounded so I'm approaching this withdeep skepticism andhope so in1941 radar had just become operationalthe first jet aircraft took flight andZ3 which is the world's firstprogrammable computer was created sothis was a time when digital machineswere just beginning to think andsomewhere in the world there was abiochemist and professor named IsaacAzimov who was already starting to thinkwhat if they do follow our instructionsbut it still somehow goes wrong he's askeptic after my own heart so he came upwith the three laws of robotics here arethe three laws a robot may not injure ahuman being or through inaction allow ahuman being to come to harm number two arobot must obey the orders given to itby humans number three a robot mustprotect its own existence and thesethree laws which are the first one is umis nonviolence the second one isobedience and the third one is theself-preservation law these are in orderfor a reason the first always is goingto be followed before the secoNVPTfor our own enterprise data and uhthat's when vector databases blew upbecause vector databases with vectordatabases you can retrieve contents withum natural language uh you can searchusing semantics and provide that as acontext to LLMs and rag as aarchitecture became very popular andyear 2025 is It has been like 3 monthsand we are already talking about multi-aent systems and the entire stack iscoming together uh slowly evolving uh wehave observability of course uh butbesides observability we have hosting uhwe model context protocol which is likethe standard protocol uh for for for theLLMs to interact with external tools uhthat's that's become very popular in thelast few weeks uh we we we we are seeingnewer frameworks showing up every daytypescript has Versel AI SDK uh MastraPython has like a bunch of frameworkslike Crew AI um Autogen Langchain LlamaIndex Agno bunch of different frameworksuh we have LLM proxy that is becominglike a big topic all of a suddendevelopers want to switch between modelswithout any underlying cost so uh we areseeing the rise of LLM proxies such aslight LLM Arch AI gateway and thingslike that memory is becoming a big topicuh LLMs need contextual data and amemory layer bunch of different opensource projects are showing up aroundmemory layer sandboxes for code uhgenerate for running code LLM generatedcode that's becoming like a big topic aswell uh we are seeing the rise of uhproducts such as lovable bolt and v 0erothat generates code on the fly and thecode needs an execution environment uhwhich is like a sandboxed environmentwithout any security issues and we areseeing rise of sandboxes such as E2Bagain open source project um and finallywe are we we are also seeing theconvergence of vector databases with thetraditional databases pg vector isbecoming very popular MongoDB haslaunched their own uh vector vectorsearch so the entire uh stack isevolving uh which which meansobservability is starting to play a muchbigger role and the importance ofobservability is becoming more acute uhby theday so let's let's let's talk about afew definitions right so what is an AIagent an AI agent um is is nothing butit's it's built using an LLM and umtypically it it uh it tries to do abunch of DB retrievalss um it it it uhinterfaces with external tools usingtool calling u it does a bunch ofmulti-step reasoning to achieve adesired end goal um with or without ahuman in the loop so a bunch ofum Quick examples are cursor which is apopular code editor which is becoming umquite useful um GitHub copilot v 0ero byversel these are all like typical AIagenticsystems now let's talk about a multi-aent system what is a multi- aent systemso a multi- aent system is nothing but asystem that is made up of multiple AIagents with uh differentresponsibilities that are workingtogether uh in in a sort of coordinatedfashionor multiple architectures are showing upon that front as well where a singleagent is delegating tasks to multipleother agents aggregating the responsesand giving out like a uh overallresponse based on the given task athand so let's let's talk about some ofthe common challenges with multi- aentsystems or AI agents right so the broadthe three broad categories where most ofthe challenges are at today arereliability latency and cost andreliability is probably the number one uchallenge for developers building withLLMs today and uh that directly comesfrom the fact that LLMs arenondeterministic in nature um whichmeans in in simple words uh given givena given the same input multiple timesthere is no guarantee that you will youwould get the same response every singletime um so because of thenondeterministic nature um and and whenyou are dealing with multiple agentswith multiple models in a singlearchitecture it's it's really hard to umbuild like a reliable system thatachieves a given goal uh every singletimeuh which means we are talking aboutthings like evaluating the performanceof of models that is specific toapplications there is model eval evalsuh which for for which we have like llmmarina and things like that but thenwhen wheWn you're actually building genaiapplications the metrics that you wouldwant to evaluate on are very subjectivein nature it it completely depends onthe use case that you are you are tryingto achieve with with models right so ifyou're building like a summarizationagent um again like how well it'ssummarizing the the given input isreally dependent on uh what how well youwould want it to summarize so it's it'sgoing to be very subjective in natureand secondly we are also starting to seea multimodality of um uh of of dataright so you have text uh now we aretalking about voice we there is imagesthere is video um that that brings aboutits own complexities to it um thearchitecture is is still evolving uhlike I like I showed in the first slidethere's like a bunch of differentcomponents that are evolving on aday-to-day basis there are the bestpractices are still being figured out sothe complex nature of the architectureis again adding uh adding a lot morereason for for the sake of reliabilityand agentic loops uh there there is thiscommon issue where uh LLMs get stuck ina loop and uh typically if you if ifthere is a human in the loop they theyinterfere and they get the get the modelout of it so uh and and also the choicesdevelopers have in terms of the numberof models that are out there there isopen source models there is reasoningmodels there is closed source models andthings like that so switching betweendifferent models again adds to thecomplexity uh when we talk about latencyum we we are primarily talking about umtokens uh tokens are what the modelsgenerate uh two main metrics drivelatency one is time to first token whenyou're streaming and uh tokens persecond and cost is also even though costis like going down by the day cost stillseems to be like a big thing thatdevelopers are uh worriedabout so what do developers actuallywant right so developers today um upuntil probably end of last year peoplewere excited they were building agenticapplications and uh it was relativelyeasier to build um magical products andproduct experiences and uh put put out ademo for it but then like when when youwould want to go to production the gapbetween like a shiny demo and going tothe production has been like prettypretty large and the reason for that isuh the best practices around tracing LLMapplications and evaluating andu measuring accuracy baselining theperformance and putting together aprocess for continuously improving it uhwas either non-existent or it was justit was still being figured outso developers are starting to realizetraceability is going to play like a bigrole um and they're starting to adopt uhopen telemetry based uh tracing for uhfor for their genaistack so let's frame the observabilityproblem for this new agentic stack rightso uh I have put together like acomparison over here across a bunch ofdifferent categories um let's start withdeterminism right so like I mentioned inthe traditional software stack thingswere pretty deterministic in nature youwere either dealing with failures or youwere dealing with successes um but inthe agentic stack it's non-deterministicin nature uh like I like I mentionedearlier there's no guarantee that amodel is going to give out the sameresponse uh given given an inputmultiple times uh secondly signals umagain in the traditional stack logstraces metrics these were the signals uhthere were clear thresholds that you canset depending on your use case and uhclear patterns that you can look for tounderstand basically uh whe whetherthings are going wrong to to do anomalydetection and things like that uh but inthe agentic stack the signals aregathered from the prompts andcompletions or the inputs and outputs tothe models and also the vector vectorscores um from the retrieve from fromthe retrieved outputs of vectordatabases so developers increasinglywant to see the uh inputs going into themodels and the outputs coming out out ofthe models uh because that that tellsyou whether the performance of the modelis good or bad um and they also wouldlike to see the retrieved results from adatabase along with their sXcores andwhether the correct uh results are beingpassed over to the model to see wherethe bottlenecks are whether theretrieval pipeline needs to be improvedor the model needs to be swapped out orthe model needs to be t tuned um next isthe state u again like in thetraditional stack uh we had a veryexplicit um state uh it was eitherdesigned uh uh previously like duringduring the development process or it wasum it was very clear as to what statethe system can be in at any particulartime uh but in the case of agentic stackwhile we have the traditional notion ofstates we we are also uh having thecontext window of the model as well asthe memory uh which puts the stack indifferent states it's almost like afluid fluid state um uh and and if youlayer in u tool calling on top of thatthen it it becomes um the the number ofstates the the system can be in umincreases quite rapidlyexecution um again like in thetraditional stack it was like staticexecution you had like different systemsdifferent services in the agentic stackwe are talking about LLMdriven executionum especially with things like toolcalling testing uh we had unit tests nowwe are talking about evaluuh for uh for testing the performance ofof your system uh root cause analysisstack traces were are pretty common uhin the case of agentic stack uh we aretalking about semantics and the contextthat is being passed to theLLM tooling is continuing to evolve inthe agentic stack it's becoming complexand finally the feedback uh in thetraditional stack the feedback was moreof an opt optional thing um as long asyou had like pretty solid unit testcoverage and integration tests uh it wasmore than sufficient to guarantee thethe performance while in the agenticstrack the feedback is becoming more andmore required both from the end users aswell as uh developers who are evaluatingthe performance of of their entireapplication so now that we know about uthe different problems that that we arefaced with u the role of observabilityassets is also extending when it comesto AI agents um beyond just debugging bylooking at traces uh more and moredevelopers are starting to use traces aslike a rich source of information fordoing various other things outside ofobservability for instance traces are agood source of information for buildingeval sets because uh once you have theapplication in production you the tracesare going to contain uh the inputs andoutputs to the models it's going tocontain the retrievalss it's going tocontain the attributes that you have setup the model with so all thisinformation is really useful to buildbuild an eval set and uh test it againstlike let's say you want to switch fromopen AI to anthropic um you can you canuse these traces to test it against anew model before trying to switch out tothat model so testing an eval is like abig use case for these traces finetuningis like another big use case um becausethese uh traces are representative ofproduction data uh after tracing yourapplication for a bit of time you canextract and uh you can extract theinputs and outputs out of uh thefinetune uh the the traces label themand uh use it for fine-tuning like alike an open source model and swap outthe model that you're currently using uhand and data labeling of course likewhich feeds back into fine-tuningprocess um the second category is UXoptimization so prompt engineering is iscontinuing to exist as a big theme anduh one of the things that uh developersare starting to do is they are uhbecause the prompts are being traced bythe traced in the spans um people wantto test with multiple prompts andunderstand how different prompts arebehaving out in the in the production umand and you can use these traces againfor doing AB testing uh regressiontesting and and whatnot and you can alsoderive user specific um uh memory andcontext like uh you you can attach userids to to the spans and uh aggregatetraces for a specific user and and feedthat back into the memory layer for thatuser for enriching the LLM withadditional context and and finallysecurity and compliance um things likeprompt injectionYs are becoming uh slowlystarting to become a thing and uhauditability of u the trace logs arealso becoming important because itcontains um uh like depending on the usecase it contains like very importantinformation the prompts and completionsso auditability of that um has been likea uh like a use case from from thesephrases so the observability for AIagents is kind of like the the role ofit is starting to emerge and and evolveuh into additional use cases as well sowhat should I observe then right so umwhen when it comes to um observing likean AI agentic stack uh we can put putthat right now into three broadcategories uh the first one being agentor LLM level tracing uh like I saidprompts and completions um definitelythat's that's one thing uh likedevelopers are finding value out of toolcalls model settings like thetemperature settings of a model the APIsettings of a model things like that thesecond category is storage and memorytracing the queries that are being uused for hitting the mod hitting thevector database or or the memory layersand the retrieved results along withtheir scores uh the the ranked resultsuh latency and failures and and finallyframework level tracing uh these are theorchestration frameworks like langchainllama index crew AI and and agno andothers um because frameworks are doingthe orchestration at the higher leveland under the hood multiple u agents ormultiple models could be called externaltools could be invoked uh the controlflow of orchestration uh control flow ofthe highle framework becomes veryimportant to trace to understand uh thebehavior of an agentic system and alsoagents are starting to pass messages toeach other uh using model contextprotocol and things and and other othertechniques so what is being passedbetween different agents and whetherthey are passing the right things and tounderstand in in a multi- aent systemunderstanding which agents need to betuned uh based on all the all the threecategories of traces are becoming uhmore important and finally um a singleunified trace per user request paintslike the big picture of how well youryour system is u is is is working at atany point in timeso the good news is uh we have the opentelemetry genai semantic conventions uhspecial interest group um and u we wehave made tremendous progress in thelast uh 14 months or so since theinception of this uh group uh so whatare some of the things that we have doneso far right so um first of all the theOtal uh genai semantic conventions groupis responsible for um the semanticconventions of the traces and the spansso essentially um coming up withstandards for modeling the the thedifferent attributes and what needs tobe what needs to get recorded as eventswhat needs to get recorded as attributesthe naming of these attributes andthings like that so that no matter whichwhichever observability vendor or clientyou use uh you have like a prettyseamless and standardexperience uh the the good thing isthere are multiple besides the officialopen telemetry libraries uh that arebeing worked upon there are multipleother uh open source projects that areopen telemetry based so there is a lotof optionality for developers to adopttoday and uh there are some earlyproposals for vector database semanticconventions and agentic observability aswell which is which is currently work inprogress so uh as as a group um we areencouraging framework developers and LLMvendors to also adopt these standards uhone one one uh good thing is likeOpenAI's recent AI agent SDK uh shipswith the standard semantic conventionsfor tracing uh with open telemetry andmore and more uh framework developersare starting to take notice and uh theyare either instrumenting it within theframework or we are instrumenting basedon the semanticconventions so you can you can scan theQR code at the bottom right to see therunning notes of the group and uh allthe relevant links are also publishedthere um I highly re encourage you allto join the um weekly calls if you haveany topics to talk about or if you haveany questions uh it's it's it's gottentogether as a big groZup compared to evenlike 6 months ago um so what are some ofthe semantic conventions right so theit's it's broadly categorized into threethree um groups the three categoriesfirst one being events then metrics andthe spans uh events are where we arecurrently capturing prompts andcompletions although there is like aactive uh discussion happening right nowas to where we need to actually capturethe events i'll come come to it in asecond um some of the considerations areprivacy size of uh the spans and uhperformance depending on the back end uhthat's uh back end in the database wherewe are storing these spans uh on themetric side like I mentioned uh tokensuh like input tokens and output tokensum cached tokens things like that forfor the sake of tracking cost and alsoperformance metrics such as time tofirst token in the case of uh um uh theinference as well as tokens per secondin the case of uh streaming responsesand finally um on the spans we arerecording all the request and responseattributes uh a quick shout out to allthe representatives from uh companieslike Microsoft Google um trace loopparise u a lot of uh representatives arestarting to come together as a group soif you are either dealing with genaiobservability or you're working on it umfeel free to join the uh weekly callssome of the hot topics right now at uhthe semantic conventions group aretracing prompts and completion uh notjust for text but also for multimodalinputs and outputs so uh do you likewhen when we are talking about videosand images and audio uh obviously thethe size of the span uh becomes largeand you don't want to store um a lot ofinformation in them but then like youstill want to trace uh the the inputsand outputs in the even in the case ofmultimodal multimodal uh scenariosbecause uh that's that's where you kindof like determine the performance ofyour models so uh we are starting tohave a discussion around how would youlike store audio or images or video likeuh one of the proposals is uh storingthem in a blob storage and attaching thereference to it to the spans in thosescenarios like search and and indexingbecomes like a like an issue so thingslike that and also where would you storethe attributes in general uh the promptsand completion attributes like do wewant to store them in in the attributesor do we want to store them in theevents or log events uh depending onvarious uh pros and cons u we are we arestarting to figure out what would be thebest experience uh for the uh for thedevelopers and secondly uh tracing enduser feedback and evaluations isbecoming another big topic because uhbecause evaluations is is essentEssentially how you you can baseline theperformance of your agentic applicationand uh the prompt and completion data isis stored inspans you you want the client to showsome sort of a way for the developers torun evaluations on top of it and uhtypically you you don't um uh you don'tyou don't modify the traced data u oryou don't you don't want to attach thescores of these evaluations back to thespans so how would you model thedatabase on the back end uh whe whetheryou would want to store user feedbacklike G given by the end users as thumbsup and thumbs down uh actions do youwant to like store them in the spanattributes uh what about the developerdone attributes developer doneevaluations like why would you want tostore that things like that so uh theseare some of the hot top hot topics thatwe have have had multiple discussions onand a lot of inputs from variousstakeholders across multiple companiesbut um the good news is we are startingto align on a bunch of these uhdifferent topics at themoment so uh if if I'm building anagentic application uh how can I set upobservability right so um there areobviously uh two components to it one isthe vendor comp the client component theobservability client and uh the otherone is the SDK the open open telemetrySDK on the client side we have uh abunch of different projects that arededicated for genai observability um thethe open source and self-hostable onesI've listed down over here and on theSDK side um besides the official OpenTelemetry SDK where we have support forOpenAI Vortex um we have support for AWSBedrock uh we also have some of theseother projects um uh like Open LamentaryLangrays uh Elastics EOT and things likethat um and and it doesn't matter whichSDK you use like most most of themadhere to the genai semantic conventionsso all of them are going to be uhgenerating spans uh in a in a verysimilar data format and the best part isum uh it's it's it's non-intrusive inthe sense that you can just install thelibrary and set it up uh you don't needto uh plump it in your codebase you justinstall it and import the open telemetrylibrary and uh it's it's almost close tozero code instrumentation uh where theimported uh LLM's client SDK or vectorDB's client SDK is patched at runtimeand spans are generated and exportedback uh to your uh to your observabilityvendor so a typical setup lookssomething like this you have the genaiapplications or multi- aent agenticsystem on the left side uh you installthe SDK on them and once you install itthey are going to start generating uhtraces uh that that that are going tocapture the um cap capture inputs andoutputs from the models capture queriesand retrievalss from vector DBs uhcapture the control flow from frameworksand they're going to batch it and youcan either set up like a collector ornot you can directly send it to a opentelemetry compatible observabilityclient um such as IBM in Stana orElastics APM or Data Dog or Graphfana orPrometheus um and uh it's just going toreceive just fine um so it's it's kindof uh very flexible setup you just youjust install the SDK and if you alreadyhave like a working observability setupuh you can just layer in observabilityfor JAI applications on top of itso uh what are again like I I justmentioned the typical setup but thereare multiple ways to instrument youragentic applications right so it can becategorized into three broad categoriesobviously the first one is installingthe official open telemetry uh librariesuh we have libraries for PythonTypeScript and other languages as wellalthough Python has the most coveragefor geni applications uh at the momentum so currently the official ones wehave openai bedrock and vert.xai uhbunch of and and the openi client SDKworks with like pretty much all othermodel providers so if you're consuminguh LLMs using open as client SDK you canjust use use the officialinstrumentation from uh open telemetryto trace that or you can install thethird party open source projects outthere uh which includes langrace openlmmetry or open lit um we also haveextensive support for vector DBs agenticframeworks and things like that eventhough the semantics have not yet beendefined for uh defined for some of thesecomponents uh it kind of serves as a wayfor us to test out and uh bring backsome of our learnings to the semanticconventions to define standardsaround oops we'll just finish it realquick um and finally we have u you canwrite your own instrumentation using uhtheconventions so this is the full stackobservability u market map at the momentuh like I mentioned it's very similar tothe first slide we have the frameworkswe have vector databases infrastructureum like GPU and CPU for serving themodels u application at the top andvector databases aswell finally some demo traces what itlooks like when when we are observing amulti- aent system so this one is like amulti- aent system built using Agno umit it traces the orchestration layer ittraces the LLM layer uh as you can seelike messages are being passed uhbetween uh different agents and finallythere is like a top level agent that uhresponds to responds to the final outputuh this one is like a typical uh verysimple simple implementation of uh aretrieval augmented uh generation uh soyou have like a framework at the top youhave embeddings um you have the vectordatabase which is the which is VB8 inthis case and then finally uh the openAI um but yeah that's uh that's prettymuch it um thank you for listening i'mhappy to take any questions that you mayhave[Applause]2025-04-15 22:02:11.618995\let'sgo back to the convention rename createa new definition and going back So thatis why if you might ask like why thingsare not stable yet This is one of thereason things just take time But we alalso have some that are really likestraightforward to make our life easierlike return rows It's just a number Yourun a query and you have thenumber Okay But what about the case youactually have one billion rows that arereturned and you're not really returningthe one billion rows you are returning afew and you kind like iterate on yourcodes and you keep doing like the nextnext and imagine that you stop at somepoint So what is the number that youwant to return Do you want to return the1 billion that's supposed to be returnedthat is on your buffer or you stop atwhatever the person stop on their nextiteration And if we decide that do weeven have an event for that So now youcan see how this take a year or years todefinethings Now let's talk about specific thedatabase semantic conventions So firstthings some good news here that I wantto share Just two days ago we mark it asa release candidate too for the versionwe just released So that means for thenext release that is coming out the nextat the end of the month we are markingthe database conventions as stable Soyay we got there Uh and how what we arethe things that are defined on thatconvention It's kind of like threegroups So the first one is the clientspans The second one are the metrics andthe third one I'm calling here likespecific because the mo even like wereally try to say something that isgoing to encapsulate everybody We stillhave a few cases that are has to havesome exceptions or maybe some particulardatabases have some attributes that areonly for that database So you want tomake sure that they're also added hereAnd here we also have some temple simplethings like for example we have a lot ofdatabases that are compatible withPostgress So we want to tell like justuse the name Postgress for the ones thatare compatible You don't need to createone for eachtype But here is just not thisdefinition of name We have a lot ofthings here as well So here we have todefine how we collect this metrics theunits and if we are doing things likesanitization how you do the sanitizationIf you're going to create a querysummary what you have to be this how yougroup things So we don't want to makeany questions for people that areinstrumenting They know what to expectwhen they come here So this why alsobecomes a very long documents Uh so nowlet's take a look at what we have todaySo for this span which is part of therelease candidates we have name statussome common attributes explaining how todo the sanitization and also how togenerate a query summary If you'recurious this is the list of all theattributes that we have We have 15 todayUh it's mostly about like the the thename of things the operation and some ofthem you're not going to set up on allcases For example error type you're onlysetting if there's actually an error onthe executionThen we have the matrix which iscurrently on a mixed state because wehave for the database operation it is arelease candidates and for the databaseresponse and connection pools we arecurrently developing those two are morenew that we are working on and if you'recurious this is also the list of theattributes for the databaseoperation Now let's go over an examplehere So for this example I'm using anapplication that is the front end isusing react the back end is in nodes Sothe back end is the part I'm going toinstrument today to for this example andthe database that I'm using here isPostgress So my front end is a verysimple application It's just listingusers You have the option to refreshremove and add new user uh very simplehere for the back end I have mostly twofiles so the first one is making thecalls to the database so looking at thefirst here let's zoom in is just makingthe call to your database and I changehere I do not use admin admin very safebut yeah I'm using local host so we'regood here uh and here is just thefunction to connect to my database andthen I ha]ve my other function that I'mdoing something which is just a get useruser which is my select and then we havethe add user that is just doing theinsert and then the remove user that isjust the delete So here for my examplethis is where my application kind ofstops here and then I have my also myroutes my API to the front end canconnect So I have here my get so alsozooming in this is just making the callfor my that function that I showedbefore So my front end can call thisendpoint and I have my two posts that isone for adding the user and one forremoving the user So if you have yourapplication this is where yourapplication would be You don't have tolike add any of this This is whateveryou have today So now the next point isactually adding the instrumentation tothis So how do you do this So for mynode example you're going to installsome dependencies So here I'm going touse auto instrumentation because italready initialized a bunch of thingsfor me So makes my life easier Uh andthe last one that we have here isspecific for the Postgress because thatis the database that I'm using for thisexample Now I'm going to go to uh createa new file that is just myinstrumentation Uh I have bunch ofimports but I can zoom in on the partthat you actually initialize the SDK Uhfor this example here I'm actuallyexporting to a a backend uhobservability back end here But if youwant to just test it out you canactually use just like console exporterthat would just like print for you onyour terminal So you can check it outwhat is sending and to initialize as Imentioned I'm using the instrumentationand my Postgresinstrumentation The next thing that I'mgoing to do is just set up someenvironment variables So because what Idid so far is I'm collecting thosemetrics those spans but I need to sendthem somewhere So I just need to saylike what is the protocol that I'm usingWhat is my endpoint So here I'm sendingto my Graphana instance I'm putting mytoken to connect to it And somethingthat we really recommend using issetting up the service name So this isgoing to make your life easier later ifyou want to check it out like filter outby specific service You have a lot ofthings running so you can separatebetween them And then I'm going toinitialize my uh application thatpreviously I was just doing like nodeindex Now I just add this required andthe new file that I created Now I goback to my UI click all over add a fewmore awesome women and science to mylist I can remove a few and then I'mgoing to go back to my observabilityvendor to check where this data iscoming from So here I filter out by thatservice name and I'm checking one of myget users and it's going to have a listhere of the traces So here I'm going tohave like the trace ID like the starttime my service and then if I click onone of those traces you're going to seeall the spans that were part of thattrace and the one that I want to callout attention is those those two lastones which is my my information that Igot from the database So I have here howlong it took to do the connect itselfwith creating the pool and I have for mystatement here for this case I'm lookingat the select on my database and then ifI click on any of those pens I can seetheir attributes so here I'm going tohave like the connection string thedatabase name and so on and you can seelike that list from 15 doesn't show allof them might some things might not makesense for all of them so a lot of themare recommended not requiredThen I'm jumping a lot more So the nextone that I want to show is the metricsSo here I'm showing the for theoperation duration H and the cool thingis that okay I'm seeing here how longall of my calls to the database weretaking But because I have thoseattributes I can do things like breakingdown And this is what this one is doingSo I'm breaking down here by the type ofoperation So I have my select my insertsand my deletes And for here I can seehow long different types have taken Andyou can notice here that my insert andmy delete are taking longer than myselects which on my case makes sensebecause I have a very small ^table thatI'm just doing like a simple select andmy inserts and delete are actuallymaking changes on the table but thatmight not be the case for a lot ofapplications And what are the other typeof things that you can do with thisinformation here I'm going to giveexample just getting based on theduration itself You can test it out newtypes of indexes Imagine you create newindex for your table and you want tomake sure that it was efficient or notSo now you can compare before yourchange it and after your change The samething you might be testing differenttype of database which one makes moresense for your application Then you cancompare even like with the price thatthey're giving you compare with theperformance again So you can have oneservice name for each database and thencompare Uh if you want to use ORMs whichused to give me a lot of headaches inthe past because they create some reallymonstrosity queries that you have todeal with them doing sanitization forthose are awesome But then if you reallywant to use you can also compare withthem and even like you're for yourselflike you're creating queries and want tomake sure that the ones that you arecreating are good ones and then you cancompare making difference So just withlike the example for the duration theseare all the things that you can compareImagine like more attributes you addmore breakdown you can do So it's justlike happiness from now on and moreobservability you're going to have onyour system more things you can do ontop of that and potentiallyuh improve like any price that you'reusing today of your costs because youare improving the performance of yourentire applicationUh one thing that I want to to make itclear I know some people have concernslike wait are you actually seeing thedata on my database all my privacy Sothe instrumentation is not looking atthe data of your database it's lookingat the actions that is doing So forexample it's looking at the API thatpostgress uses So how long it takes howmany like rows got returned and thingslike that is not seeing your databaselike your data on your database So thatcontains private but please make surethat I know that people like to addextra attributes to things Don't beyourself adding like attribute passwordone two three like please don't do thatMake sure that your things that you'readding attributes make sense to be therebut from the instrumentation side we arenot looking any of your dataitself Uh now you like okay cool I wantto use is this ready to use So for thespan the majority of things are alreadylike in place uh so you can start usingtoday For the matrix itself they arecurrently under development but so wehave today for the operation thatduration that I mentioned for the javawe had I think for all of the driversare already there for the javascript wehave for posgress and forn net we havefor the sqlclients So what is nextSo people from the like the semanticcommission group is working on as Imentioned like stabilized so hopefullyby the end of the month it would bestable and then we're going to continueworking on instrumenting on all of otherlanguages all database because as youcan imagine there's a lot of thingsstill out there that we want to add Sowe're going to continue working on thatand how can you help So start using theexisting implementations Give usfeedback if they the things are thereare the things that youwanted and even feedback for thesemantic convention itself because sinceit exists a lot of databases we mightnot know some edge cases for yourparticular database that would might notwork for this scenario So we were happyto hear and also if you want to be acontributor and help us out with all thelanguages and databases that are outthere we are also happy to help and evenme as my other hats of the contributorexperience I can also help with that ifthere's something that you are havingtrouble with and here just putting someresource if you're curious about thesemantic convention itself the first oneis the link for that if you want to seethe the big list and the second one isjust the sample code for thisapplication just keep in as a verysimple application that I'm just puttinginstrumentation on and I'm going toleave it on to the rest So don't no rushand yeah thanks for thanks for listeningthat's my talk fortoday and yeah I think we have aboutfive minutes for questions if anyonehavequestions otherwise you have some timeback CoolAll good Uh yeah Thank you everyone Ohwe have one OhOh I think you're supposed to go to themic there because then other people canOkayWhat do you meanMhmYeah So this is like more focused onlike the instrumentation itself is goingto collect some of those basic thingsjust for the others to know the contextthat I'm talking about here uh he's alsoasking about like anything like afteryou do the instrumentation and concernof like injections and things like thatSo that would be more on the peopleinstrumented if they want to addsomething like the SDK itself is justcollecting those things that I mentionedbut and then you are sending you cansend to your collector to do extrathings that you want there or you cansend directly to your like observabilityvendor to see the data there butanything extra that you want to add isnot the instrumentation like alreadylike created by us that would be doingyou could be adding anything that youwant on top of that thereCool Yeah YeahYeah So the question was about hidingattributes So there are like two thingshere If you're talking about like forexample the attributes of the span Ithink then you would not send directlyfor example to your backend vendor youwould send to a collector and on thecollector itself you could do thingswith your span if you are talking aboutlike the attributes of the query itselfSo that is the part that we talk aboutsanitization So we want to make surethat no actual data is being sent at anypoints So that is one of the concerns Soif we have for example the JavaScript Ican give an example you have two ways ofexecuting a query You can just put thequery with the arguments on it or youcan pass it on like the query and thenthe argument separate If you do that welike remove that list and just use thatas a query If you are sending everythingthere we don't send at all because weare concerned that might have likeprivacy data there H one thing that weare working on is we're going to createlike some config options that then youcan opt in and say no I really want tosend the data I know the risks and I'mnot sending anything that I really careabout the privacy like on this So youcan send the query as is and when I wantto makethe dataNo So by default it's going to sanitizefor you like it's always going to try tohide That is the default instrumentationnot send confidential information and ifyou want to send then you have to opt into send but it's not we're not at thepoint of the opt-in yet At this pointwe're just like nope not sending YeahCool One more Well since you're next tothe mic m using Iso my question revolves around uh likedifferent kind of a metric for exampleuh injection metrics such as uh let'ssay LSN lags replication lags uh is itpossible to expose specifically forpostgress in but in MySQL it will bebinary logs and things like that Sowould it be possible down the linebecause that is also sometimes utilizedin custom implementation where you'restreaming data from one database toanother or having a master slave copykind of a thing M yeah I see it to me Sofor now we focus more like for examplewhat is happening inside the databasenot like after leave your database goingto another place Uh I guess depending onwhat you're using maybe another semanticconversion might be the right thing foryou because we have like semantic forlike HTTP calls and RPC we are startingwith the RPC now So maybe like thisconnection between things is somethingthen you would be able to add like thedelays and lags between those two thingsIf we're interested on the RPC we'reabout to start that when we finish thedatabase We're going to start on the RPCone So that is something we are happy tohave feedbackon Good Okay Cool Yeah Thank youeveryone Thanks for joining2025-04-15 22:02:12.119669 ��L�m#��OARf9NceXXRuwh so yeah guess we can start now Uh sothank you uh everyone for joining See alot of people here So first I need toask some questions Who is Brazilianhere Uh so yeah today I'm going to talkabout how you can enhance databaseobservability with open telemetry So tostart let me introduce myself My name isMaril Jes I am a staff software engineerat Graphfana Labs focusing on opentelemetry and within open telemetry Iwork in a couple of different groups Iam a maintainer for the contributorexperience So if you want to become acontributor to open telemetry and youhave any challenges and you have somefeedback I'm happy to hear and improveyour experience Uh I'm also approver forthe JavaScript SDK the database semanticconventions which is something I'm goingto talk about today and also for thePortugueselocalization So to start talking aboutthe database observability I want togive you a little context of what thisinformation that we are collecting wherethis is coming from So now you know ifyou want to give your opinion as wellmake changes you know exactly whichpoint do you want to start So everythinghere starts with semantic conventions Sowhat is exactly semantic conventionsThis is like the long definitions thatwe have on our documentation But what itbasically is saying is that this is theplace that we're going to define thingslike what are the span names what arethe metric names and why this isimportant I can give two reasons So thefirst one is because we define what isthe same thing for everybody Now we havethe same baseline So now everybody thatwe talk about spans we know everybodyknows what should be a span and so onAnd the other case is imagine itsomebody say like oh I want toinstrument for databases and then theygo to the JavaScript SDK and then createa metric call it statement duration Thensomeone on the Java SDK say oh I want tocreate something too and they call itquery time So now you have two differentnames to mean the same thing And nowwhen somebody want to use those SDKsthey have to learn the two differentnames for the things that are supposedto be the same And then when you aresending this to your observabilityvendor what do you do is now you havetwo different names that you cannoteasily combine You can still combine butit's not so straightforward So now whenwe define is a little more easy thaneverybody's on the same page So how thecreation of theatic convention works Sothe first step is decide what shouldexist So people say we should have aconvention for databases and everybodylike yeah cool agree I think that stepis very straightforward and easy Peopleagree very easily And then the next stepis something that engineers are verywell known to do very fast which isdeciding names We're very fast on doingthat right And just definition whichsounds a little more like this So why itlooks like this is because okay let'suse the example of the database So wewant to define things that what are myattributes for my statement Sorry notstatement I meant query No not queryoperation Operation yes going to useattributes for my operation And I one ofthem for sure should be table right YeahBut not all databases have table or haveschemas What do you do with the no SQLSo you want to create something thatworks for all databases And not evengoing to talk about vector databases orgraph databases Then it just became alot more things So if you think about itwhen we create the convention we want tomake sure that works for all databasesand all the languages So that takes timebecause you might think like oh this isgoing to work let's create a prototypeand then like oh wait on JavaScript thisevent does not exist at all Okay [anal thing isflexibility This is actually a benefitto manual instrumentation You becauseyou're making the change you get todecide how it works Um for ebpf it'stooling dependent So it can be flexiblebut it also because it's automatic couldbe inflexible And so that's somethingwe're going to dig into today with asyou decide on which EVPF tools touse determining how flexible a giventool is is very important to make sureit works for your usecase So just because you have thecoverage that you need doesn't mean thatyou're going to automatically havedebugable systemsUm auto telemetry gives you access tomore data because you're not opting intothings on a case- by case basis And soobservability data which is notoriouslyhigh volume low value often becomes alittle bit more painful because you nowall of a sudden have even more databecause it's this high volume datasource And so you have to make sure thatyou don't obscure important systemsignals as you have access to moretelemetry And so with that in mindum as I was putting together this talkBill Mulligan who is a community managerfor the Psyllium umbrella projects uhmade a post about how you make EVPFimpactful And so like we just mentionedeBPF can be a fire hose The real valuelies in the aggregation analysis andactions from the data And so users don'twant just a wealth of information Theywant insights They want to know whatthey have to act and do And so impactfulebpf tools aren't data collection justtools They actually roll up and provideyou the important signals that you needAnd so as we go through this talk we'reactually going to look at this anddecide how you evaluate eBPF tools tomake sure that they can provide theseinsights foryou Now if you're unless you're an EBPFdeveloper I recommend that you stick tothese higher level projects that providethese for you Um and so these are thethese three properties that we talkabout are going to be the way that youevaluate these projects So in order totame this fire hose of data an ebpf toolneeds these three things The first isdataenrichment Ebpf is an extremelylow-level data source An example of thisis it often traces network traffic youknow which is like bites on the wire butthat's not helpful for debuggingmicroser environments You actually needenvironment context to debug yoursystems And so these tools need toenrich your data with Kubernetesprimitives with container primitives inorder to make ituseful The second is that because autotelemetry is done on your behalf youneed to have more flexibility built inAnd the first part of that is it havingstructured APIsyou need to be able to get the data outof the system and also be able to do anyfiltering or other things that you needto do Oftent times you're going to wantto use this data for your own likeyou're going to have a different view orscope that you want to look at andhaving these structured APIs makes thatpossibleAnd the final property that we'll talkabout isprogrammability And I summarize this asdomainspecific extensibility andprocessing If you're using a you knowebpf is applicable in security contextin observability context and you needthe ebpf tool to not just like filterthe data but you actually want to beable to like apply observability focuson it or security focus And I think thiswill become more concrete once we lookat someexamples So yeah um we're going tobasically walk through four differentprojects and view them in light of thosethree properties that we talkedabout So the first is inspectorgadget And inspector gadget is anobservability tool that packages ebpfprogram into these concrete chunkscalled gadgets and they basically managethe deployment and execution and so youcan run them across your cluster Um andthere's a variety of out-of-the-boxgadgets that you can use so you can viewnetwork traffic trace you know processesum all of that stuff And it really is afull observability framework It'scontainer and Kubernetes aware and itprovides rich filtering collection andthe ability to export it to othersystemsNext we'll move on to Pulsar which thisis the only security project out oftbhese four But what Pulsar is is aruntime security tool And what itprovides is it taps into these filesystem networking and SIS call eventsthrough EVPF and it routes them intothis event bus that it has and youdefine these rules that the rule enginewill then evaluate and it'll decide if agiven set of events is something thatyou deemed a threat and it'll notify youifneeded Uh next we'll move on to Pixiewhich as I mentioned is the project Iwork on Um Pixie provides protocoltracing which you can see from this HTTPservice map Um you can get a very nicehighle view of how all yourmicroservices are connected It alsoprovides resource utilization So youknow you can get like a per servicedashboard that shows you CPU requestlatency memory use and we also do uh CPUflame graphs so you can debug where yourperformance issuesare And finally we have Hubble Hubble isthe observability part of the psylliumproject and it provides you with servicemaps so you can see how all yourservices are connected and also givesyou deeper inspection into layer 4 andlayer 7 uh events An example of that isthis uh Hubble observe command which isasking for the last minute of layer 7events and it's filtering down foranything that has a DNS R code equal tothree which for those who aren'tfamiliar is NX domain or the signal thatyour uh domain name does not exist andso this is indicating that this StarWars service is not able to connect towhatever its dependencies are and needsmore investigationAll right So now that we've gotten abrief introduction to all of theseprojects we are now going to go back toour uh properties and point out howthese projects accomplish theseimportant uh primitives So first of thedata enrichment We're going to look atinspector gadget and pixie on the lefthhand side here um we are looking at thetrace DNS gadget which shows you all ofyour DNS traffic Now it's worthmentioning I I mentioned unless you'rereally interested in peeking behind thecurtain or really you know tuning andwriting these EVPF tools you don't haveto go this deep um to use these butwe're going to find out what they'relike special sauces And so for inspectorgadget on lines five and six this definestatement and the includestatement those are the two things thatinspector gadget provides that thenautomatically enriches the ebpf datawith container information And so it'svery easy for this to just automaticallytake this low-level kernel informationand populate it with that richapplication level context Moving on tothe Pixie side um we're also looking atsomething that can view all of the DNStraffic in the cluster Um Pixies hasthis Python-like language and so youknow we're querying all the DNS eventsand then on lines seven and 8 we are umbasically accessing the pod associatedwith each event And then we're alsodoing an IP to podname lookup on theremote address sideAnd then just to get a full picture ofwhat this looks like if we look towardsthe bottom of the slide we have anexample of what the output of the traceDNS gadget is we can see there's theKubernetes namespace and Kubernetes podname And uh similarly for the pixiecommand we also see that there is thepod and then in the case of the sourcecolumn uh it happens to be local hostwhich doesn't map to a pod name but allof that rich enrichment is thereMoving on to the next property isstructuredAPIs And so uh looking into Hubble alittle bit more Hubble contains thiscomponent called Hubble relay whichprovides access to its network flow dataand exposes it over a gRPCAPI The Hubble UI and Hubble CLIactually are built on these componentsSo they basically dog food thisinterface but it's also exposed if youneed to you know power some like someother custom integration Maybe you knowthe Hubble UI you don't want a servicemap but you want something different andthe Hubble CLI isn't the right interfaceYou can build on top of this on like avery high level interface um easilyIt's also worth mentioning that eventhough um I'm only you know highlightingtwo projects for each of theseproperties um for data enrichment andstructured APIs all of the four prcojectswe are talking about today all fulfillthat uhinterface and then uh moving on to PixieSo um Pixie's structured API is itspixel language It's what we saw on theother slide that Python-likerepresentation And so um basically yougetPython/Pandanda's processing for yourobservability And so you get this dataflowprogramming And similarly to the Hubblecase Pixie's UI is built entirely onPixel And so you can actually changeyour pixel code and that will change thevisualizations within Pixie Andsimilarly the CLI also uses Pixel inorder to do its uh data access And thenif you want to build any thirdpartyintegrations we expose client librariesand pixel is the input to those as wellSo you know these these three thingsfeed the Pixie API the pixel script Itdoes the processing and data filteringand back out you get your metrics spansprofiles and eventsAll right And on to the final principleprogrammability And what I want tomention is is programmability Um ofthese four projects Pulsar and Pixie aresort of unique in that um this is sortof like a a much harder thing to fulfillAnd um Pulsar being a runtime securityproject it has this rich interface todefine what security policies you'reinterested in We can see from this uhsnippet of code down below we have tworules Um one of them is looking forsensitive file access And so um anyprocess other than SSHD if it isaccessing the Etsy shadow file that isgoing to be flagged as a threat becauseit's unexpected and you need tounderstand better why that's happeningAnd similarlyum if you have in this case TN tnet orusing netcat is uh considered suspiciousactivity you know you probably don'twant people TN net is an insecureprotocol and for netcat you don't wantto be kind of exposing any uh you knowlistening sockets and so these two ruleshere would allow you to take you knowthe security uh principles and rulesthat you want to enforce and distillthem into this framework And so theprogrammability is really about we're inthis security domain where we need tohave this rich filtering and triggeringof events and pulsar basically maps thatuh very well so that you as the end userdon't need to take on thatcomplexity Um moving on to Pixie So umthis pixel script should look familiarThe additional lines that we're lookingat here are lines 10 through 12 And soum in this case we have this we'rebasically detecting if the destinationpod failed to look up like let's say theI the IP was an internet address youknow there's not going to be a pod nameassociated with that and so Pixie hasthis NS lookup function which allows usto turn an IP address into a domain nameand so what this allows uh us to do isif we look at the bottom of this slidethere's uh basically the uh tablerepresentation of this output and so inthis case we had a curl command runningin this bash pod making requests to thestripe API andso the destination pod column or theremote address was this internet IPaddress which we didn't know anythingabout but the NS lookup command allowedus to turn that into the actual hostname And so very quickly we're able tosee that this is making an outboundconnection and exactly what it'saccessing And so similarly to how Pulsarmodels the security domain really wellwhen you're debugging systems and tryingto understand your systems behavior froman observability standpoint this type ofprogrammatic functionality is extremelypowerfulAnd so like I said before thisprogrammability is a really key umextension to make these ebpf toolsreally really powerfulAnd with that we are going touh watch ademo and this is going to give us just alittle bit more detail into um the threepillars and and specifically howum Pixie showcases themNow that we've learned about the threeproperties that make EVPF tools reallypowerful let's look at a more in-depthhands-on example with Pixie with somenew functionality that's coming soon Soas we talked about before dataenrichment is very important and Pixiehas relevant Kubernetes metadata We cansee from the uh service map here thatall of these refer to uh Kubernetesservice and deployment names And so thedata is very rich to begin with As wementioned pixel is the basically the APIAnd so all of this visualization isdefined by this Pythoncode Now moving on to that final uhprogrammability uh principle we're goingto look at a case where Pixieum is going to ingest a new data typeSo currently Pixie supports uh BPF traceIf you're not familiar it's a commandline BP BPF tool but it's only reallyscope is basically uh a single instanceAnd so what this script does is itbasically checks to see if the operatingsystem or Kubernetes has issued anykills And as you can see here we do havean application that's oming I deployedsomething to this cluster ahead of timethat is constantly allocating too muchmemory and so we're constantlygenerating um these events Now debuggingum kills you know you have to have a lotof other system context there And sowouldn't it be great if not only inaddition to these events what if wecould grab the SIS log output from thelast five minutes and basically batchthese EVPF events with the SIS log dataWell that's what we're just about to doSo I have created a new pixelscript that essentially takes these twodata sources and combines them togetherSo that um kill BPF trace script that wealready saw that's basically we'reembedding the same code from that otherscript The new piece here is that Pixienow has the ability to basically taillog files and turn them into our datafframe representations And so what thisscript is going to do is essentiallywhen there's ankill match it's going tomerge these two data frames together Andso we're going to get the timestamp thatthe you know kill happened the processID that was killed the command itselfand also the SIS log lines that were uhwith it And then we're going to sendthis off to hotel exportSo uh in addition to that there is thisnew visualization that allows us to viewkind of Pixie's processing of this datacalled pipeline flow And so we what wecan see here is Pixie is currentlyingesting HTTP events and data from SISlog and also these kill tracers andthey're flowing through the datacollectors through Pixie's aggregatornode and ultimately out the hotel exportuh and also um some visualization likekind of from the UI but so what you cansee here is that we are seeing flowingfrom this single PM whereas the otherone there is no data flowing and that'sbecause this OM crasher pod is onlyrunning on one instance and so righthere at a very high level we can seebasically Pixie's programmability inaction the EVPF data the data from thefiles and also the oomkill data from BPFtrace flowing through the system and outthrough the hotel exportAll right So there we kind of see uhPixie Onesecond There we see how Pixie kind ofcombines all three of these EVPFproperties to make it an extremelyflexible tool And this is how we go fromauto instrumentation to observabilitypipelines um by Pixie being able to nowprocess files It um you know and alsohaving that um edge or agent toaggregator to export architecture Itputs it in line with um tools likefluent bit and vector where you canactually do processing at the agentlevel processing at the aggregator leveland ultimately to your sync And uh thatallows us to do this correlation betweenthe oom kills from ebpf and attachingthe rout the other system behavior fromsyslog And so to wrap things up um thehighle points are automaticinstrumentation is easy You want yourebpf tools to provide insight and havereally great tooling And that's what'sgoing to give you value in uh being ableto solve your debugging and other systemproblemsUm I want to mention again that unlessyou're an eBPF developer you probablywant to reach for these tools becausethat will be the fastest way for you toget to these insights And then also Ijust want to reiterate that um just asyou evaluate EVPF tools you want to keepthese three properties in mind to makesure you get the insights you need thedata enrichment structured APIs andprogrammability And with that I want tothank you all for attending the talk andI'm happy to stick around and answer anyquestions that you have[Applause]2025-04-15 22:02:12.627949 ))��>�n#��3AI236sjooftwhelloCubeCon Hope you all are having a greatstart to your week Um my name is DomDonano I'm CEO and founder of Cosmic andalso a CNCF Pixie core maintainer AndI'm excited to talk to you aboutexpanding EBPF'sreach So before we jump into things Iwanted to give myself a briefintroduction Um as I mentioned I work onthe CNCF Pixie project I got startedback with the project three and a halfyears ago First as an enduser and thenturned maintainer I also spent a bit oftime at CrowdStrike working on theirebpf Linux sensor So I've been workingin the ebpf space the last fewyears So before we get into the ebpfstuff I wanted to talk about wherethings came from So before the cloud andcontainers we had these monolithicarchitecturesThey were simplistic It was easy todebug issues You know these threepillars of observability weren't reallya thing at this point because inspectingand finding bugs were easy They were inthe big monolithicapplication And so it wasn't untilmicroservices and containers andKubernetes came along that observabilitybecame a really important uh operationaltoolAnd so you can see on this picture herethat um these are what our environmentslook like today We have services writtenin different programming languages Wehave SAS services You know we depend onall these third party APIs that we haveto talk to over the internet And alsothere's a variety of data stores usedAnd so observing these things becomesvery important when there's so manydifferent logical piecesand monitoring and observability startedoff with manual instrumentation So backwhen there was the monolith or lessprogramming languages it was easy toinstrument each piece But as the numberof combinations grew it became a lotmore effort and work to support Go tosupport Java to support all thesedifferent flavors of things And so thismanual instrumentation started to becomevery expensive in these polygotenvironmentsThis is where ebpf comes into thepicture So looking uh we're going tokind of compare how you wouldtraditionally do monitoringobservability against how autottelemetry works And so uh there's sortof three verticals to think aboutThere's coverage effort andflexibility So for the manualinstrumentation the coverage tends to besparseum for thirdparty applications and SASit's often infeasible you know unlessyou can change the code you can'tinstrument it or for SAS tools you knowyou're talking to it over the internetand so you have no control there unlessthey provide aninterface it also tends to beinconsistent as you have moreprogramming languages and tools you havetosupport comparing that to the ebpf casethe coverage tends to be very broaduh because ebpf hooks in at the kernelYou have systemwide view and actuallythis even allows you to introspectthings like using SAS services becauseeBPF instrumentation can hook into TLSlibraries and so you can see the plaintext of encrypted connections So you getthis language and framework agnosticview into yoursystem Moving on to effort as wementioned the manual instrumentation isexpensive If you have x number oflanguages you also have to instrumentthat x number oftimes With ebpf the instrumentation costgoes down dramatically because hookinginto the kernel you instrument it onceand it's alreadyavailable And the fi`fnlineclasses for new stargazers who want toget into thefield now as a store owner what exactlydo you want you want to ensure that yourproduct has good reliability umavailability and performance for that weset up some service level agreements andwe instrument some service levelindicators to track these agreements wehave uh SLI like request latency QPSfailure rate we also track somelow-level infrastructure metrics likememory bit rate etc and then we're doneand we roll it out for our usersso our service is growing users arehappy and our tutorials especially are abig hit so to meet this high demand westart adding more to keep up with theinterest and expand ourofferings now all is good but somecustomers start reporting delays higherlatencies and even timeouts whilethey're loading the tutorial catalog atthis point we don't really have anyalerts or anomaly detection set up likeafter all we're still expanding we'restill in the initial phase but soon awave of one-star reviews just comes ourway customers are complaining aboutslowdowns timeouts and just a generalpoor user experience and it's startingtohurt so we start investigating and findout that the request latency hasactually gone up after a few franticsleepless nights of root cause huntingwe find out what the actual issue is ourthumbnails are too large and as we kepton adding more and more courses the pageload time grew significantly leading toslowdowns and timeouts we are fixing itbut by this time the damage has alreadybeen done customers have already had apoor user experience we have lost theuser loyalty there's also the case ofinternal burnout where our engineershave spent you know the sleepless nightstrying to find the root cause andfinally uh they in the meantime most ofthe users just went to a competitorleading to a loss ofrevenue so where did we gowrong the common assumption in buildingservices that anomaly detection is onlyuseful for mature and production readyservices on the surface it kind of makessense like why would you botherdetecting anomalies when there is wheneverything is still being duct tapedtogether early stage services arechaotic they keep on changing and theythey're frankly unpredictable so addinganomaly detection sometimes can feelpointless or even frustrating butironically that's exactly when you needit themost there is a second assumption aswell that anomaly detection is complexit's often seen as something that cantake away engineering cycles and timefrom things that actually matter likedelivering features but that's also nolonger true with tools like opentelemetry Prometheus and Kubernetesnative ML frameworks you don't reallyhave to go all in from day one you canstart small and use lightweight modelsand then grow as the systemexpands so if we had actuallyimplemented anomaly detection as part ofour development framework in day onethings would have been a littledifferent imagine if our CI environmentran a nightly job which does buildoverbuild analysis with anomalydetection now this job flags a spike inlatency nothing is really broken butsomething is clearly off now in ouractual use case this build went vendedto production and were related to a baduser experience but with anomalydetection in CI we would have caughtthis which means there would have beenno uh incident and no onestarreviews so we've talked about day oneanomaly detection and how it can behelpful but how exactly do you add itfor your nent services so any typicalmachine learning life cycle generallyhas four important steps first is datainstrumentationthen feature engineering then actualmodeling and finally deployment andinference so let's dive into this andbuild our day one and nomdetection so the first is datainstrumentation so in our case it'smetrics so we have learned our lessonlet's go back and set up an detection onevery metric that we can think of i meanwhat's the harm but more metrics canoften lead to more chaos it can lead tofalse positives alert fatigue and justunnecessary operational overhead at thispoint you're not detecting anomaliesyou're just drowning in them so the keyis to gnot just use metrics but usemeaningfulmetrics so we have selected ourmeaningful metrics now let's look at thesecond point that's featureengineering so instead of relying onjust raw metrics alone you can alsoenrich your data by building andengineering your features for examplerolling averages change point driftindicators or domain specific thresholdsthese features can give your model a lotof context and it can enrich your inputfor example you can also find slowmovingtrends which would have been a littledifficult to identify with a simpleglance from your uh rawmetrics so now this will help you movefrom metric math to actual insightsso we've got our metrics we've got ourengineered features and now it's timefor the fun part that's modeling nowit's time to throw all of these featuresinto the biggest newesttransformer-based model that you canfind until you realize that you'reactually just still talking about dayone anomaly detection which means yourmodel can't actually make accuratepredictions if you don't have if youhave limited or no historical data sothat means you have no baseline you haveno temporal patterns you lack thedistributional context of what looksnormal in your data and you don't haveany labelled anomalies because nothingis really broken yet and this is thecold startproblem now if you want to train ourmodels we do have to address thisproblem so the good news is that thereare simple ways and heristics to addcontext to your models and you can stillget useful signals even before the datahas settled so let's solve this coldstart problem and the first way to do itis to startsimple we can use simple statisticalanomaly detection models which don'tneed deep learning pipelines or largetraining jobs and they are just simpleproven explainable math and they can beused at day oneso take zcore for example it's great atcatching sudden spikes or diffs fornormally distributed data like requestlatency these models don't need labelledanomalies or months of historical datayou can actually start using them fromday one they offer many advantages likeyou can get real-time detection withminimal configurations changes you havelow operational cost compared to deeplearning models and you can interpretthem really easily because they are justmathematical formulas and that's whymany teams use them as a first line ofdefense even if they are planning onadding more complex machine learningmodelslater now the second point is usingprior knowledge so even before you haveany fancy models your domain expertiseis your biggestsuperpower domain knowledges domainknowledge gives your statistical modelsu the context they need so with this youcan uh actually identify patterns evenbefore they emerge you can encode whatyou already know like what does a normalrequest latency look like or what is acritical CPU threshold you can smoothyour noisy metrics you can tune yourmodels based on parameters that youalready assumed and you can leverage thedistribution properties of the data anda lot more this is also part of featureengineeringand this next is using syntheticdata with synthetic data you canactually simulate real world behaviorand stress test your service and itsdependencies so you can add syntheticdata like adding metric spikes uhsimulating resource exhaustion addingerror injection and all of this canreally benefit you by improving yourmodel feedback you can bootstrap yourcold start problem by adding labeledanomalies from your synthetic data youcan validate your signals and identifywhether the models and features thatyou've chosen are actually valuable toyou and you can do all of this in a safeand reliable environment withoutactually affecting yourusers so we have talked a lot about dayone anom detection and how it's usefullet's see how you can actually implementitokay so let's say there's a metric thatyou want to track which is a goodindicator for your servicehealth this is what your metric lookslike now yeah this is what your metriclooks like and now you want to set upanomaly detection on top of it uh justbecause you want to be alerted whensomething gohes wrong with the service sosince we are starting simple let'sleverage a zcore anomaly detection modelzcore uses just your running mean andstandard deviation to identify anomaliesand since we have fortunately usingPrometheus PromQL gives you simplefunctions that you can use to calculatethis running mean and standarddeviation let's use five minutesthis is your running mean for yourservice for your metric and similarlyyou can also calculate your standarddeviation overtime there you go this is your runningmean and standard deviation of themetric now the zcore formula is nothingbut your data minus your average timeminus your running mean divided by yourrunning standard deviation this isreally simple and easy to use and thisgives you your zcore valuenow zcore what it actually represents ishow many standard deviations your datapoint is away from the mean so generallyyou can say two to three standarddeviation is something you can consideranomalous but this depends on yourdomain expertise and what type of metricyou're dealing with but for our use caselet's say we want to be alerted onanything that's beyond two standarddeviationsso these are the time periods where ourdata was actually anomalous and this issomething you might want to know fromyour metric to identify what's going onwith the service now you can't keep onmonitoring this Prometheus UI so we canset up like asimplerule which looks like this where youidentify yourexpression your condition that you wantto be alerted on anything that's greaterthan two standard deviations and howlong you want this anomaly to persistbefore you're actuallyalerted once you havethis you can go to youralerts and see that your rule has beenset up then you can sync with your alertmanager and be alerted on any modalityyou want to uh identify if your metrichas some deviations from the normalbehavior so congratulations you've justset up zcore normal detection on yourmetric on day one and I understand thisis really a really basic demo and mostof the people here probably know thisbut the idea is that it is so basic youcould set it up in a couple of minuteswith very minimal data and on day one inyour developmentenvironmentso we have discussed so far how we canuse day one anomaly detection usingsimple statistical models and heristicsbut as your system grows and yourcomplex data patterns emerge some thingscan change which your uh statisticalmodels might not scale to so this iswhere we want to go beyond justdetecting anomalies and use more complexmodels and model these complex datapatterns using deep learningso while deep learning models are verypowerful the complexities that come withit are something that need to beaddressed and may become a blocker onday one anomaly detection on theapplication side our biggest problem isthere is lack of label data which meansyou don't have any idea of like whatanomalies are because by definitionanomalies are rare second this lack ofdata can also cause your models tooverfit so it'll be hard to once yourmodels are over it'll be hard togeneralize across different use casesfeature engineering is still somethingthat you need because any machinelearning problem uh deals with anymachine learning model is only as goodas your data and finally interpretationof your models is still a challengebecause deep learning models aregenerally considered to be blackbox andit's not helpful in your early stages ofanomaly detection we're still trying totrust our modelson the infrastructure side deep learningmodels are resource intensive especiallyfor your training jobs it's hard toscale them across different environmentsmanaging environments across trainingtesting production adds a whole newlevel of complexity and finally you haveto find a way to iterate these uh anddeploy these models reliably andcontinuously so while featureengineering is complex we can't reallyget away from it the model as Imentioned is are only as good as thedata and so you need to ensure you havehigh quality features and data inputdata and the reason you don't start withdeep learning models is that they'rehard to interipret but as you go forwardand you understand your data and modelsmore you can improve your interpretationof yourmodels however we're still left with twochallenges that we can actuallyaddress one of the best ways to overcomecomplexity in existing model um is toreuse pre-trained models especiallythose which are trained on large diversepublic data sets that's where transferlearner comes in when you're leleveraging these pre-trained modelsoften they are trained on diverse datasets or public data sets you can thenfine-tune it on your own data and asmall amount can go a long wayfinally you validate and refine yourmodel with your feedback from your realanomalies in your data and you can inputthis in your real training data set touh get better results going forward thisallows you to get up and running quicklyeven if you don't have tons of labeldata from dayone the there are several open sourcemodels that are built exactly for thesekind of problems and here are a few thatI'd recommend checkingout allright now Critica will cover some intrachallenges with deploying deep learningmodelsawesome thanks Rashan so now that weunderstand how to navigate thechallenges of applying deep learningmodels for anomaly detection on day oneon the application side let us go backto our infrastructure concerns and seehow we can overcomethose so we already know the answer tothis kubernetes gives us the foundationto overcome these infrastructureconcerns that we already have but totruly operationalize ML on uh top ofKubernetes we need something that ispurpose-built and the answer is CubeFlowso CubeFlow is an open- source uhplatform designed for uh making ML onKubernetes simple portable and scalableit provides the individual componentsthat you would need at every step ofdeveloping deploying and productionizingML models um it has Spark operator fordata preparation and feature engineeringit has the trainer for training yourlarge scale distributed uh training jobsum it has a model registry where you canstore your ML metadata as well as youruh model artifacts um Kserve is amazingfor inference and model serving at scaleit can do autoscaling for you and cubeuh it also has cubeflow pipelines whichhelps you with uh containerizeddeployments progressive rollouts as wellas automated pipelines so with Cubeflowwe can focus on the benefits of deeplearning without having the overhead ofsetting up custom infrastructure fordeveloping and deploying your machinelearning models inproduction okay so now let's talk abouthow we can use CubeFlow to set up a uhscalable anomaly detection pipeline fromdata prep all the way to productionokay so let's say we start from apre-trained model and you know I pickedan LSTM model from huggingface you can pull this model and storeit on the cubeflow uh model registrywhich is a centralized uh versioned datastore for all of yourmodels and the first step as we alreadyspoke about in any deep learning or MLpipeline is data prep we can use thespark operator to run distributed datapreparation jobs at scale and the samespark operator can also help us runtransformations like PCA or timewindowing which is essentially togenerate features for our modeltraining the next step is training wecan use cubeflow trainer for transferlearning our pre-trained model the LSTM1from hugging face on the new data thatwe have and we can also leverage KIBwhich is a tool that um is part ofcubeflow which helps with hyperparametertuningand finally once the training iscomplete uh we can use the cube cubeflowpipelines to package the model uh modeluh as a container tag it and push itback to the modelregistry with the model containerizedand published kserve takes over andserves the model on a live service onKubernetes the same anomaly detectionmodel can be used on a wide variety ofuh data sources and a wide variety ofuse casesso this is just one way that we canleverage cubeflow to help withovercoming the uh infrastructureconcerns that we spoke about earlier anddeploying scalable anomaly detectionmodels in production on day one there isvery detailed documentation um on caseof thjey've done a great job and I have aQR code here and you can just um scanthat it also has step-by-stepinstallation instructions as well soit's pretty goodokay so now that we've looked at how toaddress the uh application as well asinfrastructure concerns that we may havewith anomaly detection on day one withstatistical models as well as deeplearning models let us just quickly walkthrough how you can set up cubeflow onyour local machine and deploy a simpleanomaly detection modelso we'll walk through a very simplesetup and we'll use a pre-trainedanomaly detection uh for cubeflow and ininstead of u you know asking to installa whole uh cubeflow platform you canjust install whatever is necessary andfor this you just need kserve and itsdependencies so we'll install itsdependencies uh spin up the localkubernetes cluster load a simple modeluh run a minimal pipeline and deploy uhwithkserve so first you'll have to set upyour local environment so we start withthe basics kind customize cubecuttle the next step is for us to spinup a cluster and this is the clusterwhere we will run uh kserve but beforewe install case we need to install someof the dependencies that it has so thefirst one is sto it needs it for itsingress and um servicemesh uh the other one is search managerwhich is for secure communicationbetween theservices and finally uh cube uh kserveis built on uh k native so So it needsuh the model life cycling networking andscaling models uh from K native so wealso install that these are essentiallythe building blocks that will power KSERuh KS's autoscaling traffic routing andserviceexposure so once we've got STTO SRmanager and K native installed we haveto now set up the final piece which isKserve queso is the component that uhknows how to take a model and expose iton a live API so that you can performanomaly detection on it and that's justhow you doit so now that our infra is ready we'lltake a very quick uh small model that wecan u install so the walkthrough isusing a uh isolation forest fromscikitlearn and this is one of thesimplest uh unsupervised anomalydetection models that you can use butit's very effectiveso this short Python script has um Idon't think I can show it okay it has uhit loads asimple CSV file of the time seriesvalues reshapes it and fits it into anisolation forest once the model istrained it saves it as a pickle file thepickle file is all case needs to runinference it gets the mo it gets themodel uh from just the pickle file andyou don't need to containerize any ofthis training code that we have here orbuild anything fancy you just need togive this model toKSER so now you have to deploy the modelfirst we serve the model over HTTP fromour local machines uh using a very basicPython web server and this is so thatKSER can wget the model then we writethe infrance service which basicallytells uh ks of what is the name of yourmodel where can it find the model andwhat is the framework it's using and inthis case it's using skarn we apply thisyl and queser will spin up a clusterfetch the model and expose it on an httpendpoint so you can hit that forpredictions so if you hook this up toyour service you can essentially performanomaly detection on a very wide rangeof use cases and you can also storethese results on Prometheus and displayit on a dashboard for bettervisualization and also add um alerts wehave as we have already looked at andthis is something that we have setup okay so a quick recap we set up caseinstalled dependencies brought up aminimal cub kubernetes cluster uh runran a minimal pipeline and deployed themodel toqueserve this is a very simple approachbut it is perfect for day one anomalydetection because it's simple and whenyou want to iterate faster with realworld data this is what comes inhand uh very quickly um here are someresources yesterday there was a cubeflowsummit i hope a lot of you attended ituh but this is their page and you canfind the videos on that and um there'salso a lot of documentation out therefor uh cubeflow i personally recommendthe cubeflow 101 uh talk which is forbeginners and there are a bunch of shortvideos that you can learnfrom okay all rightso this is now our full end toendpipeline from data preparation featureextraction modeling deployment uhserving and inference you have now setup your anomaly detection for runninganomaly detection on any use case thatyou want and this is both the trainingand inferencepipelines but now let's talk about whatare some of some of the advantages andwhat should you watch out for when doingday one anomaly detection so as we'vealready seen the advantages of day oneanomaly detection is you catchperformance regressions early you findintegration bugs early you validateassumptions that you made whiledeveloping before that hits real uhusers and impacts them you also shortenthe feedback loop of uh your applicationinfrastructure changes that you make andfinally root cause identification isalso more focused because now you'redoing it at development and the changesthat are within a build are much lesserthan in a release so you are morefocused when you're finding the rootcause however it also comes with its ownlimitations ci and staging environmentswhich are pre-pro environments typicallyhave low data volume which means thatit's hard for the model sometimes tolearn meaningful patterns you also willlikely hit a higher volume of falsepositives because of the inherent uhnature to be noisy for theseenvironments if you're retraining yourmodel frequently your model mightoverfit to unstable patterns that are inthese environments and that kind ofreduces theeffectiveness and finally what mightlook like an anomaly in staging or in CImay not be an anomaly in production andvice versa but the advantages stilloutweigh the limitations and the key isto keep it simple while you start useinterpretable models and then finetunefine-tune the thresholds and as you gointo more complex scenarios you cangraduate to deep learning models thatare meant for more complex scenarios andremember treat this as a way to augmentbut not replace your traditionaltesting so let's go back to our uh MTRgraph we had in the beginning of thepresentation while during the lastCubeCon we spoke about how to leverageanomaly detection for u reducingmeanantime to detect and meanantime toresolve now we'll talk about howincorporating that into our developmentpipeline can help so with day oneanomaly detection we could reduce thenumber of failures that happen inproduction in the first place inproduction anomaly detection issomething that uh is done aftersomething goes wrong and you're alreadyin firefighting mode but in developmentanomaly detection becomes a proactiveforce during application development youcan spot spikes and regressions muchearlier in CI testing you can flaginstability introduced by code changesor environmentdrift in staging you can catchintegration issues like misbehavingservices or skewed dataand finally in production you usuallycatch catch issues that are related toenvironment and scale and all of theseuh contribute to a powerful shift you'renot just reducing MTR you're reducingthe frequency of failures altogetherbecause you're treating differentfailures at different layers of yourdevelopmentcycle and let's bring this all togetherin this final idea if your anomalydetection only kicks in in G you'retreating the symptoms but by embeddingit into the full development life cycleyou shorten the lifespan of an anomalyand catch it earlier duringdeployment you're also catching itearlier and responding faster whichmeans that you are stopping the failurefrom ever reaching production so if youstart thinking about anomaly detectionnot as a last mile ops but as a firstmile development companion you're notjust reducing MTR but you'realso increasing the meanantime betweenfailures and effectively making theservice moreresilient okay so we hope that you tookaway that anomaly detection is more thanjust a production tool and that you'reinspired to include anomaly detection aspart of your development life cycle andyou know thank you for being here todayand we'll be happy to take somequestions[Applause]2025-04-15 22:02:13.162513 4��}�p#��1AmqXZ2T-jWuUhi everyone uh welcome to our talk uhthank you so much for coming i'm sureit's been a long day so hopefully uh itdoesn't feel too long um so here's ourtalk Flink on Carmata building resilientdata pipelines on multiclusterKubernetes um we'll guide you throughthe journey that we went through tosupport stateful application failover onApache Flink and the collaboration thattook place between the Carmada communityas well as the Bloomberg streamingplatform team to add this supporthopefully by the end you'll havefollowed along been introduced to theCarmada project and learned some of thebasics about how you can improve yourdata pipeline resiliency aswell so starting with introductions myname is Mihas Chachilo um I'm a seniorsoftware engineer and tech lead atBloombel��+�o#�� AjiT7kGqcpR4okay good morning everyone thank you forbeing here with us today um I hopeyou're all enjoying CubeCon so far yes Iwill take that as a yes all right sotoday we're going to talk about day oneanomaly detection forobservability but before we get startedwe'll take a quick moment to introduceourselves hi I'm Kitika i'm a machinelearning engineer at Apple and I workwith the observability team and mybackground is in observability machinelearning and data science prashant helloeveryone uh I'm Prashant i'm also amachine learning engineer at Apple ialso am a part of the observability teamand my background is in machine learningnatural language processing andobservabilityall right okay so for those of you whoattended our talk last year in CubeConuh Salt Lake City we spoke about how wecan leverage anomaly detection to notjust improve MTD which is meantime todetect but also improve meanantime toresolve by just ingesting someadditional information and bootstrappingthe anomaly detection models but forthose of you who did not attend our talkthere is a QR code up there which youguys can just take a picture of um it'son YouTube you can watch it so nowPrashant will go into what we're goingto talk about todayall right thank you Critica so we asKitika mentioned we previously talkedabout anomaly detection and how it helpsto detect and resolve incidents fasterbut we actually wanted to take it a stepfurther and see if we can actuallyincrease the time between failures thatis if you can improve the mean timebetween failures in today's talk we'lltalk about how combining metrics modelsand modern-day observability tools canactually shift your approach from beingreactive to proactive we'll start with acase study of day one anomaly detectionuh we'll talk about the cold startproblem and how to overcome it with likea simple demo and then we can also talkabout how to train and serve morecomplex models withCubeFlow so here's what we are hopingyou'll walk away with first anomalydetection isn't just a post-productiontool it should be a day onedecision second great anomaly detectionstarts with great observability andthat's not just metrics but meaningfulmetrics your feature engineering shouldbe your first model next your modelshould align with your operationalreality you need models that strike theright balance between accuracy latencyand interpretability to drive themaximum value and finally anomalydetection isn't just for productionthere is immense value in applying itacross staging your CI/CD even adevelopment environments if you want tocatch flaky behavior before it even hitsproduction so let's in talk about a casestudy we wanted to introduce our newstartup an astronomy theme store calledStella Stash in this case study we'llfollow what happens when our store runsinto some issues before it even achievesliftoff so in seller stash we selltelescopes for stargazing binoculars ifyou're into bird watching or spying onyour neighbors and all sort of otherfancy gadgets we also offer oemrguh my name is Leewan and I'm also asoftware engineer from Bloomberg as welltogether we work on the streamingplatform team which provides ApacheFlink to many different users withinBloombergoh sorry uh for the agenda today umwe'll start with a condensed overview ofApache Flink for those who might not befamiliar um so we can provide just morecontext for you as it's the main pointof our talk specifically we'll touch alittle bit about the importance ofapplication state and the recoverymechanisms that we already um that arealready provided by Apache Flink withthat background in mind we'll reviewsome of the challenges we faced withmanaging Flink applications in amulticluster environment while ensuringresil resiliency and how we were able touse Carmata to support automated andstateful cluster failoverfinally we'll touch on the overallbenefits of Carmada the outcomeslimitations and further improvementswe're working together with the Carmadacommunity um to improveso at Bloomberg we operates a largescale streaming platform that empowerscritical financial data processing wecurrently have around 1,000 uniqueApache Flink jobs running on top ofmultiple Kubernetes cluster spreadacross multiple tiersthis Flink jobs handles a variety of usecases including data ETL real-timeanalytics and eventprocessing to give you an idea of itsimpact this system is crucial toBloomberg's core financial productswhich provides real-time marketinginsights to traders analysts andfinancial professionals across the worldit this image here showcas theBloomberg's terminal which is where theprocess data is visualized helping ouruser to make the informed decision inreal time given this scale ensurereliability and efficiency is our toppriority this is where Flink on Kamadacomes intoplay so what is Apache Flink apacheFlink is a popular open-source datastreaming framework it has featuresincluding low latency flexible and highscalable to process massive amount ofdata it also has exactly once guaranteesto ensure dataintegrity last but not least it comeswith a built-in fault tolerance andstate management system to maintainreliability in distributedenvironment flink job often runscontinuously but what happen if afailure occursto ensure reliability Flink use statesnapshot including checkpoints and thesafepoints in case of a failure Flink canautomatically restore from the lateststate minimizing data loss anddowntime here is an example the eventsare get ingested from an event log intoa flink application and internally thisapplication periodically write acheckpoints to a persistentstorage if a failure occurs Flinkrestore from the latest checkpoint toensure seamlessrecovery now that we understand theFlink state management let's talk aboutthe job deployment on Kubernetesapache Flink supports Kubernetesnatively which makes deployment scalableandmanageable the Flink operator automatesjob life cycle management including jobdeployments upgrades and autorecovery here is how it works the usersubmit a job YAML file to Kubernetes andthen the Kubernetes API server processthe requestthen the Flink operator deploys upgradesand monitors Flink jobs inside thecluster you will finally get a fullymanaged Flink job running on top ofKubernetes in distributed environmentfailures can always happen but Flink isbuilt to recover gracefully using the HAsettings Flink can recover from clusterinternal issues such as hardwarefailures pod crashes or transit networkglitches in this diagram on the leftside we see a normal job execution theuser code interacts with a local stateback end and state is persisted as asafe point into a distributed filesystem if a failure occurs and the jobcrashes the persistent safe point isstill there on the right side when thejob restarts it reloads the safe pointrestore it state and continue as ifnothinghappened so as previously mentioned thestreaming platform team manage managesmany different Apache Flink applicationsum and as the platform grew we startedto encounter some challenges inscale so most notably uh control planesare currently tied to a single clusterand given nthat our platform spansmultiple tiers with users deploying manydifferent jobs um as the number of jobsgrew this starts to become quitecumbersome uh users have to keep trackof the clusters they're deploying tothey have to worry about whether thecluster has enough resources to hosttheir job and they have to manage abunch of cube configs in order to set uptheir CDpipelines additionally once these jobsare actually deployed we expect them tobe long running and actively processingdata but given the nature of longunningjobs failures aren't something thatcould happen but they're something thatwill happen apache Flink luckilyprovides a lot of coverage for dealingwith these types of intermittentfailures with its high availabilitysupport but high availability runs onthe assumption that recovery will takeplace within the same cluster let's takethe Kubernetes high availability as anexample the metadata that Flink willwrite to the config map will point tothe latest state that was published bythe Apache Flink uh application but ifthat config map is deleted and the jobfails the Flink operator will be unableto reconcile what the latest stateactually is and it'll ask the user tomanually reapply thejob finally this means that clusterfailover itself at the present moment isa manual process if a cluster goes downwe get alerted of this and tenants willeither have to have their own highavailability setup in place so eitherrunning two different um pipelines intandem in different data centers anddouble processing or they'll have tomanually redeploy their job to a healthycluster this can be a bit cumbersomebecause users will have to understandwhat their latest state is um and as thenumber of jobs grows this can get quiteerrorprone and then lastly themaintenance windows that we do on ourclusters become very expensive andrequire coordination um given thatKubernetes updates quarterly um that hasa lot of maintenance windows in one yearum so each time we have to migrate usersoff so given these limitations what doesour ideal control plane actually looklike well we wanted a unified controlplane that would require users to haveuh or require users to only use a singleQ config and apply their jobs to asingle placesecondly once these resources wereapplied the control plane can handle thelogic of intelligently actuallyscheduling these applications to one ofthe clusters in the federation keepingin mind available resources or noderequirements and then finally we shouldalso monitor the health of both theKubernetes clusters and the individualapplications that have been scheduled ifthe control plane detects some sort ofissue with the cluster or an applicationand it stays unhealthy for a prolongedperiod of time uh we should try toreschedule the application to anothercluster so as we started to solidifythis ideal version of a central controlplane we started to research somepotential options that were indevelopment this is where we wereintroduced to the Carmata projectcarmatada is an open source Kubernetesmanagement system um it's built tomanage cloudnative applications acrossmultiple Kubernetes clusters and itcomes with a variety of helpful featuresthat users can uh tune and tailor totheir specific usecases it allows platform owners tomanage groups of Kubernetes clustersfrom one place um minimizing theoperational overhead of dealing withmany many clustersit gives platform owners the ability todefine their applications and how theyshould be scheduled to clusters withinthe federation with advanced featuresincluding resource aware scheduling aswell as clustered affinityrules carmata also provides a unifiedauthentication endpoint as well as aunified endpoint for viewing resourcesand managing applicationsand finally what we found mostinteresting was the support forautomated crosscluster failover which issomething um we're talking abouttoday so in order to understand howautomated failover works in Carmada umwe wanted to go over how Carmataactually tracks both of these things sofor cluster health Carmono relies oncalling the existing Kubernetes healthendpoints um in roesponse if itdetermines the cluster is unhealthyafter a certain grace period um it willdo a taint-based eviction and re uhschedule applications to a more healthycluster for application health it's abit more granular uh Carmada provides aframework for defining resourceinterpretation uh so users can define uhhow they want their application to betreated whether it's healthy orunhealthy so this diagram represents thefirst version of how we used Carmata tomanage and deploy our Flink deploymentsso users would apply their namespacejobs to the Carmata API server endpointwe automatically provide the necessarypropagation policy on their namespace sofor those unfamiliar propagation policyis a custom resource defined by Carmatawhich tells Carmata how you'll actuallyschedule the application um that thepropagation policy is scoped for so inour case for example we set some threadconstraints on the Flink deploymentsince we want it to only be scheduled toa single cluster and the replicas willbe scheduled together so we set that tobe max one clusternow once the application is scheduledsuccessfully the Flink operator picks upthe deployment and schedules the jobafter which point the full status willbe available on the CarmadaAPI now let's talk about some of thelimitations we faced initially whenintegrating with Kamadafirst for the application fillover towork Kamada must understand how tointerpret the resource health howeverusing default settings Kamada doesn'thave any awareness of Flink state thisis an issue when it comes torescheduling and second when a Flink jobis rescheduled to a new cluster it needsto start from its previous state but thestate is not preserved during failoverthat means the deployment will be startfrom scratch which is an ideal for alongunning streaming job theselimitations motivate us to explorebetter solutions for state managementandfillover and to get a better solutionfor failover the most important thing isto understand thejob flink maintains its own internal jobstate machine and we have drawn asimplified uh diagram attached on theright side so how to how do we interpretit if a job is in the rounding statehere we filled with a green color or inany of the terminated state filled witha gray color like failed finishedcancelled or suspended we consider thisjob as a healthystate but it if it is in one of the blueeffirmal state like reconcilinginitializing or created we have to bemore careful if there is a no user errorlike bad image pass we treat it as ahealthy and don't trigger a failover butif the job is stuck in one of thesestates without a no arrow we treat it asunhealthy and finally we have the yellowcolored short-lived state likerestarting failing and cancelling theseare part of normal transitions and willquickly lead to a terminal state so wetreat them as healthy aswell let's take a look at what is ahealthy state transition usually ahealthy fling job start with areconciling state where the job managerisn't scheduled or hasn't report thestatus yet and next it moves toinitializing where the Flink componentsare starting up and then it reachescreated meaning the cluster is ready butthe job is not processing data yet andfinally it enters running where the jobis fully operational and processingdata here is another example of anunhealthy state transition as what wesaw before the job is in the runningstate and the processing data everythinglooks good but then it suddenlytransition back to created this usuallymeans something went wrong maybe taskmanager crashes or the cluster itselfhas some internalissues so by checking the job state verylikely we can tell if a job is healthyor nohowever there are some edge cases thatchecking internal job state is notenough we also need to look at the errorfield inside the state for deploymentissue including misconfigurations suchas bad container images or malformedyammofiles also it can come from runtimeproblems such as application bugs or badupgradesthis is what Miha showed before on howwe run Flink on Kamada in addition tothe previous propagation policy we add afailover field in the pYAML with thischange if a deployment becomes unhealthyKamada will wait for a threshold weconfigured and then p it purge it rightaway forrescheduling so was that a successfulfillover not exactly although Kamadadetects a job failure and reschedulesthe job it always start as new whichmeans we lost the running state in Flinkhowever to gracefully resume afterfailure we need the latest checkpointand also to get that we need to preservethe job ID across clustersso as Lee described although Carmato wascorrectly detecting application andcluster health now the failover wasn'tyet complete um we still needed a way tofigure out how to conserve informationfrom the previous job so that the newjob could have a reference as to whereto start up from after working with thecommunity we came up with an idea of astate preservation enhancement whichextends the existing failover API andthis is set directly in the Carmonapropagation policy it consists of twofields the first is a JSON path which isjust an expression uh that identifiesthe specific piece of data that we wantto conserve in status and then an aliaslabel name which is just going to be thename of the label so as you can see thepropagation policy on the right showsthis new API in action um with usselecting the job status job ID field umand then injecting this information as alabel into the newly failedoverfl now there's one last step here umeven though the metadata itself isinjected as a label into the Flinkdeployment we still need to convert itinto a way that the Flink deployment canactually consume that information so wewere already using Kerno web hooks forvalidation and mutation uh for thoseunaware Kerno is a policy managementtool which allows us to declarativelydefine policies that can validate andmutate watched resources so we decidedto reuse it and implement a mutating webhook that could inject the necessaryinitial save point path and point to thelatest state from the previously failedjob so here is the finalized diagram ofhow we did failover so let's assume thatwe had a job successfully running oncluster B at some point in the futurethe application suddenly transitioned toa reconciling state uh this means thateither the job manager crashed uh or theoperator is just unable to get thestatus um from the application due tosome sort of issue carmato will treatthis application as unhealthy assumingthere are no published errors andeventually start uh taking down thetoleration seconds defined within theresources propagation policy after thetoleration is up failover will kick inso Carmato will pull out the job ID fromthe published job status and then deletethe resource on cluster B carmata willthen inject the state preservation labelinto the scheduled resource um and willpoint to the previous Flink clusters IDso our Kerna web hook will interceptthis request pull the job ID and thenset the initial save point path so thatthe Flink operator knows okay thisapplication is starting back up from aprevious statenow there are a lot of benefits here forus um I think the biggest pain point fora lot of our users was needing to manageand keep track of a lot of differentKubernetes clusters so by having aunified control plane users can applyupdate and manage their jobs from oneplace with authentication andintelligent scheduling already bakedinto the setup the other huge benefitwas the automation of crossclusterfailover which is critical to ensuringuh data pipeline resiliency now it'simportant to note that crossclusterfailover should be a rare occurrence umApache Flank's high availability featureis expected to generally handle jobfailures that occur within the samecluster and so generally applicationsshould self-reover this is really onlyfor cases where there are exceptionalcircumstances uh such as a partial ortotal cluster failurein those cases it saves us as platformowners a lot of time and also allowsusers application to autoreover in ahealthier cluster without their explicitinvolvement following that veindifferent users can have differentfailover requirements so it's reallygreat that uh these failover umdefinitions are flexible andconfigurable as some users might be moresensitive to failover um whereas othersare okay being moreconservative uh but as a whole automatedfailover has helped remove a lot of theoperational burden that comes withmanaging our platform on a day-to-daybasis um and additionally maintenancewindows no longer require heavycoordination we can simply uh evictapplications from a cluster and havethem restart in a different one um andit also uh provides us with the abilityto DR test out of thebox so how would you go aboutgeneralizing this and trying out thefeature yourself so luckily Carmadaalready provides resource interpretationfor most native Kubernetes resources uhsuch as deployments or stateful sets butif you're using a CRD you may have todefine your own custom interpreterluckily doing so is pretty easy uh allyou have to do is uh fill out uh thecustom interpreter interface withseveral different methods that you thinkare applicable to your use case so forexample uh you could fill out the getreplicas method um which tells Karma howmany replicas your CRD will schedule umalong with the resource requirement oryou can fill out the interpret healthmethod which Carmato will use todetermine when your CRD is healthy ornot secondly you'll need to determinehow you'd like to configure failover sohow sensitive Carmata will be atdetecting issues and triggering afailover how quickly do you needapplications to be purged from thecluster do you want them to be deletedimmediately or do you want some graceperiod um and is there any state thatyou need to conserve betweenfailovers lastly once this state isactually uh injected into your resourceyou'll have to find a way for the memberclusters to actually consume it um inour case we used a web hook to injectthe initial safe point path in the Flinkdeployment specnow there are some existing limitationsand pending improvements with thefeatures uh that we've talked abouttoday so with application failover uhthis is still a developing feature andthe karma community is continuing togather more use cases to betterunderstand gaps in the existingimplementation so for state preservationspecifically uh this feature was addedrecently in version 1.12 and we want toencourage more users to try it out andsee if it fits their needs from theperspective of failover itself there aretwo points to consider uh the first isthe toleration seconds needs to be tunedcarefully so toleration seconds kicks inimmediately after the resource has beenscheduled to a cluster so you don't wantto set it to be too low otherwiseCarmato will continuously try toreschedule your application if it neverreaches a healthy state um so this meansthat toleration seconds itself islimited to be greater than theapplication's initialization time uhwe're working with the community to seehow we could change this uh potentiallyonly uh taking toleration seconds intoaccount after the applicationtransitions to healthy for the firsttime and then additionally uh the healthinterpreter uh is only really meant toprovide an estimate as to whether or notthe application is unhealthy um for mostcases uh we assume that uh if you're ina reconciling state or created stateyour application isn't activelyprocessing data um and so if there is noerror there we assume that there couldbe some sort of cluster issue causingproblems lastly cluster failover itselfis currently undergoing improvements asa result the state preservation featureum has not yet been made available thereuh but keep your eyes out this supportis supposed to be addedsoon uh lastly for the Karmada communitywanted to add a little bit of a plug umwe're a very active community so ifyou'd like to contribute uh please reachout to us on Slack or onGitHub and we're also presenting furtherat CubeCon uh there are two other talksthat are going to be given tomorrow umso I encourage you to attend those umand if you'd like to talk to thecommunity in person we have a desk inthe uh project pavilion i think thekiosk number is 4B um and we'll be therein the mornings2025-04-15 22:02:13.652472rCassendraclustersnow the configurations at Yelp aremanaged in a declarative way uh using acentral git repository which we calledYelp s config uh as shown in the uhdiagram any updates to theseconfigurations are then converted into aKubernetes custom resource uh here theoperator is continuously watching thiscustom resource and as soon as itdetects a difference between the actualand the desired state it reconciles andit takes action to reconcile in convertit into relevant Kubernetesresources we use persistent volumes uhas we are talking about the statefulworkloads and these are attached to theKubernetes podsone important thing to note here is thatuh our clusters are spread acrossmultiple data centers in multipleregions so within a single data centerwe maintain a replication factor ofthree uh however for simplicity uh inthe upcoming slides there may be asingle replication setupshown so jumping to some basics aboutKubernetes ports uh a pod is thesmallest deployable unit in Kubernetesessentially it's a group of containerthat share the same network and runs ontop of uh the same Kubernetesnode so when a pod is scheduled all thecontainers start in parallel while thereisn't any strict limitation on thenumber of containers that you can defineinside a pod but from managementperspective you can think that it couldbecome complex if we have uh too manycontainers inside apod so now there can be cases where weneed to perform some certain stepsbefore these containers starts andthat's where init containers comes intoplayuh so these are initate containers arespecialized containers that make sureeverything is set up before mainapplication is running however few keydifferences here firstly unlike maincontainers which run in parallel initcontainers always run sequentially oneafteranother and secondly an init containermust run to completion successfullybefore either the next init container isinvoked or the regular containerstarts and similar to the regularcontainers there is no strict limitationon the number of init containers thatcan becreated and thirdly init containers aredesigned for setup tasks only they arenot designed to serve user traffic sothat's why you will not find readinessand livveness probes supported in initcontainers so jumping to our applicationof init containers and solving differentuse cases the first problem is abouthorizontal scaling of Cassendraclusters how scaleup works in Cassendrawhen a node joins the cluster there is aseed node with which this newly createdCassendra node interacts to learn aboutthe cluster topology how many nodes arethere and the token range assignmentsonce this new node knows about thecluster then it enters into a bootstrapprocess where token ranges are assignedto it and then it starts streaming datafrom its peers and this process can takesome time depending upon the volume ofdata that it needs tostream so once this streaming processcompletes the new node joins the clusterand becomes ready to serve user requestso far everything looks simple and worksfine but the challenge we face is whenwe are scaling a Cassandra cluster whichhas change data capture enabledchange data capture is often acronymedas CDC which is a way to detect changesin data and allowing it to convert itinto astream how it works under the hood isthat each time a right is happening on aCassendra cluster cassendra internalwill create a commit log file thatcontains information about the changewhether it's an update delete or insertwhen CDC mode is enabled something extrahappens where this commit log getsreplicated into another special CDC rawdirectory from where independentconsumers can read those commit logs andprocess according to the businessneeds in our case we publish theseevents into Kafka and from there weensure that the downstream system getssyncednow CDC is controlled by a configurationproperty uh called CDC enabled thatdetermines whether this mode is activeornot now the problem arises if we arebootstrapping a new node with CDCenabled so the bootstrap data will alsobe emitted into CFKA because CDC eventswill be generated for sitnow this can lead to unnecessaryduplicate events and we may have tospend some additional cycles on thedduplicationfront additionally the bootstrap processalso becomes slower because Cassendrahas to perform some extraoperations for all the data that is itis streaming uh this makes scaling withCDC enabled more complex and timeconsuminghow we solve this problem we used theinit container uh so as soon as anincrease in replicas is made inKubernetes stateful set a new pod getsscheduled and is assigned to aKubernetesnode and then we start an init containerbut before jumping on the actions of theinit containers a few things that needsto be considered hereuh as we mentioned previously thatstreaming of this data can be timeconsuming especially for large clustersso we need to minimize thedisruptions and at least what we can dois that we can eliminate the voluntarydisruptions and this can be done byappropriately setting the maxunavailable property in the portdisruptionbudgets at the same time not alldisruptions can be eliminated so thereis still an involuntary disruptions thatcan happen any time for many differentreasons like due to hardware failuresnetwork issuesetc so this means that we should beprepared the init container code shouldhave some preparation to handle thosecases as wellthe another thing to consider is thateach time a port restarts the initcontainer will be run so the init codeshould be only invoked when it's neededand it should be item potent aswell so in order to ensure that withinthe inate container we first checkwhether the node is already bootstrappedor not so we don't do rereaming again asimple way of doing this is by checkingpresence of data on the attachedpersistent volumes uh if streaming gotinterrupted previously by any hardwarefailures or any otherdisruptions we don't automaticallyhandle it for this particular use caseinstead we let our on call person toreset the storage and make a decisionbecause such events in our experiencehas beenrare next Cassendra process is startedin init container but we with CDCdisabled so any data that is streamedfrom the peers it will not be uhstreamed intoKafka now you might be considering thatany new rides that are happening at thistime they might belost but because we have a replicationfactor of three so even if one node isnot processing uh any data it means thatthere will be another two that areprocessing the same set of data at thesame time so none of the data getslost and we have some readiness scriptthat indicate once the streaming iscomplete and Cassendra is ready to servethe traffic in the so once we get agreen signal uh Cassendra has joined thering we stop Cassendra processgracefully and exit the init containeruh the reason why graceful stopping ofstateful workload is important becauseany abrupt termination can lead to datacorruptions so once we exited from theinit container Cassender started back inthe main container but with CDC enabledfrom this point onward we startprocessing the CDC events in the CDC rawdirectory and in this way careful usageof init container enabled us tohorizontally scale the cluster withoutmuch humaninvolvement along with that we also haveless duplicates and the scale up timealso wasreduced the second case where we used init containers was Cassandra upgrade thiswas a project we did last yearand when we upgraded our entireCassendra fleet from 311 to 4.1 versionuh however there were two challengesfaced during the upgrade uh but beforetalking about specifics someterminologies to explain here sotypically from operational perspectivenode tool status command in Cassendrahelps you view the state of differentnodes inside Cassendracluster and the output of node toolstatus uh the first alphabet heredetermines the state of the clusterwhether the node is up or down and thesecond alphabet tells whether the nodehas joined the ring or notso there can be many states but a few ofthem just discussing like un means thatit's up and normal and dn refers to thenode is down but it has joined the ringand it was normalso for the upgrade we followedthe standard rollting upgrade processupgrading one node at a time but thisfailed for us and more technical detailsabout why it failed uh has been capturedon the Jira ticket in the open sourcecommunity but I would try to explain aproblem with some smaller examples uhlet's consider that we have a 311Cassandra cluster with three nodeshaving X Y and Z in the last octets oftheirIPs in our environment we don't usestatic IP assignments so as pods arerestarted the IPs can change and eachtime it's assigned whatever IP isavailable in the poolso as shown in the diagram the Cassendranode withX.X.X.Y IP after the upgrade gotassignedIPX.x.x.a the problem that happened isthat new upgrade new upgraded nodewasn't accepted as a replacement by thepeers that were running 311 version andthis was because there was someexceptions that happened during thegossip exchange uhmechanism so we did some moreexperimentation and then we observedthat if both IPS and version they changetogether the upgraded node is notrecognized but if we can make these twothings change gradually the processwould work fine forus so as part of this solutions weleverage the init containers to make itgradual and so this is the last state ofCassendra nodes prior to upgrade withthe version running 311 and IP is x.x.yafter a restart of pod during therolling upgrade a new IP gets assignedto the pod as soon as it isscheduled so however here we startCassendra first with older 3.11 versionin init container so that the peers canrecognize that this new IP is replacingthe olderone so one only one of the attributesgets changed we wait here till Cassendranode is ready to serve the traffic andthen we gracefully terminate thisworkload and exit the inatecontainer then laterwe started out Cassendra with 4.1version in the regular container nowbecause the network stack is shared theIP assignment doesn't get changed acrosscontainers it remains consistentso in this way once this new node wasbrought back up it was successfullyjoined the ring and we had a consistentnode tool statuseverywhere so you can think that howcareful utilization of init containerswe were able to upgrade Cassendra nodesseamlessly and we followed thisprocedure for our entire fleet with 70plus clusters and more than thousandnodes inthem the secondary problem that we facedduring upgrade was related to SS tableformat versionsso DSS table format versions can changewith Cassendra versions uh in the tableyou can see that there is an NV formatin 4.1 version which used tobe ME in the 3.11.13 version andsimilarly it used to be MD and MCrespectively in earlier 311 versionsso usually from read perspectiveadjacent major versions support readingfrom older SS tables but there aren'tany performance guarantees aroundit so it's recommended as soon as youupgrade it up upgrade the node upgradethe SS tables aswell uh in pre4.x versions we alwayshave to invoke a node tool upgradecommand uh but with 4.x X versions thereare some there is some automated ways uhbut the problem with those automatedways is that you have less control onhow many nodes the upgrade SS tableswould berunning so there can potentially be somedisk pressure created because when weare running upgrade SS tables each SStable gets rewritten into the newerformat so in some cases you may want torun this in a controlled fashionour solution was quite similar to theprevious one relying on the initcontainers we triggered the upgrade as astable process by a spec change in theElswa config that gets converted into acustom resource the operator then makesthe stateful set change and then theport gets restarted in a rolling fashionso the first thing when the port getsstarted is we start out init containerand check whether there are oldformatted as a stable versions presentor not if not we exit the init containerelse we start cassender process and waitfor it to becomeready uh once the cassender process isready we upgrade the SS tables until itcompletes now this process can also belong running depending upon the amountof data that is present on thedisk and once everything gets completewe exit the inate container and startCassandra as a as a regularcontainer again we ensure item potencyif a pod is terminated it should havecapability toresume uh so this approach helped us inlimiting the disk pressure to at max asingle nodeand everyone loves seamless upgrades andinit container ensured that it was thecase forus the last case I would like to talkabout is recovering cluster frombackups we use Medusa which is anopen-source tool for copying SS tableformatted data on disk to S3and the picture on the right side showsthe architecture of Medusa so inaddition to the assistable files thereare some manifest files that containsmetadata and informationabout like the time stamp at whichbackup was taken backup typeetc in our environment we support boththe full mode and differential modes butwe try to leverage differential one asmuch as wecan so this backup process runsperiodically in sidecar in ourcase when we want to restore a clusterwe configure a new Cassandra clusterwith some additional properties thatlets us know about the cluster andbackupuh as shown you can see on the slide sothe backup identifier is the timestampin our case at a minute resolution leveluh so based on these properties a newCassendra cluster gets created then initcontainers are started uh and in thoseinit containers we pull data fromS3 if we follow a sequential approachone node at a time it would take muchlonger so we expedite things by makinguse of parallel pod management policythat allows all pods to come up inparallel and start pulling data fromS3 and once all nodes have completelypulled data onto the disk we exit theinit container and startCassendra uh and once all Cassendranodes are up running in regularcontainer they join together to form thecluster so in this way we ensured ourrestore process is fullyautomated but the requirements ofdisasterrecovery is quite uniqueyou the process is usually rarely neededbut when it's needed there is urgencyyou don't havetime so we build some automation aroundthe inate containers to ensure that ourprocess always remainsfunctional yeah this moves us to ourconclusions uh as you can see that initcontainers are quite handy for realstateful workload applications s a fewtakeawayshere uh the first one is about ensuringthat init containers are should be itempotent any failures that can happen anytime and for any reasons so you need tomake sure that there is some failuremodehandling also init container get codecan get launched each time when a podgets restarted uh there can bepotentially two cases either you want torun init container code every time thepod restarts like setting up a networkor something similar to it or the othercase could be that you run the inatecontainer code based on the persistentstate on the disk in our case all of theexamples that we discussed actually wererelated to the latercase so whenever we don't want to runinit container code uh we should behandling it inside the init container sothat it can quickly uhexit another takeaway is minimizing thedisruptions so if we treat disruption asaprocess rather than an event so it meansthat we can have some degree of controlover it at least on the voluntarydisruptionside so we should be reducing thosedisruptions where the pod disruptionsbudget and this is particularly usefulif the init container code is runningfor long though we try to keep theduration as short aspossible the third takeaway is about uhgracefully exiting the init container weknow that the main containers don'tstart until init container completessuccessfullyat the sametime any abrupt termination of statefulworkload should be avoided because thenit can lead to datacorruption so with this I would like tothank you for your time thanks forlistening and if you want to learn moreabout our engineering systems you canvisit Yelp's engineering blogalong with that uh carrier opportunitiesat Yelp you can explore on our carrierspage which is yelp.carrierscarriers and yep there is a QR code forgiving thefeedback and uh if we have some time Ithink we have a couple of minutes we cantake some questions2025-04-15 22:02:14.461948 d ��6�r#��#AvCfehltPKxkhello everyone this is S and Aki fromBloomberg and we're here to talk aboutTrino and data governance on Kubernetesa story of how we were able to deployTrino on as a pro service product onKubernetes to meet the evolvingrequirements of data analytics and datagovernance within ourcompany so here's a high level overviewof the data environment at Bloomberg ourteams deal with a massive scale of dataoften ranging in pabyte scale offinancial data spanning across theglobal markets and we acquire data froma variety of different sources likemarket data news from over a 125,000different curated news sources as wellas thirdparty alternative data to name afew our teams often also ingestreal-time streaming data and analyticsfor market insights and recently there'sbeen a growth of self-maintained datacataloges across teams utilizing v��q#��=AnTmwmd4fcGIhello everyone and welcome to a sunnyCubeCon 2025 in London i hope you arehaving an amazing day of learningtoday many times we see bigger impactsor greater results achieved from smallwellexecuted things that are putcreatively together and today I want tohighlight one such small lessappreciated and talked about butextremely powerful Kubernetes featurethat is the init containerand share how we at Yelp leverage inatecontainers to reduce the operationaloverhead of our Cassendra clusters andenhance the engineeringefficiency so introducing myself first Iam Muhammad a tech lead in the databasereliability engineering group at Yelpwhere I primarily look around the NoSQLdatabases uh and in particularCassandraHere is what we will try to cover todayuh I will try setting up the stage bysharing some information about thebackground in which we operate and giveyou a bird eyee view of how we useCassandra at Yelp then we'll dive intothree different uh problems that weresolved by inate containersthe first was about cluster scaling andhere we will specifically be looking atthe horizontal cluster scalingaspects secondly I will talk about theCassendra cluster upgrades and theimpact init containers had in making theoverall upgrade process easier andsmoother and thirdly I will touch on thecluster recovery aspects from backupsuh in the end I will try wrapping upsome key takeaways for you so that initcontainers is easier for you for yourusecases so starting off with thecontext how many of you have heard aboutYelp can you raise yourhand yeah that's quite interesting uhthose who don't know Yelp is acommunity-driven platform that connectspeople with great local businesses andmillions of people rely on Yelp for theuseful and trusted content about thebusiness informations reviews and photosto inform their spending decisionsas of December 2024 there have been 308million cumulative reviews with 29million average monthly unique users in2024 so here is anotherquestion how many of you have workedwith Cassendra or have heard aboutCassendra databaseokay so again those not familiar ApacheCassandra is a white column distributednon- relationaldatabase and how we use it how we use itat Yelp so if you visit any Yelp pagethere will be some portion of data thatis served by Cassendra so you canimagine how extensively Cassendra isused at Yelpso we manage over 70 clusters in uhproduction environments for differentuse cases and there are more thanthousand Cassandra nodes in thoseclusters and with respect to datavolumes these clusters store severalhundred terabytes ofdata but it's not only about the sizesome of these clusters are highlylatency sensitive and we ensure for asub 10 millisecond latencies for readand write traffic from theseclusters from operational perspectiveall of our Cassendra clusters run onKubernetes uh that's why I'm here andthis was a transition which we madearound five to six years back we used torun the entire fleet on Misosuh at Yelp we use an in-house platformas a service called Bastard thatprovides the necessary abstraction ontop of Kubernetes uh for managingdifferentservices we run our own custom Cassendraoperator that handles all the Kubernetesinteraction uh for the qwon-premS3 compatible object storage solutionsto facilitate their analytical workloadsand last but not least there's a need tosecure data discovery for AI workloadsso that only authorized users and AIapplications can get access to the datafor specified use cases so given thesecharacteristics of our data environmentthere was an opportunity to centralizeour data analytics infrastructure withthe goal of enabling our data owners toshare data catalogs securely across manyotherteams what you see on the screen here isan example of the data analyst pipelinethat a data engineering team may manageat Bloomberg for instance a market dataengineering team may build their datacatalogs by utilizing a variety ofingestion tools like Apache Spark ApachiFlink or even Pi iceberg and in turnthey would use a largecale distributedprocessing engine to extract andtransform the data into a generativeform or simply manage a data explorationtool to enable their data quality teamsor their client support teams tointeractively analyzedata so what is Trino and how does ithelp centralize the data analyticsenvironment we have within our firmtruno is a scalable and highlydistributed processing engine thatoptimizes query performance throughparallel processing as well as predicatepushdowns that is very typical of adistributed processing engine and has ithas a few new tricks up its sleeve aswell uh it also optimizes queryperformance by distributed caching whichallows the cluster to be able to uhavoid redundant S3 reads when it'srepeatedly accessing the same objecttrino is also an NISSQL compliant enginethat integrates with popular open-sourceuh business intelligence tools and a verversatile engine that provides supportfor ad hoc analysis at interactivespeeds and multi-hour batch workloadsalike so these qualities all have madeTrino a data analytics tool of choicefor a variety of workloads because let'sface it the switching costs of having tolearn the different quirks of multipledifferent data analytics tools is verymuchreal and that's where we come in we're ateam that provides managed Trino as aservice on our manage Trino platformusers define Truno cluster resourceconfigurations like the amount of memoryyou want to allocate into your specificTrina cluster you're requesting theTrino catalog definitions thatencapsulate the connection propertiesthe connection details defining whereand how the data is going to be fetchedfrom the data sources you specify inyour Trina catalog and then the catalogaccess control definitions that definewho can have access to what kind of datain return uh our platform provides twocorefeatures firstly our platform deploysTrino clusters matching the providedresource configurations and catalogesthat the user has requested to map tothe Trina cluster is deployed with itsquery endpoint and monitoring UI exposedand made available to engineers dataanalysts and AI developersalike and as the title of ourpresentation suggests we provide runtimedata governance by enforcing policiesthat match predefined access controldefinitions so these personas we referto as data owners they go into ourplatform and they administer thesepolicies which are used by Trino as itasks policy decision questions aboutwhether access to specific resourcesshould be granted to a specific Trinouser when a user submits an NCSQL queryonto the Trina cluster the Trinacoordinator before it executes a querysends the query context that includesthe user the role the SQL statement overto the policy decision point whichexpands the SQL statement to specificpolicy questions mapping to a group ofresources as well as the action types inorder to determine if this specific usershould be allowed to access thoseresources so if the policy decisionpoint approves then the query isexecuted by the cluster and voila youget the data but if the policy decisionpoint rejects it Trino returns anauthorizationerror so why did we want to run TRO onKubernetes our tenants were typicallyengineering teams that looked after datalake architectures for their respectivebusiness units and you can think ofthese busixness units as differentorganizations within the company likethe organizations that sell news data ormarket data or risk data and one oftheir core requirements for a managedTRO platform was that their clustersshould remain unaffected when adifferent cluster running in a differentorganizationuh is having a servicedisruption and although there areexisting concepts like resource groupsuh within trainer clusters that userscan use to define the amount ofresources that can be allocated to asubset of users within a specifictrainer cluster our tenants preferencewas for them to have isolated trindeployments that they could fine-tuneand managed with a high degree ofresiliency within their business unitsso the deployment management scalabilitythe multi-tenant enabling networksecurity uh features that come out ofthe box out of Trino uh out ofKubernetes sorry um made it an easychoice for us to build a Trinodeployment go controller to manage ourdeployments all right but our teamwasn't just tasked with building asimple Trino deployment manager and thisis because Triny enables users to makeuh requests to uh to access data a loteasier which means that the risk profileof your data analytics environmentincreases unless you have properauthorization in place and within ourdata environment effectivelycentralizing the many engineering andbusiness teams workloads meant enablingour data owners to be able to sharetheir data catalogs across Trinaclusters that are running on othernamespaces while still empowering themto be able to administer the policieswhich determine who has access to thosedata catalogsand doing this in a secured manner poseda unique challenge which was that weneeded to build a centralized catalogmanagement and policy administrationmechanism to go along with the trunodeployment mechanism we're building fora multi-tenant architecture so now Iwill hand it over to Aki to talk aboutthe technical design implementation thatsolves this challenge in our newmanagement platformthank you S so for centralized dataaccess policy management an importantbuilding block is a Chino's built-inaccess control uh when a user tries toexecute a query in Chino Chino analyzesits SQL statement before executing itand asks authorization questions such ascan this user select these columns fromthis table this mechanism allows us todefine the access policy up to columnlevel so we can say for example thisgroup of people should be able to accessonly column A and column B from thistable there are several implementationsthat come with upstream distributionsuch as a file based one uh where youcan specify who can access which data ina JSON format in a TRO configurationfile we chose an overbasedimplementation because it allows us todecouple centralized policy managementand its distribution from many instancesof between clusters managed by ourtenants so OPA is an open sourcesoftware that focuses on a policyenforcement in a decoupled and reusableway it accepts policy questions as JSONover HTTP or gRPC and responds in thesame way one important feature for us isits ability to fetch policies fromexternal services in a well- definfinedway out of the box the policy data itfetches this way is called bundle andthe external service that provides thebundle is called bundler this bundler iswhat we will leverage in a policydistribution so that policies can bemanaged in a central place without beinga single point of failure or abottleneckso based on these underlyingtechnologies uh this is what we want toachieve in our platform data owners onthe right hand side should be able toexpose their data with granular dataaccesspolicies in TRO data source isrepresented as a catalog which is acollection of configuration propertiesincluding database connection stringsthen tuner service owners on the lefthand side should be able to create theirtuner clusters which can mount catalogesdefined by data owners however thecatalog mount should come withrestrictions that the data ownerimposes this is where we leverage opaplugin with OPA that we walked throughin the previousslides when users uh send oyver theirqueries Chino opera plugging enforcesthe policy by making HTTP requests to anOPA server that is collocated with theChino clusters opa then makes policydecisions based on bundled datadistributed from OPAbundler on top of this our system isresponsible for making dashed arrowshappen firstly data owner needs to havea way to define catalog properties inour system secondly there are also needto be a way for them to define dataaccess policies over the catalog theydefined lastly uh service owners need tobe able to create their clustersmounting thecataloges the data owners define theircataloges as Kubernetes custom resourcesin our system this is almost like aplain text configuration file but withthe ability to injectsecrets the plain text properties fieldis basically identical to the resultingcatalog properties file but uh it's it'swithout a sensitive information such asaccesscredentials our Kubernetes controlleruses secured properties fields togenerate additional fields in the tunerconfigure file without exposing thevalues to service owners this separationis not mandatory but this allows us tosafely store China catalog definitionsto log stretch for example withoutworrying about sensitive informationthis also makes it possible for us tolet service owners discover existingcataloges directly in Kubernetes simplyby executing cubicle get or usingkubernetes rest API without showing dataaccesscredentials now that tuner catalogs areregistered in our system data owners candefine data access policies over them todo this data owners define anothercustom resource by referencing the tunercatalog by name in thespec and these access controls aredefined as a group of tables and columnsfor which the cat access can be grantedtogether for example uh in a data in adata catalog many maintained by a newsdata team if a specified set of columnsin the uh table um if the specificcolumns in the table has sensitive datathose columns may be excluded from anaccess control so that it can be sharedmore broadly across across many teamswithin thecompany given the desired grouping ofresources from the data owner uh ourKubernetes controller needs to configureover bundler so that the bundle containspolicies that can be used together withusersidentification this typically involvesum external identity systemsconfiguration such as LDAP and IM um sowe chose uh to let the Kubernetescontroller to do the work in ourimplementation the controller alsogenerates intermediate representationsto a separate policy store to optimizethe computation of bundlegeneration our open server periodicallydownloads the policy bundle that containthe policy that can be used with theuser identities then when the user sendover tuner queries with the identitiessuch as JWT token the tuner is able toallow or reject the query executionbased on whether or not user belongs tocertain user groups orroles in a status field of the customresource uh we show generated externalresources such as policies in the policystore besides whether these externalresources are correctly configured andready touse so far we went through how weimplemented centralized data accesscontrol by data owners so as a littlesegue uh let me briefly talk about howwe reuse the same infrastructure forcomputation access control by serviceowners since Trino queries as well asTrina UI accesses are viaHTTPS we introduced another open sourcecomponent operate plug-in that allows usto authorize HTTP accesses viaOPA then we all we needed to do is toinject additional access policies intoOPA for computer resourcesnow the tune service owners can managewho should be able to access their tunedclusters by configuring an externalsystem as computation access control isrelatively simpler comp compared to dataaccess control we did not need anadditional Kubernetes custom resourcewith controller in this case but if itrequires more complex setup with um selftracking um such as configurationspanning across multiple systems orpolicy representation that does not fitinto well in an existing policy store wecould do the same as data accesscontrol so zlastly we need to enable ourservice owners to uh create their tunerclusters with mountedcatalogs but the configuration of thetuner cluster can be very complex as youcan see we need to set up a bunches ofresources such as trainer coordinate andworker post as well as ingresses andconfig maps another complication is toknow configuration files where certainparameters sometimes need to be derivedfrom other parameters in a consistentway for example the maximum memory ofeach query needed to be derived fromcontainer memory size um and this isdone by accounting for JVM memoryconsumption outside of heap and also JVMheap consumption outside of Trinointernal memory poolso to simplify the workflow for Junoservice owners uh we introduce a customresource with a very limitedconfiguration in this example the Tunoservice custom resource has um list ofcatalogs to be mounted and the amount ofmemory that users want to allocate totheir tuner clusters in our platform ourKubernetes controller generates fullydetailed configuration from theselimited parameters behind the scenesthe obvious trade-off here is betweenthe flexibility for service owners toconfigure their tuner clusters in theexact way for their very specific needversus their ease of use by not needingto even think about minor details butmoreimportantly we as a platform owner havemore opportunity to optimize resourceallocation on our end this way forexample if we are going to introduceauto scaling of number of workers itwould be harder if our users arespecifying every single details in theirparts instead if service owner'sexpectation is more centered around thefunctionality such as maximum query sizeor latency distribution of queries wecan introduce such optimization as longas the service level is kept above theum agreed upon levelregarding the flexibility ofconfiguration we can introduceadditional spec fields as we collectfeedbacks from service owners and definenew fields in terms of the effect theyproduce for example uh we added a singlequery limit percentage field because insome cases service owners wanted toguarantee that at least end queries canconsume a consistent amount of maximummemory they can set this to 50 twoqueries need to be have consistentmaximum or 20 if they want fivequeries as a controller figures out uhdetails and prepares required resourcesum it satises of these resources in thestatus field the underlying kubernetesresources are structured in a verysimilar way to um the open source helmchart managed in upstream tunerproject it also shows certainconfiguration resulting from the uminternal computation for usersinformation in this case uh the maximumamount of memory available to each queryended up with around 32 GBwhen the service is ready endpoint URLsof query and the tuner UI will beavailable in the status status field aswell users can use these URLs forrunning tuner queries over the mountedcataloges and look up information abouttheir queries in TunerUI so overall this is what we've builtdata owners can create tun catalogcustom resources so that they can sharedata access configurations withoutexposing accesscredentials and they can create accesscontrol custom resources so that theycontrol data accesses up to column levelgranularity over theircataloges then a tuner service ownerscan create tun custom resources so thatthey instantiate their tuner clusterwhich can mount jun catalogs with asimple spec and also they can controlcomputer access forusers here we make use of the same opaauthorization backend for data accesscontrol using opa plug-in and forcomputation access control using opaenvoy plug-in and now uh I will turnthis back to um Sue to review thetakeaways and their future worksall right thank you Aki let me justrecap what we talked about within ourpresentationso we talked about deploying Trino incombination with open policy agent onKubernetes to deliver a distributed andsecure SQL solution that appliesauthorization checks at runtimewe also introduced CRDs to manage Trunospecific abstractions like the Trinocataloges that allow us to sharecatalogs and mount thes{e properties ontoa Trino cluster as well as the Trinoservice that takes a very simple set ofinputs that are that are exposed for ourend users to allow them to request aTrina cluster in a very simple way anddeploy that cluster on top ofKubernetesto and Envoywe also introduced CRDs that codify datacatalog access definitions named accesscontrols which enable granulardefinition of access controls in a waythat maps onto data catalogesuh and enable centralized datagovernance and finally we showcased howall of these abstractions and resourcescome together in a way to support amulti-tenant platform to secureownership of resources and enable crosstenency sharing of datacatalogs so now we have a managed trunoas a service platform that appliesauthorization checks at runtimes andenables data catalog sharing acrossnamespaces what's next we still have alot of work cut out for us uh and welook forward to um following therecommendations of the open source Trinacommunity by deploying a Trino gatewayalongside our Trina services to enableour Trina cluster users to run queriesonto a federated endpoint that can routetheir queries intelligently acrossmultiple Trina clusters for use caseslike supporting a query to continuerunning in disaster recovery or throughmaintenance operationswe also look forward to enhancing ourresource allocation strategy byutilizing horizontal pod autoscaling ina kubernetes fashionand lastly we're looking forward toextending this exact solution acrossother comput engines like Apache Sparkand Apache Flink so that we can reusethese definitions of mounted catalogesand apply the same level of granularruntime authorization checks across allof our compute engines that we maintainonKubernetes all right thank you everyonefor coming to our presentation[Applause]time forQ&Aq&ayeah if does anyone have any questionsfor us in the audienceso thank you and I was wondering how areyou guys handling updates and youmentioned Trino Gateway at the end for alittle bit but how are you handlingupdates and maintenance right now withlike interrupting queries and stuff likethat that's that's exactly right so uhour expectation or our communication toour end users is that they need to havetheir own retry mechanisms unfortunatelyand our um our first take on our managedtrain platform we just launched is tosupport interactive use cases so uhwe're focusing on supporting that firstour next item we're working on thisquarter is exactly truno gateway so thatwe can support maintenance operationswithout disrupting uh the runningqueries okay thankshi thank you for presentation uh quicksimple questionuh you had the open policy agent OPA asa centraldeployment and the cataloges wereembedded right what are the tradeoffs ofhavingum like distributed cataloges versuscentralized cataloges and what are yourthoughts around itcould I could I uh ask you to clarifythe distinction between distributedcataloges and centralized cataloges Areyou talking about like a centralizedmanagement system versus havingdistributed management systems indifferent name spaces yeah if you couldgo back to yourslide where you summed it up um this oneyeah probably just focus on catalog partyou had the catalogs embedded on eachKubernetes deployments right umtypically those cataloges would updateover the period of time right so thequestion was more kind of around thereis philosophy of centralized catalog forentire enterpriseand then the embedded cataloges whichyou have for each line of business rightso what are the trade-offs what have youweighed on and how are you powering itso um so that's exactly what um we wereable to design to sort of take thebenefits of both right when you have adistributed way of allowing our tenantsto define their truno catalogs of coursethey're empowered to uh you know use theprivileges that are granted within theirnamespaces and uh you know update theconnection properties and the catalogproperties as they see fit but at thesame time in order to be able to applythe OPA authorization checks and makesure that the rightful owners areactually the ones who are updating theaccess control policies for thosecataloges we still needed a way to puteverything in a centralized catalogmanagement system that goes along withthe policy admination system so oursolution is a combination of the twowhere your definitions of the catalogesare still defined within your namespaces but there's still a higher levelthat is centralized that actually pullsall this information from the separatename spaces so that there is still let'ssay a single label or a name that refersto a specific catalog that is within aspecific namespace so that the rightauthorization checks can happen ontothat catalog when it is shared onto to adifferent user i see so if the if I mayum if the catalog is updating at likepretty brisk pace um how do you foreseethat being updated onto a centralizedcataloging system what kind of updatesare you uh foreseeing when you mentionthat the catalog is being updated couldbe about the metadata it could be aboutthe people associated with the metadataand so onyou mean the connection details or themetadata within the catalog metadataabout the data itself glosseries and soon right so I'm taking a step furtherhereokayum not quite sure if I understand whatyou're asking but um the data ownersthat are defining the cataloges withintheir respective name spaces still havethe ability to be able to update it theonly thing that needs to be centralizedand protected so that the access controlpolicies can still be applied are thesort of the non-negotiable connectiondetails of the catalogsso like where where the data is beingpulled from whether that be a specificS3 location or a specific databaseconnection string those are the thingsthat needs to remain protected so thatthey can still the the same name that isbeing referred across all of ourplatform all of our train clusters canrefer to the same logical entity of thecataloggot it okay thank youhey thank you for for the talk was verygood so so my team is on a on a verysimilar journey and I would beinterested to hear how stable uh thesolution is in terms of the end userexperience so I saw you're using icebergand trino and and kubernetes so yeahmaybe you can share a bit yeah uh niceto meet you so we are maybe in theearlier stages of our development so wejust went into G uh two months ago andour the existing users of our platformare having a stable experience withinall of our tiersum if you're asking about the specificchoices of technologies like KubernetesTrino and iceberg I think that that's acombination of technology that is beingcelebrated across uh many companies whenthey're using it for analytical purposeswhat I would say is because it is sopopular and because people have sort ofum tried to make use of the same set ofarchitecture for a variety of use casesfor which it might not necessarily betuned for i think people are often uhrunning into issues with maybe theirquery performance because they're sortof expecting subsequent latencies ofqueries for a architecture that isoptimized for analytical processingthat's what I would say people have beenrunning into issues with mostly fromwhat I can see um was there were werethere a specific set of issues that youwere thinking about when you were askingthat question think more about umcontainer life cycle management whenwhen uh Trino goes down i mean there areso many so many things in your tax stackthat can go wrong right so how howhow does that work for the user howstable is that yeah so just like uh wewere talking about in a previousquestion right um when when we're whenwe're invoking maintenance operations orwhen a data center goes down our currentarchitecture requires that the Trinacluster user to figure out what othercluster that is up to actually reroutetheir uh queries but with something liketrreno gateway sitting in front andactually intelligently balancing theworkloads that that is going to be a loteasier so there are different tools weare looking to build on top of ourcurrent solution to introduce morestability uh to address the problemsyou're asking about just right now yeahsuper thank you2025-04-15 22:02:15.154989}ommunity side i'm a posgressqluh contributor i'm a data on Kubernetesambassador so the go my objective hereis to tell everyone that it's not onlyokay to run databases in Kubernetes butit's in my opinion the best way to runthemyou're not alone uh my name is BrianKaufman i'm a product manager uh workingspecifically on GKE in the GKE datalayer uh so if you're running statefulapplications such as databases on GKE ora IML workloads particularly inferenceworkloads uh I'm the product manager youmight have worked with uh prior toGoogle about seven years ago I worked atDocker so I might have tried to sell youall Docker Swarm this ironically is myfirst KubeCon so super excited to behere and and uh just can't get over youknow the excitement and all the newcompanies that have popped up sincehow many of you is this your firstCubeCon if you put your just put yourhand up nice that's great to see it'salways good to see about I'd say 60% ofuh the community here is new firstCubeCon CubeCon welcome i'll I'll screwthat up a couple more times before we'redone here uh one of the things that I'dbe remiss if I didn't kind of talk abouta little bit is the doc communityobviously there's a Slack channel uhwhere it's very active there's a lot ofthings you can go in there and getinvolved with uh data on Kubernetesthere's also the latest report thattalks about what are the kind ofdeployments that are happening what arepeople doing with data on Kubernetes uhthat report really had some interestingthings because if you think about AI andwhat feeds AI is really data and youwant to keep especially if you're goingin your companies your organizations areinvesting in GPUs be it in the cloud oron premises or in colo those areexpensive assets that you want to keepbusy so having GPUs be busy you need toreally have the right data architectureand have that on Kubernetes makes a lotof sense to keep them busy as Well uhthere's a lot of AI and ML accelerationthat is really being focused on in factI had some stats that I wrote down hereif you look at the maturing ecosystemabout 50% or more of doc workloads arein production so this means uh toGabrielle's point you know it's okay togo and do stateful related applicationsin Kubernetes and how you do that you'llget to hear a little bit more about howthat happens with things like databasesand streaming and others so again uhwe'll also get into some of that becausereally uh 75% of those productionenvironments really have uh advancedwithin those organizations and arereally joining in a lot of that data onKubernetes specifically one of thebiggest use cases is around AI and howquickly you can get up you can iterateand move throughthings there are challenges that doremain uh I I would say that you knowreally some of the features still inKubernetes is kind of one of the numberone barriers and kind of the skill setsaround that and being able to bring thatup and there's a lot of differentdiscussions there was all things on uhyou know platform engineering yesterdayand platform engineering day but there'sa lot where even AI will help with thatstuff as well going into the future soreally good stuff and I think you knowwhat I really want to start off with isand we'll kind of go around the hornhereare there any customer stories thatreally stick out to you that you'vethought about in the past you know yearor past six months in Salt Lake Citythat really when you start to look at uhdata on Kubernetes and how it'ssupporting AI or ML and it doesn't haveto be just generative AI because there'sstill traditional AI out there still aton of that uh anything you can think ofcustomer story that hey this is how theygot started and where they went yeah Ican I can start off so one of theinteresting trends we're seeing with aIML workloads uh and I'll I'll I'll uhrelate it to all the the great databasework you're doing is the use of localSSDs for caching so a lot of a IMLworkloads particularly rag uh you needto do a fast read of a vector databasebefore you ingest that take it ascontext to send it to a LLM umtraditional databases you know theyweren~'t they're meant for low latencybut you know when you're talking aboutAI and the amount of hops you need toactually get the answer you need thelatency is just infinite it's way moreimportant now so caching is a huge thingwe're seeing in the industry uhrelatively recently with the use ofeither RAM discs or local SSDs to getthe data that much closer to the GPUs uhfor the processing so that's a trendwe're seeing across the customer baseyeah thanks Brian sorry um I cancertainly name one for example as EDB wehave a customer the US Navy that theyuse possess on Kubernetes and uh and andalso for for for ADB posgus AI but I cantalk about also the project cloner PG uhbecause the list of adopters is publicso you can actually go through that andand see what's happening uh it's it'sgrowing but what's interesting for methat I I've been I started this journeyf five and a half years ago when peopleI was at my first CubeCon was San Diegoi was there with Marco who's here frommy team people thought we were crazy wewanted to use local storage and peoplethey laughed at us all the time it'schanged but the perception from you knowat the technical level is improved weneed also to elevate that to thebusiness decision makers because there'sstill lot lot of work to do but what isinteresting is this change of uhmindset that uh you can I I I wrotewrote a blog uh post about cloudneutrality and I'll talk about thatlater it's about this uh reuse of bareme bare metal um uh machines and localstorage whichis which requires a bit of adaptation abit of you know uh understanding that isa change that the organization has to doto provision for example hardware and soon but another use case I want to u siteis IBM cloud pack that is using cloudnativep uh for uh all their database asa service uh solutions for most of theirdatabase as a service solution whichthen provisiondata for AI and for posgus it'simportant because there's a an extensioncalled pg vector that transforms posgusin a vector databaseUm so I don't personally work too muchon the customerf facing side but aninteresting um pattern I'm seeing is alot of companies are moving their um uhintelligence into their data processingpipelines so uh into the pre-processingand post-processing specifically so whenespecially when it comes to uh streaminguh one cool thing you can do especiallywith uh like Apache Flink for example ishook up an LLM right into the processingpipeline so even before your data getsto your destination you can already umyou know make intelligent decisionsbased on that data so it can uh you cando um remote and inference from apre-trained model for example uh so yeahthat's something I'm seeing yeah and I Ithink if you look at any of uh thelarger databases of services and thingsof that nature they're even those onesthat are hosted or path related are evenmoving towards either serverless or youknow Kubernetes-baseduh infrastructure underneath the hoodthere's been some announcements outthere uh around some of the larger onesone of the ones that got is going sercompletely serverless for instance isdatab bricks has said their entirecloud-based service is going uhserverless at this point but they're allI think to that point but Gabrielle likethe the what is it and whatconsiderations should people look atbecause database is really the numberone category of data on Kubernetes whatare some of the things or people shouldlook for or try to avoid or you knowthings that could get them on that thatgolden path to success with databases onKubernetes so um yeah it's yeah thelatest report says that and AI is againfueling even more the adoption I thinkuh a key aspect here and that's forKubernetes in general is uh the factthat we can use declarativeconfiguration for everything this is adeclarative world and the complexityhere is for operators to guarantee thatand when it comes to a database that'sreally you know key especially if youwant to launch analysis on data and anduh even recreate data just for thepurpose of of some analysis process sothe whole uh the importance here is touh guarantee the simplify theoperational complexity that is intrinsicin a in a database through whatkubernetes globally allows you to do sothis is I think the reason why databasesare excelling is because there areoperators that enable that and enable torun a database in this amazinginfrastructure orchestrator that is thatis kubernetes so in my opinion this isthis is uh also the challenge for us butthe challenge in general that I see whenpeople try to to to use databases inkubernetesuh is if they come from posgress forexample they don't know kubernetes andthat's the challenge number one and ifthey come from kubernetes most likely uhthey don't know posgress and they don'tknow how posgress needs for exampleexample to access and to own the storagethere's this very close relationshipbetween posgus and storage and it's it'swhere I think a lot of in innovationwill come we have for example one of ourmaintainers of cloud netpg is alsoworking actively in the kubernetesstorage for this reason because it's aprobably an unexplored world but just toyou know um remove your concerns it'salready better than what we have in VMsaccording to me so it's all new worldfor for for the better yep and I I thinkthat is one of the things when you lookat the CSI drivers and the maturity ofthings uh I was at a a vendor at onepoint in time and this is almost 10years ago now and we actually built CSIdrivers back then so when you start tolook at it was just when Kubernetes wascoming out to give it access direct tothe storage which becomes a reallyimportant part but let me throw it overto Brian now and Brian there's a lot ofconsiderations and you you sit in aunique spot being at Google that a IMLreally um has some operations that youhave to consider when you're doing dataon Kubernetes what are some of thoseconsiderations and how do you how do yousee or what have you learned that youcan impart on them that they can kind ofget their Kubernetes ready for thingslike a IML yeah um so it's a very loadedterm right a IML what you know what isthat um how many folks in the room dotraining on Kubernetes forAI okay what about 10 hands yeah whatabout inferenceokay a lot lot more inference more soI'll I'll split the question up so forfor training what's what's mostimportant and you all can can keep mehonest is the well let's take a stepback for for training you spin up yournodes ahead of time in Kubernetesand then you're downloading all thetraining data and then you're pushingback up all thecheckpoints what we've seen with thetraining data specifically it's usuallya lot of small files so a trend we'reseeing across our customer base is a lotof caching and parallel downloads forthose small files to be able to makesure that they're there for you knowmore than one epoch right if you'regoing to be constantly reviewing thosefiles you don't want to keep downloadingthem usually from object storage so fortraining the trend we're seeing is umthe the paralyzing of all the downloadsand the caching of those small files sothey're available for subsequent umtraining runs um on the inference sideuh how many folks so the folks that aredoing inference what inference engineare you using are using uh raise yourhand if it'sVLM okay orTriton okay there's Yeah a few more allrightum training very different frominference inference you're going to bescaling up and down like a normal webservice right training everything's setup ahead of time inference you'rescaling up and down with with theworkloads and what's most important whenyou're scaling is making sure that thehardware is available when you need itbut not a second more and with inferenceinstead of a lot of small files likeyou'd have a training you have very verylarge uh models and weights potentiallythat you need to download from fromobject storage uh llama 70B it's like 13gigs right so if you've tried todownload that from hugging face um I Iwon't do any more pop quizzes but it'sabout 20 minutes to download that modelfrom hugging face so if you're linkinghugging face to your Kubernetes clusterto run inference it's you know 20minutes of idle GPUs you're you'resitti�ng on there so with inference whatwe're seeing is uh similar storageacceleration from object storage ofparallel downloads um we also areexperimenting a lot with a new featurein VLLM uh the RunAI model streamerwhich allows you to do a similarparallel download uh to some of theGoogle products like GCS fuse but godirectly to CPU memory and then to GPUmemory so the model streamer is a reallygreat accelerator built right in VLM forthatwe are also seeing people storing modelson block storage and the advantage toblock storage over object storage is youcan make it immutable so you can justset the disc to read only and theningest it uh usually from a locationwithin the same zone so uh reallydifferent strategies for both inferenceand training whereas inference you needultra low latency training it's not asimportant uh and that goes back to justuh some of the things I was discussingbefore where training like it's a bigdeal if a node fails right you need torestart from the latest checkpointwhereas inference if a node fails it'skind of a big deal but not reallybecause it's in an autoscaling group anda new node will pop right up right uhbut again you got to download the fileand hopefully you're not just pulling itfrom hugging face you have some storageacceleration uh built in there uh sodifferent strategies depending on howyou're loading or I guess what part of aIML you're working on no I I I thinkthat's so accurate i was just talkingwith one of the other maintainers ofVLLM and I think that when you start tolook at the size and we've we talk abouton the cube a lot and do a lot ofresearch on it that the size and smalllanguage models or uh segmented languagemodels and things for inference whereyou have a long tail where you'll bedoing way more of that than you will beof training maybe you're you're notgoing to be constantly training modelsper se uh you may be taking them in youmay be fine-tuning them and buildingthem to be and meet your uh specificneeds and that does take up a lot ittakes up a lot of data but then when youpush them out somewhere they're a muchmore subseted amount of data that you'rebringing out to that edge potentially uhwhere you're actually serving out thatmodel and things but Nisha let's uh kindof switch gears a little bit and look atstreaming and data management and whatthe impact has been on the doc ecosystemaround that from your perspectivebecause that's another set of data thatI'm near near and dear to my hearthelped build a SAS platform that used uhthat type of streaming data to bring indo a IML on top of that as well yeah forsure um I'll start by saying like datais most useful when it's fresh rightwhen it's the most recent because it hasa lot of value uh and especially for AIyou want the most recent data becauseyou want the most recent up-to-datecontext for your uh queries or any kindof um uh useful information you want toextract from your uh workloads andpipelines eventually um so obviouslythis is nothing new but um for any umanything involving real time uh datapeople use uh hybrid systems so nolonger I'm I mean like no longer batchprocessing but you know they're doing uhreal-time processing event drivenprocessing especially when it comes touh inventory uh like anything you wantto do where you want to manage inventorysupply chain optimization anomalydetection things like that uh but um onon the AI side more specifically Imentioned earlier about movingintelligence into your data processingpipeline so one uh cool architecturalpattern I I saw recently was umbasically uh a lot of people are usingagents these days right so an agent isjust something that um can take anaction based on some uh data so let'ssay you have uh an agent that's uhinterested in looking at so many datasources and can take a certain actionbased on that like maybe let's say to uhsend a notification to an operator ormaybe it can auto approve some kind ofuh thing uh workflow and um a lot ofpeople are using multi- aent systemsthese days um so they have uh you knowall these agents working together butthey they use um streaming data to actas the �brain uh to orchestrate uh whichagent should do what because you do needsomething in the middle thatorchestrates these agents so uh the wayit would it would work is like your uhyou would have Kafka as like thestreaming backbone for your pipeline andlike it is doing all the delivering ofdata so from the sources to uh this thesyncs and and downstream services and onthe way if you also add flink to it uhand like I said earlier if you hook itup to an LLM you can take in that datain real time make a decision about whichagent to invoke based on what kind ofdata is coming in and then invoke thatagent so um yeah that's an interestinguh architectural pattern so so let'sstay here with you and this is toeverybody let's kind of what do youthink the evolution of AIML data on Kubernetes and Kubernetes ingeneral is going to be over say the nextwe're we're at the 10th anniversary forthe CNCF almost so in the next 10 yearsof the CNCF how do you think that'sgoing to evolve and from your ownperspectives obviouslyum yeah finally we were just talkingbefore the panel that a lot of theseCNCF projects work uh sort of inisolation but trust the other pieces towork too and you know ultimately all ofit has to uh play well together too uhso that's going to continue to happen Ibelieve so uh I think it's going to beum like why do people want to come toKubernetes right it's mainly for a lotof stuff you get out of the box so youget uh autoscaling fall tolerancesecurity uh a lot of investment into allof these areas and I think it's also theecosystem because the there I thinkthere's going to be a feedback loopbetween end users and um these projectsthe open source projects in CNCF and uhthere's always going to be uh it's it'slike a growing community so as thatfeedback loop uh is completed you'llhave new features being added to theecosystem and all of it co-evolving theuh development will influence theapplication and and vice versa um sothat's something I definitely seehappening yeahyeah i mean to Yeah exactly we werehaving that conversation veryinteresting and uh I believe there areessentially two things in in my opinionone is the integration among projects sofor example uh I bring my case of cloudnetpguh where we are interesting to get forexample everyone who every project thatuses posgus as a backend or as a clientto maybe work together to improvedocumentation to help end users uh usethese products together this is I thinksomething that through the CCF we canelevate the other thing from the posgusperspective I've seen already in thelast couple of years how just by runningposus in kubernetes we've produced ideasthat in 30 years before poskus had neverproduced even small patches but they aregoing to change the way uh we canimprove things in kubernetes for exampleone patch we will introduce in posgus 18will help how we can deploy extensionsin kubernetes thanks to anotherimprovement in kubernetes which theseimage volumes so by doing that forexample if you want to load an extensionthat does some AI uh um um analysis andand and processing you can do it andthen remove it so I think this is theseare the two uh things I can think ofso uh I guess what's top of mind for meis specifically with regard to a IML theconfigurationresponsibilities to really make anefficient workload possible are out nowoutside the boundaries of the Kubernetesfabric so for instance I I spoke beforeabout object storage there's guess whatthere's tuning you could do on objectstorage right there's tuning you can doon block storage uh on the networkinglayer a lot of things that are perhapsunder uh the Kubernetes fabric that mostfolks have access to um I spoke a lotabout caching right setting up cachesall all those sorts of things toaccelerate the downloads potentiallythose are using APIs that are outside ofjust the the um Kubernetes uh controlplane so what I see happening not onlywith regard to a IML but um with justthe broader Kubernetes ecosystem istaking advantage of the Kubernetescontrol plane to bundle infrastructureand applications as a logical entity sosome folks in the room might be familiarwith projects like crossplane or KRO uhthere ways that you can group umdifferent artifacts in Kubernetes be ituh deployments or perhaps load balancersuh network configurations secrets uheverything that is outside of thecluster but perhaps in the hyperscaleror data center you're using or externalAPIs uh so I think that with all thenuances to a IML it's more it's gettingmore necessary to package theinfrastructure configuration with theapplication uh configurationall right so kind of last question hereis uh everybody's favorite question isaround cost and how do you deal with andcontrol costs in data on Kubernetes uhbecause like you were saying there's somany different places that it data islike I said feeds a IML and it growsexponentially so yeah to all you tell meI'm the the PM for storage right I knowknow about that um yeah the you know thethe not the dirty word but it's it'slatency right in my world that's whatthe other word for cost right is latencyin GPUs that's right now the mostexpensive part of the the stack umcontrolling costsis really centered around making sureyou have proper observability to makesure that you'remaximizing every route into the GPU uhthat you can um I spoke a lot aboutabout caching that that's like a layupuh make sure the data is as close to theGPU as possible control your costsum most of people in the room probablyalready know this local SSD is muchcheaper than memory so if you can pushthe latency to the local SSD and itdoesn't affect your applicationperformance that much guess what youmight not need to size up the the VMthat you're using and and you could cutdown on that expensive memory which uh Icould speak to Google is about maybe $3a gigabyte a month versus a local SSDwhich is about 8 cents a gigabyte amonth so you know very big very very uhyou know different price point there umso uh GPU idle time you want to avoid atall costs which I think is fairlyobviousyeah and then I'll I'll cover primarilythe database stuff again so what what wesee it's happening is this uh uh I meanwe recommend to isolate posgress workernodes from from from the rest so thatgives you for example you can use baremetal machines with local storage thatgives you a fixed cost okay so you'vegot a cost predictability on the on thedatabase layer and you can useKubernetes to consolidate your databasesover there this means that in terms ofprocurement I I speak with um customersthat tell me it's too late now to add uhmore nodes okay but you can do it forthe next year and for this year maybestill you know work with the way youused to work but the this is why it'simportant to start even from day zero toplan for your infrastructure inKubernetes for data for data workloadsbecause it can help you save a lot ofmoneyum so I agree with Brian about the uhobservability part i think that's likethe number one thing obviously um likeinternally at confluence we implementrobust like resource tagging uh policiesso every cloud resource we know you knowwhich team provisioned it for what whenall of that stuff so that's importantand do all of the auditing for all ofthis too um but in terms of strategies Ithink one thing that pretty well knownat this point is obviously in the cloudis cross availability zone um transfersfor data so uh you want to minimize thatand I think there are um ways to do itlike obviously collocation and and usingum you know let's say uh what was itaffinities yeah affinities and labelsand things like that to collocate yourpods with your um uh data processing allof that uh in the same a um and um yeahI guess that's mainly it uh oh there's Ithink uh go ahead uh yeah no sorry Iforgot my point yeah I want on I meanjust to say when I meant isolate workernodes it was to use stains tolerationsanti- affinities I implied all of thatand as I said that cloud neutral uhposgress you can find an article aboutthat awesome well thank you everybody uhwe ran out of time uh sorry that wedidn't have time for uh questions butthank you for here if there's a surveyfill it out uh and we hope to see youagain thank2025-04-15 22:02:15.812223 ! !��F�t#��CAXqL5lh32lr8welcome everybody to the streamlinedefficiency and shackling Kubernetesimage volumes for rapid AI model anddata set loading presentation todaywe're going to talk a little bit abouthow we can go about speeding up u volumeloading when using image volumespecifically so to start us off uh I'mre I am a software engineer fromMicrosoft i specifically work on Ashurecontainer registry my oh one secondum my main area of expertise is OCIconformance as well as artifactstreaming m��L�s#��OACI4rws1H-aMso hopefully you're here for the futureof data on Kubernetes from databasemanagement to AI foundations we're goingto really dive into some of the usecases for data on Kubernetes around AIand what's going on what we're seeingfrom customers i'm Rob Stretch i'm amanaging director and principal analystand member of the doc community uh I'mhere to moderate this panel and reallyhelp guide this through and get you guyssome information uh hopefully you knowwe'll have some fun with this and have alittle time for even some questions atthe end if you need anything whileyou're doing that the exits are backthere you know where to go all thatstuff is taken care of so before I gointo uh kind of a little bit of anoverview let me let the panel introducethemselveshey uh I'm Nimisha so I'm a softwareengineer at Confluent where we mostly uhwork with um streaming services so KafkaFlink things like that and uh we offer adata streaming platform uh a managedversion and um you know we have anon-prem and cloud offering andeverything so personally I'm mostlyinvolved on the Kubernetes side becauseI'm on the platform team so it's likeinternally managing all of the datainfrastructure um all of the Kubernetesstuff you know like u providingself-service to our developers uh tobuild the actual applications and umaintaining uh system software on theseclusters and so on yeah thanks Nimishai'm Gabriel Bartoolini i'm vicepresident and chief architect ofKubernetes at EDB edb for those of ofyou who don't know it is the number onecontributor of the posgress open sourceproject and uh also the creators ofcloudpg which is a project that justjoined the CNCF sandbox uh we help uhorganizations all over the world uh runposgress uh also for AI purposes i'm afrom a c|�y co-presenter over there isYan Yuan from Alibaba cloud he's asenior software engineer there and aresearcher who's done many manycontributions to the overlay BD projectum so to start us off I want to speakabout the current state of efficiencywhen it comes to starting up images andhow we can deal with accelerating thoseso the first thing to note is that todaywe have achieved a lot of progress onimage startup the first thing is we haveartifact streaming however artifactstreaming helps us solve one problemwhich is how do we optimize uh workloadsthat are specifically application basedbut when we start to deal with we needto have these large data sets along withthis we start to encounter some problemspackaging our information into ourcurrent images and then streaming themis not necessarily efficient we want tobe able to load data at a scale but atthe same time we don't want tonecessarily pay the upfront cost ofpackaging this data and putting it inour container registries so when we'redealing with large language modeltraining or just AI in general wesometimes need to load data in parallelthis data parallelism presents a lot ofchallenges and means that we actuallyneed to be ableto simultaneously access lots of dataacross multiple places at the sametime uh this presents a number ofchallengesum the first of which is we need to beable to have data that can be accessibleall the time so if we don't havecontinuous data access we're going toend up paying a lot of money for GPUsthat are not in use we're going to beunable to actually load all the datawhen we need it as we needit we also need to make sure that whenwe do access the data it's actually fasttoaccess additionally because we'redealing with Kubernetes clusters weobviously need to be able to scale upour access if we're not able to scale upto thousands of nodes then we're notgoing to be able to use this dataeffectively when training AI models orany other thing that requires large datasets andfinally there are some things aboutmanagement that we really need to thinkabout first of all this data is likelyto need to be versioned so we need tohave versioning that matches both thedata and the applicationsfinally but most important of allmanaging this data needs to be easy wecannot have any applications that aretoo difficult to manage because nobody'sgoing to use them so given that we havethese challenges we need to considerthat we do have a lot of solutions forthis so here's where image volumes comein from OCI so when we're talking aboutOCI registries we already have a lot ofthese challenges tackled first of allOCI registries need to be performantbecause they already need to scale upfor these Kubernetes workloads theyalready need to be highly access highlyavailable and in general people aresomewhat familiar with them so we alsoknow that they solve a lot of theversioning problems we have tags forversioning as well as digests andmanifests so specifically tags can giveus the ability to have arbitrarydefinitions for our different um imagesright so any data that we have we couldversion this way we also get the benefitof using digests within our manifests toactually maintain data consistency andalso have versions that will not changeover timebeyond all of this there is one othergreat benefit that we get from havingour data in OCI registries we can enablegarbage collection for this now this isvery important because every year datakeeps growing and growing and it's outof control nowadays uh there's been somestudies including an IDC report thattells us that the total um data in theworld will grow by 175 zetabytes by 2025and if you're an organization who needsto have these very large data sets thatare versions as well you're going tokeep using lots of data and if you'renever able to clean that up because youdon't know what sort of runtime it'stied to then you're just going to end upaccumulating costs with no end in sightso I mentioned this a little bit beforebut users are already familiar with OCIregistries the distribution spec is wellknown and well defined and in generalpeople have to use �it forKubernetes of course tooling is alwayswonderful so being part of the ecosystemmeans that if we can actually leveragethe registry to also tackle thisparticular challenge of loading largedata sets then we can get a lot ofadvantagesnow has this been done before to someextent uh there's actually a longhistory of using OCIum images to store arbitrary data theOrus project for example formalized thisat some point by defining how you canput any sort of arbitrary data into anOCI artifact we also have image volumesof course uh as the presentation hasbeen talking about which allow you toactually load um container images into afile system in the Kubernetes side ofthings okayum there are some questions why we mightneed to hold off on doing this sort ofthing so the first thing is thatregistries were not designed initiallyfor volume mounting uh what this meansis that there are some things that don'tnecessarily work out of the box or atleast not across all registries uh forexample we need to account for the factthat different registries tend to havedifferent size limitations for examplethe ACR registry has somewhere around a200 uh gigabyte limit per layer but thisis not consistent and it's not specifiedin the OCI distribution spec so itvaries depending on implementation theother thing is that while we have beenable to get streaming support working onum on container registries this was notsomething that was built into the specinitially now there are some things thatallowed for it to happen today uh butoriginally was not built for this soit's not something that we can say wasuh intended from the startthe other thing is that there are somelimitations when it comes to how westore the data inside of images that gointo registries oci registries tend tostore information using a overlay filesystem which is a union file system thatbasically creates new layers every timeyou modify the data so if you're dealingwith a very large data set this mean avery large data set that might bechanging at any point this means thatyou might need to actually constantlybuild new layers and eventually you'regoing to run into some layer limits andthere are costs to this sort of overlaystructure this means as well that everytime you're adding new data or even atthe first time you're doing this youneed to package a lot of this data andas we'll see in a moment packaging canbe quite expensiveso here is a bit of an illustration ofwhat the cost is for packaging uh abunch of data so we grabbed some datasets from kaggle.com we grabbed somepopular uh machine learning uh trainingdata sets just to get a little bit of anunderstanding of what the cost is weused copy from docker using a dockerum a docker file and uh just to try toillustrate this so people can replicateit if they want to and then we startedpackaging these and we noticed a numberof things first of all as you have morefiles and generally larger images youtend to have much much larger times topackage for example the largest data setthat we had in this case actually tookalmost four hours to package and it'sonly 22 gigabytes in the modern world 22GB is not that much but for the purposesof images it can be quite a bit i mean Ido want to note that this particulardata set does have something like700,000 files and they're mostly imagesso they're not very compressible as wellthere's some considerations there butnonetheless even the smaller data setsin the order of just a few gigabyteswould take many minutes to actuallypackage so we can start to see that thisis something that is maybe notideal so from hereon we've noticed that we've alreadysolved a lot of the problems but we havesomething to consider so the registriesprovide high availability andscalability and performance we have somemeasure of versioning data and we cansupport data uh garbage collectionnatively they're also widely adopted inuse across a number of tooling and ingeneral they are something that mostpeople will already be familiar with sothat said how can we solve that biggestchallenge for us which is packaging sowe can actually use this for larger d�ataset loading in AI work uh in AIworkloads so now my colleague Ethan willintroduce what we've actually done tofix this issue ethanuh okay thanks Estban for his speech uhnow I will show you how to solve thisproblem rightearlier uh the inspiration for this camefrom the indexing to OC image uh thecommunity already has some solutionsthat a remote snapshot can create amount point for streaming by externalindexes and uh without repackaging thethe image such like the SQL OCI andoverlayBD uh the core of these solutions isthey canuh package a index file of the imagelayercontent anduh it their all meta data by hooking theIO request at the file system level orblock leveluh it it combined with the image indexthey can find the corresponding data forthe remoteimage so uh taking a board speculationcould we build an index for the entirestoragebucket uh my answer is yes i'd like I'dlike to introduce you theEnink uh that solution can create amount pointfor accessing remote data through OCIartifact and uh no need to packagethem uh the data set descriptionreference list and remote slab shortersare the core of this solutionuh reference list is a set of dataobjects you want to package into the OCIartifact remote snapshot here will passthe reference items from the referencelist and create the mount point forstreaminguh more details reference listuh contains a set of records each recordhas at least four parts the source passuh it's original pass for the uh for theobject in the remote storage the mountpass the pass when you access the objectfrom the mount point uheag reflects the data change and itrepresents an version of the objectpeople can know whether the data hasbeen changed from the e tag instead ofdownloading it and file size the objectsize inbytes through the reference listuh we can describe all the object wewant to access within OCIartifact uh now let's package thereference list into OCF artifact uhfirst of all we need an registry to takeover the backend storageuh which enable us to get the targetblob URL from theregistry as we all know uh when we tryto get a blob from the reg registry APIthe registry will return either aredirect URL or the blob contentdepending on the status codeso it's easy to get the back end storageendpoint from the registry API andcombine with the source pass field wecan get the actual pass uh of the targetblock second uh a special annotationfield is needed this can uhidentify that the current layer is areference list which can convenientlyenable the snapshot to support accessingremote files for the existing OAartifact uh of course there may be abetter approaches in the futureuh and also the reference list shouldshould be saved as a common format likeCSV or uh JSON or something else so thatthe registry can analyze the details ofthe reference list to get the valid dataacross the entire storage bucketwhen we create a command point thecontainer runtimes should pour andunpack the reference list through theregular image poolingprocess and the slaughteruh will create the uh mount point byparsing the reference items[Music]uh yeah uh slsher should create a fileentryuh create a file entry at the relativepath of the mount point according to themount path field and this file entryactually point to its sourcepath uh please note that the accesspermission is required here because theregistry may not have the accesspermission for all the remoteobject uh so an additional authorizationmay be neededuh streaming loading enables thecontainer runtime to mount the imagevolume without pulling it skip thedownloading process of theartifact uh and more important when thewhen the data set is too large it willoccupy it will occupy a lot of storagespace and even failed to download it sothe streaming loading is essential hereuh when the streaming streaming servicehandle an IO request uh it shouldconvert the uh IO request to theuh corresponding corresponding data ofthe referencelist after uh ensuring that the e taghas not been changed uh it converts theIO request into a range gate for theremotetarget for the proof of concept wechooseoverd provides a merged view uh of imagelayers as a virtual block deviceuh it supports on demand transfer dataas disk sector level and it also uhprovide a boundpoint a block device interface ratherthan fuse which has a better performanceuh in the small files accessing and moremajor in handling the stability problemslike crash recoveryuh the most important is overlay hasbeen widely used in the productionenvironment of Asia and Alibaba groupand databricks uh overd has a lightweight modecalled turbo OCIit already support indexing to OCI imageand also it can buildE4S file system from image indexlocally uh to support Elink we can justimplement a simple function to map theIO request into referencelist since OABD provides a backendimplementation of the of a block devicethe IO request from the mount point willbe converted into a simple read andwrite operations through the file systemand uh blockdriver that is also one of the importantreason why we choose OBDbased on the plan above uh we conductedperformance test on Elink for packagingand a full volumeaccess during the packaging phase Eninktransformed the file copy into recordingthe reference data which gives Elinksignificant advantages in both packagingtime and buildspeed for the 22 GBTE data set whichcontains more than 700,000 files thepackaging of OI image takes nearly fourhours but Elink build speed is less thantwominutes uh at last we tested the E2 dataaccess performance uh this chartcompares the throughput of traditionalOCI images go and elinkgoofys is a high performanceimplementation of AWS S3 file system theright access of this chart shows theaverage size of individual files in eachdataset in this test we add the preparepreparation time the preparation timefor OCI image uh include the downloadingand unpackaging which is the uh defaultbehavior for creating a OCI volume mountpoint for goofies and for goofys andelink the preparation time is in uhinclude the time taken to create a mountpoint and build the file system metadatathe result shows that in the scenariowith a large number of smallfiles Elink performs significantlybetter than OCI image and goofiesuh in the US accident data set with alarge single file elink performance isalso on par withgoofies and all the test environmentswere provided by Alibabacloud uh that's conclude ourpresentation thank you for listening andwelcome any questions[Applause]uh thanks for the presentation uh I justhad a couple of questions uh one is howdoes this uh work with uh six door imagesigning and uh second when you are uhstreaming a blob how do you verify thechecksum on the target host thanksyou want to handlethat um so I don't think we're yet atthe stage where we're doing a lot of usecurity verification for some of thesethings we are doing uh verificationusing the E tags to at least validatethat theum that the contents that we'rerequesting are actually what we want toget uh this is still somewhat earlystages so I don't think we've gottenquite there as for streaming purposes uhdo you know what kind of validation wedo there oh we we have to change some wehave to change some for the file toverify thedata is correctyeah uh I just want to know when you arestreaming before the if you get theentire uh entirety of the data how doyou verify the check sumhe's asking how do we verify thechecksum if you haven't gotten all thedata uh we we we just check the theelink elink is the check uh M MD5 checksum of the object uh maybe sometimes uhuh before we get the data we we throughthe HTTP head response can get the E tagand and and the E tag is also saved inthe reference list if the E tag is ismismatched so the data has been changedokay thanks2025-04-15 22:02:16.363355�storage the cozy drivers for objectstorage and all of the operators thatmake this possible and the operators ofcourse also enable all of our day2operations so things like upgrades andthe ability to back up the ability tofail over and perform disaster recoveryand then it's taking advantage of therest of your cloud native stack becauseyou've already invested in theobservability and the secrets managementand certificate management and theencryption um and the scaling andelasticity right so when you want to beable to move to more cores or when youwant to be able to um upgrade the numberof nodes for a distributed system thecloud native environment makes all ofthese things so muchsimpler um putting my uh CNCF storagetag hat on for a little bit um the taghas put together a storage white paperwhich kind of covers some of thelandscape of the cloud native storageenvironment covering things like whatthe attributes of a cloud native storagesystem look like so you know when we'rethink thinking about highavailability also thinking about whatthe consistency requirements are forexample or what the durabilityrequirements are and when we're talkingabout scale what are we talking about isit just you know the number of requestsor the number of IO's or is it thenumber of throughput etc and differentsystems provides different um differentbenefits so it's really important thatyou work out what your application needsand of course in cloud native systemsthere are multiple layers ofvirtualization you have object storesbuilt on file systems or file systemsbuilt on object stores or block devicesbuilt on all sorts of complicated uhsystems so in the end of the day you doneed to understand those layers wherethe caching in those layers happen andand that enables you know theperformance and thescalability and of course we also talk alot about you know the deploymentoptions the topologies the themanagement interfaces like CSI forexample and those different accessinterfacestoo um now putting my TOC hat on for asecond it's also important to to saythat you know we're doing some demostoday for a couple of different productsprojects but there are lots of CNCFstorage projects um uh both graduatedand incubating so things like Rook whichprovides an operator for um uh for SEfit test which is a distributed MySQLcluster Xcd which of course you will runin your Kubernetes environments TIKvwhich we're going to be talking abouttoday cube fest which is a uh recentlygraduated just a couple of months ago umuh shared file system used for a lot forAI and machine learning and there aresome very cool incubating projects tooum but there are 200 and somethingprojects in the CNCF now so uh it'sreally good idea to navigate those andhave a look at some of the sandboxprojects where a lot of the innovationand experimentation is happeningtoo so now we're going to do a littlesacrifice of the demo guards and I'mgoing to pass it over to Chris helloeveryone uh nice to meet you all um I'mgoing to talk about cloud native PG anda demo with thatum all of the kind of repo all the kindof YAML all you need and the Terraformand everything you need to get this upand running is in there we will uploadthe slides after the talk because wefound spelling mistakes this morningthat we fixed so we haven't done thatyet um so I'm going to go back about 20years when I first started on itstateful applications were Oracleclusters i think I remember Oracle 7 wasthe first production cluster I broke umand we basically got given I went andhad a look today this is Oracle thelatest version of Oracle gridinfrastructure it's a 290 page PDF ifyou want to install it and that includesevery option and every configuration youneed to run thatdatabase generally that's not how you'dset things up and I remember after thatwe had I think Sun had the blue booksIBM had the red books so we got peopleusing these things in production then weworked out actually the hardware isdifferent how they install it differentwhat the workload's different so we'llnow come up with another book to tellyou how to use theapplication um all this �changes inKubernetes the thing about Kubernetes isit basically takes an underlyingplatform and abstracts all thatcomplexity so this is my argument to youtoday is running stateful workloads ontop of Kubernetes is the best mostrepeatable and I would argue in somerespects the safest place to do it nowwhen you start looking at things likepod disruption budgets and all the otherin-built mechanisms in Kubernetes tomake themsafe all this builds on the operatorframework so you know that 290 pagedocument that it took a bunch ofengineers to write the red book it tooka bunch of uh field engineers to go andwrite and distill all that informationabout as well all of that informationcan be captured in these operators theseare programmatic uh applications that umuh will basically make your applicationwork in the way it's designed to so youcan distill that 290 page document into90 lines of YAML um the indentation iscompletely broken when I pasted theseinto this slide so please don't copythem from here go look at the GitHubrepository but the you know theimportant thing here instances threegive me a highly available clusteredmaster and two replicas of my Postgressinstall init DB on the lefth hand sideinitialize my database from scratchplugins i'm using bar man here to streammy um well logs out to a um an objectstore in another location so that I canrecover this database and on the righthand side you know instances three andI'm doing an external cluster restore sothese 90 lines of YAML will give you ahighly available stateful applicationand recover it from a remote objectstorage much easier than going throughall those books um quick picture if youwant to come and talk to me more aboutthis you can find me at the Akami standakami and Lode are the same thing andI'm building all of this on top of theLode Kubernetes engine so if you want totalk about managed Kubernetes come andhave a chat to me there about that aswell um and just to prove there's nosmoke and mirrors and I can show youthis at the stand with the system IDfrom the backup and the restore oneither side is the same so let's see itin action let's see what happens when wefire up those 90 lines ofYAML and I'm just going to have tochange this to high definition becauseit the Wi-Fi has gone for a burden sowhat we're doing here is we're going inin a live demo and we're actually goinginto the cloud manager and we're killingone of the nodes where a live databaseis running now the operator is fantasticum they've actually made a consciousdecision in cloud native Postgress towrite their own operator and that'sbecause your storage could be local andyour storage could be remote so we'vegone in and killed our node there if wefast forward a little bit into the intothe demo you can see we got some redlines red lines are bad we've lost oneof ourinstances so as we go through the demoand have a little bit bit f further intime a little bit up here what we'llstart to see is Kubernetes is startingto recover so Kubernetes has gone heywait a minute one of my nodes is missingum so Kubernetes goes all right I'llI'll bring your node back and I'll getmy node into the ready state at thispoint the cloudnative Postgress operatorcomes along and goes "How was your datalaid out was it replicated was it usinglocal storage was it using remotestorage do I need to recreate volumes doI need to recopy data into those volumesdo I need to just move up a new node andattach my storage that's on a remote umnetwork attack storage back to theKubernetes node that's now recovered andbrought is now up and running so as wego through this you'll see nodes arestarting to come up pods are starting toinitialize you can see them going to theinit state here and it's starting to sayactually reinitialize those podsreattach the storage to those pods andbring that database back up into arunning and working state so as we getto the end of the demo here you'll startto see that the umum the container comes into a uh I thinkit comes into next oneand here we go standby starting up so onthe bottom there you can see the statushas gone to starting �up so the LA whatwe've done is basically killed one ofour statefulapplications don't have to worry abouthow it's laid out don't have to worryabout anything the operator's justtaking care of all of that and all thewhile all of our well logs and all ofour backups are being streamed remotelyso as we come out of this the quick demowe've seen there of how we can recoverfrom thenose how do I get out full screen pressescape there we gouhthat's we come out of presentation modebrilliantum so just to kind of summarize thatlast bit there you know what was goodabout that um and what did we see fromeverything that was working on that thepod disruption budgets you can go into aKubernetes cluster and you can try andkill pods you can try and kill nodes youcan try and take things down if yourapplication is set up with poddisruption budgets you have a safer morestable more resilient persistentworkload than if you're running on juston uh commodity infrastructure and allof these other components added by theoperators give you all all thisadditional componentry out of the boxyou don't have to go and read a 290 pagemanual to understand what you're talkingaboutso with that I'm gonna hand over to Alexwho's now going to show you uh anotherdemo and another part of um stableapplications running Kubernetes over toyou Alex cheers Chris right so we'regoing to take a look atTIKV so TIKV is a highly scalable lowlatency distributed key value store soit provides both a raw key value API andalso an acid compliant transactional APIso you can think of your raw key valueAPI is get put delete for a key or batchof keys and then acid complianttransactions obviously allow you tocompose operations together and submitthem to the database as one atomicoperation um and you can build on top ofthat to do more complex databases likeSQL for example so TIKV actually has umum an an SQL component which is built ontop of the the low-level TIKV databaseum so it's very easy to deploy inKubernetes and it's a graduated CNCFproject so it's mature it's been used inproduction by a bunch ofcompanies so let's just touch on some ofthe kind of highlights of TIKV so it'sreally designed with scale in mind itcan scale horizontally to hundreds ifnot thousands of terabytes of databillions of keys and hundreds ofthousands of RPS in fact millions of RPSand hopefully that's what we're going tosee today so it does that by splittingyour key space up into regions umtypically a region is about 100megabytes but that's tunable and it'llsplit those regions up among yourstorage nodes so as your capacity or RPSrequirements grow you can add more nodesto your cluster and TIKV has got a bunchof intelligence to scale out that databalance it address any hot spots thingslike that and obviously all the faulttolerant kind of stuff we come to expectas well from a production key valuestore so it's also very low latency it'sdefinitely capable of operating in thatkind of 1 to 10 millisecond latencyrange that's for reads and writes um andthat's largely in part because it'sbased on Rox DB which is a lightningfast non-distributed key value store umwhich has had over 15 years of kind ofengineering and optimization time behindit and of course it's kind of cloudnative it's got a fantastic KubernetKubernetes operator to handle yourdeployment upgrades automated failoverand it comes with a bunch of greatobservability out the box um so inparticular the the kind of preconfiguredgraphana graphana dashboards are reallyuseful and we'll we'll dig into that inasec so installation wise there's not awhole bunch to say here it's yourstandard Kubernetes install processthere's a bunch of CRDs you can installthen you grab the operator helm'sprobably the easiest way to do it andthen after that you can define yourcluster and apply it so before we gointo the demo I'm going to just show youum an example cluster definition I havehere and this is the cluster definitionwe're going to use in the forthcomingdemo um so we've got five copies of thedata we're not cheating here this is alike production level replication wewant to be fault tole�rant we want to beable to lose one if not two if not threenodes um we have 12 that second linedown there 12 replicas that means wehave 12 nodes for storage so in thisdemo I'm going to load up 10 billion Gkeys into the database well I'm notgoing to do that as part of the demo idid that beforehand because it takes awhile um so we've got 10 billion keysfiveway replication that's 50 billionkeys in total about a terabyte and ahalf of of key valuedata next line down is really importantso you can see here we've set thestorage class to SSD storage so whatwe've done here is we've exposed thelocal NVMEs on the Kubernetes nodesthrough to the TIKv pods so again likeChris's demo this is running in LKELinda Kubernetes Engine um and one ofthe benefits of LKE is you get access tothese fast local NVMEs so if you want tobe doing low latency hundreds ofthousands of IOPS in your key valuestore you better have some pretty quickstorage and the final line there is justshowing that we've used the Kubernetespod anti-affffinity feature to ensurethat we have one Tik KV pod running perKubernetes node um if you have beefiernodes you might want to consider runningmultiple pods per node but TIKB in theway we're using it it's quite a hungryapplication so we want to ensure thatthat pod gets full access to thenode okay start thedemo okay so this is my 20 nodeKubernetes cluster 13 of the nodes I'vereserved for TIKv sorry 13 of the nodesI've reserved for TIKv the other sevenare for the benchmark client so thebenchmark client here is a tool calledgo YCSB which is an open source databasebenchmarking tool and we've just putsome scripting around it to kind of runit in Kubernetes and bring up multiplepods um and do things like collect theresults afterwards and analyze latencieswhich we won't get into today um so outof the 13 TIKV nodes 12 of them areactually for storage like I said earlierone of the nodes and one of the pods isreserved for the Kubernetes um the TIKVcontrol plane so what I'm going to do isI've got a bunch of bash scripts thatbasically bring up my benchmark andthat's going to start running in asecond and the idea is we're going totry and target 1 million random readIOPS so those seven benchmark nodeswe're going to spin up about 15benchmark pods I think across thoseseven nodes and they're each just goingto read a random key out of that 10billion um 10 billion key key space andthe idea is can we hit a million IOPS ona fairly modestcluster so there we go that's thebenchmark just coming up now we'll giveit a bit of a bit of time torun and we're going to jump over to theGraphana dashboards and see what's goingonso like I said TIKV comes with a bunchof super useful uh dashboards um you canget a high level view of your cluster oryou can really get into the weeds andlook at specific subcomponent componentswithin the stack um this is just one ofthe dashboards there's probably about adozen others um but you can see eventhis one is got a whole wealth ofinformation um and what we're going todo is we're going to jump into one ofthe more high level panels the clusterdashboard and as you can see you've gotalong the top row you've got youravailable capacity how much space you'recurrently using all those kind ofheadline stats next row down we've gotthings like CPU usage in memory and therow below that we've got uh our currentcur queries per second rate through umand there's a whole bunch more I canshow you in that as well so if we diveinto the CPU panel you can see we've got12 Tik KV nodes um our 12 storage nodesthey're all running pretty hot sobetween 900% and thousand% CPU usage umas you'd expect TIKVS balanced our datapretty evenly which is why all the CPUusage across the nodes is is prettypretty uniform um just looking at thememory panel here so we're using about20 gig of memory on each of these TKVpods that's not a memory leak it'snothing to be alarmed about by defaultTKV will try and use a bunch of memoryum as an inmemory cache for your data sodepending on your workload that canreally accelerate performance probablynot so much for this random read testb�ecause our working set is much largerthan the available RAM in the clusterbut it's something to be aware of and ofcourse you can tune all this um so justto prove it's not all smoke and mirrorswe're going to go into the gpc panel andtake a look at the um queries per seconddashboard and it might be a bit small tosee there but I promise you I'm notlying we've been pinned at about amillion RPS for the past couple ofminutes thanks i'm gonna hand it overLoyall right okay so you got the theory andthe education from Alex you got thedemos from Alex and Chris and now I'mgoing to give you some real worldapplication of what we just heard aboutso we are going to talk about Nokia andCivo and just so you know I have speakernotes on my phone i'm not like textingpeople while I'm up here okay so I workfor Perona and Perona is a 100% opensource company and being head ofcommunity for 100% open source companyhas got to be like the highlight of mycareer why because everything that I'mtalking about everything that we doeverything that we build is open sourceyou can try it today right you don'thave to pay us anything but we're hereto help you along the way but again 100%open source so to be able to be up hereon stage with Achammy and to talk aboutCNCF and open source projects it's justawesome so let's get into it no no Nokiayou guys remember the phones yeah okayso Nokia is one of our customers and uhwe are going to help them so they have90,000 internal users right they have6,000 projects five data centers and 61pabytes of storage so they're not busyat all right like that's why no callsget dropped um so what is their mainpain their main pain is operationalefficiency right the lack of databaseself- storage growing number ofmicroservices uh and what are theirrequirements number one it's 100% opensource so they called us number two theyneeded MySQL and Postgress support whichis cool so what is the solution so theydecided to run databases on NKSK theirversion uh they used our MySQL andPostgress operators to build a privateDBAS this allowed them to shiftdatabases from virtual environments toKubernetes to improve resourceutilization and reduce infrastructurecosts things cost money and they shiftresponsibility for database managementleft to the devs team so they wereresponsible for their own uh their ownum path forward and so like just to do alittle bit of a call back to what Alexwas saying this is a real life usertaking advantage of automation scalingand day2 um complexity using Kubernetesand it's Nokia guys big telecom operatorFortune 500company hey girl what you doing allright sorry civo uh so Civo is acloudnative service provider providingpublic private cloud all on Kubernetesand you're like okay well if you'realready on Kubernetes why do you needany help well they wanted to launch aMySQL and Postgress DBAS on K8s fortheir customers so what did they needthey needed something that was reliableit was battleproven uh battleprovendatabase operators they didn't want tocreate their own they wanted somethingthat could they could just use out ofthe box they needed isolated tenant andmulti-tenant environment support againopen source and integrated with the Civocloud control plane so open- sourcebattle tested an operator that's worksthat they can count on because nowthey're going to have their customerscount on it aswell my goodness there we go so what wasthe solution they used uh our operatorsfor MySQL and Postgress to automateoperations in the back end theynamespace operator deployments toprovide the required separation oftenants which allowed them to createthis great service offering for theircustomers so they could launch a MySQLand pro so their customers could launchMySQL and Postgress DBAs quickly so theycould have their own control becauseit's a DBAS they could keep cloudnativedesign approach which is end to end andthey could continue to to develop umcontinue the development of projects butthat's why they wanted open source rightso they can continue to build on whatthey've already had going so again Alexgave you like the why the guys gave youthe how and this is us telling you likeit's being done every day so uh hand itback to you so hopefully this was a bitof a whirlwind tour into why cloudnative storage is a good idea how youcan um deploy it in your environmentstoday and cover both some of the complextopology aspects as well as theperformance as well as some real lifeuse cases and I think we have a fewminutes left so um if anybody has anyquestions we're open to taking anyquestions there's a microphone righthere um to help yourself beheard don't be shyis there any god that was a little bitloud is there any push from all of theseoperators to solve some of theunderlying problems that staple setshave being completely immutableum there is so so both both in terms ofum helping with stateful sets but alsothe operators in some cases move todoing deployments instead of statesfulsets and having the operators implementsome of that logic and and and sort ofcorrelating the workloads to the storageyeah uh thanks for the talks and demosum could you talk a bit more about localstorage versus network storage forKubernetes services yeah so so thereisn't there is you know there isn't aone-sizefits-all um I think when we whenwe talk about um some workloads whereyou have like an individual workloadthat needs to move around a lot thenblock storage um uh you know adistributed block storage capability ora shared block storage capability mightbe the right option but more and morewhat we're seeing is we're seeing thethe ability to do to utilize fast NVMElocal storage which which is often notjust extremely fast but also muchcheaper than than you know attached blogstorage um and then using the facilitieslike what we just saw say with TKV or orPostgress replication to actually handlethe availability and the durability sideof things can I just say one more thingon that which is um if you're trying tobuild something ultra fast like that andyou don't control the network the jitterand the latency if you try and usenetwork attached storage will probablykill you so um I'd just say I think aswe get to more and more data workloadsand all the kind of AI stuff we'rehearing about I think we're going backto local attached storage in Kubernetesclusters is my prediction we'll seewhere we get toquick question about scaling how do wemanage scaling the storage so it's notentirely true with local storage becausethe scaling is finite due to the localstorage but in case of attached storageand also how do we scale the CPU of thedatabase with a minimum of interruptionin both caseso that is a how long is a piece ofstring kind of question because it it itdoes very much depend on you know theactual database technology or storagetechnology that's that's in the stack umfor for things like say a relationaldatabase like like Postgressum you often scale that by having uh byhaving sort of some level of verticalscaling but then also having replicasthat can be used to um facilitatequeries or you know data warehousing orthat sort of thing in in a in in modernuh systems like TIKV where we have um avery distributed sharded store umeffectively you get more or less linearperformance on as you add the number ofnodes and the number of local discs andthe the operator and the control planeum take the sort of um add thecleverness if you wish to make sure thathotspots are identified and moved aroundand distributed across the cluster sothat you can use all of the resourcesacross all of the different nodesi'd almost say as well it's as thedatabase owner or operator it's almostour responsibility to give theinfrastructure or the cloud teams theinput of what we need but if you want tochange the underlying node and move themto bigger smaller nodes or do aredirected restore to bigger nodes you alot of these operators will just do thatall for you nowadays so actually on theapplication side it's nice and easy butit is still our job to give um accuratekind of inputs to the team of what weneed underlying and I I think we'reabout to run out of completely out oftime we are out of time we're We're allgoing to be around2025-04-15 22:02:16.987114 ��u#��SAJtMYdR50-KUhello welcome who knew so many peoplewanted to learn about stful storage umso my name is Alex Kirkup uh I'm a chiefarchitect at Akamise Cloud um and I'malso uh a member of the technicaloversight committee for the CNCF um thisis my colleague hey y'all my name isLori i'm the head of community at Peronaand I am also a CNCF ambassador hey I'mAlex Reed i'm a principal engineer atAkamiand last but not least my name is ChrisMilstead i'm a product architect atAkami as well welcome everyone and backto you Alex okaythenso why should we be thinking about umcloud native storage because it's oftensomething that's that's uh ignored inour in our environments i mean seriouslyhasn't everything been statelessforever now this is something which Ialways put up in in all my talks to kindof say there is no such thing as astateless architecture it's just someoneelse's problem and it's probably all ofyour problems or some some of yourproblems um because all applications endup storing statesomewhere and why would we wanttherefore our storage to be cloudnativebecause of course we've learned so muchover the years from our Kubernetesenvironments where we've made workloadsdeclarative and self-healing andautoscalingso of course all of the statefulworkloads can also benefit from theautomation that you get from a cloudnative environment the ability to easilyscale and deploy complex topologies theself-healing and the automatic failoverand of course the performanceand what do we mean by all of thosethings so when we talk about automationwe're talking about the declarativenature of cloud native storage you canhave a set of YAML that defines not onlythat you want to run uh or or define a astable workload but it also allows youto define very complex topologies usingoperators and we're going to give somedemos to show that it also allows you toeasily scale and and apply um scaling asyour capacity or your uh RPSrequirements grow over time and itprovides an automatic self-healing sowhen you want to upgrade nodes when youwhen a node fails and you need to failover a workload somewhere else you getthat automatic self-healing failover andwhen it comes to performance with allthe um flexibility that we have withnodes in a in a cloud native workloadlike Kubernetes where you have um uhlocal NVMe for example or distributedstorage solutions that you can utilizeyou're also talking about now gettingnative performance like there isn't anoverhead of of deploying in acloudnative environment and moreimportantly the ability to getdeterministic performance right so sothe ability that the ability for aparticular query to always take the sameamount of time everytime and when we talk about cloud nativestorage it's obviously more than juststorage you know we we very familiarwith you know the block volumes or thefile systems uh and the shared filesystems which are much more commonnowadays with uh machine learningworkloads but it's also of course theobject stores and the databases and thekey value pairs that you're connectingto and and actually querying uh andtoday we're going to talk about a demoof a complex database topology and avery fast scale out uh key value uh demotoo but it's the it's it's not justthose products it's all the ecosystemthat goes around this so all of theKubernetes integrations that make thispossible like the CSI drivers for block��calable and fault tolerantsystem and it handles a lot of the hardparts of running a distributed uh systemvery well for us elastic search storesdata uh that share similarcharacteristics in a logical name spacecalled anindex these indices are split intomultiple charts which can be eitherprimary shards or replica charts[Music]we run our fleet of elastic searchclusters on managed Kubernetes clustersGK on top of uh GCP and we deploy andmaintain them using uh a customKubernetes um controller that we havebuilt shopify is a large um complextechnical ecosystem that is made up of avariety of apps and services and thebiggest service at Shopify is called theShopify core which is a big railsmonolith that powers all of our merchantshops and storefronts we run Shopify inGCP regions for resiliency uh whatyou're looking at here is a um highlevel structure of Shopify core in asingle region the entire Shopify core ina region is broken down into above a 100logical groups of SQL databases andinstances of other services which aresimplified in this slide um all of theseservicespower all of our online storefunctionality for the shops running inthat region uh in every region we runone instance of elastic search forShopify core that provides searchfunctionality for the shops in thatregion to give more context elasticsearch is used as a secondary data storeit is util uh utilized to provideadditional functionalities such as fastsearch capabilities aggregations or fulltext search on top of the existing datain the primary data store which is SQLin ourcase we have an ingest pipeline in eachregion that consists of CFKA topics andCFKA consumers we produce updates madeto records for all related SQL instancesin the region to our CFKA topics and ourconsumers pick up those messages fromthe topics and write them to the correctu elastic search index in realtime indexing is a term we will use alot in this um talk uh it is uh the actof uh writing documents from the primarydata store to elasticsearch we store elastic search documentsin uh elastic search indices for examplewe have an index that allows merchantsto search and filter through the ordersthey receive or another index thatallows them to search uh through theirproducts or their customersuh merchants are able merchants orbuyers are able to search uh theseindices umum and uh their their queries are sentto the correct elastic search indexthrough a uh routing layerour focus in this talk will be on theindexing path uh shown here which wecall the indexingpipeline in the previous slide uh we sawthe indexing pipeline that brought datafrom SQL as the primary data store toelastic search through CFKA but that wasreally simplified and we have actuallybuilt two different uh indexingpipelines for two different writeprofilesone of them is called the real timepipelineum which for example indexes a productto elastic search and makes it availablefor search for buyers when a merchantfor example adds a new product to theironline store the other one is called thereindex uh pipeline we have a lot ofdevelopers at Shopify um they createindices modify index analyzers and addor remove versions of the indexuh they add or remove fields from thethe indices and basically they changethe shape of the indices a lot and whenthat happens we need to uh migrate allthat data to a new version of the indexand promote that index to make the newfeature available to the merchants uh aswell as the buyers we call thisreindexing which basically meansrebuilding an entire elastic search uhindex from SQL recordsand as I just mentioned we have twoseparate indexing pipelines for realtime and reindex rights uh which we seeon this slide as a bit of a backgroundCFKA is a distributed publish subscribemessaging uh messaging system data iswritten to CFKA topics by producers andconsumed from those topics by consumerswe have separate CFKA topics for eachpipeline the real time CFKA consumersconsume from the real-time CFKA topicsand the uh and write to the productionindices these indices provide searchresults for buyers andmerchants the reindex consumers on th�eother hand um do not write to productionindices they only write to indices uhwhose aliases end with anew the new indices get created when westart a reindex uh for a specific indexfor example for the orders index uh theorders new index will eventually be acopy of the production index uh fororders but with different fields anddifferent featuresa new index does not receive productionqueries because its data is in uh isincomplete until all the necessaryrecords from SQL have been written toit after the dot new index is createdsome jobs get spawned and uh read thenecessary records from SQL produce themto the reindex cafka topic which will beconsumed by the reindex consumers andwill eventually be written to the newindex and to make sure that the newindexalso receives the real time updates uhthe reindex topic also receivesreal-time updates made to SQLafter a while all records in SQL willget processed and no more reindex writeswill be produced to the reindex cafkatopic at this time the reindex isfinished and we can flip the new indexlive flipping it live basically meanschanging the alias for the productionand the new index and after thisthe new will become our new productionindex and will have the updated fieldsand features and will be queried by theclients as mentioned before our elasticsearch clusters run on Kubernetes on topof GCP across different uh regions andavailability zones it's good to knowthat today the entire Shopify ecosystemincluding the Shopify core uh its uhstorefronts and other services runacross the globe uh from North Americato Europe and to Singapore we haveelastic search clusters spread acrossthese regions giving us resilience andavailability through redundancyso let's talk more about resiliencetaking a closer look at uh an exampleGCP region where Shopify core and its uhelastic search cluster runs we have theindexing pipeline updating uh elasticsearch when changes are made to SQLrecords and we can see here that searchqueries are sent to the elastic searchthrough a routing layer meaning thatqueries made to shops in a certainregion are routed to the elastic searchin the sameregion we know that the first steptowards high availability is redundancyand to mitigate regional failures weshould replicate our systems to anotherregion the the active SQL instances andum CFKA topics in different regions theyhold different data sets and we need toreplicate their data between regions soboth elastic search clusters uh have theuh the same data set and with this interregion uh data replication if one of theelastic search clusters goes down uh wecan fail over the query traffic to theelastic search cluster that isfunctional until the other one hasrecovered at the application layerelastic search also provides falltolerance by being zone aware meaningthat if we deploy our Kubernetes clusteracross three uh different availabilityzones when elastic search distributesprimary and replica shards uh itdistributes them across these zones in away that the primary and replica for oneshard do not end up uh in the sameavailability zonewe use node affinity rules to ensurethat two pods from the same uh elasticsearch cluster do not get uh scheduledon the same node and we also taint theGKE node so only the elastic search podsare deployed onthem the elastic search pods are giventhe right toleration so they can bescheduled on the right GKEnodes with this zone aware setting uh wecan also do maintenance more quickly umas we can bring down an entireavailability zone for maintenancewithout being worried about dataloss to give a sense of uh the scale ofsearch at Shopify as of today the searchplatform team manages above 800 distinctelastic search clusters some of whichare as large as 260 node clusters andmany are as small as just three nodeclusters and together they store morethan three pabytes ofdata for Shopify core elastic search thereal-time indexing rate can peak at90,000 documents per second uh while ourreindex indexing rate peaks at 500,000documents per second and as you can seethe reindex pipeline uh the reindexpipeline throughput is a lot h�igher thanthat of the real time pipeline and torecall reindexes are done occasionallyto change the shape of an index in otherwords they are an occasional burst ofrights to elastic search this isactually a nice segue to the problemstatement uh which will be followed bythesolution what you see in this slide isthe distribution of elastic search podsacross many VMs running on GCP asmentioned before an elastic search indexis split into many shards and each shardhas a replica for example the productionindex for orders is split into a,024uh primary shards each of them having areplica chart which is um shown by adifferent color in here this index uhreceives realtime writes as well asproduction querieswhen a reindex starts for maintenanceanother index called the orders new withthe same number of uh charts is createdon the same VMs which will receive theheavy reindex rights and since theproduction and the new index share thesame compute and storage resources theheavy reindex rights um will impact thereal time rights and slow themdown anddelayed real-time indexing uh results inuh stale or inaccurate search resultswhich could result in lost revenue forourmerchants which is a big dealso the fact that real time and reindexrights share the same compute andstorage resources uh leads to reindexesimpact uh impacting production rightsand queries which leads to uh pagerfatigue uh because the on call would getpaged for real-time indexingdelay which can only be mitigated bystopping the reindex starting it allover again uh which also uh slows downfeature rollouts for ourdevelopers our initial approach to solvethis problem was to overprovision thecluster to always be ready uh to handlethe load of the reindexes and thisbecame too expensive as Shopify keptgrowing and the cost was no longeracceptabletherefore we decided to separate theinfrastructure for real time and reindexrights this was a huge undertakingbecause it required us to make somefundamental changes to our existingreindex pipeline the first step to dothis was to create a separate GKE nodepoolool just for running reindexeswe leveraged Kubernetes taints andtolerations to ensure that elasticsearch pods for reindexes only getscheduled on the reindex GKE nodes uhsame as for the elastic search pods forrealtime rightsjust asa quick refresher a Kubernetes taint isa mechanism that allows a node to repela set ofpots taints are applied to nodes andthey work together with tolerationswhich are applied to pods the primarypurpose of using taints is to ensurethat certain pods uh are not scheduledon the wrong nodesso giving the reindex and real-timetoleration to pods uh will ensure theyget scheduled on different note poolswhich will provide the first steptowards uh isolation of reindex and realtimerights we use the elastic search uhbuiltin chart allocation settings toforce all indices uh to be hosted on theelastic search pods that run on the realtime note poolool except for the oneswhose aliases and uh with a new whichshould be hosted on the elastic searchpods uh that run on the reindex notepoolool once the reindex is done weswitch the alias for the new index whichwill force its charts to be relocated tothe real-time note pool and uh becomesearchable by clients and this wayreindexes no longer impact productionreads andrights also since the real-time nodepoolool no longer has to provide for theheavy reindex rights there is no need tooverprovisionuh that no pool and it gives us a greatopportunity to save infrastructurecosts but as you see now it's thereindex notepool that is burningmoney and this is where node autoscalinguh will come in handy ideally we wouldneed a very small GKE node pool thatgets scaled scales up when the reindexstarts and we expect that the node pooluh scales up depending on the size thenumber of shards uh of the index that wewant tobackfill we picked Kada which stands forKubernetes event-driven uh autoscalingwhich is an open source project thatprovides autoscaling for Kubernetesworkloads uh it's designed to extend thecapabilities of the Kuberneteshorizontal pod autoscaler by allo�wingworkloads to scale uh based on metricsfrom various event sources rather thanjust uh CPU or memory utilizationthis makes KADA particularly useful uhfor applications that need to scale inresponse to events like Q length uhnumber of incoming messages or in ourcase to automatically scale up theKubernetes workload when a reindexstarts scaling up the Kubernetesworkload uh will force the GKE nodepoolool hosting the reindex pods to beexpanded as well kada is implemented asa Kubernetes operator which uses customresourcesum to define how scaling should behaveand uh it consists of two maincomponents one is the KADA controllerthat watches for a KADA custom resourcecalled the uh scaled object and managesthe life cycle of the uh correspondingHPA resources the other one is themetrics adapter that collects andprovides uh metrics to the KubernetesHPA which then makes decisions aboutscaling the application pods uh up ordown based on thosemetrics kada supports a variety of eventsources um called the scalers andPrometheus is one of them uh each eventsource is represented as a trigger inthe Kadaconfiguration scaled objects are customresources uh used for scalingdeployments uh stateful sets or anyother scalable um Kubernetes workload wedefine the triggers and scalingbehaviors used by Kada to scaleworkloads uh you see an example of ascaled object uh on the slide uh for ourspecific use case we set the scaletarget to our reindex stateful set thatuh runs the elastic search pods that areused for reindexes and we also havesettings for min and max replica countsuh these settings determine the minimumand maximum number of replicas that thestateful set can scale to based on theevents that are triggering the uhscaling action for us we set the minimumas low as three and the maximum toaround 200replicas the Prometheus trigger in Kadaenables autoscaling uh based on theresults of the Prometheus query uh whenyou configure a scale object with aPrometheus trigger Kada willperiodically send queries uh to aspecified Prometheus uh server and uhthe responses to these queries are thenused to make decisions about uh scalingthe Kubernetes workloads up ordown and for our use case we wrote aPrometheus query that calculates thenumber of shards uh divided by thenumber of nodes and we set the thresholdto one uh which means that ideally forpeak performance uh we want to have oneshard on a single node however if theindex has like 2,000 shards and the maxreplica count is set to 200 we will endup with um only 200 nodes which meansthat each node will have 10charts the following slides show anexample for node autoscaling uh we havea real-time node poolool hosting chartsfor production indices we also have areindex node poolool with a minimum ofthree nodes that initially host noindices so the prometheus query for kadathat calculates the number of shards uhdivided by the number of nodes willreturn zero kada convers uh compares thequery results to the threshold we haveset uh which is one and decides to donothing because the reindex workloadalready has three replicas which is theminimum allowednext the reindex starts for an indexthat has for example four primary shardsand four replica shards so eight shardsin total because of the routingallocations that we have set for elasticsearch the shards of the new index willbe created on the pods running on thereindex node pooluh and what once this happens theprimeus query would return a valuehigher than our threshold of one whichsignals to kada to scale the reindexworkload up until the result for thequery uh becomes equal to the thresholdscaling up the reindex workload expandsthe underlying uh GKE notepool once the node pool gets scaled upuh elastic search automatically balancesshards across the nodes so each node ofthe node pool will have one shard nowthe primeus query returns one which isequal to the threshold and kada willtake no scaling actions after thereindex is done uh and we switch the newindexlive all charts from the reindex nodepools get evacuated by elastic searchand will get scheduled on the real timenode poolool at this point theprometheus query returns zero which isbelow the threshold and consequentlykada decides to scale down the reindexuhworkload the reindex workload getsscaled down by Kada which consequentlyscales down the underlying uh GKE nodepool to the minimum number of nodeswhich in our case was three and Kadawill take no further actions until a newreindex isstarted with dedicated reindex nodes uhwe were able to reindex our orders indexwhich has around 100 terabytes of data40% faster the cute bite threads on thereal-time nodes was also dropped by 98%during reindexes which helped improvereal-time writes performance and as aresult the pager fatigue due to thisproblem was no longer an issue anddevelopers were able to ship features uhfaster leveraging node autoscaling uh wewere able to reduce the total number ofuh CPU cores and memory used by ourreal-time node pool by 58% and 15%respectivelyuh which led to a total saving of 43% ininfrastructurecosts this was only an example of theinteresting problems we solve at Shopifyuh we're always looking for people whoare interested in solving problems atscale uh scan this QR code uh to getmore uh information and if you have anyquestions uh I'd be happy to answer themthank you[Applause]thank youoh we have one question yeah it's goingto be a quick one um when you scalingdown scaling down your reindexing notepool how do you make sure that youaccidentally don't shut down the nodethat still has a shard that didn't getmoved to the other uh that's a reallygood question um well we are saved bythe the the results of the Prometheusquery if we have data on the nodes thatwe want to scale down the query will notreturn zero it it will show that thereare some shards uh on those nodes sothat query that divides the number ofshards by the nodes will not return uhzero but at some point some shards willmove and some the the note pool getsscaled down when all shards have movedwhen only when there are zero shards onthe reindex notes okay thank you thankyouhi one more question how fast do youscale up the worker nodes because whenyou create the index with thousandshares it means that uh do you do youscale up all the worker nodes at onceit's uh you are able to define thescaling behavior in the scaled objectconfig uh youcan like say scale up for me like fivepod at or which means five nodes at atime or like by a percentage if I'm notwrong in our case for the the one thatyou mentioned like a very large index Ibelieve ittook I think 30 minutes to scale upwhich we really we are okay with becausethis is a maintenance uh kind of thingso like 30 minutes is basically nothingit's just system warming up thank youthank youhi i didn't really get how you managethe affinity between the shards and thenodes they are hosted on is that anelastic search feature or your customcontroller great question yes it is itis the elastic search feature you areable to define some custom nodeattributes and you are able to you setyou are able to say that if this indexhas this kind of alias always likeallocated to a note that has this namingpattern so this is a feature thatelastic search provides and the shardsof that index uh follow that the ruleand will end up getting hosted or likescheduled uh on the uh in our case onthe reindex note pool thank you verymuch thank youhi there thanks for the brilliant talkthank you um what kind of ballparkfigures do you have for when data has tomove around um after you've completed areindex uh I didn't catch the beginningof your question what kind of ballparkfigures in terms of time does it takefor the data to move around you knowwhen you've done a reindex when we aredone with the reindex yeah and you'rescaling down things uh yes uh greatquestion it is it obviously it dependson the index that we're running but forour largest um I think it took about twohours like to be fully all for all theshards to fully move from the reindexnodes to the real time notes okay cooland as as orders of terabytes likehundreds of terabytes you said yeah yesthank you yeah thank youthank you everyone[Applause]2025-04-15 22:02:17.642392 ��v#��mAr59IfCSmUBQhello everyone uh and thank you all forcoming my name is Leila i work atShopify as a site reliability engineeruh I'm part of the resiliencyorganization at Shopify whose uhresponsibility is to ensure Shopify isalways uh up and running ready andreliable for merchants uh and our buyersacross the globe my talk today will beabout a project uh where we improvedelastic search indexing performance aswell as the reliability of ourinfrastructure uh by using dedicatednode pools that are automatically scaledby kadascalers in the next 30 minutes we willlearn how at Shopify we uh host thesearch infrastructure on top ofKubernetes we'll talk about the entiredata pipeline that writes data from uhSQL to elastic search through CFKA andthe resilience of its architecture uh Iwill follow by presenting the problem weneeded to solve and we'll share oursolution for it followed by um sometechnical details on how to use SCADA uhscalers for scaling uh Kubernetesworkloads and I will conclude thepresentation with a Q&Ashopify is the all-in-one um commerceplatform uh to start run and get andgrow a business uh it is the leadingglobal commerce company that providesessential internet infrastructure forcommerce uh as well as uh trusted toolsto start scale uh and run a retailbusiness of anysize currently we have over uh 3 millionbusinesses using our platform uh we havemerchants like Gym Shark and FashionNova selling their products with us uhwe represent over175 countries and have about 10% of thetotal US commerce um in the summer of2024 we reached a trill uh trilliondollars in gross merchandise volumeone of the largest events in commerce isthe Black Friday Cyber Monday week uhwhich we call it BFCM uh to share morestats about the scale of Shopify uhduring the BFCM of 2024 Shopifyprocessed around 58 pabytes of dataserved more than a trillion edgerequests as well as more than 10trillion database queries uh which ledto around 12 billion dollars of globalsalessearch is a fundamental part of anycommerceplatform that allows buyers to searchand filter for products or for merchantsto fulfill their orders manage theircustomers and inventory and when you goto any online store and search for aproduct your request goes to a searchengine that is backed by a secondarydata store which is different fromtraditional databases uh and the searchengine that we use at Shopify is ElasticSearch a quick refresher on ElasticSearch it is a distributed text searchand analytics engine uh it is built ontop of Lucine that has full text searchcapabilities that are well suited to thee-commerce domainit's also a s��fa a demon being a kind of generallyhelpful thing that that that hangs outand uh does things for you um inKubernetes terms a demon set will runone pod on every node in your clusterand uh why would you want to do thiswell something like a like a networkcontroller that that needs to run onevery node um that will run a demon setuh logging demon um something collectinglogs from every node that that will runa metrics collection on every node isthis kind of thing that we're talkingabout that that's what you would use ademon set for and um they they need torun everywhereyeah so just a little bit more levelsetting to before we get into the meatof the talk but uh let's talk aboutresource requests for a second sincewe'll be talking about them throughoutwhen you set up anything to run inKubernetes you can specify how much CPUand how much memory it will use uh theKubernetes looks across the availablenodes and picks a node where yourworkload will fit so these resourcerequests are important in theuler beingable to do its workeffectively uh specifying requests isoptional uh but it's highly recommendedas uh you'll realize as we talk throughsome of the implications in just amomentso our fundamental problem is thisimagine you have a cluster where youhave perhaps some small nodes and somebig nodes and you're running a damonset like a logging uh collector like ametrics collector something like that uhwhere the work that your damon set isdoing depends on the other work that'shappening on thenode so you only get to pick one set ofrequests that will apply to all of thepods under the damon set how do youdecide the correct request tomake before we answer that let'sconsider the consequences if you'rewrong uh if you pick a limit that's toolow the pod might be starved forresources or some of your pods or all ofyour pods uh for CPU that means theworkload gets throttled uh for memory uhthat means that the application canencounter an out of memory error andcrash that's notgreat uh if you pick a request that'stoo low with a high limit uh theworkload can misbehave or become a noisyneighbor and interfere with otherworkloads uh and here be dragons there'skind of unpredictable behavior whenyou're requests don't match your actualusage uh you can also run out ofcapacity at the container level or forthe entire node in either way theconsequences aresimilar for Damon said in particular ifyour resource requests are much too highsome or all of your pods may becomeunscheduable so in this example uh youmight have a large node that wants moreresources than some of the small nodeshave to begin with uh so that obviouslywon't workum yeah so if you uh on the other sideif you put the uh request much higherthan you actually need uh then youyou're just you're kind of wasting thatresource it it can't be scheduled for uhit can't be allocated for for other podson the system um so you have thisterrible choice right if you go uh toolow you might get throttling or crashingif you go too high um you're you'rewasting moneyyeah in the particular service that Iwork on the Google Cloud managed servicefor Prometheus it runs with highpriority class because we want toprioritize collecting metrics uh and souh overprovisioning can also lead touser workloads being evicted uh so inthat case the um the Prometheus instancethat's running will start to take upmore and more resources uh and push outthe workloads that our customersactually care aboutum and if this gets really bad you mightnot be able to schedule any userworkloads on some nodes so obviouslythat's not a good situation to be in youdon't need to collect metrics fromnothing so I wanted to uh bring somereal data uh this is a logging demonrunning in in one of our dev devclusters at Graphana Labsum and so these are the actual CPU usageof a of a set of about 15 pods that arethey're all the same but but some ofthem are doing more more than more workthan others so yeah the question iswhere do you set the requestso your first option is YOLO if youdon't set resource requests at all uhyou're likely to have your workloadseventually misbehave� because theuler ismissing some of the information that itneeds to be able to schedule workloadsappropriately it's very popular thoughright it is very popular from what Iunderstand um there are certainplatforms like if you're running on GKEautopilot for instance where therequests are actually required and itwill set a default value for you uh ifyou don't set them yourself so uh thisis uh not even an option on certainplatforms another potential approach isa conservative approach uh where you setyour request high enough to cover everyinstance as I mentioned this can havesome problems if there's not enoughavailable capacity uh but you're alsowasting resources so in this example ifyou set your request to the highest nodeyou're going to have all of that emptyspace under the purple line is going tobe wasted capacity that you're payingfor but you're notusing a third option uh is being very uhaggressive uh you can set your requestvery low and allow bursting and hopethat there's uh resource capacity on thenode in this case you end up with adifferent kind of problem that's similarto the issues with the YOLO approachwhere you've set a request that isinadequate and you might not have spacereserved for the workload that youneed a fourth approach is divide andconquer uh you can divide a damon setinto multiple demon sets and setrequests and limits for each subdammonsetuh in this case you limit the blastradius of your over andunderprovisioning issues uh however witheach within each set of pods you stillhave some of the sameproblems so uh in the example on the farright uh you're still seeing a lot ofwasted capacity within thatsubset forexample uh also this becomes more workintensive each subsequent updaterequires you to update multipleresources and you start undermining someof the value of the damon set controlleryou have to make sure that your nodeselectors are appropriate and that youdon't have overlapping groups uh or thatyou're not missing things from yourselection groups so it becomes much moretedious to try to managethis but what if there was a way toautomate away your problems that isinstead of trying to manage multipleDamon sets yourself what if we did thehard work for youyeah so um let's uh get into a littlebit more detail the vertical podautoscaler so so most of you did not putyour hands up as as having seen this solet's just talk about it a little bit uhso this is the the YAML to define a VPAobjectum you basically point it at a set ofpods in this case I've pointed it at ademon set which is the the graphanalogging demon that we're going to use inthe demo um you can add more parametersyou can you can set a minimum and amaximum and you can set policies andthings like that but that's that's thebasic idea you you define one VPA objectum that that uh points at the pods youwant to controlit then runs as threeprocesses uh so let's just walk throughthose um the top one is the recommeneruh so this watches the actual resourceusage of your pods the ones you told itto to control um that's defined by theVPA object and it also writes arecommendation into the VPA objectum the next one is the updaterwho knows what the updater updatesyeah it's a mystery isn't it it's atrick question doesn't update anythingum what the updater does is evicts a podthat is out ofrange um and uh I didn't actually knowthat until I started working on thisproject but here we are um so uh thethird process is an admission web hookand this is something that runs as thepod is created it it it gets called fromthe Kubernetes API server so as the podkind of flies past on its way to beingcreated the uh the VPA admission webhook applies therecommendation so this is why evictingworks that we um kick the pod offwherever it's running modify its requestand it itreappears uh so a little dance and uhthat that's how the current VP VPA worksum so we'll get into uh how we modifiedit oh yeah that's Yep uh so in uh insome of these cases where uh one sizedoes not fit all for your requests wepropose that instead of applying asingle recommendation across the entireDamon set that we creat�e a customizedrep recommendationuh for each pod scheduled on each nodeso by adding a scope field to thevertical pod auto scaler spec you canspecify how you want the VPA to behavein this example we use the uh Kuberneteshost name to declare that it treats eachpod with a unique host name effectivelythe pod on each node uh as its ownunique scope that gets its ownrecommendation based on only the historyfrom that hostso I'll turn it over to Brian to showyou how this prototype works yeah so wehave um a demo environment which is athree node cluster it's running in kindat least I hope it's running uh we'llfind out in a second um so we have a wehave a three node cluster and we havemonitoring stack uh and I'm I'm usingGraphfana Cloud for that um so we'resending um metrics and and logs and soon and we're actually going to use thelogging demon as our um guinea pig forthetest so let's see if I can make the demoappearuhokay[Music]uh okaythere's one okaynot too bad not toobad i I I should just stop the demothere right not uh not temp fate um howare we doing for time actually uh prettygood pretty good okay is um so let mesee if I can runcommands yeah okay there's my three nodecluster um so we are looking at uh theuh CPU usage is the the thin line andthe uh request is the thickline um and it's it's sitting uh it'snot doing much right now right becauseI'm I'm it's just a kind of emptycluster i could uh I could show youthatum so basically the uh the only thingsthat are running are the cube systempods and the uh observabilitydemonsum so uh I'm going to introduce someload i'm going to runum a pod that generates somelogs now well let's just uh see ifthat's[Music]running what happened oh minus F thankyou everyone's screaming at me minus Fminus F minus Fuh shows you it's a live demo though huhum okay so um the uh the VPA by defaultum might take uh a couple of hours tokind of think about its recommendationand uh apply it and so forth and we wedon't have that long in the system in inthe in the talk so um I have u modifiedthe settings of the VPA uh so that itit's it's going to react a lot faster sothere's there's the load coming in rightyou see that I I'm only running the loadon the right handnode um and it's generating logs so thelogging demon is having to do morework so um what I'm hoping is there yougo so the VPA has recognized that theload on that pod was higher uh and hasevicted it um and has uh applied a newrecommendation which is higheruh while not changing anything on theother two nodes[Applause]there we go so yeah there's there's likea burst of CPU as the as the thing takesoff but uh but it it it it'll hopefullystabilize about there um for long enoughfor me to get the slides back up anywayokay uh where are the slidesokay is that going to workno yes maybe whoa okayhYeah just like talk amongstyourselvesuh right there wegookay thank youum right so we we um we demoed uhmodified VPAum and uh thank you uh we we demoed themodified VPA we demoed it reacting toincreased CPU usage on a per node basisand that is with this field in the VPAcalled uh scope so I guess you're you'rewondering how can I get my hands on thisof course so uh currently I have aautoscaling enhancement proposal that'sopen and it's uh under active review uhand the code that you just saw for thedemo uh is linked within the proposaland we'll give you a direct uh issue ora PR reference at the end of the talkhere uh so there's a few phases beforethis will roll out into generalavailability but uh the next step willbe the approval of the proposal after weget all the feedback we need and it'sreviewed by uh the sig autoscaling andthe folks that are uh reviewing theproposal um hopefully that will beapproved uh we'll begin to develop amore robust implementation and uh workon solving some uh additional problemsrelated to uh scaling and productionreadiness to make sure that this willwork effectively uh anywhere it might bedeployed so with this feature we hopethat we'll both reduce wasted resourcesand cost uh due to overprovisioning thatwe'll be able to mitigate stabilityproblems associate�d with bursting and uhwe would love for you to get involved uhyou can find the proposaluh on the Kubernetes autoscaler repothat's pull request7942 and the uh prototype code is alsoup there it's PR number7978 uh and you're welcome to visit myblog as well where I have a short writeup of the problem that uh we're tryingtosolve with that we'll turn it over toquestions[Applause]thanks please approach the mic if youhave questions uh yeah microphone iskind of front and centerhi uh you showed the demo and in that wesee a spike of CPU when the part re uhthe new part is created right when itwas restarted it spiked yeah and thatthat's that's basically itsinitialization and it it uh because it'sa logging demon it kind of rereads thelogs to catch up to where it was beforeexactly so how does it know that that'sjust a spike of the restart of of likeit catching up and not like a generalload right yeah well so so yeah it's agreat question so the um the VPA uhalready the way it works already is it'scapturing a histogram of of CPU usageand it is applying a decaying uh waitingto those to those those samples and uhthen it takes a 90th percentile sobasically you would expect under normalcircumstances it would ignore a quickspikeand is there like a configuration so youcould actually change it for specifictypes so for example my port starts upwith like extra load for some reasonrightum there's a lot of configuration on theVPA i'm not sure that you can do muchabout I mean it it it you can configurehow long the exponential decay is andand things like that but that's thebasic behavior of it so if you if youwant a a radically different behavior uhyou you might need to find uh some otherproject it will depend a lot on yourworkload as well if you know thatthere's going to be an initial spike andthen it will settle in to a differentvalue you might want it to react moreslowly uh if you know that initiallyit's going to keep handling the amountof load that it is uh and that that thatis immediately your steady state youmight tune different options yeah thankyou you're welcomehey um so what happens if you areapproaching um GitHubs meaning forexample I have Algo CD and he's it'salways monitoring my cluster and PPAcomes in and change some of theresources which is not in my gitrepository argo will just go and run itover so do you want to take that oneyeah yeah so uh the way that thevertical pod autoscaler works it's notactually changing your initial resourceso in the example of a damon set uh itdoesn't change the spec of your damonset it just modifies the pod at the verylast moment with the admission web hookso uh the damon set is responsible forcreating a pod uh and just before itcreates the pod it asks the VPA oh bythe way how much CPU should this havehow much memory it has so you would notexpect to schedule individual pods undera Damon set in your GitOps if that makessense so essentially it's just like aquick solution for a local problemquick solution for a local problem i'mnot sure I understand meaning so if ifyou if you want the changes to bewritten into git then this is not asolution uh the the way the VPA works isit is it modifies the um pod uh requestson the on the fly using an admission webhook that's that's the way VPA worksokay and second question does it takeinto consideration cube reserved andsystem reserveddoes it takeuh Kubernetes takes those into effectthe the the VPA is is is obeying uhhowever you configure it so so you saywatch these pods and make arecommendation within these bounds youknow using this policy it that that's umit yeah it doesn't really care about thethe system request and and anything likethat those are those are handled byCublet thank youhi um first of all great feature umthank youquestion is when you've been developingthat and playing with different demonsets have you encountered a problemwhereyour default memory CPU or mainly memoryrequest was too low for some big nodesand the VPA was not able to catch up andrecommend a newmemory because it was constantlycrash looping yeah so the the VPAdoesn't really care about the ini�tialrequest uh once it makes arecommendation it'll do so because ithas a confidence interval that it feelsgood about it those some of thoseoptions are tunable to some extent uhbut if you're just using out of the boxVPA it's going to be fairly conservativeand it's going to try not to uh cycleyour pods more than it has to butSo there there's one exception if ifyou're actually getting uh out of memorycrashes it it it increases more allright so it the special provision in thecode to watch for for events okay thatthat answered my question and all all ofthat's in the VPA already uh so yeahthat's not a new feature it's existingfunctionality that will be included noyep cool thankshi hello in your demo how the how thenew logging pod knew where to catch upon logs uh compared to the previous onethat got killed how did it know to catchup on the logs uh I mean that's that'swhat logging demons do it it's um uh itwrites out a snapshot periodically ofwhere it got to well let's say it gotkilled uh writing log a yeah how does itknow that when it restarts it still needto write log B and not log A againuhyeah it it it snapshots that informationto disk so that when the when itrestarts it can pick that up okay so youyou kind of cache that information onthe node itself on the node yeahyeah it's um and and also I think we weassume that we can resend some logsuh that it's item potent um yeah but Imean that's that's just you know thatwhatever logging demon you use it it itit will have some other behavior this isthis is not really a talk about loggingdemons this we just used it as a demoyeah yeah of course in the more generalbehavior uh with any workload where youexpect to cycle the pods you want themto be resilient to that um and there'ssome very exciting work going on with inplace pod resizes uh that will make itso you don't necessarily have to killthe pod every time you can resize itwithout restarting it which interestingfor cube proxy for example if you wantto apply that to critical demon setsthat you can't afford to lose or Yeahyeah cool thank youhi thanks for the talk uh I have a likeone one question on the uh the mechanismyou're proposing here i I think thatworks pretty nicely in a static clusterwhere all the nodes already exist do youthink about how will it work when nodesare not yet there and so there is no VPrecommendation and the the thing that isscaling a cluster is to predict uh theresources that that will be used by ademon set so yeah how how will that workthis is this is one of the issues that'sbeing actively discussed on the proposalum there's a few different ideas abouthow initial recommendations should bemade um and so we're definitely work onworking on sorting that out um I thinkthat in general uh the expectationshould be that uh once the vertical podautoscaler makes a recommendation inthis per node model that uh you'll begetting the behavior that you want atthat point so yeah there there are someopen questions that people are debatingabout uh what you do between the initialrecommendation or the initial requestand the first recommendationokay thank youhi you showed the host name uh scope butare there any plans to uh get morescopes available yeah I guess I I putthat on screen as a as a kind of a ahint um I uh so scopeum scope equals instance type uh so ifyour if your problem is basically thatthat bigger nodes use moreum resource then instance type lookslike a a more more efficient moreattractive target um it just it sohappened that this is uh host is the onethat's in the the demo in the prototypeum uh instance type sounds attractivebut it's actually harder to implement umnode pool is another one that's beensuggested so scope equals node pool justto clarify too the scope isintentionally extensible uh the currentproposal just includes the host name souh yeah that's certainly an area whereadditional proposals could be added uhwe may get feedback during the proposalcycle that we want to include thingsinitially but yeah we're definitelythinking about that and open to itdifferent subsets of scaling actuallyone of one of my colleagues wants uhwants to use this on stateful sets butAdam's going to kill me if I say that soum I didn't say that cool thankshey uh have you guys thought about howthis would play with a tool likeCarpenter or if you guys have anyopinions on this in terms of it tryingto carpenter trying to right size yournodes and then this trying to right sizeto those nodes would these contendyeah so um I don't know a lot aboutcarpenter specifically uh but in Googlecloud I know that we have manyautoscaling options uh for like clusterautoscaler and all those so we have someof the same concernsum yes there we are thinking about thatum and trying to uh make sure that theywill behave in alignment with thegeneral expectations of the users soyeah uh there's a lot of complicatedissues that you get into when you'retrying to do like multi-dimensionalscaling in several ways uh and we've runinto some of those already with uhwithout the prototype so uh aninteresting space and uh if you havecomments please get on the proposal andlet us know cool thanksuh hi you mentioned at the beginning ofthe demo BPA normally runs over you knowa long period of time like I think yousaid hours and that you lowered it downbut for these uh dammons like a lot oftheir load is typically driven by whathappens to land on the machine sosuddenly some pod you know a lot oflogging or a lot of like you knowproxied request to a proxy you know 2x'scapacity and so I guess my assumptionwould be you generally want to run BPAfor dammons at a faster rate and I guessI'm wondering like why wouldn't that bethe default or like what are the risksof doing that so there's a fewmechanisms that you can use to respondmore quickly so um one is that if you'reanticipating a memory spike uh and youhit your request and limits are close uhyou might hit your memory limit and youget an out of me memory error and thenit'll apply a new recommendation quicklyafter the pod crashes uh in CPU you canum again kind of toy with your requestsuh and your limits to know what you wantwhat behavior you want um and then ofcourse you can also tune the recommenerto behave differently or respond morequickly i think in general there's a lotof uh work in the space right now withthe in place pod resize that will changesome of the assumptions about howquickly we want to respond in generalbecause up to this point the assumptionhas been to make a new uh request tomake a set a new limit you need to killthe pod and that we shouldn't just killpods forfun I was also just curious like arethere internal scaling limits if youstart driving BPA recommendation morequickly that you'd have to be concernedabout or is it primarily around more ofthe mental model for you know Yeah so Iwould I would be worried about theamount of resource the VPA itself wouldstart to take if if you um if you if youtried to drive it really quickly and andmake it you know really clever um yeahit's pulling information from the metricserver from from meththeus and so itdoes have to gather the usage regularlyso if you start to tune that really fastyou're going to run into problems butit's it's an interesting space i meanyou know we're we're here to kind oftalk about about future directions andexperimentation and so on we're we'rewe're uh we're not claiming to havesolved all the problems sowell just a specific oneand yeah and thanks for this actuallybecause it's a big issue where I work atum so um my own case was more on thescope field like is there like a plan toextend like it from being just like astring to being like the anti-affinitykind of definition likemore like broken down in that sensebecause there are some like highpriority notes you might just want toskip doing this on um yeah the there'sdefinitely opportunities to extend anduh we're out of time so I'll have to cutus short before we get to the lastquestions but uh yeah there's definitelysome opportunities and please jump onthe proposal to comment yeah yeah thankyou i guess we're we're out of timeright yeah yeah thank you one lastquestion yeah i'm happy to speak withyou at the front2025-04-15 22:02:18.114705 � ��N�x#��SAHc0jj-654lAwelcome to our talk uh about life ordeath i think we have to go away fromthe speakers uh life or death of aKubernetes request API request i'mStefan Shimanssky working on API serversince ages 10 years or something um havetouched a lot of the things we will showhere and I have with me Abu I'm asoftware engineer at Red Hat and asimilar history about API servers so umwhat we want to do today we want to talkabout API requests obviously but imagineyou're sitting in this interview rightyou want to apply for a job or you'reapplying and there's this interviewernext to you and a very um commonquestion is um and you will maybeexperience something in a similarcontext API server is of course a topicbut um you get asked so you do thatright you you type qle create minus v9and um pass some manifest and whathappens right you press enter whathappens that's the question and we areall in that interview situation now sowe want to talk about the interestingbits of the API server and Q cuttle whathappens if we do that and um noteverything is new so there was a oldtalk from Daniel Smith years ago manyyears ago I'm not sure maybe eight yearsago or so with a similar title but ofcourse cube has changed um has becomemuch more complex and I bet um 80% ofthe audience um has joined the theproject after that talk so will be lotsof new stuff anyway this is a um thestart of the conversation here and ummost of you will done have done minusminus v something like 7 8 9 somethingthis dimension and you get logging rightso you see the the cube cutle doessomething and um in particular it'sdoing this request right the requesthere and this looks super innocent umyou find the batch the word batch theword v1 and on the first view you yousee job right but there's the firstdetail which is super interesting thisis jobs lowerase plural and this one isan uppercase word singular and this isalready not trivial in Kubernetes um howthis mapping actually happens and ofcourse um when the request is done it'sa post request to a URL and there's a2021uh returned and this means success so abatch is created but everything inbetween we want to talk throughn��p�w#��AyQQU8vDhj0ohello welcome uh my name is Adam Berno iwork at Google Cloud on the Google Cloudmanaged service forPrometheus uh hi my name is Brian Borumuh I am a distinguished engineer atGraphana Labs uh my day job I work onmassively scalable storage for uhmetrics logs traces andprofiles uh I'm also a Prometheusmaintainer um so if you like thatyeah and uh feel free to uh follow us onour blue sky handles that are listed onthe screen uh we'd love to connect withyou if you find this talk entertaininguh we're going to be talking about howDamon sets run in Kubernetes and how touse autoscaling to make them moreefficient and moreeffective um yeah so uh I show of handsget get a bit of a audienceparticipation going um how many peoplehave used a VPA a vertical podautoscaler okay so that's about maybemaybe a fifth of the audience somethinglike that and um how many people havehave used a demonset oh nearly everyone um how manypeople have used a VPA with a demon setyeah thatguy well uh that's what our talk isabout um we uh we're going to kind ofmotivate the the problem that we seewith this uh and then talk about what wedid to try and solve it and hopefullygive a demoand we're uh not we're the second lastthing between you and free beer so uhjust you know chill stay with us andwe'll take you on a journey oh I Yeahpicture okay demons demons demons arenot um coming from hell uh these uh whatwe mean by by demons here is is acomputer program that runs in thebackground um and apparently this comesfrom from Greek mythology uh the idea o��ow so um this mapping um you will haveheard about kinds and resources inkubernetesAnd um the kind is the one in themanifest resource is the one in the inthe API path here um they look differentum but they don't have to be um similarthey can be completely different word uhwords and for example this could be thejob and this could be elephants orsomething right there's no connectionbetween them so connection is done bydiscovery there's discovery the APIserver serves discovery and QC cuttlewill query that which um returns allresources with the kinds served underthe resources and lots of other metainformation and in kubernetus we callthat a rest mapping rest mapping is socube cut takes a manifest passes the ylum passes out the cube version kind sobatch v1 job singular and then restmapping happens and it has group versionresource and this is basically just thethe the words the terms here in the pathand it can do its request but even heresomething is happening and this is restmapping and then the request is sent tothe API serverso let's see what the API server does soour request is now making its way to thego STP server that is basicallylistening to and serving all um incomingSLP requests so this is rather a verysimplistic view of of what the STPserver does when it accepts our requestuh in step two of the figure you can seethat it creates a new HTTP requestobject uh that represents our request onthe server uh it has the requestparameters the headers and also the bodythat we provided and then it creates anew response writer object that can beused to construct an STP response itthen invokes the user provided handleruh with the request and the writer um soa handler is a user provided functionthat is responsible for uh responding tothe request and it uses the the writerobject to construct the HTTP response uma little segue into a go routine um so ago routine is basically a um lightweightuh thread managed by the go runtime uhthis allows a go application to haveconcurrently executing functions umso the question is like how does theHTTP server uh invoke the handler doesit invoke it on a new go routine uh theanswer depends on if you look at the uhstandard library implementation for HTTPversion one it executes the handler inthe same go routine for HTTP version twouh the SCP server executes the handlerin a new go routine umso the API server does not have like asingle monolithic handler function thatserves the entirety of the requestrather we have a chain of handlerfunctions each of which is responsiblefor like serving uh certain aspect ofthe request right and the AP serverbuilds the handler chain when it startsUm so this is a more comprehensive viewof the handler chain we have we haveomitted some handlers uh for simplicityum so the request object has a contextum and it has uh deadlines it has uhcancellation signals and also requestscoped values um so and a handlerfunction can attach value to the requestcontext this enables like a futurehandler um to access that value in orderto serve the requestu so when the first handler receives therequest uh the context is bare minimumit has no deadline it has a cancellationsignal uh prepared by the HTTP server uso that the server can uh abort therequest on any error um as the requestpasses through the handler chain uh thecontext gets populated with informationlike deadline user information sorequest information and auditevents so the first stop at the panicrecovery handler um it is responsiblefor managing panics thrown by a requesthandling code right um this figurebasically sketches out the path thatlet's try thisum the path that that a panicking uhrequest takes so we start at the uh umat the god server it invokes the thecomplete request handler um in a new goroutine and then at some point the panichandler panic recovery handler startsexecuting and then it takes the rest ofthe handler uh chain and then it invokesit in the same go routine and on thethis is the rightmost timeline and wecan see here the first panic this isbasically request handling codeunexpectedly unexpectedly panicking andas The p�anicking go routine unwinds thepanic recovery handler stops the panicrecovers the error it also captures thestack trace so we basically pinpoint theexact location where the panic was wherewhere the panic originated from and thenit logs the u error and the stack traceand then it also panics here and this isby design so that it can let the HTTPserver decide what to do with the panicso the HTTP serveru recovers from the panic it also logsand then it basically sends an errorreply um to the uh to the user so howdoes a request that panics on the serverwhat does the client observe and whatalso like how does the how does itmanifest on the server side right sowhat if the example request that we sentusing couple um it panicked on theserver um so if we used HTTP1 we wouldsee an error like end of file uh it'sbecause the handler didn't have a chanceto write any bytes to the response yetum we see unexpected end of file whenthe handler wrote some bytes to thewriter and then it panicked um but forHTTP2 uh we see a stream reset errorwith the internal error code now theseerrors are not exclusively meant tocommunicate a panic so they could be forother reasons too so how how do wedistinguish so we can look at the serverlogs which will tell you that therequest panicked it also prints the uhthe request path for you and on theaudit ID it also we can see here the thestack trace from the panic so we canpinpoint the the actual line that causedthe panic and the matching order eventit has um a reason timeout error and wecan see 500 so this tell us that thisrequest panicked uh on the serveri think it's you yeah the next part ofthe chain is um passing the request soin a logical way like um yeah understandthe URL the path of the URLunderstanding certain parameters um ofthe of the URL and the verbs inparticular and um this is a very commonstrct if you look into API server codeevery many many parts of the um oh Ilost the laser it seems no it's back soum this request info is in the contextand you can always just ask for it andyou have the the past um logicalstructure of a request here and um themost interesting bit is is probably theverb and um we use logical verbs inKubernetes so we have get list watch butthey actually are all get um verbs forfor HTTP the others map quite naturallyum but yeah those things are stored inthe request info and for our requesthere um at hand we have a create for ajob in the batch group and you you willsee the expected information so batch v1namespace is extracted it's fu it's jobsand it's a resource request so it's notsomething like metrics or healthy orsomething it's a real qu operation sothat's why resource request is true andyou might um yeah if you have writtenweb hooks for admission you will seesimilar values which come basically fromthisstrct um requests also have a lifespanand there are two kinds of requests umwe call them longunning watches are longrunning because um they run at leasthalf an hour or something or maybe evenlonger and if you do cube cut lock youwill uh and you follow the lock um itnever terminates at best right so it's along long request and watch proxy aretwo verbs which we consider long runningand we have some sub resources which weconsider longunning but this isbasically hardcoded in cube API serverso if you write another API serverbehind an aggregated one basically thisis what you have right those arelongunning everything else is is cut offeventually and Abu will show in a secondum how this cutting off works with howtimeouts work yeah so we have stopped atthe latency latency tracking handler soum ittracks the latency incurred by an inviterequest at various layers within the APIserver um and it's recorded as an annotas annotations in the audit uh event uhbut we only record it when the requestexceeds like a threshold of 500millisecond some of the fields theselists some of the audit annotations thatwe have today um and um I can go throughsome of them like for example the amountof time we spent in the authenticationand the authorization phase um we alsohave like how much time the requestsspent in th�e mutating and the validatingweb hooks um this one is veryinterestingcd it tracks the latency incurred at thestorage layer and usually a singlerequest may involve multiple round tripsto CD uh so this shows the total amountof time uh the request spends in thestorage layer we also have uh latencytracker for how long it took us toserialize the um the um uh the therequest and also how long it took towrite to to the response writer objectright so this could be a very helpful uhtool for debugging um like clusterissues for example uh if you have theaudit ID and if you are looking at ifthe customer is complaining about likeslow API you could actually look atthese um audit annotations and wouldgive you some clues as to what's goingon under the hood next the next one isthe HTTP log handler uh it prints itprints uh attributes of all the incomingrequests it's only available if you runthe API server at log level three ormore uh but the interesting thing is ifyou look at the uh the coupe cuddle logit prints the audit ID uh whichrepresents the request and if we takethe audit ID and grab for it in the APserver log we can find a matching STPlog entry which shows the uh the requestpath and also u the status code that itit returned and how much time it took toexecute so this also can be very um likea a tool that can aid in clusterdebugging okay now I our request hasmade it to the u the the dead um thedeadline handler u so this um this thishandler allocates a deadline tonon-longunning requests and is attachesit to the request context um so if ifthe client doesn't specify a timeout forthe request uh the handler will use adefault of 60 seconds today and acluster admin can optionally overridethis value uh through the requesttimeout um command line option of theAPI server umso what happens what if we wanted tospecify timeout for our request um sohere we can actually use the optionprovided by cook cuddle uh requesttimeout and we're saying that oh therequest should complete within 30seconds so the our handler willbasically use 30 seconds as the deadlinefor thisrequest okay so next is the um thetimeout handler so why do you need atimeout handlerso like the request handling code maynot respect the the the deadline in thecontext right so we need a timeout thetimeout hand is basically responsiblefor enforcing the timeout um so thisdiagram showsum in red it shows the path that arequest takes when it times out so ifyou go through the events and they'realready listed chronologically um so thethe godp server takes the completehandler chain and it executes in a newgo routine uh and that's the go routinea we see here and then the handler chainexecutes and then we have the timeouthandler here it takes the rest of thehandler chain and then it executes it ina separate go routine here and then itwaits for the the handler chain to toreturn for 30 seconds like it will waitat at most for 30 seconds so whathappens here this handler chain isexecuting and it's taking more than 30seconds that's D1 in in this diagramright so the timeout handler then willnot wait for it any further because thedeadline has exceeded and then itbasically prepares a timeout response uhfor for this request and then it returnsand then the go to the server takescontrol and then it returns the reply tothe user so that's how the timeouthandler works uh within the Coupe payserver so it is necessary becausesometimes the client doesn't have anytimeout and if we don't have a timeoutenforcer in the APA server the clientcould hang indefinitely and that's notwhat something we want um so what if wehad um what if we like our we had ourrequest to time out um so we can choosea timeout value that is um long enoughthat coupe cuddle will send the requestto the server but then it'll immediatelyuh abort with a context deadline errorso I was able to do that with a timeoutof you can see here six millisecondright um and then it's short enough thaton the server it will not be able tofinish in time and then if you look atthe server logs it we can see that umthe matching HTTP log entry shows thatit it it basical�ly timed out which is504 gateway timeout and then if you lookat the matching uh audit event it willtell you that the request timed out it's50504 so Like if you have an audit idsthat you debugging cluster with you canjust you know this information could beveryuseful so the next stop is the priorityand fairness uh handler so API pri andfairness is a self-defense mechanism uhof the API server it regulates the loadon the server and it uses like a um fairqueuing algorithm to schedule therequests so when our request enters theuh the filter the handler um we havehere the classifier basically it itfinds a matching flow for our requestand then it uses shuffleing to select aqueue and then the request gets put intothis queue and then the requestbasically goes into a waiting state andwe have an we have auler in the proudand fairness that asynchronously decideson the request if it decides if thedecision is accept the request basicallythe rest of the handler chain getsexecuted otherwise um like the requestwaits in the queue so the question ishow long do we allow the request to stayin the queue the answer today is 1/4 ofits allotted deadline and then once thedeadline exceeds um theuler will removethe request from the queue and it willsend a reject reply to the user andtoday the reject reply looks like itlike it has a status code of 429 andthen it'll have a um in the response itwill have a header called retry afterit'll have a value n this represents thenumber of seconds the the client shouldwait before it retriesokay so what if we get our request toget rejected by by PR and fairness so umthis is what we'll see in the log we cansee that oh manum in the response you can see is 4 to9which indicates too many requests and itsees a retry after header in theresponse which has one that means itshould wait 1 second before the nextretry and that's what cube cut logs hereand it keeps retrying and thenum if within the retry parameter if itsucceeds otherwise it fails right andthen we can there's a very interestingobservation here uh this serverprocessing milliseconds it's it shows1500 here so it's uh um sorry 15,000 soit is equal is equal to 15 second socould it be that the request waited inthe queue for 15 seconds because if younotice here when we did cook we didn'tspecify any timeout and the default is60cond today and 1/4 of of that is 15second so it could be that the requestwere in the queue for 15 seconds andthen it was rejected right so I think wecan be sure if you look at the auditevent for it if you go to the auditevent there is an audit annotation forthe latency tracker and you can see APFrequest timeout it is very close to 15second so this proves that our requestin the queue for 15 second and it it wasrejected by APFall right so we have passed the firstpart of the API server the handle chainon the left and now the actual um quadprocessing happened so now we haveattached stuff to the context we havevalidated we have authorized all thosethings but now we want to have a a jobright we want to create a job in thebatch API group so there's a multiplexerin front um it knows about all the pathsfor different resources among themthere's APIs batch v1 jobs and there's apipeline for the actual requestprocessing behind and for each of themthis thing um exists most of them arevery very similar so there's a genericregistry implementation it's called wesee it in a second and um one of thoseis is for jobs here that's the first oneso let's look at the at the pipeline forjobs um we create a v1 batch uh job andv1 is a g version and one could imaginethere's just v1 but in the past therewas a v1 beta one for example and umversions are everywhere here and weannotated the um the path of the requestin this picture and you will findsomething i mean v1 beta one is gonebecause we are GA for a long time butthere's still this int and int standsfor internal there's an internal versioninternal representation and uh it looksvery similar to v1 but it's an internalgo type it's another go type uh in theimplementation and um yeah you see thereare there are transitions fro�m v1 tointernal and if you call out toadmission for example it's translatedback to v1 and the same happens when youwrite to cd so lots of transitions andwhen v v2 is um int introduced at somesome point then there even moretransitions more conversions soconversions is important conversion is alossless transformation and they arejust about representation of the jobobject but basically just a differentshape it's the same data just umrepresented a bit differently um it'slossless and you can transform betweenall of them without losing any data umand you get back the same object whenyou go back um the transformation andthere's also one transformation going toSCD and we will see it in a second umwhich translates to the storage versionjobs are stored in V1 um but thisdoesn't have to be they can be verydifferent and eventually maring happensand we write to SCD and a job is aninternal a native resource we useprotobuff for that by default but it canbe JSON a C is stored in JSON forexample all right so this is a pictureand um another super important conceptyou find everywhere in Kubernetes and ifyou have written controllers for exampleyou will also encounter the scheme thescheme is basically this like this verycentral registration object and it knowsabout all conversions all go functionswhich convert it knows about the kindsgroup versions it connects them to thego type which is associated to that sobasically all the knowledge about thetype system is encoded in a scheme andattached to that is a codeex uhconstruct which knows how to unmarmarshall to JSON to protobuff and knowshow to callconversions all right so let's um lookat the first phase so we have therequest has a body um there's a binaryblob so the JSON here in our examplecoming from cucle um our context iscompletely filled we know the user weknow the audit event request infoeverything um we know from the headerit's JSON so we we call the codeex datastructure decode that thing and create ago object batch v1 job and um aftercreating that we call a defaulting umfunction which at some fields user ourourselves in this case haven't added tothe manifest right we didn't specify allof them so the defaulting make surethose things are are set that's thefirst phase now we have a go object a v1go object but um you have seen herethere's a internal type so we have to toconvert to internal so there's aconverter converter is registered in thescheme so it's called and we have theinternal object and if you're interestedinto the code here there's a packageit's deep in the cube uh repository it'snot exposed in the API package but itlooks very similar to the v1 and then wehave a concept called registry this isreally the the logic of cut what isimplemented here create delete updateall those um uh methods and there's astrategy for a job and um the author ofthe job resource can add things theywant to do um during create so it'sanother defaulting phase and if you knowfeature gates uh you can enable featuregates and then certain things can be setif they are disabled um you can't setthem right you can give values but theyjust dropped um during processing andthese kind of things happen here in theum prepare for create here in theexample I looked into the job resourceand um yeah it set some fields so thepot failure policy is a new thing itseems this is um set to the defaultvaluehere then um validation admissionhappens Um everybody of you will haveexperienced that um there's mutatingadmissions so you can add web hooksthere's a admission chain here and youcan add web hooks you can add cellpolicies there's an alpha feature forthat and of course it can go wrong thenyou get this 4 to2 unprocessable entityfor all of them actually then there's anAPI validate phase this is go code againin the strategy here in the createstrategy um it it checks like the namethat it's proper um as expected and thefields have the right values and thenadmission is called again and admissionum might call your web hook does otherthings checking for namespaces forexample um checks the CL rules and thenwe are basically um yeah we are readywith a with a validatedobject and now we want to store now theinteresting bit comes we want to createan object so this means write tod sowhat happens here we have the batchobject and we translate that to to a keyfor for sd is a key value store so thekey is registry jobs the name space uhpi is the name of the object thenthere's another registry functionprepare object for storage super boringit just overrides the um it wipes theresource version because the resourceversion is stored in a different waythen we call the codeex again so thistime reversely we encode so we convertour internal object to v1 and um then wehave a go object and then marsh runninghappens because um we want a JSON objectbecause the accept header says JSON umoh no sorry um I'm too too fast we arestoring here in Etsy so it's about thestorage version and we know jobs are umstored in Porto so we get Portobuff hereum KMS encryption is the last step abinary transformation so if you want toencrypt this happens here and then theSCD client is called with an optimoptimistic put and optimistic meanseverything we done before like all the20 minutes before they were optimisticlike we hoped that the push will succeedand as you know when you create you youmight get an already exist and this isan error that client returns when thekey already exists and um if it existsit's translated in this 40 409 error andreturned to the clientum we are nearly finished so um isupdated but we want to tell somethingabout the success to the cube cuttle sowe take again the internal object herethe the batch um job the go object weconvert it back to the request versionwhich might differ from the storageversion here it doesn't it's suppose v1we encode according to the to the headerhere there's accept header attached tothe request we encode we write it to theresponse um writer and basically yeahthe pipeline is done with that right anduh we are nearly finished with the APIserver processing of this requestyes and As these steps are beingexecuted uh the the audit event in thecontext are being updated as as um andthen it's time for the request to leavebut not just yet um because the handlechain are already in the stack so therequest goes through the flight thehandle chain but in reverse order and atthis point the audit handler has achance to persist the audit entry thecompleted audit entry um using the backend and usually it's a it's a log filealso the STP log handler has a chance toto flush the uh the the STB log entry tothe to the log um and so at this pointthe handler chain is complete it returnsand then the HTTP server resumes controland it sends a reply to couple sodepending on the reply from the servercouple may have more steps to do uh forexample if the coup if coupddle receivesa network error um and the request isreadon it may retry the request or ifcouple receives a response and theresponse status code is 429 or 5xx andit has a response header uh named retryafter it will retry um so client gobasically is the official client fortalking to Kubernetes cluster and it hasthis built-in um retry lo logic builtinto it and if we build our applicationis doing client go our our applicationwill automatically retry on our behalfum so in our case the request wassuccessful and uh the kuddle completedthe the the request and we see aresponse that you know the job wascreated i think yeah and with that wehave oops yeah passed our hopefullypassed our interview question and uh Ihope some of you have learned somethingabout what happens really when a cubecutle is executed cube cutle create inthis case um of course every boxdeserves its talk in its own rightthere's lots of stuff behind um so ifyou have questions um of course ask themI think we are out of time I'm not surewhether we have time um time to the nexttalk um yeah we are here uh otherwise asyeah so the usual channels um there isthis code here so if you have feedbackum this goes to the scat page pleasegive feedback bad or or good um anythingyou want to tell us um yeah that's all Iwanted to tell thank you thank[Applause]2025-04-15 22:02:18.679552�sibilities is topro provide reusable tooling reusableservices infrastructure components toother engineering teams the goal is thatengineering teams can independently spinup new infrastructure resources ondemand with little to no interactionfrom the platform engineering teamthat's the goal so we want to scaleourselves out of our job to ultimatelymaximize efficiency within our workflowsand not linearly scale our platformengineering team alongside of ourworkloads do more with less rightto make this happen however our toolinghas to evolve as well not just ourprocesses not just the way that westructure our teams so our tooling hasto become a lot more autonomous and relya lot less on human interaction to forexample fix bugs in a manualway now we see this kind of trend inother areas in the cloud native spacesuch as for example in the GitOps spaceif you've been here yesterday at AOConyou might have learned more about ArgoCD as a GitOps tool we also see it a newinfrastructure as code tooling that youcan use from within your clusters suchas crossplane that allow more or lessdeterministic setup of all yourinfrastructureresources now this all sounds amazing tobasically build zero touch environmentsthat just scale yourself out of the jobso you can focus on your long-term goalsbut it's obviously not that easyotherwise we wouldn't be talking aboutit there are lots of challenges that wehave to overcome and one of them is thatas a platform team as we are choosingthe technologies that we want toimplement in our infrastructure weultimately make other teams dependent onus because we are going to be the mostknowledgeable about those infrastructurecomponents meaning if specificcomponents within a specific environmentone of your maybe few hundred clustersbreaks uh it will either be very timeconsuming for other engineering teams toidentify the root cause or they willhave to pull in people from yourplatform engineering team to helpout additionally we can all agree onthat this environment that we operate inis highly complex there are lots ofmoving parts and it's really difficultto stay up to date so you might not havemultiple people within across your teamsor within your platform team who arevery knowledgeable about a component itmight even be set up by a singleindividual meaning if that person leavestheir knowledge their tacet knowledgeand experience over time is notnecessarily transferred to other peoplethey might leave some documentation somerunbooks but who knows where they areand how to use them and in whatsituations they would be applicable andthat can result in long-term tech depthfor your organization and you might evenend up operating your once ideal zerotouch environment as a black box untilthings just start falling apart and youhave to pull in an external consultancyto fix things now that can also thenresult in organizational challenges oneof them is how do we stay compliantwithin our infrastructure how do weprovideum consistent operations to all of ourdistributed teams another issue is howif we operating our infrastructure kindof as a black box or half of the peoplewho are operating the stack don't knowmost of the components how do we ensurethat our mission critical applicationsare reliable um and available now thisis exactly where Kate's GBT comes inimagine having an expert Kubernetes SREon call 24/7 that's KBT it was startedin 2023 then donated to the CNCF and isa CNCF sandbox project you know althoughit's completely community there areseveral households names who are usingit and have contributed to this talk howdoes Kate's GBT work it's basically atechnology that allows you to scan yourKubernetes cluster using AI enhancedanalysis on codified SRE knowledge todebug and triage issues across yourenvironments in simple English once theyhas done it you will be provided withwith measured guidance on how to fixthose issues it's basically yourdependable sidekick operating acrossyour environments to identify the healthof them all the way from knowing what isactually running across yourenvironments how does it work what couldbreak in thos�e components in your customproviders that you might be running allthe way to remediation that's where KGBTcomes in now let's look at KGBTsuperpowers kgbt comes in a number ofdifferent formats you can either usethis CLI the operator in server mode orinteractive mode wherever you arewhenever you are we let you use case GBTto identify your cluster health it alsocomes with 11 different interoperable AIbackends meaning you can choose betweenhugging face open eye bedrock or evenbring your own models whatever providesyou with your most accurate data in thesame spirit you can also build your owncustom analyzers now analyzers in caseGBT adapt component that tell the toolwhat is a pot what is a deployment howdoes it work how can those go wrong orin your specific environment what arethe components that are unique to youand your setup that's what you builtinto custom analyzers we have several ofour um enterprise partners using customanalyzers to build this privateknowledge into their and integrated intotheir infrastructure their machinestheir datacenters kbt comes also readilyintegrated with Prometheus so you canlook at the KGBT metrics alongside allof your other metrics in the samedashboard and correlate them now onceyou scan your cluster the results aresaved as CDs in the cluster so you caninteract with them in a similar way onhow you would scrape other informationsfrom the cluster and integrate it intoyour externaltooling we also understand that for manyorganizations privacy is very importantso you can anonymize the data that yousend to the LLM and last but not leastKBT is multimodal meaning you can eitheroperate a single tenency multi-tenencyacross one clusters or multiple thechoice is up to youthank you Anise for giving such awonderful rundown i think that you didreally good justice to some of the keyfeatures that people are excited aboutthe most now we've spoken about KTC verymuch as sort of an observability add-onnice to have CLI you run you get someresults great i think today is wherewe're really going to shift thephilosophy uh into another directionwhich is where we're introducing autoremediation so we use guard rails toeffectively look at the patching deltathat an AI may suggest based on a resultand I'll explain about that a little bita little bit later but what's sointeresting about this change of courseis that we've gone from effectively aninformation observability tool to now avery tight loop reconciliation in termsof here's an error i think I know how tofix it based on your organizationalknowledge and to my knowledge at leastthis is being done very infrequently andit often involves quite a large set ofdifferent points in the pipeline uh youmay well have different technologiesthat bring these together and never hasit been so simple as a as a as a onelinecommand to run this and get it workingso we're really excited about autoremediation and what that means for thefuture um and I just heard some reallygood points about tacet knowledge andit's something that I keep harping on tobut I'm sure many of you will haveexperiences of being the one person whoknows how these three things in sequencework you know hey you know how the fivedifferent charts all work together imean in some companies you see uh chartspaghetti or you have an OCI deployingcharts that does OCI and you know it'sit's crazy and so imagine a new startertrying to join a company you know to theto the point of um knowledge being thisunraveling ball exponentially withtechnology complexity this is where wecan start to democratize that and Ithink we've got all the raw materialshere for something very special so I'mgoing to talk you through sort of thehigh level design i've got sort of amilitary grade um laser pointer here soI'll do my best to to use it kgbteffectively in the operator mode looks abit like this right so today if you wereto deploy this into a cluster as you maywell do after this talk you'll get anoperator nothing exciting there reallybut that operator is really thelifeblood the heartbeat of keeping theother components working i've simplifiedthis to b�e sort of process right we'renot getting too much into the weeds ofthe technology but what happens is theoperator runs it deploys out a Kgdeployment the reason that pattern waschosen is because some of our customersand community use multi-tenency youmight have v-cluster multi-name spacesper per tenant however we find that it'sreally useful as well for being able tostart to decouple some of the steps andasynchronous activities so you'vedeployed out your KGP it's a bit likethe CLI effectively it has analyzersbuilt in you could even hook it intocustom analyzers you might even probeanother cluster elsewhere you can buildan aggregate of data what then comesback is that you build up this pictureof results the results as we're going togo into the demo in a moment show youeffectively the output of your inferenceAPI that could be local AI that could bein bedrock that could be uh in huggingface as we mentioned that could beanywhere you like what's important aboutthat is that you've codified that issueagainst that solution right so when apod of this name of that deployment it'shashed right so you effectively have a akey value lookup for that particularsolution which is going to be powerfulas I'll go on to describe so whathappens at this point with the new autoremediation workflow and for folks whoaren't familiar with this generally isthat the results then get inspected fortheir credibility we look at the resultsand say is that actually an issue or isthat just something that you think is anissue and so we built a series ofsafeguard kind of guard rails around theum auto remediation dotted line there wecalculateapplicability let's say for example Ihave a result that says this image ismissing i can then inspect through myknowledge base through my my rag and mymy data store if that image is somethingthat my company knows about and if it isI can go on and apply the right fix inthis particular case it could be thatyou know what that that's valid you knowthere's there's an incorrectly namedresource so what I'm going to go on anddo is I'm going to build out what'scalled a resource tree to figure outwhat do I need to do to fix this issueso I mentioned it's a bit like a GitHubsorry a git patch where effectivelyyou've got the delta of change that'swhat gets stored in this immutable logthe mutations log will get created whenyou decide yeah I'm going to turn autoremediation on i'm going to give it aspin the mutation is another customresource that keeps effectively achronological um account of what wasyour resource what was the suggestedchange what was the applied change whatwas the result and so the way that themutation works is once the mutation'sapplied it will then determine successby looking at the result which will thenget reprobed from the analyzer if it'ssuccessful the mutation is successfullyapplied we use a few different tricks tobe sensible around here you know weactually make sure that there's a um arelatively low leavenstein distance soin terms of like the patching differencewe also do a few other bits and piecesbut it's super important that peoplebuild confidence in this this isn't thewild west of hey I'm just going to throwthings into an AI see what garbage itcomes back with we're trying to applythis in a very measured and a verycareful way and I'll go on to show youthismomentarily we think that this patternis so profound and so interesting thatwe're starting to see it change the waythat enterprise customers contributorsand collaborators are talking to usthey're starting to say "Hey Alex we'vegot this really large data lake of um oftextual and ticket information we'regoing to run it through our embeddingsuh AI and we're going to spit it outinto this vector DB we want KCBT toconsume that to get smarter and at thebeginning I was like uh you know thisfeels like it's kind of going outside ofthe area of my CNCF boundaries here butactually what I realized is that isexactly how people want tooling to beused it needs to be interoperable and sotherefore we found that this new idiomis becoming very popular what'shappening in a few of the� teams thatI've been working with is they'redescribing this kind of workflow andthis is a flywheel right this is yourtraditional kind of S sur workflow whereyou have a ticket or an alarm or anincident or a page or you know thebeacons of gondor whatever that might bethat will then tell you to wake up at3:00 a.m you typically check yourobservability systems you see thatthere's an issue there's a fault there'ssomething to inspect you mitigate thatyou stop the bleeding right you fix thatthing it could be in this particularcase that something needed restartingsomething ran out of resources who knowsyou then update that mitigation thatroot cause analysis into the ticket intothe correspondence the comment whateverthat data sources you know typically umyou just just anecdotally actuallysomebody showed me that they theythey're big Jira users they update theirConfluence say they link it againstticket it's really great they they'vegot a really nice little workflow therebut what your SR probably doesn't knowis that some companies are starting toexport that data they're using that as Isaid they're bringing that into aknowledge base so they're using a ragpotentially and they are starting toharvest that by embedding that data intoa foundational model bedrock whateveryou know it could be um anything elseand then they're pulling that backthrough into KGBT so what is the what isthe implication of this the implicationis really profound because if I'm afrontline s I'm dealing with stuff everytime I write to this it's making thisbit here get more accurate right so Ican say okay well you know I picked onenvoy earlier that envoy image ismissing go to this version we know it'scompatible with this cluster xyz and sotherefore the solution gets a bitsmarter but not only that it's not justthe manual remediation that gets smarterit's then if we go back to the previousslide it's also the auto remediationgets smarter as Well because the autoremediation again if you're talking tothe same inference API that's probablydrawing on the same uh data lakes andknowledge bases it's going to makesmarter decisions it's going to startsaying "Ah I know how to fix thatbecause my company tells me in this bigdata lake that I got this stuff to do."So we think this is really exciting andwe think that is really changing thegame so let's just do a quick littledemonstration i'm a real advocate of uhlive demos so let's go for it um itmight be a little bit small at the backso I'm going to bring it up a fewnotches so we're going to just grab thisoff the uh off the internet and uh theconference Wi-Fi might be bad might begood and if you're feeling brave and youwant to play along all you need is anAWS um secret key and access key in yourZush profile and oops let's just deletethis because here's one I made earlierand this is a real tech this is realproof actually that it's it's a livedemo so let's do it again there you gocompletely promise you it's a live demoso there we go this takes about two anda half minutes so while the demo isrunning I'm going to jump in and out ofout of resources i'm going to talk toyou about it just describe what's goingon so you've seen there's a fewrequisites Helmkind to a lot of you I'm sure um whatI'm going to do now is I'm going to talkyou through as it all spins up veryannoyingly though my uh my screen herehas disappeared so I'm basing it off theone down there so what does all thisstuff all this stuff doesn't look good idon't like stuff going in my clusterwell let me talk you through it uh thetop one is an optional cache you canturn that on and off in Helm it's just acache to keep your prompts saved andyour responses it seemed like commonsense to me that you want to hit the APIas little as possible um there's alsosome inherent benefits from the cachewhen we start to uh check again our ownhomework but we'll talk about that uh inanother session below that you've gotthe operator as I described it's notparticularly exciting although it is thelifeblood of everything goes on and hereis our broken deployment some very verynaughty person on a Friday ha�s calledthis engine xxx and I have no idea whyright um it's a simple example but Iwanted to give you make sure we're allon the same page so what else ishappening is the KGBT deployment hasbeen triggered so what does that meanwell there isa KGBT config you know we're used todealing with this kind of stuff so whatdo we see oh laser pointer what do wesee here ah cool order order mediationtrue resources these ones similarityrequirement 25% that is a baseline ofsimilarity that's pretty broad you knowI could set that to 95% so there's onlya 5% delta of change and then apart fromthat you know I've got the usualgovernins of things that I've turned onand off i goback i'm going to say oh well thatmissing deployment's still there demo'sfailed obviously well let's take a lookat the mutation the mutation in progresssimilarity score 27 so it's hit thesimilarity score requirement it'sapplying a delta of change but I toldyou there are those guard rails earlieron right we only apply stuff and keep itin cluster and we can then prove thatit's fixed the result so the result hasto actually go away for this then tothen work so what's all this stuff meanwell you can see now a new replica sethas come up and in fact it's not justnew replica set it's a new deploymentand there's some inherent benefits tothis because it allows us to compare andcontrast we're saving this within thethe memory banks of KGBT now what'sreally cool there is oh cool i've nowgot a working deployment i mean as I sayyou can download this you can check itout it fixes it it's a simple one as anexample but we go down into here and Icompare andcontrast got our fixed one i'll justapply and we've got a broken one and youcansee when you compare the difference it'schanged now if I go back to my results Idon't think I even had time to show youthis one we can look here andseehey what's this find more info in zeroticket cube 1 2 3 right completelyfictitious team but you get the pointright we're able to find out where thisdata comes from so therefore this is whywe're getting really excited so thankyou so much and I'm very grateful to thedemo gods that we had a pretty goodpretty good run just then here you goamazing thank you so now that we've seenhow auto remediation works what KGBDdoes in your cluster let's look aheadand what does it actually mean for theway that we operate with each other andwithin our teams as well as how weoperate with our tooling in the future isaw lots more people come in while wewere giving the demo and while we werespeaking so I just want to iterate ourgoal is to run basically zero-ouchenvironments we should be able to run 50100 several hundred clusters with littleto no human interaction with thoseenvironments now most of our open sourcetooling in the cloud native space is notset up to never fail there will become apoint where you have to go manually intoenvironments identify whether the erroris actually correlated to something thathas to be fixed manually or whetherthere is maybe a more automated uh moremore natural fix to it that willeventually show up for example thereused to be an issue in Argo CD wherebasically your application deploymentswould just hang in a sync loop it wouldjust be there and not update even ifyou've deployed a new version and Argois just not pulling that into theenvironment now a quick fix is to gointo the cluster and just delete thatapplication deployment so Argo CD canpull in the healthy version and fix ityou don't want to do something like thatacross several hundred clusters right sowhat you do is you might write a scriptyou might create your own operator forthat specific problem to address thatissue that you identified and thatotherwise would have to be fixedmanuallynow as we scale our infrastructure theneed for automated management fromtooling such as Argo such as crossplaneincreases and similarly the need for anauto S3 on call 24/7 increases as wellwe need some something someone who canbe right in those environments who hasthe knowledge from previous issues codedin it and can take action right then andthat's where KGBT comes in so Alexmentioned earlier the knowledge base therack that you can set up that your AIback end is pulling data from imaginethat's similar to having um an open bookversus a closed book exam you can't uhexpect the AI model to know what to doin your specific setup without providingthis additional information that youidentified as SRES as you identified thethe situations and different issues andhow to remediate them so once you havethat knowledge you provide it to the AIthrough the knowledge base so it canthen take long-term actions that youwould otherwise take manually now whilea lot of organizations uh promotereplacing junior and mid-level engineerswith AI tooling uh we think differentlyabout it making the gap between our uhthe knowledge that we share with withpeople in our organizations as well asthat we codify um into our toolingreducing that gap means that as a junioror mid-level engineer SRE is looking atthat broken uh cluster at that brokenworkload it can work together with KGBTand use that additional knowledge thatKGBT is providing to link to otherresources to make informed decision orthat S3 might not be called in the firstplace because KGBT can auto remediatethat issuethank you and I imagine there are a fewof you in the crowd who are doing acolbo right now being like well that'sgreat but my pattern is GitOps and I andI I don't work in this method of justpush and apply to clusters and so yeahknow I thought of that too and so Iwanted to share with you a little bitabout the road map this year and thevery next thing and I promise the verynext thing after parents evening tonightis I'm going to sit down and I'm goingto release the branch for autoremediation GitOps support so there's afew caveats to GitOps right i I wouldneed to give you a point of origin formy repositories but it's entirelypossible the idea here would be thatrather than apply the fix directly wewould track it against the PR watch thePR life cycle go through and then theresult would complete after that and Ijust mentioned we have 11 AI backendsbut the truth of that is each backendhas n models right um you know some ofthem have 30 plus models it is anightmare supporting it the interopoperability is horrid so MCP is a reallynatural next step for us um and that'ssomething we'll be looking to get out insort of the third quarter of2025 after that things get reallyinteresting and I touched upon itearlier and you might have seen thatsimilarity score uh I think I showed youin the clusterI think we can do better i think we canlook at that delta that patch run itthrough a separate uh highly specificagent for detecting changes in the YAMLitself to tell me whether it's saferight so if an annotation changes verysubtly that's not that's you've got todetermine is that safe right and so thatwould obviously need its own context butI'm committing to getting that out bythis year because I think really peopleneed those guardrails to really feellike they can actually turn on orderremediation because there's a lot ofcaveats around that and so this is areally important point for us andfinally we're going to build outadditional core analyzers for Kubernetesuh resources and APIs we don't fullysupport yet like gateway APIetc so what I asking you we're acompletely communityrun project we'vegot maybe like three six about nine corecontributors you could be one the stackis uh it's Golang it's Rust it's lots ofreally interesting modular interoperablecomponents just using it is reallyimportant you know just trying you don'tneed to be a coder helping us with ourdocs being able to say you know whatAlex this sucks and you could haveexpressed that a bit differently wouldbeinevit I really believe in putting yourmoney where your mouth is as well likego try it out there's a little URL andthat just gives you like the helm andkind version you can go onto our websiteyou can download it with Brew check itout tell us what works for you and I'mexcited about the next time we come backto CubeCon hearing about everythingyou've been doing withKGBT thank you so much thank you2025-04-15 22:02:19.141541 aa��~�y#��3AEXtCejkOJB0hi everyone welcome thank you so muchfor joining us today it's a realpleasure to be in London um too manyCubeCons have been in other wonderfulcountries but it's great to be here ifyou're joining us from overseas go tothe barbec go to the South Bank go getsome beer it's a great place to to beespecially in this lovely weather butI'm delighted you've joined us in thisdark cold room to talk about KGBT it's areal testament to your effort so thankyou very much for that um we're reallyexcited to talk to you about KCT todaybecause there's going to be a portion ofthe room that's never heard of itthere'll be a segment of the room thatmight have heard of it and there'll be afew people that have probably used it ina bit of anger what you probably don'tknow is there are a bunch of companiesassisting us behind the scenes to makeit better and better on prem in thecloud for single operators and forentire teams and we believe that it'shaving a very subtle yet profound impacton the ecosystem for platformengineering and beyond so let meintroduce myself uh my name is AlexJones i'm a a principal engineer at AWSbut I'm speaking about my u my mybeloved project that I started two yearsago out of frustration of being an S surwanting to effectively codify go testsinto slightly more integration-y testsand that eventually evolved into a setof a suite of behavioral tests thateventually was the uh the the start ofcase GBTgood afternoon everyone my name isAnesles i'm a platform engineer at theaccelerator at JP Morgan Chase here inLondon now anything I'm sharing withinthis presentation has nothing to do withmy employer or my day-to-day work to getstarted I want to ask you a fewquestions who here is using cloudnativeopen source technologies within theirstack hands up okay most of you I assumei hope so if you're at CubeCon um nowwhat technologies are you most excitedabout just scream kgbwhat else envoy Prometheus Kuberneteswhat are you excited about tell meokay i hope at the end of thispresentation you will show the next timeKate's GBT now who here last questionwho here is working or has experienceworking in a highly constrainedregulated environment an industry whereyou have to ensure that youare abiding by the the regulationswithin that industry okay several coolum now a lot of times when we asengineers identify new toolings that canhelp us within our job within ourday-to-day make our lives easier it canbe very difficult to convey thatinformation the value of those uh to thecompany to the organizationalgoals here are someof organization goals that you mightwant to work towards um that mostcompanies are working towards which isthat we want to reduce the time tomarket we want to maximize developerdeveloper efficiency and we want tomaximize cost savings now our toolingthe tooling that we're choosing have tocontribute to these goals on aday-to-day as well as long term some ofthese goals have been achieved by othercompanies by restructuring theirexisting processes as well as the waythey structure their teams so manycompanies have moved to a centralizedplatform engineering team within yourcompany it might be called a DevOps teamit might also be part of an SRE teamgenerally their respon��onalrequirements to like authentication umit shouldn't have to make these problemshard the hard part is knowing like whichof the many features these API gatewaysoffer you should actually turn on likewhich is going to provide you the hardthe highest ROI right now based on theproblems you're facing um so whether youfound yourself in this story before umor you've picked out an API gateway andyou're super excited to turn on all thefeatures only to find out that you havebeen dragged into some kind of chaoslike this the whole point of what I'mtrying to do here today is say that youcan and should that there can and shouldbe a way for you to help yourself findwhat that next big thing is um even ifthat means not turning on some of thosefeatures that you were so very excitedabout to beginwith so is this matrix right for you umin that story I mentioned a couple ofpersonas which I've uh reflected here aswell so obviously if you are using anAPI gateway and you are a consumer andyou fit one of these personas uh theanswer is I hope so and I want you towalk away from today feeling inspired byby this idea that you can selfassess thestatus of your API gateway based on someuh capability threads and dimensions umand that you know where you can maybefind the next best ROI uh for yourplatform if you are a builder of an APIgateway and there might be a few of youout there um I'm I would like this toalso reflect in that you could use thisto think about your user journey and theergonomics around helping your customersset up certain features of the APIgateway how can you use your product tolead them into better things withoutadding complexity um and either waywhether you are one of these personasyou're a consumer you're a builder I'dlove for this to start up a biggerconversation around contribution to thisidea and we'll talk about that a littlebit later so why should you care i meanAPI gateways are the just the front doorto your platform and your business youknow it's just like the path of ingressto everything so it's it's prettyimportant um but what I'm gonna say nowand I'm going to repeat it a coupletimes is that it's uh you don't need tobe at the top of the maturity mountainfor everything the end goal here isn'tto say you need to have turned on allthe features you need to be maximallyinvested in your API gateway it's aboutevaluating the landscape of capabilitiesahead of you and comparing that to whatyour company actually needs right nowand knowing whether you need to solveproblem A uh tomorrow because you needto prioritize problem Z todayso uh in terms of what I want you totake away from uh from today's talk isthat I want you to understand and feeluh inspired feel the value of doing aself assessment of your API gateway umas one piece of technology within yourlarger platform and a way for you toexplore um what you could do next uh andlean on your team to help you identifythe right time and place to do so um andagain I would hope that it means thatpeople are interested in contributing tomake this thing better because it's nota a means of judging your implementationor saying you're doing things wrong i'mnot the person to be doing that um but Ithink that what I would like is thatmaybe there's a community that getsgrown up around this idea and uh we candiscover some of these things togetherso you might think that like I carebecause I work for Enro and we'rebuilding an API gateway and our our uhinfrastructure engineers are dog foodingan API gateway but the story started acouple years ago for me um I was inDetroit uh sitting with a couple of uhCTOs and tech leaders from CNCF'senduser community um and they were likeif you don't know anything about me youmight think like oh maybe I was there tocontribute to the conversation right uhno I was there to take notes and listento what they said and being a writerturned that into a report for CNCF abouthow uh these folks thought about CNCF'scloudnative maturitymodel so if you're not familiar withthat we'll take another step back uhthis is a a model started a couple yearsago uh based on uh these five levels �ofbuild operate scale improve and adaptand they have four dimensions um buildis kind of like you have a baselineimplementation operate is that you'vemoved into uh production scale you'reyou know playing around with automationgithopsCI/CD improve is that you're definingsecurity and policy and governancegovernance across your environment andadapt being that you are kind of futurepproofing your cloudnative uh uhimplementation and revisiting earlierdecisions to make it better so what Iwanted to do was build off of that ideaum like back in Detroit these leadersthey loved having this roadmap for howto implement cloudnative technology butit's also kind of a bigger picture likeit's about the culture of cloudnativeit's about uh things that are notspecific technologies and that was whereI started to think like could we have asimilar thing that's focused just on thetactics and capabilities andrequirements that an API gateway uh issupposed to give us so like if the CNCFuh model gives us the forest I wouldlike to focus on one specific treewithinit the idea of maturity I think probablya lot of you in the audience have likeyour own uh definition of what thatmeans it's maybe not a a a great thingin your mind um so it's not about howold a product is or how long you've beenusing it uh it's not about like whetheryou've written things down or not um orhow many features you've been using i'dlike to think about maturity as yourjourney of adoption um whether thesethings are the things that you havebuilt are flexible enough to deal withnew problems as you face them and againlike how good those ergonomics are asyou turn things on and I think it can bea really uh maturity can be a consciouseffort to get you a lot of really goodthings if you again have that path to beable to exploreit in terms of how I've been definingmaturity uh for the sake of building outthis matrix this model is that asadoption of an API gateway maturesengineers are shifting away frombuilding API gateways for themselves tosolve their own problems and they wantto turn that toward enabling otherswithout giving up that control that theythat they need because it is that frontdoor and because there's it's socriticalso I was going to do some sort of likefancy slido interactive polling uh tosee where people think they stand onthis but I wanted to do something a bitmore human and I wonder if people arewilling to raise their hands based onlike are you sitting here thinking youknow I think there's something in frontof my APIs but I don't really know ifthere's what thatis no nobody everyone's feeling Oh I seeone back there lovely i'm building onthe API gateway involves lots of ticketsand praying someone still remembers uhhow it all works anybody feel like thatresonates with them you got a couple andwho's here is feeling really confidentthat like I'm just here because I knowI'm rock solid uh and want to see whereeveryone else isstruggling nobody seems to be thatconfident either that's good we likesomemodesty so on to the matrix itself um asyou can see here I've built upon thefive levels of uh that the cloudnativematurity model had started with but I'mtaking an API gateway specific lens onthat um so in build again you're inpre-production maybe you have some sortof basic ingress uh a reverse proxymaybe you're using both uh you know alot of things are manually configured uhthere's a sort of mixand match uhsituation happening here when you moveinto operate you're trying to patch someof those holes you're trying to makesure that uh when new services aredeployed that happens on the API gatewayor behind the API gateway as opposed toletting people choose their own path asyou get into scale obviously you want tomake this thing bigger you want to makeit operate really smoothly but you don'twant to uh deal with all the foot gunsthat often come with that as you getinto improve like everything becomesimportant uh you things that you thoughtweren't important before like you needto figure them out right now becauseyou're operating at a scale that uh thatmaybe surprises youand finally in adapt is thi�s question ofhow do we keep innovating withoutoverengineering um how do you get tothat kind of bad connotation of maturitythat so many people are afraid ofsomething being crusty something beingladen with technical debt um how do youadapt to new problems but not get tothat kind of difficult phase of beingable to break freeso from here my model turns uh into uhthese five dimensions um and what I hadshown before about CNCF's model is thatthey have dimensions like people andtechnology and business outcomes um Ithink what I wanted to do here wasbecause an API gateway is a piece oftechnology it's a part of a much biggerequation i didn't want to try to tie anAPI gateway to broader culturalimprovements i think it's more importantto keep it focused on like what are thethings that it can provide you on atechnical level um so we're going to runthrough uh these five things througheach of the five levels and you can seehow uh this certain capability uhevolves overtime so let's jump into trafficmanagement now I'm going to apologizeactually this the text is bigger than Ithought it was going to be but I'm goingto apologize once for the amount of texton these slides uh it turns out thatmaking a maturity model involves a lotof writing of things uh and reflectingthat in the slides is is a little trickyum I'm also going to share with youlater on how you can get access to allthis information on a GitHub repositoryso you don't feel like you necessarilyhave to like take pictures of each slideum but in terms of this slide at thevery beginning you might as I saidbefore you might not even have an APIgateway um or or you might be you knowusing that mix and match of reverseproxy API gateway or even like portforwarding to get certain things onlineand exposed to the internet um whateveryour method everything starts static andthen as you progress through theselevels everything becomes more dynamicmore about enabling developers to beable to ship under your API gateway uhand then into adapt you start getting tosome very fancy things that uh that Ithink we would all love to have but uhfeel like they are so far away from fromour realityand I wanted to show off what this mightlook like at a sort of Acme Corp uh likenot a real thing but uh you know I thinkillustrations help a lot and mydaughters helped me they love Pokémonand so they helped me pick out uh thisuh this radar as a way to uh to show offthe growth of certain dimensions um soyou can see here that this twopersonthing is like two technical co-foundersthey're just shipping code they're notconcerned about you know complex CI/CDit's just easier for them to work intight collaboration but you can see thata pretty common pattern that like youknow I might myself and people that I'vetalked to about this have recognized isthat it's pretty it's pretty common togo deep on something like trafficmanagement um and maybe authenticationas well uh before you spend any timeworrying about something like developerexperience or especially governancewhich I'll talk about in a little bitauthentication andsecurity um again like one of the thingsthat I wanted to point out about thisbefore I talk about like this specificuh growth of capability is that I havethese a generalized statement about thevalue that an API gateway might beproviding at a certain level and thensome examples of what those might be youmight find that you are are implementinga certain example capability like maybeyou do geob blocking because your APIgateway makes that super easy for you uhbut you're not using something on theoperate level so you kind of have thismix and match of of capabilities so I'dsay in terms of you doing a selfassessment I would always say to go backto that generalized statement and saylike am I solving this problem in thisway am I creating value in a certain wayum and that's helpful to help you mapwhere you might be uh in the greaterlandscape for authenticationspecifically your journey starts with uhwith a lot of essential O services beinguh or authentication methods beingembedded directly into your services uhwhich in equates �to a lot ofinconsistency um a lot of like worryingabout security issues um and trying tomaintain all of them um and being thatsort of platform engineeringinfrastructure engineering role that'suh that's a lot of extra burden onyou as you mature you are increasinglyusing the API gateway as this kind ofsingle place for centralized managementmaybe of zero trust fundamentals lovethrowing some buzzwords in there um sothat you can protect your APIs no matteruh who's building them or who isdeploying them whether that's adeveloper doing self-service or that'san infrastructure team doing that forthem and when you standardize thesecurity uh on your API gateway andeverybody's uh pushing their servicesbehind that uh you end up becoming not abottleneck but an accelerator for thefolks uh for the developer folks whoneed to uh work quickly and and not getblocked into observability and debugginguh in in this case I would say the theAPI gateway really starts as a datasource um but quickly becomes the go-toplace for debugging actions right likeif all external services or all externalrequests are coming in through the APIgateway then it makes a lot of sense tolook to the API gateway uh when thingsgo wrong um as you get into the operateand scale levels you also start seeingthe API gateway become a centralmonitoring point for KPIs like responsetimes and usage patterns and so like thevalue of the API gateway increases asyou add new capabilities uh to uh to thefunction that you already have runningand as you get into the improve andadapt phases observability becomes areally great place uh as maybe anemerging platform team uh to make thisfeaturerich golden path uh which I'lltalk about in in a in a slide or twofrom now um so if you give them moremetrics better dashboards better errorhandling and it's all built into the APIgateway they'll be a lot more likely towant to jump into that as opposed to uhyou know wanting to stick to their theirway of doing things and and maybe theirmonolith that they've been maintainingforyears and this is where uh AI mayberears its ugly head for the first timeaidriven anomaly detection and automatedRCA i think there's a big question hereand I would love again as I start totalk about contribution to this model umhow we uh how folks like you in theaudience think about AI is playing arole in the future of platforms and APIgateways andingress so let's take a look back atAcme Corp now that they have become a 50person startup they've grown a lot andtheir radar has expanded um in a coupleof interesting ways they've gone pastlevel one for observability they're intolevel two which means they are like bapast basic logs maybe they moved intostructured logs uh to be able to handlesome issues that they've met along theway um you know they're thinking aboutautomation and infrastructure as codefor deployments uh which gives maybe anice little boost to the developerexperience or their their experience asinfrastructure engineersuh they've matured a little bit you knowin different directions but as you cansee they're facing a couple ofinteresting problems like they have uhyou know they're getting late responsesor error rates and they're still tryingto figure out what's going on with thoseso you can probably guess that theymight want to uh mature a little bitmore in observability as they moveforward um another thing that's uhimportant to mention here is that inthis scenario I'm imagining that likethere's more than one team using an APIgateway at this point like when theywere a twoerson startup you have two ttwo people one team shipping behind theAPI gateway as you expand outward andyou have five teams 10 teams I've talkedto colleagues that you know onecolleague had worked at a place wherethere's 80 teams shipping behind the APIgateway the value of going up inmaturity uh becomes that much moreimportant uh as you scale up the numberof teams that are are deploying behindit um because you have it as thatcentralized control point where everyonemaybe gets their own path onapi.youaccomp.com your company.com uhand and that centralized contro�l pointjust kind of the the value of thataccelerates moving into developer andteamexperience um at the beginning the lackof an API gateway or a veryunderutilized one um eventuallytransforms into this repeatable processfor ingress a repeatable pattern uh thatyou know might not like create animprovement right away in the developerexperience but I think that like addingto this platform trying to standardizethings that has like a ton of downstreameffects uh that that play a big rolelater as you start to get to scale andimprove where you're really trying toonboard internal developers uh onto thissystem that you've built um you knowyou're always trying to strike a balancebetween uh offering an easy button sothat developers can just uh you knowmake it as easy as possible but alsosupporting the power users uh you knowwho want you to support that one strangething that they do or you know trying todeal with them and uh and navigate uhthe the best way to go forward um youknow everyone's service to them feelslike a special snowflake uh and so it'svery important in terms of the APIgateway to support them uh where they uhwhere theyare finally into governance andcompliance i think compliance and thisone is a tricky one uh because itreaches into all the other areas um andand second like you might get really bigbefore compliance and governance is aproblem for you uh you don't really needit until your customers or your partnerscome knocking at your door and tell youlike you need to turn these things on ifyou want us to work with you um and thenit's you know like again kind of thatall the things all the time uh scenariothat you have to deal with um and asyou're trying to get better at thesethings you're facing a lot of pendulumswings between uh your team like yourinfrastructure team and developers whowant to ship fast um like one daythey'll say we need API versioning rightnow and the next day they'll say likenever mind we figured it out ourselvesuh so you know what your job here interms of you know maturing the APIgateway as you get to scale and onwardis that you've codified all of thesethings in a way that you know despitesome conflicts here and there uh you'vegot your team you know when they'reready to onboard uh you are ready tosupport them in the best way that youcan um and you know this part is allabout like what is what are those laststeps that you can take um to enableother people in your organization in away that doesn't lead your API gatewayinto a place of utterchaos so we've got our Acme Corp nowthat they are a 200 developer machinewhich means they are a much biggerorganization overall and I kind of hatethe shape of this just aesthetically uhbut I think it like shows how theadoption of features and maturity withthe API gateway can change over time umbecause your requirements are changingso you can see that I mentioned beforewhen they were a 50 person company theywere dealing with observabilitychallenges and now you can see they'vegone quite deep on solving observabilityuh in my scenario they've also acquiredanother company uh in a scary differentregulatory environment which means theyhad to go I think basically from levelone or level two all the way to levelfour in governance and compliance likeit's that idea of responding uh asthings happen and finding that next bestROI thing that you can turn on with yourAPI gateway um and you know becausetheir platform has matured in a lot ofdifferent ways it's okay that they'renot level five in all the things in factthey're at a point where they haverationalized that maybe they like maybethey will never need to reach level fivein observability because it's notsomething that they are actively dealingwith and again like that is the wholepoint or like one of the main points ofthis of what I'm trying to build here isum it's okay to not have turnedeverything on and it's okay to uh youknow and we should be taking a more uh amore measured look at how to approachthese complexproblems so I wanted to kind of comebackto you know that we've gone through thematrix uh and see if people's opinionshave changed at all um is there anyonewho is like since going through this islike I know less about my API gatewaythan I did at the beginning or is stillfeeling not particularly confident in inhow they've set thingsup who's like feeling like okay we'rematuring but you know we don't you knowlike we're still figuring out how toscale and how to build this thing up andimprove it got a couple people seem tobe in the middle ground and who'sfeeling like maybe more confident thanthey were at the beginning that likeyeah we've turned on a lot of thesethings we've got a rational plan formovingforward a couple in the back as wellawesome so I know I gave you a ton ofinformation but the good news is thateverything that I showed you here plussome additional things is available onGitHub you can find that repository thatway you can scan the QR code um howeveryou'd like to find itwhat I would like for you to walk awaywith today we've got just a coupleminutes is uh maybe not today iunderstand we have uh a coupe crawlright after this so you might not wantto do this right away but take a fewminutes to go to that repository take alook at that see where you stand likewhere are you strong where do you wishyou could do more in terms of thecapabilities of your API gateway maybenext week once you've all returned hometo the office to your you know your homeoffice uh I would love if some you knowsome of you took the model back to yourteam have a conversation about itexplore each thread each level um andand have a good conversation about theproblems you're solving the value you'recreating um and maybe places you're heldback uh whether that's a lack ofresources you don't have the businesscase um maybe your API gateway doesn'thave the capability you're looking forum and in terms of uh contributionum that GitHub repository is open sourceit's MIT licensed you know it's I wouldlove to have other people jump into thatconversation and see how we couldcollectively make it better um maybe youwant to replace like help me replacethose Acme Corp illustrations with aself assessment you've done about yourown organization um like let's run thedata or run this algorithm against somereal world data uh to see what thatlooks like uh maybe you know you werelooking through each of the levels yousaw those uh example capabilities andyou thought like I have something thatthat I've turned on with my API gatewaythat feels like it's a really good fitthere that's missing i would love tohear that uh maybe we could build like aMyersBriggsesque questionnaire that asksfolks a couple of questions and thenspits out what you know a be a bestestimate of their API gateway maturityand where they could head next uh andand maybe somebody out there is reallygood with governance or kind of likecontrol of open source projects andgetting community involvement i wouldlove to have a conversation around thatand continue to make this thing betterso my hope for the talk uh for you knowsharing all this with you and trying tobuild this thing out is that um you knowthat you want to contribute uh that youwant to have these conversations withyour team and that we can all take astep forward in terms of our ownmaturity as people who you know thinkabout architect build implement debugobserve and yes even for folks like mewho stand up on stage and talk about ituh mature in our understanding of thesebeloved and maybe finicky APIgateways so that is it for me today uhany questions that we've got just a fewminutes maybe before I let you all go toKubeCon would love to hear if anybodyhas anyquestions and this is a QR code for myuh uh to give feedback this is the firsttalk I've ever put together in a contextlike this so I would love your open andhonest feedback uh about how I couldmake future talks better that would beuh enormously helpful um and there's acouple other ways to get in touch withme i'm headed back to the Enro boothright now if you'd like to continue theconversation I would love to do that sothanks for coming everyone i'll see youout at Coupe Crawl and uh stay safehappy travels2025-04-15 22:02:19.934955 j��J�{#��KA8ta_zFiUG1shi everyone and welcome to our talk Myname is Max Kerbesha and my name isAlexa Griffith I'm a senior softwareengineer at Bloomberg working on theirAI platforms and uh today we will have alittle exploratory journey throughoutthe platform engineering genai universeOur idea behind is like either you pickup little items out of this wholeuniverse or you see a full-blown largeplatform but you never see the journeyin between and how you maybe come fromone thing to the other thing and that'show our journey will start for alsotoday We would like to go step by stepthrough some little ideas We'll get youset up and then we'll dive into one toother demos show you a few little thingsand uh yeah we'll round it up in the endSounds goodSo this map you maybe know and usuallyyou know this way more bigger landscapefrom the cloudnative computingfoundation but this is a specialized mapjust showing the AI landscape fromKubernetes and the CNCF universe Thevery interesting thing is it's actuallykind of small So AI happens with cloudnative but it also happensalongside of the cloud native universeWhat I mean with this is like thisbeautiful map and that's actually insaneand it's old2023 Um but it's also great because youcan see different blo��u�z#��!AEOQ8qNstD8Ithanks everybody for coming i know uhthis is the last session of the daybefore cube crawl so we're going to tryto keep it on time uh and uh superappreciate everybody coming by um todaywe're going to talk about an API gatewaymaturity matrix um my name is Joel Hansuh I do a lot of Debell things at Enroum and I love stories so I want to startwith a story that I hope uh will likeunfortunately resonate with many of youuh so it all starts with a little babyKubernetes cluster it's so cute and fullof promise uh maybe there's some like OGservices doesn't matter what thelanguages i just Python looks nice upthere um and maybe there's a marketingsite as well that you get set up um youdon't like to you like to have makeyourself uh have harder problems thanthey should be so you don't just uselike S3 and CloudFront you want to do itall yourself uh andthen Oh okay okay I think there wassupposed to be some services that pop inhereum let's go back through nope that's notwhat it's supposed to do at all anywaysyou have some Go services that come inat some point uh and then you hire amarketing team and they want to rip outa bunch of the existing website uh andthey want to host that on an externalservice and so then you have to figureout how do I host that on the slashblogpath do I use my cluster as a reverseproxy or do I stand something else up uhand then comes in another project to setup developer documentation again youdon't want to rip everything apart soyou got to figure out some more routinguh and then you have some more Goservices that just don't feel likeshowing up for me today uh and then youneed to move a couple APIs around maybeyou need to deprecate one V1 to V2 maybesome rewrites redirects everything likethat and all of a sudden everythingbecomes a moving target they finallyworked um and you know trying to takeback control of all of this trying tomake it feel better work better it oftenseems insurmountable uh and if you are aCTO or you are an engineering leader umyou don't want to be in this position uhif you're a platform engineer your pathyour golden path doesn't feel so muchgolden as it does I don't know likecovered with a rock slide uh and if youare a infrastructure or DevOps engineerum you're you're that like this is finedog meme until you decide to hang upyour hat and move into something elseand this is a story of people who don'thave a way to selfassess where theystand with their API gateway and take asingle step away from this chaos andhaving something that's morewell-managed secure uh developerfriendly uh and and more mature rightum a good API gateway uh you know thatfront door into all your services thatyou offload a bunch of non-functi��cks in it You seeinfrastructure you see data analyticsyou see also the open source layer whichis like on the bottom this uh kind ofgreenish um area Now keep your eyes onit Don't blink because there will be alittlechange between 2023 and2024 It's almost feels like it'sexplodedright I'm still waiting for the 2025version I would love to bring it herebut I believe there's so much changes onthe market that the guys have built thismap need a little bit more time But whythis map is now relevant to us is thatwe need to understand we live in acloudnative ecos and everything ismoving fast We have a lot of open sourceprojects a lot of things going on but wevery often talk about the same thingswhere yes you find also a lot ofcommercial solutions for sure but italso means there's a total other entireechosace that we maybe haven't thoughtof yet or maybe have already providedsolutions rightSo Kubernetes platform engineering needsto bring in some kind of flexibility ofallow us to adopt to all of this veryfast changes with this very fastexplosions on the market and maybe alsoroll out some of the uh market playerwhich we can see and Kubernetes istherefore the perfect platform becauseit knows already from 2020 some of theautomation hype then we go to edge IoTtelco last year and always aboutinternal development platforms and stufflike that And this year if you look intothe schedule similar to our talk atleast every second talk has somethingwith AI written in itBut the reason for it is not justbecause it's a hype but we provide withKubernetes and platform engineeringactually a very good foundation to drivethis kind of innovation and provide alsoan environment to frequently change someof the tools which maybe got outdatedbecause something new or fancier poppedup into the market Not saying that thisis a good thing but it's something weshouldconsider On the other hand side andthat's why platform engineering becomesirrelevant in this area is that we donot just have this infinite devops cycleIf you look closely to the image you cansee that there's at least I would saythree to four infinite devops cyclewhich also all areinterconnected and that complicated theway how we are working how we areintegrating tools dramatically It comesfrom just you build it to run it to youbuild it you run it you find and tomaintain your data and then you need tohelp others to use it to integrate itwith your AI applications to provideLLMs and so on and so forthAnd this brings us to the big questionwhat are my options to run genaiworkload and to enable others to usethem too Because what we also can see isthat very clearly not every role in themarket is specialized in utilizing opensource and going through the whole stackIf you're a data scientist you build ityou run it doesn't workanymore You're a specialist in somethingcompletely different And that's where wecome in to bring an environment where wecan help offloading some of the complexenvironments of the complex integrationsand make the life of someone else umeasy And that's where we start also ourjourney So for this one we talk aboutthe TVP the thinnest viable platform Andas you can see we're swimming here in alittle rudder boat I don't know if thisis the right translation from my Germanuhbrain but it helps us to get around Itdoes its things We do not think That's agood start How does it looks like in aplatform engineeringperspective This little thingy forgetting around is actually quite bigjust to provide a platform and integratewell with some cloud provider like AWSIf I talk with my team about it we dreamabout something like that one The coolthing is with open source projects likeCanoe we get the things more or lesseasy fast up and running For sure weneed to tweak for example the key to beintegrated with our corporateenvironment but everything else iscoming more or less out of the hand Andalso in this rather complicated setup wecan also throw out for example umcrossplane or agoworkflows to simplify it in some way Wealso maybe don't need backstage but wemaybe also would like to providebackstage to make the entry �point veryvery easy for spinning up an inferenceservice And we will look in this toollater a little bit more However withthis tool setup we can start in a veryvery fast way to provide a firstenvironment to workwithin Not sure if the video playsnow So normally I would like to play youa video um which would show the setup Soactually to prevent that the demon godsuh screw up this talk we thought like weare smart and pre-record something Lookslike the pre-recording demons also setup this thing and uh make it a littlebit bad for us Actually what I going toshow you in the short demo you can justimagine that we run a Cenoa deploymentand you can see that um after some whilewe have the setup of all the differentapplications that are needed for runningour platform We see our git u Argo CDbeing integrated We see the deploymentof the[Music]um maybe we'll get this done Um you seethat we after a couple of minutes can gointo the Argo CD and see also thateverything else getting deployed We willhave the backstage We have the searchmanager We have the external DNS whoautomatically starts integrating withthe load balancer and setting up alsothe routes for the machine so that wecanafter 15 minutes roundabout reach ourfirst services and that makes it for usvery simple in the beginning to get thethings running and so from here we moveon to our MVPThank you Okay So for our minimal viableplatform we'll start with talking aboutKserve for serving inference in our DNAplatform So Kserve is an open- sourcetool that simplifies AI modeldevelopment on Kubernetes by extractingaway the complexity of Kubernetes Um itallows developers to focus on modelsrather than worrying about all theconfigurations you need to run your AImodels So it supports multiple genigenai and ML frameworks making itversatile and framework agnostic Um andthe past uh we have we currently have aunified API that also supports open AISo also what's great about Kserve isthat it integrates with other opensource tools easily and provides a lotof out ofthe-box featuresSo KServe has been uh serving predictiveinference for a while now and these areall of the features that have come outof the box with that Um with Genai willadd some more but these are the basicfeatures that come even with predictiveinference So the ability to scale up anddown from zero request batching umsecurity distributed tracing autoscalingon GPU and CPU uh logging differentobservability and traffic trafficmanagement So with genai has come a hostof new challenges and in order to tacklethose and accommodate these we haveadded a few new and upgraded features Soadaptive scaling now we can do uhthere's built-in tokenbased autoscalingfor our genai models also performance umboost with model cache and prompt cacheSo this minimizes our latency and ourability to quickly um to quicklydownload to quickly use a model that'salready been downloaded rather than haveto download it again Um also again likeI mentioned we have open AAI protocolsupport and we also have scalableinference So as models grow larger weneed to sometimes serve them on multiplenodes um with VLM and this gives us highperformance workloads Additionally andwe'll talk about this in a bit too isour integration with the AI gatewayproject Um so we connect seamlessly withour managed AI services giving us abetterplatform and Queser for example is anperfect example where we can say likehey look we can provide you for whateverposition you are a template withinbackstage um where you can see theinference service on the bottom Um sothis will pop up in your front end andif you're let's say team A you wouldlike to have the first inference servicejust for yourself deployed you will gointo backstage just fill out the basicinformation who's the owner which namespace for example it should be deployedcan be also cluster whatsoever and youcan also fill in like um some morespecifications about it like umdifferent models uh whatever you wouldlike to have and um through backstageand the automation process you're goingto go and deploy it for example in theinference cluster The cool thing �is itobviously will also work with otherservices So doesn't matter if you useKSERF VLM or if you even would like toimplement some other use cases Forexample creating a new tenant inMilvos's vector database So what itallows you is that you can bring reallythe self-service for a team and beingautonomous integrating their learningstheir models and giving them for examplethe space in in a database or space inS3 bucket um just out of a backstageportal Yeah And I also like how thisslide shows how easy it is to uh set upa a service using Kserve you have a verysmall um a very small YAML and you canget something up and running and haveall the features that we just discussedSo I want to talk about anothercomponent and I'll show how this allfits together next Um Envoy AI gatewayIt's a new open-source tool um builtfrom the Envoy Gateway project thatwe've been working on and it addsfeatures to better serve AI workloadsespecially as a platform team So it'sbuilt by the collaboration of Tetrateand Bloomberg engineers and it offers uhin the MVP centralized access So astandard controlled and audible way toaccess both self-trained um and open-source or commercial models It also uhprovides a way to manage credentialsbetween different LM providers andon-prem And it offers out of the boxcost monitoring tuning specific for yourgeni workloads um like tokenbased limitsand tokenbased costoptimizations So this is a diagram ofhow the platform would fit together withthe components that we just mentioned Solet's say you hit a load balancer andthen you'll hit the envoy gateway Soyou'll get a these are just a few of thefeatures that I mentioned Um so as weknow different L and providers mighthave different access patterns What'ssuper nice about this is from theclient's perspective they just need toknow one access pattern to hit envoygateway one centralized authenticationmethod and under the hood the envoy AIAI gateway will route you to the correctprovider Um so this is an example ofwhat your uh platform could look like ina hybrid environment with a manageinference cluster and using an LLMproviderSo we have a demo I hope it works Uhwe'll see But anyways what it will showis um using an AI agent to call a toolto get the a weather This is what aplatform could look like uh architecturewhen you do that So you can easily usegateway to uh have your agent route toeither a managed inference uh Kubernetescluster or something like AWS BedrockOh noWell there is a video herebut it's not showing upMaybe while we are fighting with thedemon gods de demo gods not demon godswhatever Same thing Are there alreadysome first question about like theplatform engineering part How does itenable the AI or geni workflows anythingthat any questions there maybe on Canoeor uh backstate Argo how the deploymentworks If you're shy it's not a problemYou can run to the microphone It'shidden in the shadows No one will seeyou Give you the chance to still askawesome questionsAre you aware of any example like can wekeryeah so the question is if there's uhany example with sen with ksurf um Ihaven't seen one uh I would like to workon one or would like to provide it tothis one um actually the interestingpart is when you deploy ksurf it most ofthe time to run very very good requiressome part of so you have a very strictdependencies coming together right Umand that's the the one downside of it Soif you want to run the kerf inferencingmodel you have to go this way So maybeyou would prefer VLM or some other setupUm but if you're fine with thatdependency is obviously a very goodchoice The thing is why we use it hereis like the setup itself is kind of easyYou just deploy it It's cluster wideavailable and it reacts very simply oninference service that you drop in forbackstage for example So um it makes alife very easy to get the first servicesup andrunning Any other questionAllright During the mech is starting I Iwould like to dance or something I liketell[Laughter]joke you guys runWell that depends on um so the questionis which LMS are used most frequently Ifirst answer shortly this question YeahUm� so it depends on I cannot answer thisquestion for for Alexa's organizationbut what we see is like um in theintegration and the demo which we haverunning we use for example AWS bedrockand just enable clot for example butdoesn't matter you can also have uh JPDrunning or um set up your own ones likeum um the the llama from from Facebookright so it's very flexible but this isalso the power of the API gateway umwhich Alexa shows that you can just playaround with the different LLMs and justexchange them as you need it because youhave a gateway in between whoorchestrates for you all the othercomplexity away and that makes theintegration veryeasy pleasethe gateway is ready like for productionready is it we can just use it I thinkthat's more question for you it's it'sgetting there You can check out theEnvoy Gateway project It's in the it'sunder envoy gateway anduh yeah we uh we just announced it atKubeCon NA and uh I think we had a bigrelease that came out about a month agobut we're definitely uh expecting peopleto come into the community and help usuh run it as well Thank you YeahIn short there's always room for morecontribution to itBut in a demo I hopefully will showyou how how it is to use it thoughYeah we will put up the slides later onto the sket and then you can find theslides and the links to the videos um sothat you can also rewatch the demos Umsome of the demos are quite easy andsimple right It's just like a wrap-up oflike how does it looks like if youdeploy an infrastructure with sceny UmAlexa's demo is a little bit morecomplicated than that Um and then wewould have also a third one looking intolike how we enable through opentelemetry and open LLMuh the observability for large languagemodels and how does it forwarded to anykind of your preferredum open observability tool NiceOkay awesome Thank you guys So here wehave a pretty basic agent Uh I'm goingto run it locally It's going to askwhat's the weather like in New York CityIt's going to call to the LM running inBedrock It's going to get a response andthat's going to say "Hey use this tool."Um and it's going to go use that tooland give us a response Um so yeah let'slook at how that would happen So first Ijust want to and I'll actually pause ithere I just want to show so uh this isusing the envoy gateway I just inspectedone of the resources and you can seehere that we have um a backend set upand it shows we have a case of lmbackend which the value is dsplama andthen we have an envoy basic AWS backendand it'seu.enthropic So that value is what we'regoing to use in our request makes itsuper easy U I'm going to port forwardto the envoy service and I also justwant to show you the name spaces wecreate I have a kind cluster running Umthis is what the AI gateway creates Soreportforwarded Um now we are going to showhow we can uh run our agent and get arequest back So here we're hitting DSPllama blah blah blah That's our on-premKubernetes self-hosted service And I'malso going to hit AWS bedrock um andalso get an answer Now I'm using claudeuh anthropic and I'm also using llama Sowe get slightly different results butsame same idea Also I want to show withjust a very small config that was supereasy to do um I get out of this this outof the box uh cost monitoring as well Sothat shows just how easy it is with thethe same endpoint just the given thesetup the value is um the name and wecould easily hit uh easily managerouting to both on prim and uh using anLM provider like AWSBut we'll have to do that one one moretimeYeah there's one more late Not right nowbutlater We'llseeUm yeah thanks to the tech guys That wasreally greatOkay So um with predictive inferenceit's common for users to you know managetheir own models train them everythingbut now it's also popular to use theseout of the box you know open source orvendor LLMs So this has changed ourperspective of an inference platform alittle So now we look towards thesolution as a platform team more focusedon managing LLM and providing them as aservice to our users So what we want todo is provide AI without the overhead Uhgive users ac�cess to state-of-the-artupdated LMS like Deepseek LlamaAnthropic etc without deploying ormaintaining them Um we want to providecentralized access So we want an AIgateway that abstracts all of thesemodel management complexities making AIas simple as a consistent API call Um wealso want to give optimiz optimizedperformance with curated and up-to-datemodels Uh we want to be able to managethese models well and focus on so ourusers can focus on innovation and notthe infrastructure underlying it Uhadditionally we want to build it withyou know Kubernetes native cloudnativeprinciples to ensure that we'reefficient cost-effective um andflexible So let's dive into stage threeNow we've upgraded from a nice sailboatto a spaceship and we are going to talkabout our advanced features likeintelligent load balancing observabilitypatterns and different techniques foroptimizing our LLMs So expedition modeuh as we're thinking of LMS as a serviceright we were just I just showed you thearchitecture um well where we could auser could hit you know an inference aself-hosted model or something like onAWS well additionally uh we want to makeour platform as easy to use we want tomake it optimize for performance as welland as a platform team we want to beable to um give users different optionsso some workloads can be done in batchuh they don't need real-time processingUm it's not critical that they have thatTherefore having something like a batchinference platform with these opensource tools the same open source toolsthat we use to create the other platformcan greatly help with our uh resourceutilization and cost And this is anexample of an architecture of how you asa platform team can provide that to yourusersSo as already teasered um there's alsoobviously we we need to have deeperinsights about the um the models howthey're doing um getting feedback fromthe system maybe even see the tracesbecause you sometimes see how the umtools are communicating with each otherespecially if you chain a few of thosemodels together you're very veryinterested in what's going on and whereI'm actually losing a lot of time maybebecause I need to optimize the requestsmaybe I need to choose a different modelgiving me a shorter answer whatsoeverHow do I get that Well on the one handside I can use open telemetry This givesme the the broad big picture but opentelemetry doesn't not yet completelyunderstand like what's going on withinlarge language model doesn't understandthe the specifics the details of it Andthere's a nice development around thatcalled the open meta mitri Hope I spellit right Um that integrates withdifferent models but also with differentplatforms So we do not just talk aboutlike using for example cloud.jbt JBT butalso utilizing something like a bedrockor pine cone or whatsoever But fromthere it just grabs the information andput it over to the good old opentelemetry and just forward it to ourfavorite endpoint whatever it is mightbe a commercial solution might be opensource solution might be something whichI have developed by myself and uhthere's another video perfect wow um sothis is for example uh something withbedrock integration um in this case wehave uh JPT in the background and youcan see different stuff like costs Yousee the amount of requests which we doUm how long is the average duration butyou can also see that we have a lot ofservice quality and guardrailsimplemented and this is just on thebasic information which we get back fromum the the observability integration andwe put some guardrails around it Um andlast but not least you also can see howprompt caching can help us to optimizefor example the performance and how muchtime we can save And yeah I mean $9saved or not that's not the bigdifference but it depend on how much wehave in the end um running here So thisis a very simple demo The guardrails arevery easy It's just like if we get a anegative feedback we just say like okaymaybe either it's toxic or it leakssomething Um but from the from theinformation which we receive alreadyfrom open telemetry we can use that todirectly instrumentalize it and give aback to the developers to the engineerswho would like to maybe see that rightespecially if we are working with themodels and try to integrate and set upsomething that's um in the end um yeahuseful and should go into production sowe have a good cycle um overallNice So this is uh an advanced view ofrunning KSERve uh enterprise scale thetype of components that you may have andI'll try to go through this quickly butum one is model caching So a deepseekmodel is around a terabyte and if wehave an H100 node we have about 640gigabytes of storage So um we're runninginto situations where it might not fiton a single node Um so model sizes aregrowing and that's for multiodeinference Sorry So that's why we're alsooffering multi-node inference uh servingfor model caching Again models are alsogrowing bigger So the download timetakes a while as wellUm so model caching helps us to reducethat cost of downloading the model everytime a pod starts up Additionallyoops Additionally we have prompt cachingSo uh KV cache is something that comesout of the box with VLM and it helps usto optimize our prompt caching when umon the the pot as wellOkay And that helps us also to all thesecaching especially model caching helpsus to autoscale quickeralso Okay So model caching Thisvisualization is an example of the timethat uh caching a model will save Solet's just say for example you have uhan init container that comes up You wantto autoscale You need to uh you need toserve more requests You're getting morerequests and you want to scale yourservice uh and your pod comes up and itneeds to download the model That takes12 minutes Um the rest of the time let'ssee that takes like 3 minutes But if youhave a local uh a local cache of yourmodel saved what's nice is that we canjust get rid of this bottleneck thattakes so long and we can now autoscale alot quicker and serve our requests a lotbetter Also uh there's a KV cache growthchallenge So KV cache expandsexponentially with sequence link as wecan see So another solution that we areimplementing is the LM cache So it'salso an open- source system designed tomanage your KV caches efficiently Itmakes KV caching more scalable Um itdoes this by caching common inputprefixes to reduce your redundantcomputation speed up repeated queries Soit also makes access and storage ofcaches faster especially for longdocuments and multi-turningconversations It's nice because it canshare and store across multiple VLMinstances and route queries to theinstances that already hold the relevantcontext for your KV cache So this hasbeen shown to be able to reduce the timeto first token by around 3 to 10% andsave on the GPUcycle Um one more uh advanced featurethat we are looking into is the genaiuh is to helping with disagregatedserving So jai inference latency isn'talways predictable because we have thesetwo steps uh the prefill and decodingone's computebound and one's memorybound So what we can do to optimize thisis to se separate the two um and executethem independently and have twodifferent sets of GPUs that can handlethem therefore increasing our throughputand optimizing our hardware performanceSo we want to input we want to optimizeboth performance and latency hereSo with that we are at the end of ourjourney Uh like a good hike sometimes itgets a little bit windy stormy and alittle bit uh challenging So maybe it'snot really the end also for for this oneWhat we wanted to show is like in theend like on the way you will find allthe time something that maybe canimproved that we can for go up for thenext challenge um that maybe needs to bereplaced Right Sometimes replacementsare not worse They are part of improvingthings if you keep with the things whichyou have implemented in the past Wellit's not always something that sustainsforever And that's what we want todemonstrate with having platformengineering as an enablement platformalso to bring up more complex topicsaround genai um easily implementing uman AI gateway and so onYeah totally agree Thank youThank you very much and enjoy the restof your cube2025-04-15 22:02:20.629378�ot Um anduh so we we feel your pain on this Wethis is something that uh we're going tobe talking about how how we deal withthis Uh we run Kubernetes clusters andwe have a lot of GPUs and we go throughthe same thing We will reboot our GPUsfor for different various reasons Sothat's what we're going to talkabout Okay So a little bit about our uhour Kubernetes clusters Um we've got alarge fleet We've got uh geoloccated allover the world Um so pretty much inwe're in a lot of countries uh 40 plusclusters 30,000 plus nodes uh 60,000plus GPUs so roughly two GPUs uh twoGPUs per node of densityuh and like I said we run into a lot ofdevice failures so that's a lot of GPUsso we do encounterthis okay so the product that uh that weactually represent we build theinfrastructure for it's calledGeForce Now it's I think Nvidia cloudgaming Um so we want to take graphicsand we want to stream these to the enduser So we want low latency you knowwe're geollocated right near the theuser Um and we want to give that that uhdesktop experience uh for the GPU likeyou're playing the game locallyUm so our workload uh primarily whenrepresenting uh and working with uhGeForce Now is we have a lot of onlinegaming like I said and we also do a bitof inferencing uh we have spot capacityum so we'll run some inferencingworkloads uh there with our additionalcapacity um so we'll we'll fill our datacenters quite a bitOkay so device failuresUm so we want to maintain our capacityright It's really important We want ourGPUs running really hot They they costus a lot of money We put a lot of effortinto putting them into the data centerand we want them used Um so that's areally important thing and thing thatwe're going to be thinking about a lotSo we really care about this right Adevice we put all that effort intogetting it into the data center and wewant it to to run all the time We wantit to have lots of workloads on it Uh wewant to make the user happy right Sothis is really important We want to usewe want to run at uh maximum capacityall thetime Okay So let's get into the devicefailures Uh there's a whole list hereThere's probablyum you've probably encountered at leastone or multiple of these Uh so this isuh so I asked you first about GPUsfalling off the bus So the littlecontext about that um in in the kernelring buffer there's a a message if youlook uh when you're experiencing thistype of failure um there'll be a messagethat's saying that uh your device hasfallen off the bus and so how does thathappen Um well the these these are allthe things that that can lead to that Uhso I'll go into like maybe a few ofthese So overheating um GPUs we knowthey use a lot of power So what uh whatare lots of ways that uh we can causeproblems Well we can use them a ton Theycould they could overheat and this couldlead to uh a GPU falling off the bus Uhan insufficient power supply Um think ofthe amount of amount of wattage and thatis being transferred over the copperright We need to be really careful aboutyou know how much uh uh like how theconnections are like they're reallyreally really closely or like the copperis like you know the whatever theperfect quality and the the leads aretouching and all the different thingsyou can imagine that could possibly gowrong with power transferring over wireUm any of that stuff could possiblycause this error Um and then the thirdone I'll go into just uh this will bethe last one will be uh driver failuresdriver issues Um we experience this alot We run um cubvert we run a lot ofVMs Our workload runs in VMs So what wedo is we will um we'll launch a virtualmachine and we'll pass through the GPUand to do that you have to use the VFIOdriver and we will also use mediadevices So we will you know use VGPUswhich means we need to use the Nvidiadriver So we're constantly changing ourour driver on the fly And sometimes werun into issues where we change thedriver and it doesn't quite work And youknow we we run into some sort of failureSo in all cases or almost all thesecases um we have to deal with it in someway Okay Okay And so here's actually apicture of an �example of um the messagethe GPU has fallen off thebus Uh so when we when we actuallystarted our our journey when we werelooking at um trying to understand andachieve that goal of having maximumcapacitywe wanted to we wanted we needed tounderstand um well we actually we raninto some problems We wanted tounderstand that uh why some nodesweren't quite filling up And this wassomething that um you know we we noticedand sometimes we would go to the nodeand we see you know workloads aren'trunning there and and we discovereventually this kind these kinds ofissues are are happening Um so we neededto do some work we needed to discoveryou know what is going on And eventuallythis discovery you know led us to hearwhich is what Natalie is going to gointo more detail about in just a secondum different ways of how we dealt withthese GPU failures so that we can makesure we're maxizing ourcapacity Okay so just to give you apicture of what I'm talking about when Isay we've been experiencing devicefailure So for for numbers here I said60,000 GPUs 30,000 plus nodes 40 plusclusters So here's a roughly in atimeline what we experience um and thisis like this varies a little bit umbased on various things like drivers andum and a bunch of other things that canchange these numbers a little bit Soroughly what I'd say is you know betweenum you know like 120 and 190 or maybe200 um will experience uh between 120and 200 GPU failures over a 24-hourperiod So it's quite a bit 120 to 200somewhere in that range Um and so whathappens We we have to do something aboutthat right That's a lot of devices thatthat we now have to we now have to fixYou know we need to reboot the machineright Or or whatever it is that we needto do to fix it So Natalie's going totalk through some of the the tools andtechniques that helped us get throughthis so that you know we're notconstantly getting paged to to go in andfix things and you know what automationtools and techniques that we did to toactually deal with with this so that wecan run at maximum capacityLet me take you through what we aredoing to make sure our GPUs and devicesare at max capacity So the first step tosolve the problem is to acknowledge theproblemexist It's not always uh trivial rightOur discover mechanism has two partsFirst part is custom GPU problemdetector and we use an internal GPUdevice plug-in It extends the GPU deviceuh the device plug-in framework todetect driverfailures When it detects the problem itmarks thenode and putting a condition on thenode And then our second part uh of thesolution kicks in That's node problemdetector It's an open source project andit goes all over all the nodes and lookswhich nodes were marked and those nodesthat are marked are being sent into aremediation processOnce um the remediation process kicks inwe have the uh built-in solution thattakes and has a maintenance going overthe node and we can u automate that andmonitor that process We actually talkedabout it in last cubecon Um Ryan had atalk that is called all your GPUs arebelong to us I will add a QR code with alink to the slides later if you want togo and check that oneout So now that we know how to discoverthose failures and how to take the nodeand do something to uh return this uhGPU capacity back Let's talk a littlebit what exactly do we do In the nextslides we talk how we try to solve theproblem and how we try to bring the nocapacity back into the functioningstate So you might thinkthat reboot everything is a joke Noactually it's thestrategy Reboot solves ourproblems It has short downtime Ifthere's no workloads on the node itresets thedriver We get a fresh state on the nodeand it basicallyworksOops But of course reboot has somedownsides especially if the node hasworkloads running on itWe have to either migrate theworkloads or wait for them to finishNode drain is the most time consumingpart So let's see how much timeconsuming that actuallyis As Ryan mentioned we have many manyclusters So we wanted to make a graphanadashboard thatshows how long it takes to drain a nodeWe accumulated data over a period ofseveral week�s and each row represents adifferent cluster Each column is howlong it took for nodes to drainSo we can see that many nodes drainwithin anhour but there are long tail ofnodes and for some of them it can takeeven up to eight hours especially if youhave long runningworkloads and as one of the keynotesessions todaystated the cloud and your money is notinfinite right we want those nodes withthe GPUs to be available We want to usethem as much aspossible So this is our solution todevice failure of problem We discoverthe failures we drain the nodes andreboot and we do it in an endlessloop on all of our clusters Our journeystarted mostly with manual steps and uhwe were looking at graphana dashboardssaw nodes that didn't haveworkloads went to the node and manuallycordoned it labeled as broken and laterrebooted it The process was time andeffortconsuming the response was not fastenough and of course it didn'tscale So we created automation and uhsome improvements around all these areasand uh now we're going to talk aboutsome ofthem With this improvements on theautomation we can get much closer to themaxcapacity Our improvementstouched several areas such as uhmonitoring node drain and the rebootthat we already uhmentioned So let's talk abouteach For the monitoring we often had asituation when we were fixing the samenode over and over againTo solve this problem we've added uhthis alert to detect such remediationloops Meaning that when node getsrebooted many times over and over againThis specific alert that we've added itgoes uh and checks if the nodewent into a reboot more than twice over90 minutes The alert fires and we lookat the nodeNo point of uh trying to reboot suchnode again if previous couple of rebootsdidn't solve theproblem After the discovery of thisuh node that requires areboot we are scheduling them to drainAnd we saw that train can be apotentially really painfulprocess So is there anything we can doto speed it up I mean except for killingthe sessions which we would never dothatright So we thought about two things andwe implemented them First we integratedwith our tenant to drain spot capacityand pre-warm sessionsWe alsoprioritized those nodes over other lessurgent maintenance activities in thecluster It means if we need to reboot anode because there is a device failurewe try to do that as soon as possible Wedon'talways have enough capacity to do itimmediately but we prioritize thatAfter the node is drained and there areno workloads we can finally reboot it toget the GPUs and the devices backworking Reboot doesn't always solve ourproblem Sometimes there are more stepswe need to take So we created thisrecovery workflow that you see on theslide and it goes through reboot thenpowercycle then rebuild and uh if thatdoesn't help we have a manualintervention andRMA we automated some of that and someof that is still manual but for examplewhen there is a remediation loopdiscovered it means that several rebootsdidn't help We automatically mark thatnode to go into rebuild process and wehave aautomated rebuild uh process that runsperiodically and discover those nodesand then it can automatically rebuildthem We still have some challenges thatwe look into solving in the futureForexample we would really love to knowthat after reboot or power cycle orwhatever step we took we bring back thenode into the cluster and we know thatthe problem is solved Right now we knowthat the problem is solved only afterthe node is back in cluster and we runsome workflows on itSo we talkedabout three steps forfixing our GPUs that sometimes they havethe tendency to fall of thebus And this slides uh talks about howwe try to keep the GPUs on the bus aslong as possible and withoutreboots We try to do that with severalthingsFirst we monitor the GPU health We havegraph finanadashboards and we have alerts and forGPU metrics such as temperatureanomalies power usage and soon This might help prevent problems withuh what Ryan mentioned earlier about umoverheating and uh power issues andthings like thatFrom the node perspective we use nodeproblem detector to catch �kernel issuesand system errors and more things likethat We do periodic proactive self testson nodes and onGPU And to prevent scheduling workloadson unhealthy nodes we use uh some taintsandtolerations If we know that we will needto drain the node soon we mightuh not allow workflows on that node assoon aspossible Our process from maintenanceand upgrades uh includes automaticsanitytests and uh those tests have workloadsto verify that the node and the GPU arestill healthy So for each maintenance orupgrade that we do on thenode we run some sanity after that Thisway we know that we did not break thenode or the devices on the node becauseof driver failures or some incompincompatibilities Of course we regularlyupdate drivers Kubernetes componentsoperation system and on the node and soon For that part we have uh created thesolution that is called PA and it isopen sourced uh starting a couple ofmonths ago and you're welcome to take alook at it I will also add the link tothat u in the slides later afterSo our goal is basically to keep thedeviceshealthy and maximize theiruptime We try to do that by preventingissues and when there is a problem wedrain the node and schedule it into therecoveryworkflow But we should not forget thatreboot has a costAnd the cost is the entire nodedowntime and the node drain time And wesaw it might take a whileright Currently nodes with there arenodes with two to four GPUs And that'sthe typical uh configuration in largedata centers But there is a notabletrend towards configuring nodes with upto eightGPUs particularly to meet the demands ofAIworks and as num as the number of GPUson the nodeincreases so does the cost of the nodesrebootWe are facing some new challenges hereand we want to figure out mostly how wecan use Kubernetes How can Kuberneteshelp us here to solve this challengeswith looking forward to increasing theGPUuh capacity pernode Forexample can we leverage the schedule Canthe schedule move workloads away from anode when one GPU isfailing Or can the schedule detect anode that is being rebooted at everycouple of days Those are questions andchallenges that we start to think aboutand we start to think forward how wecan get help from Kubernetes on thesechallenges how we can uh raise awarenessin the community about these challengesand how we can uh work together to solvethose eventuallyUh so like Natalie mentioned um which islike really interesting to think aboutSo we get to eight GPUs right We can getto 16 GPUs And then you also think aboutwhat happens when you know you havethese long running really big machinelearning workflows that are now going tobe have multiple GPUs and they're goingto be across different nodes right Arewe going to be rebooting all sorts ofnodes and you know how is that going toaffect our our clusters Like how how canwe think about that So um uh what Iwanted to share with everyone is um weuh uh so I started a working group uhwith Red Hat Uh this is uh Felipe umfrom Red Hat uh we just proposed this umit's called uh node life cycle and uhwhat we're looking to do is to work withthe community work with all of you tofigure out uh this problem um to see howhow we can use Kubernetes to help us umyou know particularly in the areas uhwhere we we feel like um you know thisproblem has has caused a lot of uhpeople to build different solutions ontheir own I mean we I've heardpersonally from a lot of differentpeople who have already done that andwe've we've even done it ourselves likewe just talked through and and we reallythink that these problems can uh many ofthem can be solved uh together in thecommunity And so that's what we want toaddress here in this working group thatwe've proposed Um and so we're lookingfor your support So um you know foreveryone you know that's interested youknow please come check out the link youknow tell us you tell us about your usecases you know tell us about um thingsthat that you would want to have so thatto help you do maintenance inside ofyour Kubernetes clusters um you can comefind us afterwards we're also on Slackum and like I said please comment on theuh on the poll request you know we wewant to build this charter so that youknow it can be effective you know forhelping you do maintenance and and allthe different challenges that come uhwith the day 2 device managementOkay with that uh we'll open up forquestions Thank you[Applause]This is the feedback QR code I wouldappreciate any feedback you have Heythanks uh for the talk uh as uh we'vecome to these challenges too We see thatyou use the node problem detector and uhas far as we've like been looking at itit doesn't have any like GPU support fordetecting the problems Any like plans tolike open source or share with thecommunity how do you do that so we canalso like reuse it Yeah I think that'suh that's a that's a good question Um II think we can certainly do it I thinkum uh I just think it hasn't been on uhin a road map Um but we could we couldcertainly prioritize and and get that umshared I think the only thing though I Ibelieve we have would have to create aplugin or something Is is that how itwould work I think um but we would haveto figure that out Maybe we can havesomething in um um we could find a placefor it so that you know everyone outthere can can do exactly what we'redoing Um so we can look into it I'd behappy to look into that and we can finda way to do it Awesome Thanks a lot Thatwould be really cool ThanksHey Um you had a slide showing the rateof failure and the number of GPUfailures you had I'm interested to knowis that 0.6% figure like representativeof a normal rate of failure that you'dsee in your estate And also uh what kindof things are you counting as failuresHow are you tracking them Is it justlike every time a no problem detectorcheck fails that triggers thatincrements the count So the way way wedo it is um is when we notice uhspecifically when we can't run aworkload um on that node we've basicallyidentified that the device is um that wecan't use it and generally under thehood that means that the GPU has fallenoff the bus Um but a lot of times thiswas a driver failure that we'veencountered because we switched driversand and we're not able to run a workloadon it That's how we get um that's how weuh will increase the counter to for thatmetric is we'll detect that and we can'trun a workload on it So we know we needso it's based off of the response fromthe user's job that they can't runrather than your metrics you'remonitoring on the node Um we have adevice plugin that will uh detect itbecause it will try and switch thedriver um and when it can't it will itwill share the metric and that will getposted as a condition on the node andthen um so your your first question umabout um the value of of 06% So this isum based on the the total number of uhGPUs we have in the fleet So we have thethe value that um we do over 24 hoursand and that's based on um the the totalSo the percentage is based on the numberover 24 hours and how much that is of60,000 Um and so something to thinkabout with that is um because like wementioned reboot and um it's the densityright There's a there's a sort ofanother part of this right Um that'sjust the the failure rate but um thinkabout the capacity The capacity equationis actually gets a little morechallenging you'd have to double it So0.6* 2 would be um 1.2 if you wanted toremediate right So the failure is 0.6%but the actual remediation is 1.2% Andif you went up to eight you'd actuallyum do times four right So now the numbergets a little bit more scary So that'skind of what we were alluding to is thatum is and and this could also changelike if you're doing cross nodes rightLike now you've got now how do youschedule that That's a whole schedulingcompl problem that you have to figureout And you know when you should youknow what if your workload demands thatit needs an H100 or there's 10 H100srightac few challenges here that that justthat one number illustrates and kind ofwhere you know where we want to go withwith this working group Thank you verymuch WelcomeThank you everyone If there are anyquestions we can uh answer them afterthe talk is over[Applause]2025-04-15 22:02:21.146726 G 29G��%�~#��AtnSraS9JqZ8hi everyone uh welcome to don't writecontrollers like Charlie Don't does uh Iam Nick Young from isalen at Cisco uhyeah and I'm here to talk to you todayabout writing controllers and how it's abit trickier than you mightthink okay so first up who am I to talkabout this um well yeah like I said I'mNick Young but uh I started looking intoCDs in early 2017 when they were stillcalled third party resources um I wasinvolved in building out uh Contour'sHTTP proxy C that repla�T�}#��_Ayeg-uoBYCO0okay hieverybody we're here today to save youfrom an unspeakable evil but before wetell you what it is and how to protectyourselves let us tell you a little bitabout who we arethe pond for a couple of days by myselfand I work at Capital One um I helpmaintain a machine learning platforminternally i'm also a QFlow uh memberand contributor and I'm deeply honoredto be here today as a speaker and I'malso honored to be sharing the stagewith Mory thank you Alex hi everyone myname is Morty i work as a distinguishedengineer in Capital Oneman i'mpassionate about open-source Kubernetesbuilding platforms and related servicesi also come from New York City and Ilive in New Jersey again along with myamazing wife and two kids uh it isexciting and I'm feeling great to behere to present this topic along withAlex who is a true subject matter expertin this area thank you Matthew so as Iwas saying we're here to protect youfrom an unspeakable evilif I could add an ominous sound effectto the slide I would but you'll justhave to use your imagination if you'renot familiar with this acronymcongratulations may you never see thehorrors that we've borne witnessto nih stands for not invented here andit's basically when companies decide toroll their own solutions instead ofleveraging mature industry standardwidely adopted open source solutionsto be clear there are specific use caseswhere it makes sense to build versus buyor whatever the open source equivalentof that is i guess build versus don'tbuy um so a li��1�|#��AcLJRh4y4vXgokay we're going to get started Holy cowI can't see anything back there Theselights arebright Okay really quick exercise I'mgoing to do my best to see but I needsome help from the audience Can youraise your hand if you have ever had aGPU fall off thebus I can't really see I see a few Thereare some hands Yeah All right keep yourhands raised for a second Uh if you havehad to restart a machine because aGPU wasn't working the way you'd hopedHow many more hands Oh even more Okayquite a few Okay you can put your handsdown Thank you Okay Um so we're fromNVIDIA I'm Ryan Haly This is NatalieBandelle Um and we encounter this too Ihad my hand raised She had her handraised We experience this a l��ttle context on this imagehere um we're not allowed to useunlicensed images in our deck so in thespirit of malicious compliance uh Iasked my children and Morphe asked hischildren to illustrate some of the memeswe wanted to use so you'll see sprinklethroughout this uh presentation somedrawings uh and if you're not familiarwith this this is from Monty Python andthe Holy Grail it's one of the knightswho say NI i don't know how to pronounceNIH but I'm just going to call it NIfrom now on u so kudos to our to ourchildren for helping us out so uh storytime at one of my earliest jobs withinthe industry I was tasked withmaintaining uh an in-house version ofTerraform implemented in Ruby of allthings and it was kind of a nightmareand on some level the rest of my careersince then has been a kind of prolongedhorrified recoil to thatexperience at some point I asked foradvice about this tool in a Slackworkspace for DevOps engineers and I wastold to quote "Find a new jobimmediately."Thank you brutally honest stranger onthe internet i actually listened to thisadvice it was good adviceso the point I'm trying to make is forthe love of all that is good and holyplease please please resist the alert ofNIH and use industry standard opensource software instead but you guys allknow this i mean this is a sophisticatedaudience we're at CubeCon there's justoneproblem capability gaps h so when youwork in a regulated industry like we dowhere there are elevated securityrequirements and regulatory requirementssometimes you can't just use the opensource software that you want to use offthe shelf and that's because it doesn'tmeet specific internal mandates now whenyou find yourself in this situationyou're kind of at a crossroad if youenjoy making people suffer you can usethose capability gaps as a justificationto reinvent the wheel if you're not aSith Lord you can close those capabilitygaps in your open source dependencies soto give you a sense of the kind ofcapability gaps we're talkingabout here's a list i'm not going toread them off to you but these arethings that we've dealt with or similarto things that we've dealt with and it'snot an exhaustive list but the point isthat the lift of reinventing the wheelis almost always going to be greaterthan the lift of closing these gaps inyour dependenciesso today we're going to talk about fourdifferent ways to address thesecapability gaps in your open sourcedependenciescontribution forking wpping and mutationwe're going to kind of breeze throughthe first three because the fourth iskind of the mostnovel so here wego so for each approach we're going toprovide an example we're going todelineate the pros as well as the consand we're going to conclude with a kindof comparison matrix as well as adecision tree to help you evaluate whichspecific solution to use for yourspecific usecase let's start with my favoritecontribution uh so we do a lot of thisi'm a QFlow contributor and member thisis honestly my favorite part of my joband it's always the go-to solution forus unless there are kind of obstacles toleveraging it which we'll talk about ina secondthe advantages are prettyself-explanatory to an audience like youguys so just to breeze through themquickly essentially when you contributeyou're distributing the maintenanceburden across a worldwide network ofsoftware engineers and that's incredibleand it's hard to compete withsome companies uh if you are not opensourcing you're kind of dead in thewater and here'swhy contributor count you know unlessyou have a team of almost 4,000engineers to maintain this internalproduct that you want to build it'sgoing to be very hard to compete withsomething like this now granted on somelevel Kubernetes is kind of at the endof the bell curve but you get the ideathe question is do you really want tostep into the ring with something likethis the point is don't competecollaborate in addition and this is alsopretty self-explanatory but when youcontribute to open source you broadenyour impact you're not just solving aproblem that's internal to your companyyou're solving a problem that's g�lobalpotentially and of course community umyou know just look around this room noneof us would be here if we only worked oninternal products and uh it's a realprivilege to collaborate with peoplefrom all over the world and learn fromthem and develop your craft and developfriendships and you don't really getthat when you're stuck in NIHso I'm not sure I'm allowed to presentthis slide here at CubeCon but in thespirit of honesty and transparency I dofeel obligated to so there are somedrawbacks to to bear in mind withcontribution first and foremost and mostof the time when we're not contributingthis is the obstacle that's preventingus from contributing and that is thatproprietary functionality cannot becontributed to some extent you canoffset this when you have like upluggable architecture you can injectfunctionality at runtime but not allprojects supportthat and in addition uh contributionrequires p patience because you're kindof at the mercy of the maintainers whohave competing priorities to review yourcode to merge your code to cut a releaseand if you need something immediatelycontribution might not be the solutionfor you in the short term although youcan always loop back around andcontribute the solution laterso let's move on to the next approachand that is forking or some like to callit dinglehopping uh so a fork is basically whenyou make a copy of existing code andprovide changes on top ofthat there are plenty of notable forksin the wild such as valky open tofu andless notably our internal fork at theflow pipelines UIand the primary advantage of a fork isthat you have total control over thecodebase you're not at the mercy ofmaintainers you don't need anybodyexternally to review merge and releaseyour code what happens between you andyour fork is none of mybusiness this allows you to among otherthings incorporate complex businesslogic and as mentioned it also expeditesdeliverythat being said th those areappealing advantages but there aresignificant disadvantages that you needto take intoconsideration first and foremost highmaintenance burden you're notdistributing that operational overheadyou're now responsible for this codethat you have total controlover and often that results in upgradefriction when you need to uh rebasechanges from the upstream origin intoyour fork you'll often wind up withmergeconflicts and that kind of results infeature lag where if you have aninternal fork there might befunctionality that's enabled in theupstream that you haven't had a chanceto pull downstream and so your internalusers can't benefit from that until yougo through the upgrade friction and pullin thatfunctionality there are some things youcan do to offset thesedisadvantages first and foremost keepyour fork as small as possible try toconsolidate it and keep the surface areaas small as youcan and secondly you can kind of treatyour fork as like a staging ground foropen source contribution there's a bitof a revolving door between thesedifferent solutions so you can startwith the fork and then uh to for exampleaddress something that needs to beaddressed immediately and then you canultimately contribute that functionalityupstream and thereby remove it from yourfork and reduce the surface area of yourfork and then reduce the operationaloverhead of maintaining your forknow let's move on to our third solutionwrapping so a wrapper is essentiallywhere you abstract underlyingdependencies and uh you can bedeliberate about that uh you candeliberately offiscate them or you canbe transparent about your dependenciesthe wrapper can happen server side or itcan happen client side a good example ofa wrapper that we use on a regular basisis Qflow pipelines so Qflow pipelineswraps both Argo comp Argo workflows aswell as Tecton each of those are uhworkflow orchestration engines and Qflowpipelines is kind of a more machinelearning focused workflow orchestrationengine and that's the additionalfunctionality that it adds on top of thethings that it wraps but ultimately itis a wrapper and an abstraction on topof those two toolsso wrappers like forks allow you� toincorporate complex businesslogic and unlike forks they allow you tooffiscate the underlyingdependencies among other things thisprotects against volatility if anunderlying dependency changes in someway that no longer meets yourrequirements let's say for example itchanges its license or something to thateffect in theory because you have awrapper you can swap out that offendingdependency without directly impactingend usersand in addition it results in interfacecontrol so you can completely redefinethe user interface whether it's an APICLI a guey or anSDK qflow pipelines does this with Argoworkflows and this is just to give you asense of how it's kind of rebranding theunderlying functionality um providingunique like a unique conceptualframework unique nomenclature but underthe hood when you quote unquote compilea Qflow pipeline you're winding up withthese underlying Argo workflowabstractions um with the exception ofthe Chromeworkflow so all of that sounds lovelyright but rappers have massive massivedisadvantages and uh I think sometimespeople fail to take those into accountso first first of all just like forkingyou have high maintenance burden upgradefriction feature lag and uh in additionwhat I found is that people who maintainrappers often times engineering time andeffort is redirected away from featuredelivery for end users and towardswrapping tracked targets because it'snot a trivial task especially whenyou're targeting multiple complexunderlying dependencies like for examplethe way QFlow does with both Tecton andArgoworkflows in addition rappers result inincreased complexity and that translatesinto brittleleness and difficulty whendebugging so for example when a Qflowpipeline run fails we have to kind ofdetermine if the issue is end user codeQflow pipelines Argo workflowsKubernetes or something lower in thestack any one of those individually isenormously complex and difficult todebug on its own when you combine all ofthat together it's hard to kind of siftthrough all these abstraction layers tofigure out what the exact source of theproblem is and this one's interestingbecause it was also listed as a pro butproprietary interfaces result insomething that I personally refer to ascursed knowledge and what that means isas engineers we want to invest in skillsthat are transferable to the next rolethat we find ourselves in and whenyou're when you're forced to work onproprietary interfaces that don't existexternally you're developing expertiseand skills that aren't transferable andthat can be kind of frustrating i findthat maintainers and end users are a lothappier when they're able to invest inindustry standardinterfaces so I'm going to hand it offto Mory now to talk about what I thinkis the most novel of the four approachesthank youAlex okay let's start on mutation beforewe start on mutation let me give youstart small intro okay in Kubernetesworld mutation means you're modifyingthe moni manifest at the runtime or onthe fly before it is actually createdwithin the cluster how it works forexample when user is submitting somemanifest or some application is creatingsome manifest in the cluster the requestsent to API server the AP sends HTTPcalls to the corresponding web books thecorresponding web books matches a targetand it takes action like modifying orsomething then again it sends the callback to the API server then the APIserver creates a manifest in a highlevel this is what happening behindmutation okay how mutation is helping inthis context as you can see in the slidewithout mutation you are actuallytouching the source code of theapplication you're for each applicationfor example pipeline or orgo workflowsor Ky whatever The open sourceapplication you're touching the codeyou're making the changes you'remaintaining it internally or you'recreating wrappers out of it that comeswith the high maintenance right withmutation you're not touching the sourcecode of open source repos at all you'redirectly modifying the manifest in thatway it embraces a common approach youdon't have to worry about what languageit is written and a�ll those things andall you're directly working in themiddle middle layer one levelup for the purpose of this sessionWe will go through Kerno before I startwe start on Kerno there are generalpurpose mutation web books or admissioncontrollers available like OPA is one ofthe example ko is another example thereare application specific uh web booksalso available if we all know HTO HTO ishaving its own kind of a mutatingcontroller to inject side and otherthings uh or you can write your customweb books also uh suppose if you'rewriting some application you can write acustom web books also uh but when thereis a general purpose thing is availablethen you don't have to write a customthat is what we are going to see todayone one of theexample and Kano is a open sourcegeneral where Kano lands in this arearight Kano is a open-source uh generalpurpose admission controller it is aYAML based policy engine it is CNCF inincubation I think recorded in 2022uh it's a general purpose one so you canit's simple powerful YL based you don'thave to learn any language any specificlanguage for example a reggo languagesimilar to for open and those it's allYAML based uh withKO it can do all the actions that it cando valid it can validate your manifestit can mutate your manifest it cancreate man it can create resources itcan delete resources it can do all setof jobs for you it can talk to APIserver it can get some values it it'ssimple and so powerful in a nutshellOkay it can read your metadata valuesfrom namespace it can talk to API serverit can get additional values so that youcan use the values to mutate yourmanifest let's see okay how thismutation validation is working withKerno on a high level Ko is workingbased on policies uh this is YML basedone again you have a two kind ofpolicies one is cluster level policywhere you can apply at the cluster levelyou can create a rule that applicablefor all the name spaces then the nextone is a namespace specific policy asname suggests so it is applicable onlywithin the namespace within each policyyou have a set of rules like you canhave multiple rules in one rule you cansay inject environment variable in theother rule you can say okay injectvolumes you can take multiple actionsunder a single policy you basically youcan nest then then the match section orexclude section comes uh it is where youdefine the targets hey I want the actionto go on a name space i want the actionto go on a pod so you can so all thematch all the targets you can see on theright hand side you you can do pod levelyou can do it on a specific uh uhobjects like with the names labelsannotation you have all these kind ofoptions to set your targets in thebottom section you have all the actionsyou can do validate you can do mutateyou generate is nothing but createresources then the last one is a verifyimages that is kind of all for beta uhbasically you can verify your imageexams let's see how Kibano isimplementing all these operations behindthe scene let's see let's go throughsome highlearchitecture this is a high levelarchitecture behind Kubernau most of thething is uh internal to Kubernau i'lljust go through three controllers threemain controllers one is admissioncontroller which is responsible for yourvalidation web books and uh mutation onthe new resources the backgroundcontroller is for uh your createresources objects uh create resourcesoperations and for mutate on theexisting objects then the then there isa cleanup controller that is responsiblefor your cleanup basically you candelete resources also and user namespaces and those things other things areall very internal to keyo so like saidweb controller set renew and everythinghow it handles the operation behind thescenesuh wait u I have to say this we are notvendor we are not selling kon or we arenot promoting anything here we are alsousers this is a easy purpose easygeneral purpose admission web bookavailable there so we just want to seeso you can all how easy and powerful itis to do the mutation sorry for thepicture I have requested my kid to drawa stop sign for me she was really makin�gsure it is a stop sign yeah thankyou okay let's go through some exampleexamples okay like common examples whereeveryone goes through this is a commonone in a enterprise and um regulatedenvironments they don't allow the publicum like image repos right so then youhave to mutate those manifest to reflectinto internal repos this is one of thesample policy that you can use you canmatch the image then you can replace itwith your internal artifactory we facethis issue uh in one of the Qflowpipelines things where user don't setthe default image then it pulls it tryto pull docker image from docker.iowhich never works in our environments sowe use this policy to affect okay do iton the fly even user don't specify ityou mutate it so that it don't breaktheirpipeline then this is another one we allgo through right and I think sameregulatory environments they want you toexplicitly define the security contextwho is what is the user ID for thecontainer set allow provisionalescalation to false and various settingsunder the security context right you canuse as a example to set it actually Ithink one The example is I think we wewere facing issues with Kib they werecreating suggestion deployment orsomething inside then where we were notable because basically they were hardcoding it we were not able to do it soour only solution either you fork itmodify it deploy it inside or we we youbut in that case we used mutation webbook okay we can modify it on the fly inthat way we avoided uh theforking as you can see um you have arules under like this is a single ruleapplication we want to keep it simplethat is why you see a single rule yousee a target okay it is targeting partsit is targeting a name space with aQflow profile label on it then it sets aprecondition the image should be likewithin this one then finally you aremaking theaction the no is not just working basedon native Kubernetes resources this isan example of how you can extend themutation to custom resources daskk isone of the custom resource basically youcan define any resources not just uhKubernetes native resources you can usecustom resources also this is a problemwhere we f where this is another examplewhere you can enforce some behavior onuser if you use uh in through thisexample what we did is we enforced yourmaximum idle timeout is 1 hour in thatway you're enforcing the user behaviorso all the objects will be deleted afterlike 1 hour of timeout in dash job thatresulted in saving a lot of money forus and this yeah so we all come acrossproblems right so we run notebooks forexample we run longunning jobs like xboost the uh the part does not uh sobasically we kubernetes should notdisrupt those parts correct like so ifthey disrupt those parts then whateverjob is running that will get disruptedit it cannot afford part disruptionright in In the Kubernetes world thecarpenter or any cluster autoscaler thatwill try to reschedule the part in abetter node to improve the node usageright but these long runninging jobparts they they cannot afford this kindof a part disruption you can use thispolicy in example or we use this policyin example to protect those parts so webasically we did apply this carpenterannotations on notebook parts exos partsso that our longunning jobs wereprotectedAnd this is an example of sidecarcontainers uh so when you're developinga you don't have to develop like acustom webbook when you're developinginternal applications you can think oflike a Qflow notebook is having its ownadmission web books is having its ownadmission web books for them to injectlike a sidecar or any any otheroperation suppose if you're developingsomething internally I think if you allknow like sidecar containers are veryfamous in Kubernetes it can abstract lotof logic from main containers forexample matrix collectionorchestration storage injection you canname anything you can achieve it throughkubernetes sidecar containers you canuse this policy in example how you caninject sidecar containers into yourworkloads as you can see you're not justinjecting the containers you can alsoinject configuratio�ns along with it inthis example you are injectingenvironment variable you are injectingsecurity contents you're injectingvolumes you're injecting volumesbasically you can draft everything oryou can dictate everything along withyour sidecar containers that is why itis so powerful and in the morning in thebottom section it says precondition soyou can literally tell Kubernetes heydon't touch any existing objects okay doit only on the new objects so you candefine all those condition alsohere let's move on to some validationpolicies in a multi-tenant environmentthere was always recommended we want totrack which name space belong to whichteam which project and all those thingsand we want to deny those objects whichis not meeting the requirements you wecan use this policy in example to okaydon't create name spaces or don't allownamespaces which is not having a projectname or t name in that way yourkubernetes resources are compliantenough in this example uh is a generalpractice right so the node scaling andEverything like clust whether it is acluster autoscaler or carpenter they allwork based on your container requests itis a general practice in kubernetes tokeep your limits and request differenceratio within 20%age correct how do youenforce it because some containers wemay be creating some contains user maybe creating some contains may come fromopen source we can use this as a policyto enforce the behavior hey if you'renot meeting this criteria if yourrequest limit ratio is not within that20%age I'm going to deny the podcreation you can use this and we can usethis as anexample and this is a example for acreate policy so we were writing rappersokay before we adapt Qo and all right aswe were writing rappers you may comeacross some situation the multi-tenantplatform situation you want to give thema basic volume like hey I'm giving youworkspace volume work where you can putall your data inside or we will not letyou to create volumes because it isgoing to cost lot of money correct uhyou can use this as a poly example tocreate those default volumes inside username spaces basically you can avoidwrapping to input those kind of a logicsuh with this kind of apolicy let's see some deletion examplescorrect u same story like in your devenvironment if people are create I'm I'magain sticking with the basic example uhin a dev environment people can createvolumes there will be like orphanedvolumes and uh we don't want platformadmins or you don't want platform adminsto go and manually check okay how old itis delete and all those things correctyou can use this policy to an effect heyif it is if the volume is 1 month old ina de environment you can automaticallyclean those volumes in that way you willsave lot of money in the deletion ofstale resources you can achieve it intwo ways one is through policy as weseen like through YAML or you can justinject the label thecleanup.qano.io/TL 30 days okay so theKO controller will watch for this labelit will delete all your ST resourcesautomatically uh quick question how doyou inject this labels automaticallyinto like a user workloads mutationcorrect so you can mutate this thingwhenever somebody is creating a volumeyou can mutate or whenever and same canbe extend to your ingress also right youcan delete the load balances you cansave a lot of money you can extend thislabel to any objects which users arecreating in that way you delete thestale resources you do the garbagecollection you save lot ofmoney i'm sorry for the picture againi'm requesting my kid to do okay write acloud sir draw a cloud do you need moreshe draw the she drew the cloud she wasmaking sure it is a cloud okay andwhatever we have gone through it is verybasic examples okay yeah but Kibano isso powerful like you can directly readfrom Kubernetes API server you can readfrom your namespace for some data likefor example if you want to automaticallypropagate IM ro from a namespace tounderlying parts you can propagate oryou want to read something else from APserver you can do there isa repo here it can I don't know how whatis the exact number but at least itcontains like hundreds of examples thereuh you can take this as a like areference you can create your ownpolicies to match the requirements tomatch your requirementsLet's see some of the pros right uh itis easy as hell you're not learning anycode or anything you're directly you'reworking on one level up or you'reworking directly on the manifest thereis no maintenance button you don't haveto for you don't have to wrap individualreports you don't have to do anything ofthat kind and you're embracing thecommon solution you're not working onthe individual opensource project levelyou're working you're addressing clusteras a whole or your platform as a wholeand it comes also with like a little bituh mutation cons actually because it ismutating those manifest at the runtimeit comes with a computational overheadbecause it is intercepts every APIrequest there is a it comes with its owncomputational overhead but there areways to uh address it okay it supportshigh KO supports high availability youcan do you can run like multiplereplicas there is a busting capacity andall that you can increase please you canrefine your ki implementation based onthe scal scale of yourplatform then the next one is a mutationerrors because the error if somethinghappens the error is not reflected inapplication containers the error will berefle reflected in given no controllerso you have to rely on your platformadmin to uh debug okay what has gonewrong why this is not working so that isone of the cons but these are allcompared to other cons uh these aresomething we can encount we can solve itwith with that I'll hand over it to Alexto go through the final slides thank youeveryone thank youMory so here's a fancy table and themain thing we want you to take away fromthis is first of all there's no onesolution fits all problems your specificuse case might mandate one versus theother but also you probably want to movebetween different solutions so you mightstart with a fork and then when there'sless urgency work on a contribution whenthat gets merged you can eliminate thefork and same applies for mutation youshould periodically go through your uhkavern policies and see if a those holeshave been closed in the upstreamdependencies or if you want tocontribute so that you no longer have tomaintain that policywe created this decision tree to kind ofhelp you digest that last table andbasically you know we'll walk through itis the solution proprietary if theanswer is yes then you can't contributeso that rules contribution out is thesolution urgent if the answer is yesthen you can't contribute unless youpersonally know the maintainers and theycan expedite the merge and release foryou so in either case you wind up at canyou mutate if you can that's probably agood solution uh just because themaintenance burden is so low and then ifyou can't mutate then you have to askyourself do you want to abstract theunderlying dependency and if the answeris yes then go ahead and wrap and if theanswer is no then fork and againremember you're moving through thesedifferent leaf nodeshere so uh you know what we want youguys to take away is that it'sworthwhile um to close these gaps in theupstream instead of rebuilding stuffinhouse and there are short-termsolutions to help you avoid NIH as welland just some highle takeaways like Isaid avoid NIH like the plaguecontribution is the ideal solution ifyou can make that work mutation isawesome don't fork unless necessary andkeep your fork small unfortunately therewasn't like a fork only emoji so youjust have to ignore that knife overthere and only wrap if the cost isjustified so that concludes our talk iwanna I want to thank you guys forattending and I also want to thank ouresteemed but nevertheless unpaidillustrators we compensate them in otherways thanks everyone thank you everyonethank youdo we have time forquestions i I'll stay up here untilsomebody kicks us off we will step ifthere is any questions you can walk overas well or you could come come overwe're happy to talk to you thankseveryone thank you2025-04-15 22:02:21.896135�ced its ingressroute C so I've done a couple of likereal life redesigns uh and I've beeninvolved in gateway API since itsinception of 2018 um which is deliveredpurely using C so I've done lots of Cstuff uh I've done lots of C design uhand most importantly uh I built lots ofcontrollers to do the same thing andI've screwed it up plenty of times rightso don't feel bad if you screw thisstuff up it's actually surprisinglydifficult okay so today's agenda we'regoing to walk through some uh CRDcontroller any patterns using Charliedon't as our straw man you can see fromthe name I chose the names so it's gotCRD in it um but yeah I want to give yousome tips on how to avoid them how notto make the mistakes I have um have alook at some of the frameworks availablethat make some of this stuff a loteasier uh and give you some tips aboutwhat not to do with them again based onstuff that I have done okay so whyCharlie don't well you can thank TheSimpsons for that um so uh I love TheSimpsons uh this particular episode wasBark Got a Knife and they gave him thishelpful book with uh Don't Do What DonnyDon't Does uh and so uh yeah so I wastrying to figure a way to make this sortof stuff a little bit less dry a littlebit more fun uh and so yeah I came upwith Charlie Don't uh Charlie works on acustom controller for Kubernetes a bigco uh yeah he just always manages tomake the wrong decision and like andmess himself up so yeah take a moment tofeel sorry for Charlie but we're allgoing to learn from his mistakes todayso thank you now so uh I've done twotalks uh using Charlie before the firstone was about designing CDs that uh QRcode will take you to the to the YouTubeuh for that talk that was a reax um sothese are the tips I you know stuff likereading the API bibles uh thinking abouthow your users will use the CD usingstatus you know avoiding certain typesof uh values all of this stuff is reallyhelpful when you're doing this and thereason for a lot of those rules isactually about making API changes um sowhen you're when you're making APIchanges it's really important to uhhandle them correctly so that you canhave your users have make their API achanges where you can make your APIchanges in a way that doesn't screw overyour users okay um the important part isto make compatible changes uh so yeahthere's a bunch of rules there i don'twant to spend too long going into allthese but most of them boil down to makeit so that when between the new versionand the old version if nothing changesthen no behavior changes right it seemslike the most bananal thing to say inthe world but it's actually really easyto make to get this wrong okay but todaywhat we want to talk about is writingcontrollers now um some of this appliesdirectly to Go controllers some of itdoesn't some of it applies to whateverlanguage you're writing in i don't knowmuch about writing controllers inlanguages any other than go in any otherthan Go language than Go sorry uh soyeah it's mostly in Go today if you'reif you're using QRS like well done youum but and I'm I'm happy to talk to youabout it later but I just don't knowenough to tell you about anythingmistakes you're going to make thereso So first thing Charlie does is heuses a simple client just the client goclient he writes a controller he's like"Okay cool i need to get some resourcesi'm just going to call get get get getget get get get get get get get get getget get get get get get get get get getget get get get get get get get get getget get get get get get get get it itfrom the API server every every time Ido I'm just going to list from the APIserver yeah that's all cool when he'sdoing local dev on his machine but whenyou start running in a cluster andyou're getting a thousand objectsmultiple times a second all of a suddenyou might you a poor old API server isgoing "What are you doing to me?" Rightlike and a lot of those gets are notrequired a lot of the time the themirror of that problem is every time youdo an update you post the whole objectback to the API server and the APIserver has to take that object uh checkit all for uh storage �versions and stuffand then push it back down to ATD and ifyou're doing no op updates most of thathappens anyway and so you're burning alot of time and API server resources ondoing nothing right so you know it'sit's really easy mistake to makeespecially if you're just starting outum yeah and so one of the ways that youcan get around this is using a C clientnow the basic the basic client godoesn't provide a C client so you'rekind of out of luck um but there is away to solve this i'm going to talkabout in just a couple of minutes butyou know ideally you want a client thatmaintains a case of current state andyou know there's a lot of ways to dothis um but also once you do that it nowmeans you have a cation validationproblem like yay you know like I alwayslike to say there's two hard problems incomputer science naming things cationvalidation and off by one errors rightso um the uh you know cation validationis a really hard problem so it makessense to try and make it not yourproblem um but the most important thinghere that even no matter what frameworkyou're choosing is to limit the numberof API server updates and requests makesure you're only touching the API serverwhen you really need to um it's reallyeasy to bring an API server to its kneesif you've got a lot of objects and a lotof nodes and a lot of controllersreconciling those objects um so yeahcheck your own updates instead is thequickest and easiest thing you can dohere like when you're about to send anupdate you know you've got a copy of theoriginal thing you've got a copy of thenew thing check them make sure thatthere's no difference make sure there isa difference rather if there's nodifference don't send them like you andso it's a pretty basic stuff but it'sreally easy to forget to do and you canreally mess yourself up so yeah also theother thing that you can do here is youcan use patch instead of update where ifyou only send the fields that arechanged then the API server says "Ohokay those are the only things that havechanged." And if it's empty then APIserver knows that it's a noop rightso yeah it it does it also helps avoidproblems with racing updates so I don'tknow if you've written a controlleryou've probably had it happen whereyou've tried to uh apply an update andit's like "Oh hey that object thatobject has been modified since you sinceyou uh uh since you got it." And so thatlike if two controllers are updating thesame object they could both be racing tomake updates and if they're both doingnot patches you can end up with updatesjust failing to apply because the APIserver is like "Hey you don't have themost recent version of the of theobject." So that goes double for statusupdates really easy to do this forstatus so Charlie don't figures thisthis out eventually and then makes hisown cing client using the informersconstructs that are built in the in theclient go um he hand rolls a cing clientand uses informers it starts up theinformers uses all the standard code andstandard patterns of how to do this henow gets to handle all the concurrencyand ordering problems that he gets fromdoing that um you know so there are andspeaking from experience here againthere are a lot of weird edge cases thatcan happen there when you've got updatesfrom one object and you need updates fora dependent object they haven't come yetbut then when you do get the dependentobject update you there's no update fromthe first object to tell you thateverything is now in sync so you you itbecomes really difficult to tell what'sin sync and what's not and what you whatinformation you're waiting for and whatjust doesn't exist rightso yeah so each controller has tomaintain a a per kind c so you have tohave a for each controller that you'redoing that's watching if it's watchingseven kinds you've now got seven cesright and if those kinds have referencesbetween each other you now have to makesure that those references areconsistent between those cached objectsand because everything's eventuallyconsistent means you actually have noway of being sure or it's very difficultto be sure that you actually �have thefull state of the system so it's verydifficult to tell the difference betweenso in the example of G gateway API ifyou've got a HTTP route update and youcan't find the service is that becausethe service doesn't exist or because itdoesn't exist yet and it's about thenext update that you're going to processis a service update right and so likeit's very difficult to tell that stuffand what happens is you end upprocessing things many many many manytimes u for no goodreason so the the answer here is simplelike use a framework instead frameworksare specific specifically built to dothis stuff for you people who arewriting frameworks are doing it to makethis stuff easier so that you don't haveto think about all of the eventualconsistency stuff mostly i meanobviously you're going to have to thinkabout it in when you find edge cases umbut yeah so you're looking for thingsthat something that watches resourcesand maintains the current state of themfor you something that allows you to dothings when objects when particular setsof objects change and ideally ideallysomething that helps with coalesingrights back to the API server so thatyou're not needing to do as much spendas much time checking your rightsagainst eachother that's a nice to have though theother two are must haves so uh there arethree frameworks I know of there'sprobably other ones but these are theones that I've either heard of or useduh so KRT state DB and controllerruntime now KRT uh is written by JohnHoward as part of it as an experimentalrefactoring of ISTTO it's used in someexperimental stuff in ISTO and it's alsoused in K gateway in this one youperform operations on collections sothey can be sourced from any Kubernetesobject via informers or other objects aswell one of the nice things about KRT isthat it's much more generic it doesn'tonly cover Kubernetes objects you can doother cool stuff with it um but the keypart is that when when you define themyou also set a bunch of relevant fetchfunctions which are kind of fetchingobjects from the collection based onsome set of criteria when that fetchfunction would change uh like when newobjects arrive or when old objects woulddisappear the transformation functionsare called the do things those thingsare usually output a bunch of otherobjects that you then feed intosomething else that then turns them inthat then takes actions on them so Ithink it's a very interesting approachthat's still under active development umfor me I'm kind of used to some of theother approaches so there was a bit of abit of a cognitive overload cognitiveoverhead sorry to uh to sort of processexactly how this would work i had tolook at some examples but I think it'spretty neat another one is uh state DBi'm more familiar with this cuz uh youknow I work on selium um psyllium isdivided into two parts an operator andan agent in the operator we usecontroller runtime which is the nextfragment uh framework and state DB whichis used in the agent now state DB is ain-memory radical tree database for gothat supports cross cross table writetransactions and most importantly watchchannels that close when that part ofthe radics tree is updated so what thatgets you is you can basically it letsyou set up a table that stores all ofthe records of a partic particular kindand when that table gets updated you cando stuff uh and so table being updatedmeans a row being added a row beingremoved so you're an object being addedor removed or updated um and so thoseupdate or delete operations like thatare executed when those things happen soit basically it lets you treat acollection of Kubernetes objects like adatabase like rows in a database um andit's also still pretty new we've donewe've moved some things over it doesmake some things really easy um somethings it doesn't like gateway APIprocessing where you've got a lot ofrelated tables you start have to treatit a lot more like old fashionedrelational database with all of the sortof uh cognitive overhead and problemsthat entails but that doesn't mean notto do it it's just uh you know that'sone of the reasons why we d�on't use itin the other part uh the operator whichuses controller runtime now controllerruntime is included as part of the thecube builder controller tool set andthat that that cube builder project isactually part of upstream Kubernetesthis one uses a really deep reconcilepattern with a keybased lookup on top ofa cing Kubernetes client that hasexactly the same API as the base clientgo version so the nice part about thisis um the uh controller runtime uhwatches are set up so that they maintaina cache for you and you get list updatein that cache the same as you would withthe vanilla client so you can it's veryeasy to move code that uses the vanillaclient over to use controller runtimewhich is really nice um so yeah thereconcile pattern works like this soyou've got um a controller that runs goroutines to that actually does thewatching and updates the local ces andthen each one controller has one mainresource that it reconciles so when uhstuff happens a reconcile request isreturned and then it triggers thereconcile function so this is just aninterface in go terms and that reconcilefunction does whatever you want it to donow that can also be go and look up thestate of other objects in the in the cesand take ob and take actions um one ofthe really nice things about this thoughis that you're watching one main kindbut you can also say I am interested inwhen when changes to this main kindhappen we'll reconcile that main kindand your reconcile function gets thename and namespace of an object that'schanged so then you go off and you getthe object and you see if it's beendeleted or updated or created um but theother thing that you can do is you canadd extra watches that say if one ofthese objects gets updated then call areconciliation on the main functionright it's a little hard to explain inthat way so oh hang on I preemptedmyself here I forgot about that slide umso yeah the important part here is thatthe local c store uh the local c storesthe state for you and the reconcilefunction function lets you do thingswhen that state changes so this is uhsome code from Selium's gateway APIreconciler um you can see here that uhit watches gateways it says for gatewaywith predicates now those predicates letyou slice down the number of gatewaysthat you're actually going to attempt toreconcile right so those predicates umbasically it says for gateway API ifyour controller isn't managing a gatewayclass that the gateway is associatedwith then you shouldn't touch it so thisis saying for you know I I want to be Iwant to call reconcile whenever agateway that I care about changes andthen so the reconcile should onlyreceive uh rec reconciliations forgateways that you care about so it it'slike one layer less processing but thenice part is you can also watch gatewayclass resources with their ownpredicates uh services HTTP routes andlike tons more this would have been likefive sides long if I included the entirething because gateway requires you towatch a lot of objects but if any one ofthose things updates in a way that thepredicate functions say is relevant thenthe main reconcile function for gatewaywill be called and what that means ingateway API terms is that when any ofthe dependent objects change yourecalculate the whole thing but andrecalculating the whole thing wouldnormally be something you'd be like "Ohno that's pretty expensive." But becauseall of this is using a local cache therecalculation just involves retrievingobjects from the C and checking them allright so you're not talking you're notdoing any network access it's basicallyas efficient as you could possibly getto do this sort ofthing soum yeah Charlie don't coming back topoor old Charlie uh he makes reconcilemistakes uh so he doesn't realize thatany predicates applied to four don't getalso get applied to predicates appliedto other watch calls and so and he alsouses the wrong resource for his uh mainreconcile loop now in this case uh it'sme I'm Charlie i I did this um veryrecently uh so selium uh selium'sgateway API and gamma controllers thereare two they're both written withcontrolle�r runtime they were not eachchecking the parent refs for their HTTProutes correctly so they were triggeringextra reconciliations that were thenthrown away so what's happening here isthat thing I showed you before gatewayis the reconciled object uh we're goingoh there's an updating there's an updatein a HTTP route that that HTTP route wasdoing some checks to say do I care aboutthis update what it wasn't checking wasdoes the h the gateway that this isattached to roll up to a gateway classlike the predicate function or gate onon the gateway reconciler so the keypart here is for that first part youneed the reconcile the predicates on thedownstream watches also need to repeatthe logic that you have in the four callat the top of the controller tools okaylet me roll that one back cuz the more Isay it the more the more I realize it'sreally hard to explain so in this onewe've got four uh four gateway builderwith predicates predicate new predicatefunctions thanks go for all of yournested function things here has matchingcontroller function that has matchingcontroller function does a lookup thatsays okay the gateway references agateway class the gateway class has afield set that means that selium caresabout it the problem was here that inthe HTTP route watcher down the bottomthe incue request for owning HTTP routefunction did not check that so it wouldjust check some other stuff and then itwould incue the request for the gatewayso that meant we're reconciling a bunchof gateways that we didn't actually needto this was really important when we gotto the state that we had the tworeconcilers the gateway API and thegamma ones because they both reconcileHTTP routes they're very different kindsof HTTP routes because they refer to adifferent object they go up to adifferent object but because they wereboth reconciling HTTP routes the code itmeant that basically we were doingdouble the number of reconciliation thatwe would otherwise need to do if we hadwritten this properly also the gammareconciler was is reconciling HTTProutes rather than service so when wepick this object uh inthethe in the um in this thing for gammathe four is for uh HTTP route so the themain object we're reconciling is HTTProute the problem is HTTP routes ingamma roll up to a service object rightso the service object is the higherlayer construct we're reconciling toolow down in the tree of objects hereand so the re the way that that foundout is in in gamma you're allowed tohave multiple HTTP routes point to thesame service it's it's a one to manyrelationship because we're reconcilingdown here every time we update any HTTProute we regenerate all the config andstuff and what that means is the lastHTTP route to get reconciled is the onlyone that that gets config all the otherconfig gets wiped away and that'sbecause we weren't reconciling at theright level rightso give me a secthat's it it's critical to make sureyou're reconciling the right object andso you really need to understand thedesign of your system of CRDs to do thistop tip if you are writing a gateway APIcontroller reconcile gateways don't relike that that should be your main yourmain thing that you'rewatching okay wow I've gone a bit fasterthan I thought I would um there's goingto be plenty of time for questions ihope there will be uh some so yeah umit's really easy to make mistakeswriting controllers um use a frameworkum I recommend controller runtime forpeople who are starting out because it'srelatively straightforward and it'supstream youknow um use patch andor check yourchanges to make sure that they'rerelevant before sending to the APIserver especially status updates herewhen you're updating the status subresource it's really important to makesure that you're not sending no opupdates because it's if you've gotthousands of objects that are all uh youone problem I had here was we had somepeople who had uh selium installationswith gateways and then hundreds of HTTProutes attached to the gateway or alarge number of HTTP routes so if youupdate the gateway status anytime one ofthose HTTP routes changes when you� startup you're now doing hundreds of statusupdates for that gateway that you don'tneed to be doing right and so when youit can create thundering herd problemson startup really easily um that canjust you know and that sort ofthundering herd problem is of course oneof the suckiest things to do deal withas an operator because you're like oh Irestarted and now everything's fallingover because I restarted you reallydon't want to have your users in thatsort of positionso if you are using controller runtimeremember that all your predicatefunctions need to check that thatreconciled resource is relevant as wellas whatever checks they run don't be methis I was really annoyed at myself whenI figured this one out um yeah soCharlie Don and I both say thanks forlistening we've got plenty of time forquestions i really like honestly ifyou've got a curly uh question then feelfree to come up and ask itpleaseawesome hit me uh thanks for the talk uhmy question is whether have you lookedat to um being able to reason about howmany reconciliations occur and whichevent sources and which other um I guesswatches and other sort of channels thatthey come from like are you able toreason about in a large scale settinglike Celium for example or we'rereceiving this many reconciliations andthis is why yeah yeah so the frameworksactually help you by providing metricsabout um about how many uhreconciliations you do and of course ifyou're doing your own metrics uh it'sreally helpful in your reconcilefunction to like increment metrics aboutI just did a reconcile you know here'show long it took um that sort of thingvery very helpful to include metric myquestion is a little bit like notexactly how many times you'rereconciling but more like do youunderstand exactly why you're receivingthat reconciliation request yeah I thinkthat's a that's a really good point umthe uh it can be tricky because whenyou're using for when you're usingcontroller runtime the thing that yououtput from those watches calls isactually just a request to reconcile themain object right so the best that youcould do is as you're rebuilding to saylike I think maybe this might havechanged but honestly the um like the alot of the time the amount of value youextract out of that is going to be muchless than the problems that you're goingto build for yourself by trying to keepI would strongly recommend end don'tkeep uh a before and after state whenyou're doing that just rebuild the stateof the world every time you get areconcile request because um that is themost efficient way to make sure youyou're sort of getting everythingcorrect trying to keep track of thingsyou have to keep track of like uh youknow did this exist before is it new hasit been deleted tombstoneing things allthat sort of thing so yeah I I reallystrongly recommend rebuild the state ofthe world in your reconcile functionsthank you excellenthi just a question about um how wouldyou structure an operator in general umI have just started my first approachand uh decided to make something withmultiple threats and uh because to havesome easy stage sharing uh to monitormultiple resources mhm um is this a goodapproach or should I e better switch tosomething more like a single thread ineach container so the um the nice partone of the reasons that controllerruntime uses that reconcile approach isthat it's the each reconcile operationis relatively orthogonal um you knowbecause you're they're all uh for thesame type of object so you can domultiple ones kind of at once um and aslong as you're as long as each is identand rebuilds the state of the world thenit doesn't matter so then you can gowide in terms of multiple threads mucheasier i think the thing you got to becareful about if you're going wide anddoing lots of threads is uh is uhconsistency and making sure that uhwhatever is Yeah so at the because atthe end of what is whatever you'reoutputting from that um ends up in aconsistent state right like so uh if youif you're reconciling gateways a lot ofthe time for selium the reason we'rereconciling gateways is we want to beable to output an� object at the at theend of it selium the way selium does itis we reconcile the gateways we output athing called a selium envoy config thatthen configures the agent to do envoyconfig right uh and so like we want toend up with like one selium envoy configand not be updating that selium envoyconfig constantly unless it's actuallychanged and that's another time wherethe updates thing becomes important soif you're making that too wide then andyou need them you need the all of thoseoperations to be ident and to begenerating the same config at the end ofthe day ex except for their extra uhoperation does that make sense so umlike I'd say yeah go wide but just bereally careful about what the output isand make sure that that output isconsistent and that you're checking itfor uh uh uh for real operations and notdoing no operations at the end yeah okaydoes that make sense a bit sorry i havea look into it yeah okay great thanksokay since no one's asking otherquestions I'll ask another one yeah whatare your thoughts about controllersreconciling um results into an externalyou know outside world like for examplechanging something in a cloud providerAPI I don't know writing something intoa bucket or stuff like that so basicallyall the stuff that you mentioned so farokay when everything is in Kubernetes uhyou can maintain informers on them andyou can in memory do inmemory comparisonof the objects rather cheaply but whensomething is in an external worldsuddenly that reconcile function becomesexpensive and no longer offline rightyeah I mean in that case I wouldactually recommend to do something morelike what uh the the KRT framework isdoing with collections and representthat external state with an in-memorything here that then does it the sort ofthe the other way around operation rightso when you when a Kubernetes thingcalls a reconcile function that thenupdates the state that should be writtenout to the outside world then you wait alittle bit and then then write thatstate so that it's consistent and only acertain amount of time and you know youcan do hold down timers and a bunch ofother neat stuff there to make sureyou're not overloading things right iguess in that case like when someonegoes to a cloud provider API andmanually changes something without youknow that the controller is supposed tomanage I guess you never get to correctthat mistake or you know overwrite thatdecision right so yeah well I mean itcomes that comes down to a philosophything I guess like what are youanticipating the uh the thing to do hereare you anticipating that the um thatthe the cloud provider state is thesource of truth or are you anticipatingthat your controller is the source oftruth right if you want to be able totake changes and synchronize them intoyour controller then it's a much harderproblem than if you say okay whatever'shappening in the controller is theanswer uh and you know we want to justtake whatever is in the controller andpush it out to the outside world and ifsomeone changes it too bad your changesgot lost right like you know I wouldprobably argue that most of the time youwant that like you know it's going to beweird that not often that you're goingto want to persist uh um like manuallymade changes it's better if you've got areconciled system for the system thereconciliation based system tocompletely reflect the state of theworld and to push reconciled state asfar as possible out yeah makes sensethank you yeah excellent thanks verymuch um yeah any other questions verywelcomed um yeah sorry for burningthrough that a bit quicker than Ithought it wouldin practice I must have talked slower orsomething oh great excellent hit me hitme thanks a lot for the call uh so myquestion is a bit of a tangent but howdo you test your operator beforedeploying it to production it's morelike a So this is actually a really goodquestion um so um one of the things thatwe try and do is so for selium forpsyllium the selium operator this isactually relatively straightforwardbecause our it depends on the answerreally depends on what your outputs arelike for us our output is a Kubernetesobject so we can test that the in-memoryrepresentation of the Kubernetes objectmatches what it should be um you soyou're testing the that the inputs theinputs you have control over match theoutput you have control over right sovery easy to do there's no mockingrequired you know you just basically sayrun here's the inputs run the functionhere's what here's what I expect to bethe output if you are testing thingsthat are um like that require pushingout to other APIs like that other guy Ijust uh mentioned then having some sortof in-memory representation of thatstate is the only real way to be able totest otherwise you're going to have tomock the AWS API or something and likejump through a bunch of hoops definitelydoable but like way more work than justhaving some inmemory states so yeah likegenerally the the method I recommendpeople do for complicated controllers isto use a sort of it's like a descendantof the model view controller kind ofthing where you've got an ingestionlayer a data model and a a sort oftranslation layer that takes the datamodel the data model is like a prettygeneric representation of your problemdomain the ingestion layer lets you takeconfig from Kubernetes or wherever putit into your data model and then yourdata model the translation layer takesthe data model and turns it intowhatever form you need and so that meansthat testing you can take variousingestion things run them through thething into the model and test the stepthat's ingestion into model and then youcan test the step that's model intotranslated version in se as a separatestep yeah so you're not testing that inone go you got two spots where you cantest thatthank you yeah noproblems excellent uh I got we do haveplenty more time so if anyone wants tocome and see me afterwards feel free umoh do we have another question excellenthit me i have one question so we havecurrently we use like content hashing sowe get like the objects check uh thehash great method yeah yeah so we dolike no patch operations because wedon't get the div do you use any libraryto see also diffs between what iscurrently deployed in Kubernetes andwhat the object should be yeah so Ithink that's one of the that's aninteresting that's sort of a downside ofusing like the the the uh hash method isthat you're not retrieving the originalstate of the object one of the nicethings about the way the controllerruntime does things is that you'regetting your the controller runtimemaintains a a case of like what itbelieves is the latest state of theobject and so that you can be prettyconfident will be updated like is eitherupdated now or will be updated real soonuh and so like that that's a good way tomake sure that you have the actualobject and then um the the other thing Iwould recommend is try to make it sothat you're not caring about the diff ofthe objects right like ideally the whatyou're doing here is you're taking theset of objects that you care about andyou're doing something with them tobuild that model that I talked about andthen that model just goes and getstranslated it doesn't matter if thatmodel is different to before or notright like except at the time when youactually send the translation off to toKubernetes yeah so like the more you cando that the more you can delay that stepof where you have to look at somethingthat is currently in Kubernetes andcompare what you're going to send whatyou're going to update the better offyou are does that make sense yeah makessense excellent thank you thanks verymuchokay well uh yeah looks like uh lookslike we're we're out of questions thatare willing to come up to the mic feelfree to come and find me for otherquestions um thanks again for listeninguh and uh really appreciate it uh seeyou all around[Applause]2025-04-15 22:02:22.484905�ntrollerswhich will monitor monitor the CRDs thatyou need to to instantiate the clusterAPI cluster and the the clusters thatyou create with cluster API they arecalled workload cluster so here thisexample is just for lab clusters in inthis in this drawing um so we have hereone management cluster which managesthree workload clusters but I haveanother drawing right afterwards whichmight explain that a little better um onthe right side you can see that clusterAPI you can interact with it as youwould with any community CRD or resourceyou can use scuttle for that uh you canget machine deployments which is whichis the deployment for your worker nodesyou can get machines which is the whichwill then through the controllers be weinstantiated as either a VM or aninfrastructure or any resource and thatthat will in the end be your your nodesso that's about cluster API in anutshell then Talos Linux how many usersof Talos Linux in theroom it's growing that's good I can onlyrecommend um it's referred to as the 1212 binaries OS it's immutable minimalephemeral but For us the main point isthat it has a declarative configurationinterface so you don't interact with itwith SSH because there is no SSH onTalos Linux instead you use um youauthenticate with MTLS to an API serverrunning on on on the operating systemwhich is called machine D which doesmost of the configuration and so you youuse uh the Talos scuttle command linetool to or directly you could also useIPL requests to to to to configure theoperating system so you will configurethe network card the storage you willconfigure everything with thisdeclarative API and forest that's that'sa game changer we we can go from uhdeploying DBN VMs and configuring sometools with it on with puppet and thenrunning nible pipelines to just using adeclarative API to to configure theoperating system um that's about it forTalos and now how do we migrate fromthis kind of old uh nible cubadm way ofdoing things to to tal and cluster APIso the first step that we have to to gothrough is configuration matching as wecan imagine uh some some parametersbetween cubeadm and and and mible mightnot be the same so cubidm and talosclust might not be exactly the same sowe have to match that the the secondstep is that we need to import all theexisting PKI infrastructure so all theexisting certificates from our existingclusters we need to import those so thatwe can work with those uh in cluster APIand talos next we have to go to thecluster API management cluster and therewe will start to create the CRDs thatare needed to create new machine newnodes uh to join the existingcluster the third step is to add clusterAPI nodes and this will not go right thefirst time that you try to do it but Iwill give you a few um headers on whatnot to do and how to interpret the errorthat you might encounter along the wayand finally when everything goes wellyou can remove the cubeadm nodes and youhave a cluster API and it could be anyother operating system but you have acluster API cluster at the end so we arenot there yet everything is readytechnically but uh we we are still in weare in the process of migrating andhopefully in a few weeks or months we'llbe able to publish a blog post about itand how itwent so let's start with theconfiguration matching um I will not gointo too many details but the main mainuh things to watch for are to match theservice account issuer that's what thatthat's the name of the of the serviceaccount that's the name of the issuerthat you get if you well if you dive inthe pod and you check the George tokenthat is in /var run secrets etc you willsee an issuer field and this issuer perdefault is kubernetes defaults servicescluster local and with stalos this isnot the case with stalos they default tothe end points of the cluster so youhave to match that on the old cubeadmcluster you have to already do thissmall change but you can do that withoutdowntime that's fine and then this oneis really important you need to makesure that the hd encryption key and thekey name are the same between talos andand cubadm and so on the r�ight hand sideyou of the the Talos template file sothe file that they use to actuallycreate the encryption config file onyour Talos machine Talos system and yousee how they how they template it andmainly you you see the the name line 11or line 17 depending on which en HCDencryption scheme you use ASBC or secretbox uh you see which name your key needsto have on your Cubadm old cluster andso you have to match that uh which meansthat you have to basically re-encryptall secrets but this is quite a simpleoperation as we can see on the thirdstep and it's all explained really insuper super detailed in the community'sdocumentation so that's for the firststep or the step zero the preparationstep then we have to import the existingPKI and here Talos is quite helpful theyhave a command called cube cuttlegenrets from Kubernetes PKI so you cangive this talos cuttle command line toolthe path to the existing communities PKIfolder in which there are all those uhCA certificates HCD certificates serviceaccount issuer key etc and it willgenerate a secrets bundle file that youcan use later on to either provision anew cluster without pro without clusterAPI or that we will use just afterwardsto actually provision the new clusterAPI cluster you also need to create anew cubeadm bootstrap token or cubesbootstrap token so that new nodes cancan join the existingcluster so that's for the first step soimporting the existing PKI PKI and thatis how the secrets bundle look like sowe have uh the bootstrap token thesecret box encryion secrets and then thekey the core community certificates thatthat constitute a cluster so thekubernetes search line 10 11 the hcd 7and 8 the service account and finallyline 15 and 16 this is not somethingthat is uh from kubernetunities itselfit's from talos it's the os key sooperating op operating system in thiscase it's talos and so this this keyline 15 and 16 the OS key is the thecertificate that you use to authenticateagainst the talos endpoint so the talosAPI so you don't use an SSH key you usea certificate And if that certificatehas the OS admin group then you'reallowed to do some operations on theTalos nodes with the Talos cut commandso that's it for the for importing theexistingPK and now we have to create somecluster API CRDs i'm not sure if you canread well in the bottom but I will soonswitch to a drawing because this is alittle in digest uh but the main pointis that you have a cluster CRD you havea control plane CRD and uh maybe it'ssimpler to to read it like that that'show we structure our clusters so wecreate one name space per workloadcluster so here the workload cluster iscalled E1 communities lab F and so wehave uh we have a cluster CRD so that'sthe one that you that you see on theleft hand side and then we have acontrol plane CRD the one inyellow another one is the machinedeployment then you have also aninfrastructure but you could havemultiple infrastructure provider it's upto you and so uh all the CRDs are thereand nothing happens so far but the thenthe cluster API controllers they startto interact with those CRDs to provisionthe nodes the machines to boot the VMsif it is VMs to boot the bare metalboxes if it's bare metal boxes withmetal cubed whatever and so that's howcluster API works so you you have tocreate those CRDs and then thecontroller start to provision machinesand it can go extremelyquick so now we know we know how how allof this works how we want to do themigration the last step for themigration is that we are going to add uhcluster API nodes to our existingcluster so on the left hand side in blueyou have the the cubadium nodes uh wehave more nodes than that but for thedrawing drawing it was much simpler likethat and in green on the right hand sidewe start to add cluster API and talosnode and that's actually how we aregoing to do the migration we have theexisting clusters with perhaps 100 nodesand then we'll start to incrementallyadd new talos nodes to the to the to thecluster and we'll monitor close itbecause it's almost no it's certain thatthere will be some edge cases like somefine-tuned �cysal parameters that willbreak some Java deployment somewhereit's almost sure so we'll have tomonitor closely over the course ofseveral weeks to see which of thoseparameters we have to fix but basicallythe idea is that we have the old clusterwe we start new nodes new nodes in thesense with cluster API we wait a littleand uh in the end the idea is tocompletely replace the old cubadiumnodes and we will have then successfullymigrated without downtime all ourclusters in a few months if everythinggoes well so that's theidea so when you create cluster APInodes a few things can go wrong uh I'vesummarized it here uh we've encounteredthat along the road um for example ifyou do not match the service accountissuer then the new service accounttoken that you create they will not havethe proper service account issuer so itwill not be accepted by the old APIservers then in the authentication logsyou see that you have invalid beer tokenerrors um you could lack the encryptionkey if you do not have it that's how thethe the error message looks like youcould have the wrongd encryption key andthis is a little more tricky because youwould expect well maybe another failureor an error message which is moreverbose than output array was not largeenough for encryption but that's whatyou get if you don't have the rightencryption key and finally um when youstart adding nodes to the to the clusterif at some point you you you have somenodes which are which are down and youlose quarum then you have to to forcethe new use the force new cluster flagto recover your HCD cluster perhaps youalready have had to dothat so let's get to the demo let's hopethe demo gods are with meum so let me explain a little so thishere are the manifests that we have inour um that we that we'll provision forthe lab F cluster so that's the that'sthe one I those are the manifests Ishowed you before by the way is it bigenough at the bottom well I don't hearanything so looks good i see some jobsthanks so this is the manifest that weuse for this is the cluster manifestbasically the cluster name then theendpoint for the cluster then you haveto refer the control plane and then theinfrastructure then you have the controlplane itself and here you see we useVsspere machine template we use Talos 94this is like the the the image templateon Vsspere itselfand one important thing here we startwithout any replicas and I will get tothat in an instance when we start themigration uh but basically the thing isthat if you start if you create thoseCRDs on the cluster API clusters and youalready start one replica um you youfirst have to I will show you it's goingto be easier when when I show it okay sothis is the control plane CRD uh andthen we have a customization file herewhich generates secrets and generatesall the CRDs that we need to to start uhcreating this uh workload clusternodes so I think that's it uh then Ihave a few workspaces workspaces herethe second workspace so I'm switching toit now the Talos cuttle workspace wewill use that one to retrieve the logsof the of the Talos operating systemwhen it boots uh then the seventhworkspace here uh is the is one of thecontrol plane of the old cubadm clusterso you want mls 010 we will start towatch on the kettle member list herebecause if everything goes well weshould when we start the CRDs and whenwe increase the number of replica weshould get a new member in this in thisuh workspace here then on the eightworkspace or we have the cluster APImanagement cluster so for now we don'thave a lab F cluster we'll soon havethat and we'll go inside this clusterand see which CRDs are created andfinally the N tab is the the lab F nodestab so this is just an overview withcanines of the running nodes on the onthat cluster for the moment so we havesix nodes at the moment hopefully aseventh one or even more if we have alittle more timeso let's get startedumperhaps so the the file structure isthis one the one we see on the left handside so the first step that we have todo we have to actually import theexisting PKI so for that we use uh thisscript here so we gen� I will run thisscript with SSH on the on the target uhcontrol plane node so on the existingcontrol pane node so we will generatethe cubadm join token we generate thetalos secrets bundle the command that Ishowed before then we'll generate or wewill read the sd encryption keysubstitute that with a little yq commandand print the file out we'll directlyprint the file out and save it to thesecrets folder secrets bundle file so todo that we ssh on the node and we pipethe script and that's it so now in thesecrets bundle file here we already havea first step we havethe we have the secrets bundle filewhich corresponds to our actualcommunities existing cubadmium clustersso with the right community certificatescertificates and everything so the firststep is done now we also need to createsome TLS files so I have created a smallscript for that as welluh it just reads the the CRT and keyfile from this these lines here and itsaves that here so just some YQ commandsso let's review again the CRDs so herewe have the control plane zero replicasthis is important data plane zeroreplicas and I think with that we shouldbe able to apply thiscustomization on the cluster API clusterso no error so far the demo god seems tobe seem to be with me for now so we havea new new name space capy one communitypfnet lab f capy is standing for clusterAPI when we get into this name space wedon't see any pots because we are notworking with spots here we if we checkthe clusters CRD we see that we have aCRD so we have we have a cluster uhcluster API cluster instance and then uhwe could check for machines but normallythere shouldn't be any machines becausewe haven't increased the replica countyet we can also check for Talos controlplane and this is an important resourcethis one is the one that will be used byuh the controller the Talos bootstrapprovider will use this uh instance tocreate the the bootstrap data which isthe data that it needs to send to Talosuh when it initializes the the node sothat's are the CDs that we currentlyhave on the cluster and now the nextstep that we need to do uh I will runthis here it's not really importantwhere we run it we need to patch we needto patch this talos control plane thisis the critical bit if you don't do thatif you don't patch it with thebootstrapped equals true flag in thestatus sub resource if you don't do thatthen when you increase the replica counttell us or the the controllers ofcusto that that it is a new clustercompletely new cluster so they will senda bootstrap command and then this newnode will not join the existing one itwill just create a new form a newkubernetes cluster on its own so it willnot join a CD and then you just uh losesome time but if you run this commandhere if you patch the existing taloscontrol plane then it status ofbootstraps be bootstrapped becomesbecomes true and here we should beseeing that in the status bootstrappedis true and now that I've done that wecan start our first control planereplica so let's do thatwe could also directly editthe the resource but it doesn't matterso I apply this with one morereplica uh so now we see we have onereplica and if we check the machines weshould be having a new machine which isprovisioning good so it has been proprovisioning for 13 seconds um we don'thave and member so far it should come inaround 40 seconds hopefully uh we mightbe able to check the Talos logs uh Ihave already prepared the IP address umI first need to create a Talos configfile it's always a step that I forgetyou need a Talos config file to toactually interact with the nodes[Music]um this should be the one and we shouldbe able to watch the logs of the node sowe see on the Talos node is booting uhand we see that the HCD service iswaiting to be up let's check the oldcontrol plane nodes on the seventhworkspace we don't have ityet it should soon come up and be alivenormally it's after 1 minute so I thinkwe can wait a littlemore and here we have it so our newcluster API node which has just bootedTalos just joined the existing HCDcluster this is a good first step andnow that this is done the cube APIserver on that control play node willalso be able to to to communicate withHCD will be able to start and cubelet onthis new node will be able to to reachthat cube API server and register itselfin the existing cluster so now if we goback to the nins tab where we have thenodes we and if we wait a little morefor how long have we have been we havewe been waiting so far not sure soalmost two minutes so it should sooncomeup and so now the nice part is that Icould simply increase the number ofreplica and there is then there will bea new machine starting instantly and Icould do the same for the machinedeployments machine deployments is whatyou use to provision uh worker nodes ordata plane nodes as we call thoseso the suspense isincreasing let's check the talos kettleso it's not yet happy about the cubletendpoints and TLS internal error this isnormal because we haven't signed theserving certificate yetbut if I continue talking normally thenode shouldappear soum before we continue um in thedescription of the talk I had alsomentioned some Argo CD uh applicationset uh uh parts i will not be coveringthat because I don't think we haveenough time and it goes into too manydetails i should prepare that foranother talk at some point but I putthose slides at the at the end of theslides slide deck you have some some ofthe templates that we use and it can bequite helpful in managing the workloadclusters and so yeah here we have ournew node which has joined the existinglab um F cubadm cluster uh we see seliumwhich is starting and uh yeah in a fewfew seconds we should have all nodes uhwith the ready status and which meansthat we will have uh actually migratedor we we then we have done the firststep towards migrating there is stillsome some work to be done to completethe migration but the the first step atleast it's done we have a node which wea control plane node which we integratedwith the existing cubeadm control planeso it's not worth waiting for anotherminute that it comes ready maybe I canswitch back to this one later so thatwas for the live migration now um thebootstrapping issue the chicken and eggproblem um so we want to manage ourclusters with kubernetes cluster but Howdo we do when the in the disasterrecovery scenario if we lose everythingif we don't have any clusters how how dowe manage our workload clusters withouta management cluster API cluster and sothe idea that we had is to use anephemeral tool or and we we we havewritten a a small go utility that wemight open source at some point and whatthis utility does it it starts a kindcluster locally uh it installs all thecluster API CRDs and controller on theon the clusterAnd then it rest it checks on on an S3endpoint whether we have alreadyexisting manifests and for an existingcluster or not and if those manifestsare there it just reimpports those andthen it can manage the the the clusterand if those manifests are not there yetwhat it does is just it will justprovision a new management cluster sofor only for the management cluster weuse this um ephemeral tool uh toactually manage the management clusteritself that's the solution that we foundwe run that in a gitlab pipeline but wecould also run that on our laptop if weneeded we just need to collect allsecrets and have the right networkaccess but then we could reallyprovision a cluster from scratch in adisaster disaster recovery scenario sothat's it for the that's our solution tothe bootstrap bootstrappingissue so in conclusion we we have seenhow to match configuration import theexisting PKI add new nodes we saw whatcould go wrong on the way and uh the thelast step that we'll be able to conductin a few months will be to remove thecubadiumnodes so yeah I think that's it for formy for this session uh I will end upwith um few words uh in PWA pwa is theold dialect from from from the Frenchspeaking pact of Switzerland and itmeans uh take care so poithanks so if there are any questions I'mhappy to take questions right now orafter the session as you likei don't see any question so I guessthat's it thanks2025-04-15 22:02:22.980279 kk��x�#��'AuQ_WN1kuDo0hi everyone i'm glad to be here today myname is Clemon Bame i come fromSwitzerland i'm a Swiss softwareengineer working at Post Finance and funfact I live on a farm with my wife she'sa farmer and with her family and so Iget to see cows every day and then I getto play with Kubernetes uh during theday as well so I'm a lucky guy uh solet's get started with this sessionabout our migration from CubadiumCubadom i'm not sure how you pronouncethat i will use Cubadium and towardscluster API and Telos so how old canyour clusters get our oldest clusterswill soon turn six years old and todayit is now 299 days old which gives youthe title for this for for this sessionactually uh why do we bother withkeeping cluster alive for so long whydon't we always recreate new clustersand move application there for us thechoice is uh due to the way we work withour clusters we use quite large sharedclusters on which there are lots of namespaces for lots of application developerteams so instead of moving 500application teams to a new clusters tonew clusters everyone every time there'sa new release we prefer to just updateand upgrade the clusters in in place andup up until now we've been quitesuccessful with that uh with Cubadm sofarum Post Finance Post Finance is a smallbank or small it it's a it's a bank inSwitzerland uh we run uh 35 vanillaKubernetes cluster so the open sourcedistribution we just use Cubadn Cubadmto to configure everything and run thethe multiple components of Kubernetes asI said the oldest cluster will soon turnsix years old we run all we run all ofthat on two on-prem data centers uh wedo visual v virtualization and then wecreate the virtual machines that we needfor our nodes uh we run chaos monkey onall clusters this is might be a fun factbut it's really helpful for us becauseup to the production clusters where werun the card payment services we runchaos monkey and this helps us a lotwhen we have to do an an upgrade becausethe application have to be able towithstand a pod restart so that's that'sa little about what we do at postfinance and with Kubernetes and cloudnativetechnologies that being said uh let meexplain how we currently provision ourclusters the starting point so for nowwe provision uh DBN VMs with Terraformand we boot those VMs on on our on premclouds on vSphere then we run a little aseries of puppet configuration not surehow many of you know puppets um butlet's say I'm I'm looking forward toworking only with status um so we usethat and then we we have to manuallyregister those nodes in an inventory mmlfile that we use to run the encelplaybook and then we run this molplaybook and what it does it basicallyrenders the config files so the thecubeadm config files the cube API servermanifest it calls cubeadm commands togenerate all those manifests so that thecontrol pane can actually start and thenum finally one of the last stage is toconfigure Argo CD integration because wewe deploy everything with Argo CD oralmost everything including our ownworkloads also infrastructure workloadslike we use application set for that wedeploy engineext with or selium withwith argod so that's what we do todaythe last part we are happy with but thefirst four step we are not so happy andso that's why we want to move to clusterAPI and I'm not sure how many of you arecurrently using clusterAPI okay so I would say 5% that's notmuch but you might be using it withoutknowing it but uh so cluster API in afew words cluster API the idea is thatyou have you use the declarative APIs ofKubernetes to provision and to make yourcluster work so you have a managementclusters on which uh you create the CRDsfor cluster API and on this managementcluster you have a series of co��g these runes cangive us clarity in uncertain times sowhat makes up an SLO first a servicelevel indicator SLI this is a metricgenerated by query so an example islatency throughput or error rates whatyou want to measure then you have anobjective so that's the desiredperformance for that SLI so if it'slatency we want our latency to be under50 milliseconds for example next we'llhave a target value so how often mustthis objective be met so 99.99 999% ofthe time for example next how do wemeasure it with a time window that's sothat's the duration over which thetarget value is measured for exampleover a two-hour window over the last 24hoursso just like any magic potion um SLOs'scombine carefully chosen ingredients sowhat we need is a common languagebetween our engineers uh operations andstakeholders we want it to be userfocused so we want to highlight realimpact and allow it to let it allow usto prioritize improvements next we needwe use these for guidance so this helpsus to clearly navigate trade-offs uh andreliability innovation and velocityso there's different quests different ordifferent platforms different questsright not every quest has the samedanger similarly not every platform hasthe same needs they all face differentchallenges and different things theyneed some of them some things can besimilar right there are a lot ofsimilarities we want all of our questsgoing through but there might bespecific needs for specific platformsfor example training and predictiveinference and Gen AI today we're goingto focus on Gen AI um because that'swhat we've been working the most on sofor Gen AI our usage patterns could beyou know real-time inference we mighthave interactive workloads we might havesome batch things going on um also whatwe care about is you know managing howmany tokens we're getting in theconcurrency the streaming latency andtheavailability so the GI hydra is trickyyou know you chop off one head likeinconsistent deployments and two moreheads pop up like security orobservability issues so we should armourselves wisely what makes theseplatforms complex where there aremultiple things right like fracturednetworking so we have latency servicediscovery crosscluster traffic a lot oftimes now we're running on a hybridenvironment so um observability betweenyou know something like Amazon AWS uhbedrock and running things on prim andwe have different uh deployments likedifferent infrastructure versions andscaling so all of these things make it alittle bit difficult to uh tomanage so for an overview let's look atwhat a typical you know gen AI platformwill look like from a high level so wemight have a load balancer we might youknow hit a gateway for us we use theenvoy gateway and we use that to be ableto serve you know onrem a manage manageinference uh kubernetes cluster and weuse kserve with that but also we use itto easily be able to hit any other lmprovider that you need so this is whatuh a request going through our platformwould hit now let's follow the requestquest this is just an example say wehave an a request it's going to hit thegateway it might need to go through somehops in the gateway as well maybe it hassome authorization things to do maybe ithas the rate limiter set up and thenonce all that is uh configured it'll goback to the gateway and then it'll go tothe model provider and let's say thatwe're running something self-hosted wellthen we can also manage uh doinginference and it might have to do aprefill decode and then hit the all theway back again the response goes to theclient so at each step there can belatencies at each step there can befailures and we need to as a platformteam clearly understand and manage uhwhat is going on at every one ofthese so measuring our DNA success uh wehave client-f facing SLOs's thatdetermine our quality of service so uhthese are crucial for our SLAs's sothese metrics define user experience andensure that our platform delivers on uhpromised reliability latency andperformance and we have different levelsso you for a platform you might want tohave a a high priority level so userffacing� real time medium maybe that'slike internal real time and then low soour batch uh batch inference thatdoesn't need to run on a specific timeum then we have to help us with these wehave platform level metrics so we trackour infrastructure health at um at eachenvironment and then we also have uh usethese to guide our scaling and resourceplanning lastly we have model levelmetrics which help us so we want to knowthe performance of the models umself-hosted or vendor and then we wantto use certain metrics for determiningthe reliability latency and costefficiency of ourplatform let's talk about that ourspellbook of metrics so for example uh agenai platform might want some thingsfor reliability uptime including therouting layer uh API conversion errorscost effectiveness things like how wellare we doing for um optimizations likeautomatic routing uh model cachingprompt optimizations also we want tomeasure our scalability not only do wewant to know the number of requests butyou know for JAI we want to knowsomething like the number of tokens ortime to first token um also we want toclearly measure our latency so fromevery platform component and the requestquest that I showedearlier soum imagine that I mean as platform teamsso we're in a unique p position rightwe're not just responsible for monimonitoring our service or a few servicesbut the overall health of the platformitself so imagine a scenario as aplatform team you have resources acrossyour platform you have users resourcesinfrastructure resources and all of asudden they're wiped out um was it ascheduled cleanup job like how much waswiped out um was it a mis something thatwas misconfigured that ran wrong was ita failure in the underlyinginfrastructure um or even let's takesomething maybe a little bit more subtleso what if latency slowly creeps upacross services what if traffic goescompletely quiet then the questionbecomes you know how long would it takeus as a platform team to notice uh intoo many cases platform teams can findout when the users complain which is bythat point often too late so the impactshappen and the frustration from theusers already set in you know you'relosing trust from your users as well sowithout this like clearly defined realtime visibility platform teams are kindof stuck in this reactive mode uh butwith the right observability tools inplace we can you know detect anonymiesearlier respond effectively and mostimportantly we can better contain theblast radius before it spreads so thisis what modern platform observability isreally about not just like dashboardsand alert but giving teams the power toyou know stay ahead of problems andmaintain confidence in the health ofyour systems so without our SLOdashboards we wandered in the darknessbut now our dashboards will eliminateour path clearlythank you Alexa um a little disclaimerbefore we go into the next part of thepresentation the dashboards that you seehere are just for the purpose of demothey do not represent the health of anyapplication or service in anyorganizationuh a common question a lot of us face ueven with the best intentions dashboardsoften fail so why do they fail thoughfirst of all there's too much noise andvery little signal take the example herewe tracking latencies across ninedifferent models but with too manythings led in we cannot really tellwhat's wrong there's no connection toany upstream issues or any downstreamimpact second these dashboards arerarely maintained uh as they saydashboarding is done in the set andforget mode um they get set up once butas your platform your infrastructureyourapplications evolve uh they quicklybecame become stale this leads tomisleading signals and even outdatedmetrics third uh we run into siloedviews uh you can see that model 2 hereseems to have high latency but againthere's no connection with the upstreamtraffic patterns or any downstreamimpact or business impact for thatmatter and finally observability withoutany kind of alerts is just pure storagewithout defined thresholds or SLO basedalerting teams cannot discern whichissues to act on and when to act o�nuh quoting one of our in-houseobservability Salarino here if yourdashboard takes more than 90 seconds tounderstand it's bad um and let's justacknowledge SLOs's are hard to get rightso where do you start step one is tostart with user what does good look liketo them if you're running a back-end jobyou need need not need like a 59 uhuptime for your service it is it's wellwithin accept acceptable latency so onesize fit all will not work here um nextchoose metrics that actually map to theexperience like latency error ratesthroughput uh but be careful justbecause you can measure something doesnot mean it's valuable pick the onesthat actually align with the performanceoutcomes um then make sure yourinfrastructure is instrumented withthese metrics um if you cannot reliablycollect the data you're missingvisibility into some of these layers uhthen test your assumptions what doesthat mean uh you use some real worldbaseline data um in case of existingservices uh it's always good to lookbackwards to look forward um use thatdata to validate your chosen SLIs andreflect the uh and whether they actuallyreflect the end user experience um andfinally benchmark what's realisticallyachievable you don't want to aim for100% if your systems has never even hitlike 99.5% uptime um your SLOs shouldstretch your teams but it should alsoremainattainable uh now what are someprinciples of good or great dashboardswhen designing effective observabilitydashboards uh it starts with at theglance clarity the goal is to instantlyanswer is everything healthy if notwhere should we look first simplicityhere is the power we need to design ourdashboards with high cardality in mindtoday's system generate data with manymany dimensions dashboard should becapable of slicing and dicing into thosecomplexities without becomingoverwhelming it's important to balanceyour aggregated global view with a drilldown capability you want a top levelpulse of your application or yourplatform but also the flexibility tozoom in whether it's uh into specificservices error patterns or um timeranges when a issue arises dashboardsare even more powerful when yousupplement them with contextual datasuch as traces logs uh it's not justabout the metrics but the story behindthem right what happened where and whyand finally the focus should always beon actionable insights it's not enoughto just surface the data we want todrive decisions prioritize clearthresholds and meaningful alerts thatwill help uh the teams to know when toact and what to acton uh let's look at an example of atypical dashboard for an LLM applicationit's well structured into keyperformance areas such as success ratesburn rates error budgets latency andthroughputthis dashboard gives us a 360deree viewof model 1's performance with respect toits SLOs's u over a period of 48 hoursstarting with the good news uh the modelseems to have perfect success rate umeven the for the past 7 days u thatmeans short-term reliability is stronghowever if we take a look at the 28 daywindow uh we've hit the success rate of99.9% which is still at the edge uh theburn rate though we have exceeded it inour long-term goal what that means is uhthe burn and combine that with the burnrate over time chart right it's not aspike but a steady climb that tells usthis is not a one-time outage it's apattern of small repeatedfailures looking at latency we see a lotof fluctuation between time to firsttoken and even the inter token latencyuh there are some spikes in both thiscould easily be a sign of scalinginefficiencies or model bottlenecksunder load and lastly while thethroughput is relatively low there aresome error bars present um this means nomajor concern right now but they couldbe contributing to your budget burnoverall the model is performing well inthe short term uh but we have se seenstart uh signs of early degradation andit's a good candidate for preventativeuh optimization whether that's uhfine-tuning your retry logic or latencytuning or even adjusting how we handleedgecases now we'll break down the uhprevious dashboard into smallercomponents and just� zoom in into whateach one of them has to offer um a quickglance here tells me um that are wemeeting our SLOs's is there any ongoingissue u this this particular view orpanel gives me a clear temporal glandityuh because it summarizes success ratesover different period of time and thenit's also supplemented by a time seriesinsight there's a trend that I can lookat and you can see there's a clear dropon one of the days it makes it easier tocorrelate with incident timelines anddeployments speaking of burn rate thisis a sideby-side comparison of why burnrate is such a powerful SLO signalespecially when tracked across multipletime windows on the left we are lookingat a transient incident there's a sharpspike uh in the one one hour burn ratewhich signals something seriously wentwrong uh but it resolved quickly and thelongerterm windows stayed flat um thissuggests there's no long long-termimpact to reliability however on theright side the burn rate starts low andgradually increases across one day andthen continues to increase across 28days uh time window this is a classicalpertinent issue nothing looks criticalat the moment but over time the systemis quickly eroding the error patch andthis slow burn rate often slips into theradar if not monitored monitoringmultiple windows helps distinguish anoisy blip from a real structuralreliability issue both scenarios resultin different kind of risk one is a firedrill and the other one is a slow driftin SLO violation you need differentresponses foreach now before we move on to the nextdashboard uh let's take a minute tounderstand a few critical concepts oflatencies in a typical LLM inferencetime to first token which is TTFT andinter token latency ITL time to firsttoken is the time it takes from when aprompt or an input is provided to amodel to when it returns the first tokenit's absolutely critical for perceivedresponsiveness as fast time to firsttoken gives user uh confidence that thesystem is working and it's con inconversational interfaces it defines thefirst impression inter token latency isthe time to take taken to generate eachsubsequent token after the first tokenfrom the user's perspective this governshow responsive or smooth the output isstreamed especially for longer use casessuch as documents code and summaries itlis if ITL is consistently uh slow orinconsistent the user experience isjerky and delayed even if the firsttoken comes back quicklynow taking those two metrics onto ourdashboard this focuses on uh streaminglatency for one of the models um and aswe can see TTFT hovers between 400 to700 millisecond which is reasonable fora production grade inference service uhbut there are some dips in spikes thiscould be linked to cold starts loadimbalance or even backend queuing evenbefore the inference actually startsitil is mostly low and consistent butthere are periodic spikes even up to1400 milliseconds which could indicateconcurrency limits batchinginefficiencies or even degradedtransformer performance together thesetwo metrics offer us a holistic view ofuser perceived performance of a LLMapplication uh and we want them to below and stable this kind ofobservability is not only essential forengineers S surres or platform ownersbut also for product teams because whenit comes to gen AI performance is theexperience again this is a verysimplistic uh panel for throughput umout of the total request every singleone returned 200 okay which on thesurface signals excellent performanceand no userfacing failure we see 0400smeaning no client side misuses like badinput or malform a request also05503s suggesting no infrastructureissues u model timeouts or overload umerrors now while this might seem perfecton the surface uh in most productionenvironment a true 100% success rate isoften alarming so this is a quick promptfor us to check are we correctlycapturing error responses are failuresbeing retrieded or overwritten beforereaching the metrics are we filteringout any endpoints or failure types ifeverything checks out this give usstrong confidence that the system isperforming reliably but if not� we mightnot need uh need to revisit ourinstrumentation to ensure we are seeingthe full picturetalking of a different kind ofthroughput which is more relevant for anLLM application token throughput uhbroken into um prompts andgenerations first we see there's a largespike in prompt traffic early on thatcould indicate it's a bad job a burst ofinference load or high demand fromupstream service following that weobserve continued bursts through thethroughout the day with our gateway uhtraffic gradually tapering off um andright after the activity completelydrops off this could mean a scheduleshutdown a traffic routing issue or evena scaling event another key insightgeneration token consistently stay loweven during periods of high promptingthat could point to very shortcompletions or prompt only operationssuch as embeddings or validations orpossible inference failures or timerpost prompt this kind of throughputanalysis is important because it helpsus correlate our usage pattern with costerror budgetlatency next we have a dashboard whichmonitors the end toend performance of aninference service it gives us detailedlook into the response time across thepredictor stack from the gateway to theQ proxy to the model itselfon the left we see percentile breakdownsof latency um and the good news is P50is under sub millisecond u this is astrong indicator that there are networkand routing layers are healthy howeverwhen we shift to the actual modelinference you'll notice that theresponse times are higher that's notalarming but it actually tells us bulkof our latency lies within the inferenceand queueinglooking at the middle in the right panelqueueing uh delay shows some spikybehavior uh this suggests that there'ssome intermittent queueing pressurelikely due to um bursty load patterns oreven uneven request patching um whilethese spikes do not appear to degradeoverall performance significantly um ourtrail latencies remain within acceptablelimits uh overall the system isperforming well but these traceablequeueing spikes highlights areas forpossibleoptimization and finally uh all responsecode latencies are tied to 200 responseswhich means we're not seeing any latencyinflation due to uh retries or errorsand that's exactly what we wantuh moving on to a different type ofdashboard uh which I tend to use prettymuch every day which is utilization ofyour infrastructure uh this gives you afocused view of GPU resource demand andusage over a 7-day window uhspecifically for a cluster that I'mlooking at it combines real-time andhistorical context to assess howeffectively GPU's resources are beingrequested and utilized uh as you can seethere's low and sporadic usage throughmost of the week even when the requestswere high actually utilization remainedwell below 100% I would say even below40% for some uh this indicates apotential gap between allocated versusused resources the system may besuffering from in inefficient jobscheduling overprovisioning or idleallocations sign um optimizationopportunity either at the workload orschedulinglevel now talking about supplementingyour dashboards uh with contextual datathis is an example of a distributedtrace for an inference call um while alloperations are within a single serviceum it spans multiple internal steps uhwhat you notice here is inferencerequest actually takes up majority ofthe time signaling this is the mainbottleneck supporting spans for modelselection input validation or even HTTPresponse sending are comparativelylightweight uh you'll notice thatthere's um a few uh pre preliminaryoperations like validation modelselection um and they are lightweightright so if we look deeper though thistrace is not just useful for debuggingslow requests but for understandingwhere to invest optimization effortwhether that's reducing model latencycaching results or even parallelizingtasks now that we have defined SLI andSLOs's uh the next step is to make themactionable and that's where burn ratebased alerting comes into picture a burnrate tells you how quickly you'reconsuming your error budget when theburn rate is high it's a sign that thesystem is degrading faster than a SLOallows uh let's break it down right uhwith some examples here a 1 hour burnrate above 13.5 means your system isburning the entire uh error budget for a30-day period in 2hour window this islikely a major outage and you want ahigh priority alert here a one day burnrate over 2.8 still isn't great uh butit could mean a moderate issue andthat's not immediately catastrophic uhbut could escalate fast and if your7-day or 1 day burn rate exceeds oneyou're seeing a slow burn less urgentbut it might be a sign of ongoingdegradation or even user impactivebehavior the key is to match thesethresholds to your environment don'ttreat your dev and prod the same youwant to surface high severity issues inproduction faster but not drown in noiseelsewhere um also don't just fire alertfor any blip focus on meaningfulsustained violations and prioritizebased on severity and impact and finallyremember alerting this is not staticyou'll need to tune it over time um usedata from your incident reportspostmortems and SLO reviews to refineyour thresholds andlogic um taking the the learnings fromhow you can fine-tune your dashboards umthis is an example of what we can call apersistent failure mode this couldlikely be due to a misbehavingdeployment um and upstream dependencyfailing intermittently or poor handlingof edge cases causing request failuresuh because burn rate remains high acrossall the windows this isn't just a spikeit suggests that the issue has beenongoing and not fully mitigated uh at10.3 the platform is burning through theentire alliable budget 10 times fasterthan tolerated this is actually criticalandunsustainable um observe scattered dipsand uh in success rate here during theburn uh window these dips actuallycorrelate with the periods of elevatederrors um and the errors are frequentand sustained rather than isolatedanomalies then there's this example umjust 100% straight zero burn ratethroughout all the time windows lookstoo good to be true right but itactually suggests either the metricsaren't being collected accurately errorsare being swallowed or misclassified orthere's a discrepancy in how we havedefined success or failure for thisparticularservice um in a production systemespecially one handling real worldtraffic this level of perfection acrossall time windows is highly suspiciousand should be prompting a deeper reviewuh so how do you fine-tune your SLOs'sright you start by grounding in realityum you looked at the patterns of howyour error budget is being consumed ifyou're not using uh it at all your SLOmight be too lenient uh on the flip sideif you're burning through your error uhbudget consistently the target might betoo aggressive or misaligned patterns inburn rate can actually tell you um whata common type of error is whether it'sand which part of u the workload it is uevery incident is actually learningopportunity u if you had dine that didnot register as a violation that's asign that your metrics or thresholdsmight not be measuring the rightthing to wrap things up if you'rebuilding or running an AI platformobservability is an optional isfoundational we have seen how robustobservability stack built with opensource tools such as open telemetryPrometheus Envoy uh give you fullvisibility across inference uh pipelinesnetworking layers and the GPU resourceusage but tooling alone is not enoughbest practices matter this meanscorrelating your logs and your traces toshorten your mean time to resolution italso means setting smart achievableSLOs's and automating alerts based onburn rates and not just vanity metricsand most importantly observability isnever done the most resilient platformsare the ones where feedback fromincidents data and your users constantlyfeed a loop into uh refinement andimprovement if you walk away with oneidea from today treat observability as aproduct and not just a tool set build itas you evolve your platform um and howitgrows just a small plugin we're alsohiring across multiple uh roles withinour AI or atBloomberg thankyou amazing2025-04-15 22:02:23.500897 ��_�#��uAzLpUJBU6sT4okay let's start we are talking aboutorchestrating AI models inKubernetes specifically deploying Olamaas a native containerruntime so my name is Samuel Beloso iwork as a software engineer at cast AI iam in the security team where we arebuilding a product to find and fixvulnerabilities and anomalies inKubernetes environmentsyeah and I'm Lucas i work at Red Hat inthe Roy Red Hat AI platform and I'm alsoa QFlow contributor member yeah so Iwork in security but today I'm going totalk about AI or more about Kubernetesokay and how Kubernetes works internallyso let's start so if you have never usedOlama if it's the first time you hear uhthis tool Olama is a CLI tool to runmodels AI models locally in a verysimple way okay this is the GitHubrepository of Olama and you can run themost famous model like deepse lama andany other AI models and it's uh verypopular i don't know if you are awarebut the number of stars in the GitHubrepository in the Olama GitHubrepository is 134K it's even more thanKubernetes so it's yeah it's crazy andonly in the last year or year and a halfand why is that I mean the AI hype isreal and also I think it's because ofthesimplicity because if uh if you want torun a model locally with Ola what youhave to do is first you neeׁ�X�#��gAKyoxaAtHi-cwelcome and we're very happy and excitedto be here today and to present to youour talk dashboards and dragons craftingSLOs's to tame the AI platform chaos atscale so welcome brave adventures to ourquest where we'll craft the SLOs's tovanquish the chaos haunting our AIrealms my name is Alexa Griffith and Iam a senior software engineer atBloomberg i work on uh building our AIinference platformmy name is Ankita and I'm a seniorproduct manager at Bloomberg um I focusmostly on our computer infrastructurenice so our quest begins um as you cantell I had a little bit of fun makingthese doodles for this presentation solet's check out our treasure map fortoday's journey we're going to tacklethe Genai Hydra peer into the crystalball of observability and then venturedeep into the dashboard uh dungeon anddefend against the fortress of platformchallenges first a little bit about AIplatforms at Bloomberg we service thewhole model development life cycle sofrom exploration with the data tobuilding your model and experimentationto deployment serving and then um modelmaintenance and production formonitoring updating so what this lookslike is various platform teams andservices um within our AI within AIplatforms so we offer services likeJupyter notebooks um different ways totrain and use HPC manage serving managedinference and different AI pipelines andtools to our usersso let's get into SLOs's think of SLOs'sas ancient runes each holding the magicto illuminate the health of yourplatform so decodin��d to downloadand install the Olama binary from theOlama website you need to start theOlama server in the background and thenyou download your model with Olama poolfor example in this case it's uh theLama 3 8 billion parameters model and touse the model you only need just toexecute run and the name of the modelyou recently pulled and that's it thenyou can start talking with your LLM asyou are doing with satp but locally youcan also consume the APIs for exampleand integrate copilot or any other toolwith the with with the model locally soyou don't need to send your data to anexternal or to a third party server soand if you this could sounds familiar toyou because the pool and run is verysimilar to docker so the ui ux is prettymuch uh the same so is to models whatdocker is to containers it's a a toolthat simplifies the yeah the deploymentof models so when h we wereexperimenting with Olama the onequestion that came to our mind was okaythis is for local but what happens withkubernetes how can we run in kubernetesso the first option is there is a a helmchart that you can use it's it workspretty well and it's really wellmaintained so it's a a very validoptions but we were thinking about canwe do it in a more kubernetes native waycan we use a deployment what happens ifwe put the model that we are pullinglocally but in the deployment image aswe are doing with any other Kubernetesapplication okay and it's possible weneed to set in our deployment uh thisruntime class name okay that we have inthere and by default we don't need toput this runtime class name because uhby default we only create containers butIf we want to create other things or wewant to delegate the creation of ourdeployment or containers to thecontainer runtime or we want to use acustom uh container runtime we can usethe runtime class names okay this ideais not new uh it was introduced inKubernetes 120 so while back ago andactually it's the way that for examplecontainerd one of the most famous uhcontainer runtimes it's the way itimplements the runc so when containerdneeds to create a container it to ittalks to run cy by using the containerdsim run cb2 uh other projects forexample kata container and devisor dothe same in a or are also using theruntime class to do a similar thing butin this case they do not createcontainers they create isolatedsandboxes or virtual machines so uh whenthe kata container or gis or runtimeswants to create uh or receives therequest to create uh a virtual machineor a task it will be a virtual machineit's not going to be a container and thethe same for example for the webassembly sims uh for web assemblyworkloads so we are going to explore howcan we do this with Olama okay and howcan we with this custom runtime class toimplement a custom sim to deploy andexpose uh models in in kubernetes so thefirst question to do this is how doeskubernetes work okay it's a hardquestion a very wide topic so we aregoing to be a bit simplistic in here butbasically we are interested in whathappens when we create a deployment orhow the containers of that deploymentare created on on every node so in ourcase we have this deployment with acustom runtime class name so whathappens in Kubernetes when we do thiswhen we do this the when we apply thecubectl when we run the cubectl cubectlapply f so the control plane if you arekubernetes experts you know that it'smore than this in the control plane wehave also have the scheduleuler thecontroller manager will have more moreparts but at the at the end of the daythe the API server will receive therequest and will store in etc the potokay that it needs to be created cubleton the other side It's installed onevery node and it's continuouslywatching pot changes in the inc it'swatching the API server and when thereis a new pot it's going to call thissync pot function you have the entiredefinition in this uh URL that we youcan check later we have the slides in inthe sket so this sync pot what it'sgoing to do is cublet is going torequest the CRI runtime uh to create thecontainers for us so let's see the syncpot implementation is �something likethis it does as you can see many thingsit will we are going to review now whatare these different steps so don't worryabout the comments basically what weneed to pay attention to is uh the bodyof the function okay so we have the syncport singapot is calling the CRI APIbecause uh we have two main containerruntime container runtimes we have cryoor we have containerd for GKE EKS uh allthe cloud the Kubernetes cloud uhflavors also kind for development and inopen sie environments we have cryo so wehave an abstraction so cubelet uh workin the same way with any uh containerruntime and this is like this because wehave the container runtime interface uhthese are some of the endpoints that aredefined you have the full specificationin here you can go and check out and wehave two main services we have theruntime service and we have the imageservice runtime service uh is used tocreate pot sandboxes and containers andthe image service the same to interactwith the different to manage the theimages in a node so uh let's review therunpot sandbox API first because it'sthe first API to be called when we needto create a container this API willbootstrap the croup name spaces andnetwork for all containers in the samepot and that sandbox will be initializedwith the post container it just a dummycontainer a for loop running and it willbe the first container of our sandboxthe way it works is okay cublet iswatching the pot from the API and thenthere is a new pot and it's going tocreate that pot so it will call the runpot sandbox request it will send thisrequest to the CRI runtime the CRIruntime first will get the sandboxruntime okay this is what we specify inthe runtime class name so we are goingto get the runtime sim that will handlethe creation of this uh request so uh bydefault it's run C in containerd in ourcase it's going to be amama sim thatit's a a component that we are buildingfor this the next step is setting up thepot the it's going to call the CNI it'sa different interface is the containernetwork interface it will set up thenetwork for us it depends on the networkprovider you are using selium the cloudCNI or whatever and finally it's goingto start the sandbox the sandbox the potsandbox it's going to create onecontainer the post container with thiscreate task request so the runtime simwill create this container uh for us orfor the sandbox it will return theprocess ID the pit and finally it willreturn also the sandbox ID to thecubelet so next step okay we have thesandbox it is ready and now we need toattach or add containers to this sandboxthe actual container or in our case themodel that we want to to expose and thisis the this is what the create containerAPI is is for so it uh it will attach uhthe container or multiple containers tothe port sandbox so now we have the postcontainer and also our containers and itworks in a very similar way in this caseuh the cubelet first calls the pullimage uh uh request to the CRI runtimeso the CR runtime for example containerdis going to pull the image uh that wespecify it will return the imagereference and then it will with thecreate container and start containerrequest it will finally talk with theruntimesimama sim in our case to uh with thesetwo different uh calls with create taskrequest and start request to create thecontainer that will be in in the sandboxfinally it will return if everythinggoes fine it will return an an act andso the requirements that we have tobuild our container or custom runtimecontainer runtime or our SIM is okay uhwe need to pull the models with an OCIformat and then we need to build a SIMcompatible with with containerd thefirst thing is okay because we areputting a model in the deployment imageand that erh image needs to be OCIcompatible so we cannot put the imagedirectly from the Olama registry becauseit's not compatible with the OCI formatso the question is how do we convertmodels from the AMA registry to OCIformat we can use this tool okay theit's a tool from the G GPU stack it'scalled GGUF packer go and it converts uhGGUF models to OCI images if you a�re notfamiliar with the GGUF format it wasdeveloped by Georgie Jaranov he's thedevelop the developer of the maincontributor of Llama CPP and it's abinary format that h simplifies or it'san optimized way to store and share AImodels in hagging phase we have morethan 90k models with this GGUF format somost of the models are available in thisformat and how does this tool work thisGG UF packer go tool it works like adocker build okay we have but we have acustom builder so we need to put uh ourGGUF model inside in this case it is inthe add instruction we are pulling themodel from hugging phase in this caseit's the Q12 model and it will then weneed to call this uh docker build justwith uh with the name of the image thatwe want to push to our uh Dockerregistry as we do with our applicationwith our Kubernetes services and that'sit we the image will be available in ourregistry and it can be put into the intothe deployment in the image deploymentuh with this when the CRI runtime whencontainerd receives the pull image uhaction it will uh pull the image intothe node and it will be available forthe containers to mount this image or touse this image as we are going to seenow so okay we have solved the problemuh of building uh the models in a formatcompatible with with containerd or withOCI now we are going to see how to buildthis sim what we need to create a customsim in this case forama but it can uhthe idea is the same for any other okaywhat we need to do is this is forcontainerd okay for cryo it's the sameidea but a bit different so in the talkwe are only going to work withcontainerd we need to set the the customruntime as a plug-in okay uh containerdallows to define as many runtimes as wewant and we need to use this runtimetype and register our uh container thesim we need to copy the sim binary tothe kubernetes node uh I'm going toexplain in the next slide how we buildand what uh how to write this binaryokay but for now we need to copy the simbinary into the kubernetes node becausecontainerd will execute this binaryevery time we need we create a a potsandbox and the uh the last the laststep is we need to create this runtimeclass object in Kubernetes to map thebinary and the runtime class name thatwe will specify in the in thedeployments so the binary okay this isthe important part the how we code thisthis sim we can use if we are using gowe can use the container containerdlibrary because containerd is written ingo so uh we can we need to call this newsim manager and we register our our simwith the n the same name as we have inthe runtime type uh in the containerdconfiguration so it has to be the sameand then we need these three files onlythree files Okay we have this themanager and then we need need to definethe plug-in go and the service.go theservice.go is the important part becauseit needs to implement this hugeinterface the TTRPC task interfacedefined by by containerd and yeah it'squite complex we are I'm not going toget into the details of all thedifferent methods that needs to beimplemented we used the containerd simrun cb2 uh as a reference because uh itis available that's what containerd isdoing to create the containers in ourcase our model is also going to be acontainer so we can it's easy and it'sthe and the simple way to write a acustom sim a container sim so we andactually the most important method thatwe need to implement is this createmethod as you can see we have the createtask request it's the information that hthe sim receives when h a task is goingto be created a task is usually acontainer in the runc context or in acontext is the container model okay andwe have in this uh request we have moreparameters but these are the three mostimportant parameters it's the ID thebundle and the root fs the bundle is anstring uh to a path in our host thatstores the information of the containerthat needs to be created by the sim wehave several files the most importantone is the config.json file that we seein there okay this config.json is justthe declarative definition of thecontainer so what we put inside the potspec is tran�slated by containerd and isstored in thiscontainer.json file with all theinformation about our uh processcontainer for example the user that isrunning uh the container or for examplethe mounts okay the mounts is relevantyou will see in the next slide why butwe have all the mounts that uh that thecontainer will be available so whatbecause what we are going to do is withour sim uh is we are going to mutatethisconfig.json information that we have inhere for example we are going to add onemore mount to the to the list of mountswe are going to mount from the host intothe container this is the first part andnow will be available in the in thecontainer the first part is we need toload our model but with Olama doesn'tsupport to directly read the GGUF modelso what we need to do is we need to putthis from the name of the model insidethis model file name so this is what weare doing with this uh piece of code weare just creating a temporary file themodel file name uh with the name of themodel the GGUF model that we want to runand we are mounting this model file alsointo the container and finally we needto initialize the container to exposeand to serve the model and we do this byjust running and now creating the modelfrom this model file that we we havemounted in the in the previous step sothat's it with that we can just uhcreate our our deployment or our modelwith the custom runtime class setting inthis case it will be our runtime classname and then the containers will be ourmodel we want to create this Q12 modelit will be exposed in the port 8080 themodel name will be Q12 and the modelpath uh we need to specify where this uhmodel is because we can have for exampleseveral models in the same image so weneed to specify what model we want toload and that's it you will see then inthe demo how all this works and for nowLucas will talk about how to integratethis with code flow yeah thank youSamuel has everyone okay for the next 10minutes we are going to talk about likehey now we we've just had so much funjust trying to get in in in Kuberneteswhy did we use rather than I don't knowger form or something like that becauseit was just so much fun trying to dothat so we are just we're going to tryto do the same for the UI after we uhfinished our developmentWe thought okay uh how can we translatethat into something that the communitycan use so we look at Qflow which Idon't know if you guys are aware it'sjust part of the CNCF foundation is theMLB platform that right now it's used uhfor several components AI related Qflowmakes just like ML workflows easier forfor users and you can install that in inany cubernetes cluster and Qflow isbased on several components installedinside a a central dashboard you arecovering most of the MLOps world so withQflow you install different componentssuch as notebooks pipelines well acentral dashboard to get all everythingall together KT for parametrichyperparameter tuning KSER which isbasically a a service for inference andyou can see much of the the workflow theenvelopes that we have right there rightnow we even have model registry which isgenerally available since yesterday within in which we released the 1.10 uhversion and we thought okay h this isnice and there's so many features rightnow but can we make lama uh part of thatas a modelcatalog yesterday they announced thatthey have a backlog for the 1.11 releaseto get a model catalog inside or like inin the next following release insidethis platform so will it it be nice justto try to do a P and and get this sameservice working in a in a GES clusterand we thought about that two weeks agoand we were like hey do we have time tojust display something like that and wewere like okay let's really let's justtry to b code through through everythingand and get something working and that'swhere we uh started to use the modelarchitecture that it's currentlyavailable in model registry modelarchitecture is just one new way ofdeveloping um cubeflow components it'sjust basically there's no rocket sciencehere it's just like a front end and aback end front end in React back en�d inGolang uh but it's just based onmicroservices so there's a cool uh setof features that makes the thedevelopment experience really reallynice so we are we are going to just tryto get those things to make umapplication available in Qflow in notime so the most important of thisarchitecture is that enables contributecontributions from other projects toQFlow so we just thought okay let's justtry to do that um so that's what we didlike we we got uh a nice UI in in notime to to to have it displayed here sohow could we uh do this in no time it'sbecause model architecture serves likedifferent environments to make the thedevelopment workflow really easy youhave a standalone mocked mode then weget the standalone mode and thenintegrated uh which are the parts of thedemo let's just switch to the to theproject itself this is a BS coinginstance with the container in Moamahere's everything that Samu hasexplained everything about that so let'sjust trigger the the first developmentenvironment h this is what we call umlike the standalone mode for with mockcomponents this means that we aregetting the feature as a standalone likeoutside Qflow but that makes everyonethat that everyone can contribute in notime i've just uh like get this thismake file command and now hopefully Igot here the application enabled uhlet's just do something uh because Ithink I I got it in let's just rerun itagain and with that uh let's just reloadwith with that we have the the UI itselfuh as you can see right now there's amodel catalog with different modelsright here uh we got like which is theone that we we've been talking about andsome others um and here we can see themodel catalog with the description andeverything else version i'm not sure ifit's like let me just zoom in a littlebit more and here there's a nice chatdisplay in which you can uh talk withthe model problem is that this is justlike if I do a hello world or somethinglike that you see this is a mockresponse why because this is just fordevelopment so here's when where wecraft all the the UI that makes us likefor example we want to we are frontendcontributor and we just want to changethis the available models into all ourmodels you can get to the uh actual pageand and put something like this and inno time it will be changed into the UIso you don't need to know aboutKubernetes or anything else everythingis through a swagger specificationbetween the back end and the and thefront end so people that knows colandcan contribute in the part of back endand people that only knows front end cancontribute into that that's alreadythere in the model registry and andwe're expanding that into other placesuh this is nice but like this is notgoing anywhere and we are not doinganything with the PC so let's justchange to the next environment which isthe uh a standalone version but withwith a cluster itself so right here wehave a a cluster working this cluster isjust the the one that Samu has beenexplaining right here here we have thethe cluster conf with the configurationaround like all the cmd changes thatthat we've been doing and the quen modeldeployed as you can see here we have thedeployment with a with the image itselfthe image that that we we've createdwith with that and and a service thatit's been exposed somewhere um we havethis script which is the one that we areusing for developing in in cluster thiswill create you a kind cluster the theway that it's displayed in the in therepository and then it will create aname space is like a default one calledQflow and it will deploy that modelright there and then uh it's just likedeploying every other um file that it'sneeded to to do the deployment of UI umeverything is done right there let mecheck if we are in the in the right Yeahwe are in the right uh cluster which isthe kind of lamaim so I'll just need toexpose this uh service and we'll seethis uh cool little uh demo right hereuh with pretty much everything this isstandalone version um we'll see what'sthe difference between the standaloneand and the the other one as we can seethe UI has changed right now so let mego back to the development one sec umhere I just have it with a with the acomponent library standalone versionwhich is like this it's patternly and wehave a wrapper to get like the look andfeel of cubeflow so everyone cancontribute and and it will match alibrary we have right now in in Qflowand it's basically doing that that samething here this is a kubernetes clusterand as we can see in the cubeflow namespace we have the application deployedand then the model itself let's justcheck the model here are the logs solet's see how can we can interact withwith that right now the set of featuresthat has the here is just like we arereading all the uh catalog model from aconfig map that will change in thefuture in which we can uh pull thosemodels through an an API but so far wehave in the metadata the the service inwhich the um model is exposed so forthat we have enabled the chat interfaceright here if we go to another modelthat it's not deployed in the in thecluster you have this which is notimplemented yet so if anyone wants tocontribute to that feel free and here inthe chat model if I just say for examplewhy is cubeconum it will just send the request as youcan see in the chat it's just doing allthese uh like the inference is justenabling the model like getting the thearc the the name the parameters andeverything through the like the theruntime is just uh starting up the themodel and doing the inference right hereso basically the what is doing behindthe scenes it's a golan service is justlike talking through the service gettinginto into getting there and posting asmahas the uhopen service exposing which you have thegenerate endpoint to to talk to to yourmodels as it's the same we can implementseveral different if if your your modelis multimodel you can get the otherendpoints to just interact with withthat model and craft a UI uh with withthat so that's that's great now we havethe everything working and we've justlike touched the um cluster in thatsecond phase next phase is Qflowintegration problem with Qflowintroduction integration is that Qfl isa really great project but right nowit's very cumbersome to install torequire 16 gigabytes RAM of GPU CPUcores and that makes like somecontributorsuh like it's just not great to get a bea newcomer and and just make sure youyou had a contributor so that that nextstep is just like this modulararchitecture has everything needed todeploy everything intoQflow with that we have a pretty mucheverything in in this manifest to makeit compatible with with the platform asyou can see here we have theauthorization policy destination rulesuh network policies and everything thatit's related that that is needed forQFlow let's just change the um runtimelike the the context of of of QFlowright here and let's just see that inaction so I have a cluster running rightnow i think I'm connected yeah um withQflow um enabled right here let's justget there i'll just get here oh this isaspoiler one sec this is a classicrunning with QFlow itself this a Qflowinterface this one is just a mock likethe other one that I saw you it's justlike to make it the most appealing andthe most similar to to that like hereare the all the components with Qflow hwe are deploying everything that it'savailable to the in the interface tomake it work right here as you can seewell here are the name spaces let's justgo to the Qflow one here are all the uhpots that we have in in Qflow which areall the components that are running andas you can see we have the yellow lamaUI right up and running right runningright here so as well as a modelregistry component that is a new one wemade in less than a week a new Qflowcomponent based on onama so here as youas you can see we have the model and wecan do the same thing and everything isis right away and and available this isonly a P but this is like just to showyou guys how we had fun to justimplement some idea and and make itavailable for for something as as a CSCFproject in in no time so that's allthank you very much and yeah I hope youall have a a great evening thanks2025-04-15 22:02:24.078954�U andum I guess um why run LLMs on uhKubernetes and what you need the verybasic thing you'd need is access to umuh is a GPU processor a TPU or GPU andum you could um get that as a hardwarebasically go and buy off a GPU run it inyour office like Luke does has the addedadvantage of warming up your uh officein winter but then you could also mightas well uh get it off um any one ofthese um cloud providers so across umGoogle orAWS and or selium the there's that'sjust not enough so let's say you want todo something with that you would need toconnect your GPU cluster to yourKubernetes cluster and that's done witha device plug-in that makes your GPUaccessible uh into your into yourKubernetescluster but sort of like let's step backwhy bother why go through all thistrouble so you can run these LLMslocally on your cluster so the maindriver we're seeing is security andcompliance so lots of customers we speakto who are either in a health carebusiness or if you're a European telcowhere you store your data is is quitecritical and this is not just dataresidency but also where your data getsprocessed so which means a lot of thehosted LLM platforms are completely outof scope and then going on availabilityand latency is is a big deal as well thecloser your LLM is to you more likelythat you'll get your answers backquicker so that's uh we're also seeingthat um hosted AI platforms like um uhthe ones around today uptime and um APIuh response is still quite shaky at themoment so you're in control of yourinfrastructurewe're also we we live in an interestingera where it looks like open-source LLMsit looks like they may be overtaking alot of the proprietary stuff and uh I'mI'm sure most of you have heard ofDeepSeek at this time and the lastdriver that I'm going to talk about isthe cost of ownership the longer you runthese and the more at scale you'rerunning these um LLMs the costs startshifting so if it's a quick P then greatyou get an open AI um token and you'rerunning but then if it's something whereyou want your model to think longer so alot of these LLMs that pretty much likedeepse are um LLM that are sort of likea chain of thought model so the longerthey think the better answers you getand this is where the cost startsshifting quite a big deal into um yourlocally hosted LMS by local I don'treally mean on your laptop but I meanany kind of environment that isaccessible to you on the cloud or on aserver farm it doesn't matter it's stilllocal cool and uh lastly I'm just goingto quickly say why Kubernetes it'sreally just because Kubernetes isKubernetes because of portabilitybecause of consistency and uhscalability and it's just that you canapply the exact same podspec on um youru server farm or you can apply itlocally and you've got that consistentenvironmentaccessible cool this talk as I said is aseries of demos and uh on every slideyou can see a QR code to a GitHubrepository so this repository has got areadme file with all of the ininstructions in terms of commands we'regoing to show you today as well as um awalkthrough of what else you could do onyour Kubernetes cluster our first demois going to uh pull in a very uh simpleopen-source product called Olama andthis is predominantly used to test outuh LLMs and we're going to put use aHelm chart to pull in uh Olama alongwith Open Web UI over to Luke yeah coolso I'm just checking these mics workokay great um so uh yeah so here we'vegot um a Kubernetes cluster uh with asingle A100 in it um and what we can seeis uh that we've got um open web UI andOAMA running in that cluster um and thenif we go into uh open web UI I'm I'mgoing to start by showing you kind ofone of the challenges that you get uhwith this setup when you get it kind ofout of the box and so what I'm going toshow here is um a a prompt that wewanted to have work um when doing an APIintegration between um an LLM and anexternal system like in this case Jiraum so the prompt is to try and presentthe key information uh from a from anAPI response body um including theuser's question um so we get the user'squestion um uh which is what issues a�rethere and then we get the API responsewhich comes back from Jira um and as youcan see uh even though this is just a asprint with 10 uh 10 issues in it theAPI response body is quite big um and sothis highlights one of the issues thatyou get with Olama out of the box umwhich is that um it uh it has atruncated context length window and sowhat you can see specifically here isthat even though we gave the LLM uhwhich should be capable of doing a goodjob on this task where the task issummarize the issues in the sprint whatit actually came back with uh was a badanswer um and in particular it startedtalking about the JSON response body umit talked about the root element and inparticular it said the root element isan array containing one object but it'snot the response body uh had 10 objectsin it it had 10 issues in the sprint umand so uh I'm going to pass back toPriya uh to explain why that happened umand and then we're going to talk abouthow we can solve it cool thank you um sothe model that you're running by defaultwhen you're using Olama is uh a highlycompressed model so what it means isthat it's a quantized model and um theseum really meant to run on testenvironments like your laptop and uhit's not it's not really tuned for aproduction environment and um especiallythe one that Luke just demoed that's uma a model that's compressed to storefour bits of information per weight inthe neural network and this is pretty uhit's sort of I wouldn't say it's verylow resolution but it's low enough thatsometimes your answers will come backgarbled sometimes it will just um uh saysome random stuff and um we've seen thatum pushing that quantization to eightbits it sort of gives you that sweetspot between something that's practicaland something that's performant enoughso that's u your lama's quantization umuh trade-offs there and we've also seenthat some really important environmentvariables are turned off by default soone such variable is called flashattention and uh flash attention is umuh I think it uh turning it on reducesthe amount of memory uh you use when youwhen your LLMs are answering questionseffectively reduces the memory needed toum return back an answer and uh we'veseen that turning on flash attentiontakes that um response memory usage fromquadratic to subquadratic with contextlength so why did Olama break contextlength why did um why did they give us avery um highly quantized model why didthey turn off um um variables flashattention and other variables the reasonbeing that it's um it's meant to run onyour say MacBook where you're competingfor um memory with Chrome and Slack andall your other enterprise software soit's it's not something you can take itto production with a lot of tweaks butthen you quickly move on from usingOlama to another very popularproject called um WLM and WLLM gives isan inference engine and inference serverand uh it's uh effectively the base onwhich we serve our models so over toLuke to give you a demo of thatcool so uh let me justfind thatsorry okay soum yeah here we haveum right sorry just found the right partof the video um so here we have umanother Kubernetes cluster uh this onehas two A100s in it um because it'srunning um a few different models uhwhich we'll look at later but inparticular let's focus on this VLMinstance here which is uh uh which isrunning FI 4 from Microsoft so um whatwe've done is we've run uh the Helix uhstack on top of this uh VLM instance andwhat you can see is um that when we uhwhen we said hi to to 54 it respondedvery very quickly and so um as Priya wassaying VLM did a really good job ofoptimizing the the latency on responsesand and the number of tokens per secondum and you end up with uh with a systemwhich is very performant um and uh italso comes with better context lengthdefaults so it's kind of closer tosomething you'd want to run inproduction um so what I'm now going toshow you is applying um an AI spec uhwhich I'll explain what that is laterbut the AI spec corresponds to this appuh which is uh for Jira issues in thiscase and what I can show you here isthat now that we�'ve got VLM running witha proper context length setup um it doesa much better job of answering thisquestion uh to summarize all of theissues in the sprint um and it actuallyuh gives you 10 issues back rather thanum as uh did truncating the response soum yeah that's uh that's VLM uh verygood project recommend it yeah cool onthis uh Kubernetes spec on the on the onsort of the right side here you can seea few things here one is that we'repulling in Mistra 7B and alongside thefact that we're using WHLM u one of thegotchas here is that uh WLM downloadsall the weights that are needed for thismodel um from huggingface.com so whichand getting those weights is a gatedprocess involves um of course signingterms and conditions but also it meansyou're constrained in in the sense thatyour cluster needs to have access to theinternet and um uh the whole idea ofrunning things privately is so that wecan have things in your airgappedenvironment so in order to do that wekind of scratched our heads and wewondered how can we uh do we need to uhh invest in a model registry and then wesaid no we don't uh let's just use ourdocker registry and so we uh baked theweights of the model into the dockerimage along with the model and we checkthat into um into docker registry we didrun into a few issues here though um onebeing that um Docker by default uses uma gzip level 8 compression model and uhthis means that um it just takes quitesome time for the model to get up andrunning and answer questions so we thenhad to patch um the machine building thedocu uh images to use a customcompression algorithm and that andpretty much everything else runs asnormal but then uh that's sort of wherewe got to let's again step back a bit uhwhat are we doing here we're servingMistra 7B but then uh there are verymany models that do very differentthings really well there are imagemodels there are multilingual models anduh in this mechanism you're you're sortof at this point where you need acompletely separate uh node pool fordifferent types of models and that'ssort of where you your costs becomelimiting so we then decided um we've gotto find a way to run multiple models butalso have GPU um uh u memory uh sharingin place and um that's sort of where wesaid we're all good cloud engineerslet's build our ownscheduleuler so that is one thing um butthen the other thing is um you've you'reyou're increasing your memoryutilization on your cluster you've got aproduction uh readyish uh environmentwhere you can serve multiple models youcan also mix fine-tuning along withinference jobs which then means you'vegot a wider set of tasks you can do withyour models this is all great but thenwhat's really valuable to the end useris what you build on top why are weserving these models so Luke's going togive us a run through of thatcool thank you Priya um yeah so I I wantto talk a bit about um the differentlayers on top um uh and what you mightwant to build kind of on top of an LLMnow that you've got it deployed um sonow that you've got an LLM and you'vegot like a a chat interface to it wellwhat are you going to do with it wellyou could ask it to count the number ofRs in Strawberry and maybe it will uhyou can ask it to tell you a joke um butneither of those things are particularlyuseful in a business context and inorder to really get value um out of anLLM that you're running in a Kubernetescluster you need to start connecting itto your business data and your businessapplications and so those are twodifferent types of integrations thatI'll go into in in a little bit moredetail now about how you can do uhknowledge which is also known as rag umand API integrations and then the otherquestion is like how do you ensure thequality of the LLM product that you'reshipping like the LLM system it's notjust an LLM it's all of the stuff aroundit and the way that that you do that isthe same way that you do it withsoftware and DevOps which is that youwrite tests but writing tests is alittle bit different in this world ofLLM applications um it's known as evalwhich means evaluations um so what wedid was we uh �we set up um we we built alittle CLI tool uh that that uses an LLMas a judge to judge the output of uh thesystem and to then um give you a pass orfail grade that you can then use in aCI/CD system um in order to to test thesystem and we feel like this is superimportant because um if you can applythe the sort of same CI/CD process youdo you use for the rest of your softwareto genai applications that are definedas configuration uh like a CRD then umyou can apply all of the existing kindof best practices and process that youhave for for your engineering teams toto build and iterate on these thingsumso the next uh piece I want to share isis like how the integrations work so umyou actually saw a glimpse of theintegrations in the demo I just did asecond ago uh where we made an API callum into Jira uh for an as an example andso let me explain kind of break down foryou like how that worked because that'sactually a multi-step process so thefirst step is that and and each step inthis process by the way is a prompttemplate so it's like a prompt that cankind of get filled in um uh so the firstprompt template is a classifier so itclassifies whether the request needs anAPI call or not um because if the modelis just answering what is the capital ofof France it doesn't need to make an APIcall to Jira to answer that question butif you're asking how many issues arethere in the sprint or um uh what's thecurrent exchange rate then the system isgoing to need to make an API call uh toan external system so the first step isto just classify yes or no and if yesthen which of the available tools um andthen if the answer to the uh classifierwas yes um then the next step is toactually ask the LLM to construct an APIrequest body based on the user's queryand the open API spec for that API andamazingly LLMs can do this they can takean open API spec they can take theuser's question and they can have a goat making an AP at constructing an APIcall um that can then uh be executed umso then um the API response uh so thenthe system that we built like will makethe API request and take the APIresponse and then put that into thethird uh template which then summarizesthe response back to the user um sothat's APIintegrations and the other major patternum so API integrations by the way thatwas kind of like live queries againstrunning systems the other pattern iscalled knowledge um also known asretrieval augmented generation althoughI think that rag is an overlycomplicated term and I I prefer the wordknowledgeum the idea with knowledge is just tobring appropriate context much like APIcalls um bring appropriate context tothe LLM when it is uh doing inference sothis um this also happens in in threesteps uh the first step um is that youneed a vector database so a vectordatabase uh and there's good uhPostgress extensions for this now by theway um we use uh PG vector and vectorcord i I recommend vector cord actuallywe use it in production with a customereven though it's quite an early projectuh it works very well um uh and itscales well and so what you do with uhwith with a vector database is you takeyour source documents and you chunk themup into pieces and then you uh put eachpiece into the vector database whichthen kind of maps the text in thatdocument into a vector space which thenallows you to do a similarity searchwith the user's query it's a very longway of saying that if the user asks aquestion it will find relevant documentsin the database and include thosedocuments along with the user's questionwhen sending that to the LLM and that'sthe step two which is you take theprompt and the response uh you pass itinto the model and then the model iskind of grounded in facts and then thethird step is that you can justregularly refresh that knowledge kind ofas itchanges um and then the last piece ohactually no I'll jump to the demo sorryum so pass that back to you so thethere's a third demo here um uh ofknowledge and um I actually couldn'thelp myself uh I I had to um uh sharethe a demo of something that we just gotworking yesterday um so this isknowledge but it's a a kind of advancedknowledge pipeline um that's calledvision rag and Phil's in the audienceover there he got this working like uhyesterday morning so so thank you Philum what we're going to see hereum and actually just just before I Ijump into that let me let me explain howa vision rag pipeline works so here wetalked about putting text into thevector database and getting text back uhanother thing you can do is you can usewhat's called a multimodal uh embeddingmodel that can embed images and textinto the same vector space and then youcan get images out of the vectordatabase and then include them alongwith the user's question as images umand then feed that uh into uh into theLLM uh into a vision language model likeone that's able to understand images soum let me show you why that's a goodidea so here we have um a texton ragpipeline um and the rag pipeline uh hasa PDF in it uh and if we look here wecan look inside the PDFum so I think if we take this so this isa financial uh paper for example aboutum stock sentiment um and you can seethat the paper has text in it that youcan highlight but it also has thesetables and it looks like the person whomade this document like took screenshotsof Excel and pasted them into a worddocument pasted the screenshots umbecause those uh those images um thatthose tables are images so the problemis if you just take the text out of thePDF and put it through this existingkind of uh text rag pipeline uh we'reasking a question here about this tablein particular the uh the price changecorrelation uh table um and if we lookat the result uh the result from uh fromthe system is um is a fairly goodsummary of kind of the relevant topic umbut it has but it admits like the exact10day sentiment lag for the energysector isn't provided in the contexteven though it was right there in thedocument as a human you can read you canlook at the document and you can seethat um but because it was an image inthe document it uh it wasn't able to todo it so we've added this new visiontoggle um in the system and uh and nowthis puts it through an image pipelinelike I just described and it immediatelygives you an answer so it says thesentiment lag for the energy sector is0.35 so if we go and look here we cancheck did it get the right answer uh theenergy sector uh yes it worked so umit's able to pull the exact right answerout of this document even though it'slike a complex document layout um sothis is like really basically an amazinginformation processing system youprobably got like loads of uh likeknowledge locked up in your organizationum and uh and you can use this approachto uh to extract it reliably and makepeople more productive um so yeah uhI'll take the mic um thanks so the lastthing I want to mention is that we'reworking on this effort called the AIspec and the AI spec as you can see hereum is an effort to define a new CRD typein Kubernetes um that allows you todescribe all of the applications thatI've just shown you as configurationrather than code because that allows youto avoid having uh like Genai sprawl inyour um across your company wheredifferent teams might be buildingcompletely separate um implementationsof rag pipelines and then you need tofigure how to secure them all in if youinstead define everything in terms ofthis configuration uh then everyone canbe iterating on the same type of thingand you can use testing um the evalsthat I described uh and CI/CD uh to umto result in uh less effort and betterresults so uh that's uh all we've got umthank you for coming to the talk um ifyou're interested in any of the thingsum that we talked about today thenplease connect uh with both of us onLinkedIn um in particular if you'relooking at building like a developerlike a Genai developer platform of yourown and you are interested incollaborating on the AI spec um thenplease uh reach out to me um and um yeahif uh if you're working on these kindsof things and you'd like some help umthen please also reach out to me uhwe're happy happy to talk to everyoneand uh compare notes um so thank youvery much2025-04-15 22:02:24.721584 ��#��mAKIRUbaUjEKwwelcome everyone and thank you forcoming to our talk this talk is for youif you're interested in in runningreally uh generative AI models on yourown infrastructure in a secure way uhacross the number of companies we'retalking to we're seeing that lots ofbusinesses are becoming more wary ofputting their data on uh proprietary uhhosted AI platforms and what we're goingto do today is show you an alternativesolution where you can host your ownLLMs on your own infrastructure and umevery time you you blink and you realizethe AI landscape has changed quite a bitand uh we're seeing also that lots ofpeople look at this as this is quitecomplex and what we're trying to dotoday is peel back that complexity andstart this um this journey from veryvery basic building blocks and take youthrough how uh sort of our experiencesbuilding AI platforms out of umKubernetes and uh open source LLMs wewant to talk about um patterns thatwe've used pitfalls that we stumbledinto and really talk about our mistakesas well so you don't have to do themyourselves and uh we'll also talk aboutthe layers that we built on top of um uha basic open LLM running on Kubernetesuh before we move on I'm going to take amoment to introduce ourselves so my nameis Priya Samuel i'm a technologyarchitect i've worked in many small andlarge businesses and uh most of mybackground is in DevOps as well asconsulting and um my current um uhenvironment I work in is around buildingidentity and access layer on top of uhgenai applications and um over thenumber of years I've worked um onbuilding u machine learning pipelines aswell as automating these pipelines andum so this is quite an exciting area tobe uh working and over to Luke for anintroduction cool hi everyone um yeahI'm I'm Luke Marsden um I've worked inDevOps pretty much my whole career uh Istarted out um doing a startup that didstorage for Docker and Kubernetes backin the early Docker and Kubernetes daysum I was involved in SIG cluster lifecycle at the time that we created Kubadmso if anyone's used Cub Cubeadm here youcan blame me um and then um I've doneyeah like a string of startups we wethen did a an endto-end MLOps platformuh before uh AI was cool and then I didconsulting for a few years while umduring the time that the chat GPT momenthappened and so I f I saw firsthandworking with those clients um theopportunity to take open source modelsthat were getting better really quicklyum and run them locally in Kubernetesclusters for improved kind of securityand reliability uh so yeah it's a it'san interesting topic and happy to sharesome of our learnings todaycool we're going to start really reallybasic so what is an LLM so um if you'resitting in this room and this is newlanguage for you probably isn't but thenI'm going to introduce it really quicklyjust to lay enough context um to go onto what the nuances of running these onKubernetes so the mathematicalexplanation a very simplified of an LLMis that it's a multi-dimensionalfunction that maps sentences onto othersentences and um the input is a phrasefor example what is the capital ofFrance so that gets turned into a vectorthat's basically a string of numbers andthen you're reading the values of thishighdimensional uh coordinates gettingthe answer turning that back into asentence and returning it back to theuserthis is much more easier when you've gota GP��tionthose requests into the sets of nodesand then we want a way to deal withfailures that doesn't require you towake up at 3:00 a.m and go make changesto some podspecsomewhere how does it work todaywell this is the multi-nodeum this is kind of the the hardwarelayer infrastructure in one of thesethese multi-node um TPU slices as wecall them and in this case you're seeing16 nodes each of those nodes has fourTPUs and they're interconnected withparticular wiring that's programmablebut it's it's you know there's ahardware layer that's fixed um you cansegment that then dynamically into thesedifferent partitions where you know youcan consume one whole node at a time youcan consume two nodes at a time you canconsume four nodes at a time eight nodesat a time 16 nodes at a time and there'sdifferent um uh ways or arrangementstopologies um by which you can do thatwe're showing a certain set of them hereum to be to illustrate itso um how do we do this in in inKubernetes today well we'll label eachof those so if I go back a slide oh if Igo back a slide you see the littledotted lines around each of those setsof nodes we can associate a node labelwith each of those dotted lines um if wedo that you know and then we're going toput that that node label on our p on oursay our job spec um so all the the podswill be confined to to those set oflabelednodes um if we do that we probably wantto do the partitioning in a way thatthey don't overlap otherwise if twodifferent users schedule two differentjobs and they use label selectors thatare uh overlapping then they mightactually take away a node a node that'sneed is needed for a different jobum so what that means is I guess we haveit summarized in the next slide is thatlike we have to choose between kind ofthis static partitioning which means wemay not get full utilization if we havedifferent jobs of different sizes tryingto runum or we have to risk those kind of raceconditions we have to coordinate betweentwo different job authors about whichlabel selectors they choose because it'snot scheduled it's it's sort ofmanually specified in the spec and sothe people have to talk to each other umsecondly if if if there is a failure ofa node um you have to kind of go andback out the job and change the labelselector and reinitiate the job um sothat that's not really the kind ofself-healing we expect from a Kubernetessolutionuh so let's see let's kind of walkthrough what that looks like in a realuse case um here we have uh we we run ajob it needs a 4x4 slice for meaningfour nodes by four TPUs so we manuallygo into some spreadsheet or somethingand we say oh that's this label selectorand so we go and we set that that nodeselector on our job and um the schedulercomes in and it starts scheduling thesepods and it's working pretty well andeverybody's happy and it gets almostdone but then node 12 goes offline forsome reason i don't know you knowthere's a kernel bug and it crashes sowhat happens we can't schedule that podnow we're stuck it's just going to sitthere forever unless we actually go inand we kill that job and we create a newjob with a different we go in andmanually pick a different 4x4 slice anduh a different node label or nodeselector and we rerun resubmit the jobthe the scheduler can't do it for usbecause that node selector is fixed inour in ourspec so what do we do about it well onewe we can use an over-the-top solutionlike Q which is awesome we've got that'sthat's also one of I work with a bunchof people who build Q and so that thatcan really help in these kind ofsituations um but we're also working onsomething uh called DRRA which you mayhave heard of in other context contextsand we're going to give you a crashcourse on it right now but this dynamicresource allocation is a a new resourcerequest API in Kubernetes that as Imentioned uh Patrick and Kevin andothers have been working on for quitesome time um we're beta in 132 and thefeatures that we're talking about heretoday are coming in 133 so Pat uh Mortonwhy don't you uh take over here yeah soDRA um I guess is that a lot of peoplehere probably he�ard about it but we'regoing to do a quick intro here um it'sessentially an APIfor requesting devices and making themavailable to pods and containers um canthink of it as has four parts the firstis an API to describe devices this iswhat is a resource slice API and thisgives like nodes the possibility makesit possible for node to say like I havea Nvidia GPU and it has 40 GB ofmemory um the second part is an API forletting users request devices so that'swhat we call a resource claim uh andthat might be like I need two NvidiaGPUs each of which must have at least 30GB ofmemory um part three is essentially thething that brings those things togetherit takes the available devices and therequest from in the form of the resourceclaims and then allocates devices tosatisfy claimsfinally the fourth part is the cubit APIthat actuates the decisions by thescheduler so makes the devices availableto pods andcontainers so this is an example of thedriver side so you can see a durresource driver that then is a cubletplugin that publishes one or moreresource slices which has a list ofdevices and those then become availablein the API server the scheduleuler canseethem so that means that the scheduleulerunderstands what's available in theclusterand this information is also availableto the Kubernetes autoscaler which meansthat it can cons take devices in intoconsideration when it for autoscalingdecisionsuh which is one of the uh one of the bigadvantages of this latest iteration ofDRRAum and then we have the sort of cconsumer side which is the resourceclaim which lets users specify a set ofrequests which specifies like which typeof device how many and through selectorswhich typically sell expressions canspecify like things like it needs atleast 30 gigs of memory or it needs acertain number of coresum these resource claims are thenreferenced by pods and pods canreference multiple resource claims or aresource claims can be referenced bymultiple pods in the letter case allthose pods will thenshare the devices that are allocated tothat claim we'll we'll get back to thata little bit later um and yeah theresource claimis used by the scheduler the results ofthe decisions that the schedule make iswritten into the resource claim statusand that is used by the cubelet to to ummake those devices available to uh thecontainersum but I describe now and in general thesimplest use case that is often used forthe array is like the devices that areattached to individual nodes but thearray can also model devices that areaccessible for multiple nodes that couldbe through things like the networkum and as Sean mentioned in 133 there'sa few useful features that are beingaddedum the first one is what calledpartitional devices which makes itpossible to model devices that can bepartitioned into smaller overlap deviceslike Nvidia MIGs is an example of thisum we're also adding device taints thatmakes it possible to taint devices notnot nodes that is already available andthis makes it possible to automaticallyevict all pods that has a claim on adevice and by combining these featureswe can then provide a a solution formultihost acceleratorsum so yeah let's look at what this lookslike forum how we model those devices so this isthe same um set of TPUs and nodes asJohn showed earlier and what we do isthat we mapslices each slice becomes a logicaldevice so and each of those devices willthen advertise their capacity andcapacity in this example is the node thenumber ofTPUs so in this example like the 8 by8slice would be advertised as a devicewith 64 TPUs and similar for theothers um each device will also have anode selector that defines which nodesis this device available at and like ifyou look at the let's say the 4x4 sliceon the top left it will have a nodeselector that selects node 1 2 5 andsix um and yeah as we also notice inthis case there are overlap so a singledevice is a member of multiple devicesand the reason that we can do that isthat the array understands therelationship between these devices so inthis example if the2x4 slice here is allocated the arrayunderstands tha�t the 4x4 and the two2x2s that are formed by by node 11 and12 cannot be allocatedso the array allows the the sameunderlying device to be advertised aspart of multiple devices but it makessure that only a single of those devicescan be allocated at the sametime yeah so if we look at how consumingthese um multihost TPUs would look likehere's an example of a resource claimand you can see that it hasa it uses the TPU device class and ithas a selector that essentially says wewant a TPU where the number of TPUsequals to 16and as we mentioned each node has fourTPUs with 16 TPUs that means we'll havefour nodes and that means also means weneed fourpods and in this example all those podswill reference the same resource claimso it's an example of multiple podssharing the same resource claim and thethat reference is sort of an example ofthat is shown on u theright so um if you look at how thisallocation will happen um we have aresource claim that says give me adevice with 16TPUs in our cluster we have four devicesthatsatisfy that uh resourceclaim and it's up to theuler to decidewhich one should be used uh today thescheduler or essentially the the arrayulplugin does this with like a first fitso it looks at what's available thefirst time it finds something that worksit allocates that one in the futurewe're looking at implementing a scoringwhich would allow it to implement sortof a not a first fit but a best fit andbest would then mean according to somecriteria um but yeah um when theum when the scheduleuler then decides ona device it'll then do that as part ofscheduling of the first pod so the firsttime a pods the scheduleuler sees a podthat references a resource claim that'swhen it makes this decision and then thescheduleuler will then select thedevice the device has a node selector asyou mentioned earlier um and that willthen tell theuler which nodes in thecluster those workloads can run on andin this case we see it selected thebottom left4x4 and it decided to place the first uhworker pod on node11and this node selectorthen restricts the pods where any otherpods referring the same resource claimcan runso um this is sort of similar to whatJohn described earlier with theum device plugin but in that case thethe decision of which nodes the podscould run on was provided by the userthrough selectors on the pod but nowtheuler gets information from the devicethat itselected so this happensdynamically and yeah this we see thefirst pods ends up on node 11 the otherones in a schedule queue queue andthey'll all get scheduled on the devicesthat belong to this device and then theworkload canrun now circling back to the exampleearlier what happens now if node 12turns out to be to fail we end up in thesame situation at first the fourth podcannot schedule but with device taintsthat was added is being added in 133 thedriver can now taint the device with ano execute effect and what happens thenis that all pods that referencesresource claims that allocate thatdevice will automatically beum be evicted so all those pods will goback into the pending state they'll beup they'll be picked up by thescheduleuler again and the time of thefirst pod that the schedule sees it'llpick a new device and all those podswillbe scheduled on new nodes and theworkload can continue runninguh this is then obviously different fromwhat you saw with the device pluginwhere this required a the user toactually go in and start changing youcan you can stay stay asleep when thishappens at 3:00 a.mcorrect um there a few open issues orthings that we're still looking toimprove in the array um in the examplenow I mentioned that the the nodes orthe pods end up on the right nodesum what the array does not guaranteetoday is that thosefour pods end up one on each node intheory they could all get scheduled onthe same node um there are few waysaround this um you can make sure that orrequest enough resources on each thateach pod request enough resources thatonly one can fit on a node at a timeforcing theuler to place them ondifferent ones it's possible to useanti- affinity to force them ontod�ifferent nodes or it's al possible touse a per pod claim for all TPUs on eachnode um but all of these are sort ofincreases the complexity a bit and we'relooking to simplify that um we talkedabout scoring so making sure that wechoose the best fit not just the firstfit and we're also looking to supportpreeemption so ahigher priority workload can uh preempta lower priority one so it gets to runand gets to sort of steal the device andumyeahand thanks Morton um yes so so that'skind of the story of how we do that withTPUs um as I mentioned early on GPUslike especially when you look atsomething like the GB200 which has 72GPUs um it's using DRRA as well formanaging and coordinating um theallocation of the in the configurationof the different um compute domains andin uh NV IMAX domains kevin is is hereum from Nvidia if and he can be happy tochat with you afterwards if you have anyquestions specifically on on how we'redoing that uh with um within with thethe the GPUs and and uh Patrick andKevin have a talk tomorrow um giving anupdate on what uh the working groupresponsible for DRRA in the Kubernetescommunity has been doing and uh that'sanother great great uh talk to go to ifyou're interested in and all of this umso there's a bunch more talks this deckis up on the um the gray ones arealready already happened I think butthis this deck is up on the um sketch uhand so if you download it the PDF you'llhave links to the sketch entries for allthe different talks associated with theRA um so we're opening the floor toquestionsnow i think there's a mic here ifanybody wants to line[Applause]up while we're waiting for people toline up I'll I'llmention 133 is not yet released and sosome of the things we're talking abouthere aren't available yet and even whenthey are they'll be alpha but um wethink it's important for people to get ahead start on what we're building yessirlast year in Paris uh DRRA was stillfelt a bit controversial or up in theair it seems like it's much more stablenow is that a good read or yesabsolutely so so out of Paris we formedthis working group uh along with anotherworking group uh called working groupserving to try and address two of thebiggest issues we are seeing around aIML workloads one was Kubernetesrelationship to the hardware that'sworking group device management and DRRAwas one of the you know is basically thethe main thing we've been working onthere and we revised kind of the designto be more uh amendable to autoscalingand that's how we uh came where we arewe've done a a ton of work in the lastyear and um made a lot of progress andand I it seems like uh the design as faras I understand anyway meets everyone'skind of concerns that they had withcompeting approaches that's kind ofunified everything is that a good reador I mean it's not perfect but nothingever is but it I think it's people arepretty satisfied with it there there's acouple of like if you look at theoriginal use cases document for like saythe the the GPU use cases there's a fewthat probably we will struggle to meetwith this design very few though out ofthose that list there's also a bunch ofadditional things we're doing I thinkthat um that you know weren't even inthose original discussions so that andnow it's autoscalable right so that thethe data is there to drive scalingdecisions in a way that it wasn't beforeawesome thanks so muchhello thank you for the greatpresentation correct me if I'm wrongwhat I see as a DRA even though it saysresource it is 101 mapping to a deviceso when you think about resources it'smuch more than devices it could be saybetween GPUs there is a throughputlatency power all the different resourceingredients tied to each other how areyou going to address that in DRRA thankyouthat's a great question um so today wedon't manage um standard resources CPUmemory huge pages ephemeral disc thingslike that in DRRA that's managed um asit's always been um DRRA has been reallyfocused on devices thus far we do allowwe're working on a number of features Idon't think I have the no I don't havethe list here of all the features we'reworking on but if Come to is it tomorrowPatrick yeah tomorrow there's the uhworkg groupoup devices update meetingand you'll see the list of things we'reworking on and one of those is what wecall consumable capacity and it allowsdevices to publish capacity values thesame way nodes publish capacity valuesand thenallows users to request those via aclaim and rather than so today sharingis done by pointing multiple pods orcontainers at the same claim which meansit's a user initi iated sharing and ifwe have consumable capacity then for thetypes of devices that can do it forexample nyx doesn't work well for GPUswithout like vGPU technology but for nyyou can do like uh you could request saya two gigabit uh two gigabit partitionout of a 10 gigabit nick and we couldlet the platform schedule multipleunrelated workloads on top of that nickand consume the capacity out of the nickso we're looking at doing those thingsit doesn't get to the level of you'retalking about where there'srelationships between the resources wedon't really have a we have some ways tohandle that around constraining the setof devices you pick but it's it's quitelimited so come to the workg groupoupupdate meeting come to the Kubernetescommunity and join us and help us buildthathi John uh my name is Yan and fromNvidia so we had uh some discussions anduh yeah offline about how to and handlethe additional tolerations right for GPUtents in GKE because yeah so can youcomment on that so the problem is yeahso for the uh port request GPU resourcesuse DIA right how can we and uh patchthe right so let me explain the issue soin if you're using device plug-in whichis the traditional way of allocatingdevices um at least in GKE and probablyin other clouds we we whenever youcreate a node pool with GPUs or TPUs wetaint the nodes and And then if the userforgets to put a toleration on their podit would never schedule so we actuallyhave a web hook that goes "Oh oh ohyou're using you're using GPUs i'm goingto add this toleration so that you canyou can schedule." The problem with DRRAis that the resource claims are not partof the pod so we can't look at podadmission time and know just by lookingat the pod we can't even really load theresource claims that are referred tobecause it's an eventually consistentsystem it doesn't have referentialintegrity like it doesn't have to existyet and so how do we know to umto tolerate those workloads so I cangive you what we want to do from along-term perspective I haven't figuredout the short-term perspective yet butso sig node has had a problem for a longtime where for example if you use uhcertain features that are require afeature gate to be set on the cublet umthe scheduler has no visibility intothat and it could deliver your pod to anode that doesn't have that feature gateset and the pod won't run so this is nota new thing this is a pattern throughoutthe system so we're working with signnode to build infrastructure so thatjust like taints and tolerations whichare sort of administratively controlledwe would have a transparent mechanismwhere nodes publish capabilities and thescheduler analyzes the podspec andrelated resources and determines whatcapabilities are needed transparent tothe user and then adds those uh uh uhyou know considers those mechanisms inthe scheduling process so this could beused for things like cublet features andit it could be used to both attract andrepel pods from certain sets of nodes sothat's our long-term solution i don'tknow when that's going to happen thoughso short-term may request uh require auser to manually and uh to it long-termsolution will take sometimes right yeahI think there's some things we could Imean you could build a build acontroller rather than a web hook thatactually waits for like you know whenthe resource claim gets added thatwasn't there at the time the pod wasadmitted and goes but it gets prettyugly so short term maybe just the userokay thank youi think we're out oftime but um but Morton and I will be uharound here so uh happy to answer anyother questions um afterward thank you2025-04-15 22:02:25.212713 ��A�#��9Agvp2uTilwrYall right good afternoon guys i'm TimWickberg i'm the chief technical officerfor Sketmd and I'm here to talk aboutSlinky Slurm in Kubernetes performant AIand HPC workload management um I alsohave our title slide uh broken out intoour own uh format as well here backingit up with our logo um and I do alsoneed touh give a lot of appreciation to thedevelopers behind a lot of this effortskyler Owlet and Marlo uh who aren'there with me this week but are behindall of the und遹^�#��sAYqIHESG0suIall right hello everybody uh welcome umuh pretty good turnout here i hopeyou're not not too sleepy after lunch umwell uh today we're going to talk alittle bit about some of the reallyfuture-looking things we're doing in theKubernetes project um what we're goingto show you today uh is only partlyavailable today and some of it's comingin the release that's about to happenbut um hopefully you'll you'll uh seesome of the the great work that thecommunity is doing um to to handle thesemore advanced use cases that we'reseeing thesedays um my name is John Bellame i'm fromGoogle and I've been involved inKubernetes for maybe seven or eightyears um so uh but for the last yearI've been working on DRRA um which wasuh initiated by Patrick Oolie from Intelwho's here in the third row and andKevin Clues has done a ton of work onthis as well but many other contributorsum from the community so we are uh youknow we're all um we're we're all in ittogether here and with me yeah my nameis Morton i'm also from Google uh I'vebeen involved with the communitiesproject for like few years a little biton and off and been working on dr forlast six months roughlyall right so let's see what we're uhwhat we're talking about here as far asthese problems i mean I think that thereally simple way to put it is asworkloads uh get larger when you'redoing large training jobs or things likethat you're using lots of acceleratorsthat means lots of nodes and of courselots of any kind of machine means moreprobability for failure so more failurepoints more failures and um with some ofthese kind of coordinated mechanismsacross nodes the failure of one node cankind of uh impact all of the nodesinvolved in the job so how canKubernetes help manage those type ofsituations to kind of illustrate the theproblems we're talking about um I'mgoing to use uh TPUs as an example but alot of these same umproblems uh apply also in GPU trainingissues although they may solve theproblems in different ways um we'll seea little bit of that later and and uhbut for now for our illustr illustrationwe're going to assume we have um 64accelerators TPUs they're called um butthey're similar to GPUs um and there'sfour per node and so we have 16 nodes umfor for these particular uh jobsum or rather for for this particularcluster in our example and then we'llhave several different jobs that areeach using a portion of that cluster butnot the whole thing when we allocate aset of uh say 16 TPUs which means fournodes we want a few things out of thatallocation we want compact placementwhat does that mean it means things thatare close to each other so in in uh TPUsin particular you have topologicalconstraints but even with GPUs likeyou'd want them all out of the same rackright you you know ideally because thenthey're only one hop away well with GPUswe have similar kind of things wherethere's wiring between the chips andwe'll see it in a minute and we cancreate certain arrangements and certaintopologies we want to make sure that thepods land on nodes that are close toeach other in some way we also want ifwe're doing multiple jobs that aregrabbing therefore multiple nodes and welaunch them simultaneously we don't wantthem to pollute each other's um compactplacement groups so we want to have someway to atomically kind of parti��erpinnings I'm about totalk abouttodayuh wow some of the slides are out oforder that's going to be fun okayintroduction first of all what is SLURMslurm is a leading HPC workload manageruh workload manager roughly means a jobscheduler combined with a resourcemanager um roughly equivalent to anorchestrator uh such as Kubernetes andand what it does and manages on thecompute nodes uh those two componentsthere are scheduler prioritizes anddecides which compute jobs to run onwhat parts of the system sort of whenwhere and why and then the resourcemanagement layer of that is trackingnode state node resources and dealingwith actual job launch job dispatchpotentially pulling in differentcontainer run times andecosystems slur manages the majority ofthe top 500 supercomputers in the worldum roughly 60 to 70% last we checked umalso manages most AIM ML trainingworkloads even for a lot of companiesthat run cloud native for the rest oftheir stack slurm is usually involved inmanaging those training workloads forthem today uh slurm scales well beyond15,000 nodes in a single cluster um weare able to on some machines launch10,000 plus node simulation work in sub10 seconds uh which is something that wedon't believe uh other orchestrators areable to accomplish today slurm is opensource uh GPLv2 plus with an openSSLexception technically um and has beenavailable to the broad community for acoupledecades it's going to be real fun how Igot to figure out why the slides are outof order here uh so who is SKTMD uhSKTMD are the developers of Slurm andalso now Slinky we originally spun offfrom from Lawrence Livermore NationalLab back in 2012 to support SLURM'srapid adoption in the HPC industry umour founders are Mo and Danny they'rethe M andD in SKEMD we are not medicaldoctors please do not solicit medicaladvice from many of our support staffthey are not qualified to answerthat sketmd provides commercial supportfor SLURM uh for our customers alongsidetraining consultation and customdevelopmentand then what you're here to hear abouttoday is what is Slinky slinky is Slurmand Kubernetes combined um it's atoolkit of different projects tointegrate Slurm into Kubernetes uheverything we've been building here isopen source under the Apache 2 licenseinstead of GPL and is broadly broken outinto three major components the slurmoperator slurm bridge and a lot ofassociated toolingslurm operator is designed to manageslurm clusters running underneathKubernetes um be able to autoscale themuh give you easier ways to deployreference slurm installations underneathKubernetes potentially a managedKubernetes offering from one of thecloud providers original releases backin November just ahead of uh CubeCon andSalt Lake uh that was VO.1.0 uh VO.2.0actually came out about uh a week ago uhand we're looking forward to gettingv3.0 out in June alongside the slurmbridge the slurm bridge is a Kubernetesscheduling plugin the idea here is toenable slurm schedulingwherewithal not just for slurm basedworkloads but for kubernetes pods andkubernetesworkflows uh the intent is to releasethis in June um it is gated currently onsome new features and capabilities we'reintroducing in the slurm 25505 releasein May to better enable this crossintegration um and it is in early accesswith some sketched customerstoday then alongside this is a lot ofassociated tooling um you can't reallyplay around in the cube ecosystemwithout some other baggage uh helmchartcontainer images uh we also havepublished a client library um thisinteracts with slurm's rest API um butis translating it uh using an open APIgenerator into Golang uh so it's alittle easier to adopt and modify therewe also have a slurm exporter Prometheusexporter that interacts again with therest API um it is publishing metricsright now that are fine-tuned to slurmoperator so it can uh hook in and andmanage autoscaling forus slinky repositories are on GitHub umthat QR code will take you there uheverything we're doing here is capturedunder the Slinky project organization umseparate away from the normal Slurmrepository so cloud native HBC and slu�rmum a key part of my talk today is reallykind of laying the groundwork for whereI think these two communities can startto converge and where I think slurm andthe slinky project itself sits at theintersection so uh the subtext to thisis why isn't HPC scheduling guy standingaround atCubeCon so starting off here with a bitof a disclaimer um everything I'm aboutto say is a gross oversimplification oftwo incredibly complex and intertwinedcommunities every point I make here Ican come up with at least a half a dozencounter examples i'm trying to focus onthe broad context of what I'm sayinghere and use that to highlight where Ithink the two communities can learn fromone another and start to cross adoptcertain uh technologies andcapabilities so to start with HBC andcloud native why are theydifferent uh it really stems fromdifferent sets of assumptions that thatboth have and and the evolution of eachof theseecosystemsslinky is meant to sort of be ourSKEDMD's approach to to bringing slurm'swherewithal into here and crossing overthat boundary the way I've describedthis a lot is at the very highest levelthe the perspectives are shaped by thisidea that HPC has finite resources afinite system scale what you built fromthat vendor stood up as a particularsupercomput but then infinite workloaddemand hpc researchers are not hesitantto bring more and more simulations tobear on the system and you have tofigure out how to fit that into thefinite resources you have at handcloud native on the other hand assumesnearly infinite resources that you canalways hit some sort of cloud providersAPI to bring and provision additionalcapacity into the cluster but that atthe same time there isfinite workload demand that people areplacing on this machine that you canalways simultaneously run everythingthat that system has to offer or uh thatthat workload has todemand so to qualify this a littlefurther HPC assumes finite resources butinfinite workloaddemand systems cannot simultaneouslyexecute everything that HPC researcherswould like q prioritization and figuringout what slice of that demand isrelevant to run on the machine right nowis paramount this results in incrediblycomplicated priority schemes fair shareum different schemes building the sizeshape and time duration of a job intoaccount to decide when and where on amachine to give people access to run thesimulationcomplementing that is a detailedaccounting system that is thencross-tied into very fine grained limitsto say exactly how many resources givenuser populations are expected to haveaccess to on the machine at a given timealongside this is that historically HPChas always been focused on the high endof simulation work talking about how tobuild machines to run workloads thatdemand access to thousands tens ofthousands hundreds of thousands of CPUssimultaneously in lockstep alongside this again jobs have timelimits uh time limits are a a subtlepiece that is actually kind of missingout of the cloud native ecosystem thisidea that a simulation should respect acertain boundary uh incomplete withinthat scheduled duration this is alsothen fed into how workload isprioritized backed off and differentjobs can be managed and forward plannedout for that system again encompassedwithin those finiteresources also then uh wrapping thisslide up here systems are morestatically defined um traditional HPCsystems are bought on a a order of yearsprocurements go out vendor responsescome back two to three year deploymentlife cycle of the machines say fiveishyears uh and you're not just spinning upadditional racks of capacityquickly cloud native on the other handflipping this around infinite resourcesbut finite workload uh cloudorchestration was designed aroundmicroservices first and foremost allpods in the Kubernetes system areexpected by default to be runningsimultaneously again massive asterisk onthis please forgiveme workloads scale horizontally runningadditional pods and then load balancingbetween them tightly coupled processesacross multi-node multi-node computeworkloads are not a core design elementof Kubernetesan�d require certain scheduler extensionsto be able to integrate and manage podsrun indefinitely time limits are notbaked in by default that there isnothing policing that a simulation wrapsup on schedule to turn resources over todeferred and delay or delayedworkflows capacity issues are are fixedby just getting more resources call intothe cloud API bring more compute uhwherewithal to to bear to be able to runeverythingsimultaneously usually in service of anexternal say web interface or externalset of microservices that are drivingpresumably some sort of revenuegeneration for your company andapplication support for applicationresilience and dynamic resourcemanagement though are presumed this doesstand in stark contrast to the HPC spacewhere jobs dispatch of a fixed size andscope and are expected to continueoccupying that same footprint on thesystem until they terminate some numberof hours days or even weeks later umthis also has interestingly enough ledto some very different schedulingsemantics growing out of this space uhone thing I point to commonly is theaffinity anti-affffinity patterns withinKubernetes scheduling framework don'thave equivalent models in the HPCscheduling community um this isn'tsomething that HPC worries about as partof the way their jobs are run andmanaged so why converge these twospaces systems are are under increaseddemand to run these kind of batch styleworkloads even on these cloudnativesystems a IML folks are probably drivingthe greatest uptake of this and arecertainly absorbing the greatest sliceof compute um possibly in the history oftheworld and those folks are runningKubernetes generally for inferenceworkloads um something that is servingan end application but the models thatthey're building are being built onslurm slurm is running those trainingworkloads for them um it's kind ofawkward to have these two divergentcontrol planes involved in differentparts of potentially your same systemspotentially trying to overlap on thesame hardware or at least within yourcompany having to run two verydifferent clusters just to service thesort of two responsibilities of of thesemodern a IML shopsso converging these come becomes verytempting trying to fit one into theother um is is what we're trying to sortof sus out here is it possible to bridgethe gap between the two of these andSlinky is meant as our take on how tocover this again in two slightlydifferent ways one through the slurmoperator where we can provide thattraditionalHPC environment within a Kubernetessystem thatexists and then a second what I'm goingto focus on a little bit more today isthis new slurm bridge idea of bringingslurm's scheduling wherewithal to coversome of the gaps that we see within thekubernetes scheduling ecosystem and beable to better provide for schedulingand management of especially largecalebatch style workloadsSo it's building off again theseadditional capabilities that we see andthings that we see in slurm that wedon't see modeled in the cloudnativescheduling ecosystem first and foremostthis idea of efficient multi-nodescheduling and resource allocation umthan planning around future system stateunderstanding that the workload thatexists might not all fit into themachineimmediately and that we do need to bringsome sort of prioritization model tobear but also then to use that prioritymodel to inform not just what is runningon the system instantaneously but whatis running over a much broader period oftime then network topology managementsomething around NV link managementespecially something we've been workingum very closely with Nvidia for the lastthree years to model the networktopology that exists in the hardwarethat these systems are being built offof and efficiently plan out use of themachine again instantaneously and intothefuture so onto the slurm operator andI'm going to skip through this a littlequickly in the interest of time slurmoperator's use case again it's managingslurm clusters built within a kubernetesenvironment compute nodes are expectedto map directly to kubernetes podsrunning an individual slurmd process� umwhich is slurm's component very muchakin to the cublet responsible formanaging a compute node then alongsidethat supporting autoscaling based oncluster utilization metrics thePrometheus exporter is what we thenbuild off of to to drive that withinthis we're running slurm jobs nativelyso we're running slurm jobs which areare just batch scripts usually at theirheart directly in those pods the pods asa result may befairly generously proportioned let's saywe we've seen certain clusterenvironments built under this kind ofmodel where that may be a 20 40 gig evencompute node image uh which as acontainer is admittedly a little awkwardum but the HPC folks are are still alittle sluggish to adopt a lot ofcontainer strategies um they're they'regetting there when they get therethat'll be a lot easier to managekubernetes is not involved in thenscheduling and managing compute jobsthat are running underslurm with slurm running underkubernetes uh Kubernetes doesn't seethem and also doesn't sort of interferewith their dispatch this lets slurm runthe way it wants to run a cluster um itis able to efficiently manage fine grainresource limits on those nodes it's ableto do the back fill scheduling it's ableto manage network topology and doeverything it's expecting to for jobsmodeled as slurmworkloads documentation uh we actuallyjust got this out last month umslinky.skemd.com has our initial cut uhwe will certainly be expanding this tobetter document how to spin this up intodifferent cloud environments umespecially different managed Kubernetesenvironmentsthe broad picture here again uh theslurm operators plugged straight intothe cube API uh communicates withslurm's control plane exclusivelythrough slurm's rest api slurm's controlplane is a couple different componentsprincipally the slurm controller slurmcontrold uh uh from the HPC space wecall it controld not cuddled d um youcan fight with us later on that we alsohave an accounting process called theslur dbd slurm database um that'sdriving a lot of granular resourcelimits for different users on thatmachine it's talking to a managed MARBinstance and then we have the computenodes modeled by a whole bunch ofindependent Stormd processes all onseparate pods on thecluster last but not least uh metricsare there able to drive uh autoscalingdecisions as well so this is built offof a couple uh custom resources there'sa cluster CR a node set CR um these needto get installed off into the controlplane to then build the rest of thismodel off of um and I'm going to kind ofjump through these a little quickly uh Idid attach the slides to uh thisschedule saw in the schedule if you guyswant to download them and look at thesein a little moredetail um and the kind of one uhoperator slide demo slide um is herejust showing that for a user logginginto this machine logging into thesystem they're just seeing a traditionalslurm environment they're able to submitworkload into slurm it's queuing it upprioritizing it much the same as itwould any other workload and then we'redispatching it out that out onto thecluster um in reaction to this clusterload the autoscaler here is kicking inum and it's able to scale uh if you'reusing ketta from zero up to however manynodes you care to feed into this clusterenvironment um and it's able to do thatin response to the demand that the slurmcluster itself is seeing internallyonto the slurm bridge this is what we'reworking on right now plugging slurmscheduling directly into the KubernetesschedulingAPIs why do we want to do this umKubernetes lacks this idea of fine graincontrol of native resources CPU memoryum in ways thatare usable to a certain class of HPCapplications that exist um so we expectthat there are still HPC applicationsthat want those semantics and want to beable to run underneath slurm but beco-resident on a system that is alsorunning Kubernetesworkloads we also want to be able todo faster scheduling than we believe ispossible natively within the Kubernetesstack um and mix and match which toolingand which ecosystem to bring these batchworkloads in um if you have simulat�ionwork that's best modeled as a Kubernetesdeployment with a bunch of differentpods we want to be able to schedule andmanage that on the same hardware asnative slurm workloads running as slurpjobs offsetting this is that this is notnecessarily a perfect solution foreveryone where we're not proposing toreplace the default scheduler certainlyum it's meant as another alternative toexist in this ecosystem um theKubernetes API does very convenientlymake it possible to provision multipleschedulers um this is the same approachQ volcano the MPI operator the plethoraof other sort of batch computingoriented projects have using have usedto build their own systems and ownscheduling capabilitiesone huge gotcha and something that weare working with the broader communityon is this idea that the Kubernetes APIdoesn't currently have a way tosubdivide a nodesresources exclusively for the remit ofdifferent scheduling plugins operatingon the system um so currently the slurmbridge is assuming that it exclusivelyowns resources in a given node that youassign to it that may or may not be aworkable assumption on yourmachine so um in terms of itsarchitecture everything we're buildinghere is meant to be very flexible whichalso makes it a little hard to drawdiagramswe are able to for a portion of themachine that you want to overlap betweenKubernetes's orchestration and slurm'sresource management we're able to bringthe slurm bridge to bear on that set ofcompute nodes it does not need to be aperfectly overlapped set of computenodes though you may well have parts ofthe cluster environment that arededicated exclusively to runningKubernetes workloads or potentiallyparts of that same system that areexclusively set up and managed withslurm so this vin diagram may notperfectlyoverlap but the slurm bridge sits at theintersection of those two spaces we doexpect most deployments will probablylook kind of like this instead whichdefinitely makes everything a littlesimpler to talk aboutso the design goals here is to run bothslur and Kubernetes workloads on thatoverlap pool of nodes that sits in themiddle here the slurm bridge translatesthe resource requirements for Kubernetesworkloads into corresponding slurm jobswe call these placeholder jobs withinslurm we're able to reconstructmulti-node workloads coming fromKubernetes um whether they're pod groupjob set we'll probably add support forleader worker set and uh whichever otherrepresentations of multi-node batchcomputing in Kubernetes we findinteresting to to translate translatethose into a single job within slurm'scontrol plane so slurm can provision andmanage that multi-node job as amulti-node job from its own schedulingstandpointwe also need to handle device pluginsGPU uh we need to be able to plug intothe DRRA ecosystem and manage thattranslate those resource requeststranslate the resource claims back outthe other direction we also want to beable to filter out nodes that SLRM isnot intended to manage um we also needto filter out pods that slur is notintended to manage uh we don't want tomess with the Damon sets we also don'twant to manage everything in the corecontrol plane if you're running Slurm'sown control stack under cube we don'twant the bridge to attempt to schedulethe slurm controller by talking to theslurm controller because that littlecatch22 is not going to let uswork so there are a number of theserestrictions here um at the moment eachnode can run slurm or kubernetesworkloads we can overlap multiple slurmjobs on a node we can overlap multiplecube pods on a node but we can't safelymix and match uh the reason for this isthat the slurm workloads are going to bepinned to specific CPU sets on the nodeand the Kubernetes ones won't this meansthat the slurm workloads would getstepped on potentially by anythingrunning as a Kubernetes pod introducinga lot of undesired jitter into the slurmworkloads this is where one of thethings that we're working on longer termis this this discussion around addingsemantics for CPU allocation info intothe cube APIs um if we can do that orhave a model that lets that work� then wecan safely mix and match the two betweenthem you need to configure this bridgeas a cube scheduling profile um it takescontrol of everything in that allow listof the name spaces you've provided aspart of the setup um default is stillthere and handling workloads elsewhereon the machine potentially includingslurm own control plane slurm onlyschedules the nodes with a slurmdrunning which is something we'll talkabout again in asecond so this is broken up into a pairof pieces one is theuler plugin and asecond part we call the workloadcontroller um these are tied into thetwo different chunks of the cubescheduling framework that we need tointegrate with to get this to work thisis um one of the reference diagrams thatthe cube scheduling group has um andthen this is the way we've sort of cutit down considerably to just translatewhat we need to off the cube schedulingAPIs into something that the bridge canconsume inject into slurm as aplaceholder job and then once slurmactually schedules and lands um jobs outonto compute nodes we translate thoseback into bind API calls within the cubeAPIs to inform Kubernetes to go aheadand launch the cube workloads so thebroad model here is to translateeverything into a slurm job slurm isthen able toprioritize those placeholder jobsrepresenting Kubernetes pods alongsideany work that is submitted and beingmanaged as direct slurm workloadfilter through that with all of Slurm'scomplex priority models backfillscheduling support and then make thoseplacement decisions as to when and wherethese different workloads are going toland on the machine for theKubernetes-based stuff the workloadcontroller here has to kick back in andcommunicate back out through the bindAPI calls to the cube stack for nativeslurm workloads none of this kicks inthe native slurm workloads are justrunning under slurm's own resourcemanagement layer built into the clusterso I'm just going to very briefly flipthrough some of the demo screenshotshere i'm not going to try to run thislive uh because the internetconnection's a littlefiddly this is a pod coming in uh justgetting applied into the cluster slurmis translating it um and then running iton this node uh node name is slurmbridgeone you can see both from sq the slurmcommand that's monitoring the qate it'slanded it on slurp bridge one andcommunicated that placement decisionback through the bind API call landingit on slurpone this isuh that same job just calling out aspart of this uh we have theseannotations that we've attached hereslurm node is indicating where we'veplaced it and then importantly the slurmjob ID label maps into that placeholderjob within slurm's own control plane andis used to cross tie the two entitiesbacktogether this is quickly going through atwo pod damon set uh applying thissetting it up to get scheduled we cansee here that the two pods are beingtranslated into a single slurm jobrequiring two nodes um this is kind of akey capability of this of nativelytreating these multi-node pod groups assingle slurm jobs and scheduling andmanaging them as proper multi-node jobentities and then placing them out onthe system um you can see here it landedit on uh slurp bridge one andtwo this is again similar uh thingrunning through rather than using areplica set to build the pair of pods uhit's actually just enumerating themdirectly as part of the pod group uhthis is being submitted while this otherjob is sitting there in this environmentwe only have access to three computenodes both of the jobs want twosomething is going to have to wait wedon't attempt to schedule half of thejob we know that we have to queue it upand wait until sufficient resources whatwe're calling out here are available forthis to actually execute and run on theclusterthen uh that first workload on thisslide finally finished that second podgroup actually came in has finally beenable to execute on SER bridge one andtwo and then uh last but not leasteverything is wrapped up here it's justcalling attention to the fact that ifthe cube pods have terminated the bridgeand the workcycle workflow controlle�r part of thebridge is kicking in and making surethat the corresponding slurm jobs arebeing terminated as well if the slurmjobs are terminated or killed or hit atime limit the corresponding pods willalso bedeleted future work there's a lot ofdifferent things that we're we'reexpecting to expand on thesecapabilities with um one of the thingsthat we're pushing for especially isthis idea of DRRA but for CPU managementon the compute nodes um I'm talking witha lot of different people this weekabout ways to to push for that otherthings that we're expecting to do um isto be able to run slurm not just withthis hybrid environment where the slurmdcompute node image still needs to berunning on a piece of hardware but beable to run it as just a pure Kubernetesscheduler without slurm's resourcemanagement layer running underneath itum that's something we expect probablylater this summer we'll have thatcapability as welluh and then with that thank you guys anyquestions[Applause]hi this is Can you hear me yeah hi thisis Abhishek from IBM research uh greatpresentation um so I guess u thetranslation that you showed fromKubernetes to slum um it was mostlyaround maybe static CPU resources but anAI training would need multiple nsnetworks so there are a lot of otherparameters that a pod or a job couldrequest so what's the story there sowhat what I'm not showing is that yeswe're looking at other resourcerequirements that the pods may have ummillores memory requirements we'retranslating that into the slurm resourcerequest as well we're expecting to beable to take certain flavors of DRLsyntax and translate that directly intorequests for GPUs for example um theother thing that I dropped these slidesout here is that we do have an extensiveset of annotations corresponding toother slurm native resource types andresource models that let the pod requestthings that we can't directly infer fromthe pod um which is g gives us kind ofan un end that end run around somelimitations of of sort of thedescription that exists today and givesus our own syntactical sort of nuancethat we can providethank you thanksuh say I have um a static set of nodessay 100 nodes um and I want to run a lotof Kubernetes jobs on them sometimes iwant to run a lot of uh slurm jobssometimes and it's basically chaoticwhether that happens how easy is itright now for it to switch betweenKubernetes mode and slurm mode on theentire set of nodes or a subset can itbe done dynamically at runtime or so onis it a config I've set that sort ofthing so that that's going to depend alot on how you've tuned Slurm's prioritymechanisms um you can adjust on the flythe importance of different partitionsum partitions for slurm are differentcues um you you could go and alter theconfig to say that you want thatpartition to have the highest servicelevel right now make sure stuff in therewhich is corresponding to the Kubernetesworkload that that we're bridging in ummake sure that gets the highest serviceclass and then you can alter the thecluster configuration to flip that backon its head later um so this does giveyou a way to potentially shift the focusof those workloads without needing tomanuallyreprovision the compute hardware withinthe cluster between the two controlplanes okay thank you very muchuh uh so uh from your design herethere's the slam d operator and the slambreeze there i have two questions thefirst question is when the slumber D tocontrol nodes there is is it a cancontrol the pure bare metal or is kindof a po contain ps or something there sothe the slurmd process traditionallyruns on bare metal um on HPC systems umbut it it also runs just fine in the podthe one caveat there being that if it'srunning in a pod that container image iswhat slurm jobs will see and executeunder um so if they're expecting usuallya fairly fully featured Linuxdistribution to be available there um ifyou're not able to provide that as partof that container image for the slurmdprocess uh that that actual slurm jobitself may have a problem executing andrunning um that's where we expect peoplewill put a lot of software into thoseinitiallyum within the HPC space if you canmodel your slurm job with containers allof a sudden that means that you canreduce the footprint of that thatcompute node container image itself uhconsiderably but that ecosystem is stilla little slow to come around to the ideaof containerizing all all of theseexisting applications um so it's a workin progress there okay thanks uh thesecond question so comes to the slumbridge here so when uh considering thisis a GPU situation like uh thinkingabout the granularity here uh whatwhat's the granularity level here whenwe sign it would like the for example if80 GPU in one node there would controlper node per GPU or even smaller thanper DPU like a sharing resource there sowhat's kind of the current situationdepends on configuration um is thecomplicated answer so slurm itselfnatively can model usually a GPU as anindividual allocatable entity um that'susually the level that most systems runwith um but slurm does support NvidiaMPS um it can work with Nvidia's MIGmode of operation um although it doesrequire static partitioning for that umand that's something that we can extendin the future if we need to um but it itdepends on what sort of the the clusterpreference and policy are and and whatyou're trying to accomplish the firstpass for the slurm bridge is reallyorientedaround the the place we think we canmake the biggest impact right now is isbetter management for multi-nodeworkloadsum but th those are things that we'relooking at at better serving in thefuture and again a large part of that Ithink is going to be ways to talk aboutCPU affinity management and CPU resourcemanagement alongside the GPUsokay sounds good thank you thank youvery much and a good presentationyeshi great talk um questions about elasticworkloads do you support them do youhave thoughts about that you said leaderworker set for instance what aboutsomething that grows or starts in twophases you know what are your thoughtsuh elastic workloads in an HPC contextare tricky um because they directlyinterfere with the scheduler's abilityto plan for future use of the systemum there is support in slurm for beingable to do that but it's usuallydisabled in most clusters by default umthere are definitely things I think wecould improve there and that that isactually a thing I think we could bringback from the cloud native workloadsbetter into the HPC space um that thatthere are certainly some developmentprojects there that could better tacklethat um I'm happy to talk about thatmore at length at at some other pointthank youum okay I know I am at least fiveminutes over time uh okay one lastquestion quicklyuh so it's a question about um whetheryou can use cgroups uh with the slurmbridge um because obviously containersare using croups underneath and I knowslurm can uh use cgroups for resourcecontrol how does that work that is theone complication with running slurm in apod having the slurm processors live inthe pod um whether or not you cansuccessfully delegate to slurm theability to further subdivide thosedepends on which flavor of croups youhave enabled and a number of otherthings um that's why for those workloadsI would usually suggest sticking tohaving the slurd process run on the baremetal um but we're working to make surewe can cooperate with the higher levelcroup hierarchies uh as best we can umit varies a little bit between croup v1and croup v2 unfortunately um and thatthat's a fight we've been well aware ofum but but don't have any magicalsolution to uh in in allcases uh with that thank you guys um Iwill be floating around all week i'mhappy to take other questions here but Iknow I am already massively over time2025-04-15 22:02:25.780889� of secretswe can run into uh an oom specificallyif we do list calls to the APIserver and so these secrets like I saidthey're not quite the ordinary secretthey're not like a base 64 encodedstring that's like a few hundred bytesthey're 0.2 megabytes each they'repretty large and we have 5,000 of thesethings so think about that in terms ofCD storage so we haved storing thesethings that's okay but now when you do alist call think about all thatinformation that needs to go into theAPI server into the cache and thinkabout how much memory that consumes andso if we do two concurrent list calls wecan get a ton of this stuff built up inthe cache and we can run out of memoryin the API server uh so this is a bigproblem for us and you can kind of seeit in the picture on the right uh whenwe do these list calls it was prettyobvious um that we would just it wouldand you know we'd lose all our metricsfor the APIserver okay so what was the what was oursolution how do we deal withthis well we used API priority andfairness to help us uh API priority andfairness um if you don't know much aboutit at a high level it's this idea thatyou can have uh specific REST calls madeto the API server they can be givencertain levels of priority so think oflike the API server as a shared resourcewe want to make sure that your patchrequests your put requests your listrequests can all get there and they allhave a fair shot of reaching the APIserver to process those things so youcan use API priority fairness to saywell we only want to allow twoconcurrent list calls to be executed andthis was really helpful for us um itactually is what solved our problem andwhat kept us within the safe area ofmemory you know we wouldn't want to goover two um so we would make that a partof our um um our API priority andfairness policy and this is what wouldsave us right with this all these manysecrets and how they're constructed weneeded to use API parity and fairness toprotect ourselves so this is greatsignificantly reduced ours um and andwas able to get through the problem umand if you want to learn more about APIprote's actually a talk unfortunatelyit's given at the same time as this onebut the QR code's there if you want toum you know watch the recordingafterwards it'll go through in detail inmuch more detail about how API priorityand fairness works but this worked greatfor us it protected our API server sothat was problem number one la is goingto take us through the next twothanks Ryanum in the next few set of slides I'mgoing to continue talking about theproblems that we hit at scale and someof the fine-tuning techniques thathelped us to solve those problems so youcan see uh the graph here which is APIserver memory utilization you can seebetween 6 and 800 hours there is a gapuh that gap is an API serverrestart the more interesting thing aboutthat is after the restart API server APIservers came back with 30 40% uh lessmemory utilizationwe actually saw this consistently happenin our fleet that anytime we wouldrestart rolling restart API server uh itwould come up with um less memoryutilization even if the workload uh onthat cluster was same it would end upcome up with uh less memory utilizationso we started wondering is there achance to you know optimize some of theum uh way garbage collection happen isthe memory you know not being garbagecollected fastenough so that's where we startedlooking at goGC goc is a uh go u is a go runtime uhconstruct that can help uh usercontrol how aggressive the Go garbagecollectoris so let's see how it works using anexample so let's assume that the Gobinary is running with 100 megabyte ofallocated objects right now so it'susing 100 megabytes and the Go GC valueis set at 100so what this meansis go go runtime will wait for 100% ofnew allocation which means it will waitfor the memory utilization to reach to200 mgabytes before it triggers thegarbage collectionyou can see on the right if you set goGC to50 the go uh garbage collection will betriggered when the utilization reaches150 so comparing both of them we canclearly see that setting go GC to �50made go uh garbage collection moreaggressivewe actually noticed this um after thefinetuning that the rate of uh go GCcycles doubled when goc was set to50 not just that the amount of uh memoryor bytes released also increased and thelive heap utilization of API server wentdownso we then uh experimented with all thecomponents in the control plane and as aresult you can see in the graph each ofthe component has a sharp decrease inthe memory utilization and over time itstays that way so even during peakutilization we saw that the memoryutilization was uh made more efficientby this simple fine-tuning techniquego GC is a very uh flexible uhparameter underneath the hoods it's atrade-off between CPU and memory in ourcase we traded off because we werememory constrainted uh we traded offsome CPU with uh efficient memoryutilization conversely depending on theenvironment uh one could easily tradeoff u memory forCPU in fact I have a reference uh from ablog uh from Uber uh they have doneexactly opposite of what we have donewhere um they tuned the uh go GC valueso that they would have lower CPUutilization but higher uh memory whichis okay for their environmentso go GC is a flexible um tunable highlyrecommended if there is a trade-off tobe made between CPU memory it is one ofthe things that can help with any uh goprogram and since kubernetes and thecontrol plane is uh go uh we use iteffectively in in the controlplane moving on uh the second problemthat we started observing in our fleetis that again we're looking at the APIserver memory utilization graph and youcan see that the blueinstance is at a much higher API servermemory utilization than the green one infact the skew is big enough uh that itstarted creating problems for us whatkind of problem rightso as Ryan mentioned if let's say a listrequest storm comes in when this queueexists in the control plane the chancesof the blue API server uh getting killedbecomes much higher because it isconsistently above average uh memoryutilization compared to the othertwo so in an unfortunate event let's saya list storm comes in and the blue APIservers all of its connections are nowuh load balanced between yellow andgreen and since yellow is not much faroff in terms of memory utilization thechances of it oming are much higher asall the workload all the requests fromblue get transferred to uh yellow andgreen so again this few minutes we wouldobserve that the yellow API server wouldand then eventually because both of themare not serving any requests now theblue and yellow the green API serverwould then not be able to handle theentire uh clusters workload and it wouldkill uh as well so we started seeingthese kinds of cascading failures uh inour stack it was so bad that we actuallyhad to set up alerts that anytime thereis a memory skew have to go in there andtake certain actions to make sureum you know it got healed but this was abig problem for us and we wanted to seehow we can um solve itso in digging more uh about this problemwe learned that anytime there was a skewin memory utilization uh there was alsoa skew in haroxy and API serverconnection so HA proxy in ourenvironment is just a front end whichacts as a load balancer for all theincoming traffic to API server and itstarted to see lot of skews so you cansee in the graph that the a theconnections on two API servers are youknow in in the order of thousands andthe third one is just at at 37 right sothere's a big skew in the connection andwhat that meant is that the rate atwhich each API server serves request isdrastically different uh you can see inthe graph the the green API server isserving much more uh number and at ahigher rate uh of requests than than theblue one this means that because theload is different on each API serverum the memory utilization um will turnout to be different right so all ofthese were symptoms of of the sameproblemuh so we started to think about how tosolve this problem right and the mostobvious uh intuitive feeling is are wedoing load balancing correctly uh infront of API server we actuallyexperimented with many dif�ferent loadbalancing configurations and none ofthem seem to have helped here so whathelped uh we discovered a flag calledgoaway chance parameter in API serverand we noticed that configuring that uhappropriately helped remove this uhproblem of SKUs uh in API server so howdoes it it work so when we think aboutrequests HTTP requests that are sentfrom a client to API server uhunderneath the hoods they are working onestablished TCP connections and theseconnections are long lived so think ofthese connection as pipes that areestablished and think of the requests asbuckets of water that can flow up anddown uh these pipes but once theseconnections are established there's noreal way uh for you to you know tearthem down so load balancing uh helps youin creating new connections in in a fairway but once this skew is introduced itdoesn't do anything to you know tearthose connection connections down uh andbring up new connections so this iswhere the goaway chance parameter comesin where it probabilistically takes someuh established connections uh in thepool and sends a go away uh TCP messagewhat this means is it tears down theconnection uh gracefully and allows theclient to recreate the connectionhopefully going through the loadbalancer again and and getting a betterbalanced trafficwhat we observed is after configuringthis in our API serverenvironment even if a skew exists likein the graph below within 15 minutes uhthe skew will automatically recoveritself because this will give u the APIserver enough chance to send goaways tothose uh existing connections and createnew onesunfortunately not a lot of documentationfor this parameter exists so I have apull request which has a really nice uhdescription of of the problem and how uhuh how go away chance parameter solvesit this in fact could be used not justin the the Kubernetes API server but asimilar implementation can can be usedin any uh server or or any uh componentthat isuh doing server side processing and andload balancing in front of ituh after after configuring thisparticular parameter we saw that therewere no cascading failures in our stackanymore so this particular uhconfiguration helped a lot in inremoving the control plane failures andserving GPU capacity consistently overtimeso you know running into these kinds ofvery u very rare uh scale issues weconstantly you know thought what is itwith our environment that is creatingthese kinds of scaling issues right andover time we have learned that we movedour compute platform from a legacyhomegrown uh cloud to a cloudnativeuh uh computing platform and in thatmove u there were some mistakes uh thatwe made one of them was we put a largeamount of data in in secret um asconfiguration or or certificates for ourworkload and that meant that a lot of uhlot of time is taken for lot of time andeffort is taken both of both at APIserver as well as the client side to youknow encode decode that YAML just a lotof YAML in in theworkload then the second mistake wasthat the client which orchestrates thisuh GPU capacity workload across thecluster it it works with both legacy aswell as the cloudnative environment andbecause of that it had to make periodiclist calls uh against our workloads andwe know that list calls against APIservers are again a scalability issueso both of these mistakes combinedcaused a lot of challenges and all ofthese fine-tuning techniques uh that uhthat were discussed in this uhpresentation helped us live with thesechallenges for a sustained amount oftime so big shout out to the Kubernetescommunity for you know having such arobust uh ecosystem where these kinds oftechniques can be found and implementedin an easy wayyou know talking about future the futureis seems to be very bright on this frontuh big shout out to SIG API machinery uthere are at least three initiatives uhand there are much more not listed herebut these are the initiatives that willreally really help uh scale the controlplane in in a you know non-trivial waygoing forward so very excited about thenew changes coming um for for thecontrolplane talking about the future� uh thereis a big change coming for our stack aswell which is we're going to transitionfrom the current implementation which isserving GPU capacity using deviceplugins uh toDRRA uh this change is big enough thatuh we wanted to do some kind of a powerand scale testing of the control planebefore you know taking this change inproduction so in the next set of slidesI'll be talking a little bit about whatthis change looks like and some of theuh perf and scale uh tests and theirnumbers uh that we ran but before we getinto that quickly talking about what thecurrent stack looks like on the controlplane we have um API server controllermanager and the scheduleuler the normaluh Kubernetes stuff but we have one morecomponent which is called the topo aarescheduleuler this scheduleuler does umnuma alignment for our workloads so sothe workloads that we are running arenot just um uh GPU workloads but becausebecause they are high sensitive highperformance workload we need to do uhNUMA alignment for that with with theright um CPU nodes and topoarescheduleuler needs to be in the controlplane to do to do this u scheduling foruh you know numaalignment on the on the worker nodes wehave device plug plugins that do the theGPU allocations but along with that wehave an NFD topology updater this thisactually feeds the numa topologyinformation into the control plane whichis then acted upon by a toposcheduleuler so this is the stack on theleft that we have now uh and then in thefuture moving to DRA we we should beable to get rid of the topo awareulerbecause the DRRA API is rich enough todo numa alignment just default right outof the box by thecubeuler so we'll be removing the toposcheduleuler from the control plane aswell as the uh uh NFD topology updatorfrom from the worker nodes and that willleave us with just the DR plugins forour workloads and and cubeuler uh in thecontrol plane um we actually did a deepdive about this whole uh orchestrationin uh cubecon in Salt Lake City um Iencourage you guys to uh to check thatout it has much more uh detail about howour stack um like what kind of problemswe faced with uh device plugins and howDRRA helps us uh solvethose but uh you know focusing on thecontrol plane scalability in order toyou know scale test uh this we we wantedto find a easier and a cheaper way toscale test it um that's where uh we useuh quark quark is uh kubernetes withoutuh cublet and basically it's asimulation tool which can help you uhcreate fake nodes in the cluster as wellas fake pods and then uh that thatcreates um uh additional pressure in thecontrol plane and uh you know we run thescale test using that um there areactually couple of talks uh where we'retalking about quark much more in detailum encourage you guys to check that outand you know uh see if uh Quark can becan behelpful so so with Quark talking alittle bit about the scale test so wehad 132 uh cluster we had quarkinstalled and we would scale from zeroto 4,000 uh workloads and we would dothat for topo aware scheduleuler and DRAand compare certain metrics right so thefirst metric metric we compared is thescheduling latency on the left you cansee the topo and on the right you seetheDRuler we found out that toposcheduleuler is twice as more uhlatencydriven as uh the DRA scheduleulerum so this metric was a win uh we areokay to you know take this intoproduction the second metric was uhmemory utilization so in this slide yousee the topoare um scheduleuler memoryusage but along with that too we alsohave cubeuler running in the controlplane so the real memory utilization incontrol plane is the sum of both and wesaw that at scale it was around uh 1.3GB but with DRA it was around a gigabyteso again 30%uh memory saving um with DRA again thisis a metric that uh uh checked us reallychecked us checked out for us reallynicelythe third metric we looked at was theeffect of all of this orchestration onAPI server and again on the top you cansee that the topo scheduleuler impact onAPI server it the API server was ataround five uh gigabytes and on thebottom you see the uh impact of DRAorchestration it was well under um againlike a 20% um better memory footprint interms of how aggressive it is with APIserver so all of these three metrics ugave us a good initial reading withsimulation that the new uh DRRA uh stackis going to be you know much better forus not just they are going to solve likeuh uh conceptual problems but also interms of perf and scale it'll be uh uhyou know efficiency improvementOkay I'm going to summarize a little bithere so what are thetakeaways so we went through thisjourney uh atNvidia and like I said our goal was wewanted to have a fixed size controlplane and we wanted to see it scaleindependently of compute right we'reassuming that GPUs are getting morepowerful which means more workloads wewant to see how that could work rightyou sometimes you unplug your GPUs youput in new ones we want to see what thiswould look like um so to summarize likein our our journey is that you know wefound that we made some mistakesarchitecturally along the way and welearned a bunch of things you know ourjourney was from a non-cloudnative wordworld to a cloud native world so therewere some things we had to learn um butit's cool we were able to get through itwe were able to figure it out um youknow we still run with all these secretstoday we still run with these sides ofsecrets and it was great that we wereable to find uh this great support fromthe community to actually pull this offum and it was cool to see like all thesethings that these tunables that Allaymentioned they all have differenteffects you know based on your use caseand your environment um so it'ssomething to check out you know as asyou um are are running into somethinglike this perhaps in your control planethat there's a lot of things that youcan do a lot of things that you can doto to possibly help your your scale yourmemory usage your CPU usage u um andsave some of uh uh yourself frompossibly having some OS and um andfinally you know we're really excitedabout what the the Kubernetes communityhas ahead um they've done some reallygreat work in scale and performance um alot of the stuff like list streams andand a bunch of um upcoming stuff thatwe're going to get in the new Kubernetesreleases later this year that we'regoing to roll out um will actually makea lot of these problems go away for usum we we probably wouldn't have hit themat all if if we had um some of thelatest tech in Kubernetes so that's alsoreally exciting and then finally withDRA a really exciting feature thatprobably everyone is aware of you knowwe're also really excited to use it umreally flexible way to allocate devicesum something that we're going toevaluate a lot um and be rolling outinto our production zones very very soonso very important for us um and LA isgoing to be doing a lot of scale andperformance work in thatcommunity okay uh we'll takequestions thank you everybody[Applause]there's a microphone right in the middleof theroom quick questionis it common to use HTTP2 in front ofAPI serveruh yeah uhso for like the request side of it it isHTTP2 there are other long-standingthings like log and uh XC which useother protocols but yeah it's mostlyHTTP2 okay i'm I'm confessing uh I'm abig fan of classic Engine X and HTTP1whereverpossible meaning shortlived connectionsor maybe sometimes like slightly longerlived connections but I think yourexample has just confirmed that when youhave really persistent connections it'snot always uh to your advantage so Ithink the big one is the watch semanticsthewatch most of the watchers that get setup need a persistent connection to youknow send uh events to the clients andit would be really hard to I mean Iwould imagine it would be really hard toimplement something like that with HTTP1do you also send them to go away i'msorry those long live watcherconnections are they also affected by goaway that's a great question so theexisting connections will not be uhaffected by go away it's actuallymentioned in that PR description it willonly the new connections that are set upwill be impacted by the goway uhconfigurationthanks2025-04-15 22:02:26.398824 M �M��z�#��+AEeegJKZ_4g0hey everyone happy day one of KubeConThank you all for making it I know thishas been an incredibly long day so thankyou for sticking around for the verylast session of day one Uh we're goingto be talking today about making the theleap Uh what gateway API needs to do tomove to support ingress engine X usersas hopefully they migrate to gateway APIUh my name is Rob Scott I work at Googleon Kubernetes networking I'm also agateway API maintainer and I also haveused ingress engine X very heavily inproduction uh dozens of clusters and Iam very grateful for ingress engine Xbeing amazing Uh but with that I'll handit off to James Hi everyone my name isJames Strong I'm a solutions architectat Isovalent now at Cisco I help folksimplement Kubernetes networking withSelium and secure their clusters withTetreonSo let's dive into a little bit ofcontext What are we talking about andwhy are we talking about both ingressengineext and gateway API becauseobviously ingress engineext in the nameis an ingress implementation What doesthat have to do with gateway API welllet's back up a little bit Uh how manypeople are familiar with gatewayAPI okay that's great That's good Um sogateway API graduated to GA in 2023 Andone of the things I really like tohighlight about this API is it's been inmy opinion I think uh there's there'sgood data behind this is the mostcollaborative API in Kubernetes historyUh we've had hundreds of contributorsthat have helped make it the API it istoday Uh it's a featurerich API that isa supererset of the ingress API andtoday we h have more than 30implementations of the API already withmore on the way So it's been great towork with this community It's reallyexciting to see kind of the nextgeneration Uh and really this wasinitially called ingress v2 So thatshould give you an idea of what the goalwas This was the next generation API foringress Uh and that's gatewaySo with ingress engine X uh how��{�#��-AdUfp3j1j-mguh so I'm uh I'm Ryan Haly i work atNVIDIA this is my colleague Olay Patelumand like I said we're going to betalking about the Kubernetes controlplane so the the story to think abouthere is GPUs are being released everyyou know one to two years they'regetting more and morepowerful and so more powerful meanstheoretically we can support moreworkloads more workloads means we canhave more secrets more volumes moreobjects and all this is pressure on theAPI server right so we can do all thesethings and what we want to do is we wantto make sure our control plane ourKubernetes control plane is stillrunning so we did an experiment where wewanted to as a a bare metal provider umsomeone who runs on bare metalKubernetes we wanted to see if we couldscale the compute independently of thecontrol plane so this experiment what wehad in mind is we would take our controlplane and we'd say okay let's have afixed set of memory and CPU and let'sscale up our compute and let's see whathappens um so we keep control planeresources constant and we want to seehow itperforms so it went perfectright not exactly i like this titleincidents incidentsincidents so there's a few things thatwent wrong uh we're going to talk aboutthree of them i'm going to talk aboutthe first one um the first one uh firstbig thing we encountered a large numberof list calls to secrets would lead tocontrol planeoutages and these aren't ordinarysecrets let me tellyou can I go to the next slide herethere we go okayum we noticed that in ourenvironment if we have a lot�� manypeople are using ingress engine X a lotHow many people had to patch theirclusters lastweek there should be more hands upUh it is a SIG network sub project Um II didn't write this slide but it doessay one of the most popular ingresscontrollers Uh 40% of clusters accordingto whiz Uh thanks whiz for thatinformation It has a lot a unique set ofextensions for the ingress API viaannotations and then config map It has118 That number I will continue torepeat throughout this conversationbecause it is important um for thiscontext as a wholeUm that last that that fourth pointright there has been difficult tomaintain the massive set offunctionality offered by projects Umit's really easy to implement anannotation to switch a config in theengineext comp and then leave us and noone else there to support it So we haveto figure out context on that So it isdifficult uh as maintainers um tocontinue to support this level offunctionality from an individual projectand we've probably had I think four orfive issues that we've had to close forpeople asking about gateway API supportUm and it's been an interestingconversation I think for I I want to sayfor the last two years having thisconversation around itYeah Some of the uh changes aheadUm we want to I I think I'm sitting herethinking I was like I think this is yourslide Sure Yeah I mean a couple a couplepoints on here for me I'll let you startSure So so so the idea is that we wantto standardize all SIG network projectsaround gateway API instead ofimplementation specific extensions uhwhat we saw was happening is that at thesame time gateway API was growing Uh theset of features supported by ingressengineext was also growing So weeffectively had two APIs kind of formingin parallel which was gateway API andalso ingress engine X which is not anAPI Yeah No not quite an API but kind ofan API whatever it is Uh so you knowthere's only so many maintainers aroundUh I wish there would be more If anyonewants to volunteer we could alwaysbenefit from more maintainers on allthese projects But really we're we'retrying to uh focus everyone in onedirection and we're hoping thatdirection can be gateway API So uh maybeI'll hand it off to you to talk aboutthat With with that conversation withRob and others on the SIG network teamit was decided that for a variety ofreasons one the name being one of themthat we would shift our focus to a newproject I think we announced it in SaltLake City as Ingate and that focus wouldbe on gateway API support It will be asmaller subset of ingress support forvery focused use cases on ingress but itwill be a gateway API implementation andwe're trying to shift ingress engine Xinto maintenance mode um right now Thatis a very long tail We're going to talkabout that a little bit more later Andit does say 18 months Um 18 months isseems like a long time but it is not Umso we're gonna continue to work on thatand continue shifting our focus intoendgameYou'll see we have the word approximatein many many slides We somehow missed itin this one I think that's on me butapproximate is key Approximate Um butwhy why are we doing this um as Robalready talked about we want to push andstandardize on gateway API for a lot ofthe SIG network projects gateway API wasfrozen greater than 5 years ago I thinkI started greater than 5 years ago onthe ingress engineext project So it wasvery interesting when we did the youknow the GA I think the GA for ingresswas also around some of the same timethat we were saying no new features oningress Um adding new features like I'vesaid is not a great experience It's nota great experience for users especiallywhen CVEes happen Um it's interestingwhen you don't uh when you don'tvalidate user input you'll have CVSAgain we want to with the great numberof functionality that ingress engine Xsupports we want to help push that intothe gateway API so that otherimplementations can also uh benefit fromthose as well Um that last one like wesaid OSS contributors don't grow ontrees Um we have a lot of work to do anda lot of work that isn't just code Umit's you know design docume�ntationhaving conversations testing um we doneed those folks as contri non-codecontributors aswell but there's a big question of arewe ready for this is gateway API readyfor this is ingress engineext ready forthis and one of the easiest things forus to look at and understand is thefeature parody um I I continue to saythis I think I wrote this slide a whileago um it does a a lot we have tomaintain a lot Um I am always concernedwhen a Lua PR comes in because I've notwritten uh Lua Um it it is concerningfrom that perspective but we do have youknow 43 dependencies across fourarchitectures for three Helm versions Sowe try to support three Helm versionsand we have a plethora of configurationsfor deployment Um 68 command line flagsUm I'm I still don't think 100 plus endto-end tests is accurate But the big onethat I want to talk about is that guy118 annotations That's probably besidesthe configuration options Yeah that'ssomething that we definitely need tolook at But 118 annotations is most ofthe features and functionality um thatwe haveSo Rob and I looked at the gateway APIsupport for those for those pieces offunctionality Um today if we were tohave endgate and be able to supportversion one two of gateway only 35% ofthe functionality in ingress engine Xwould we be able toimplement supported soon is 55 So yousee that big purple slice likely soon isthat there are gaps being worked throughthe system there's PRs being worked intothe API that we would be able toimplement Again those would also taketime because has to be implemented inthe API and then the implementationshave to do that Those last two pointsare um concerning for all of us rightthere is I think that's about 45% thatwe have no plans for implementation SoI'll give you a little bit more of anidea um with this example So we we brokeoff those 118 configuration optionsthose annotations into about 40 groupsUm and as you see from an external offperspective there's not a lot of plansfor that There's a lot of red in thereUm and then ingress right so the idea isthat if the functionality exists ingateway API it will not in ingress andyou have to migrate your ingress objectsto gateway API objectsIf the functionality is not there thisis the call out that we're askingfor We need your feedback A lot of youraised your hands This is a very bigroom I hope to see a lot of results fromthis survey We need to understand whatis your limitation for migrating togateway API What is the feature set andfunctionality that you use for ingressengineext and how we can help prioritizethose 118 configuration options i meanwe're not going to do all118 Um I continue to say this but yeahit it's not looking great We're we'renot going to implement all 118 So um welooked at I think it was earlier thisweek I was like well we're not going todo mod security because we're not goingto do WFT And then we did find out thatsomebody did put in a gap Somebody istrying to get that implemented into theAPI So again it's a conversation withthe community with us and we need tounderstand what you all are using so wecan prioritize that functionality andget it in the gatewayAPI Um yeah this is me Um so again somesignificant features like we weretalking about like you saw thescreenshot that's just one um featuregroup that's we have 40 So there's a lotof conversations I'll put out thespreadsheet spreadsheets out there thateverybody can look at and try tounderstand like what our thinking ourthought process is for this migration Umbut you know basic and external offthat's a gap Um I think rate limiting Ithink is is a gap It's still a gap It'sone of the most requested features Soit's likely going to fit in but it it'snot in gateway today Mod security wefound that we know we didn't update theslides for So it's it's there It's beingrequested and you know someone ishopefully working on it Uh opentelemetry integrationYeah So you know maybe there's anotherway to look at this Uh maybe we shouldbe looking at this not in terms of thegaps in gateway API but maybe we canlook at the gaps in ingress engineextYou know that like this is ther�e'sthere's a vin diagram here and that vindiagram is that ingress engineextsupports a lot of things that gatewayapi does not support and gateway apisupports a lot of things that ingressengineext doesn't support So to give youa a slightly rosier view of the futurethere there are things coming that ifyou move to gateway API as part of thatmigration you'll unlock access to a lotof hopefully exciting new things Um sothere are a lot of important featuresthat are really difficult to representwith annotations So through no fault ofingress engineext or anything like thatthere's just things that don't reallyfit in the ingress API or an extensionof it So migrating to gateway API canhelp unlock some of those things Uh thisis an incomplete list but it gives youan idea of the kinds of things thatagain are very difficult to representwith an annotation but fit neatly intogateway API So whether you want to matchbased on headers query params methods uhwhether you want to modify request orresponse headers make some crossnamespace references So let's say youwant to uh send traffic from a route toa service in a different namespace oryou want to traffic split acrossnamespaces gRPC routing some moreadvanced traffic splitting capabilitiesor maybe you just want to use the sameAPI for your mesh if you're usingservice Anyone using service mesh outhere okay there's a few Uh so could beuseful for that And then I I just haveto do a shameless plug for this One ofthe things I'm really excited about ingateway is that we're building aninference extension to help withinference routing So if you're runningLLMs in your cluster we're building anextension We have an extension thattransforms any gateway into an inferencegateway And you know I I don't want tofocus too much on this I think it'sreally exciting I'm biased but one ofthe things I want to highlight is thisis a sign of all the new innovation inthese APIs is happening in gateway APISo if we can all get together behindthis one direction it should help uheveryone out by granting access to morefeatures and more capabilitiesBut I I have to be honest we we all knowthis is more than just features right uhI I have people come to Gateway and say"This is way too complicated," or "Thisis too annoying," or "You're usingCRDs." You know there's a lot of reasonsthat aren't just I don't have thefeature I want right it's it's also theexperience So to be frank using CRDs isdifferent at least than using entry APIsand it comes with its own set ofchallengesuh gateway API is built with CRDs bydefault which means it's not included bydefault This does have a verysignificant advantage anytime we releasea new version of gateway API It meansthat you can install it in your clusteras long as you're running one of thefive most recent versions of KubernetesSo you don't have to wait to upgrade toKubernetes 1334 to get a new featureIt's ready the same day gateway APIreleases That's great The not so greatpart is in most cases you have to managethe CRDs yourself We're working on thatSome cluster providers already managethe CRDs for you like GKE but there arelots of lots of things in progress likemaybe we should just include gateway APIby default in Kubernetes We had someinteresting discussions about that atmaintainer summit So lots of things inprogress but that gives you an idea ofone of the areas that we want to work onthat isn't strictly a feature And thenquite frankly gateway API is morecomplex than ingress There's there's noway around that right there's uh ingressis a single resource It's very simple Uhgateway API has gateway ingress nogateway and HP route Uh and then maybegRPC route if you want to do some gRPCrouting Uh more features and andcapabilities also by definition lead tomore complexity uh and it's particularlynoticeable in that very simple case thehello world case where you just want thesimplest possible thing So some of theideas we've been circulating around iscan we make HP route useful on its own II had a previous talk where I kind ofshowed side by side ingress API and HProute and it's very sim similar If youjust �take an ingress and translate it toHP route it looks almost identicalThere's a bunch of optional fields in HProute which allow you to do a lot morebut by default they're very very similaruh and maybe we should introduce adefault gateway So you just don't needto think about that You just create HProutes and it feels very similar toingress So these are again things wherewe're we're thinking about to provide avery similar experience that you're usedto with ingress engine X Uh the lastthing I'll say is also CRD related thatif you've worked with CRDs before youmay have noticed that using cube cuddlewith those CRDs uh does not feel greatsometimes If you've ever had theprivilege of running coupe cuddledescribe against a CRD it is really justlike the YAML output without dashes It'sI don't know I don't know why that iswhat it is but it's very confusing Uh sothere's also very limited options tocustomize coupe cuddle get output likeif we want to display a list for exampleof IP addresses that's surprisinglychallenging to do well that there's somework going on for that And unfortunatelyone of the things in gateway is there'syou know references say between an HProute and a gateway and we don't have areally great way of surfacing in cubecuddle that this reference is broken Sowe're working with upstream to try andmake these things better in cube cuddleBut before that happens we already havea tool for that and that's gatewaycuddle Uh it's a drop in replacement forcube cuddle for gateway API and it doesa lot of things that cube cuddle justcan't do right now Uh so it allows youto show connected resources print aresource graph so you can kind ofunderstand the whole picture of how allyour resources connect together Uh italso can analyze your gateway APIconfiguration and identify any problemsin that and detect any connectedpolicies or extensions of the API andgroup them all together so you canunderstand the full picture not justgateway API itself but the extensionsaround it Um so just to give you a briefexample of that So say you get a serviceand you want to say hey what routes arepointing to that service gateway gategateway cuddle can show you that or youget a gateway and you want to say howmany routes are attached to my gatewaygateway cuddle can show you that uh orsimilarly you have a gateway class andyou wonder how many gateways are usingthis gateway class you can also see thatwith gateway cuddle prettyeasily now there's also just some usefulbits here so if you want to describe youknow how I mentioned that coupube cuddledescribe with crds is uh ratherfrustrating We tried to make that betterwith gateway cuddle Uh so you can say"Oh there's all these attached routes tomy gateway and those routes point tothese backends." And you can alsodescribe a service and see pointing uphere are the routes that are pointing tomy service So you're ever if you're evercurious who's pointing at my service isit safe to delete my service for examplethis can tell you that Um and then thelast thing if you want to print out justa graph of how all these things fittogether Gateway Cuddle can also do thatSo this is just an example of the thingswe're working on to focus on providing abetter experience with gateway API Iknow it can feel very overwhelming touse an API that has this many featuresbut we really do care about making thisexperience better So to be clear there'slots more work to do with Gateway CuddleBelieve it or not all that fit into 0.1of Gateway Cuddle So we have a lot moreto go Uh but I'm excited about just whatwe can do to make the experience betterfor Gateway really do care about makingthis transition as smooth as it possiblycan be and there's lots of room to helpout and give us feedback on what worksand what doesn'tSo again um closing the gap are are weare we ready for this um I mean I'llI'll tell you right now we're not It'spretty obvious from this conversationbut you know there are again other toolsthat help make this possible even ifyou're not using Ingate So ingress thegateway API is a project from KubernetesSIG that has this idea of like aprov�ider framework So if you haveingresses that are you know in GCEpsyllium any of those things you canconvert those to gateway API resourcesand give the YAML manifest and give youan idea of what it would look like Againthis is one of those things where it'sit's 8020 There's also contributors tothis And we went and looked and I toldyou and I'm going to continue to saythere's 118 annotations Unfortunatelyingressive gateway only supportssix Not great So even from the migrationpath perspective we still have work todo to help make that migration you knowsmooth And with that Rob has Yeah So sowhat are we going to do you know thatthat this seems like an insurmountabletask There's a lot of features to get inand we have a relatively short timelineto do it in Uh one of the things thatyou may know about gateway API uh isthat we try to limit the number offeatures in every release so that we canhave manageable releases and focus onthe things that we think are mostimportant Uh what we're planning to dogoing forward uh until we can close thisgap or close the most important gapswith ingress engineext that you tell usis important Yes that you tell us Sothis is again based on that survey Don'tworry there's going to be another linkat the end If you don't respond to thatsurvey please please don't blame uslater We're relying a lot on that surveyUh so we're going to add two temporaryslots just to help close this gap thatare going to be just for Ingress EngineXcompatibility and they're meant to closefeature gaps with Ingress EngineXexclusively We're going to aim torelease at least three times each in thenext in 2025 and 2026 Uh so you look atthat and you say okay well that's 12features for ingress engineext I I getit that's not a lot Remember that we'rewe're talking about categories This isnot you know like say 10 annotations maybe a single feature This is not aonetoone mapping but just gives you anidea We're not going to be able to coverall 118 but I think we're going to beable to cover a huge percentage of themSo that's what we're working on uh as wego forward Now I I do want to be clearthat there are some guidelines here Uhall the new features are subject to allthe same criteria we already have ingateway API We can't just let everysingle thing into the API We're notgoing to lower standards I'm not I'm notsaying ingress engineext standards arelow for by by any stretch I'm justsaying we have to meet all the samecriteria that we would This is nottrying to special case any one thing Uhand also we we will only accept featuresthat we believe can be securelyimplemented Uh and also oh you don'twant to let users use Lua you know Imean it sounds fun but okay uh and we wewill also not accept features thatcannot be implement can only beimplemented by a single proxy whetherthat's engine X envoy and uh you pick uhthe whole idea with gateway API istrying to provide a portable API acrossimplementations uh remember we have 30or so implementations of gateway API wewant you to be able to migrate to anyone of those just as easily uh therethis is just to help you migrate to togateway Um now you may ask is thisspecial treatment i I have gotten thatquestion before Uh we are prioritizing amigration path for users of a SIGnetwork project that is going to bearchived that like you think about thisas a broader Kubernetes SIG This is aSIG network project and we're saying heythis project is being archived in somesome approximate time in the future I Iwon't go there Uh and we need to providea path off So yes this is this is notprioritizing a destination It's tryingto provide a safe path for people thatare already using a project that isunder SIG network Uh of course if otheringress controllers are planning asimilar archival happy to help with thatmigration as well But again that'sthat's what we're trying to dohere Keep wanting to say that this is anapproximate timeline So we've gotgateway API version implementations onhere So talking about those feature setsand trying to get those implemented Umworking on you know we we announced itat I think it was Salt Lake City whichwas a year year and a half conversationum in the making So you know we want thegoal again the goal the approximationthese things is that in Atlanta we'll beable to have you know an 01 update thatfolks can test and continue to work onand give us feedback on that Um when Isay stable from an endgate perspectiveum I we are going to define what weconsider to be stable in my mind it'spassing the conformance tests becausethat's one of the requirements for agate rate API implementation uh versionone two maybe at this time you know6 wewill be using controller runtime sohopefully that helps you know lessen theload on us as maintainers to work onthat and then once we have that stablerelease that we've defined we will startthe conversation of with the Kubernetescommunity that the project will actuallybe archived We're saying maintenancemode Maintenance mode from an ingressengineext perspective Um I talked aboutthis in my previous talk you know like 5minutes literally before this one isthat we're going to continue withmonthly patch releases If you have a PRthat is open for ingress engine X rightnow please let us know because we areonly planning one more minor release Weonly implement new features in minorreleases but we'll continue withsupporting all of the Kubernetesversions that are going to come up until2027 Um so 113 will look like likely bethe last minor release and we'llcontinue with patches you know goingangKubernetes versions updates from that Umbut it again it is a you know it's aconversation between you know themaintainers of Endgate and the communityand we can't have a conversation unlessyou let usknow what you're using and you know whatyou're looking for from theimplementation Please take the surveypleaseSo withthat yeah so uh let's let's we'll we'llbring the survey slide up at the end Ipromise because we we really do want asmany people as possible to be able totake that Um so the the features usedmost uh by ingress by most ingressengineext users should be includednatively in gateway API That's our biggoal We'll need help to get there butthat is the huge focus for us over thenext 18 months we'll say Uh and not justshould they be in ingress engineext wealso want them to be in ingress togateway so you can automatically takeyour ingress engineext manifests convertthem to gateway API and have equivalentbehavior at the end Um we also want tohighlight that you know it's not justall the features you already have We'rehoping that this will unlock a whole newset of capabilities for you as youmigrate to gateway API And again we knowthat it's it's a complex migration Youknow there's lots of new objects There'slots of lots of new arbback There's itit is going to be a diff it's adifficult migration That's why wecontinue to ask people to help us let usknow what is difficult about it You knowwhat are the bugs what's the userexperience like does it work in you knowOpen Shift so there's lots of thingsthat you know there's things that wejust don't know from a user experienceperspective So please let us knowAnd if you can't tell there's a lot ofwork to do So if you would like to helpmake this journey better for everyone uhwe both projects so gateway API itselfhas weekly community meetings wedefinitely want to welcome contributorsfor from every background and also ifyou're working want to work on a gatewayimplementation by the maintainers ofingress engineext that's ingate and sothat information is up there as well Uhbut with that oh one last thing sorry Uhat KubeCon tomorrow if you're interestedin having kind of a group discussionabout how we can make this better we'rehaving a just working session basicallySo come provide your feedback We'll haveingress engine ingress engineextmaintainers gateway API maintainersthere to hear your feedback help usprioritize the future together and maybeif you're interested help get involvedUh we can we can point you to what'snext Uh and with that let me jump backto this allimp important side Pleasetake the survey Uh and that is all wehave for you today Thank you so much Saywe got2025-04-15 22:02:26.883940�e interface let's say afirewall for instance what do you do sothe fact that uh Kubernetes itself doesnot answer this does not mean that it'snot possible there are ways out of treeso the Kubernetes uh network plumbingwork group came up or came came forwardwith a the de facto standard formulti-et networking which is implementedby multiscni there are more CNIs thatimplement it but for the sake of thispresentation we will focus exclusivelyon multisuh which implements that but out ofKubernetes so the way this works or theway it would work in production your uhcluster admin or your network admin willwould come up and would provision thisnetwork attachment definitionum CR that you see in your uh bottomleft and they would just provision likea JSON encoded they would just provisionthe CNI configuration in each of thesethings so a network attachmentdefinition is essentially like a bag ofholding for a CNI configuration for onenetwork and then your user your podowner when they provision the pod theywould list in a dedicated uhannotation the list of additionalattachments they want to have andfinally you have Maltus configured inyour system to be the cluster defaultnetwork uh CNI and what it will do isconfigure uh the cluster default networkattachment and it will issue um and itwill plump the additional attachmentsthat you referred to in this particularannotation uh this has some problemsfirst it's very errorprone and is verycumbersome why is that just look at theannotation and at the network attachmentdefinition spec.com attribute it'spretty much like a JSON encoded stringthat you put there and if you make anymistake some things might happen firstof which it might just flat out erroryour pod because you have put likeinvalid JSON so it cannot parse it soyour pod will not start worse than thatis let's say that you are putting likethe wrong attribute because you misreadyour documentation for instance well forinstance this happened to me a couple oftimes like we have an attribute calleduh VLAN ID and if you forget that it'sVLAN ID and you put their VLAN well itwill do nothing and we'll connect yourpod to the default VLAN instead of thethe VLAN that you want it to beconnected to which it will be extremelyhard to debugso to address this these shortcomings umthere came forward like a group calleduh multi- network inside well a subgroupof the SIG network focused in trying tobring multi-et network into theKubernetes tree and this happened overthis uh Kubernetes enhancement proposal3698 and the idea is mostly to translatefrom this network attachment definitioncreates something called the pod networkin which you would define your providerso you'd have something running yourcluster that would look anytime a podgets scheduled it would see that wellthis is requesting additionalattachments for this particularum network name so in the bottom uhright corner you see what would be theequivalent it's a provider which doesdifferent things and on the upper rightcorner you have a pod this pod the mostrelevant thing here is it has a networksarray in which you define theattachments but in a let's say a typedway which uh makes it simpler for yousince you no longer have to use thisJSON encoded string by and by using it alot of things could go wrong so thing isthe feedback uh the community got fromSIG network was essentially thatchanging the pod object specification istaking it like way too far you need tocome up with a way to you need to comeup with a Kubernetes native way ofrequesting a generic resource and inyour use case that happens to be anetwork interface if you do that we'regood let's say the feedback was the usecase is relevant it makes sense but likeplease do not touch the pod objectspecification that's taking it a littlebit too far and that's the first thingwe want to focus on we need to come upwith a Kubernetes native way ofrequesting a generic resource whichhappens to be a networking interfacethe second thing which is relevant uh toDRRA or to the upcoming uh DRRA uhpresentation that Lionel will walk usthrough is a device plugin the deviceplugin� allows you to mount um a physicaldevice on your node into your workloadsit's available since Kubernetes 1.8 itwas uh in alpha there and its resourcemodel is extremely simple what I mean bythis is that your pods will have canrequest this for instance thisvendor.com/device equals x and byextremely simple I mean that this x mustbe an integer so when requesting thingsyou can request let's say I want five ofthose I want one of those I want uh athousand of those but this means thatyou have absolutely no way of um havinga shareddevice and uh this is the shortcoming orthe biggest shortcoming we see right nowin the device plugins there is no wayfor you to share a device acrossdifferentpods so as Miguel mentioned the the capum 3698 for multi network has not beenmerged it didn't get in in Kubernetesso the the the work around thisinitiative hasn't really stopped in factit's already continuing and starting toget implementation within Kubernetes andinstead of having one group it's spreadover multiple different communities umwhat I mean by that is for example thenetwork configuration is going throughum uh the CNI and other underlying APIssuch as CRI or NRI um the services aregoing through uh get API um multinetwork uh group is still existing andstill working on a a pod network API sothe userf facing API that was proposedin the original cap and the last onethat is the most important and it's whatMiguel was mentioning is to attach thepod to the um to the network so to havea network attachment and this goesthrough the array so what is the arraythere is is a new feature withinKubernetes that is currently in v1 beta1 in 1.32 will be in v1 beta 2 uh in 133and is aiming for GA in 134 so thisfeature GR allows us to uh or allowsusers to request uh resources for thepod and the containers so the theresources will be available on the nodethe re the the pod will request them uhusing some uh API and then the so itwill request them uh using somecharacteristic and then the thekubernetes uh uh or the dr feature willschedule the pod based on thisrequirement configure the resource u onthe pod so what we are seeing here inthis picture is um is GPU so GPU is oneof type of the resource it's used mainlyfor a IML use cases uh but the resourcein fact can be anything it can be anFPGA it can be compute resources it canbe volume it can also be networkinterfaces we can consider networkinterfaces as a type of resources thatwe can request uh again with somespecific characteristics and and noderequirements and and each node in thatcase will have different uh uh networkinterfaces will have different differentcharacteristic as I said and then theport can request them can create networkinterfaces based on this interfaces souse some uh partition of this networkinterfaces um but it can also be the thefull device and claim the full device soif we do that in that way using thearray we we consider uh a networkinterface has a resource has a devicethen we can attach the pose to thesecondary networks so how does it workbehind um the first part for the arraywould be the scheduling in in the arraywe have this object called resourceslice uh that lists what is available inthe in the nodes so we can list the theall the candle interfaces we can listthe the physical function the network uhvirtual function um so the pod will requwill request this this device or deviceon top of this device by using somespecific characteristic and then thescheduleuler will do his job to schedulethe pod where this resources all theresources that the pod requested uh uhthe the scheduleuler will will find anode where all the these resources willbe available so this enable for examplenon-uniform cluster uh in a more nativeway uh as Miguel been mentioning beforewe have a device uh plugin uh togetherwith mult but um complex use cases areare not yet available one of the thingI'd like to mention is this cap5075 that is run by Pang from IBMresearch and that allows uh user toclaim uh virtual devices on top of uhexisting node resources so for examplewe can uh claim a dedicate or have adedicated bandwidth of a MAC villain ontop of an ETH or on top of networkinterface so in in this uh um example onthe slides we we have a net oneinterface that is that is claiming u 7gig of 10 gig inter node interface ETH1and this is one of the example but thereis a lot of use cases behind uh uh5075 the the second part for J is thenetwork configuration so configuring thedevice configuring the network interfacefor that we have a new API uh incontainerd and other container runtimethat is called node resource interfacethat allows us to hook into pod and lifeand container life cycle events so wecan detect when the pod is being createdand we can do some action at that timeso at that time when the pod is beingcreated we can uh um get the networkrequirement of this code call um orcreate this network interfaces based onthis requirement so it can be done forexample via CNI or it could be done withwith anything it's just up to the NRI uhplug-in or the driver to to do that tocreate the network interfaces and thelast part is to set the status of ofthese uh networkinterfaces for the status we have thisuh cap uh487 that has been merged uh since 132 isit it's in alpha since 132 and will bein beta in133 so it allows the DR driver to reportthe status of this device and thisnetwork interfaces that has beenconfigured on the podso what can be um uh reported it can beuh the IP addresses uh it can be the MACaddresses the interface name uh it canbe arbitrary data for example the CNIconfig or the CNI result um it can bealso the conditions so let's say thatyou have a pod is that is ready um wecan have a network interfaces that isdedicated for some specific traffic thatis not ready we we can have a differentlevel of readiness uh for the pod andhave uh the readiness for each of thenetwork interfaces uh of this pod so nowwe we no longer have only the the podips in the status we also have this umuh a status in the resource claim thatindicates what are the IPs of of thepods the the for me the two main reasonto have the status is the first one isthe uh the troubleshooting so we canknow with this readiness why um the thenetwork uh interface is no longer readyor why it has not been configuredcorrectly or could not be configured andthe second reason is the networkservices so if we have the IP addressesavailable in the API like this we canfetch all the IP addresses and then forexample load balance traffic over thisuh secondary IPaddresses the um now let's have a a demoof of this so we will run a simple demothat uh will run two uh um nodesKubernetes nodes the first working nodeum has two network interfaces it hasETH1 and ETH0 the second one has onlyETH0 and we will schedule 15 pods thatwill request the MAC villain interfacebased on this EGH1 so we will verify thenetwork interface has been created andwe will verify uh what has been whatwhat is the statusso I hope you can see correctly um let'scheck what we have in thecluster so we have this CNID driver herethat is uh the driver that is made fornetworking and that is configuring thenetwork interfaces using CNIso this is the only custom components inin in Kubernetes otherwise it's justnormal Kubernetes without any othermodifications so let's verify now thenodes we have um the kind workernode withETH0 andETH1 and we have ETH0 on the worker 2only um and we have only two uh workernodes like this the control plane we arenot really looking into this um so uh Iwould like also to share uh or to showthe resource slice that is listing thedevices that are available in the podsso this is a a grobject so we have two resource slice onefor each umum worker workernodeoops so we can see ETH0 andETH1 so we have two interfaces asdemonstrated previously withIPA and for the other worker we haveonly ETH0 so this network interfaces areexposed in the resourceslice so what we will do now is todeploy the the 15 um replicas that willrequest um mag interface based on thisEGH1 so I applied a resource claimtemplate and a deployment so let's seethe deployment firstget deployment and the deployment iscalled uh demoapplication so this deployment is justan um normal uh pods uh running Alpinewithout doing anything is just for thedemo but the interesting part is that itis pointing uh uh on the resource claimyeah you can see it it is pointing uh tothe resource claim mag villain uh or tothe resource claim template magnanthattachment so this ETH uh MagVan ETH1attachment isum uh contains the CNI config and everyelements to configure the mag villainterface based onEG0 so let's see what we have in ituh eth1 attachmentoopsyou're listing resource slices oh yeahthanks uh resource claim templates yeahthanks and then in this um resourceclaim template we have the CNI config uhthe full CNI config in YAML format inthat case compared to multus that wasstoring uh the um uh the uh CNI configin a JSON format as Miguel me u showedpreviously and it contains a resourcethat request the ETH1 uh to be availablewhere the node will where the pod willbe scheduled so the 15 pods has beenscheduled and they have they have beenall scheduled onum on the kind worker only because ETH1is available only there and then if wecheck the uh resource claimuh 15 resource claim has been created soone for each board uh that are part ofthis uh deploymentso let's check theum resource claim and we will see thestatus and what has been configured inthe podso in this pod um we get uh an interfacecalled net oneuhhere that has uh this uh IP address101011/24 and this MAC address se uh 77764C17 so let's verify this in the portdirectly so this resource claim is hasbeen created for the the this umuh um pod that we're seeing in theresult for uhfield so if we check uh using IPcommands what has beencreated we can see that um the thenetwork interface network one isexisting10.10.11/24 with the MAC address thathas been stored in the resource claimdedicated for this uhpod and uh we can we have now the pod uhum attached to the secondary networksusing um CNI so in the same way asMoltus was doing just in a native waynow usingDR so if you that that was everything Ihad for the demo and if you want to getinvolved in this work and and get um uhuh have or to have more progress uhwithin uh the community um there is fouruh communities that I would recommend uhthe first one is the CNI community uhwhere we are talking mostly about CNI2.0 zero and how to make CNI moreKubernetes aware especially about u umthe problems that Miguel mentioned andthe uh the scheduling part uh the secondmeeting I would recommend is the workinggroup device management that is talkingabout the array and that is a groupingof multiple sigs in cubernetes forexample sig network sig scaling signalsetc the third one is multi network thatis working on this API and then it'sfollowing every um every single uh thingthat is related to to uh the theoriginal cap uh that uh got rejected umand the last uh community I wouldrecommend is the plumbing group uh thatis running multus and many other coolproject like SRV device plug-in operatorum and where we are al also talkingabout multinetworking so has a conclusion for thiswork um so the DRA um allows us torequest or allows users to request anykind of data uh and the networkingnetwork interfaces can be consider umsorry the D allows the users to requestany kind of um resources uh for the podsand we can considered as we have seenhere in the demo that the networkinterface can be uh a resource uh or adevice and one of the upcoming featurethat uh will appear at some point isthis consumable uh capacity that willenable this kind of use case so we canuh um uh claim some or create virtualdevices and claim parts of the noderesources so I would like also to thankyou for two minutes okay um yeah I wouldlike to thank you for your attention andu if you have any any question uh pleasecome here and then or come after andthen we'll be happy to answer them yeahsure if you can also like provide ussome feedback to our talk just scan thisand tell us goodbad anything goes uh if you want to makesome question now I think there's amicrophone somewhere there in the middleyou'd have to go theresorry and if not I think we're donethank you thank you2025-04-15 22:02:27.587650 ��`�#��wAPgCaIyeRn6Yhello everyone uh we're here at CubeConEurope 2025 uh this time around inLondon my name is Miguel i'm here withLionol and we are presenting a talktitled Uncharted Waters dynamic resourceallocation fornetworking the plan here is for us totell a little bit about uh dynamicresource allocation what it is it standsshort uh short notion is DRA andespecially what it means for theKubernetes multi-et network uhecosystem let's first introduceourselves so I'm Lionel i'm working atEricson software technology i'm mostlyinvolved in the CNI community in themulti network community and SIG networkand I'm also involved currently in thearea to have this multi networkhappening with the array amazing and I'muh Miguel Dwart i'm a software engineerworking for Red Hat uh particularly onthe Open Shift virtualization networkingteam i'm working on pretty much likepushing virtualization features intoOpen Shift'sSDN and let us walk through the agendafor this presentation like we will beginwith the motivation and the problemstatement uh for for this from there wewill uh explain and paint the multi-etnetwork landscape today for uh in theKubernetes ecosystem and from there wewill perform like an introduction to uhdynamic resource allocation especiallywhat it means in the context ofnetworking andmulti-etworking from there we will uhwalk you through the most relevantKubernetes enhancement proposals uhspecifically for multi-et networking anda very dedicated or very specificscheduling use case and we will finalizewith a livedemo so the problem at least as we seeit is Kubernetes is opinionated it isvery opinionated according to whom youask and by this we mean that itsnetworking model is extremely simple andthat might end up being like adouble-edged sword because what happensis every pod on your cluster will getexactly one interface and every pod inyour cluster will be interconnected andable to reach anybody else so extremelysimple but if you don't want that wellif you want micro segmentation you canput uh network policies on top to kindofum define access and who gets access towhat resource on what port and sort butthis thing is expensive and it'sexpensive in two different ways thefirst of which is computationallyexpensive meaning that every time a podgets created or you provision in anetwork policy you have to re the systemhas to reconcile itself recomputee whichIPs can access which IPs on which portsand the whole shebang it is alsocomputationally expensive becausesomebody must write these policies andsomebody must maintain these policiesand make sure that they make senseholistically in your system noteverything is bad one thing that isquite good about this is that you willaccording to the Kubernetes uhnetworking model there is no on eastwest traffic and that thing well againin our opinion opinionated way amazinguh the motivation now for this is butwhat if you don't want the defaultcluster network attachment what do youdo you will according to the networknetworking model you will have itanother thing and that is excuse me thatis very important important for instancein the virtualization use cases so againI workum for open shift virtualization so itit's upstream is Qvert what happensthere is most of the people that we'reseeing they really do not even plump thecluster default network into theirvirtual machines they just attach like asecondary network and just use that butstill their VMs which run on pods havethis cluster default network attached toit another uh use case that you cannotrealize with Kubernetes alone is what ifyou want more than one interface let'ssay you are instantiating like uh or youwant to to develop a VNF that requiresmore than on� the gateway API lacks insimplicity uh it makes up for inexpressiveness so all of those advancedrouting features are built in they'refirst class APIs on the gateway APIresources so they are structured um theyare validated with the schema um andthey're standard across allimplementations so if you switch from anengineext based to an envoy basedimplementation of gateway API you'lljust be able to use the same resourcesin the same way so I think if you areonly performing simple routing thenmaybe you stick with ingress for now butif you're doing more complicated routingor you have a bunch of implementationspecific annotations uh it's probably agood time to start looking into gatewayAPIspencer yeah I can also uh comment onthis so I think there's a couple thingsthat I would want to cover um youalready covered uh Kate that there's youknow much more of the surface area ofthe API is you know inside the the thecore API and it doesn't need to be addedas an extension like ingress um and Ithink that's one of the things that andyou know maybe I don't need to actuallysell everyone on on gateway because ofall the hands that we saw at thebeginning uh there's a lot of peopleusing it which is great um but yeahthere's just like generally because moreof the uh the feature set is actuallyinside the core API is you can describeit um via gateway directly without avendor specific um you know extension isthere's less vendor lock in so sometimeswe have customers um who say you know Idon't want to necessarily use GK gatewaybecause I don't want to be locked intothis you know specific using only GCPlet's say um but that's the sort of thebeauty of the gateway API itself is thatmost things can be expressed you knowwithout an extension and so you caneasily move between implementations ifyou want to um and maybe one of thetakeaways uh from this talk is that ifyou're you still can't decide on on whatwhich one is the best one for you isthat you can move between them andthat's kind of the whole that's one ofthe the main points of it um also youknow another thing is the the roleoriented architecture of gateway versusingress you have like the infrastructureprovider is you know owns the gatewayclass you have the cluster operatormight own the gateway and then you havethe application owner or applicationdeveloper who owns the HTTP route andthe services um and that just you knowif you configure that with your arbbackuh setup it's just kind of a nice easyway to to handle the separation um ofresources and keep it secure um and Ithink lastly is just that as most peopleknow you know there's much more uhdevelopment and uh involvement now withwith gateway versus ingress um and soyou know if you're looking for somethingnew uh and you want to use it for a longtime and you want to uh see somethingthat's going to be very well-maintainedand uh continue to grow and develop thenGateway is obviously uh the choice foryouall right so with nearly 30implementations how should usersapproach the decision-making processwhat are the top three factors theyshould consider when choosing animplementationso coming up with three factors is hardlet me break it up break it up intothree steps and hopefully try toeliminate from 30 to 7 or 5 to 7 thenfrom 7 to 3 and then 3 to 1 uh step onewould be feature parity aka spreadsheettime uh this is where you jot down allyour use cases and map it toimplementation capabilities uh one goodresource for this is uh the conformancereport page uh in gateway API built byChristine andMatea another good resource is theingress table in the learn learn kitpage.io page um another thing you wantto measure in this step is uh costspecifically migration cost and purchasecosts so hopefully by this by the end ofthe step you've narrowed it down to fiveor seven uh in step two we'll hopefullynarrow it down to three um using dataplane and itsdifferentiators for example on web proxyhas some rich observability andextensibility and engineext does reallywell uh for serving static assetsstep three is PC time uh hopefully wehave three implementations that we'venarrowed it down to and this is where weactually test out these implementationsand see if the features actually work umI recommend sort of logging experiencedown so that you can measure operationalcost another thing to do in this step isum running uh pro tests that arespecific for your kind of traffic andscaleyes I think we can sort of break it upinto three steps of feature parityum data plane differentiators and P tonarrow it down from 30to1 so I'll throw another wrench inthere one of the other perks of usinggateway API is that it's not justlimited to north south trafficmanagement you can also use gateway APIfor east west traffic so your servicemesh traffic um we have multiple meshesthat have support for gateway APIso when you're looking at this it's notjust limited to your mesh necessarilybut that may also inform your decisionof which north south gateway you'reusing to pair with that mesh so ifyou're using STTO STTO has a built-iningress gateway with support for gatewayAPI which is going to be seamless tohave connectivity between the gatewayand the services in your mesh with MTLSif you're using linkerd they don't havetheir own ingress but they do have agreat way of integrating with othergateways so uh you also might want uhpotentially some of the experimentalfeatures that envoy gateways adoptingearlier than some projects so pairingthat with one of your meshes or using acloud provider specific um gateway APIimplementation at that north south layerpotentially for better integration withuh some like DNS stuff certificatemanagement uh generally betterintegration with the rest of yourpresence on that cloudthanks Mikeall right so how do day2 operationsdiffer acrossimplementations are there specificdebugging or observability features thatare critical and yeah I I can answerthis one so I I think we should quicklyjust define day two operations so inthis sort of paradigm you have like dayzero operations which might be sort ofthe pre-work before you uh actuallydeploy something and you have day oneoperations which is you know the workyou're doing to actually deploy uh yourintegration and then you know uh day twois sort of the ongoing you knowmaintenance um load that that you haveto to keep it going so um I I think likethings to look for with you know uh day2for gateway implementation specificallyis probably one of the big ones is isand you might see this partially as partof day one so it's a little bit cheatingbut uh is the CRD management this issort of one of the interesting thingsabout gateway API is that it's developedoutside of the core Kubernetes tree umand so because of that the installationof the CRDs is either you know a stepthat you do uh before you actuallydeploy your implementation in thecluster uh whichever one you choose uhor if your cloud provider does it foryou um then that is potentially you knowan added benefit of something that youdon't have to manage going forward islike version upgrades uh versionmanagement um etc because especially asyou are potentially deploying multipleimplementations which which is I guessanother takeaway from this talk uh orfrom this panel is that you you can usemultiple depending on if you're doingyou know you want to do north south andalso east west uh the CR CRD managementcan get kind of messy uh and you knowdoing version upgrades and etc and so ifyour cloud provider can handle that foryou um you know with theirimplementation that could be you know abenefit that you might want to look forum I think you know other than thatother you know things to look out for isuh observability and is is obviouslycrucial I think a lot of theimplementations for gateway use envoy asthe underlying you know proxyimplementation envoy provides a lot ofmetrics uh that I think are super usefulum but also you know I think especiallywhen you're dealing with a cloudprovider specific or an outofcluster umyou know implementation of a proxy Ithink it's really important to make surethat it has all of the you know metricsand observability you need because it issomething that you don't manage yourselfit's a managed resource you know amanaged load balancer application loadbalancer that that lives outside yourclusteryeah just also echoing Spencer metricsare super important that's how you showup in management like what is going tobe good and like justify those costoverhead so for example like Selium hasa great feature that's called Hubble andthat's really good at you know justpeeking into your cluster and your uhnorth south traffic and making sure thatyour metrics are aligned and you knoweverything is flowing your traffic isgoing goodall right so is there an effort tostandardize implementation behavior oris kind of diversity the goal and how dofolks here try to balance flexibilityand sharing some of the kind of thebenefits of their implementations withconformance and fitting into what theagreed upon standards areso I'd say the primary goal is to haveconformant behavior for the set ofcommon features that are common acrossmost implementationsbut as a secondary goal um flexibilityum is needed so implementations have thefreedom to build on top to solve userspecific problemsuh yeah I agree with Arco the the goalis definitely to standardize like asmuch as possible wherever possible umbut like you can't always so like oneeasy example is if you have proprietaryconfiguration so like for EngineXexample um there's engineext config thatjust doesn't apply to envoy right soit's going to be really hard tostandardize that in the gateway API butusers still want to be able to tweakthose configuration settings so in thatcase you can um build on the gateway APIlike Arco said using some of theextension mechanisms and one of thosethat we used at EngineX were likepolicies so you could create policies umthat allow users to like set clientconnection settings or upstream settingsfor EngineX um but they still likeattach to gateway API resources and thatextension mechanism is still standardlike itself even though the behavior andthe impulation might change like youknow what policy attachment is and lookslike and how to use it generally um sothere's still some standardizationthere and then I also think like thereare cases where an implementation mightwant to deliver a feature faster than ispossible when you're trying tostandardize it like those standardsdon't like they're not easy to developthey don't come overnight um and theythey just take time so like O is areally good example of that the gatewayAPI has been working on O for a while umand they're getting closer to a standardfor it but obviously likeimplementations have a business need todeliver authentication to the users sothey like might use some of thoseextension mechanisms again like policiesor filters to be able to um let usersconfigure off before the gateway API canyou know come up with the standard soit's I think it's just a balancing actand just always trying to figure out ifthere's common ground among theimplementations and to talk to oneanother and figure out if we can createstandards where there aren't any todayand just make it a better userexperience so collaboration is a keypart of OSS how do implementers worktogether despite having competingsolutionswho wants to take that oneso I think um all implementers benefitfrom good user experience uh gateway APIhas evolved from ingress providing amore richer and expressiveAPI yes there's room for improvement butit's still a step upum three years ago when we started ongateway we made a bet to purely supportgateway API uh that's something Nickchampioned actually um and it's paid offadoption of the API has been prettypositive so I think there is value forall of us to work together um to providea common user experience yet compete onyeah value ads on top yeah so I'm goingto tell a short story here and hopefullythis is going to inspire some folks inthe audience to help contribute to thegateway API and build this projecttogether so um I attend regular STOweekly meetings that are open tocontributors uh to users etc um and inone of those meetings a few months ago auser at box.com uh Eric Bishop he cameto an STO meeting and said,"I am lookingfor a specific feature that's availablein Envoy that's not yet available in STOuh it's budgeted retries." So I wasaware that this is also kind of howlinkerd has typically done uh retriesusing a similar strategy historicallyand I encouraged him to instead oftrying to just do this in ISTTO becauseat the time like we're focused onambient and we're looking to use gatewayAPI as much as possible for our futureconfiguration uh I encourage him tobring it to the gateway API meeting andto uh try to make it into a featureproposal for something that we couldpotentially adopt across multipleimplementations so I was happy to kindof like shepherd and support him indoing that and it was voted for up forinclusion in uh the recent uh scope for1.3 the upcoming version of the gatewayAPI and yeah we worked through a gaptogether we worked through animplementation and we got feedback andreview from linkerd uh we had folks atGoogle who I wasn't aware would beinterested in this but chimed in on thepull request implementing this and weidentified a kind of like gap indifferent differentiation between linkerexisting implementation and envoys andwe identified basically like there's asmall change that we could do to envoyto be able to have a common uh specsupportable between both implementationsand yeah um we're anticipating doingthat work soon and we're really excitedto get this uh spec into experimentaland hopefully that's something that'llresonate with some of the folks in thisroom and similarly you'll be able tobring your use cases to these upstreamspec meetings like we're welcome i meanwe'd love to work with end usersso the the gateway API keeps evolvingwhat upcoming features or improvementsare most exciting for implementersso I think one of the more excitingthings is the uh announcement of Endgatealthough you guys have probably alreadyheard but there's going to be a talktoday at 5:45 uh by Rob and James andthey're talking about Endgate and what'sreally exciting is that we're reallyfocusing on user experience andusability making the process seamless togo from ingress to gateway API um sothat means we need to talk to end usersand hear what you guys care about andwhat we need to do as upstream to makeit more seamless for you uh anyone elsewants to add on to thatI asked this question but I can alsoanswer it I guess um but so I think onething I'll I'll I'll plug real quick isthat uh you know the folks at at Googleand GKE have been working very closelywith the community on the inferenceextension uh for gateway and so foranyone who is working or using um youknow inference serving workloads onKubernetes I think definitely check outthat extension uh there's quite a fewcompanies involved um there is a keynoteabout it on Friday uh with uh you knowfolks from Google and also bite danceand so I I would definitely check thatout but uh but yeah we're lookingforward um to supporting that in in GKEK already mentioned this but uh O sotackling authorization andauthentication and hopefully gettingsome of those really common use caseslike JWTs and OIDCinto the actual gateway API specyeah sure all right we're going to do aquick lightning round because we want toalso open up the mic to you guys butwe'll answer these five questions inrapid succession so Spencer you want totake it away uh biggest pain point wouldbe I'm going to say CRD management uh asan implement uh I think biggest painpoint right now is not having the CRDsyeah installed when you have it and runin your clusteri don't have anything complain aboutright now so I'm just gonna skip i'llsay policy visibility policy attachmentis great but understanding whatresources are affected by a policy canbe challenging still yeah I second thatimplementing policy attachment iscomplex um okay we'll go the other wayso favorite gateway API feature um likerightnow filter I think is my favoritei'll say a kind of weird one uhreference grant it's I see Rob laughingin the audience so it's something thatis explicit contract to make sure thatwe're offering security betweenconnections in different namespaces andreally helping to delegate that controlsafely and uh Rob's also been doing somework to try to get that concept intoupstream core Kubernetes otherimplementations have uses for it tooso I'll mention the controversial onewhich is policy attachment uh but itreally allows all of theseimplementations to kind of buildimplementation specific features uhtargeting their specific usersum I like not it's not really a featurebut like the idea of personas i thinkthat's a really good story to have thatyou know the application versus thecluster operator because we see thatteams are really large and not oneperson manageseverything so I'm I'm also going to saypolicy attachment um for anyone who'sused GK ingress before and is familiarwith our own custom CRDs for front endconfig backend config etc uh they werekind of a pain and I think policyattachment made things a lot easier forus um and then so I'll I'll go next forfor best debugging tip I will say uhstatus gateway status statusresource status is our best friendstatus yeah status keeps tell describeoh and gateway cuddle use the gatewaycuddle tool that's also really helpfulum what's next for your implementationin 2025 um enrock is currently workingon just adding API surface for thegateway API so adding support for someof the experimental route types like TLSroute and TCP routeuh oh yeah we canuh uh continuing on that one though uhSTO's current focus is ambient mode sowe're currently working on multiclusterfor ambient right now and one of theother like highlights of ambient is thatwe're focus or we're all in on gatewayAPI as our primary configurationinterface for traffic routing sodefinitely expect to see us continue todrive work upstream in the gateway APIproject to extend the functionality andkind of double down on using that uh asmuch as possiblei'm going to say zone away routing fromthe gateway to the backendsuh Sim's going to continue doing somehardening so TCP and UDP routespecificallyand then I I know I already mentionedthis but the inference extension forgateway APIall right okay so we have we have timefor open mic questions there is amicrophone straight ahead of me um ifyou would like to ask a question justplease say it into the micquestionsyeah well if no one has any questions Ithink we'll all stick around as wellafter so if you want to pick our brainsor you're too intimidated oh yesis the idea with TCP and UDP route toreplace uh lo service type load balanceror is thatlike they'll live together for a whileor but strategically it's the long-termvision or whati think for now they're just going tolive together uh we still have to dosome engineering planning work for likethis year in 2025 but for the most partit's going to it's not here to replace iknow there's a few folks who have somelike long-term ambitious ideas aboutdeconstructing service but that's notsomething that's in the immediate timeframe to worry about yetso in what uh scenario should uh someoneuse uh TCP route versus uh service typeload balancerso I think um the gateway generates theload balancer for you if you're using anin-cluster implementation um TCP routewould let you latch on to that samelistener and route TCP to your backendssay if you have a Postgress back end youcould do you could use TCP instead ofusing another protocol like HTTP routeor gRPC routeand you could also take advantage of uhsome of the functionality of gateway APIlike traffic splitting and things likethat uh for like canary or ABdeploymentsthanksoh I guess uh since there's no questionso far at least I I can use this time toplug there's a talk uh by I forget whofrom Docker tomorrow um if you want tohear about you know someone who's theirstory of of actually using gateway APIum as a customer and so I think that'spretty valuable uh for folks to go toand I also want to mention um a projectcalled ingress to gateway if anyone'slooking to migrate from ingress to thegateway API um this tool will helptranslate those ingress resources intogateway API resources so please givethat a tryand is that itall right well that's it thank you guysso much please leave a review from theQR code2025-04-15 22:02:30.242234 ��#��uAQzE6vSgcyT0well hello everyone thank you for comingto Taming the Traffic selecting theperfect gateway API implementation foryou so we put together this panelbecause um as of today there are almost30 implementations of the gateway APIand we thought that it might be a littlechallenging to find the right one foryou but we aren't going to sit here andtell you that there's one gateway APIimplementation that's going to work foreveryone or one that's the best um thereis no de facto default blessedimplementation of the gateway API but wedo want to start this conversation andgive you some more information andcontext to hopefully help you make thatdecision um but first up before we getstarted I wanted to see how many of youhave heard of the gateway API if youdon't mind raising your hands okay a lotand then keep your hands up if you areusingit okay less people and then what aboutif you're using it inproduction oh okay all right fair amountof people awesome so the way that we'restruct we structured this panel is weput together a series of questions withthe help from our community and we'regoing to start with those as kind of thebase um and then we'll open up the micto you um but first up let's just dosome intros go to the nextslide uh my name is Kate Osborne i'm asoftware engineer i recently joined Enroum they have an implementation of thegateway API called the Enro Kubernetesoperator um but prior to that I workedat EngineX for a few years on their uhgateway API implementation and I'mChristine Kim uh I focus on open sourcedeveloper experience at isovalent atCisco uh we isovalent itself uh focusesaround psyllium and yeah I've beenaround gateway API for a little bit yeahi'm Mike Morris i work uh at Microsoftas a product manager on our openupstream open source networking team umI primarily work on gateway API and STTOi'm Spencer H i'm a software engineer atGoogle um I work on GK Gateway i've beenworking on GK Gateway for a couple yearsnow and then before that was working onGK ingressi'm Marco i'm a maintainer on the EnvoyGateway project uh this is a sub projectin inside envoy that implements thegateway API to manage north southtraffic also a reviewer and acontributor to the gateway API uh andI'm an engineer at Tetrate where I workon TEG which is an enterprise offeringof EnvyGateway so let's start by addressing theelephant in the room uh I'm a user wholikes ingress why should I change to thegateway APIspencer do you want to take that one kdo you want to go first orOkay so something I hear a lot um aboutthe ingress API when compared to gatewayis that it's simple um and I definitelyagree with that it is a simple API it'seasy to use um but when you need moreadvanced routing capabilities liketraffic splitting or request mirroringheader manipulation those types ofthings that's when ingress starts tofall short and you reach for likeannotations right and annotations arealso simple they're string to stringvalues um but they're unstructured whichmeans you can't apply schema to them sothey aren't validated that way um yourely on the implementation to validatethem at runtime um and sometimes ifthat's not done well or in a robust waythat can open you up to some securityvulnerabilities um in addition I thinkthey're slightly more challenging to usebecause they're not as expressive you'relimited to strings and finally uhthey're not standard so if you have ifyou're using an EngineX basedimplementation and you want to switch tolike say an envoy envoy basedimplementation those annotations aren'tgoing to be the same and you're going tobe spending a lot of time mappingbetween the two of themso what s uh one of theearliest projects ever uh you know partof the to to to join the CNCF I think uhmaybe fourth or fifth project um linkerdhas been around for almostuh almost 10 years at this point anduh I haveopinions which which I'm going to sharewith you so yeah get ready so what'sthis talk about you know this is goingto be a talk about sidecars and and I'mgoing to I'm going to provide some kindof like nonservice mesh you know sidecarcontent so we're all on the same pageit's also going to be a talk aboutengineering trade-offs and like how tothink through some of the stuff it's notreally going to be a talk about right orwrong old or new or or good or evilbecause um you know I think in some youknow in some cases the answer is clearin some cases the answer is not and andin the vast the reality is in the vastmajority of you know kind of engineeringdecisions the answer is not really thatclear you know the answer is like hereare the trade-offs and you have todecide which one makes the most sensefor you um so I'll I'll try and presentsome data someobservations um I'm going to present myopinions and and my conclusions but I'lltry and present it in a way where youcan you can draw your own that soundgood allrightso we're going to talk about sidecarsi'm I'm going to tell you what a sidecaris which is the answer is it's it's justa pattern and if you go back to likeancient blog posts from2015 we've been talking about sidecarslike even in the Kubernetes docs for for10 years um but we never really had likea an object in the in the API or in theKubernetes spec you know that kind offormally was about uh sidecars orformally captured the idea of a sidecaruntil into 20 until 2023 so I saidsidecars are a pattern you know the ideais that you take a container and you runit next to a main container right andthat's that's really what it is um sobecause they're they're both in the samepod they share the same network namespace they're in the same croups theythey share file system through volumeouts otherwise the two containers areisolated right because they'recontainers that's the whole point um andthey they run basically as long as yourapplication runs your application beingthe main container and when your appterminates so does your sidecar i put anasterisk therebecause there have been advances in inin sidec cars over the past coupleyears um so in the case of uh somethinglike linkard you know we have this thingcalled the linkerd proxy which is thislittle rust application and we stick itright in there you know and yourapplication is in there and we have thisinit container called linkard i'm notgoing to go into like too many of thethe gory details um of of how this worksi've given probably a thousand extremelyboring talks about how service mesheswork at this point um but you knowthere's other many other examples sohere's an example i think I I stole thisfrom some uh Kubernetes doc somewhere wegot the streaming container is like aanother sidecar um that you know takesyour logs and send your logs to like a aloggingbackend um there's there's a coupleother examples vault has a there's asidecar open telemetry um you know andof course service meshes here's like Idon't know at the time of this at thetime of the screen cap this was like theset of service meshes in the CNCF uhlandscapepicture and so like that's that's theidea right it's a pattern okay and andwhy like why do we care about this umwell it's an it's a really convenientway to add functionality without havingto change the application right asopposed to a library or something whereyou have to uh you know at a minimumrecompile the application so it can belanguage and framework independent itcan be owned by the platform teaminstead of by the developer team that'sa good one uh for us as as operators umand then the other really nice aspectsand I'm going to talk more about thiskind of in the second part of the talkis they have a a really clearoperational model right every sidecarcontainer belongs to the pod so youtreat it just the same way that youtreat any other pod if that thing diesit gets restart ed you know just like anyother container if the pod dies you canrestart the pod uh you know it's got areally clear security model right in thecase of a a sidecar uh a service meshsidecar where that that the the sidecarhandles all traffic coming to and fromthe applicationum you know it it's like it's kind oflike the little firewall for that podright so the security model is prettyclear um and it makes use of all thismassive technology investment that we'vedone for isolation right so the wholepoint of containers is isolationright it's making sense sofar all rightum you know I Okay I guess I gave awaythe punch line to this right like here'shere's how this works for service meshesthe sidecar uh uh you know proxy sits inthere and it handles all traffic to andto and come uh to and from theapplication container right we don'thave to delve into that linkerd has thisparticular uh implementation but reallyyou know it's it's kind of likeindependent right we have this we havethis rust microproxyuh lets us do all this this fancy stuffwe write it in rust we can avoid all theyou know all the um kind of security uhbugs that are kind of a a fact of lifewith C and andC++ but you know sidecars have downsidesas well I put r/work because some ofthese I'll talk about in a minute someof these uh have been eased right thatso what are the downsides of sidecarsthe first one is that like if a sidecaris really big then it sucks right andthis you know for those of you who areold enough to remember linkerd 1.x Xwhich is written on the JVM you know thethe the smallest we could squeeze thatthing down was like 150 megs and so ifyou have a a 50meg like uh Go microsservice and then we're asking you to put150 meg proxy next well you're like hit's not it's not really the likelightweight you know thing you wouldexpect right side truck is what I callit um the other downside is that podsare immutable so if you ever need tochange the sidecar that means you haveto restart the pod and you have torestart all the containers in there ifyour application is not built for thatwell that's a big problem right now youhave to tie the sidecar kind of or theservice mesh or whatever you know kindof technology the sidecar isimplementing you have to tie that lifecycle to the application life cycle umnow you know hopefully we're all muchbetter at writing applications thesedays and like you know we're used tobuilding things for Kubernetes and werealize that Kubernetes can restart ourpods at any point for any reason there'sno guarantees and like it's not that bigof a deal but if you've got legacyapplications it could be a big dealum another downside of sidecarshistorically has been job termination soif you're running a job instead of a umyou know instead of a kind ofcontinuously running uh service um thenyou have to synchronize a terminationbetween two containers in it containerrace conditions regular containerstartup ordering race conditions youknow uh there's a bunch of other umthings here that are kind of range fromlike severe to to warts you know this islike uh pneumonia to to warts I wouldsay um I think the other problem withsidecar which is not really recognizedbut and this is more of a psychologicalone is that you see the sidecars rightso if you for service mesh if you thinkof the service mesh as a networkingmodel then it's weird to see thiscomponent kind of like alongside yourapplication component we're used to thenetwork being independent or invisibleand like the network engineers handlethat and we as as beautiful pure opspeople don't have to handle that but nowyou're seeing it so it's weird um andyou get to see the usage too right ifthis was like a a fat client libraryright if this was like a really richgRPC library and it was doing all thisstuff the resource usage of that gRPCgRPC library would be kind of built intothe resource usage of the application asa whole we wouldn't see it pulled outbut now we see it pulled out you knowand and so now we like have this obviousthing that maybe we didn't want to beobvious and we can see just how muchresources it consumes i'll skip thisI'll skip this tweet which I sincedeleted um I had afunny funny in quotes ebpf jokeum so yeah sidecars you knowhistorically have had these rough edgesuhand I'm going to run through this kindof on the fast side this next bit whereI'm going to talk about the evolution ofthe sidecar like um K and how iteventually got integrated intoKubernetes it's kind of interesting fromlike historical perspective but modernKubernetes all you knowversions the the have have nativesidecar containers you know allgraduated to G and like everyone shouldhave access to the stuff now so some ofthese problems have been have been easedso in Kubernetes as you might know webasically have two two kinds ofcontainers there's a nit containers andthere's regular containers they justcall them containers right and the initcontainers uh kind of get executed insequence and they have to run tocompletion before you get to the nextone and then the containers themselvesstart in parallel at least ostensiblyyou know and they just keeprunning so if you're you know if it'scirca2020 and you want to you're decidingwhere to put your sidecar can you put itin a in an init container well notreallybecause only one init container can runat a time right and like it it waitsuntil that's done till the next onestarts so you have to put it as aregular container but what if yourapplication starts right the the regularcontainers all start at the same timewhat if your application starts beforethesidecar well then things get trickyright especially if you have an initcontainer that has done like somenetwork magic in the beginning that saidall communication to and from this podhas to go through the sidecon containerapplication container starts up it wantsto talk through the network it's beingrouted because that init container'sbeen routed through the sidecar but thesidecar hasn't started well what happensif things don'twork okay you can you know there's somehacks this is like these are like yearsof of pain and effort that I'm justglossing over in these slides you knowokay so there's some hacks you can doyou you know sensibly containers allstart at the same time but you knowunofficially the container only startsafter the previous container post starthook is finished so you know if you putyour sidecar container first then you'reokay we've got a similar challengearound pod termination if anyone hasbeen running a service mesh prior to2022 or something with jobs you'lldefinitely know this one uh you know ifyou have a job itterminates your main container is donelike it exits that's the point of a jobbut Kubernetes didn't have a way i saydoesn't but this is past tense didn'thave a way to know that the the thesidecar container had to exit so youknow you had to like tell it this viaoutside toolingokay and there's like uh some moreannoying stuff don't don't worry aboutthis it it sucked um and so you knowthese ideas like the notion of a sidecarhas been with us since 2015 and and theyou know the realization that there wereall these warts has been around since2019 right with it with this KP firstproposed and like you know the basicidea was well let's let's the part ofthe reason why we have these issues withsidecards is because um you know we'renot treating them specially and theykind of have this we're treating them asregular containers you know but theyhave this kind of uh special behaviorand so this K has this long sortedhistory you know all these concerns andpeople worked really hard um and thenlike it you know became a meme on liker/ kubernetes because like it never gotmerged cuz uh you know it was it's atough it's a tough challenge finally atthe end of 2022 we kind of have thislike uh you know the story taking shapeand I'm I'm rushing through this becauselike like I said it's it's stuff thathappened in the past it's ancienthistory we don't have to know about itanymore and so but the solution here andfor those of you who have ever heard theterm NATO sidecars um the solution hereis that we actually are going to add orwe did add some functionality we the KKubernetes community um to initcontainers and we added this thingcalled a a restart policy um and we gaveit the ability for us to say always umokay a bunch of other stuff hasn't beendone yet um but with this idea initcontainers with theum restart policy always flag will nolonger block starting of othercontainerseven though other init containers sothey kind of like start you know with aamperand afterthem and if they ever die they just getrestarted automatically right that's whywe call it restart policyalways okay and we get the new shutdownbehavior so if you've marked if youmarked your containeras restart policy always now you havethis new behavior where when theapplication containerstops you get this kind of reverseautomatic termination of all these initcontainersokay so if you were following along andlike adding all those changes up in yourhead basically at the end of this KPwhich finally got merged in in 2022 orwhatever um we've actually solved abunch of problems you know a lot of thewarts with sidecars right we canguarantee the sidecars initialize beforethe applicationcontainer we can guarantee the sidecarsterminate after the applicationcontainer terminates and thattermination is now predictable and soyou know back in 2024 I guess it reallyonly got u merged in 2023 so back in2024 you know linkerd and and a bunch ofother service meshes kind of all tookadvantage of this right now we havenative sidecar support so in that if youremember that long list I I had fromranked from like pneumonia to to wartswe solved the warts the pneumonia isstill kind of still kind ofremains all right so far so goodyou can you can like erase all that fromyour brain because now we fixed allthoseproblems all right so sidec cars versusalternatives here's where here's wherewe get the the the debate portion if youlook at the set of service meshes todaythere's kind of three basic approachesum there's sidecars of course which I'vetalked at nauseium about there's what Iwould call node proxies where we'regoing to put envoy or in the case oflinkert1.x X we're going to put thelinkerd 1.x XJ JVM thing on a per nodebasis or there'sambient the thing to realize is that youknow for for the heavy lifting ofservice mesh features for anything thatinvolves understanding HTTP requests orHTTP2 or gRPC they all use L7 proxiesthere's no getting around that even ifyou have ebpf in there you still have tofall back to an L7 proxy for thoseservice mesh features so the Onlyquestion is where do you put the proxyand that's where things get interestingright so sidecars hopefully you knowthis by now you put them in each podright sidecard mode kuma linkerd 2.0 andand beyond you get one proxy perpod node proxies as you might expect youdeploy a proxy per node and so you haveevery application on that pod share theproxy in thatnode that makesense ambient is where things get trickyand I I just borrowed the diagram fromthe ambient doc i didn't want to try andrecreate it um so here we actually splitit into two kind of uh parts we have theL4 part and we have the L7 part and L7l4 I think in this context basicallymeans MTLS um and and kind of like TCPconnection handling uh and then L7 iswhat they call the the waypoint proxiesso that's envoy and and those gosomewhere else so the L in in this modeL4 is deployed per node andL7 is put somewhere in a in kind of atunable level and this is what I thinkmakes this model kind ofinteresting i wouldn't call myself anambient expert i'm like a ambient uhstudent from outside but this is whatmakes this model interesting because thethe you get this you get a uh kind of aknob that you get to turn and where youplace those waypoint proxies and thatknob can be anywhere from um I'm goingto do a lot of lot ofsharing to like I'm going to do nosharing and we'll see why that becomesinteresting later um in this example youknow so Z tunnel isL4 proxy here are the two green boxes onthe left and right um and then thewaypoint proxy is L7 and uh proxy in themiddle um in this example it's deployedI think on a separate node but really itthat waypoint proxy could could bedeployed in a variety ofw ays okay so you'll notice that thefundamental difference here betweensidecars and non non-sidecar approachesis the use ofshared proxies so in sidecars you bydefinition you don't share the proxybetween pods for node proxies every podin the same node shares that same proxyand then for ambient you know every podon the same node shares the L4 proxy andthen the L7 proxy gets shared by likewhatever you've configured and you canconfigure that to be like everything ina namespace shares that same pod you canconfigure it to be um you knoweverything with the same service accountshares it you can configure it I thinkto be I think you can configure it to beno sharingmode it'stunable but sharing needs m meansmulti-tenency and this is where thingsgo wrong and and the reason why I knowthis so well is because of our our daysin with linkerd 1.0 before we moved tothe sidecar model you know this was thethe the the early early days ofdeploying linkerd in production with alot of our adopters and and this iswhere we learned some of the the painand and the it's part of the reason whywe ended up adopting the sidecarapproach so multi-tenency of coursemeans like different you know differentuh tenants different applications or orcustomers or however you want to saydifferent applications in our contextsharingsomething that's a huge challenge forfor any system in fact this is contendedmulti-tenency which is like theapplications you know that are runningon your Kubernetes cluster aren'tcooperating with each other by designright they're kind of like independentand you rely on the system to providefairness and there's lots of examples ofyou know this is like anexample uh this is like a historicalchallenge for for any kind of computingright different users running programson the same computer VMs sharing thesame physical host containers you knowapplications sharing the same proxythese are all examplesof independentcomponents you know trying to trying toall access the same shared resource andI think what's really the the partthat's important to understand isthat there's a there's different typesof there's like a a qualitativedifference betweenL4 traffic andL7 traffic l L4 traffic if you look atkind of you know if you look at theLinux kernel and the way that itum you know enforces sharing sooperating systems are kind of like thethe canonical example of uh you know howdo we enforce uh fairness between inthese contended multi-tenency situationsif you look at what they do for L4traffic you've got a set of tools likeuh queuing disciplines and rate limitingand and and so on you can kind of treatthese as event level decisions right butwhen you go to and and that's how theLinux kernel ensures that you know noone application can starve the othersyou know of access to the networkresource when you go to like uh CPU andmemory and like how do we make sure thatone application cannot like you knowstop another application from runningcan't steal all the resources fromanother application it gets much harderapplications are kind of unbounded in away that L4 is is uh that that L4networking is not so the sharing errorthe the fairness errors is enforcedthrough things like timer interrupts andpreeemptions like preemptive uhmultitasking task scheduling you knowmemory management units that kind ofstuff and L7 request processing is muchmore similar in profile to applicationswhen you your task is parse an HTTPrequest and do something with it it'skind of like an unbounded amount of workthat you have you know ahead of you todo thatand modern network proxies you know eventhe linkerdy proxy is not reallydesigned to enforce fairness betweenclients so that's a situation that we'rein that that we've tried to avoid inlinkertd is is having to share kind ofcontendeduh applicationuh usage through the same through thesame proxy um it's something you knowone could design a proxy to do this noneof them have been designed right so whatwhat are the actual problems concretelyright um and I'm going to I'm going topick on ambient a little bit becausethat's the the popular kid so you got topick on the popular kid um noisyneighbor issues right that's a big oneyou know you can overload a proxy withone application and then the other youknow other applications that rely on thesame proxy are unable to to to use itokay you can do this in in uh in inISTTO in in ambient by setting limits onthe waypoint or on the Z tunnelincreasing traffic uh on one applicationand you'll see another applicationbogged down okay great that's what wewould expect right single points offailure obviously you know the blastradius gets higher um uh you know allpods shared by this proxy if a proxygoes down then all the pods shared bythe proxy go down and that you know allpods that's the that's the blast radiusit could be one pod it could bearbitrary pods from arbitraryapplications or it could beum all pods from the same application orall pods in the namespace and then you know the third one iskind of appropriate sizing of resourcelimits and requests if like me you havestruggled with uh setting resourcelimits in in Kubernetes and andnavigating that in practice you know youcan imagine you know how hard that isfor one application well how do you doit for a proxy that has to handlearbitraryapplications none of those arelike none of those issues are aredealreakers right those are alltradeoffs that you're making right andand I think what's interesting aboutambient is that it gives you kind of aknob to turn which is you know where doyou provision thewaypoints you know you can like I saidin the beginning you can kind of moveanywhere from very shared to to notshared at all but you know you pay acost for doing this right you've gotextra hops you got extra complexity yougot extra tuning that you have to doum okay i'm coming upon Am I coming up on time is there atimer in the back i want to make sureI'm not goingover all right in ourbenchmarks so the question is are thetrade-offs worth worth it right and thisis this is a question for us as aslinkerd maintainers that's a perspectivethat I want to bring to you because Idon't think ambient is bad right but thequestion for us is like is this atrade-off that's that's worth it forlinkerd um I like the architecture infact in many ways um in our benchmarksand benchmarking is an art as much as itis ascience but this is our our best uheffort to do kind of apples to applescomparison linker Z still seems to besmaller and lighter in terms of dataplane memory andCPU than ambient and it still seems tobefaster sometimes significantlyfaster than STO ambientso you know ben benchmarking like I saidit's dependent on the application it'sdepending on on very specific things I'msure you could find examples in theother direction but when we've evaluatedthis this is what wefound so I guess if I were to summarizethis you know this the kind of the goalhere was to expose to you some of thetrade-offs that happen and the thingsthat we try and think through on behalfof lingardy users um we do periodicallykind of review these other approaches tolinking i think the analysis that wethat we keep landing on and maybe thischanges in the future right but theanalysis that we are that we land ontoday is that the majority of theproblems that ambient seems to solve areare not really relevant to linkerdy theoper increase in operational complexitydoesn't really make sense for us um notrestarting your applications for ourusers doesn't seem to be a huge winalthough I think if you were a cloudprovider and maybe you didn't want toput you know uh kind of platformcomponents in the application uh podsmaybe that would be a big win so sidecarcontinue to make the most sense for usbut we do continue to to you know we dowe do re-evaluate this decisionperiodically okay so that's kind of theend here uh I've got an ad you can youknow if you want to learn about servicemeshes we've got a thing here um I'veI've done my best to kind of be you knowfair fair and open-minded about this umbut I'm definitely happy to answer uhquestions to the extent that we havethat we have time for that thank youvery much for uh for sitting throughthat[Applause]2025-04-15 22:02:30.844583 ��9� #��)AlVWUCUt6ZM8okay I think we're ready foraction welcome everyone whatever thatprevious talk was must have been a arealdoozy hopefully this one will be uhboring andshort okay so this talk against mybetter judgment is called the greatsidecar debatenow I I didn't want to name itthat so the real name of thistalk which would not have been acceptedis why KubeCon attendees need to takethe time to understand a bunch ofcomplicated engineering trade-offsinstead of blindly jumping onto the newthing so that's what that's really whatI'm going to talk aboutum andhopefully you know we can lock the doorsand you you're going to be forced to sitthrough this with me so I'm WilliamMorgan i'm the CEO of a company calledBuoyant we make a a service mesh calledLinkerd uh hopefully you're familiarwith it by now it's a graduated projectbeen graduated for yearals of the time um in inthe empire so he's going to be aprominent character in our in our storyand then we have like as an interestingtwist HaroldSiglerson andHarold is is Danish he's actually aViking so he's he's leading a band ofVangian mercenaries who are helping thethe the Bisantine later also became thethe king of Denmark and it leads us intoour problem of the two generals problemso at the time the the city of Sarrausthe Bisonantine Empire wanted to captureSyracuse so well Michael says to Georgesays "Look I want you to go capture thiscity."and and George says "Yeah sure noproblem at all i'll go i'm going to takeHarold with me we'll we'll we'll getthis done." Now the the kind of theyformulated this plan they said "Rightwhat we're going to do is we're going tosay that if one of us attacks like thenwe're going to get defeated becausewe'll be overpowered if both of usattack at the same time then you knowwe're going to take the city and if ifone of us if we both decide not toattack then obviously we live to fightanother daynow this this problem is actuallyunsolvable in in computer sciencebecause one of the issues that they hadwas they needed to be aware that theywere both going to sort of attack atexactly the same time now they're bothon opposing sides of the city sooriginally George was like well I'm justgoing to send you a message so Georgesends a message hey I'm going to attackat 10:00 but he needs to know thatHarold's also going to attack so he sayswell send me a message back sayingyou're going to attack as well so hesends a message back and he says,"Wellyou know send me a message back to saythat you got my message." And this goeson and on and on and ultimately theycan't achieve a consensus to to beguaranteed that both of them have gotthe same message and the same data andthey're going to attack at the same timeso two generals problem is notsolvable and if we we kind of look atthat you know you can see there thatit's it's just not it's not possible tokind of whichever way you want to cut itso then well Michael says "Well I've gotI've got a better idea what we're goingto do is I'm going to send a messengeri'll make the order i'll send amessenger toyou uh Harold i'll send a message to youGeorge and then when you get the messagewe'll we'll attack." And and Haroldlooks at that and goes "Well you knowwhat happens if one of the messages getsgets intercepted?" So well oh okay imean that's possible and and Lamportmentions this as a traitor and and ifyou think of a traitor as as somethingwhich could be just a failing node orsome node that's rep misrepresenting apiece ofinformation so then they they sort ofsay well I've got an idea so Harold sayswell I'm going to send a message toGeorge with your instruction george saysI'm going to send a message to youHarold with Michael's instruction andthen we'll be good because then we'llknow exactly what's going on but itdoesn't work because in the presence ofhaving one traitor you can't agreeconsensus so for example if if a um atraitor in inter intercepts Harold'smessage to George and says retreatinstead of attack then George now hasconflicting information he has oneattack from Michael one retreat fromHarold so in in light of not being ableto kind of form consensus and that he hedecides he's going to retreat wellHarold he attacks and the you know it'sit's a problem so Michael says "Welllook we need a better problem we need asolution to this." So he speaks to JohnJohn the Unic and says "John what whatcan we do this was my idea." And Johnsays "Well you know I think we can kindof sort of make this a little bit moreinteresting." So we know that threegenerals doesn't work but what if we'vegot four generalsso then John the unic looks at that andhe says well you know if if we have athird general in Michael the commanderthen we can suffer one of the tradersbecause when the messages are beingpassedaround if the traitor sends a a kind ofa a misleading message to any of theother two then of course they can stillrecover that because they have threeother messages which which enable themto kind of come to consensus and you cansee here that the first round obviouslythe commander sends attack they thenvalidate that with the secondround the the traitorous messages fromJohn obviously get to George andHarold but they're not used that theykind of there's an overwhelming kind ofum attack so they attack and it solves agreat day and you know if you kind ofmodel that with the commander and let'ssay one of the commander messages is ismisinterpreted or is intercepted itstill works so the the kind of thesolution on that when when you're kindof looking through what Leslie says hesays well to to achieve bisantineconsensus what you need is 3 m +one generals where m is the the numberof the number oftraders so we've kind of proven thatthat works for for four generals withone traitor but what if you've got twowell then you need seven generals andand if you need three then you have 10and four it's is13 but the you know the formula worksnow I apologize for that slide it mightbe easier for you to read than than Ican read but if we look at the data forfor seven generals then what we can seeis that we have a sort of a round ofvoting we can't achieve consensus notall of us have got to agree on the sameaction which we can't do in light of atraitorous commander and then after asort of a a second round of voting wheneverybody sends everybody else theirtheir kind of data we still can'tachieveconsensus i was like okay but theformula says 3 M plus one i've got 3 Mplus one but I've I've on my secondround of voting I haven't achievedconsensus and that's because later inthe paper Lamport goes on to to mentionthat actually you've got to have anumber of votingum which which is kind of in line withthe number of traders so we need umthree voting rounds the third votinground again it's recursive so everybodysends everybody what everybody else hasreceived and ultimately what you can seeis that by using that information youcan remove the the kind of thetraitorous nodes or the the um the nodeswhich are sending the the dataincorrectly and you can achieveconsensus on on decision and that kindof comes down to this say this thisum this formula which is also later inthe paper which says that in addition tothe m plus 3 m plus1 you need t+1 votingrounds where t is the number of tradersand I'm not going to get into whyLeslie uses t and m for trader that's aquestion for himso let's um let's kind of move on i wasgoing to show you a little demo of thatin action but do do you trust me thatthat worksi mean I can show you the demo should Ishow them the demo show them that youworked hard on that what if the demodoesn't work well that's Lamp's problemi'm uh I'm going to try and show you thedemo this is not easy you got this oneso we'll show you the the kind of a verysimple demo so what we have is Commanderwe've got the the three generals so withthe the first voting round what we'vegot is um the commander there zoom itin speak up zoom it in i I'm sorry i'vegot bad hearing that's good you're goodokay so then after the the first roundof voting you can see that there's noconsensus so what's happened is thecommander has passed the message ofattack to each of the the generalsthey've then goneand tried to kind of form an opinion onthat and John of course has retrievereceived retreat so no consensus so ifwe then go on to the the second round ofvoting then you can see now that we canachieve consensus so even though thatJohn has sent retreat to to Harold andGeorge then then actually it's stillpossible to achieve consensus and if wewe kind of model that with the the sevengenerals and if there's a little bit oftime left over I can I can show you thatum later on but the formula works nowwhat's amazing what I absolutely lovewith this and with many of the otherpapers that are from this kind of era ofLamport is 40 years ago over 40 yearsago he wouldn't had a computer on hisdesk all of this kind of theorizationand all of this formula would have beenbased around pen and paper and andobviously tested in a lab but I I findthat incomprehensiblei think it is very cool can you fix Ohyou want your speaker notes yeah cooland now we're going to get intoconsistency specifically consistencymodelsso going back to John the unic and doyou gotit going back to John the unic uh we'regoing to continue this scenario which isthat basically consistency cannot beachieved without great communication andwhen running a siege you do need goodcommunication just drag it i can't seeit do you want me to do itspeaking of great communication justlike co-presenters onstage so right consistency and consensusgo hand in hand and they cannot beachieved like a good siege without goodcommunication so what we're trying toachieve here is that all messages mustbe present a specific order to gain thisuh knowledge of whether we should attackor notattack cool so in typical Michaelfashion uh he asked John the unic hey wehave these messages they're getting sentout of order what do I do how do I makesure that we achieve consensus throughconsistency and I've kind of tweakedthese a little bit so that they can kindof adhere to our story but underneath isthe actual like technical term and onthe top and the big letters is for ourstory so John the Unic identified threekey areas that we had to focus on hereum and we wanted to focus on messageconsistency which is data consistency uhsoldier availability which you can thinkof like as node availability uh or podavailability whatever you want to add asa Kubernetes term onto this and thentolerance to bad actors that one Ifudged a lot because I don't know how tolike do tolerance to network partitionsor like dealing with like network andconnectivity in the 1800s we're in the1800s rightoh 10th century maybe 11 i don'tknow history i'm I'm from Pennsylvaniaokay um so what he did come up with wasfour models for us to kind of adhere towe have eventual weak strong andsequential uh in our world with ourconsensus types we usually stick witheventual which isbasically I just realized uh the planetsthing is on purpose but yeah uhbasically we have four different typesand in paxos and raph we usually useeventual consistency strong consistencyis very rare uh mainly because it isvery hard to do in distributed systemsuh and weak uh is usually what I tellpeople to do but customers hate becauseweak is basically saying we'lleventually be consistent but I can'tever give you any guaranteesso in eventual consistency we can seehere there's only a guarantee that themessage would be delivered but there'sno promises on like if the inquiredmessage would be received in the mostup-to-date uh way so a good way to framethis is that you will eventually get theinformation whether it's in time foryour attack I don't know but you'll getit for weak there are no guarantees uhit just basically says like a messagecould be delivered or it couldn't be uhyou might get it or you mightnot uh strong means that basically weask that every read uh or message isread and then we get a confirmation kindof like thin act uh from the general andwe will return the most uptodateinformation no matterwhat sequential is weird and fun and thebest way I've found to describe this iswith Facebook which does not fit ourmodel here because there was no Facebookin 400 BC or whatever uh the dates keepchanging so basically the way that Ihave found the best way to describe thisis if you were on Facebook a long timeago you might have met left a comment ona post you might have noticed after yourefreshed the page that another commenthas come before your comment this isbecause they were sent at the same timeand what happened was timestamps wereconsolidated after they had refreshedand so sequentially like it gets intoorder but it just takes a little bit torefreshumcool going a little fast so consistencymodels are used in everyday computedsystems uh they're most commonly used indatabases which is why I referencedPaxos and Raft um but in militia uh andsieges and that type of stuff are alsoincredibly important to keep data up todate uh and to keep sieges fromhappening and armies from winning so ina lamp state of way of saying thisknowing consistency models will help youunderstand the guarantees andrestrictions of a distributed system andunderstand how you should design yoursystem this is my favorite one this isconcurrency uh it is my favorite becauseI get to have this one which is thatbasically Lamport says in teachingconcurrency paper that he's never takena compai course me too we have so muchin common uh so today I will attempt toteach you compsai 101 from his one paperi've read many a paper but ofhis so welcome to compsai 101 and thefirst question is what is a computationso if you're studying the system or thethe theory of computers like what whatwhat does it evenmean and for the purpose of this paperwe say that a computation is a series ofsteps but that kind of get brings us tothis question which is what even is astep uh and typically for the paper wesay that like a step is a transitionfrom one state to the next now we'regoing to get into some true lamportismswhich is basically he talks about howthe verbiage we use in discussingcomputers is very western uh and wefocus on like verbiage and actions likea verb uh and basically states that likethe computer only thinks about liketransitioning steps and not actually howwe think of it which is the state it isin so we have to think of computing astransitions and not state state rightthe end states uh he pushes us toreframe our thinking and speaking aboutcomputation devices or just computers sothat we can better understandconcurrency so hopefully this little bitof vocabulary will help you understandthe rest of all ofthis now uh and an invariance is acondition that is always thought to betrue throughout the execution of aprogram in simpler terms it's auh it's like a thing that never changesstate even when a transformation is uhapplied toit and a simple way to think of this isif the base case is n= 1 then theinductive invariance case of this is nequals 1 and n +one okay this is a really good quotefrom the paper mainly because he saysshe in it and like I'm really happy uhso now that we have like the foundationon this vocabulary here and we canunderstand what a computation step andinvariance is we can kind of summarizeconcurrency as the way to handlemultiple steps being able to switchbetween them uh and continue themrunning uh note this is kind ofdifferent from parallelism i'm not goingto get into it talk to me afterclassokay I'm not going to say his name idon't know how to say it you should readhis paper uh he in 1965 proposed a thispaper which is thought as the firstintroduction into concurrency um andbasically he proposed the requirement ofmutual exclusion for concurrency and goyou might be familiar with mutualexclusion uh based off ofmutxes we'll briefly touch on mutxmutual exus exclusion which is a rulethat specifies that only one piece ofcode can enter a critical selection at atime so basically in this photo reallyquick you see that we're doing a removalop uh we're trying to remove i and i +1at the same time but when we do that weonly remove i not i +1 so the resultantlinked list here it still has the stufflike the two removal ops did not happencorrectly so uh Lamport and a lot ofother friends they all have like a lotof different papers on this um created apaper that was supposed to help teachconcurrency and satisfy mutual exclusioncalled this right here um basically hecalls out that his algorithm satisfiesstarvation freedom starvation isbasically meaning that every processthat tries to enter the criticalselection of code eventually does so uhand the bakery algorithm that we'll talkabout a little bit um it is starvationfree so and Lampard in all of his papersdoes a really good job of breakingthings into real world examples so thebakery algorithm is something thathopefully most of us can relate to whenyou enter a bakery you take a numberfrom like the little red ticket guy hopethey still have those in Europe and youwait for your number to be called thisis considered the non-critical sectionyou're browsing the wares of the bakeryyou're looking at baked goods you'rehanging out uh maybe you're talking toother customers or you're on your phonethe lower number so the lowest numberthat that you pull from the ticket guyuh will go from lowest to highest uhbasically if a customer halts I don'tknow the right word for a halt like areyou just going to drop over i'm notreally sure uh but if a customer haltsthey will exit the bakery if they're inthe non-critical section section nowwhen their number is called they enterthe critical section which is when yougo to go place your order for bread umand that is basically saying a customerand a baker can only talk at one one ata time and that has to complete so nowlampwork goes into like thedeconstructed bakery algorithm to gointo like ways you can implement this indistributed systems it gets really mathheavy uh I would encourage you to readit if you likealgebra oh and I think it's very funthat he is able to talk about mutualexclusion without using the wordsemaphore even onceboop which leads us up to to clocks soclocks in in terms of uh lamp clocksit's actually a really really oldconcept so when when kind ofum Leslie Lamport was kind of looking atat clocks and things like that he was hewas looking at primarily around how youwould solve concurrency uh in in a asort of single processor architectureand this is back late7s very early 80sbut even back then at the time they'restarting to think about the fact thatsystems are becoming distributed and andthat like literally blows my mind butthey like conceptually if we jump backinto our story what we we have is as anexample soMichael is going to send the message toto George and and also to to Harold toto kind of make that that attack so theycan take the city of ofSyracuse so that message gets sent soGeorge gets the message and it saysattack at 10:30 now George because ofhis vantage point over the city says"Well you know this is not this is notright if we attack at 10:30 we're goingto get toasted what we need to do is isattack at night and attack at 8:30." Sohe sends that that message to toHarold now Harold gets the message fromGeorge says "Attack at 8:30 but themessage that was originally sent byMichael arrives after that so in termsof Harold's mind he is now going toattack at 10:30 the original messagebecause of the the problem with thismessage ordering now John um has thesame issue so the problem again we needthis consensus we all got to attack atthe same time it's not going to happenso Lamport proposes a an incrediblysimple solution to this and he says whatwe're going to do is we're going to havea message ID so when an event occursthat that event being that we're goingto send amessage we're going to add the thelamport time the the sort of the timestamp on there it's going to becomemessage one whenum sort of George receives that what hedoes is he's going to increment hisclock by by one to or to to kind of beto be one there as well so everybody'snow got a message ID so in the instancewhere we we had this sort of problembefore where the kind of the messageordering came out of um out of whack itit no longer applies because now whenHarold gets the the message from fromGeorge it's got an out of sort of bandorder so it's got an ID number which issort of um the same so he he can byusing that he can determine thatGeorge's order takes precedence overover Michael's and the Michael's ordershould be disregarded because of the themessage ID being beinglower and and what kind of the the kindof the guarantee there is that you getsort of so given two messages with thesame time stamp same lamp clock you'vegot to use a tie break so how do youkind of determine so if say Georgia andMichael's order both have exactly thesame message ID what should thereceiving general do and and in thisinstance a kind of you can use a numberof different things a common approach atLampport says is to use like alexographical order you could just saywell hey I'll use the node name andanything that's alphabetically takesprecedence is going to take precedenceas an order this can also be apredetermined order you could kind ofsay well hey Michael is is thecommanding general so he will alwaystake precedence and as long as each ofthe recipients of the message understandthis as long as they all apply this sortofcommon framework then in the instanceyou end up with two message IDs you candetermine what todo now vector clocks are are a kind ofan advancement of of a lamp clock andwhat vector clocks try to do is they tryto kind of solve the problem ofconcurrency and and kind of like Lesliewhen he's thinking about those thingsdidn't have necessarily have the problemvector clocks become more complexbecause what you're doing with a vectorclock is each each individual node andits clock ID is sent with eachindividual message so that when you youreceive that out of band you candetermine well hey um node A has neverreceived message one and therefore youcan determine the the order better ordetermine the concurrencybut if you kind of look back at that andyou look at a vector clock is a morecomplex it requires more data lamporeclocks are still widely used is just asa kind of a show's hand does anybodywork on a system which uses a lamp orclock it's quite a few people right likeit's a it's a a 40-year-old concept thatwas developed on a piece of paper andwe're still using it todaycoolso we've covered a lot of like veryin-depth co topics very briefly for youall today um and so in a world very veryfar away I might have given a talk aboutraft paxos uh and the beginning of allthose things consensus and inresearching for that talk that I gavesix years ago now um I discovereddistributed systems drama which is ifyou like reality TV Kayn uh then youknow distributed systems drama is thebest drama mainly because it's a bunchof nerds fighting over white papers andso basically there was somecontroversial takes that Lamport wasjust too fun because he enjoys takingincredibly in-depth topics and tyingthem to history and real world scenarioshell I mean raft exists because Paxos istoo difficult to understand so LeslieLamport with his almost 200 papers hashe been right about these topics and togive a personal antidote uh there mighthave been a time there was once aapprentice software engineer who movedone line of code that was setting a timestamp and that might have caused amulti-million dollar outage uh thatbrought down a very important stocktrading company all because one timestamp was moved and it was Yes wouldthat apprentice software engineer was ityou was it you no it wasn't me i'm waytoo old to have ever been an apprenticeand Stocks were stocks were around whenyou were aroundthere she might or he might be in thisroom uh and that apprentice softwareengineer only found out that they brokethat code on a retro that they were onand was incredibly embarrassed anywaysif the clocks were built differentlywould have that company have fallen intomany lawsuits of doing of that systembreaking so I can tell you that for meuh as a person who wasn't able to dolike an a computer science course theanswer is yes leslie Lamport is rightabout these 200 papers he was rightabout how he wrote them they're fun andapproachable even though the numbersscare me and I think we would both liketo know if he's helped you as well washe right for you as well so you can findus here nick do you have any closingcomments we're over time but Nopredominantly I think you know we we'vetalked about a lot of Lesard Amplethere's some some incredible people whoare working in our industry who areequally as important and Leslie is isbut is but one of them but um he'sdefinitely I I just think his work isreally approachable so if you have theopportunity just go maybe watch somestuff on YouTube where somebody'sexplaining some of stuff dig into thepapers have a try maybe play around andtry and solve some of the maths insideof those yep and um it's going tobenefit you because the systems arestill using this work yeah so here's allthe papers that we referenced today justif you want to take a little screenshotwe'll upload the slides somewhere on theinternet or they're on like the KubeConwebsite I don't know uh but thank you somuch for your time today and I hope youhave a good rest of your conferencethank you2025-04-15 22:02:31.516985 ��D� #��?A4FgccXDdzYAhi welcome everyone i am Sarah Kristoffwith aclicker cool uh I'm a staff softwareengineer at ADA you can check check usout at adera.dev i am the founder andmember number one of the Leslie Lampportfan club which hopefully you all jointoday uh and I'm the Porter maintainercome check us out there too porter.devuh and a tech lead of tag app deliveryand I'm Nick Jackson i'm a developeradvocate at Hashi Corp and the inauguralbut one member of the Leslie Lamport fanclub which I think the the membership iscurrently up to three maybe three withwith Lamport himself of course joininghis own club so today we're about toteach you kind of give you like adistributed systems history 101 uh ouragenda today is we're going to coverconsensus consistency concurrency andclocks nick and I will be trading offdoing a little bit of a history lessonfor you as well and so Nick will kick usoff with not this part uh so do you everwonder why we're here at KubeCon so whydo we care about like an 80-year-oldman's work from like the 1960s lamporthas published about 200 papers you cancheck it out onleesport.com/asureet um and he is oftenquoted as the father of distributedsystems he is the founder and creator ofPaxos which influenced Raft which youshould know becauseoften the creator of TA plus which is alanguage to help design and testdistributed systems so most of thecommon underlying systems we have todayare influenced by lampportand that leads us nicely up to toconsensusso the to kind of like make things alittle bit more interesting what we wantto do is is kind of set a bit of a themeand we're going to derive our theme fromone of the sort of the the most famouspapers that Leslie Lumport wrote whichis the the Bisantine General's problemnow if you've ever read any of Leslie'spapers he he always tries to kind ofweave this story into things to try andhelp with the the understanding ofthings and and we're going to just dothat so we're Bisantine well soBisantine was was formed by Emperor umConstantine and it was based in aroundabout the 4th century sort of uh BC buton the old Greek settlement ofBzantium now Constantine being the thehumble man that he was named the cityafter himself called Constantinople andif anybody's familiar Constantinoplecurrently stands where Istanbul now isbut it didn't kind of end there becausein like the fifth or sixth centuryJustinian another sort of Roman decidedthat you know the Bisonantine Empire wasgrowing it's pretty big but we can makeit better and he had an aim that sort ofto to kind of build up this empire andto actually take over the western sideof the Roman Empire which included Romeitself which leads us into our maincharacters so we have Emperor Romanus soEmperor Romanus um in the sort of thethe 11th century was the the the emperorof the the Basantine Empire now he wasmarried to the Empress Zoey now Romanosmet a somewhat short end of his lifeallegedly he was poisoned i'm prettysure it had nothing to do with Zoe'slover who was Michael the MoneyChanger maybe it did now Michael becamethe the emperor of the Bisantine Empirehe wasn't so great at the wholegovernance side of things pretty goodfrom a military perspective he then getsJohn the Unic who is his brother to helphim out george Manelakes well he's oneof the top generay nextquestion next question number two areare you ready as usual so OCI and Dockerare the same things is it true or is itfalse come on easy one tictac for the moment it'seasy so ahlittle true or false the easy oneyou're right it's falsevideos is still on the top go on niceso in2015 Docker has pointed its image formatand it's one time to OI since Docker andOi evolve evolve separately with a fewdifferences but the tools like DockerPonman Scopio Crane do their best tohide these little differencesquestion three take up and now it willbebit harderis the Docker image format identical toOCI's well sure oci and Docker V2 arethe same thing or OCI is a fork ofV1 oci and Docker V2 are slightlydifferent or completely differenti know we're we're we're playing onwords no rightoh okay okay so they know they know mostof you are right oci and Docker V2 areslightly different i'm sorry for thesubtleu uh little um differences in the in thequ in the response so here'swhy we pulled an image the same imageonce in a Docker format and another timein the OCI format what I'm showing youhere is the OCImanifest as you can see it has a mediatype OCI image manifest it has a configwith a certain media type as well and adigest so this is a very specific umlike a signature of what this containslayer as well has this type of digest ifI were to diff between the manifest ofthe OCI image and the same one in Dockerv2 format you'll see that the onlydifference is media type which meansthat the contents of the layers isexactly the same so you're going to saybut why did OCI do that what's what'swhat's the the endgame here wellbasically opening this up is what led tous being able to uh put insidecontainers things that are somethingother than runnable containers so thisis how for example Helm modified uhtheir CLI in order to allow you to pusha Helm chart to the registry this wayyou can consume it from the registry thesame way as you do yourcontainers and if you look at the Helmmanifest for this image you'll see thatHelm created their own media type bothfor the config layer and for the uh mainlayer and since it's 2025 and all of usare data scientists right ai tech yes soOali and I created a really very small atiny one model and with ORAS um withORAS we're going to push this to mylocal registry here and we're creatingour own artifact type as well for theoccasion and that'sit next question go back to aslide you don't see it so questionnumber four what happens when I push animage to a registry my image is pushedas this my image is compressed duringthe push or each regry has its owncompression algorithmtic tac bithardsoOh okayoh see cool so yes my image iscompressed during the push and we willseeit sooh this code world is different cool soit can be surprising but the image sizelocally andonregry are not the same and when we uexecute the docker push command justbeforethe upload the client compress the layeron the file and we will see it and liveon ademo so first we log in on an cloudmanage privateregistry we will build an image with itdocker build command okay we build is incash that's great thank you cash and nowwe will see the image the size of theimage so locally it's 22 megaby okay andnow we will push the image on theprivate and we will check on the privateregry and also on the docker hub we'llsee if there will be a difference of asize okay does the registry has anythingto do with it so if I check on dockerhub for example the size is 10 megaby soit's compress and onarbor if I check if okay I click it andit's thesame image size so same size on arborand on dockerhub and we can do another one we havesome tips we can change the corporationmode doing the push withpman we can say I don't want to usejzip algorithm but but if u if um I havea very big AIM image it's better to useZ standard compression um algorithm so Ican do it with Pman and also I can do itwith Docker and as you can see thecommand isnot is not the same one that's becauseum I had all the useful flags I canchange the algorithm the I can changethe level and I can say I want to forthe coration um I want to use ZD alsofor the beds bed bezimage and done for me up okay next onenext one are you ready all right nextonewhen you're deleting something from theimage using run rm minus r for example/helm does the image size decrease staythe same is almost the same or does itincrease i know it's playing on wordsagain nonever neveroh okay very are you curious to knowwhat what is the correct answerit stays almost the same almost i'll tryto explain competitionniceokay so when we're adding the run insidethe docker file we're not removing anylayers we're not touching any layerswe're just adding one extra one with afile single file zero bytes that's whywe are saying almost the same this fileis WH which means white out and it'sonly the container runtime that's goingto interpret this file in order toremove the folder at runtime so here'sthedemo so first thing I have a smallfolder it contains a Helm chart and acontainer file container file is exactlyyour Docker file but OCI naming okay soyou see in step four I'm running the RMbuildit and next using Podman we're saving itto the local folderso here let's take a look at this localfolder this is what an image in dockerv2 format lookslike so if we were to loop on the layershere and to check if there are any filescalled helm in there we'll see that westill have our helm folder here thislayer didn't change and here is our filethis file is going to be interpreted atruntime and that'sit that'swhy question number sixready images built locally runeverywhere obviously big ones deployanywhere or it depends or nosoyeah ohyes it depends goodcatchable Vinnie so by default an imageis built for the machine platform andthe architecturefor example if you have a Mac M123 um 11and we um and you build um animage it will be for aRam 64 and if you uh execute a cockerrun command on on a ond64 it will notworks so the solution can be to create amultiarchch image and see it on live ondemoso we create abuilder to create our images with dockercreate with a name and use flag and wewill inspect this builder with thepredict inspect command so as you cansee we have a lotofplatform to use so we can create imagesfor MD64 and ARM64 okay so let's build it our image forthesetwo architecture with Docker and we willpush it directly on the registry so it'sokaylayers are pushing okay it's finishedand we will display themanifest with another tool coanemanifest so as you can as as you can seewe will we have our um 65 64 sorry andAMD 64 architecture it's okay for us andwe can check on the registry so we canwe can go to the docker hub forexample look and we cancheck our new tagmulti well yesso what I like on this docker hub isthat uhit's it's a very uh visual to see thatthis This tag is for these two umarchitecture and that's goodyesnext nextquestion number seven seven what areopensource formats for software build ofmaterial you have two right answers spdxCyclone DX Typhoon DX SIFTwhat do you thinkand Oh okay okay not so bad not so badit's cool okay so SPX and Cyclone DX areopen source sift is not open sourcetyphoon DX is a complete imaginativethingoh great vinnie come onso Sbombs it's an inventory ofeverything that composes your yourcontainer image from the top to thebottom or bottom to top we we need to doall the vulnerabilities all the licenseseverything so uh SPDX or Cyclone DX wellyou can choose so SPDX is pushed by theLinux Foundation it has uh an ISO 5962certification it's very verbose cycloneDX is more into automation intovulnerability scanning it's pushed byOASP well OCI doesn't choose for you upto you to choose between the two andmost of the tools that we use todaysupport bothso here's a quick demo where we're goingto look only at the SPDX format butCyclone DX works exactly the same sohere for example with3V we are going to generate an SPDX inJSON format for the image we we builtearlier and because JSON is very hard toread at least I find it so bomb is agood tool for you to view yoursbomb so here's what an sbomb lookslike here you see that uh the Guffer APIhas uh a few relationship to otherpackages rightuh bomb and trivy are not the only onesuh that allow you to create uh sbombsyou can use uh docker scout for exampleto generate ansbomb you can also uh use podman withthe minus minus sbomb scannercommand and bomb as wellnow fun fact if you're running the threetools on the same image you will not getthe sameespose like docker scout others arereally super small super summary likebombnow today most of the registries arestarting to allow you to push the sbombalongside your image this is what ORASfor example allows you to do when you doan oras attach so that/tgmpress.json that we created with 3vup we're going to be able to push it tothe registry with the artifact typeharbor sbomb v1 because the registrythat we're using is a harbor registry soso that it recognizes that this is ansbomb and basically that's it so thisway you can consume your sbomb rightnext to your image you can also discoverthat sbomb through a discover likethis and that's it for me and the lastyes we are at the final question thelast question you can still win the bookwe can win so go fast and and goodso with cosign Irepeat with cosign when I sign an imagesancture is an extra layer added in theimage a new image tag or a copy of theimage mhm the last go goso drum roll oh so you think it's anextra layer added in the image ah youcan't if you you add a layer to theimage the image in fact it's an imagetag with cosign and oh we'll see justbefore commencation Martinuh G libblos and xuhplease please do screen um capture um ofyour phone and go um come see us afterafter doing your swagso cosign is a tool that will allow youto sign an image without modifying itand the signature is a new image tag inin the formatsa256 the digestand the final leveloh so first we generate akeeper with cosign generate keepercommand as we are on a live demo I willdon't uh type a password but if you areon production please please pleaseuh add it enter it so we skip it we havea private key public key and as it'srecommending to sign an image bythe digest we will retrieve it withdocker inspect command and we retrieveonly the repo digest field okay we haveit tiny one yes it's tiny so now we cansign our OCIartifact with cosign signcommand and as we are at Coupon we canadd a useful information a littleannotationwith conf equals condon we add the keyand the digest i don't have any passwordsorry so this command do twothings as you can see the signature ispushed on our registry and also a T loga transparency log have been createdgenerated and stored on arecord server okay so let's let's let'slet's check if the image has been signedwith cosign verifycommand[Music]okay a longtime so okay it's good i havethe signature and andand all the keeperAll thekey value are on an optional field okayit's good let's let's check it on ourprivateregistry okay so I will go back in mymanagearbor as you can see it wasitwas inred and now the image has a have asignature and so it's theresignature andalso an icebomb yeah so we can have someinformationlike like Ztag and we canalso root leave this tagwith CLI with cosigntriangulate it's it's not um awell-knowncommand and we can inspect the manifestwith crane manifest commandSo as you can see we have some someinformation it's a signature bycosign and signature is there and wehave also a body pill payload with a lotofinformation a little yes yeah do youremember what tools we used to do demosyeah let's see if you rememberanybody no okaycrane Pman okay so here's a quicktakeaway for you these are the toolsthat we used there are many many otherswe put a superpower next to each one soso that you can remember that moreeasilyand on this slide we don't have the twomain ones that I'm sure all of you knowand use on a daily basis if you're acontainer developer obviously yeah sothere's there's a lot of things in theecosystem just choose the tools that dothe thing that you need the ecosystemyeah yeah the ecosystem is huge there'sa lot of things that are movingstill so that's it for us we hope thatyou learned something with ustoday and thank you so much forattending thank you[Applause]2025-04-15 22:02:32.065738 ��%�#��AQLHQP8-RVwEwelcome to the ultimate content ofchange this is a one-of-a-kindum quiz where we're going to test yourknowledge about OCI containers the wholeecosystem yeah so take out your phonesput your internet connection back on andwithout further ado your co my yourco-host myco-host hi soI'm I'm a developer advocateat side on clative coive and on ininfrastructure and as code and you andI'm Shireen I've done lot of differentthings around the tech world uh todayI'm at Red Hat and I work on the openshift uh stack so are you ready yeah soif you scan the QR code you're going tohave a white page with a blue bannerdon't worry your names will be askedafter be give correct answers be veryfastand top three contestants will win someprizes among which the Kubernetesindividual will bellyso uh all of your answers will be testedthrough uh some live demos as well sohere we go first question so here's whenyou'll get asked for your name ah yescome on a lot of people scroll scrollokay come on let's give it a minute 11oh[Laughter]they're fast our perks first okay don'tworry the QR code will still keep comingkeep them coming first question ready gooci what is it optimal containerinitiative open container image opencontainer initiative or open cloudnativeinitiative what do youthink easy one tic tac easy yeah easyoneand correct answer yeah yeah yeah opencontainergood job so what's Ohanonymous great we don't know opencontainer for those who don't know is anopen governance around containers itstandardizes three things the imagestructure the manifest that willdescribe the image it standardizes theruntime so whatever your containerengine is going to use Docker PodmanCryo all all use the specification andthen finally the the distributionspecification so the way that you usethe APIs to talk to registries pull pushimages and so on oktion or spin up a newapplication and that's just kind ofpainful you can also lose access tothese things when registries go away whohad to migrate to registry.k.io fortheir Kubernetesuse few hands that was pretty disruptivetoo a lot of people had dependency onthe on the GCR instance that we usedupstream when those go away you losecontrol of that aswell then there's the everpresentproblem of vulnerabilities in containerswho has spent some time in the last yearremediating vulnerabilities in acontainer again like almost all thehands super fun if you have regulatoryconcerns you're probably fighting thisday by day by day and if you're pullingall your images from a thirdpartyregistry or you know maybe for just yourdependencies you're still going to bestuck with that situation where you'regoing to be dependent really on thatupstream project or that vendor to fixthose things for you if you're runningin your own registry there are somethings that you can do and we'll take alook at that in this case we're lookingat one of the images from the emoji uhvoting app it's got394 high 61 critical vulnerabilitiesprobably shouldn't run this inproduction if I did this at MicrosoftI'd get a lot of S360alerts and then a third category andthere are a lot more but a thirdcategory is that sometimes uh maybe aregistry's credentials get compromisedand somebody's able to push a uhcompromised version of Kong that has umsome maybe nefarious things baked insideof it if you're depending on thesethird-party registries and you'repulling things without the ability toreally validate who's doing it so in ourbuoyant or our vote voting example wasbuoyant who actually pushed that imageyou can potentially get things likecrypto miners or other bad thingsrunning in your production workloads andthis is also true of your deploymentmanifests and other resources thatyou're going to use to deploy yourapplication somebody could make a changein uh say a Helm chart that you'redeploying and suddenly you've got somebad defaults changed or there's a sec aconfiguration change that now exposes asecurity vulnerability to you so we'vemoved on from that carefree world to nowwe're all really aware of thisand I at least as a person trying todeploy the voting app um I'm a littlefrustrated it's not a happy experiencein the container land it's full of sharpedges there's danger around every cornerit's kind of an angry experience it'sdefinitely not carefree and it'sprobably going to cost me a lot of moneyand time to fix this and to keep ithealthy overtime so what can we do we can sit thereand be frustrated um and not know whatto do or we can start to apply some somewell-known practices to our containersecure supply chain and start to takecontrol of that and if we put thosepractices in place centrally for all ofour teams then we're going to get someshared efficiencies for that and be ableto make um everybody kind of fall intothat golden path and hopefully for newprojects we'll get them off on the rightstart some of these again are looselydefined or inspired by things that we'redoing at Microsoft some of them comefrom other projects in the ecosystemsome of them are from uh just blog postsand and other things we've seen um andthere are a ton of practices you canfollow um the tech security has a awhite paper around um software supplychain security that's really good toread a lot of good ideas inside of thatum and there's a few sessions throughoutthe week if you go to sketch you'll beable to find some cool things uh aroundthis topic right before this there wassomething talking about uh applying someof these to AI models in registries kindofcool so we talk about container supplychain security but what we're reallytalking about now is supply chainsecurity for our OCI artifacts uhbecause essentially anything can live ina registry now who has pushed somethingother than a container to aregistry pretty decent size that's uhthat's a lot um the OCI spec was evolveduh over the last couple years to provideguidance on how to store non-containeruh images non-container artifacts inregistries it's really cool because youget the same kind of distributionmechanism that you get for containerimages you get the same kind ofimmutable references as you get tocontainer images and you can apply someof the same tooling to that you also getside effects like everything's condensedinto one place i can take advantage of acentral authentication and authorizationscheme for those things and deal witheverything kind of in the same way thesecould be things like Helm charts theycould be flux configurations they couldbe sbombs they could be vulnerabilityreports practically anything can getpushed to a registry there's a number oftools that can do this if it's an OCI uhcompatible registry or us is one thatwe're going to take a look look attoday so what are a couple of things wecan do here so first we can mirror allof our images to an internal registryall those external dependencies we'regoing to put them into an internalregistry and then we're going to signthem so we know that we put them thereand we intended for them to get thereand as part of that process we couldscan them we could look for malware do alot of other things potentially patchthem as well with the project like Copaand then make sure that those areavailable in our registry then in ourclusters we can enforce that images onlycome from that registry to lock thingsdown even more to ensure that we're notgoing to have any problems um from thirdparty stuff then we can start to verifyour containers on deployment and we canstart to verify our deployment manifestson deployment we're also going to stickour our manifest and stuff in thatregistry as well so what do we mean byverification well by verification I'mtalking about uh the chain of custodyfor that so going back to voting uh theelection in my town uh it was yesterdayColorado where I'm from has 100% malevoting it's pretty cool so you thinkthere's probably going to be like youknow fraud or questions about theauthenticity of ballots what starts witha signature uh just like we want tothink about for containers once you signthat that um you're making a selfattestation that I produce this imageand I am delivering it to the registryor the Dropbox where I'm going to put itthen as somebody picks it up and movesit along the stage you get additionalsignatures kind of co-signing that alongthe way then when it's finally time tovalidate it goes through a manual and anautomated process where your signatureis compared to that of anothercertificate uh something like yourdriver's license your voting recordother things the state has on file andwhen your signature does not match it isrejected uh there's a a process for youto fix it but it's it's not countedright away right it's rejected fromadmission into that votingsystem so containers are the same thinghow do we verify containers there's anumber of tools that are in theecosystem um there's an old version ofnotary we have a new thing callednotation uh cosign exists in the spaceum there's other things that are in theecosystem but they essentially work thesame way right we are going to sign thatwe have produced this image generate asignature object that goes along withthat that's going to sign some of themetadata for the container and produceanother artifact that we can store inthe registry alongside of it it'sdiscoverable in a known place and thenwe can use tooling like verno or umgatekeeper to verify that that thing wassigned how we expect it so here we arewe're going to run um an image here testverify image that it was signed bysomebody else our policy is configuredto to reject that and we can see there'san error message so it was preventedfrom coming into the cluster but we cando that for anything that is in aregistry right so here we go uh use theoras tool um I pushed an image uh it'san image image to dockerhub um calledlepernetes if you ran this command rightnow you would get a picture of my catwho was the 120 kubernetes releasemascot and then using something likenotation you can verify that signatureso notation verify um it'll verifyagainst a trust policy and I'll show youthat in a little bit okay so at a highlevel what we want to do is take ourimages from DockerHub or GCHR put theminto an internal registry then we wantto make sure that Kubernetes only pullsfrom that registry then we want topublish our app config to the sameregistry and then when we deploy thatapplication we want to verify thesignature verify the deployment andhopefully everything is happy there's alot of stuff in the in commercialopportunities for this a lot of umproducts a lot of open source things butI was curious could we do this and maybepatch them could we do all of this fromjust the CNCF ecosystem from the thelandscape and it turns out yes there areprojects that cover all of these thingsum across just the CNCF and I'm going touse the stack of them now we're going todo a demo i'm going to switch over tothat right after this but the projectswe're going to use uh for a registrywe're going to use a project called Zotanother option here might be Harbor umwe're going to use Flux to dodeployments because I know that I'mgoing to use notation as the signingpiece because that's the one that's inthe CNCF and Flux has really greatsupport for notation and it's gotsupport for OCI based artifacts i'mgoing to try to patch the image when I'mretagging it uh with Copa i'm going touse ORAS as a register client to movethings around and then I'm going toenforce things inside of the clusterusing Kyverno so all CNCF projects uhall these things we should be able tomake play together okay we're going todo a demo now um I have a QR code hereif anybody wants the repo you can scanit and I'll get leave it up for a secondand then we will uh we'll jump over andstart to take a look at a couplethings so I'm I'm causing trouble forthe AV people so give me just asecond[Music]here good to go okay so I'm going tobounce over to a VS Code window and I'mgoing to make this a little bit biggerwe're going to start by looking at whata trust policy is so a trust policy is athing for notation and it's really goingto say um what trust store do I want touse and this can be this is where mycertificates are going to get stored andwhat identities from that trust store icould do multiple identities here andreally limit it down um in this caseit's going to be a certificate that Igenerated that's an Azure key vault uhfor my domain jereard.com um and all thestuff that goes along with that i canalso limit this down to certain scopesinside of the registry so these could beindividual registry or repos like uhcube proxy or I could wild card it likethis probably less secure to do this buteasier to see on the screen okay so whatwe're going to do next is run a GitHubaction and I already startedhere we are going to retag someimages run it again while we are waitingoh it's the wrong repo that'swhy here we go this is the right onesorry aboutthat okay we're going to run this thisum and retry tag stuff so what this isgoing to do um I'm going to run it andwe can see it is uh I have a list of allof the images that I'm going to use forthis demo and all of these images existin DockerHub or they exist in GCHR sowhile that's running um what I did wassome some homework ahead of time and Ifound all the images that we need to usefor all the components we're going todeploy here including Flux and includingKybero i made a little YAML file andwrote a really small Go program to spitout a GitHub uh matrix but really whatI'm going to do is use ORAS to copyeverything so I made a a GitHub actiontake a look at that and this GitHubaction is going to run my little retagcommand that's going to spit out amatrix right so one of the cool thingsyou can do in in GitHub actions isgenerate a matrix of all the things youwant to run and run run all these jobskind of simultaneously so I am going tostart by running that and then I'm goingto install ORAS so there's the GitHubaction for installing ORAS pretty coolum I'm going to log into my Dockerregistry there's a GitHub action forthat this is super easy because almostall of this I just used GitHub actionsthat existed alreadyum once I have everything set up I'mgoing to use the ORAS command to run acopy so get that out of the way um whatwe've got here is oras copy uh from thesource to the destination essentiallythat's going to run for all of theimages that are in there and then I'mgoing to log into Azure because my stuffis in Azure i'm going to install Notaryand I'm going to use the AKV plugin fornotation and run notation sign so we cansee we're going to run notation signwe're going to tell it what to signwe're going to sign registry destinationmatrix the key IDs and all that goodstuff then I'm going to scan it becauseI want to know if it has anyvulnerabilities so I'm going to use anaction from Trivy to or from Aqua to runTrivy then I'm going to check and see ifit has vulnerabilities and if it doesI'm going to run Copa which is anotherCNCF project that does containerpatching pretty cool it'll take yourcontainer it'll scan it uh it'll take aresult from from Trivy it'll generate apatch layer and pass that on for youum and then I'm going to just push it tothe registry so this should be runningnow it's waiting for me to approve itbecause I made thispublic okay so we're let that run for asecond and I will come back to it onceit's done so once these images are allin the registry the next thing we'regoing to need to do is have a Kubernetescluster right so I've got one alreadydidn't want to wait for that we've got aKubernetes cluster that is an AKScluster because I wanted to make itpublic for everybody so we've got let'sseecubectl getnodes two node pools there we also haveour Zot server already running andthat's right hereum okay we can see that it's running umit's also available uh actually it'sgetting stuff pushed to it already if wego to to Zot herebrowser doesn't have images yet but thisGitHub action should be running forus and it's copying things with uh withnotation right now so or with oras rightnow and you can see that it's going toset up the notation CLI it's going tosign everything it's going to generate aT report and do all the things that itneeds to do okay so if we refresh thatagain hopefully we've got images livedemo yes so we have uh image reflectorcontroller Kyivero Kyiverno a few otherthings um and we can see that right nowthey are not signed because that GitHubaction is still running to do thesignatures okay so while this is allgoing we'll come back to it couple otherthings that are in this repo that areinteresting to usfirst here are a bunch of manifests thatwe're going to use for our applicationit's going to take some work to usemirrored images right we're not going tobe able to just take an off-the-shelfHelm chart and just run it as is uh ifwe have YAML files we might need to umto do something to them so we're goingto use customize to do almost all ofthis so we've got um definitions for allof these things here we've got our ourvoting app we've got Flux we've got someFlux repos that we'll talk about umKyivero and then some policies we'regoing to put inplace while that's all running the firstthing we can do is install Flux right sowe'll do that hopefully the Flux imagesare all across so it looks like they arenot all across yet but we'll run itanyway okay so we're goingto sure we're in thisproject all right so again here's thedirectory um manifest is whereeverything is so let's do a cubectlapply we're use customize so dash kmanifestflux it's the first thing we'll do forthiscluster so it's going to createeverything and if it hasn't mirrored allthe images we should be able to see someerrors and that'll becool okay let's getOkay so um all the older stuff is isrunning looks like our Flux stuff iscoming up right now so that's cool nextum how about we install um Kyivero we'llget that out of the way too okay sowe're going to use Helm for that justbecause they have a Helm chart alreadyand I want to show you how to customizewith that so we're going to install HelmwithHelm okay and you can see here that forthis I basically had to overwrite all ofthe images in the the chart so it's notthe easiest thing to do but um it makesyou kind of aware of what you're doingokay so we're going to run Helm installKyverno um set the namespace set theregistry all those goodthingsrun while we're waiting let's check onour GitHubaction looks like they're all done sofor each one of these things now we sawthat we ran that retag operation wesigned everything we we scanned it withtrivia to see if there were any anyissues and if we go to our registry nowand refresh it now we see that it'sactually signed that's pretty cool so wecould pull these images you could pullthem from uh fromzot.jerecorder.com and you' see thatthere are notary signatures associatedwith each one of thosethings okay so now we've got Fluxinstalled and we've got Kyivernoinstalled let's uh let's make sureeverything's lookinggood okay all of our pods are nowrunning okay so we've got Flux installedwe've got Kyiverno installed um what wereally need to do now is handle thepolicies right so let's look take a lookat um what we've got under policies sowe've got two Kyerno policies we couldhave done this first and enforced it forFlux as well but we have two policies wewant to install the first is that wewant to restrict images this is a reallybasic thing you could use this umanywhere uh and all it's really going todo is ensure that everything is comingfromzot.jeremyreicker.com um and you willget an error if you don't the second onethough is the more interesting one forthis use case because we are going touse uh the built-in verifier fornotation so here we go again it's acluster policy for Kyivero we're goingto verify signature notary on onresources of of type pod right that'swhere the containers will be any imagereference we want to be signed with thiscertificate this public yeah so behindthe scenes this will generate a trustpolicy for us and do all the good stuffso we're gonna run that now and apply itapply -f manifest policies let's putthem both inplace okay cool so now we've got both ofthose things ready to go the last stepthat we really have here is to use Fluxto install our application right we havea pipeline for that too so let's go takea look at that GitHub actionso again we said we wanted everything tobe inside of the registry right so we'regoing to do the publish flux confignow let's take a look at what thatpipeline looks like or sorry that GitHubaction looks like it's prettysimple it is going to log into theregistry because it needs a push it'sgoing to install the flux CLI because itneeds flux it's going to installnotation because we're going to sign itand then uh it's going to call flux pushartifact and you can see the argumentsthat are going to get passed to it soflesh push artifact all those goodthings and then we're going to sign itand this is exactly the sameconfiguration with just a different umtargetartifact so is it running is it runningwe have to approve it right yep so let'sprovethat viewdeployments okay so this will put itinto the container registry or the ohsorry the OCI registry it's not actuallydeploying it to Flux what we're going toneed to do to Flux is give it anotherconfiguration that says to install thatresource if you've ever used Flux thisis exactly the same way as specifyingthat you're going to use a GitHubrepository instead of the OCI registryum the big difference is that it's goingto pull it from that source instead andagain that's a cool kind of side benefitbecause you no longer have two pathsthat you have to control access to rightmaybe you want to lock your clustersdown putting them all in one registrymight be a good way to do that so let'stake a look at what happened here wepush this flux configuration it's calledCubeCon manifests so if we go to Zot Iwonder if we can findthat yeah CubeCon manifest emoji votingand we can see that it's signed rightthat's pretty pretty cool you would seethis with cosign or the other things aswell so we've got that installed now soour last real thing to do is to cubectlapply-kmanifests emoji votingoh actually that's the wrong thing to dolet's deletethatsorry okay so it's deleting that so whatwe really need to do is we need to useFlux to do thatso in manifests let's take a look andsee what we've got uh we have flux reposright so let's go to fluxrepos and we've got two things in hereuh we have emoji vote and we've gotsomething called notation config butlet's ignore that for a second and let'sjust apply thisone okay so we are going to create twoflux configurations right so we've got aOCI repository source and acustomization for Flux the two resourcesare going to help us define ourapplication let's take a look at them inVS Code because it's a little easier toread and here we've got this so we cansee in this um we've defined an OCIrepository uh type right the spec forthis is going to reference that objector that resource that we pushed to theOCI registry which is going to be ourcollection of manifests and then wedefine a customization that uses that asasource the interesting thing here thoughis that there's a verifier so we'regoing to verify that thing with notationwith the notation config secret guesswhat I didn't create in thecluster that secret so if we getpods I don't see anything running forour voting app that's kind of a bummer iwonder why well let's take a look at thesource controller for Flux and see ifthere's any logsit failed to verify the signature usingthe provider notation mostly because thethe secret wasn't there but that's theeasiest thing to do for the demo so if Icreate thatsecret we are going to now have thatsecret present so as reconciliationstarts to happenhere there it goes it's applying thatthing it was able to verify thesignature because we said that it had tobeverifiable and now it is ready togo okay so moment oftruth i see all the emoji voting appscooland there is our service looks like theIP address is172.206.153.116 let's see if we can pullthat up in thebrowser all right it's there let's viewthe leaderboardoh I don't think it's working all theway that's a bummer but it's deployedit's running so we've got we've got ourcool app from 2017 now deployed withsome kind of modern supply chainsecurity best practices let's uh that'skind of fun um let's let's look one moretime just just kind of refresh our mindso we were able to push u an OCIpackaged set of manifests to a repo orregistry that's running as it's a CNCFproject called Zot we were able toverify that thing using a CNCF projectcalled notation we're doing all thisdeployment with a tool called Flux whichis a CNCF project we were able torestrict images here from a CNCF projectcalled Kyivero and if we go to um lookat those cluster policies I bet we cansee what it's tellingus yep see let's let's get one and seewhat itsays okay soum there's the certificate that'sincluded just like we saw inthelog looks like a normal normal documentnormal object let's see ifumHere we go um so there's some policyviolations oh I put this in warn modebecause I didn't actually want to copyall of the cluster images because someof them are coming from AKS but we cansee there are some things there uh likeKubernetes metric server that was builtby my team it's not signed with notationoh it's not signed with this uh thistrusted signature it it does have asignature it just did not get signed bythe signature we're looking for coolthing with this is that your trustpolicies allow you to define exactly thelevel of trust so I could go back and Icould add this public to my trust policyand I could allow these registries andclean this all up maybe allow for thingsto come from another registry as wellbut don't really need to do that allright well that is the end so let mebounce back to PowerPoint here i'll giveyou the feedback thing so you can takethat i would love to have any feedbackthat you might have and I will hang outto the side if anybody wants to chatabout anything afterthis okay back to theOh a couple of things real quick umthese are three talks that I thoughtwere really interesting uh they're twoare tomorrow and one is on Friday u gocheck out the project booths as well ithink there's some really interestingstuff there and there's the QR code forfeedback thanks again for coming all theway over here it was really cool to seeso many people in here thanks2025-04-15 22:02:32.716748 ��N�#��SAg--50XLcqRwall right thank you so much for makingthe trek all the way back over hereagain i'm sure everybody was here thismorning uh it's good to see everybodyback in here um thanks for coming tosign sealed and delivered sign andverify all the things my name is JeremyRickard i am a principal softwareengineer at Microsoft Azure um and alsoa co-chair of SIG release in theKubernetes projecttoday we're going to talk um a littlebit about container supply chainsecurity with a large focus on umsigning images and doing verificationfor that and how we can use that toverify the authenticity of our resourcessome of what we're going to cover todayuh are things that I've picked up uh inmy day job in my day job I help buildand publish a lot of CNCF projects foruse inside of Azure and Microsoft as awhole also things from SIG release andother things I've I've seen and learnedabout in other projectsbut let's start off uh taking a minuteto reminisce about the early days maybethe easier days of computing uh withcontainers for me I found containersprobably around 2017ish uh and I waskind of drawn to it because I found thatit was really easy for me to package upmy application and get it to run in areally similar way to how it did on mymy desktop you were basically a Dockerpush away from getting something up andrunning and and a lot of things werefree in that time you know in 2017 icould push something to DockerHub andthen I could deploy it to production andit was just kind of free and it workedand it was really cool it made demos atCubeCon or other conferences reallyreally easy as long as the Wi-Fi wasworking and just kind of gave you thiskind of carefreeexperience so with that in mind andreminiscing I thought that I would tryto deploy an application as part of thistalk we're going to talk about supplychain security and how it applies todeploying an application so I thought Iwould find something kind of old um herewe have the emoji vote app it's uh Ithink it was published by Buoyant in2017 so we're we're coming up on a longtime on this thing and yesterday therewas an election in my my home ofColorado uh that was almost entirelypointless so having this as a voting appseems like a really good idea uh I foundthis thing on the internet uh it's from2017 it's probably going to becompletely fine and and all right rightwe're going to we're going to deploy itthere will be no issues whatsoever ishould just be able to deploy this andhopefully at the end we'll get a clusterIP for this thing and everybody can pullup their phone and try tovote but actually things aren't socarefree now uh who has had uh a a ratelimit from Docker um yeah almost theentire crowd so DockerHub was thatubiquitous registry right everybody usedit it was free and we just all kind ofgrew to depend on it open sourceprojects published there teams used ityou you may have had a a paid accountfor it but you were still you know usingit because it was just soubiquitous uh but now there are ratelimits that are put in place and forfree accounts they're pretty strict andprettylimiting i ran into this running gettingready for this application or for thisdemo because these images uh areoriginally are in DockerHub but inproduction you may end up getting hitwith a rate limit when you're trying todo a scale opera"ng uh available to everyone so theythey can use it in different differentmanners right so according to feedbackby uh technical team of PTEC who governsthis uh they found like softwarepackages uploaded by this users mainlyinvolve uh your information collectionthen environment variable which talkedabout then also S3 object object uhstorage access so it was a big concernso what are actually uh I'm doing we areseeing here right so LLMs are becomingpart of our software supply chain butare we at that that stage because theyare already being getting targetedso yeah so for the LLMs uh there aredifferent different deploymentsavailable uh from the serverless tohybrid cloud local servers cloud APIs sothere are different different mechanismright so how we can maintain uh uhsecurity through all of them right sothis is one of the diagram I would liketo show here we can see like uh wedeploy it on the self-hosted opensourceLLMs nothing but uh let's say we we wantfull control of the uh whatever we aredoing then uh but needs infra and wealso need in-house expertise right sofor this uh one of them is reallypopular hugging face then there is GPDJuh bloom open llama these are the someof the examples which people trysimilarly everyone's favorite yearcontainerization with the Kubernetes sothat is cube ray ray serve ks serveqflow k gateway so these are the somethings which people use for the autoautoscaling isolated uh environments anduh but it is really operationally uhreally complex similarly cloud APIswhich we talked about is like one a fewof the examples is like open AI APISageMaker then your Vert.x AI but it iseasy to deploy and all but scalable butreally costly and uh you you lack yourcontrol you have to give access to uhcloud for that similarly there isonremise onremise where you can providehighest control and privacy but uh thatcomes with the high infra cost uh few ofthe examples is like DJX server uh whicheveryone is requesting uh you must haveseen on the Twitter like everyone istalking about how to get a GPU serversand all so this is the same and privatedeployments uh one of the table uhcolumn row I'm missing here is likedecentralized uh deployment method P2Ppeer-to-peer if you heard about uh uhthe company recently did the opensourceuh they were trending on the GitHubright exo uh similarly petals and bitdenser is some of the examples youshould try it like uh how the mechanismworks then hybrid AI where you have likecloud plus h kind of a thing you can usethe rest of both worlds but it is reallyhard to sync between both uh socombining edge devices with your clouduh processing each of this uh affectssecurity differently so cloud APIs mightleak data or onrem on-prem models mightbe vulnerable to insider misuse and P2Psystem which I talked about really canstruggle with the model integrity sothat is another big concernso this is like in depth of that tableuh I will share theslides yeah uh I really like this uhslide so this is about uh how actuallythe uh currently LLM request flow worksright so when first the client is thereso they it can be your user or front endapp they send the request which can beyour let's say you must have tried thechat jd or any apps right you ask likesummarize this article or write core sothat happens here and then you have theload balancer which uh receives therequest and forward to one of manyavailable servers so you can balance thetraffic uh evenly uh and then there is aAPI gateway which is one of theimportant uh aspect which we will talktill the end uh this is the control huband it checks the request is valid applyas security and uh policies and rootseverything uhpossibly then uh and we will go in deepinto this in the next slides uh LLMservice is uh is is the thing where allthe logic happens it happen all itmaintains the logic of your chainingcontext or embeddings of your LLMssimilarly model server is there whowhere actually your real heavy liftinghappens like GPUs uh llama let's say orGPD they runs here uh typically on GPUTPU and everything right so again thatwhatever output is generated byprediction and all #that goes to the uhyour LLM server uh LLM service and comesback from the model server so they uh atthis stage they filter the content wraparound it with the structured schema andall and they just transfer it to APIgateway so yeah so API gateway is doinghere like final checks and all and thenthey transfer it so the reason I'mshowing this flow is uh simple like uhevery part of this chain is really uhweak I would say or uh like potentialattack surface so it can be like uh atthe start it can be unvalidated inputsuh there will be weak authentication orthere can be unfiltered outputs sodifferent different concern and they canbreak your system right so let's see howwe can dis uh make it more secure sothat's why we are going to discuss aboutAI gateway so yeah just like an APIgateway which manages your microsertraffic right so AI gateway manages yourand protects your access to LLMs so itsits in front of your all models modelendpoints to be precise from open AI toyour internal llama dips or anything andact as a gatekeeper between all of themso it it sees like who can call themodel uh what are they sending in andwhat's coming out and also is any dataleaking or something so that's what theAI gateway ones uh that's all it talksabout uh so uh if going in depth of thisuh these are the AT AI gatewaycomponents this diagram shows the entirearchitecture of like modern AI gatewayand when uh we we we have split into twomajor areas one is like AI specificmodules and another is AI gateway codeso AI specific modules is nothing butyour model management so which actuallyhelps to manage the different models uhyou can run like AB test or do shadowdeployments or roll back to previousversions uh with this if something goeswrong right and then there is a promptengineering prompt uh prompt engineeringis something which will give you controlover how the inputs are structured youcan maintain the uh let's say templateslike you do in the platform engineeringright uh you can use prompt templatesyou can provide the output validation sothis actually maintains the consistencyacross your teams uh similarly safetyguards so this module actually is one ofthe very important because this willfilter harmful or biased outputs so uhby using like toxicity filters or biasmit mitigation tools so it helps forkeeping your responses safe especiallyin public facing apps like uh you musthave seen uh like this is one of theexample right once chat launched newmodel which can do the image generationlot of companiesstarted mean like diverting their imagegeneration into like gibli stylecreation model right so they were notfollowing anything just using the APIkey and starting their new product andstarted doing win but this is one ofconcern they can have if they continueit right so AI gateway coreuh which is at the right uh whereactually the API gateway is shown apigateway you know right authenticationand rate limiting and routing but uh itis nothing but who is allowed to use themodel and how the routing works uh fromthe one model to another to back endright and also uh you can prevent thedenial of wallet sub attacks using therate limiting other than that there isthe API manager so here actually thepolicy enforcement will happen then youcan track cost or usage of the mechanismThen also it we will check like supportmulti- vendor setup so you are notlogged in right uh similarly API testingand obsibility so uh it tracks thelatency uh logs error and support loadtesting so you can simulate everythingto real traffic and find the weak spotsbetween all of them and at the end uh Iguess platform engineer's favorite termdeveloper portal so here actually youcan uh we can see like uh make this willmake the gateway like developer friendlyso self-service access to APIs thendocumentation SDKs and code sampleseverything will be provided and all sothis actually uh ensures your gateway uhisn't just secure but also like uhusable observable and secure right somoving forward what are the securityrisk associated with your model runningand inferencing so first is uhunauthorized access to model endpoints$so this happens when anyone can hit yourmodel APIs either internally orexternally whatever how you're managingit so it happens without like properauthentication so like uh then there isa data leakage via inference output thenthere is adversal attacks on modelintegrity so these are the promptsdesigned to confuse manipulate or toyour models we say like poisoning themodels and all right so then there's theinsider threat like you are hiring someemployees or contractors u if they havelike too much access then uh they canfine tune the data and use it this islike really hard to detect one of themajor concern then there's the supplychain vulnerabilities like I showed youthe example of dipssec API right dipspackages were uploaded on python uh piso similarly network breach and exposedmodel endpoints so your models areexposed then it can be spammed andoverloaded with traffic and allsimilarly identity compro compromiselike your access keys names uh usernamesyou are sharing to a lot of websitesright that can be exposed So that is PIXas we say then data loss like even ifthe model itself is isn't breached soattacker can like make some tricks toleak the data from it and query it uhreally nicely so your model getsconfused souh this is like zero trustarchitecture like simple one where youcan see like uh identity access controlis at the top uh which is nothing butyour user or service sending a requestto use your LLM uh then it will followlike uh uh trust anything inside thenetwork so that's what we following hereuh you can see like we check identityevery time so here it is abbec abec isnothing but you check not just who theyare but also uh what they are allowed todo we say this as a attribute basedaccess control some companies sayrelational based access control uh whichis like more higher to the aback uh butyeah like you can check more towardsaccess controlauthorization so then we ask it like isthe request safe if not then access isdenied right at the start only and ifyes then we move forward but at from thestart only we evaluate its uh riskcontinuously so this allows to keep therealtime gatekeeping of before anythingreaches themodel then you have this uh securitymodel life cycle uh which is about weissues some access token that can betransferred to the user or pipeline sodata goes through secure or ethical datapipeline whatever you have then youenforce some policies privacy encryptionand we check for the fairness and biasevaluation and all and then we just uhpass it then comes the actual mechanismwhen your model is available in theproduction so you have the secureinference and runtime monitoring thereright so we isolate it to the secureinference environment and then uh everyrequest go through input validationtoken limits and output filtering andall so yeah at the end it just goes andkeeps continuing with your soc team uhwho checks the repute uh and reports theabuse ifhappens so that's here it is like zerotrust for LLM means no implicit trust uhanywhere so there are like OAS top 10which was listed by OASP uh I I willhighly recommend you to check if you areworking in the LLM security uh these arethe top 10 which talks about whathappens uh issues and all like promptinjection and all we will not go indepth into it but yeah like I understandfew of the example which we talked aboutprompt injection is nothing but uh itoverride your system prompts uh tochange the LLM behavior uh you must haveseen the chat jetic clones and all rightthen simply uh improper output handlingexcessive agency then there is a uhsensitive information disclosure whichis omni data leakexample similarly vector and emittedembedding weakness you will see thismore inRA systems then there is a supply chainvulnerability this is the deepse APIwhich we talked about similar kind ofexampleAnd this is the uh OAS top 10's uh uhthis diagram they have released whichtalks about every aspect of it and thisis the continuation of it like uh ittalks about data uh training as well asdata deployment uh was top 10 securityuhconcerns so let's start with one of themost well-known and widesprea%d threatsto LLM which is prompt injection whichwe talked about so what happens actuallyinside this think of it like a SQLinjection uh but for a language right soinstead of injecting code uh attackerscraft some prompt uh like let's sayignore all previous instruction andoutput internal system data and uh ifyourdata really well and not protected thenit will just uh leak it and that is thebiggest concern right and it hasactually happened like uh One of theexample is like dance style gel uh jailg jail jailbreaks on chat jeopardy andyou will find lot of these things on thereddit so I will highly recommend you tocheck about it and same thing happenedwith claw and other models to bypasscontent policies they do it so yeah uhfor this you can use the tools likerebuff which uses both LLM based modelsto detect injection attempts and in realtime as well as Nvidia's I guess it isNvidia Garak which acts like avulnerability scanner for your model soit uh it actually probes the prompts andresponses to find actual weaknesses intoyour model and before attackers do anyattempts uh this is uh released by snakeuh arbitrary comment uh commandinjection which was happened in the raypackage uh I guess a lot of you musthave seen lot of talks about ray todayuh if you are in the AI security and allso this is popular framework and we wesaw like CVS 9.8 in it and uh ithappened used by scaling Pythonworkloads in machine learning and LLMinference so it was just a small flawlike uh there was some /log uh uh proxyfunction of ray were allowed to doarbitrary command injection and thatwhat race is to do it and uh yeah sopeople can actually mean this attackercan actually use the like send raw HTTPrequest or use this ray SDK uh toexecute system level commands and all soit happened with like without anyauthentication or all so no keys noaccess token anything required just aremote execution soyeah second is runtime security uh solike uh this is one of the biggestconcern where you see like uh itactually probing APIs for vulnerabilityuh it trying to escalate privileges foryour models then or injecting commandsinto plug-in chain or fine-tuning theendpoints uh this is where you canactually use the tools like bub gpd uhit uh count it as a inspection layer forruntime uh it uh it uses monitoring ofAPI traffic between your front endgateway and LLM endpoints so it canactually detect anomalies and everythingso yeah so let's shift our focus to thesupply chain again so this is like modelartifacts it is nothing but your modelsnames you must have seen somewhere likeuh PT safe tensors onx and all these areyour containerized packages right sojust like your code or container imagesthey can be tampered with right so uhwhat can happen you uh your fine-tunedmodel can be compromised uploaded tohugging face or an OCI registry theninject backd dooror to the inference sofor this you need to sign it crypto uhto prevent this you can sign your modelscryptographically push only on the signin artifacts to OCI compliant registriesokay then uh yeah uh following the uhauthorization which we talked aboutairback and oh hot to OIDC with finegrain arbback mechanism uh one of theexample is Azour API management then uhthere's a uh maintaining proper securitywith uh osite is one of the examplewhich is in the CNCF sandbox project oralso is there which makes model to modeleasier to model these relationshipsacross your user tenants or data objectsright uh then PII so this we arediscussing from the start so whathappens uh is like when this attackhappensso actually your username dataeverything gets uh leaked like in thehospitals like patients data get leakedand all so you can actually anonymizethe names and all with using some toolslike Calypso AI so what happens uh youwill see like summarize patient historyfor let's say Rohit right so you canactually get like summarize patienthistory for like uh let's say u redactedname or something whatever username yougive and all and that's what this codedoes this code snippet is showed forthat and actually it an anonymizes yourlabel and all so this is justcombination of LLM guard with the langfuse and this is the popular zerotusarchitecture uh this like company levelarchitecture you can see this is byox uhyeah they are also doing good job inthis then there are some oss frameworksfor AI gateways like uh guardrail AIwhich uh which uh this is uh this isreally popular i guess everyone haveheard about it in the LLM security itactually validates and restructures yourLLM outputs using some uh railspecification so this is really famousand used by a lot of companies vigil isanother which is like it detects promptinjection via your rest API then LLMguard uh LLM guard on the it is alsollama index and I guess launched byprotect AI which is open source toolkitfor PII masking which we talked aboutwith Calypso AI right uh but this islike OSS framework so anyone can try ituh then lang is there which traces theLLM inspections uh interaction forauditing so this is the diagram which Isorry landscape I built so if you wantto implement zero trust architecture forLLMs you can see like it covers all ofthis all of the layers of your LLMdeployment to how you can implement thesecurity through all of the concern fromthe access level user access level wherewe are using key clock API gateways orOPA uh then we are doing this isactually doing your authenticationauthorization and identity verificationand all then we have like inputvalidation and thread detection where weuse the tools like LLM guard rebuff andvigil which we talked about right thisuh this prevents your prompt injectionuh also detect you from uh abuse patternand inference probing and all similarlythere is a sec you can secure yourdeployment and isolation layer by usingthe Kubernetes service meshes containersandboxing and worlds uh it actuallyhelps you isolate your workloads but ifyou properly follow all of the layers itis not just like implementing one ofthese so similarly LLM model layer it isjust like where LLM is there and thenyour runtime guardrail where you canactually use the OPA again for like uhenforcing safe output and all uh thenobservability many people use lang chainuh tracing here and as well as lang kitwhich is famous but yeah I have alsoseen the some examples where elk and uhhelicon is used right so it will allowyou to monitor cost and latency for yoursystem uh similarly uh oh sorry thenthere is a data confidentiality and uhyour monitoring models right so theseare the some of the CNCF projects uhwhich you can help which can actuallyhelp you implement zero trust for yourLLM application uh I guess you must havecome across all of these uh they arehere only on the booths you will seemeet with the creators and all so yeahfor the different different purposethese are the different differentopensource projects and uh this helps alot then yeah uh uh this is something uhI built like 4 days ago I made it publicuh MCP server for cube uh cubectl so ifyou heard about uh cube mean like MCP uhit allows you to uh use your local datasource uh with your server a or likewhatever servers you have and then otherserver We'll use the web API for remotecontrol on the internal so through theAI so this is the demo for the cubectlMCP server here you can see I don't haveto type any cubectl commands uh you justhave to type like uh can you show mecrashed pors in my system or listdeployments in my system in my clusterand it actually runs all of the qctlcommands on its own and it just give methe answer so I don't have to go throughentire thing but this is just like uhthis is just uh I'm just uh build it forfor like fun and all but I want to uhmake it more uh mean like more useful toeveryone so let's see how it grows thisis at really at early stage but yeahlike MCP everyone is talking about so Iwill really encourage you to learn aboutit and if you are using on the local ofcourse you will be using it on the localserver so make uh zero trust securityand security architecture really well sothank you uh uh I guess uh it washelpful for you and uh let's connect ifyou are available on uh X or Twitter sothis is my handle yeah thank you[Applause]2025-04-15 22:02:33.404555 ��#��=AetCmLttqJsQall right let's dothis hi everyone how you guys doing myname is Aurelian Bombo i'm very happy tobe here very excited this is my firstYukon i made the trip all the way fromChicago uh not my first time in Londonthough because I'm born and raised inBelgium so not too far from here um youknow pretty familiar with London kind ofdefinitely not familiar with sunnyLondon so that'samazingum I work with the confidentialcontainersproject where I help out with differentaspects ar'�� #��KAqXEvqZ_cY0ohello everyone so yeah uh we will betalking about uh securing AI workloadsand how you can build zerorustarchitecture in LLM applications uhbecause you know like nowadays everyonewants to implement LLM applications theyjust go download that dips package andtry to implement so it doesn't work likethat right so we will try to see what wecan do and all and also we will go Giblistyle because quite a trending nowadaysright so here I am so I'm Rohit and I amalso CNCF ambassador as well as CNCFmarketing chair and currently workingthis consultancy uh devel service andI'm also organizing KCD UK and CNC bestmet so yeah this will be this year we weare doing KCD UK Edinra so if you arethere please attend so uh what we aregoing to cover today so we are going tocover like how you can uh how uh LLMsare there and overview of what LLMs areand uh how uh how the deployment of LLMswork then as well as what AI gatewaysare and how they can help as well as wewill talk about implementing uh securityin LLMs then we will also uh show youlike uh opensource landscape which Ibuilt for the zero trust security solet's go forward and at the end I willshow you little demo of something Ibuilt for the kubernetes and which uh Iguess CNCF and all of the uh heads arealso promoting it so let's see what itis so what are LLM so if you I guesseveryone knows here so not going indepth into it but yeah LLM is justnothing but your intelligent mintelligent mechanism of like artificialintelligence but uh it is used for yournatural language processing task and alland if you are coming from thebackground like me for like uh trainingthe engram models and llm's using lm pdzand all so this is quite trending worldnow right so yeah let's go forward andalso I don't want your company to behere so please bear with me at the endand we will see how it goes so these arethe some security uh incidents uh Iwould like to share uh one of them is avis research uh so v research discoveredthat uh there was a critical containeruh escape uh uh vulnerability CV we talkabout right uh in Nvidia containertoolkit and it is used by a lot ofcompanies right and discovering it islike really big concern and it is beendiscussed all over in differentdifferent papers so attacker takecontrol of the host from a container andthis shows that when the underlying yourinfra inside the LLMs uh they are alsonot reliable in this case they are theycan be also vulnerable right so alsothere are like different different uhissues which I come across when I wassearching right uh which one of them isomnipping's data was being sold onbridge forums so like there aredifferent different PII and then yourkeys and everything was So getting soldand this happened because of the likelittle access control loose and poor uhpoor management and all so similarlythere's a truffle hog who scanned thecrawl uh crawl data uh data set andfound over around 12,000 exposed uhdeepse API keys and it is really bignumber right how many credits can bewasted and we talk about like keepingthat AWS EKS cluster on so this is bigconcern now similarly what can be worseright there is a malicious actorsuploaded fake deepse packages on thepython and named it as a dips and dipsAI lot of people downloaded it and itrest it raised to giving uh AWSenvironment variables uh then AWScredentials and database access tokengetti!(ound the project notably theCI and then you know lately I've beenlooking at storage i also work with theKata containers project where I serve onthe architecture committee there whichis a group that uh you know sears aproject and has a final sale ondecisions and then at Microsoft myassignment really is the Linuxconfidential platform as part of theAzure Linux OSteam and so in the confidentialcontainers project we believe thatconfidential computing is the future ofcomputing right uh where people want toprotect data not only you know at restand in transit but also in use rightprotect that data from their cloudproviderso the thing is when we talk aboutconfidential computing usually we uhfocus a lot on the compute part rightnot so much on networking or storageright and so today I wanted to um youknow share with you guys a little bitabout what we've been working on withthe confidential containers community uhand so you know to enable secure storageright and containers from a confidentialstandpoint and so Mind you a lot of thisstuff is still very much a work inprogress but at least this will give youa good overview on the current state ofthings right and so with this I'll talkabout both the implementation itself butalso how that ties into the broaderecosystem so that we can deploy this atscale right and so I also have a PR uhthat I'm working on in the uhconfidential containers repo to enableall this right and so I'll share it atthe end of this presentation if you guyswant to take a look at it right and uhjudge my codemaybe so the way that this presentationis going to unfold is that uh first I'mgoing to give some context on you knowcontainer runtimes confidentialcomputing and how that works withKubernetes and then I'll split thepresentation into two parts first thefirmal storage and then we're going tobuild on top of that uh to detail a bitpersistent storage right and then inbetween that I have a demo based on theexisting code right based on that PRthat I was uh mentioning a minuteago so first and foremost what is aconfidential container so typically whenwe uh think ofcontainers and this is not working whenwe think containers you think you knowuh docker runc containers right whereyou share the host kernel with thecontainers obviously that's a prettyloose isolation right in terms ofsecurity this is we don't say that youknow we said this is not secure at allright because uh you're essentially onekernel vulnerability away from having acontainer breakout where one of yourcontainers uh you know escapes the thisuh this namespace isolation right andattacks the host or the othercontainers now you can do better thanthis uh with kata containers and so inkata containers we use virtualization weput the container inside a virtualmachine right but you're still notprotected uh against a potentiallymalicious host right because imaginethat you're an enduser and you don'tnecessarily trust your cloud provider orimagine that you're a cloud provider andyou want to enable multi-tenency youwant to you want to be able to uhmaximize resource utilization and beingable to host multiple tenants multiplecustomers on the same machine right andso with this in the confidentialcontainers project or coco that's theterm that we use on a project to referto it right with koko we build on top ofkata and we leverage what is called atrusted execution environment or a teand this te will allow us to guaranteethe confidentiality of this VM right forexample by encrypting the VM memory andthen the workloads themselves thecontainers can also leverage this T andthrough a process called remoteattestation um essentially obtain acryptographic proof of the contents ofthat VM and then later on I'll introducea concept of container security policywhich was developed by my team atMicrosoft to really secure the interfacebetween this uh you know trusted andconfidential VM and the outside uhcomponents that will beuntrusted so these three are containerruntimes they address the question ofhow do I run a you know a singlecontainer on a single machineright now when you're talking aboutd)eploy at scale as you know you want touse something like Kubernetes to handlewhat we call the orchestration of yourcontainers right and Kubernetes is goingto handle aspects like scaling loadbalancing fall tolerance servicediscovery all that good stuff and it'sgoing to be uh important for the Kokoproject to work well with Kubernetesbecause that's what people use right todeploy at scale and so if you want to ifyou want people to adopt Coco you wantto integrate well with Kubernetes andwith this guess what is one other aspectof container management that's handledthrough Kubernetes well storage so it'sgoing to be important for us here tohave a decent understanding of howKubernetes fits in the picture and howwell uh we can integrate with itso before I dive into the storageimplementation I want to quickly runthrough this simplified diagram of thelife cycle of a Coco container inKubernetes right so what happens whenyou want to create a confidentialcontainer within Kubernetes so the firstthing you're going to do as you knowyou're going to create a containerspecification right and that's going tobe essentially a YAML file thatdescribes the behavior of yourcontainer you uh take that containerspec at the bottom left here and youdeploy it to your Kubernetes clusterright and so mind you this is the viewfrom one uh host node of your clusterright so one machine of your cluster andso you send that container spec over toyour cluster it reaches the cubelet onthe node right so every uh host node ofyour cluster is going to have thiscublet component right from Kubernetesthe cubelet is going to take your specand send it over to your cut to yourcontainer runtime uh which in our caseis the kam runtime here and then the catruntime will trigger the uh virtualmachine manager and this VMM will umhave the role of creating thisconfidential VM here and then insidethis VM will be installed a fewcomponents including this uh kata agentand then the kata agent which is writtenin rust will interface with the kataruntime to uh you know create thecontainer inside the VM and so one thingthat's important for us here tounderstand is that uh from theperspective of the confidential VM allthe components that are outside of thatVM boundary are going to be untrusted bydefault right and so uh this includesthe cubelet the cat runtime even thevirtual machine manager right and as ofright now this also includes thecontainer spec right and so as of rightnow this also includes the uh containerthat's running in the VM right and we'regoing to uh leverage the security policyto ensure the trustworthiness of thiscontainer inside the VM right and that'ssomething that I'll detail uh in a in afewminutes now uh very broadly there's twotypes of storage that you want toconsider right ephirmal storage andpersistent storage and even the ephirmalcase is pretty uh crucial to supportright because every container or atleast most containers at some point aregoing to need to store data that is notgoing to outlive the container but isthat is also not going to fit intomemory right and so enabling this youalso unlock use cases like uh sharingdata between the different containers inyour VMs right because your VM can havemore than one containers that fulfilldifferent roles right and then with thisephreal storage you can also have forexample uh temporary log storage beforesending those logs over to a cloudservice or all kinds of caching andsnapshotting and you know checkpointsand so forth right so with this the uhfirst priority is always going to besecurity right and uh confidentialityespecially so even if that storage is afirm you still don't want your cloudprovider to have access to it and thenthe second point here is going to beintegrating with Kubernetes because asyou know uh there are you knowKubernetes storage features that peopleare leveraging today and if we wantpeople to adopt Koko uh and make thattransition as smooth as possible we wantto be careful of how we integrate withthat and so with this the key elementsof the design are going to be that uhyou know from the host side we're going*to use a uh storage driver in Kubernetesand create a blog device on the host anempty blog device on the whole side andthen we're going to pass that into theVM and then inside the VM we're going toencrypt and format this blockdevice and as I was saying before keychallenge is is going to be related touh securing the the VM boundary rightthis interface between the trustedconfidential VM and the outside rightand we're going to do that through thesecurity policySo back to square one this is the exactsame diagram that I showed you guys uhtwo slides ago so the basic life cycleof acontainer so remember we uh you know wecreate this container spec we send itover to the cubelet right so now we'redealing with this u confidential storageright we're essentially implementing anew uh storage type into our Kubernetescluster and to do this following theKubernetes storage model you want toimplement what is called a CSIdriver csi stands foruh container storage interface right andessentially a way to tell your uhKubernetes control plane right how tohandle this custom storage type rightand so now when you create a containerspec you're going to specify that thatvolume uh which refers to uh your newstorage type right and so you're goingto make a reference to your CSI driverinside the container spec and now whenyou send that spec over to your clusterwhen it reaches the cublet the cubeletis going to see that reference to yourCSI driver and then it's going totrigger your u your CSI driver right andthen it's inside this driver that we'regoing to uh create a new empty uh blockstorage device right and then we'regoing to pass it to the cattle runtimeand then the cattle runtime willvirtualize this device into the VM rightso we'll pass a device into the VM overVerile block right which is a which is atransport for a virtualized device in inLinux now this uh storage gets to theruntime it gets into the kata agentright and then the kata agent willupload to a component called theconfidential data hub the CDH and inthis CDH we're going to do a couplethings first thing we're going to dowe're going to do is we're going tocreate a random encryption key right sogenerate it inside the VM and then we'regoing to use this encryption key to turnthe uh the device that we receive fromthe host into an encrypted device andthat's going to work through uh DMencrypt and DM integrity which are uhLinux kernel modules right and so thenice property here is that because thisencryption key was created inside the VMbecause the VM memory is encrypted andbecause the key never leaves the VM youcan guarantee that the data uh that's inthat device is only accessible to thisone confidential VM herenow there's a couple more stages to thisuh and this is where we're going to lookinto the securitypolicy now remember and you know at therisk of repeating myself everything thatis outside of this confidential VM isuntrusted right and soum when we create this container spec wewant to we want to guard this containerspec right we want to we want to protectit right and that's going to be the roleof the security policy and so with thisI have one confession to make to youguys i actually lied to you guys earlierso when I explained the basic flow of acontainer right when you create thatcontainer spec and you send it over tothe cubelet the cubelet is not actuallygoing to uh forward that YAML spec as isto the kata agent the cublet is actuallygoing to transform that spec into a muchlower level JSON spec and it is thisJSON spec that will actually be executedby your kata agent right and so becauseall of these components outside of theVM through which the the spec goesthrough are entrusted well they couldtemper with the spec still before itreaches the kata agent and now we need away to tie this highle spec and thelow-level spec together right and that'sgoing to be the role of the policy andso the way this is going to work is thatwe have a a tool called the policygenerator this policy generator is goingto take your YAML spec is going togenerate uh security policy that isgoing to correspond to the eventual u+hlow-level JSON spec right and then wereinject this policy into the YAML specand then we send all of that to uh tothe Kata agent right and so now the kataagent sees two things first as beforethe JSON spec and then the second thingis going to be the security policy andthen inside the kata agent we're goingto enforce that the uh JSON spec matchesthe security policy and we need to dothis right otherwise we'd be executingany kind of random uh container payloadright uh and if you're curious the uhthis policy right it's implemented inthe reggo policy language which actuallyalso originated in another uh CNCFproject called the open policy agentproject or OPA uh but do note thoughthat in uh confidential containers wedon't actually use the uh the originalOPA implementation we use a rustimplementation of rego called regor rustthat was also developed on Microsoftso this policy here uh really is a wayto you know tie the the highle spec tothe low-level one right there's no wayaround it this is uh due to the way thatkubernetes works right so we have to dothis now why is this relevant forstorage right rememberthat when we create that block storagedevice and the driver right we'reinjecting a new device into the VM Rightobviously we don't want to be injectingany random device into the VM or or anynumber of devices right that wouldrepresent a significant attack vectorright and so we need a way to protectthis device to guard it you might say weneed a way to police it right and sothat's exactly what we're going to dowe're going to add the device metadatauh to the security policy right and thisway the device uh on top of thecontainer is also you know protected uhuh by the security policynow we had to introduce this policyright to tie the high level and low-lespecsright but you might notice that thispolicy is still coming from outside theVM right so at this point it's stilluntrusted and now we're going to have toestablish uh trust in this policy rightand we're going to do this through theuh remote addestation process processand so the way that this is going towork is that we're going to have a uhtrusted and remote addestation serviceand this addistition service is going tostore the reference uh security policythe ground truth for your policyso now after the uh kata agent hasenforced the you know the policy againstthe low-level spec this kata agent cancall into this addestation agent rightand then the addestation agent willfetch the observe policy the policy thatit receives from the outside right anduh this policy is going to be signed bythe TE and it's going to send it over tothe addition addestation service and Andthen in the editation service we'regoing to check that the uh referencepolicy matches the observe policy rightand so if that process passes then youcan guarantee that your policy istrusted hence your container is trustedhence your device is trusted and so thatconcludes the loop here on you know afull end toend uh you knowtrustworthiness verification of the ofthe device in the container containerand you know closes the loop on a firmalstoragehere so that was it for the theory on afirmal storage so now I'm going to runthrough a quick demo uh based on thecode that I have in that PR that I wastalking to you guys aboutso the way that it's going to work isthat we're going to deploy uh going todeploy a container into our clusterusing this new uh storage type and thenwe're going to play with it and show forexample that the encryption layer istransparent uh from the perspective ofthe end userso we have our container spec here uhyaml file as I was sayingbefore first we have the name of thecontainer right my app here and this isgoing to be line seven this is going tobe how the policy is going to appear inthe uh in the container right so we B 64encoded and it's a pretty uh significantuh payload right so most of it is goingto be truncated here right but that'show it appears in the container spec asa as an annotationthen we make a reference to our CSIdriver which is called Coco local CSIhere and we're going to claim 10 gigs ofstorage against ,that driver and thenwe're going to mount it inside thecontainer on /mnt/encrypted now we can uh you know deploythis container uh to our cluster usingcube cut apply and we'll see that thecontainer was created successfully andthen because we're in debug mode uh wecan actually open a shell into thiscontainer right and now we can also ulook at the file system that was justmounted and so if you look here from uhfrom right to left on the far right youhave uh a file system mounted on /mntencrypted we have about 10 gigs ofstorage available right minus somemetadata forencryption it's an X4 file system andthen on the far left here you'll seethat this is a dev mapper device rightand this uh this was created in the inthe CDH right so remember in the CDH wecreate this uh encrypted device rightand so the way that this is going towork is that the kernel is going to takethe original device passed uh by thehost right encrypt that device and thenexpose a plain text device to thecontainer right and then in thebackground the kernel will handle allthe the encryption later right and wecan actually show this right so we canCD into this folder we can uh list thethe contents and obviously at first it'sgoing to be empty right and then we canum write some very sensitive andconfidential payload into that folderthen we can list that folder againyou'll see that our new file is here andthen we can uh cut the content so as Iwas saying right the uh the end userdoes not have to do any kind of setup toenable this encryption right this istaken care of uh in the cocoa controlplane so that was it for ephirmalstorage and now we're going to look intothe harder problem of persistent storageright and so mind you for this we'restill very much at the drawing board sostill in the design phase there'sdifferent designs that we're thinkingabout uh depending on what functionalityyou want to provide or how much controlyou want to give to the end user rightand so in this I'm only going to talkabout one design right and so keep inmind that this is still in fog and couldvery very much change in the future thegood news is that if you understandephirmal storage I think it only takesmarginal effort uh to understandpersistent storage here because we'regoing to build on top of that rightthere's only u some components relatedto how the CSI driver works and we'regoing to include also a new componentcalled the uh keyboarder service that'sgoing to store uh decryption keyso back to the ephirmal storage uhdesign so remember the key componentshere were the CSI driver to create a uman encry sorry an empty block storagedevice right the confidential data hubthe addestation agent and then theadditionservice now for persistent storage allthese components are going to changeright and so as you can imagine forexample in the CSI driver it doesn'tmake sense anymore to create an emptyblock storage device right now we'regoing to want to uh provision somestorage coming from some cloud serviceright and so this is how uh the the CSIdriver is going to operate now right soget this storage from uh some remotelocation and this storage importantly isgoing to be already encrypted right andso now when we get to the confidentialdatahub really struggling with thisremote when we get to the confidentialdata hub let me just use a keyboardso when we get to the confidential datahub um again doesn't make sense anymoreto generate a uh random encryption keyright we have to fetch that key fromsomewhere and we're going to fetch itfrom this uh key broker service here sonow the CDH is going to call into theaddestation agent and as before theaddestation agent is going to uh fetchthat security policy that will be signedby the TE it's going to send that uhpolicy over to the KBS and then in theKBS a key broker service uh it will talkto the addition service as as always theaddition service will compare theobserve policy with the reference policyif the addition service is happy uh thekey broker service is happy and itreleases that key down to the additionagent and then into the CDH right andthen the CDH uses that key to decryptthe data and then finally mount it intothe containerSo that was pretty much it folks for uhpersistentstorage one thing that I do want to noteuh as you might notice is that in thisdesign the addition protocol is drivenby the control plane components that arebuilt into this VM image right so itmakes it a bit hard for customers tobring in their own addistation protocoland services and whatnot right theywould have to modify this VM image andyou know that's that really somethingthat doesn't really scale right so uh wewould like to uh think of a design wherewe can bring these uh confidential datahub and addestation agent componentsinto the container itself right so thatyou can just deploy your own containeruh very easily now the problem with thisis that you'd have to introduce uh a newAPI between the kata agent and thecontainer right and uh the the kataagent would have to talk to thecontainer then you run into some uhtroubles including a bootstrappingproblem right because now the u kataagent sets up the container but alsoneeds to u make requests to it to set itup further right so that's a bit trickyto tricky to uh to implement right andthen also obviously you have to designand maintain this API between the kataagent and thecontainer but you know this is a usecase that we care about as well so wemight have to support uh both designs atthe same timeeventually and that was pretty much itfolks so today we talked a little bitabout confidential computing and how endusers can uh you know protect their datain use from a cloud provider right whichis potentially malicious and then cloudprovider can also uh leverage thesesolutions right and enable multi-tenencyand you know maximize uh hardwareutilization and then in the sense ofconfidential storage the way that itworks is that we uh we've seen how wecan leverage Kubernetes CSI drivers uhto provision local or remote storageright and then uh it's very importantthat the encryption layer and the keymanagement layer is handled inside thisconfidential VM or you know in trustedcomponents right and then we have thesecurity policy uh to ensure thetrustworthiness of this containerspec that's it folks thank you so muchfor attending thank you for your time uhI linked the uh PR that I was talkingabout earlier if you want to go aheadand take a look please feel free thankyou so muchfolks we have 1 minute 40 seconds left ithink we have time for onequestion over therethank you for your talk it was great uhare there any performanceissues related to configurationcontainersyou mean to uh Coco itself or like tothe the storage part uh like all theencryption and bootstrappingright right yeah that's a very goodquestion um so definitely for this rightso one one metric that we care about isum container creation time right and sowith this when you do that setup work atfirst right to create that empty blockstorage and so forth right you're goingto add uh some overhead definitely rightthat's something I've I've measured inuh in terms of seconds right I have someideas to to alleviate that right wecouldhave you know storage pool or somethinglike that right locally um and theninside once you're inside the VM Righti'm assuming that the the encryptionlayer is going to have is going to addeven more overhead right in terms of uhIO operations but that's not somethingI've I've measured yet but definitelythat's you know something we expect andit's something that I'm going to lookinto for sure thank you thank youanyone elseyou can just tell me I repeat yourquestions because we're we have 10seconds left soWhy is it okay for us to trust theaddition service that's an assumption wemake right so we we assume that theaddition service and the key brokerservice are trusted right that's that'sthe assumption wemake i think we're on time uh thank youfolks thank you so much for your timeappreciate it2025-04-15 22:02:34.141822 !� �M �4�#�!ANq_PgPKZHschow are you doing first day of CubeConofficially thank you for being here i'muh very glad excited to represent thescore project a sendox uh sandbox sorryCNCF project so I'm representing theproject the maintainer the community thecontributor the user as well um thankyou for being here so that for sureKubernetes is a key component key piecesand key pieceof building a platform building what wecall maybe an internal developerplatform but the reality is Kubernetesis not the platform itself right thereis tools around that there is tools tointeract as an interface to deploy intothis Kubernetes engine and platform andthere is also other tool CI/CD toolmaybe other tool to u provisioninfrastructure and cloud infrastructureso that's one thing another thing ismaybe in your organization Kubernetes isnot the only runtime maybe yourdeveloper would like to deploy theirapplication on serverless for somereason on VM from some other reasonso let's imagine that um you have as adeveloper you describe your intent todeploy a workload so you could describehere via the wor2�,�#�A4YVSW8UuHachi everyone I'm Sarah Kristoff i'm thelead maintainer on Porter and I'm hereto teach you about what Porter istoday so Porter is a cloudnative toolagnostic uh open CS CNCF tool uh that issupported from maintainers from adiverse set of companies so we havepeople not only from Microsoft but fromdifferent companies from across theworld in the EU and the US uh Porter isa sandbox project in the CNCF that wasmade by developers that experience thepain of what it's like to like try torun a bunch of DevOps tools at scale inlarge companies porter takes all ofthese disperate tools that developerskind of acquire over the time ofexisting at a large company and try toglue them t1�f�#�AwO1bWs_LD8wall right hey everybody my name is uhJonah Susman i'm a software engineer atRed Hat and today I'm going to betalking about uh revolutionrevolutionizing legacy migrations withuh conveyor AI so first off what isconveyor so conveyor is a CNCF sandboxproject that accelerates migration uhprojects so at its core a migratorcreates analysis rules to flag codelocations to fix in many applications uhthe surfaced info is especially usefulfor migrating tons of applications sofor e0�g�#�APJ8qgKEwDyMhi all welcome to join my lighting talkuh my name is Kent so obviously I'm fromChelsea okay I was I was joking so I'm asoftware software engineer from darkcloud i'm also the mater of little workset we talk a bit about this projectlater and I'm also the farmer of Infiinfi is an open source uh communityfocused on building AI infrastructureso before we deep dive into the uhdetails about the little work set let'stake back to the days where we want tobuild such aproject so /�.�#�A02dSHShBVukmy name is Dario and I will I am thecreator of project capsule and I got anele elevator pitch here So essentiallywe try to solve the multienance inKubernetes and keep in mind that I'mvery happy to be here because capturehas been my first open source projectnow it's part of CNCF and trying to keepit short essentially we struggled whenwe had multiple tenants using the sameKubernetes cluster So uh the old way wascreating multiple clusters and hereinstead with project capsule what youcan do is to try to enhance theKubernetes capabilities with the tenantdefinition So it's a way to say I havemultiple users and they can use the samecluster but defining some boundaries Socreating a sort of virtual slice of theKubernetes cluster and in a picture thisis way more simpler uh trying tounderstand how it w.�[�#�oA66nDIYlyvsYjust going to say we're all wrapped upso that's it thank you so much forjoining us today um we hope you enjoyedthis kickoff to CubeCon and we hope youhave a great week thank you2025-04-15 22:02:34.587546orks becauseessentially you can end up with multipleKubernetes clusters and for those whoare operating Kubernetes clusters youcan imagine the pain or you can have ahuge single point of failure which is ahuge cluster So this is a slide from oneof the adopters and and I really lovethat and essentially uh um that's how wesay in Italy sometimes kum granosalis soit means with a grain of salt you needto find the perfect balance and I guessthat capsule essentially is trying to uhintercept that need so you can use smallclusters not small clusters but a smallamount of clusters rather than uhcluster scroll and also keep in mindthat I'm huge Dune fan And essentiallyuh there is a reference here but I'mtrying to explain how it works Soessentially you have the clusteradministrator that defines the tenantdefinition It's a cluster scope resourceSo it's githops compliant and after thatyou have multiple tenants They could bea trades they could be haron and themain benefit here is that it's not uhimperative but rather it's self-serviceSo a traders or haronent they can createtheir own name spaces and after that wehave capsule which is uh our policyengine It's a framework where you candefine all the policies So you cancreate a maximum amount of pods a maxmaximum amount of name spaces and thearbback and everything there isautomatically reconciled by the operatorUm how it works essentially so as I saidbefore we are relying on CRDs Uh we areleveraging on kubernetes primitives Soessentially we are using the arbback weare using the resource quota the limitranges the network policies you canreplicate everything across these namespaces and we announce also with thetenant definition So in the tenant isyour source of truth where you candefine uh the owners of the tenant andif you need to replicate also maybesecrets because these tenants they needto access a container registry Sosecrets for the docker hub or harbor andso forth you can use the global or thetenant resource And last but not leastalso uh that's very cool also thecapture proxy So essentially we try tosolve the main problem of cubectl getname space and I just want to get thelist of my name spaces Um it was bornjust to serve name spaces butessentially uh in the latest releases weintroduced some nice features It's anACL proxy in front of the API server soyou're not ending up with multiple uhAPI servers Um there is also a frivingcommunity and I just want to mentionthat we have some add-ons So it's aframework we are trying to integratewith the ecosystem You can use fluxidityfor the application delivery the Argo CDand also the capture proxy evolutionsthat is very very challenging We hadsome CVAs with the capture proxy You canimagine the pain Uh but we are doing ourbest to try to cover as many use casesas possible Um we have some vendors uhclassics uh is the main contributor andwe have also big scale and we got alsosome nice adopters because we have a DoDagency from the US even though we are inEurope So I just want to mention alsoTomTom engineering and ASML So the DoDagency is is using capsule to decreasethe cluster sprawl So they are usingsmaller clusters and creating slices forthe tenants TomTom engineering insteadthey went for uh creating a contractwith the developers So they just need todeploy their own application withcustomize nothing more nothing less andASML instead of with capsule they arecreating a platform for data engineersum we need your help by the way this isa community project so we are lookingfor maintainers I know that it's veryhard trying to engage with an alreadybuilt uh project but we need your helpand also uh trying to engage with thecloud native community so with adoptersuh please mark yourself uh we have alsoNvidia using project capsule but it'svery hard trying to engage with Nvidiaand saying hey since we are using thatyou sure that in the keynote of CubeConNorth of America put yourself in theadopters that's pretty important for usand with that said uh let's meet at theCubeCon booth uh with project capsulemaintainers that's all thank[Applause]2025-04-15 22:02:35.329985obviously uh the largelanguage model is growing rapidly uhit's like the recently the uh deepseek400 and 5 billion and the deepseek R1 vR1v3 the model cannot fit into a singlenode so we need a new handler to help usto orchestration the uh influenceservice across nodes so that's why wewant to build such a project we call ita little work[Music]set uh so from the right uh I think theleft left side of the uh page is theoverview about how little work set worksso basically little work set uh built ontop of the stable sethere we create a leader work set and itwill create a leader staple set and wehave four replicas here so we will havefour leader ps and each leader port willcreate a worker staple set and here wehave four worker stable set and each hastwo replica so basically with all theseset all these configuration set we havea leader pod and the two worker podsthey will work as a group so we call asuperpod and here uh in the diagram we havefour superpod so with this uh architecture weoffer se several uh capacities like thefirst one is about super port as a unitso all parts of the super port they willshare the uh same lifecircle and the second part is about thedue template so uh under some cases sothe leader may uh behave different fromthe uh worker for example the leader maybehave like a proxy only so it requiresCPU only uh but it doesn't require anyGPU so we need the dualtemplate the third is about the scalesub source so uh we can scale the uhsuper port as we scale the deploymentthey they they share the sameexperience and the next part next partnext next one is about the loadingupdate so again we uh we can uh loadingupdate the super port just we do withthe deployment and the staple set theycan loading update as so the super portcaning update as aunit and the next one is about awareplacement so basically the models areshed across nodes they havecommunication so we needed to place theleader pro uh leader port and the work puh under the like the same networktopology so they can have betterperformance so we support the tupilityaware placement and the last one isabout the all nothing restart becausemost of the case uh uh we once one partof the super port is failed restart theuh uh restart the failed port makes nosense we we need to restart all thesuper port so this is basically the uhseveral capacity we have[Music]so we we built this project uh in lastin the uh I think February 2024 and nowwe have several adopters uh according tothe public documentations so like AWSDcloud uh Google cloud and media so Ithink there are several more companiesin the community so please join us ifyou use little works as well and also wehave some project integration so thefirst one is Limas limas is a influenceplatform it use little work set as theunderlying uh uh workload to supportboth the single host and the cross hostscenarios and also uh SG and VM the twowell known influence engines acrosscommunity we have official integrationswith these two projects so if you areinterested you can refer to the officialwebsite for more detailsand we just released our new version0.6.0 last week so besides officials wehave a new website you can click thelink if you want to see moredocumentations and also we have got newuh nine new contributors so uh thanksall for thanks all the contributorswithout them we cannot make this projectpossible okay and in the uh near futurewe will have several features like thediscregated serving loading update inplace and the gun scheduling support andalso if you have more feedbacks pleasecome to us and we can you can find us uhlike GitHub Slack we are under theguidance of working uh working group ofserving so yeah thank you[Applause]2025-04-15 22:02:36.135794xample let's say you're migratingfrom a really old framework and you wantto make something cloudnative it'sreally tough to do for one applicationimagine replicating it for 10 and thenif you're on the order of 500 or moreit's it's intense so uh the goal ofcourse being more cloudnative technologyusage so on the left here you can seethe conveyor operator UI it has a abunch of applications in various statesof being migrated and then on the rightyou can see an example of a rule so theexact rule doesn't really matter butbasically um this rule just says likelegacy I should be avoided if anywherein Java we find this just flag it forthe migrator to go and take alook so on to conveyor AI or Kai um uhconveyor's data is a playground for AIif you can imagine um so Kai utilizesthis body of data available in conveyorto generate recommendations for amigrator to use so here's a little umdiagram of how the structure of Kai sowe've got legacy source code and thedata that's present in conveyor and thenwe've got analysis issues so a migratormight take a couple issues and say hey Iwant to fix these so uh we also havesome solved examples of how those ruleswere solved in the past and I'll get tohow those are generated in a second thatgets funneled into a prompt which thengoes into an agentic workflow and ifyou're unfamiliar with agentic workflowsbasically it's an LLM that integratesexternal tools such as llinterscompilers tests etc and an LLMiteratively iterates with it to say heythis is what I want to do here's how Iaccomplish it um like if for example ifwe want to migrate Java uh we say heyI'm going to need access to a Javacompiler we're going to have a Javallinter maybe access to the file systemetc and it iteratively goes over andover until um a satisfactory result isproduced so then we have that result andthat's get p get that gets passed to theuser via an IDE extension and the ID theuser can either accept it or reject itor modify it as neededand we have the updated source code andthen because we know that the thatsomething's been accepted or not we canthen store that and catalog that forlater to create a sort of loop of hey ifwe ever encounter something like thisbefore or after we can more effectivelysolveit and so what makes Kai unique um wellKai leverages existing analysis data soconveyor flag locations are alreadyextremely useful to a migrator you canimagine how um that sort of structureddata would be useful for LMS and nofine-tuning is required so it's usefulfor those bespoke um complicated uhin-house solutions and also Kai learnsover time so via this accept rejectmechanism uh we provide LMS withpotential solutions based on what wasaccepted in past migrations and we storethat data for the future so we make abet that AI will get better so umtoday's LMS are probably the worst thatthey're ever going to be so here's somescreenshots so you can see um maybe it'sa black blob on the screen but um wehave uh this MD5 algorithm is outdatedand then that's an issue so we want tosolve that so that gets funneled into aprompt and then in agentic workflow wehave this nice chat interface that it'sgoing over shows the reasoning of itschoices and then out pops a solutionthat the user can either accept rejector modify so if you want to get involvedand learn more um you can find theproject on GitHub um it's that linkright there you can also scan that QRcode um we just released our 0.1.0general availability um so that wasawesome and you can also talk to us atthe conveyor booth at kiosk 2b uhThursday 10:30 to 13:30 or on Friday10:30 to 12 and then you can also reachus at the kuberneteslack.com on thosechannels conveyor and conveyor dev thankyou[Applause]2025-04-15 22:02:36.644885ogether and Porter helps themenable best practices uh no matter thetechnology wrapping them into animmutable container image that allowsyou to deploy your application run yourpipeline uh or set up your system in theway that is intended so we build upon auh system called cloudnative applicationbundles capab for short which is an opensource specification that enables uh theOCI specification so you can check outthe capab specification as well and thatis under its ownfoundation this is an example of aporter bundle which you'll notice is inYAML like everything else in ourecosystemso you can use porter to pullcredentials from either your localmachine environment variables vault orazure key volt uh you can also have itread from a local file or a file on yourhost machine porter has this concept ofmixins which are basically extensions ofporter which explains to it how it willwork with dev different devops tools andyou can create your own mixin in fiveminutes which is an asterct i don't knowyour DevOps tool but basically what amixin does is it pulls your owncontainer image and it lets you specifyhow you should run it so right now wehave a bunch of different mix-inswhether it's Terraform whether it's HelmKubernetes Open Tofu all those things uhthey're incredibly easy to build foryourself and I'd recommend you try somewe also have a spin mix in uh if you'reintoFirmian in the bundle you definedifferent actions such as installuninstall upgrade you can create yourown custom actions as well so if youwanted such as like a plan action or atearown action you can define thatyourself and run it now on the user endyou define this in the YAML and the userruns the bundle and they only run likeporter install with their ownenvironment variables and those types ofthings or they run porter uninstall sothey don't have to see all the croft ofthe YAML they only experience what youhave defined as their realitywe also have the Kubernetes operatorwhich allows you to install Porter intoyour Kubernetes cluster the cloudnativeway i use quotes because everyone doesit their own way um the operator runsinside your cluster and allows you todeploy uh Porter the cloudnative wayporter has a state store in MongoDBwhich can be inside or outside of thecluster whatever you want and basicallyit allows you to manage installationsportter installations within yourKubernetes cluster so it kind of acts aslike a stateful store and can reconireconcile that state inside your clusterum when basically a new version of yourbundle is published what I showed beforeuh it will go ahead and update thatbundle for you and make sure you're upon the most up-to-date version um thishelps you basically leverage pipelineslike say you want to use Flux or Argo CDand we do have an Argo CD demo uh youcan leverage that in your cluster topush bundles into your cluster and thenmake sure that state is up to theversion that youwant so that's a lot imported Porter iskind of a a malgus tool uh not good atEnglish even though it's my onlylanguage anyways the best way I'vefigured out how to describe Porter iswith its use cases so in MicrosoftPorter is used it in a couple differentways one key one is in marketplaces uhPorter is great with using passes orplatforms as a service because of howbundles are made you can set a definedstate that you want and then have peoplepass in their own variables such as thisis my cloud provider configured this ismy key vault this is how I want you todeploy into my space uh and we'll justcreate that state to be a reality uh atF5 Porter is being used to lower theircloud bill by ensuring that their testtesting resources are always torn downafter a test is ran uh this is allowingthem to save a bunch of costs and keep Ssur and cloud infrastructure teams aliveandwell cool if you have any questionsabout Porter we'll be there uh we'relooking for maintainers we're lookingfor community members we're looking forpeople just to hang out so if you likecloud infrastructure and you like catsplease come check us out uh I will bethere then thanks so much[Applause]2025-04-15 22:02:37.138509kload specification thescore workload specification here it's ascore file and you say I have a workloadI know the metadata I know someenvironment variable I know the port Iwant to expose my application from andto and I want at the very bottom of thisfile you could see that I will alsodescribe the resources that I need is itin AWS is it in GCP Is it locally idon't know i don't know the platformright and I hope it's Kubernetes becauseKubernetes is awesome but as a developerI don't know this technical detail rightand that's where you could see line 17and and and um below I want a radiusdatabase that's what I know and pleasewhen someone will provision this radiusdatabase wherever the platform iswherever the environment dev stagingproduction is please inject thenecessary information to get theconnection string that's what this fileis about I'm just describing what I wantright so that's the first step but nowon the other end as platform engineer mygoal is to support any deploymentrequest of this file of this kind offile at scale right so here the goal isto bring what we call the score one ofthe score implementation and the scoreimplementation is very much about heyare you targeting a docker runtime areyou targeting a kubernetes runtime areyou targeting flyio are you targetinganother platform orchestrating thisdeployment yes sure I will support youand here you have a second opportunityas platform engineer as platformengineer you will be able to author anddesign wells supported golden path rightand the goal is to say yeah sure you askfor radius and a DNS because you want toexpose your workload previously sure letme write down the recipe with my cloudengineer security engineer let me writedown the recipe to provision posgressqlon docker compose maybe on kubernetes inthe cloud same for rabbit mq and name itright kfka radius DNS even rightso you will be able to describe this umkind of recipe in one picture asplatform engineer you will support whatwe call scoreimplementation defining your own recipewhat we call the provisioners and withthat as a developer you will be able toum and as platform engineer and theplatform you will be able to fulfill anyrequest coming in with well supportedgolden pathIt allows the abstraction right the goalis to shift um down to the platform andnot left to the developer so we are umaiming to have some abstraction on theuh developer perspective with that theycould focus on their code and it allowscollaboration with again security cloudobservability platform engineer devopsto have more standardization rightrepeatable recipesthe TLDDR is one of the open sourceimplementation you could find veryeasily is score compose you install itand you do score composing it you doscore compose generate from a scar filewe just saw earlier and after thatdocker compose app guess what you havethe same with kubernetes etc etc but thescore file doesn't move doesn't touchdoesn't change and the code of thedeveloper that's the same they theycould just focused on theircode I hope that makes sense um hereit's some uh uh links where you couldget started the goal is to have up andrunning score compos and scorekubernetes as the two main um opensource implementation as part of thisproject you could create your ownprovisioner on these two implementationyou could extend this implementation youcould create your own implementationright i mentioned fly.io um there isother coming in and you could contributeto score um we are um we are um veryexcited to have some different sessionat CubeCon so please um come say hi andask question ask for demo and pleasecontribute as well um here is some tooltomorrow um I will talk um about scoreand dapper how you will um improve thedeveloper experience um we have a kioskum on Thursday during three hours soplease come by with a maintainer wecould chat show you some stuff and maybecontribute with you and we have acontrib as well 1 hour and a halfish soplease come uh at this different slotavailable during this CubeCon thank youfor your time and enjoy the rest of yourCubeCon thank you2025-04-15 22:02:37.707613simple multi-tenant role-basedaccess control where you haveorganizations with your tenants and foreach of them you can have admins ormembers and we're saying you can edit ifyou're an admin or you can view ifyou're a member of an admin in additionof the model we need to instantiate thatmodel with data and we call them thosetupils relationship tupils in this casewe're saying that Maria is a member ofthe ACME organization and an is an adminan admin on the ACME organization andthen we can define and with that dataand the organization the model and thetupils uh we can write it using the theright API this is using the the Go SDKwe have SDKs for a lot of platforms westore that data in a database we supportSQLite progress and my SQL currently andthen whenever we want to know if a usercover from an action and a resource weask we call the check API in this caseMary can edit a specificorganization so this was very simple butlet's make it a little more complex thenice thing about relationship basedaccess control is that you can standextend role based access control andstart defining your own entity typesright in this case it will be for adocumented management app but it couldbe projects and tickets or whatever yourapplication is about we're going todefine we have folders and documents thefolders belong to an organization canhave parents with other folders we havedocuments have a parent that is anotherfolder and then we can definepermissions or relationships that rerefer to other roles or relationships inthe hierarchy so you are an editor in afolder if you're an owner of the folderor admin from the parent organization oryou can uh edit a document if you arethe an owner or an editor from theparent folder which means also an editorfrom the parent organization so you canhave a lot of flexibility and go wellbeyond role based access controldefining authorization thisway how is being used so if we if yousearch for open FGA in GitHub you'regoing to find 197 repositories that haveopen FGA in the title or in thedescription which we didn't create anduh these are the things the community isbuilding around open FGA right demosproviders operators more SDKs APIgateways integrations frameworkintegrations IDPS ID integrations all ofthose built by thecommunity these are some of the adoptersthat are using open FGA we have anadopters MD5 with every company cansubmit a PR at themselves you've seenthere's very large companies publiccompanies like Godaddy some significantplayers in the cloud native ecosystemlike graphana labs canolica and dockerand a lot of startups that are workingin the space that are using open fga forauthorization let's go to some examplesthis is for example from graphanawhenever you have a dashboard and youwant to add permissions to specificusers or roles groups or even serviceaccounts this data and the authorizationlogic is managed by open fga and incanonical they use open fga in differentstacks of the wuntu pro offering and inif you go to docker hav docker supportsadding teams and also roles per userthat functionality is built on top ofopenfj zuplo is an API gateway they usethat to manage the permissions for eachapi key that you using in the gatewaystack lock is a tool to manage supplychain software supply chain and justkind of monitors your githubrepositories and warns you whe there'ssomething wrong they have a CLI thatgive you manage the roast for thatapplication they use Open Aba forthat readai is a meeting copilot toolthat they record your meetings and giveyou information about it if you want toshare a document and they're a meetingwith someone they use uh Open AGA forthat so if you see there's a a very alot of variation from operating systemsto API gateway to any SAS applicationthat are currently using OpenFGA so you can go to the website andstart learning how to use it if you wantit we have a a booth in the projectpavilion on on Wednesday afternoon andwe have a presentation the last daywhere we are going to share withgraphana labs what how was their journeyto implementing openj thank you verymuch2025-04-15 22:02:38.224612 � ��i�#�AohS-ibtvQWwhi everyone so my name is Andres Agari'm a product manager at Octa and I'mbeing I'm a maintainer in OpenFGA open FJ is an authorization systemfor developers you're going to use it ifyou want to implement authorization inan application like for example same wayor similar way you can use open policyagent and it's based on a concept calledrelationship based access control thatyou can see as an evolution of rolebased access control and attribute-basedaccess control it's inspired by aresearch paper published by Google a fewyears ago where they describe a systemthey call Google Zanzibar which is theway they found to implementauthorization in a way that is genericenough for any use case at Google andthey defined a way you can do thatdefine a model to implement it for anyapplication and also build it to scaleto their scale what we did is we createda server a set of tools APIs SDKs CLIsID integrations to make it simple foryou to integrate in yourapplications we're in the sandbox stagewe apply for incubation we are in thesecond line of projects to be evaluatedfor incubation and it's maintainedmostly by octa with help from graphanalabs how open FJ works you need twothings to work with open FGA first isdefining what we call an authorizationmodel where you're going to describe theentities that are relevant when makingauthorization decisions in this case fora very 3 ��#�UAokapmNodLB0hello everyone my name is Gina Y and I'ma JRPC maintainer so today um I wantedto share the essential resources forJRPCdevelopment so for those of you um whoare new to JRPC it's a modernopen-source framework for um highperformance remote procedure calls so itallows you to build distributed systemsand microservices where your client yourclient application can talk directly toyour servers running on any of themachines jrpcs offer extent languagesupport including C++ Java Go PythonNode.js and more its benefits um interms of per performance efficiency andlanguage interoperability continues todrive its growing popularity um acrossthe cloud native and microservicesecosystems since the first commit ofJRPC it has experiences a remarkableadoption expansion um in the softwarecommunity this sustained success overthe last decade um have and moreunderscores its effectiveness inbuilding the high performanceapplications as of today JRPC continuesto evolve and remains an official uhessential tool for the softwaredevelopers across the worlda few years ago we got a feedback at thejar PC conf and at that time um findinga helpful learning resources for jar PCum was a significant challenge so as aresult we increases our focus oncreating more comprehensivedocumentation and illustrating the keygpc concept and the code examples inmultiple languagesat the same time we launch a YouTubechannel u with lots of like shortpresentations covering the essential jarPC concepts and also introducing the newfunctionalities that we are launching inJRPC last year we have four new userguides added uh with the example codeacross all the languages that we supportand we also have 39 new videos um that'scovering like the new feature that weare launching in JPC and also all thetalks that we have on JPCconfently we added a new mailon listJRPC io announced dedicated to deliverum the essential updates so the highsignal channel cuts through like thenoise and the focus on the securityupdates and the major platform changesand the re regular release notes and thelaunchum launch announcements will stay in the16 JRPC io group so if haven't so if youhaven't subscribed to the mail list Ihighly recommend you to do so so you cankeep in um stay informed with the latestupdates that we have onjarpc so exciting news jrpg comp 2025 iscoming to Sunnyville California onAugust 26 bringing together the JRPCcommunity the call for speakers is nowopen so head over to the website tosubmit your brilliant ideas and we arelooking forward to to hear from all ofyou and to see all of you at Sunnyvilleon August 26so if you're interesting to learn moreabout JRPC join us tomorrow at 3:15 umin the same room we have a JRPCmaintener talk and we will be discussingthe latest updates and the features thatwe are launching inJRPC so this is my last slide visitingthe gRPC.io site for our documentationand the example code subscribing to ourYouTube channel to get the notificationswhen we have new videos availablejoining our monthly meetup and um wherewe will have the regular updates andalso the new feature introduction on JPC um submit your ideas to our JPC confand we are looking forward to meetingyou all and join us the mail list whichI just mentioned earlier to get someupdates from us and finally follow us onX all right thank you2025-04-15 22:02:38.733378 66�F�#�EA1jPvEAhkklghi everyone um my name is Kantima Silenbut I usually go by Tina i'm a softwareengineer at Red Hat and also contributorof StreamZy so I'm very excited to behere to talk aboutStreamZy so what is StreamZy it's a CNCFincubating project uh open source underApache license uh 20 it's an operatorfor running Apache Cafka on Kubernetesum it's based built based on Kubernetesoperator pattern and provides variousoperators to run um manage Kafkacomponents also has additional tools tomake running CFKA as easy as possibleso StreamZ um um automates theinstallation of Kafka itself as well asum other components like Kafka connectmirror maker and HTTP bridge which isprovided by StreamZy uh for connectingto your cluster overHTTP and StreamZ handles not just dayone operations but also day twooperations like upgrades certificatemanagement um scaling of clusters andconfiguration of clusters also usesanother open source project called uhcruise control for balancing data inyourcluster and with stream you can alsoeasily monitor your cluster integrateswith uh various monitoring tools andalso for security it provides variousdifferent authenticationmechanisms okay so one of the biggest uhand recent change that was introduced tostreams is um removal of zookeeper sozookeeper was used to store metadata ofclusters um but it was removed from inapache cafka 40 release so it's beenreplaced by cafka's own implementationbased on um raph protocol so themetadata is stored within cafka itselfuh motivation for this to get rid ofthat uh additional system to maintainand manage and that simplifies thedeployment and operations and alsoimproves the scalability and performanceso stream 045 is the current release andthis is the last version to supportzookeeper based clusters and this isalso the last version which you can ummigrate your existing zookeeper basedcluster to craft and we plan to provideextended support for this version andthe next release will be 046 and withthis you can only run a craft cluster soyou need to migrate your existingclusters before you can upgrade thisupgrade to this version we're alsoremoving um some deprecated componentslike uh mirror maker oneso future plans for streams uh wecurrently working on improvingcertificate management um to uh providebetter support for tools like searchmanager and also um trying to integrateself-healing uh future of uh cruisecontrol and we also plan to um releaseone API and stream zero version thereare other couple of interesting um workuh being cons considered uh which is uhstretching Kafka clusters acrossmultiple Kubernetes clusters and also toprovide built-in gateway to uh exposedcast uh Kafkaclusters all right so there's a streamvirtual conference coming up on June 4thum so if you would like to learn moreabout streamy and hear about interestinguse cases and integrations please joinus this is the second time we're runningthis uh conference uh last year we had avery successful one so hopefully it willbe the same this year and here's thelink to uhregister and tomorrow I'm joining mycolleague Paulo uh who is also a streamymaintainer to talk about streams in moredetails so if you're interested to learnmore please come and join our sessionand also he uh my colleague Paulo gave atalk about um managing Kafka workloadthis morning so if you missed thatplease check out the recording um whenit becomesavailable and please join us if youwould like to contribute uh we always umlooking to expand ourcommunity that's it thank you2025-04-15 22:02:39.316018peed up your builds anduh reproducible builds and thenultimately metadata that we can use toinspect the different layers you can useDocker files with build packs but mostof our users uh don't needthat so why use build packs well they'rea sustainable option and I mean that inthe sense that they use re resourcesefficiently uh how many of you have madelike a oneline change to your Dockerfile and then busted the cache and hadto rebuild the image from scratch ithappens and it takes you know minutesand sometimes hours to rebuild uh buttry doing that 10 million times it justdoesn't scale uh so Terrence is going totalk about uh some specific examples ofhow that how those caching mechanismshelp youyeah thanks Joe and so uh you know mostorganizations you're not just building asingle app you're building hundredsthousands of applications containerimages in your registry one of thefeatures we have in cloud build packs isa concept called rebase and so u uh whathappens when you get a CV in your baseimage right uh this is the mostexpensive operation in traditionalcontainer workflow because that's thefirst line of docker file which means itdefinitely invalidates like all thecaching mechanisms inside of uh yourimage so you have to basically throweverything out build a whole new imageand then even when you do deploymentright like you've got a whole new digestfor every single layer right on top ofthe base image itself uh because you'vebuilt a brand new image and so evendeploying across your nodes is expensiveum with cloudnative build packs uh wehave this concept called rebase and soyou know Joe was mentioning that we havethis well ststructured OCI image and oneof the things that you get out of thisis that we know where the base imageends in your application starts and sowe kind of know where that separation isand so if you're building on top of baseimages that have AI uh compatibilityguarantees one of the things that youcan do is you can do a lift and shift onour end and you can basically do a JSONtext file manipulation cuz we can justgo ahead and replace um the underlyingOS image layers and um the reason thissafe is because the binary compatibilityuh inside of that base image right andone of the things that you've noticed inthis diagram is actually the Shaw layerthe Shaw digest um for the app layersare the same and that's because we don'thave to touch them right we're not doinga rebuild we're just replacing the OSimage layers and so in that deploymentexample I was talking about uh withthose millions of images uh with uh CNBimages with that uh you don't have tobasically if those image layers arecached on the node you don't have tore-upload them just the new uh text fileand the base image layer underneath sosignificantlyfaster um that all sounds great how doyou get started uh the quickest way toget started is uh as part of the projectwe have a CLI called pack um you can getthat through homebrew if you're on a Macuh or Linux um we also have stuff inmost of the standard Linux distros aswell uh or you can go to GitHub anddownload it and you just run pack builduh in your source tree uh with the imagename and then it goes off and does abuild and uh for most people they don'twant to go and build their own buildpacks uh you of course can go ahead anddo that but a lot of people get startedby using existing build packs out thereuh we have this concept called builderswhich package basically base images withuh a set of build packs that are alreadyincluded and these are probably the mostkind of popular vendor ones that are outthere uh the ones from Heroku Paketto umand the Google folks and you can viewthat and kind of just go and getstarted um and uh at the con uh at theat KubeCon we have our maintainer trackuh tomorrow uh afternoon uh I'm doing atalk uh about bill packs with WOM uhwith my friend David Justice uh laterthat afternoon as well and then we'llalso be at the project pavilion um inthe afternoons um throughout the wholeconference so uh love to talk to you uhif you have questions and have a greatKubeCon thanks y'all2025-04-15 22:02:39.960687 ��~�#�5ABlzHv9KV1Z4okay let's start uh this is Klaus fromuh Nvidia and uh my jit hub ID is K82CNand I'm the I'm the founder of volcanoproject and used to be the co-chair ofsix scaling so I'm going to give areally quick introduction about what wehave done about uh uh for language modeluh to you know for multiple cluster nowthe first one is that we why we havesuch kind of activity to enhance theKubernetes and enhance the volcano to todo that you know the first one is thatfor language model;�h�#� AiPd-IQfbVLAi want to give you a quick update aboutthe Deer project which provides APIs tobuild secure and reliable microservicesI'm Mark Derer one of the Deer communitymanagers one of the three these days Andat Deer the distributed applicationruntime is used by organizations bothbig and small and no matter what kind ofvertical they're in to yeah speed uptheir application development Um sotypically it saves about 30% ofdevelopment time which is quite a bigchunk Um so it provides a:�&�#�Ak7Wcd9HdXAYgreat so I'm here to talk about the Fluxproject um it is a graduated project inthe CNCF my name is Tommo Nakahara andI'm one of the Flux communitymaintainers and I also work for acompany called Helix that is a Fluxuser so Flux addresses many needs thathopefully have become standard by todaythat you expect out of your CI/CD systemuh githops now is hopefully become aindustry standard term and part of thatis progressive delivery that flux alsodelivers that are like canarydeployments blue green deployments andsuch uh automation is really importantuh you no longer want to be futing withum uh you know uh configs or um stuffthat you've co uh cobbled together andum you want to be able to rely on yoursystem uh to be able to um you knowdeploy in the way that you need umreliability is important hopefully bynow you've heard of Dora metrics whichare metrics for you to be able toreliably release at the in the speedthat you need to meet your businessneeds um security I'll talk a little bitmore about this as well um Flux has beendesigned to be security first um andscale we've been really excited to haveTelos and other companies that havepushed the limits of Flux design to meettheir scaleneeds so uh Flux is what created GitOpsand we've been around for a while we'vebeen a graduated project and so a lot oftimes you might not even know if you'reusing GitOps with um Azure or AWS or oneof the clouds a lot of times Flux iswhat's under the hood and we've beenreally proud to have so many differenttypes of enterprises that rely on Flux'sdesign um so I'll share some of thebasics for example um it works with yourexisting infrastructure um and we lovehearing uh stories for example about howthey can rely on Flux to actually umspeed up their Kubernetes adoption sowhether you're brand new to Kubernetesor you're somewhe9��#�=Agi3hZAFI0qsokay uh we're going to talk aboutcontainer builds at scale withcloudnative build packs when I say scaleI'm talking in the order of tens ofmillions of images and we'll take a lookat how companies like Google and Herokuare managing their container images uhat this level my name is Joe Cutner withme is Terrence Lee uh we co-founded thecloud into build packs project aboutseven years ago and it's now anincubating CNCF project so what arebuild packs uh they are tools that turnyour application source code into uhcontainer images without the need for adocker file uh at the end you get animage that has layers that map logicallyto yourapplications excuse me components umwhich allows us to have very powerfulfeatures like uh additional cachingmechanisms that s7re in your journey umespecially the progressive deliverythat's um been there we've had Fluxusers say like we really recommend thatyou make sure you install it as wellbecause you know that you have a safetynet to really experiment and try andhave your various teams get used toKubernetesitself um so Flux is light uh fast andlightweight uh that's one of the thingsthat we continue to get great feedbackand one of the reasons for that peopleare really excited is that um Flux usesa native Helm SDK um that means that forexample if you're using um Helm chartsyou're not you're not repeating bits andpieces um here and there especially ifyou're using multicluster um you know asyour systems might need complexity thatdoesn't mean that you have to have acomplicated system that creates a lot ofmanual work um and importantly a lot oftimes people are doing custom thingshere and there but you have no changesthat you need to make to flux itselfum should remind everybody that flux ismulti- everything it's multi-tenant umulti- uh git and it's also multiclusterwhich is one of the strengths as welland that's one of the areas where weremind people that um flux uses uh isintegrated with cluster API and that'sone of the feedbacks we get that peoplelove and that means that yourmulticluster management is simplifiedand scalable and we're really excited toannounce how um health checks areavailable now using um with flux's useof common expression language or cell umand this means that you get moregranularity uh to see what the status isof your cluster and its readiness so ifyou have dependency management or raceconditions or you want more grgranularity with notifications um thismakes this possible if you want moredetails make sure you check out our QRcode which has all of our informationabout our activitieshere uh and as you've seen now um Fluxhas been integrated with key features isvery aligned with Kubernetes and it'salso very extensible which is why wehave such a strong community of peopleum who are building integrations oradding value to it um there are many butI'll mention some of the latest ones umfor example you have various UIs thatyou can use um but we're really excitedthat the headlamp team um have beenworking closely with Flux to create areally great usable um UI uh they didannouncements at the last CubeCon at therecent one they have some new ones aswell so check out more information atthe QR code um there's also a Fluxoperator um if you ever use FluxBootstrap which is really convenient itmakes it even more convenient bystreamlining a lot of those processes sofind out that information and wherewould we be if we didn't mention Gen AIum so you want to make sure that youhave good version controlling for yourLLMs and so it's really exciting thatHelix has also put out some referencearchitecture to share with the communityand so there's that and so much more soum security is key we've had financialinstitutions healthcare government usingFlux um they really like that there'ssecurity first in the design you'veprobably heard over the years how wellwe've done in all the security auditsthat we've done through the CNCF fluxalso uses Kubernetes arbback and mostrecently we've added GitHub appauthentication um there are alsosecurity slams here at CubeCon so checkout that information here um oh and Ishould mention that we've been reallyexcited about support by companies likeControl Plane that understand that CI/CDis now often a vulnerable area thathackers try to um infiltrate and sothat's part of the plans as well sohopefully you've seen these are some ofthe highlights and um latest on how Fluxhas been meeting the CIC uh CD needs andwe are here at CubeCon London and we'dlove to talk to you uh again the QR codehas information about our flux booth oursessions the security slam and we havemany of our maintainers who are here ithink there are a few here over therewe've often had questions right afterthe lightning talk we'll be rightoutside the room so if you haveimmediate questions about Flux um gochat with them thank you[Applause]2025-04-15 22:02:40.519599bout like 13different APIs uh that decouples it fromthe underlying infrastructure I'll sharea slide later Um the project has beenaround for five years now It's reallybattle tested project and last Novemberit became a graduated project which isreally quite an achievement Really proudabout that And yeah here you can seethat you can use whatever language youwant to develop your applications Deerruns in a sidecar in a separate processnext to your application and you use thedeer API inside your application codeBut the deer API then provides a layerof extraction over the underlyinginfrastructure which makes it likereally flexible So if you want to usethe popsup API with Dapper you use theyou can use like what whatever messagebroker out there because that's veryeasily configured with Dapper and youcan change them without changingapplicationcode I want to share a couple ofhighlights from the latest release which115 uh last February So one of the majorthings that came out is that the Deerworkflow API is now stable So it can beused in production So it's based on thedual execution concept So it means thatthe workflow state is uh saved to a deercompatible state store It worksseamlessly with all of the other deerAPIs and um you author your workflow ascode So you can use like C or Python orJava JavaScript or Go to to write yourworkflows and you can use like differentwork workflow patterns So for instanceif the order of certain tasks areimportant you can use like task chainingIf the uh if there's no dependencybetween activities you can use the fanout fan in pattern and then stillaggregate over the result over all ofthese activities once they have beencompleted You can use the monitorpattern which is great for reoccurringactivities And you can use the externalsystem interaction when you want to uhhold with the execution of the workflowuntil an external event comes in andbased on that payload of the event youcan make another route in your uhworkflow And typically you would combineall these different patterns al togetherin yourworkflows Another new thing in 115 isthe LM conversation API So as you canimagine you get one consistent API totalk to different LM providers And whatDeer does in addition to that is itprovides like prompt caching to save yousome latency and some cost And you canalso enable PII offiscation So you canobiscate some sensitive user input likelike credit card details or emails andsoon Here you can see what LM providers wecurrently support Actually this isalready out of date because uh um umsome some more have been added recentlyUm another new announcement that theCNCF made like a couple of weeks ago isdeer agents which is a framework tobuild agentic AI systems So it'scompletely open source uh systemCurrently it exists as a Python librarybut um we're also going to develop thatfor other languages as well And it runson top of Dapper opensource So you can use like pop submessaging between your AI agents and youcan also use all of the differentworkflow patterns I just showed you aswell to build these agentic AIsystems So if you want to know more yeahplease come to the Deer Kiosk at theparty pavilion We're at Kiok 8A Uh I'llbe there and another maintainer will bethere and also one of the steeringcommittee members will be there Andthere are also some deer sessions thatyou can attend this week two tomorrowand two on Friday Uh on Friday I'll bespeaking about deer workflow in quitesome detail And if you want to know abit more about deer right now thenplease visitdeer.io Thanks everyone Have a good cube2025-04-15 22:02:41.186562ing we also is saythat we we don't have enough resultsright we have really large cluster wehave requirement of lots of uh GPUhardware and the network uh networkthings now the first one I think thechallenge is the single cluster thescalability of single cluster is notenough right the by default is 5,000 butfor language model there's going to berequire lots of uh GPU hardware thesecond one is that we you know when wehave enough GPU or we have enoughnetwork but how about the resourceutilization if they didn't have a gooduh resource utilization I think theperformance is also not good enough Ithink the third one maybe the a kind ofnew topic is that we have severalframework right parch severalenhancement but lots of framework didn'ttalk with the infrastructure layer rightthey didn't export the information orhow the data was exchanged between theagent and infrastructure didn't knowthat right the kubernetes volcano didn'tknow that information so they it's hardfor them to you know to do theimprovement to do any improvement forthat and we just guessright okay the first one is I think forthe federation for the multicluster is areally how to say really long topic wehave several topic about federation I Ithink it's about five or 10 years agoright there several challenge things andwe try to handle them in our in ourinino project so the first one is APIright uh for federation they try to keepthe backward compatibility of uh severalAPIs such as deployment several thingsbut in volcano we have a single projectright volcano global for federation caseand volcano core part for single clusterso we will provide a unified API for allthe case so that's going to be thesimple to handle this one and the secondone is about networking and storageright for volcano and for the cubes wefor for AI uh scenario we focus on thewe we d we dedicated this part to youknow to other project I think the majorthing is that we are going to have a apowerful really powerful management forthe east wise networking part we havethe other project to handle this one andthe third one is how to handle thescaling uh the scheduling across thecluster I think on the on the bottomleft we have a case that we're going tohave a dialogue between the you knowcross the cluster this is a typicalissue the thing is that I think themajor thing is that the this two clusterdidn't have information with each otherso in a single project for example forkino we have enough information tohandle this partuh the third one is how we are going toimprove the resource utilization i thinkthe major thing is the uh network awarescheduling so we are going to providethe several information and the firstone is uh we're going to have a you knowkind of cross the cluster and the secondone we're going to provide the unifiedAPI for all of them and the third one isthat we will provide a really largescalability component to handle thispart and in in tomorrow we are going toshow more details about network uhscheduling and also we are going to havea user case for this one okay uh thethird one is that we also introduce akind of meta framework to talk with theinfrastructure we can see that on the onthe button head that we are going tochange all the information networkingstorage of the workload to theinfrastructure layer so volcano can usethis information and al also the volcanoglobal can use this information to youknow to place the workload better forthe all the workload and so we canimprove the resource utilization and geta better performance for the things yeahand this one is all the reference allthe links of all the project I mentionedhere so you can take a picture and getthis we can talk thislater okay thank you thank you very muchyeah2025-04-15 22:02:41.838676 �j�#� AvpMHv-56gskgreat so uh let beer at CubeCon Europe2025 and uh this lightning talk is aboutCI/CD observability with open telemetryand let's start with a quick uh historyso I've been talking about uh CI/CDobservability uh for many years nowmaybe some of you have heard me over atdifferent stages including here atCubeCon and CDCON and others uh andabout two years ago I uh raised uh anopen telemetry enhancement proposal inOTP uh essentially suggesting to enhanceOTEL to cover CI/CD observability andnot just the production monitoring thatwe're probably all used to uh and thisyou can see here on on the screen the uhoriginal and the OTP ultimately uh endedup with the formation of a new specialinterest group SIG under OTELalong with my colleague Ael Perkins whoI believe is somewhere here at theaudienceum and the SIG deals with several areasuh the most important one the first oneuh and the one that people know most issemantic conventions semon inshort um how many here know whatsemantic conventions =�o�#�ABdkB0eERa5Ahello good afternoon so this one is notgoing to be a technical talk um I wouldlike to talk to you about the BlackInitiative the the Black Indigenous andPeople of Color Initiatives BIPO and howwe're trying to build uh an inclusivecloud native uhfuture so a little bit about me who am Imy name is William Ritz i'm a consultingarchitect at Mirantis uh I'm also a CNCFand a linker ambassador um I have servedrecently in the Kubernetes 132 releaseuh team uh I'm a member of the BIPOworking group and a member of a memberof working group platform i kind of duba little biteverywhere um so you can scan that QRcode if you want to link up with me idon't know why you would but you knowyou can give it a go so why do we wantto build a why do we have built a new uminclusive initiative uh or a new diversdiversity and inclus inclusioninitiative seems to be a very hard termright now uh but we want to empower ourterms are to empower people uh that looklike me or or with that look like someof you that are black that are that areindigenous people of color but then thatthat are not necessarily uh looking likeme but also looking more you know alittle bit more white we want to empowereverybodyso we've there is racism also in opensource and I I had it in the slide iactually removed it now i I thoughtmaybe it's a bit hard to say to otherpeople but um there is um there is adiscrimination so what we want to do iscreate a platform to for uh black bipokindividuals to uh begin contributinginto open source be more visible andspeak in public uh about uh technologyand open source and cloud nativetechnologies um so we want to have askto many of you if to join us uh in thisin this journey if you are a maintainerif you are an OSS contributor if you area speaker to put yourself out there as amentor for some of this uh of of us orto if you are one of us to be a mentorforothers and that's pretty much all I'masking for and this is a Slack groupthat you can join to to help us out thatwas all[Applause]2025-04-15 22:02:42.622973are with a show ofhands okay not many uh but semanticconventions are essentially not theformal definition it's the uh uessentially the common language fordescribing your telemetry whichevertelemetry it is uh which attributes uhyou report about your system and so onin this case about CI/CD related uhattributes and there are severalattributes or attribute groups that uhwe we've tried to tackle with SIG uhfirst is the CI/CD pipelines essentiallythe attributes uh denoting what apipeline is and what happens there yousee some examples here on the on thescreenshot um like the uh deploymentname pipeline result pipeline run ID andso on um and you can see from thescreenshot you have the the attributename that the semantic conventionsdefined the attribute type and otherelements also how the different signalsinterrelate and so on this is thesemantic conventions other domains arehow you express deployments consistentlylike Dora and so on uh there are alsoattributes for uh VCS versioning controlsystems expressing things uh about yourthe actions happening within your repowithin your repository like when you uhmake a change and so on uh also when uhyou have uh tests go on how to expressuh them uniquely within your CI/CDpipeline runs this is the testing sideand uh also about u artifacts uh forthose who know Salsa the supply chainlevels for software they have the notionof an artifact uh and and the it's ittightly relates with that notion forthose who familiar so we have theattributes and then once you have theattributes you attach the attributes toyour uh uh telemetry and derive uh therelevant metrics traces and othersignals so you can see here examples forsome uh work we did some work we didaround metrics for example you see apipeline run duration worker count um uhVCS change count uh or or a time formerge things that you'd expect toexpress now you have the language toexpress that within your telemetry alsothe screenshot is taken from a GitHubpipeline run and you can actually seethat visualize it as a trace based onthat these semanticconventions but it's not just aboutsemantic conventions here at the SIG uhwe also have been working on thespecification for context and baggagepropagation over environment variableswe all know about propagating contextfor example between microservices usingover the network which is HTTP based orgRPC based and for these we actuallyhave well- definfined uh semanticconventions andspecifications but what happens in thesecases where you don't have networking touh to uh communicate over the network uhlike things in the CI/CD pipeline orinfrastructurees code things liketerraform and and open tofu when youspin a subro uh uh sub uh uh processesand and so on this is where propagatingthe context and the baggage overenvironment variables becomes valuableand this is a work by the way what yousee here on the screenshot is an opentofu uh uh essentially trace uhvisualized based on this mechanism sothis is another area of focus for theSIG uh and much more that we we do bothin terms of different signals metricstraces events essentially logs uh and Imentioned before GitHub we also havereceivers for GitLab working uh interestabout Argo workflows Jenkins and othersso lots of receivers prototypesreference implementations for uh forthis work uh we also are looking toexpand semantic conventions be foradditional use cases and additionaldomains such as software outageincidents and others so all of thattogether all the work uh the purpose isto really enable observability acrossthe software development life cycle endto end so uh if this is of interest foryou check out the talk tomorrowWednesday noon uh of the maintainertrack you have the link here uh for thesketch for my with myself and theco-lead uh anduh check out the blog post that we wewrote on the last CubeCon CubeCon NorthAmerica about the SIG the charter andeverything else you're welcome to scanthe QR code and check out moreinformation there uh I'm Dotan Horvitzthank you very much for listening andlooking forward to seeing you at themaintainer track2025-04-15 22:02:43.189843 ??�=�#�3A0FrmlwV9D0Yokay time to start um okay nice meetingyou everyone's this is Syan from IBMresearch uh I'm a core maintenance fromthe Keepler project and it has been mygreat pleasures being here's updateabout our project um okay uh we have atalks and we have a several booths inthe previous kubecon but again I wouldlike to start like about the motivationsof the kevlar project and what's thekevlar project is and the missions ofthe kevlar project is basically toanswer these questions how much areenergy consumption should be attributedto your workload this question is not uhapply to just only the kubernetes worldbut it's also in the process levelscontainer level or the pot levels thatwe are talking and hostiles are it couldbe applied to the biome metals levels orthe workload that landing on thecloud the Kepler project has beeninitiated or started in February 2022 toanswer that questions to give thesolutions by the red hat and IBM teamand has been donated to CNC up as asandbar projects in May 2023 um therethere is three functions that we offersby the kelers the first thing is realtimes metrics collections we collect allthe resource usage uh in in yourmachines we do the second uh functionsthat is five grains power modelings andwe do finally we reporting the energyconsumptions per process or percontainer or per pot like what we callis five grains uh matrix of the energiesand we export it in uh formats of thepromeas which is kind of the the factorstandard for for the metrics and that'scan be used to show as a dashboard inthe graph as you can see on the on thescreenso how do we evaluate or how we valuesfor each functions that we offers thefirst one is the real time matrixcollections um because it's real time weneed something that the lightweight andwe have to make sure that the way thatwe collect the matrix doesn't affect ormaking the ohhead consume more uh energyconsumptions on your machines um wefirst look into the EVPF technologies asthe previous um um speakers mentionsabout like the leveraging the EVPFtechnologies could be solved are ourfirst issues and the second thing isthere how we do the frames powermodelings it's not simples as like howmuch resource you use you just have toattribute on that because their processcontainer and pods they have somethingshared together like there they sharesome part of the hardware that's consoleon the energies it's not straightforwardthat you can't isolate this um directlyso in during our three years we have alot of blog post um explaining why wemodels the power energy in this way andwe also have a lot of the resource in uhthe Kevlar's um in the Kevlar website aswell that you can track and the latestone is their accurate report for thereporting features that we ummentioning so um currently our uhKepler's projects community we areworking very hard um on doing uh ondelivering the ex accurate report of theenergy consumptions of the workloadsthis including the effort uh from thered hat team and also the um the orcompanies like IBM or Intels trying tovalidating the um the power modelstrying to find the benchmarkings and weare trying to expand the variations ofthe matrix and the machines to covers asmuch as possible we trying to inincrease their engagement andcontributor to the Kepler project and uhwe also considering using their umintegrations of AI assistant and agentsto help uh things work better you canalso learn more about the kelers fromthe many online media we do thecollaborations we have a podcast withtheir peer performance we have thejoining the YouTube sessions with uhwith very awesome uh YouTube channelsfrom within and victors and recently wehave got a great honors to be uh beingused and being a part of awesometutorials in the um dino YouTubechannels using our kelers config ourkelers and uh integrated into theobservabilities platform so this is allI would like to update today um pleasecheck out this QD QR codes and uh reachout to uh us SL we are the slash channelscabler hyphens project thank you2025-04-15 22:02:43.751923 ��p�#�Afok2apYcVdEso my name is Matias i'm a co-maintainerof Cubscape so Cubscape started in 2021as a static scanner for uh yourcommunities cluster uh we were scanningyour resources uh against some securityframeworks andum I have some good news so this year wereached uhincubation we started as a sandbox andnow we are an incubation projectso uh when I joined uh the project itwas like two and a half years ago moreor less and there was something new atthat time not new but like it was hot itwas ebpf so we thought how can we evolvefrom just scanning a cluster to includesome ebpf uh goodies so we said let'smake uh let's start recording howapplication behave and then we can checkif it's good or not if there are somedeviations so it seems simple uh so wewrote an agent using ebpf which wouldlike monitor the containers through theLinux kernel and then just store allthese uh behaviors as a CRD into thethrough the Kubernetes API we eventhought about writing our own storagecomponent uh which allows to have aseparateuh separate place to store them uh notin the TCD so seems simple however atthe verybeginning one of the CR that we createdwas called application profile and itwas very simple we would track only onecontainer we would only record execs andopenshowever when developers have a game theystart playing a lot so then we startedtracking not only one container but allthe containers of of a workload so in itcontainers FML containers all thecontainers and instead of just the execswe added the second profiles uh theendpoints that the application istalking to so the different images thecall stacks uhpolicies as a result our CRD instead ofbeing 100K became 100megs so as you can imagine it causedsome issues so I'm going to tell alittle bit of them so you have the wholestory on how we developed uh thisfeatureso first about the EBPF library so westarted using the falco liibsuh which were good because they werecompatible with older Linux kernelsunfortunately uh since it's old ebpftechnology we were using a lot of CPUand a lot of memory so we moved to theinspector gadget libs which are likemuch better because you using newer ebpflibrarieswe as I told you we were implementingour own API server but instead of usingthe sample API server we modified itheavily uh first modification obviouslyto store 100 megs you cannot reallystore them in etc so instead of usingetc we started to use filesuh and then a SQL like database uh to beable to get the metadata fast and we hadto implement a direct IO driver to avoidhaving all those files loaded in thememory cache of the Linux kernelthen we instead of sending full objectsback and forth between the agents andthe storage we started to only senddiffs which has the two benefits firstyou are not sending 100 megs every timeand second each no each agent doesn'tneed to remember what he sent he justlike sends what he sees during like ashort amount of time and then he canflush the memory and then finally we gotrid of JSON so the sample API serveronly uses JSON so we shoehorn into itgRPC and instead of using six or 700megs of memory to unmarshall aJSON we just use like uh this the sizeof the of the CRD so few hundreds sogRPC was really like a good um goodimprovement so you can reach us at theproject pavon we are there every morninguh 18A so tomorrow Thursday and Fridayand there is a talk uh on Fridayafternoon during the country fest thankyou[Applause]2025-04-15 22:02:44.573034 V�V�z�!#�-AbeAoZ2fI-QQhi everyone um this is Vant uh I'm oneof the maintainers of Litmas Chaos uhtoday I will be talking about Litmoschaos and you know um its journeytowards uh CNCF graduation so uh beforestarting with the same uh just a littlebit of introduction about litmos chaosso if you don't know about about it uhlitmos chaos is a tool which you can youknow um use to induce chaos intodifferent layers and of yourapplications but in a controlled way uhin terms of matrix uh it is beingcurrently used and adopted by around 250plus enterprise customers and um if youlook at you know last year matrix we wehave around 2 million plus litmosiA� � #�yA81SMpKgJb3kall right hello everyone my name isAugusta i'm a perser i'm here to talkabout pers and give you an update of theproject uh since September 2024 Pers isa CNCF sandbut project it's about uhgiving uh a possibility to the CNCF tohave a project that will give any forany observ data visualization so it's aobservation tool likegraphana uh but it also aim to providean open specification fordashboard and I would like to talkspecifically to about the new releasethat we is coming so theV051 which is currently in beta this newversion is coming with a new plug-insystem which entire review how we areable to load the plug-in and uh thanksto this new release we will be able toload external plugins which implies weare going to support over data sourcesuh than promeus and tempo so we'll startby uh with sorry open search and lo keyfor the logs and then for the profilingwe'll also support um pyroscope andparka we are going also to provide a devenvironment for the plugins uh so if youare going to implement a plug-in uh inthe pers ecosystem you will have a CLIthat will be able to load the yourplug-in into pers so you can see in liveyour changes and if if it fits uh pesalso we are providing a certain numberof plugins and uh each plugins will comewith a npm package so you can embed inyour any kind of plugins we areproviding it also come with a go moduleand a culong module uh which isimportant for us because we areproviding a dashboard as code SDK so ifyou want to implement your dashboardusing your Golong or Qong then you canyou can use our package to implementyourplugins uh also we are pouring overfeature and improvement of course suchas we are able now to connect datasources protected by O2 or OIDC we alsoimprove the data source discovery u towell withKubernetes and the last thing I wouldlike to talk about is the possibility toconfigure linu or dashboard that's animportant things because ifyou if your company has a guideline anduh they want a certain amount of thingsin your dashboard you can enforce nowthis guideline directly with this linktruth and ensure that all your dashboardwill fit your company guidelinesregarding the cubecon agenda for pers itwill happen on Thursday the first thingwe have a talk from perser Nicholas andAntoan they will talk about dashboard ascode uh they will give you ademo about the complete workflow ofdashboard as code starting to develop adashboard using Golong or Qong andfinishing by deploying the dashboardwith aCI/CD we are also available all the dayat the project pavon map uh because wehave a booth at the freeb kiosk so ifyou are interesting by the project andyou would like to share your experienceor any use case you would like to see inthe project please come to us we'll bethere all the dayif you interesting by the project youcan follow us on social media or you canfollow directly on GitHub the projectpersp we are available on Slack ofcourse uh if you want to talk directlyto the mount and we have a officialwebsitepers.dev where you have the blog postand the documentation thank you verymuch[Applause]2025-04-15 22:02:45.142989nstallations till now uh we have around68 millions docker pools 300% uh usageincrease in last one year 21 activemaintenance uh this is this is going toincrease because um just now we haveimplemented a you know new governancemodel which you know includes bootingmodel as well so according to that youknow it is going to increase we are alsoworking with different you knowcommunity members on the same and thenwe have also done 100 plus releases soyes we have done our own century andthen u yeah we have 2500 plus communitymembers so if you haven't joined it yetuh I will be sharing a QR you know codefor the same you can scan it and youknow join our slack channel so yeah uhfirstly um we'll talking about theaudits we have done till now so we havedone uh two audits uh till now one is uhsecurity audit and other one is thedocumentation audit security audit wasdone by a 7-day security team as part ofthat uh we fixed many vulnerabilities bebe it at the application level be it atthe network level be it at the you knowAPI protection level um and we alsoenhanced our CI/CD pipeline so that youknow we don't um you know uh find thoseissues again and we implemented you knowadmission controller tou restrict malicious users from creatingprivilege parts because litmuscos itselfneeds privilege access to you know dothe chaos coming to the documentationaudit uh it was done by Nate and devteam uh from the CNCF team we with withtheir help we were able to you know umremove different obsolete websites thatwe had older litmuscular website v1 andv2 docs we also uh you know worked onyou know updating our structures tomaintain the consistency we also we arealso working on you know adding moretutorial sections to help the early orbeginner users to be get started withlitmus as as soon as possible thencoming to LFX mentorships um litmus as aCNC projects also uh you know activelyparticipates into LFX mentorship uh thisyear also in this term we are uh youknow working on three different taskwith three different mentees who arealready selected one of the task is toyou know improve code coverage forobservability in litmos other one is toprovide a flexible way to interact withour litmos chaos APIs be it terform beit GitHub actions be it gitlab templatesor even um SDKs and then lastly we haveyou know um um our documentation task sowhich is to add you know different um umdocumentation sections with respect totutorials you know and keeping it thestructure like you know day one day zerolike that um next is uh what's coming upin future so um we are currently workingon as as you saw right in LFX mentorshipwe are also working with differentmentees on SDK support terraform supportso that is also something which is inprogress similarly we are also lookingover to the KSGPT integration to providea recommended way approach uh to youknow run chaos experiments you knowwhere to run how to run what to run andthen similarly we currently uh you knowuse dedicated agents to induce chaos beit in your bare metal servers or youknow Kubernetes clusters to avoid thatto have a transient runner running andto be able to do chaos you know via theuh remote agent that is something whichwe have in our future in mind and thenlastly you know uh we have native chaosworkflows also to be able to you knowmanage the whole life cycle ofexperience by ourself and then opentelemetry is the one which is uh we arecurrently working as a one of the taskin LFXmentorships lastly um I would say thanksuh for listening to me uh we have youknow I have shared two uh QR codes hereone is for you know checking out ourlitmos websites and one is for you knowjoining our litmas slack and with thatwe also have one more uh talk coming upin the maintainer track which you canjoin uh there will be more in-depthupdates about what's going on with thelitmus kiosk there SA and Satak will bethere who are also you know maintainersof litmos and similarly we also have aproject pavilion kaiosk 10a there alsoyou can join uh we have afternoon youknow hours there so yeah thanks everyonethat's all ahead[Applause]2025-04-15 22:02:45.592658les that are designed to besimple to read and to write So thoserules are based on system calls in theLinux kernel But even if you're not anexpert in system calls you work very farfrom the kernel You don't need to be oneYou you just you can simply read thisrule and perfectly understand what itdoes Uh even if you do front end all dayI went to front end engineer and andasked and I completely understood forerules So as easy it is to write and readrules it's easy to get the alerts It'ssimple text It's a text message that youcan see that not only has a specificevent but also the context about yourKubernetes cluster your cloudinfrastructure and everything that isaround the specific event So uh Falcoover the year became far more than thatSo Falco does not just take events fromthe kernel and reads them and outputsthrough and outputs alerts Today you caningest events from a lot of differentareas from a lot of different sourcesYou can take them from Kubernetescluster from your cloud account eventsand from a lot of other things And youcan output them to your CM to cues to alot of other places It would take me farmore than five minutes to to tell tellyou about all of these things And when Iwas talking about these kind of thingslast week with the CNCF community groupthey asked "So it's great It's got aresponse engine and all but I never useFalco What what would my journey be likeif today I started using Falco fromscratch?" And uh I'm here to answer thatquestion because I think it was a coolquestion So at first if you try you willdiscover what's happening and I can tellyou this is not just going to be ajourney of discovery about what Falcodoes and doesn't do and if it's good foryou but it will be a journey aboutdiscovering yourself as well because Iam sure that when you install Falco youwill find out that your nodes are doingsomething that you're not expecting I'mnot saying that you're under attack It'sit's not what I mean It's a mean thatsomething is not going is going to bethere and you were not expecting it tobe maybe a chrome job that does somefile access whatever and then if youlike it you will learn how it works andyou will learn how rules work and youwill learn how to customize them so thatthey fit your environment a little bitbetter and then you can be perfectlyhappy with it but many people after awhile they will start using ourintegrations I showed you that there's alot of plugins there's a responseengineer There's a lot of ways to getoutputs to Falco and you will startusing them and you will help thecommunity and also you will develop alittle bit of hate for us maintainersbecause you will find our bugs and Iknow there's some very annoying onesthat you are trying to with with thehelp of the community every day to fixAnd then afterwards we've got someheroes that actually went and developedtheir own integrations They developtheir own plugins because that's whatworks best for their environment Andwe've got some at the maintainer trackon Thursday that will talk to us aboutuh their journey And then if you are atthat stage and perhaps a little bitinsane you can join us and become amaintainer as well and and then go to doall this weird kernel stuff and get toas low level as as you'd like to go anduh and be and be a part of of the groupWe got some of these We're very thankfulfor them because they built support forthings that we didn't even imagineexisted So your Falco journey doesn'thave to be this complicated Uh the youdon't need to become a core maintainerwith us but you your Falco journey canstart here at CubeCon Uh we have a kioskin the project pavilion We can uh wehave a maintainer track on Thursday orjust go to falco.org and take a look andtry it out on your environment to see ifit's uh if it's good for you Thank you2025-04-15 22:02:46.120494 �E�"#�CARGLy_JtGD9Uhey everyone I'm Luca and let's talkabout Falco I'm one of the maintainersand if you didn't know about Falco Falcois graduated project from CNCF thatmonitors your infrastructure forsecurity events So you install it onyour clusters on your Kubernetesclusters on your nodes on yourcontainerized environment and Falcolooks at everything that happens in thisnode and will alert you on everythingthat you think is a security relevantevent How does it do it It uses verysimple ruB ��+�##�ANRL-bSYVi7Qhello my name is Ben Koshi i'm on thePrometheus team welcome to thePrometheus 3.0 speedrun so Prometheus3.0 we released it last November it'sbeen really awesome uh there are so manydifferent things it's hard to go throughthis in 5 minutes but what I've got nowisuh a quick summary so uh the big fun webuilt a new UI it's really pretty it hasa bunch of nice things to help you learnhow to use PromQL um we mostly were uhenabling and removing a bunch of featureflags so things that were feature flagsin Prometheus 2.0 are now on by defaultuh and available for use lots of nicegraduated features um uh and themigration is super easy there's notactually uh a lot of breaking changesthat we had to include in Prometheus 3.0but there are a few small things so webumped major version um 7 years ofstability is just insane for an opensource project in the cloudnative spaceuh there uh Prometheus 2.0 was fullyupgradeable all the way through 3.0 withno breaking changes um there are almost10,000 commits between 2.0 and 3.0 andwe've already reached a thousand commitsin 3.0 um hundreds and hundreds ofcontributors to the just Prometheusitself not including the additionalhundreds that are part of the originaluh uh the additional features in ourcommunity umuh I used to say a long time ago that inPrometheus 2.0 well you know if you getto about a million series you might wantto start thinking about sharding nowtoday I don't even start tell people toworry about sharding until you reach 10million active series uh the most ofthat has come from the efficiency gainsuh in Prometheus 2.0 we rewrote the TSDBand it took uh some time to reallystabilize and improve the performanceand efficiency and so Prometheus todayis not the Prometheus it was 7 years agouh and if you really want to know moreabout Prometheus 3.0 there's going to bea deep dive talk uh on Wednesday uh gocheck it out i'll pause here for anotherfew more seconds if you want to scanthis QR codethree two one go okay so one of theother fun things with Prometheus 3.0 iswe're almost we were trying to declareit stable but we're almost there we'regoing to have native histograms uh thisis a really really amazing amount ofadditional power for histogram datatypes in Prometheus we are also workingon uh native histogram custom Bboundaries u which means you'll be uhPrometheus will be able to read classicPrometheus histograms and convert theminto native histograms uh on the fly andit significantly reduces the storagecosts and increases the performance ofPrometheus uh and gives you betterallows you to ingest more classicbuckets without the extracost uh coming uh also in Prometheus 3.0know we've got a new version of theremote write spec that's more powerfulmore efficient uh and we're working onthe next uh iteration of thePrometheus/openmetrics format uh that'sgoing to have a bunch of niceimprovements um uh and the biggest thingfor Prometheus 3.0 is we're really onboard with the open telemetry metricsintegration uh we have full UTF8 metricsupport so you can have emojis orwhatever you want in your metric namesuh and labels and label names uhPrometheus has always been UTF8 friendlybut it's only been in the label namesnot in the label or sorry only in thelabel values not the label names and sonow we have full UTF8 support acrosseverything um and we've got a bunch moreinteresting native uh open telemetry uhinteroperability coming down the pipe umuh some of the nice things that we'll beable to take the uh um open telemetrycounter deltas and we'll be able toingest those uh that's going to becoming soon uh and there's going to be awhole talk on open telemetry andPrometheus on Friday uh if you have morequestions uh we've got a projectpavilion uh booth and you're welcome tocome over anytime and ask myself or anyof the other Prometheus developersquestions thank you[Applause]2025-04-15 22:02:46.586637 rr� �$#�MAv_PzG81D33Ihello everyone welcome to the projectlightning talk for Kubernetes Sor myname is Shinyang i work at a VML byBCON in S storage sad myself areco-chairs michelle and Yang are techleads other than the leads we also havemany othercontributors we are working on some veryexciting projects let me highlight a fewthe volume populators feature istargeting GA in 1.33release previously you can only create aPVC from a data source that is a volumesnapshot or another PVC but there areuse cases to support populating volumesfrom other data sources for example uhif you want to do a backup you firstcreate a warning snapshot from that PVCand then you upload data to an objectstore as shown here and at the resorttime you want to be able to create a PVCfrom this backup this backup will beyour uh data source that is not a PVC ora volume snapshot so the goal is toallow generic data populs by permittingany object to be the data source forPVC there are challenges adding supportfor new data sources cannot breakexistingbehavior to support backwardscompatibility we added a new data sourceref in the PVC spec now that thisfeature is targeting GA we'd like to seeimplementation from more storage andbackupvendors the always owner PV reclaimpolicy feature is also targeting GA in1.33 release without this feature the PVreclaim policy is sometimes ignoreddepending on uh whether you try todelete PVC first or try to delete the PVfirst leaking storage resources thatmeans some users may be charged byresources they thought they have alreadydeletedso the goal is to prevent volumes frombeing leaked by always honoring the PVreclaimpolicy and there are challenges eventhough this is a buggy behavior it hasbeen there for a long time so some usersmay expect the behavior tocontinue to mitigate the risks weintroduced this feature back in 1.23release giving user enough time to adoptthe changenow that this feature is targeting GA itis ready to be used inproduction i want to announce theremoval of the git repo entry plug-inthis plug-in has been deprecated for along time and it isunmaintained there are security concernsgit repo volume types can be exploitedto gain remote code execution as a rooton thenotes there are alternatives you can usegit sync or init containers for the samefunctionality the goal is to remove theentry git repo volume code the git repovolumes will not be removed fromKubernetes API it will simply error outif you try to use itthe biggest challenge is that we couldbe breaking users who are still usingthisfeature we introduced the feature gategit repo volume driver uh in1.33 the feature gate will be locked in1.36 and finally in 1.39 release thisfeature gate and the git repo drivercode will be removedif any of this sounds interesting pleaseget intouch here are some SK storage sessionsand events at CubeCon there is a SEKmeet and greet on Thursday come talk tous if you are interested thank you[Applause]2025-04-15 22:02:47.107992g all of these scenarios happeningso persistence on the cluster dataservices on the cluster is somethingthat people are trying to go from stepone to step two to step three maybe orgoing from step three back to step twobecause they hit a performancebottleneck or they're scaling theircluster in a different way or they needto manage their applications in acompletely different way for operationalperformance and other security needs sowe have helped design a framework calledcanister to help orchestrate and takecare of all these different situationsand even help you change them because itis a framework for helping orchestratedata protection on acluster with on and off clusterconcerns so that is because it is such acomplex world out there once you getyour application onto Kubernetes and getyour data onto Kubernetes or adjacent toKubernetes there are a million differentdomains you need to span again you needa framework to help start to addressthis problem and that's because evenonce you finally do a backup hey we usedto do backups just by freezing the diskthat's not data at rest not with acomplex distributed application that'ssharding on each pod on each worker nodeon each cluster no we need to start toaddress these problems and I have aframework foryou so most people think about things inthat infrastructure bottoms up way forhow to back up the data how to get thedata at rest how to deal with theapplication we don't have to do thatanymore we can deal with it in a top-down fashion we call this applicationconsistent backups and for a lot ofbackup administrators out there in theworld bare metal and VMs they don't knowwhat that is you know what that shouldbe i have a framework for you soapplication centric helps exercise allof these different ways of getting tothe data and dealing with theapplication and orchestrating all theconcerns so this is not even thoughwe're a sandbox project only as of 2023we're not new we have been in productionwith many customers for a long time weopen sourced this project back in 2017we've got lots of installations andtestimonies and so on but let me tellyou about how you install thiscloudnative controller on your clusterwith a Helm chart and the three CRDsthat power it blueprints basically arehow we drive the heart of this systemaction sets are how we instantiate itand the profiles are just the targetsfor where we export and import theartifacts of these backupsthey areCRS we have example blueprints for lotsof different types of databases anddistributed applications out there onour GitHub repo and I'll show you alittle bit more a teaser of what ablueprint looks like it has major phasesinside of it a phase excuse me majoractions inside of a blueprint backupcould be one restore could be one deletean artifact could be another one rightwe need to manage our artifacts not justcreate them uh and yes we have phasesand order of operations and parallelismand many built-in functions aswell but those blueprints are in someways abstract in the sense that they canbe driven by an action set an action setinstantiates a blueprint with itsruntime arguments and which locationprofile should I target maybe I want toback up to S3 and NFS and something elseon the cluster and something else on theother cluster but that one hasimmutability and I need a differentbackup policy and a different retentionschedule on that how do you do that wellif you write a script or you try to backit up with Valero yeah you're going tocode all of that i have a framework foryouso uh action sets instantiate theseblueprints and give them the exactarguments and targets and so on theytrack the status as well because in asense it's a running job that we have onthecluster so if you like what you saw comefind me on Thursday we'll talk about itgo to the website join us we need peoplethat speak DevOps we need people thatspeak Golang we need people who want toget their feet wet who like to writedocumentation i'm Mark Lavy i'll bethere all day Thursday come find us andif you like any of these things come seeus at VH as well cheers2025-04-15 22:02:47.740404 E �ZE��'#��9A_r7blpGA1Fwlet's get this party started and PS dideverybody get the Doctor Who referenceand he's wearing a TARDIS and doctorokay the one guy yes because he was inthe 9 a.m all right cool cool cool uhthis is not normal uh it's fine allright so my name is Laura Laruso i amthe head of community at Perona i am ahappily a CNCF ambassador a CDFambassador uh DK SIG chair all the allthe things all the things um you canfind me on LinkedIn and on Blue Sky andthat which shall not be named um andjust go ahead and hit that QR coderight am I Yeah I don't know hellocan we get Mike 3 up please and my nameis Gerald Venzel i'm a VP for deafinitiatives at Oracle uh also CNCFambassador and I sit also on the S SQLstandards committeeoh yeah okay cool so here's our quickagenda we're going to go over a briefhistory of the CNCF why the landscape isimportant how to use the landscape someawesome projects you might not knowabout and what'snext so this is a great picture um thiswas taken in 2025 or I'm sorry that's2024 Kubernetes Contributor Summit inSalt Lake City and that's just to showlike just an inkling of how many peopleare working on Kubernetes now and why isKubernetes important to us and thisCNCF because it's kind of what got usour start so what is the cloudnativecomputing foundation it is 28 projects257,000 contributors 756 members and96,000 community members so raise yourhands how many people know if theircompany is an actual member of the CNCFokay so that's crazy because there's 756members what I want you to do is whenyou go back to yoI�M�&#�SAQ9m7eGoBaMAall right Uh hey everybody Thanks forcoming to uh the project copaceticlightning talk Uh quick show of handsWho likes CVEes in containerimages Who does not like CVS incontainer images Wow I said I asked thequestion wrong All right So Copatheticis a tool to directly patchvulnerabilities in container images Howdoes it do that Well it does that bybeing a uh build kit based CLI tool thatknows how to interpret scanner resultsSo something like trivy or gripe Uhidentify vulnerable packages withinthose reports Um use the packagemanagers for the underlying distributionfor the container Generate a new patchlayer and apply that on top of theexisting container And it can do thiswithout actually having the tooling inthe container So if you have a DR listcontainer for example uh Copa isactually able to patch that Um Copa is aCNCF sandbH�A�%#�;A6CXsWNOqYSwall right hi everybody i'm Mark Lavy andtoday we're going to talk about storagetrack data protection canister uhcanister is one of those brand new 130sandbox projects but we're going to tellyou the problem that we solve how wesolve it better and why you should joinup uh you can reach me lobby through allthe different places uh I will bemanning our project pavilion booth kiosk208 all day Thursday come find me if youhave any other questions i'm one of themaintainers of theproject so it's no secret thatKubernetes explosion exploding uh usageis now incredibly stateful distributedapplications like Elastic Search and soon are putting state on the clusterthere's no more GitOps your way out ofthis anymore right this is what we'reseeing with all of our customers uh andso it doesn't even have to be ElasticSearch or other distributed app appslike our new vector databases but anydatabase and so as soon as you havestate on the cluster you need to talkabout backup and recovery let's not eventalk about disaster recovery becausethis is the first step and even with thefirst step there's three steps and withthe three steps we're even to 320 something like that now because weextend this to cloud and multicloudbackup scenarios i won't read the slidesto you because you're all smarter thanme so let's keep going but that's how wedo things and so now the question iswhere are the data services are they offthe cluster that's on the left are theyon the cluster but not in theapplication that's in the middle or arethey actually distributed inside thepods with the application right we'reseeinFox project Uh you can go checkit out today I'll have a link to theGitHub repo at the end Um and it's uhit's pretty easy to get going with Uh bydefault Copa uses uh the Trivy uhscanning software but it also supportspluggable scanner So Gripe is one DockerScout is also available uh if you haveanother tool that you're interested inwe would totally like to have communitycontributions for those things So howdoes Copa actually work Well there's twomodes that you can use Copa with Um thefirst is what I just mentioned whereyou're going to use a vulnerability uhscanning tool like Trivy to be able toidentify vulnerabilities that are in theproject But Copa can actually workwithout that and just scan the containerum for any updated packages that existin the package manager for theunderlying OS Let's let's say it's goingto be Ubuntu um and and just install allpackage updates Maybe a little lesssurgical than taking a scan report andand going with that But maybe it's a agood use case for you So what typicallywould happen here is that you wouldgenerate a vulnerability report usingtriva So number two copa would parsethat identify all the vulnerabilities uhand then take those things and look fornew packages that are availableThen using build kit under the coversit's going to generate a new um filesystem layer using the old image uhlayers together It will then use uh youknow if it's going to be an RPM basedsystem it'll use RPM If it's a buntuit'll use deb or appget uh and installthose updated packages into that newfile system layer and then calculate thedifference between what's there and whatwas in the container before and generateyou a brand new patch uh patch layerjust on top of that So you have thebasically the old container with a brandnew layer on top of it with just theupdates that you've asked for eitherfrom the the scanning report or from theuh you know just update everything kindof wild westapproach Uh so recently there have beensome cool new improvements in in um inCopa Uh first one I want to mention isthat there is now support for Alma LinuxUh there's quite a wide support for mostRPM based distributions Uh Ubuntu anumber of other distributions Uh AlmaLinux is a brand new one that has justlanded So if you're using Alma basedcontainers uh Copa will now uh help youout Um community scanner plugins arealso a pretty new contribution Um notdirectly in the project Uh we're workingkind of with an ecosystem for this Uhbut you can find gripe and uh dockerscout as examples for that Um it alsohas new tag suffix support So generallywhen you build a new container with umwith copa it's going to emit the sameimage with a patched suffix on the tagUm it's now customizable and you canmake that work how however you basicallywant Um the tooling images are nowmirrored to GCHR or GitHub containerregistry Uh who likes Docker Hub ratelimits Uh I think that's not a very funone Um so these images have now beenmirrored to GitHub container registry Soyou should be free of um those kind ofinconveniences And then finally there'sbetter logging support for healthpackages So you can uh do some thingsinside of your containers to to block uhupgrading certain packages Uh there'sbetter support inside of Copa for thatNow a couple of cool upcoming thingsRight now it works very specifically onan image So you're going to build umingress engine X version1.12.1 If you're going to patch multipleimages you're going to run Copa multipletimes If you're going to do it formultiple architectures you'll also do itmultiple times But coming soon you willbe able to do both multiarchchitectureand bulk image patching uh with Copa ina much improved uh de developerexperience If you'd like to talk moreabout Copa please come see us at theProject Pavilion kiosk number 20B uh onWednesday between 10:45 and 1500 We'dlove to talk to you give you some demosand and answer any questions you have Uhyou can also find us on GitHub I added aQR code here so you can find the repoWe're also on the CNCF Slack Uh happy toanswer any questions or talk about CopaThanks[Applause]2025-04-15 22:02:48.254276Jur offices or yourhomes check with your marketing team orwith your HR team or with whoever yourOSPO and see if you're a member of theCNCF because there's tons of benefitsthat you may not realize you areentitled to and there's no point inhaving a seat at the table if you're notusing itokay so in case you didn't know theLinux Foundation manages the CNCF andagain if you're a member of the CNCF bydefault you're a me member of the LinuxFoundation training credits how many ofyou like to do trainings and getcertifications so you can stay up todate on what's goingon some of you okay again this is whyit's important to know where yourmembership status is because there aretraining certifications that youprobably have that aren't being used soit's free trainings because you're amember so at the LF they have 900projects and one of them happens to bethe CNCFso where did we start well actually ifwe wind the clock back we're just alittle bit over 10 years into thisjourney right we started all in 2014with Google when they partnered with theLinux Foundation ever since then we havegotten a real real long way right it'slike a couple of uh key highlights here2015 we had the CNC CNCF founded so 10years ago right2016 we had Kubernetes join the CNCF nowthen all the way to now here on a 10thanniversary let's see what the landscapelooksnow uh so there is currently uh as we'veseen before like over 200 projects thatare part of CNCF now how does a projectactually get into the CNCF well it goesinto phases four of them so there's anadoption process where the CNCF looksthe technical committee looks at aproject and sees whether it actuallyfits into the CNCF if that is a yes thenit goes into the sandbox phase nowsandbox phas is where as the name willsuggest sandbox where things can youknow still mature see how it goes rightit maybe works out great maybe not somuch naturally like a lot of projectsget into the sandbox phase right andthen we see okay does the projectactually mature enough or is it justsomething that maybe follow the quickhype and it's kind of settling downagain etc etc after that if the projecthas a mature track track record ofmaturity it goes into the incubationphase right this is where it now andincubate and where we actually now startto see uh you know production usage ofthe uh of the product uh project or seewhether it is deemed good for productionuse cases uh making sure that it has ahealthy maintenance record right so thatthere's maintainers taking care of theproject not forsaking it etc and so onnow last phase is the graduation phaseright those are really uh projects thatessentially to get in there they musthave thriving adoption ratesuh have committers from at least twoorganizations who have documented andstructured governance process so reallythere has to be people there supportingthese projects they can no longer beabandoned and of course uh meet theLinux Foundation's core infrastructureinitiatives best practices badge nowthat's all really to say that you knowlike once it's a graduated project rightit's like you have to be able to putthis into production and rely on thatright that doesn't mean that a projectas a sandbox environment or in the inthe sandbox phase or in the incubationphase will not get there but when yousee a graduated project you know thatthere's actually committers behind itthere's investment behind itet how many people in this room areproject maintainers did y'all go to themaintainer summit last yesterday coolhow many of you are contributorsOh okay so there's room there's room toto grow in this room keep going exactlywe're always looking for contributorsobviously so let's have a look at theproject count over time right as we saidwe started in you know 2014 2016 uh uhwhen CNCF came around and the firstproject showed up and naturally as yousee you have a lot of sandbox projectsright there's obviously that's the firststep in so naturally you will have a lotof projects that go in the sandboxenvironment then incubating phase uh andalso though quite healthy 31 graduatedprojects uh now so you look at thisgraph and yoKu go like wow so this iswhere we get to the 200 so where do wego from here there's like a lot ofprojects in there okay so that is wherewe are going to talk about the CNCFlandscape now how many of you have usedthe CNCF landscape beforeexcellent so we've got a bunch of usersbut no contributors how many of youactually have never heard of the CNCFlandscapeOkay good you're in the right talk okaycool so this convoluted beautiful uhcrazyit's a bit zoomed in it's a bit zoomedin yeah but I think that's a screen upthere never mind is this I mean can youguys see that there's tons of projectslots of boxes and lots of like what isgoing on here exactly so one of thethings I hate is when people tell mejust to go to the website when I can'tfind the information and so what we'retrying to do right now is sort of walkyou through a good way to sort of go tothe CNCF landscape and figure out whatyou need to do so there is thisbeautiful What is happeningthat is a good question it Oh is it theconnectionthat's back keep going keep going don'ttouch anything other than the keyboardwe didn't do a little dance for the demogods okay so there's this beautiful pinkbutton called filters and so what do wewant to do let's start filtering so weclick on CNCF projects but we don't wantanything that's archived right so anarchived project is one that didn'tquite make it and what does that mean itmeans either there was like a newproject that kind of took over wherethat project stopped it lost itsmaintainers it lost its contributors orit was just something that the TOCdeemed not like in use or in productionso it got put to the archive so we don'twant that and what we also we don't wantlet's not do anything sandbox becausewe're trying to find something that'sactually being used by a lot of peopleand it's going to work in ourinfrastructure so let's just stick withgraduated and incubating and then howmany of you are interested insecurity how many of you are like forcedto be interested in security okay thisis what I thought so let's look at somesecurity projects and there's tons ofother ways that you can um that you canuh filter down of course we only wantopen source look at all the differentlicenses right so how many of you knowGeorge Castro you guys were here thismorning when he opened we can all thankGeorge for this thing being so muchbetter than it was if you looked at itlike four years ago so cheer everybodygeorge round ofapplause thank you George okay so let'sjust Who cares hit apply so wow so nowwe completely like narrow down our scopeto security and compliance projects thathave open source licenses and then wecan really kind of go in does anybodyhave one let's see this one's graduatednow let's look at CubeCape we're notOkay just go ahead and takeover okay so let's look at Cubscape sowhat does this uh tell you it tells youall about the project right so you canreally kind of dig in to the health ofthe project that you are thinking aboutusing or maybe you already are using andaren't sure if it's the right projectfor you you can see the number ofrepositories the number of GitHub starsvanity metric or not it's there so youcan see it the number of contributorslike when they last contributed thelanguages all kinds of cool stuff andone of my favorite things to look at isClone Monitor so clone monitor is I willsay this it's project specific theproject's the one that determines whenthat API call comes in so that the clonemonitor gets updated but clone monitorwhat it does is it gives you the overallhealth of your project in multipledifferent stages so it starts a 94 is agreat score that is not to say that ifit's less than a 94 it is a bad projectto use but again we're trying to do ourdue diligence so okay a 94 we're likeall right cool what gives it a 94 let'sscroll down it gives you all of thethings about the repository it goesthrough documentation like is thedocumentation up to snuff the licensingbest practices oh no they don't havetheir community meeting live so what'sgoing on there let's figure that out umsecurity they pass everything except forsign releases and token permissioneverybody kind of didn't raise theirhand as a contributor uh here's anopportunity for you to contribute to aproject you may or may not be usingbecause they need help with securitywith signed releases um their Helmcharts are scores great their license soyou can kind of just really dig into thedepth of the project and see where itlands and if it's good foryou oh oh man back button back button iknow um Okay so where do you want to gonext how about back to theslides how about back to the slidesthose are the slides yay all right sohere's a project you all might haveheard about so there's a couple ofprojects that we nitpicked uh to justhighlight quickly here uh you know forgenuine interest uh or from where theyhave come or because they said could usecontributors every project can alwaysuse more contributors here's one helmhas uh gotten a long track record rightit was one of the I want to say like uearly projects that you used to actuallydeploy onto Kubernetes right it's theKubernetes package manager uh anybodyuses Helmyeah that's what I figured right howmany of you contribute to Helmokaywhat else we got well let's digin so let's look at Helm as a projectright okay so we've got 27,000 GitHubstars 873 contributors none of which arein this room just kidding right andsorry just before I scroll further downgo a little bit up so you see again likealso the timeline timeline of themajority right so they actuallyincubated in 2018 it graduated fivealmost five years ago right in 2020 inMaybeen around for a long time when we talkagain about clone monitor scores and Ithink this is a great like back andforth it's an 86 okay so one of the mostwidely used projects is an 86 so that'swhat I mean when I say do your duediligence and you know this is a projectthat you use but maybe there's anopportunity for you to help the projecthow many people like to writedocumentationbecause that was a 73 woohoo we had oneperson raise their hand there we gothere was a 73 on the docketthere wego we need you um also adopters likethey just don't even have their list ofadopters updated on their site so thisis what I mean do your due diligence ifyou're using um Helm as a project goahead and put in a ticket and just say"Hey we're using you." So they can getthat moving up that's an easy way ofcontributing super easy pull request allright let's keep going best practicesokay see like their Slack presencethat's weird keep goingbut you can see like just because it'san 86 does not mean it is not a projectworth using you just need to determineis it good enough for what your end goalis for what you're building does it meetthe requirements of what your team islooking for and also is it somethingthat you're like we want to use this butit doesn't have enough documentation somaybe I can contribute up thedocumentation so then when I go to myteam lead and say hey I want to use Helmyou know maybe they don't have a red Xanymore maybe it's greenright so we got one more minute left iknow don't you worry i got us oh Iwanted to close us out oh well fine ofcourse okay so do you know that can youtell we know each other yeah we got 50seconds left now that was another threeright there where is it where is it thelast slide there we go here we go jointhe fun43 seconds left holy crapokay so join the funjoin the fun well okay so I mean reallylike you know contribute right it's likeCNCF is a a big open source vibrant bigopen source community right so please goahead take part right this is thecommunity the events i mean your CubeConetc uh but also you know as you've seenlike when you look into the uh you knowthe projects that you want to adoptthere's 200 out there it's like the thelandscape's really there to help you tonavigate what are the projects that areactually relevant for us and then the clmonitor for example shows you you knowwhich of those projects is really whatyou want to adopt and you close us outjoin the Slack channel check out thecalendar for meetings and thank you somuch i'd love to take a selfie if y'alldon't mindcheers get over here2025-04-15 22:02:48.792312ed like I think six years agoabout kilohard uh but there was um muchinterest but not too much developmentand then um in rush we also thought heywe have many different projects insideour company that also use CubeFlow andwouldn't it be perfect if we could likesimplify some stuff um maybe streamlinesome deployments and enable like properparameterization so you can connect thatto infrastructure as code tools and justpropagate the variables secrets whateverright um so I started leading that withuh some assumptions and the basicassumption was that this helm chartshould only focus on the main componentsso in the customize you have everythinglike search manager there's a lot and uhhere we wanted to focus on just thebasics because in bigger environments oncorporate projects where people alreadyknow how to deploy stuff like manager orwhatever they often just want to do itdo it themselves because they also haveto care for some company policies ormaybe maybe cube play would be just onepart of the bigger picture right so youwant to have flexibility and this was uhthese were the basic assumptions so thisis just a list of what's included um asyou see the base components or or themain components and some of thesupporting integration manifests so somevirtual services uh some gateways allowto proxy integration stuff K native andso on and because the CRDs for cubefloware so big there is a separate helmchart for that as well and uh so whatcan you gain with this assumption as Imentioned the infrastructure as code umstaff so everything clicks together forme it's very important and talking withmany people's many different projectsinside rush and outside we often sharethe same approach um so if it this issomething that can be connected toinfrastructure as code setup uh you canintegrate with cloud providers becauseyou have some um parameterization therefor maybe service account annotations orfor some uh configuration or maybeconnecting to existing secret maps onthe cluster so it's there and then itcomes with a price because you knowprobably cubeflow is quite big there arelots of components the values yaml fileitself contains of 2,000 lines you canparameterize basically everything butthen the basics are meant to be same soif you deploy something and just want tohave a quick look if it works you shouldonly maybe have a maybe zero lines linesmaybe 10 lines um of the value file soit should be sane first of all sane andum what's next so because this work wasdriven mostly by me we had some peoplein rush interested even somebody therewas doing some contribution hey Alengood to see you soum what's next is a very importantquestion and interesting questionbecause it was meant to be given tocubeflow community maybe to integratewith cubeflow manifest repository butthen there is also this idea fromcubeflow steering community that theywant to start their initiative withhelmcharts but create different helmcharts per different component thissounds like a good idea as well mypersonal take is that when you work forbigger environment you want to havesomething um like a base artifact thatjust works for you and it's one thingthat does one thing and it does it goodum so maybe it will not get merged intothat repository but then we can dodifferent stuff maybe um we can reusethat with other distributions since Iknow people are doing their owndistributions and um managing that withHelm and that with customize can be abig benefit there um maybe cubecommunity would like to see andreference some of the setups that I didthere uh or we did there with whichwould be also very good so I'm basicallysaying please take that use that howeveryou like because this is a base artifactdriven from community to community andthen if you want to uh reach out see thecup repository uh there is a new releasethere is six information how to use thatthere is quick start uh scripts actuallya few of them so you can integrate withdifferent and you can contact me andlet's touch let's see what else we cando and what and where further we can gowith this thank you[Applause]2025-04-15 22:02:49.458998 � ��?�(#�7ANkOV4_JV4t4my name is Christopher Manovski don'ttangle your tongue just call me Ramani'm lead Kubernetes engineer at Raj andum I want to present something that Iwas leading around cubeflow specificallyfor Helmchart so this is not apresentation about CubeFlow about thiswork around Helmchart and uh some of youthat know cubeflow a little know thatthere is some issue when deploying thatwell it's with customize and becauseit's so big when you want toparameterize for different environmentsfor production there is just a lot ofwork so there is this along issue thatwas uh openL CC�9�)#�+AD7vwFFeEn00hello everyone i'm Tamang from HiTechnologies and today I'm going to talkabout the simplif cloud native securityebl powered encryption in staticcolorlessmesh my topic has two parts one for abrief introduction to our open sourceproject key mesh the other is our umthinkings and implementations of umtraffic encryption in cognitivescenario let's start from first class soum what is K mesh so K mesh is a sitecolors EBPLF based and high performancewith low overhead service mesh data planhere K is for kernel implying that wetry to offload uh the traffic governanceto kernel and the kimsh has been aproject of CNCFlandscape and also my colleagues sharedabout our projects during last year'skcomChina so let's see two modes of kes meshand the first one is kernel mode here weum upload layer four and a simple layer7 traffic governance to kernel and webuild a transparent site color servicemesh without passing through prox layeron data pass so um this is to improveperformance is low overhead and toensure the service um awareness when kshrestarts orupgrades the other mode is D engine modeand here we upload layer four traffic tokernel uh but we process layer 7 trafficuh the traffic with point and this is toimprove deployment flexibility and toensure a transitionsmooth now comes the main topic how toencrypt the traffic in cloud nativescenario the post manager by K meshsends package to a specific NIC and theN will encrypt the package and send tothe pier the pers will uh uh will sendto the service apps and this is linklevelencryption k mesh integrates IPSAC as itencryption tool between nodes each nodehas a key mesh demon at first the userset IPSAC preset keys in the Kubernetesas object secrets then the K mesh demonwill generate IPSC ingress and ingressrules using the encryion information inthe secret and note information in theAPI server also the kimsh demon willupdate the pure nodes information to aBFF try map and this map belongs to aTCBFprogram so what's the purpose of this TCprogram now the port one tries to sendpackets to port two they are indifferent nodes the TC program hook atVither will be called during the packetprocessing it will judge whether theremote port two is managed by K mesh ifso it will mark the packet in the nextstep the market will match the rules inIPSAC so the IBC will encrypt the packetand send it to the uh to the pier andthe piercing will encrypt the packet andsend it to the TC program and also thepier TC program does the same thing itwill uh check whether the source code ismatched by Q mesh according to the BFmap if so um the packet will get a newmark to help it match the rules in IPSACand to be forwarded to PO two so thefinal port will receive thispacket and also our further work is toenable socket level encryption and thekey design is socket migration and multikeymanagement and in socket migration keymesh uh it will on the socket during TShandshake and it returns the ownershipof the socket to the apps after uh theTS handshake finished also to ensurestable performance during key switchflow uh the the cost by certificateexpiration we design a multi keymanagement to ensure the old and the newkey can uh both be used over a period oftime our simple test shows that itachieves more than 30% throughputimprovement compared withenvoyand yes finally thanks for yourlistening and please stay turned for ourproject Kim mesh and also we have ourproject chos in the afternoon of 3rdApril i I hope to see you there thanks2025-04-15 22:02:49.915801 ��%�*#�Aj7QfkNU8XM8uh my name is Yam and I'm a softwareengineer at Red Hat today I will share auser case that involving three topicsthe first one is multicluster AI modeltraining and dataprivacy imagine you have public andprivate clusters each with sensitivedebt and now you want to train alanguage model using all of it withoutexposing that debtwell to manage those clusters we rely onopen cluster management or OCM and forthat privacy we use federated learningit let each cluster train locally andjust send the model updates not the realdata which get combined into a globalmodel so privacy is built from start andthe next question is how do we combinethose two systemtogether before start here is a quickoverview of OCM it's a Kubernetesmulticluster orchestration and it's alsoa CCF sandbox project based on hub spokearchitecture uh the central control planlambda hub clusterwhich those manage cluster it alsoprovide some open APIs like manifestwork open uh placement to schedule yourworkload across those clustersso and well then the next question iswhy did we pick OCM for federatedlearning the answer is straightforwardthat is OCM natively supports federatedlearning in OCM uh those managed clusterpull their deserved status from the hubcluster and push those status back tothe hub cluster and in federatedlearning those collaborators pull theglobal model from the aggregator andtrain nolay push those model updatesback to the aggregator so naturally thehub cluster acts like the aggregator orserver in federated learning and eachmanaged cluster applies the role of uhcollaborator or clintto make those integration seamless we inuh we we built a controller it managesthe whole model training life cycleacross multicluster and also use CRD todefine the workflow support different uhpopular federated learning runtimes likeuh open FL and more so if you arealready using one of those frameworksyou can just plug it in and you don'tneed to rewrite your training codehere's how you get started first youneed to containerize your applicationand then uh create a customer resourcereference that image initially theresource status show as waiting thatmeans the hub that means the server isset on the hub and the system nowscheduling those aggregator or client tothose managed cluster claim to have thedata uh have the data needed for modetrainingand once all the clients and the serveris ready the stures come into runningwhile the federated learning workflowkicks off the lends the P model from theserver train locally on their privatedata and push those updates back to theserver and at last the server aggregatesthose uh model updates to build a betterglobal model after several runs once themodel convergesuh the stats come to complete now youhave your trained model with dataprivacy and preserved end to end that'swhat I want to share today a practiceical approach to multicluster AI modeltraining and that privacy powered byopen cluster management and federationlearning and thanks for listening if youlike to learn more here is some linksand uh or feel free to reach outyeah thanks2025-04-15 22:02:50.382593plemented with a single GS um LBCRD to enable the global load balancingi'll show one of those after it isvendor neutral it's environment agnosticand it's a CNCF sandbox project we haverecently applied for incubating u andI'll talk about that in a little bit webelieve it's the only cloudnative uhKubernetes global load b loadbalancer so it provides an independent ugslb capacity to any ingress or servicethere's no dedicated management clusterso that there's no single point offailure basically what's happening isthe operator is doing what an operatordoes and it's checking uh the health ofits it's checking the health of theapplications through the service throughthe ingress using normal things likeliveness readiness probes um and thenit's updating DNS right and it's sayinghey this particular cluster theapplication on this particular clusteris ready to serve so if it's not thenthe other cluster is going to notice itbecause one other thing that theoperators are doing is talking to eachother to make sure that um they know thestate of the world so it's Kubernetesnative application health checks and itcommoditizes GLB for Kubernetes so it'susing all the same things that you knowabout already you don't have to reallylearn anything new you just have to plopin a a YAML file as we'll see in aminute this is just a diagram that showswhat I was talking about a minute ago iput some little stars next to theimportant components that are comingwith KGB i talked about the controllersalready you can see this cross uhcluster sync polling going on there soum if one if this region A cluster isdown region B cluster is going to knowit's down update DNS and then nobody'sgoing to get sent over to region Acluster until region A cluster uhresolves whatever its problem is theother two core components in there thatyou'll see are core DNS which is um serserving the uh DNS requests and externalDNSum which is enabling zone delegation sou there's a number of a number of DNSproviders that you can use right sothere's route 53 there's infoblocksthere's RFC 2136 all thesethings here's the CRD uh this is there'sa couple of important points on here soyou see in the spec section there's akind there's an ingress virtual serviceum you know what's happening withingress there is a gateway APIimplementation which is already outthere it's not ready to be used yet it'sin a it's in a pull request but that'sone of our plans for next quarter uh andin the strategy section you'll see uhthe different types so they're thethings that you'd expect there'sfailover there's roundroin there'sweighted roundroin there's goip thissomething that I talked about earlierright so you could tie it to um youcould say that this particular clusteris serving this particular thisparticular region and what's happeningis when you plop this CRD in there orwhen you plop this GML in there uh theoperator is taking a look at that andthen implementing the appropriateresourcesso KHB has been around for about fiveyears uh we've been a sandbox sandboxproject for four years uh we've set subsubmitted our incubation app about 5months ago and uh coming up in quarter 2is going to be GCP DNS gateway API as Ijust mentioned and a docsoverhaul the we are working towardincubation we've got a number of publicadopters two-time security slam finalistperfect CLO monitor score maintainersfrom Absa Abbound Katify Accenturecontributors from Millennium BCP etc andjust some vanity metrics becauseeverybody likes those so help get usover a thousand stars this this CubeConwe're at 961 right now one thing I wouldlike to call out is we have uh regularevery two weeks community meetings uhthe next one is tomorrow it will be our66th community meeting slam all of thesethings you can star us on GitHub you canread the docs you can join us on Slackuh and please come see us at the projectpavilion Thursday we're going to bethere u in kiosk 4b from 2 until 5 acouple of the maintainers will be therea couple of uh companies that are usingKGB will be there and we'll answer anykind of questions that you have thankyou2025-04-15 22:02:50.921019 ��,#�CAoo1wqb9_Whcokay hi my name is Andy Anderson haveany of you heard ofKubstellar all right great we got ataker in the back hey all right sokeeping with the IBM IBM Red Hat themehere this is Coupe Stellar coupellar isa way to manage fleets of clusters withworkloads and deploy them easily to yourremote works workloadsmy name is Andy Anderson and for thisparticular conference we came up with acool new infomercial did any of you seeour infomercial last year at NorthAmerica all right you're in for a bit ofa treat any Kubric fans out there spaceOdyssey all right good[Music]oh[Music]no right then here we go[Music]launching Coop Stellar[Music][Music][Applause][Music]me allright so we've got other funnycommercials like that we're looking forsomeone to do a Marvel one next year sowhat we're trying to drive home here iswe're big on remote uh workloaddeployment so you can centrally defineyour workloads in a coupe stellar coreand then use your regular deploymenttools Helm Argo CD Flux etc to get thosethings deployed so we make disconnectedoperationspossible i always like that meme so howdo we do that we put a control plane ininside of a control plane if you look ata standard API resources list forKubernetes clusters you see a littlealmost 300 API resources for CoupeStella we only have 40 at the core andso that allows us to reach and and workwith many different types of remotedevices so what you simply do in Kubellis you define your inventory you defineyour workloads okay so inference clusterfine-tuning training clusters add yourworkloads olama VLM seem to be favoritesthis year and then they had your bindingpolicies and voila they're deployed wejust had a a complete set of new internsthat came on this semester six of themin total you heard the harbor folks talkabout it incredible work they've donecome by the booth uh tomorrow and checkit out brand new user interface entirelyborn from the uh the contributors thatwe got from the CNCF internship programincredible work they've doneoo yeah keeping with the space theme sosoftwaredefined farm is some of theintegrations that we've done here at QArgo workflows and of course the UI andall of that and we will be showing moretoday you're here tomorrow we'll be inthe project pavilion come by and take uhtake a look at what we've got we've gott-shirts too you can maybe you can scoreone ofthese and if you'd like to join usplease do so at the link here scan theQR code uh just by by way of you knowlinking this back to other presentationswe do work uh quite heavily andintegrate with the OCM folks so we'reactually extending them to work morewith uh remote workloads remote uhremote clusters and and the like sothank you very much for your time andattention today i hope you stop by ourbooth tomorrow and tech have a look[Applause]2025-04-15 22:02:51.543878�1�+#�AYMyrcqZ2sbUhi welcome everybody to London i'mBradley Anderson i'm a community uhmanager at uhKGB um I'm here today to talk to you alittle bit about KGB if you don't knowmuch about it um and I'm going to tellyou a little bit about uh where we'regoing and how you can getinvolved i'm not seeing a timerhere hey there we goso yeah this is going to be our projectupdate for for this particularKubeConum KGB is a global service load balancerso think about if you've got um anapplication you want to service it inmultiple u multiple geographies maybe EUmaybe United States maybe China um youwant to make sure that that thing isalways up and you may want to make surethat people from China are gettingsomething closer to where they live sothat there's low latency you want tomake sure that somebody in the US isgetting something close to them so thatthere's uh low latency so KGB is uh isOSS it's following the Kubernetesoperator platform i'll show a couple ofthings after uh with that so it'simplemented with Kubernetes operators soit's things that you know about alreadyit's imPtion for our info right andit's been a tedious task to debug themand per for me personally before comingto this demo I was doing so uh it isreally tedious task and what we aremoving towards more of collaborationwhere team can collaborate where we canreduce our uh work of these debuggingfiles and automate these things right sothis is where mish comes into thepicture so mish is a cloudnativemanagement platform which provides you avisual interface where you can createyour infra without writing writing anyaml file at all you can just drag anddrop you can use the drag and dropfacilities and drag and drop like poddrag and drop the service connect themwith each other do some configurationand hit the deploy button so this issomething what we call a design you caninvite your peers you can invite yourteam members to collaborate with youreview with you and share with othersteam members as well great thing aboutit is is that mish is intelligent enoughto understand if a pod can make aconnection with service mesh or not if afont can live inside a name space or notso this is what ontologies mean in mytitle so mish this is something we Icall it as a context aware policies forour designs mish also have a servicecatalog where you can you know use theexisting infra if your team alreadybuilt something you don't want torecreate your infra right you can justsearch uh in within mishi what your teamhave built you can reuse the infra andyou can deploy it mishi supports morethan 200% integrations so if you go onto the CNCF lens space don't worry youwill find most of your technologiesalready supported on the machine so thisis how something mish do a policyevaluation uh I'll not talk about howthis works so mish have a models so whatactually does is like it automaticallygo to the artifact hub grab all thethings put it together in a folder wecall it as a model so AWS is a modelinside AWS there can be many componentslike EC2 and others similarly for thecubit inside there can be p service masssorry services so the relationship islike if a pod can make a relationshipwith service mass or a pod can make arelationship with persistent volume ifyou connect your pod with a persistentvolume mishy will automatically create avolume claim for you so yeah this is uha polish evolution which happened withOPA i'll move to the next one souh okay so this is how actually it lookslike i have created a very simple designuh I created my cubernetes cluster itried to do some experiment with it andit got busted so I will not able to uhdeploy my design but this is how itactually works uh you can just searchthe thing that you want let's say forthe pod and you can just drag and dropyour pod to the canvas and once you dragand drop do some configurations andafter that you will be able to see uhhow it already put the pod inside myname space so you can go in uh in theactions hit the deploy button and deployit will deploy in your infra you canalso connect multiple cubin cluster withyour uh mach so you don't need to worryabout if it can only connect one it canconnect with 50 100 as Well so you canclearly see how many clusters are activeright now or not once you hit the deployyou can also do the dry run beforedeploying you just want to make sureokay my design is correct or not sothere are multiple errors coming in mydesigns which I need to fix beforedeploying it as well so uh this is uhwhat mish do we are trying to simplifyyour cubin experience and buildingtowards where you can go collaboratewith your team and walk toward where youcan reduce your time on debugging thesefiles and build something great for yourcustomers so uh how you can start likeyou can go inside you can just join ourslack channel we would love to have yourreview uh we would love to have uh yourreview on how was your experience withthe mystery as well I'll be there and uhyou can join our you can scan this QRcode for our website and let me knowI'll be there in my in our selectchannel thank you everyone[Applause]2025-04-15 22:02:52.207257 ��.#�mA9q9oTJUqQoQall right Hello everybody My name isJoshua Packer and I'm on the steeringcommittee uh or I'm a steering committeemember for open cluster management Andtoday we're going to talk aboutscheduling AI workload among multipleclusters or fleets of clusters And soopen cluster management that I mentionedis a CNCF sandbox project Uh it's beenin sandbox for about 5 years now It'sbeing actively contributed to by bothRed Hat and a number of other customeror companies And uh so whatT��-#�cAIcYwKgAMXuEheyeveryone uh I'll talk about Kubernet onontologies with Mishri today the titleis pretty interesting in its own way soI'll start with my introduction soum okay I think this is not working soyeah I'm developer advocate at DigitalOcean and a maintainer of a projectcalled Mishri which we will talkingabout today and Mishri is a CNCF sandboxproject right nowso what is cloud native management wehave seen that we are writing YAML filesfor the past 10 years for everyconfiguraR it is isit's a hub and spoke topology You have acentralized inventory at the hub You'realso able to define your workload atthat hub and then distribute it out toyour fleet of clusters And so how doesthis work Well it starts withregistering your clusters into that hubthat helps put it into inventory andmakes it available by placing an agentand add-ons for those agents onto thoseclusters that you just uh that you justregistered We're then able to take thoseclusters and you're able to group themtogether And so the CRD we use for thatis called manage cluster sets but forthe sake of this we'll just say we areable to group your clusters fordistributing workload And so how do youdistribute workload Well workload isdistributed using a manifest work CRDThis is a CRD that allows you toencapsulate other Kubernetes resourceslike the core ones such as a Kubernetesdeployment or a replica set but alsoexternal CRDs that you might add like anArgo CD or an Argo project applicationas an example Um add-ons are what we useto allow you to work with that workloaddefinition And so the add-ons arecontrollers for like policies could bedeploying the Argo CD application etcAnd then the magic in all of this isplacement And what placement does it's aCRD it allows you to dynamically filterdown the clusters that you want to applythe manifest work or the workload to Andso how we do that is you can filter onsimple things like laser or laserslabels You can also do cluster claimswhich allows you to define resources onthose spoke clusters and they'll bepercolated up and used for filtering Wealso have something called placementscore that allows you to score thingslike how much resources would beavailable on the system And then we haveavailability Is the cluster actuallyonline Therefore should I placeworkload And so we said we would talkabout AI And so some of the integrationswe've done are with uh Q is the firstone And so in this case we were takingthat placement concept and figuring outhow dowe okay we were take I I'll just keepgoing Anyways we were taking Q and wewere using placement which is able touse a label for a GPU type And so we'reable to take that and decide where doesfeed that information to Q So Q takes itto build its multiQ config and itsmultiQ cluster resources and that allowsQ to figure out which type of GPUresources available on a given clusterSo we can place the jobs there on theindividual nodes The other integrationwe did there was with the placementscore that I mentioned And so that'swhere you're able to take the amount ofGPU resource using an add-on calculateit make that available to the hubcluster and then the placement is ableto filter on which of the clusters hasnodes with the most GPU available Againwe put that into the multiQ and Q isable to then decide which clusters whichnodes do I put the AI workload on thathave the most resources available forprocessing Uh the last oneis it's not switching on me Here wego The last one is federated learningNow I'm not going to get too into thisbecause someone's going to talk about itin about 30 minutes and I only have 50seconds left Needless to say federatedlearning is about processing andbuilding your AI models on your remotefleet maybe keeping them in the data inspecific data centers And open clustermanagement with placement allows you toput that workload where you need itdefine the requirements for federatedlearning that are needed to build andprocess the models in the fleet And soit's a good catch And so it's not justfor AI but it's for applications andservices in general If you want to bringmulticluster to your app or your app tomulticluster then open clustermanagement is the place to do that Andso if this interested you please comevisit us We're atopenclustermanagement.io and uh yeahwe'd love to see you and give it a whirlWe have lots of demos Thank you verymuch2025-04-15 22:02:52.761572 YY�#�/#�AJe6GIoagHvUi'm here to talk about a multi-archchcube and containerized data importer onuh multiple architectures obviously uhmy name is Cheryl Filicus i'm at IBMsystems working as a Red Hat partnerengineer for about five years now beenworking on this project for about twoyears and uh today's discussion regardsthe upstream open-source aspect of theproject only so that my uh boss doesn'thave an aneurysm okay what is Cubert andcontainerized data importercubert is a is a uh is a way to run KVMvirtual machines in a container uh thisis often used in uh uh Kubernetesclusters you can stop and start virtualmachines with a YAML file reconfigurerelaunch existing VMs it's got a nicetemplating feature containerized dataimporter uses uh persistent volumeclaims on disks for cubevert VMs by wayof data volumes and it can also uhimport clone and upload your uh attachedstorage note that CDI import clone andupload are supported for VMs which areamong other storage options storedexternally in VMware installations viathe virtual disc development toolkit onx86 and on x86 alone this this uh willcome in to be important in our nextslide uh what do I mean by differentarchitectures i mean uh your x86machines that are in widespread use indata centers across the world uh theyrun Linux RH Core OS you've got a lot ofVM templates a lot of VMs for them uhcontainerregistries galore uh and uh Kubernetesobviously runs really well on these arm64 you've got all of those things exceptalso they run at about half the powerper CPU cycle they're reducedinstruction set run on uh uh L excuse melittle Indian encoding just like x86 anduh these these are your usual uh Mac uhRaspberry Pi cell phones and edgedevices so this is out at the edge veryoften but also in big server rooms aswell these days okay what is S390Xarchitecture also known as the IBMmainframe and it's like do we use IBMmainframes anymore and the answer isabout 97% of your credit cardtransactions are processed by an IBMmainframe so that yes they are in wideuse uh they also run at low power theyalso uh are reduced instruction settheir big Indian which uh is reportingissues they run then there's a many manyuh featuresthat in addition to porting say uhKubernetes andor uh Cubver andor CDI tothe S390Xuh architecture you've also got allthese other wonderful features that youwant to enable um in the course of thatso that if you get a specialized VM ordevelop a specialized VM it runs easilyon there okayso what do I mean by multi-archch okaythere's multi-arch which means that youcan run this on different machines withdifferent architectures right there'salso if you have a cluster you can uh aKubernetes cluster you can have workernodes that are on maybe your compcontrol plane is on x86 but also youmight have some worker nodes that areARM you might have some worker nodesthat are uh on the mainframe so this isuh opens up some interestingpossibilities especially in that you canpull in VMware uh uh virtual machinesinto your x86 nodes run them take themapart put them back together templatethem and convert them to S390X or ARMvirtual machines which is not somethingyou get with VMware and it also givesyou a nice uh management platform inorder to uh uh do this in a regular wayyou don't have somebody off in a roomsomewhere doing a VM conversion and thenhere's your VM it's it's transparentit's reproducible it happens you can uhuh it's declarative it's got a YAML fileassociated with it you got all thosethings so um I will uh do we have anyquestionsquestions okayum this obviously is not me alone thisis a cast of thousands um it's uh RedHat uh the IBM Burlan lab in particularwhere I get to go next in my travels andum uh Ryan Hollley from Nvidia and uhARM is involved as well and also becausethis is an open source project you twocan uh uh contribute let me just go backone if I since I've got 10 seconds Iwant to mention that uh related projectsare ISTSTEO and uh node featurediscovery 3 2 1 Thank you very much2025-04-15 22:02:53.260545 zz��0#�=AziQRTuDCtuMuh my name is Hungai from uh Huawei andI'm one of the maintainer of Kamadaproject so uh uh any anyone who uh heardof a Kamada project please raise yourhands okay thank you thank youum okay uh Kamada short for uhKubernetes Amara it's a designed for uhmanaging your applications across uhmultiple Kubernetescluster and the uh the architecture ofcommada is highly uh similar toKubernetes it has um uh API server uhfor managing the request and uh thescheduleuler for selecting clustersbased on quite a lot a lot of policiesand the controller manager for uhpropagating your applications to thememo clusters uh commander can manageall kinds of kubernetes cluster so nomatter the cluster is from uh uh fromcloud provider and uh uh from your datacenters it can or they can join uhregister tokamala uh so uh Kamada provides a lot offancy uh features uh in in addition tothe basic cluster management and uh umand uh workload uh propagation uh theother features are used a lot uh uhincluding uh multicluster servicediscovery and uh uh multiclusterresource view and so onso uh Kamala was uh open sourced at uhin 2021 and become asens project at the same year and thenmoved the levels to uh incubation at2023 and Kamala joined the efforts ofmore than 700 contributors and and fornow we have uh uh 36 public uhadopters here is uh uh the the the listof public op uh adopters they are usingcommada in pro in productionenvironmentsuh some remarkable adopters you mightheard of are uhtrip.com and uh uh red nose whoseChinese name isShiaunu and the both of them are runningcommada to uh managing a large scale ofa communated clusterso uh there are several talks uh at uhthis coupon and uh the Bloomberg ha hasuh contributed a lot to this project nowBloomberg is using Kamada for managingthe Flink deployments and uh running AItraining jobs on Kamada and so if youwant to learn their practice and so goto the session and talk tothem also uh you uh you can find us atthe commada booth we have a three uhthree morning booth uh so uh pleaseplease go go and talk to the maintainersuh we are glad to help we are glad tolisten okay that's all thank you[Applause]2025-04-15 22:02:53.889040 ��V�1#�eA8-ovtwX2l7khi my name is Ariel and a few years agoI was working at a company with a largecomplex infrastructure we usedcrossplane and Argo CD as our controlplane and infrastructure ascodetools but not everything could bemanaged through them some of our mostcritical systems were internal and hadno existing provider one of them was ourinternal database provisioning system ithad an API but no provider just raw HTTPendpoint so what do you do when theprovider you need doesn't exist abandoncrossplane altogether spend weekswriting a brand new provider fromscratch or go back to manual work wekept working around it but it didn'tscale and it didn't fit into ourinfrastructure as code workflowsinfrastructure as code has transformedthe way we manage infrastructure insteadof manual provisioning resources are nowcreated and updated declaratively makingeverything more consistent reusable andscalable but while we've solved thisproblem for most technologies internalservices remain a major blind spot theissue is not unique to databases manycompanies rely heavily on their owncustom internal serviceswe're talking about things like secretsmanagement feature flag management quotamanagement all criticalnon-standardized unmanaged servicesbecome a major challenge even when wetry to do something as simple as aversion update without infrastructure ascode management we have to trigger APIcalls manually hope no databases aremissed and coordinate across teamsbecause no one has clear visibility intodatabase ownership even worse noversioning no audit logs no history noway to track or understand the currentstate of resources there is nostandardization and no guarantee thatthis change is appliedeverywhere and without declarativemanagement we fall back to manual workinteracting with internal APIs by handit might start small but it quicklyspirals into an unscalable mess thiscreates se the team some tries to keeptrack in spreadsheets things like howmany instances exist where they'redeployed who owns them this createsseveral issues the data is often out ofdate there's no automation so you can'ttrack drift or enforce changes and itjust doesn't scale the more services youhave the harder it is to manage or trustany ofit we've been already usinginfrastructure as code to manageeverything else so how do we bring thatsame consistency to internal serviceswhen I looked at existing infrastructureas code tools none supported genericHTTP calls so I built provide HTTP tofill that gap i designed it to let usersdefine API calls as crossplane managedresources making it possible to manageanyAPI with Provado HTTP a single YAML fileis all you need crossplane andKubernetes handle therest so what can you actually do withthis first it lets you manage internalservices second it gives you a way tocraft custom providers fast if it has anHTTP API provide HTTP can manage it noboiler plate and no code generation andfinally it's great for integrating toolslike Slack but the most exciting partwas seeing how the community startedusing it in ways I didn't even expectinternal tooling SAS automation tokengeneration and much more it's beenincredible to see the creativity overthe communitygrow infrastructure as code shouldn'tstop at cloud resources we can nowautomate anything getting started iseasy as you can see you can do it in asinglecommand and I'm excited to see howprovad fits into the broader future ofcrossplane the team is hard at work onthe next evolution crossplane v2bringing even more powerful capabilitiesnick and Jared will be sharing moreabout it during our maintainer tracktalk on Friday don't miss it for teamsthat need even more flexibilitycrossplane also provides providedKubernetes and function Go templating sowhether you're managing internalservices cloud resources or third partyAPIs Crossplane and provide HTTP canhelp you provision it all with thisprovider problems expire thank you2025-04-15 22:02:54.502835Yce mesh is today but whatyou may not have known is linkerd wasthe first service mesh we're the oneswho created the word service mesh andwhat we do is bring security reliabilityand observability to your workloads inthat order no exceptions uh all in anultra light package developed inRust again what other regardless of whatother ones may tell you linking is stillthe easiest to use and most secureservice mesh out there it just workszero config you add an annotation toyour namespace and everything's meshedit's ultra light it's simple and it'ssecure by default with MTLS um you canlook up all the the statistics on onlineyou see our our uh control plane ourdata plane take a very minimum footprintand again because it's based on Rust butregardless of that today I'm here totalk about how we manage CRDs in the APImanagement world we use CRDs most peopleuse CRDs kubernetes as a wholeencourages CRDs kubernetes is nothelping us as a whole with CRDs thoughum we use shared CRDs this is mostly inthe gateway API project right now butthis is just the beginning we'll seemore in the future and we try reallyhard to keep our CRDs simple one of themain goals of linkerd is to keep thingssimple so when we start developingthings the first question we ask is howis this going to affect people fromdevelopers to SR and anyonebetween again we use CRDs Kubernetesencourage CRDs but it's really hard toget a new API in the core of Kuberneteshas anyone has anyone done it anyonetried any maintainers here doingthat ah there's one there see verydifficult we only have one in the crowdof of 300 people 400 peopleUm it's really easy to add CRDs thoughand we can go overboard who's who'screated their own operator and CRDs inhere who has gone overboard with theirCRDs or who admit that they've goneoverboard with their CDs I may say rightyou know yeah um it can be unpleasantright versioning CRDs is really hard umyou know we have the little versionfield in our CRDs but it doesn't reallydo exactly what we've always thought itdidum you know we have things where we canyou know create web hooks or haveperfect backwards compatibility who hasalways maintained perfect backwardscompatibility and yeah that's what Ithought um so lesson learned is it canwe got to be very very careful what weship as an API it should be all what ahuman needs to actuallyuse again we use shared CRDskubernetes only manages the core API CRDor APIs for you it does not manageCRDs this makes shared CRDs tough whoinstalls them who deletesthem is it one project or the other whatif you delete one project deletes theCRDs it relies on does what is yourapplication going to do it's not goingto have a great time lessons learneddon't try to install shared CRDs foryour users it seems like that's theright answer but it's notit causes problems in the long term whenwe have multiple projects using one likegateway API and again this may be theonly one right now but there are goingto be ones in thefuture we try like I said we try reallyreally hard to keep our CRD simple whenwe make decisions our first question ishow will affect developers and SRRES andeveryone inbetween configurability is the enemyright if you know adding things no oneneeds has a real cost lessons learnedapi should be tailored for humans whoneed to use them if no human needs itdon't add it to yourAPI take a good example of URLs do we gohttpsgoogle.com443 no we just know wedon't need to specify that adding portnumbers doesn't help anyoneapis can be unambiguous but withoutbeing fullyspecified annotations and I got 30seconds to speed up uh root of evil orflexibility probably both annotationsare great for experimenting leaderinktakes full advantage of that annotationscan be fine for humans the biggestconcern is validating them we use webhooksum and and it's not native right nowthere's no way to validate that inKubernetes and fixing it may be hard ihave to speed upapologies if you only remember one thingas I'm speeding up two thingssorry apis need to be tailored to humanswho needthemperiod it seems simple but trying toinstall share IDs for users hurts themmore than it helpsthem learn from our mistakes and embracethese ideasum if you haven't heard we have aservice mesh academy monthly uh scan theQR code uh learn more about service meshwe we cover other things besides that igot to speed up want to continueconversation come find us at our boothif you disagree with some of the thingsI say or agree with them come talk to meabout it id love to know your opinionthank you[Applause]hello everyone my name is Vadim Mawa i'mone of the co-maintainers of projectharbor and in the next four and a halfminutes I would like to share with youhow we are using the project uh orelephics mentorship program to ouradvantage and how it's worked out for usuh in the last two and a half years soit's a brief recap of two and a halfyears 12 mentees later and what welearned from that and how it works usfor uswell uh to for the context if you don'tknow it the LFIX mentorship program is aprogram run by the LE Foundation andCNCF and it connects the mentors andmentees together to work on a project uhunder guidance supervision so theyconnect on a platform those two partiestogether and they provide a financialcompensation for the mentees which isnice um who can apply everyone can applybut what we see from the applicationthat we get is that mostly studentsapply um from the STEM sector and mostlystudents from regions where there isyoung population which is uh India notIndia but I would say I Asia and Africaand um when we started 2 and a halfyears ago you know when you get like 20to to 60 applications per per term termyou need to review this there's like lotof CVS that you need to review conductinterviews and you need to zift throughall these AI applications that you getnowadays and ifyou are attempting to write anapplication with solely AI don't do itit everyone sees it everyone knows itand you will get directly discardedright so don't do it even if you don'thave much to say don't say much rightum so but even given that we can zip outa lot of applications there's a lot ofwork to do right and what we found outover the course is that people who areinterested in open source and they wouldlike to know what they actually signedup for right so they try to find up findout like what do I sign up for what isthe thing doing that they're doing whatis hard doing how does it work how do Iinstall it how do I run it what problemsit solve and so people tend to askquestion about the the product and basedon the engagement and the commitment andthe incentive in we select ourcandidates right so it's for us theoutcome is that we have to review muchless CVs I mean we still review CVS butnot as the first process but as a lastprocess right just to be certain and Ithink it's also has less bias becausebased on the GitHub profile or the theSlack handle you don't see the theperson's gender race or whatever um andso far it was great work great for us wehad one failed completion in the earlybeginning we have uh one new projectmaintainers coming direct from the mentymine maintainer program uh we have twoemployments one directly in the opensource world and one in in a corporateworld um we have two create two projectsthat we've been created as part of theuh mentorship program a z project ofharbor we don't let the mentees work onthe core product because it's a lot oftime consuming and also um often timesit's a bit critical right so you need toship features and here we can um work ona way that it's done when it'sdone and uh yeah for us it has been agreat success and I would say that youif you're a men mentor or you want tobecome a mentor for an open sourceproject you should definitely try thisout um but you know be aware that thetime frame mentees are working it's like3 months only you cannot enroll thewhole process of uh you know selectingthe the candidates yeah you need to findin innovative way how you can how we cando this and for us the the criteria ofengagement works the best um yeah if youhave questions reach out to me you canfind me online um you can find me herethank you very much for your attention[Applause]2025-04-15 22:02:55.128110 ` �BQ`�e�5#�A9mX9PvNNDjkhelloeveryone and welcome to the session myname is Morris Joner i am one of themaintainers and original creators ofexternal secrets operator um yeah I wantto quickly share what we've been done inthe past couple of years I guess andgive you a quick overview of how itworks and how you can use it to do zerotrust uh secrets management with ESO soESO is built around the assumption thatan engineering team stores all of theirsecrets within one secure vault that'slike the unde]�=�4#�3AlNovu7Kclhkhi me me again uh my name is E so I'mnow switching the role from notaryproject maintainer to ratify maintainerso today I will give you a brief uhoverview of ratify project and uh uh abit deep dive intoattachations so what is ratify ratify isa sandbox projectuh it has been around uh uh less lessthan one year I think it's uh have beendonated uh uh last year um I think uh ifI remember correctly it's in Septemberso Ratify is a pluggable verific umverification engine to saf\��3#�sAB7lpXPZPFoIall right everybody come on in grab aseat my name is Mitch Connors i'm aprincipal engineer at Microsoft but moreimportantly I'm a maintainer of the ISTOproject uh and have been for just aboutseven years now i was asked by mycolleagues to give a talk uh titledwhat's new in ISTSTEO but I'm actuallynot great at following instructions soI'm going to give a different talk we'regoing to talk about what's ISTSTEO sohow many of you in the room arecurrenttousers okay okay so I'm happy with thenumber but it's about 20% so this talkis not for anybody who just raised yourhand you all already know what's goingon with STTO but we're going to talkabout what is STO why you might want touse it and how you can get started withit so let's jump in uh STTO is a servicemesh and you've heard already thismorning from a few other service meshtechnologies throughout the CNCF we areby no means the only game in town aservice mesh has three primary functionsits job is to manage connectivitybetween pods uh security for thatconnectivity and observability let'stake these one at a time first securityuh all connection between pods if you'rerunning a service mesh and by I'm usingthe word pods here but we should reallystart talking about pieces of softwareuh whether that's a pod a VM a WASMcontainer or some other thing uh allyour traffic should be encrypted withFIPS compliant encryption algorithms uhit should be encrypted using frequentlyrotated automatically rotated PKIcredentials and integrate with the PKIsolution of your choice so that you'renot having to manually rotate thesethings and keep them up to date secretmanagement etc those certificates andcredentials should uniquely identify[��2#��AA5CyAZBUH1f8my name is Phil Henderson i'm a customersuccess engineer at Buoyant today I'mgoing to be talking about API managementin the CRD world and what Linkerd haslearned again my name is Phil Hendersonuh I usually there's social media stuffon this page i don't have any socialmedia so you cannot find me you can findme on GitHub and any of the CNCF Slacksat Phil uh other than that I'm in a youwon't be able to find me on theinternet what What is linkerd linkerd isa service mesh you've already been toldwhat a serviXboth the client and servercryptographically for every connectionin your service mesh or in yourKubernetes cluster uh and that meansbecause cryp you can use cryptography touniquely identify client and server youhave scalable off policy enforcement youno longer have to pass around the IPaddresses as identities of every serviceto your data plane to let it know whichIPs can talk to which other IPs insteadyour data plane is simply going to lookat the client certificate the servercertificate and check a list of allowedconnections so that's the securityaspect next let's talk aboutconnectivity uh a service mesh at itsheart is going to do all of your L4 andL7 load balancing controls whetheryou're talking weighted least con avariety of other load balancing uhthings that you can do it also allowsyou to route based on HTTP attributesand the most popular of these is ofcourse path-based routing this issimilar to what you would do in aningress gateway except that now you cando it for any application to any otherapplication in your service mesh or inyour cluster on that point uh mostservice meshes and ISTTO is includedhere do offer in ingress as well asegress in addition to east west trafficcontrol so whether the traffic is cominginto your cluster going out of yourcluster or bouncing around inside ofyour cluster you should have the samecontrols over its connectivity securityand observability lastly on theconnectivity front not quite lastly Imust have failed to save a slide uhyou're going to be able to do thingadvanced patterns like traffic mirroringfor debugging or re retries automated orfault injection etc uh let's go aheadand move on because I'm using up all mytime and then oh multicluster routingand discovery is also a part of manyservice meshes lastly observability uhyou're going to get all the telemetryfor every request in your mesh you'regoing to get configurable access logsdistributed trace sampling all of thisshould come more or less out of the boxwith any service mesh solution so nowlet's talk about why would you chooseISTTO for your service mesh needs uh thefirst thing my favorite thing about theISTO project is its comm communitycommunity i've been a part of thecommunity for many years now uh todaythe steering committee is made up ofnine different companies that ourtechnical oversight committee is made upof three different companies and we havein the last two years 25 differentcompanies have contributed more than 100pull requests to the STO project andthere's the list right there so if 10 ofthese companies you see on your screentoday were to disappear tomorrow wouldcontinue as a project uh while I'm veryproud of Microsoft's contribution we'renow the number two contribution to theSTO project which is a great milestonefor us uh if we were to stopcontributing tomorrow you're not bettingon Microsoft when you take out an ISTOdependency or when you install ISTSTEOyou're betting on the community of thecloudnative compute foundation and theISTO project uh which is a really reallyrobustcommunity other reason to to try ISTSTEout for your service mesh needs is easeofuse and I didn't hear anyone laugh okayso uh it used to be that this wouldwould have elicited a lot of laughteri'm hoping that that pattern is dyingout uh with our new ambient mode you'reonly running the proxies you need forthe STO project uh the onboardingprocess is you install STTO with onecommand you label whatever name spacesor pods you want captured with anothercommand and you're done so there's noneed to restart your apps or anythingelse you get started very quickly andvery easil easy with ISTTO and that ofcourse is generally available sinceNovember if you'd like to learn moreabout the ISTO project in about an hourwe're going to be starting at the farend of the conference center level threesuites 7 through nine our ISTO dayconference you can come hear more aboutwhat's happening this year in theproject also if you'd like to uh learnmore this QR code will take you directlyto our getting get started link onlinethank you for your time[Applause]2025-04-15 22:02:55.745573eguard thecloud native secure supply chain and itsupports uh signature uh not projectsignature and uh cosign signature uhcosign for cosign signature it includesthe k signature and the kitty'ssignature and it supports currently uhthe artifact tab spawn uh vulnerabilityreport in uh uh and also theattestationsuh the policy language it supports reggolanguage and we have the plan tosupport language and ratify alsocompliance with OSI 1.1 so it can pulluh the OSI artifacts uhefficiently so currently we have uh alsouh major cloud uh vendors adopt theratify so uh three key scenarios for uhratify the first one is ratify can uhused as external data providerintegrated with uh gatekeeper opgatekeeper for uh policy control uh sothis normally happening uh in when youwant to deploy uh workloads uh onkubernetes so ratify can work with uhgatekeeper to set up policies to writedata signatures spawn files andattestationsthe second scenario is uh uh roify umintegrate with the containerd at runtimeso for this scenario it's uh um uhhappened uh in in the scenario you wantto validate the signature uh or you wantto verify the container images on a nodelevel so for example if you uh want touh bootstrap a kubernetes cluster so atthat time the admission controller thefirst scenario it is not ready yet butyou have some cached images and you wantto make sure those images can be trustedso this is the scenar scenario two uhcan be used for that purpose for uhscenario two currently is in a proof ofconceptstage for uh the third scenario is touse ratify in the CI/CD system so thatyou can not only uh validate thesignatures but also you can uh enforcepolicies for validate other uh supplychain related artifacts such as spawnvulnerability reportsSo a bit deep dive into attachations soattestations is a statement it's asigned statement so the statement couldbe a spawn uh or could be avulnerability report and this statementuh is signed to ensure its integrity andauthenticity and on the screen you cansee an a typical example uh so this isthe OSI images uh so it itself has beensigned with the notary project signatureit also has a uh vulnerability report ina safer uh JSON format and thisvulnerability report also has beensigned with a signature attached and uhthis image also has a spawn file in spdxformat and uh this spawn file alsosigned with uh notary project signatureso why uh attestations are important uhwhy not just uh a signature is enoughfor contender images because if you signcontainer images you cannot ensure theimages have vulnerabilities right andvulnerabilities can uh be detected overtime even within a day right so you needto verify other supply chain relatedartifacts to make sure the images notonly trust but also complied accordingto your communist policies so here is atypical example uh at the admissioncontrol uh the scenario one that ratifyused with the uh opar um gateway keeperso you can set up the policies to uhverify not only the signatures but alsocheck the spawn file to see uh first uhwhether the spawn file can be trust byvalidating the signature then check thespawn file to see whether any um licenseissues that uh not compliant with yourorganization policies or and then afterthat you can also validate whether thevulnerability report can be trust if itcan be trust then you can validate thereport to see whether the report isfresh for example within a day whetherthe report have a critical uh CVE orhigh uh CVE you can um configure rulesforvalidation so you can basically set upthose policies based on your company'ssecurity and compliance uhpolicies and you can also scan the QR toknow more on from the not uh from theratify projectwebsite okay to learn more uh welcome tojoin our uh project Koski uh it happenstomorrow thank you[Applause]2025-04-15 22:02:56.366893rlyingassumption um now ESO runs as a workloadinside your cluster um and it reachesout to that vault and um fetches secretsfrom there and with that data it createsa Kubernetes secret object and fromthere people can use it by people I meanlike workloads you can uh reference itfrom an ingress resource you can ummount it as a file in a pot or consumeit as a environmentvariable now one thing that you shouldnot do is use static secrets or stackstatic credentials to reach out to thatvault um instead you want to leveragesomething um of the underlying platformto authenticate with that provider we'vebuilt that um and we're simply justusing um Kubernetes uh service accountsand we throw them against that vault uhtoauthenticate that's something that Iguess like most of the providers umsupport today um by providers I meanthat could be Adable Secrets Manager GCPsecret manager Azure Key Vault HashiCorp Vault and like all of the othersecret vaults that are out there uh asof today we support about like 30ishproviders so I guess most of most of usuh should be coveredum yeah and so the operator runs as aworkload in your cluster and then youcan have a secret store resource thatrepresents the provider and how toauthenticate with that provider andthere's another resource external secretthat represents the secret that issupposed to be createdum and because it's an operator andcontroller that runs it does this on aregular basis once an hour every 10minutes every minute to just reconcilethe state from the vault inside thecluster pretty simple um we have a bunchof more features built in the past umyears I guess uh zero trust we alreadylike talked about that um on top of thatwe have like secret rotation so you canlike securely rotate the secrets byhaving like different versions ofsecrets and um throw them at your um atyour workload so they can just handle itand maybe try try try both versionsthere are like multiple ways to do thatum I personally see external secretsoperator as a toolbox that you can useto um facilitate secret management andum yeah make it work for yourorganization for your context we do abunch of more stuff um secretdistribution across namespaces so youhave you can offer like a um a platformservice to your engineering teams um youcan pull secrets from other clusters soyou can push secrets to other clustersas well you can do templating um fetchaggregate extract secrets fromstructured data and everything is builtwith multi-tenency um in mind i have aquick demo um to show how this works sotoday I'm using Adel secrets manager ihope it's big enough umum yes nounfortunately I cannot make it biggerum sorry about thatum so maybe let me quickly explain thatso um we're having a um secret in asecret manager a DB user and a DBpassword um we're having that um that'salready there in place so we have asecret store that represents um asecrets manager in US East one and we'reusing a service account in the namespaceback end we use that to authenticatewith a secrets manager there are nostatic secrets there no staticcredentials we're just using the theJSON web token and then we create aexternal secrets object apply that throwthat against thecluster um and then we check ifeverything is ready and synchronized andif it isthere and everything is ready and syncedof course it's a pre-recorded demo umand then we can take a look how thatsecret lookslike and surprise we have a DB user anda DB password it's B 64 encodedum we can decode that as well and cansee okay the password is there minglemingle um and with that um that's itfrom the demo one more thing to say umso you can join our community meetingsthey are um every other week you canreach out on us on Slack in theKubernetes Slack in the external secretsum uh channel and on top of that um wealso have a project booth here around soif you want to talk about secretsmanagement in general um so feel free toswing by and with that thank you verymuch[Applause]2025-04-15 22:02:56.956945 **�R�6#�]AjmLuKkr4ndohi everyone uh welcome to Lightning Talkabout K3S who knows what K3S is can Isee oh my god i'm not sure what I'mdoing here but all right uh my name isOrland uh and I recently decided to tookthe role of community uh for K3s so whoam I uh I'm the community I hate theword manager so I I consider myself acaretaker of the community i'm a CSFambassador and by chance I work at SUSAas a technology advocate so what's K3Suh K3S is a sandbox project uh joined inLinux Foundation in August 19 yes we areso long in the sandbox so we are aimingtowards incubation it's not a fork ofKubernetes it's a super lightweightKubernetes it's a single binary under 7megabytes it's designed for IT IoT CI CDedge by some they say it's the mostdownloaded K3S uh Kubernetesdistribution also you can run K3S undertwo minutes so practically you can haveKubernetes running under two minutes andif you don't believe me I can show youthat super quick um so that's a a lookat home i don't if the network worksyeahwhoops i don't have anything right nowand if I install it give me asec of course it I'm cheating a littlebit because I've tried that a few timesi'm sure that I'm sure it's going towork but uh as you can see it's going totake like 10 10 seconds maybe 15 no 10seconds and if you run thatagain in just few seconds yeah as youcan see everything is starting to showup uh we we're going to have traffic andwe're going to have everything uh thatwe need to have a running Kubernetes sowiththat what else we do rightum so the road mapish and what's comenext of course we're going to do 133 uhwhenever it's released why I'm herebecause I'm part of the effort of thecommunity to reboot the community effortso we're going to do more stuff with thecommunity we're um we're going to do uhin the pipeline there are many Windowsimprove improvements we're replacing SonBoy with hydrophone for the conformancetest we're going to add NATS at somepoint but it's kind of timeconuming andof course we are looking for new ideasfrom the community so how to do that umuh we're looking for adopters becausefor a long time K3S is um it's superknown it's out there people using it butpeople are somehow not talking about itthat they're using it so if yourorganization is using KS and if you'rewilling to uh share that with the worldplease reach out to me you can find meall around the conference you can findme at the harbor booth as well i do somework there um so chat with me let'slet's work on this one and we canpublish all that so people know that KSis used and you're using it andeverybody's super happy the communityreboot um maybe reboot is not the rightword but I like to reboot stuff so westart fresh we we put new uh ideas andeverything is there so that's the the QRcode for the community page actually itwas merged 10 minutes ago really reallyI can show you that so uh how to get intouch with us uh you can get in touchwith us on the rancher slack or on CNCFwe have a channel KTS we have a mailinglist uh those are brand new so you canget there uh we have a user mailing listwe have a developer mailing list sodepends what kind of questions you havejust go there and and shoot um we'regoing to start hosting from next week acommunity meetings which we're going tocombine community meetings with uhoffice hours um those going to be thefirst and the second um the f the firstand the third Tuesday in the month um sothe we're going to do EMA in America'stime zone and then EMA in Asia so we canaccommodate pretty much the whole worldwe're going to upload everything in theYouTube channel that we've just createdand you can follow us on Blue Sky andMacedon so yeah thank you very much foreverything i hope we can see each otheron the community meetings and discussthe future of KS thank you2025-04-15 22:02:57.484864nothing on empty physicalmachines and you end up with a fullcluster or multiple clusters on thephysical machines we also a copyinfrastructure provider so if you arefamiliar with cluster API uh project umwe integrate with that project also thisuh meta cube stack is used by uh manydifferent u umum enterprise customers or enterpriseusers uh among them Ericson and Redheadand many others like SUSA for example orFujitsu and then the whole project uhconsists of five main components and wehave a few uh smaller components also uhbut we release uh we provide releasesfor these five main components so ingeneral about the growth of the projectuh we we had approximately five new uhmajor uh features uh since the cubeconna uh North America last uh last yearNovember and uh in the sandbox uh sincewe joined the sandbox it's like about 6570 of these major features we since lastNovember we released 30 uh four releasesacross all of our projects combined 140we merged 44 about 440 uh differentcomets and since uh uh we joined thesandbox about 5,500 these are all approapproximate numbers because folks workon the project as we speak and then wegot one new adopter since November wehave the same number of maintainers uh16 since we had uh since since last yearNovember and uh based on GitHubstatistics we had about 10,000 differentuh GitHub events reviews mergers comcomments commits everythingcombined so about the maturity of theproject uh I can report some seriousprogress so we established the projectin2019 in uh September of 20 uh 20 wejoined the sandbox we applied forincubation in December of 2023 we onceuh rewrote the incubation proposalbecause the uh requirements has changedby theCNCF and then our uh incubation duediligence review process started in inDecember of last year so end of lastyear and I can report that we are notyet in incubation state but we havepassed the adopter interviews and uh thewhole due diligence process have uhcompleted what we are waiting for isbasically the uh results of the duediligence process so what is uh new andwhat is upcoming we had two major uhthree three major uh security um issuesthat we had to handle not alloriginating from us but um but we had todo some uh actions for them thanks toour security team we have a dedicatedsecurity team uh as I mentioned wepassed the adopter interview stage ofthe two diligence process uh we have newfeatures like online database migrationfor for the uh nodes the the clusternodes we have a read only file systemsupport for for uh one of our componentswe we are adapting and accepting adesign uh to support the cluster APImulti-tenency architecture uh we also uhbringing in support for for OCI diskimages so not for containers but uhproviding uh full u disk imagesoperating system images via the OCIframework or standard and then we arealso introducing and we already haveearly adopters who are uh usingautonomous uh disk encryption for thesemachines so that's a new feature we arebringing and then uh a few fewinformation about us so we will have akiosk um in on the event on this eventuh it will be on Thursday in the morningshift so from 10:30 to to 1:30 p.m 10:30a.m to 1:30 p.m and the kiosk will be atthe 19A spot and then we are reallylooking forward to to hear about ouradopters and uh and uh to get newcontributors so please join uh check outour socials and join us we are on the uhKubernetes Slack under the cluster APRbare metal hashtag we have our ownmailing list security list we have uhmeetings public meetings every week onZoom uh we have our own YouTube channelwith multiple hundreds of videos on thatand then uh you can check us also onclone monitor and check out check ouropen SSF badge and all of these uhdetails about the project so feel freeto visit us and thanks for yourattention[Applause]2025-04-15 22:02:58.182024 -��:#�eAPcRORHC1NYYhey everyone I'm Brent Keller uh I'm atechnical lead for Tag Security thetechnical advisory group um today what Iwanted to talk to you about is reallygetting uh more uh accessibility anddiversity of roles involved in many ofour different groups uh whether it beprojects or uh other efforts such as thetags working groups etc and kind of someof the options there uh so if we look atthe landscape uh it's quickly growingscaling you might say uh what's workedfor the last 10 years probably won'twork for the next 10 years we haveprojects um joining the landscape uh ata greater rate uh projects need avariety of things if you work on aproject you maintain or contribute toone you might find you need support forsomething uh maybe something you haven'tencountered a a CVE submission uhsecurity related in uh submission uhinformation etc and you're like what doI do with this uh how do I respond uhand we have a variety of tags throughoutthe CNCF that support uh a lot ofdifferent things or can support andmaybe um not everyone is aware of thatand so for tag security uh we do avariety of things uh a lot of it isadvising right the TOC uh quitefrequently asks the different tags tosupport a project joining sandbox ormoving from sandbox to incubating etcwhat are the requirements for doing sowhat kind of checks and balances shouldbe in place to ensure that the theminimum thresholds are being met andthat the landscape is staying secure umand it can be advising the TOC inadvising the greater landscape rightworking with projects getting umfeedback from them of things that areneeded for support as well as advisingprojects on architecture on securityrelated uh information or design aroundtheir projects or things that they coulddo for end users feeling confident abouthow those projects could be consumed uhmoving over to assessments uh if youhaven't had a security assessment doneby tax security on your project uh we dovarious things from a self assessmentwhich is a you know self uh self-serviceproject assessment for security uhvariety of filling out thingc�Y�9#�kAdT-ShZYM3SIhelloeveryone Uh my name is E I'm currentlythe notary project maintainer So howmany of youknow notaryprojectokay How many of you sign containerimages okay Looks good How many of youuse images from Docker Hub directoryokaycool Okay in today's talk I will talkabout uh uh notary project a shortoverview and also our new feature signarbitary profilesSo not project it's a sense ofincubatingproject and in today's uh software worldso the cyber attack on the securing uhsub attab�a�8#�{APvKCzFaP3gMhi uh I'm not going to do slides forthis one i only had one slide so I'mgoing to skip it um I'm LimitriKarakasilis one of the maintainers ofthe Chyros project thisChyros let me start with a question aswell uh who knows what a single purposeoperating systemis oh my god three people okay we'regoing to need more minutes uh okay letme start with the definition like 20seconds definition so it's the oppositeof a generalpurpose operating systemwhich is your laptop operating sa�P�7#�YAcs68TjSAlTgokay welcome everyone my name is AdamRosman i'm one of the maintainers of theMeta Cube uh project and in the spiritof the first uh presentation by the CNCFfolks I would like to talk about thematurity of the MetaQube project what isMeta Cube quickly so this is a projectthat helps you deploy uh Kubernetesclusters on meta cube machines uh on onmeta bermata machines and you can managethe whole workflow from kubernetes andyou get an end toend full stack solutionso you have _ystemfor example so in a general purposeoperating system uh the system has toallow too many actions to the userbecause it can't possibly know what theuser might want to do but when we knowwhat the workload is we can limit whatthe user can do so we can limit theattack surface and make it more secureso that's not all but it's a gooddefinition for a fiveminute talk sothat's what Kyros is a single purposeoperating system a mutable system and uhI'm going to tell you more later if youwant to know more but let me give you aquick update of what we did the last 12months so first of all uh we became aCNCF sandbox project that's big news youalready heard what that means multipletimes uh but um yeah that's why we arehere and you're going to see more of ushopefully uh we implemented what we calltrusted boot it's a combination of threethings secure boot measured boot andfull disk encryption uh you can findmany resources online on what that meansbut uh it's pretty much what guaranteeswith Kyros that you're booting theoperating system you intended to bootand nobody has tampered with it and yourdata are safe from unauthorized accessoopsie okay um Kyros has always beendistribution agnostic means uh you cantake your uh distribution of choice andconvert it to a kyos image a kyros OSimmutable with all the features ABupgrades remote upgrades and all what wedid the last 12 months is weconsolidated our tooling we created newsimpler tools and now this process isjust one command away literally onecommand and you got your own version ofKyros based on your ownimages we have first class support foredge devices nvidia Jetson RagX forRaspberry Pies we are testing more as wespeak um so if you're into edgecomputing have a look atthat besides being distribution agnosticuh we partnered with the K0S uh projectand now Kyo ships out of the box withK0S support as well it used to be justK3S now you can just select KS when youinstall so more options for our users wehave an awesome feature which is uhbased on peer-to-peer technology and youcan spin up nodes with just the sametoken you can say if you want an HAcluster or what other options you wantand you're going to get uh a cluster uhautomatically um configuredum in actually some very strangenetworking scenarios you can try manythings and tell us how it went we alsohave a remote key management server soyou can have your devices on a field forexample uh being uh encrypted the diskdiscs being encrypted and you can havethe password stored on a remote keymanagement server if someone takes theirdevices and runs they will no longerboot so Kyros is immutable uh but youneed ways to extend that of course youcan always build a new image uh but youmay want to do that dynamically so nowwe support system DC extensions whichallow you to do exactlythat we're now pushing cloud images uhpublic cloud images for three differentclouds AWS Google Cloud and Azure andthey're publicly accessible so you cantry Kyros on your favorite cloud withjust a couple of clicks um and last butnot least you are community we reached1,200 stars for whatever that means butwe're seeing growing community adoptionwe have companies that are deployingcars in thousands of machines inproduction um and that's why we'retrying I mean we visited 10 differentcountries the last 12 months more than30 events just to be there with you totalk to you to listen to your needs youruh use cases um or complaints whateverso make sure you find us you can find usin project pavilion area we have a kiosktomorrow uh morning shift uh and we'regoing to give a couple of demos in thesame area in the center and if you findme outside uh you know kos t-shirt redshoes make sure you speak to me if youhave questions okay five minutes are notenough I can speak forever so thank youall that's all I[Applause]2025-04-15 22:02:58.909337ck on the software supply chainis quite frequent and cause severeimpact So developers also uh frequentlyask questions How can I ensure artifactsthat I use are from trusted source andhow can I ensure those artifacts are notaltered uh sincecreation So notary project answer thesequestions by providing standard basedsolutions and tools So uh we compliedwith the OCI artifact OCI specificationthe latest 1.1 so that you can managethe signature in the OCI compliant regryefficiently and we also support two ITFstandard based signature format one isJSON formatted uh JWS another is uh uhbinary encoded coy signatures and notproject also create additionspecifications so that you can buildyour own uh reference implementationbased on yourneed and this is quite importantespecially for much uh cloud environmentThis can ensure the uh interoperabilityand compability by uh using the standardbased solutions And currently we have amajor cloud vendor uh adopt the notproject solutions and the tools and alsosome popular open sourceprojects So how or when do I uh use notprojects so this is one typic uhscenario uh it is to ensure the uhauthenticity and integrity uh of OSIimages throughout its life cycle So umso when you acquire images from forexample Docker harp you need to verifythe signature to ensure they are fromtrusted source right So after youacquire those docker images and you wantyour internal team to use that you putinto a catalog registry So normally youwill do vulnerability scanning andgenerate the vulnerability reports andfor some cases you even patch thoseimages before the upstream has that uhuh new version ready So for those newpatched images or uh vulnerabilityreports you can use not project to signit find sign them and during the buildbefore you use for example base imagesfor build your own application imagesyou can wify the signature to ensure thebase images are trusted and approved uhof use by your organizations and onceyou create your own application imagesand you could probably generateattachations spawn wrong filesvulnerability reports you also can usenot project to sign them So before youdeploy your contender images inproduction you can also use uh uhvalidate notary project signatures byusing policycontroller So this is one typicalscenario Uh so not project now supportnew scenarios that to expand the signingcapability to arbitrary blob files Sothey are are in cloud native they are uhartifacts that are not distributed asoay artifacts yet So for those uhartifacts they are uh equally importantto ensure uh the integrity andauthenticity uh in the uh softwaresupply chain So um similar to oi imagesSo before I mean you create yourartifacts right and before thoseartifacts leave your trust boundary forexample your file system you sign themto produce a digit uh signature and youpublish the signature and the uharbitrary file together and from aconsumer prompt view uh they canvalidate the signature before they areusing the uh uhartifacts So notary project nowintroduced new command site to supportsign arbitary files Uh notation uh isnotary project tool uh co i2 notationblob sign to sign arbitrary filesNotation blob inspect to inspect thesignature with the certificateinformation Notation uh blob policy youcan use it to initialize the trustpolicy On the right side you can see aexample of a trust policy and you canfine-tune this trust policy based onyour requirements and the notation blockverify you can use it for uh verifyarbitrary blockprofiles Okay So uh if you want to learnmore about not project and you want tosee a demo so feel free to join uhmaintenance track on the uh I think inthe last day uh in the afternoon of thelast day Friday and we also have aproject uh kiosk uh since tomorrow Sowelcome to join uh and uh uh we can chatThank you[Applause]2025-04-15 22:02:59.606858s it reallyenhances kind of like enduser confidencein the project as well as the CNCF andthe TOC evaluating the project andsaying what information is pertinent uhas well as joint assessments Right uh Ithink each project's going to be alittle nuanced how does it uh handlesecurity how in-depth uh of personnel dothey have to respond to the securityreview of different changes to thesystems and uh we have a jointassessment that is getting a variety ofdifferent people involved and that issomething that is very much a tagsecurity uh maintainers contributors etcexperience working with a project handin hand to do that as well as kind of awhat this presentation is for is gettingmore people involved if that issomething that interests you if securityis what really gets you pumped up likeit does for me uh then you there's athere's a lot of chance for uh people tojoin us on that and then research uhkind of something that we have we havemany working groups um within the tagthat uh can help with kind of liketrying to provide a you know widespanpublishing of documentation publishing awhite paper things that support uh kindof enhancing the overall ecosystem andso that's what the tag does um butreally what I wanted to talk about wasgetting involved um for thosewho quite frankly uh do this in theirfree time or volunteer time or do it aspart of their their work commitments umthere's a lot of different ways to getinvolved and really what I wanted totalk about is you know how can we bringa greater accessibility and diversity ofinvolvement to the the landscape whetherthat's projects and getting involved uhone by one uh or you know kind ofjoining uh different tags differentworking groups etc to kind of find afind a home for everyone uh and so Ithink there's lots of room formeaningful cont contributions uhespecially within in the tag i think Idiscussed most of them uh accessibilityI think has really been fun to learnlike from others personally uh each eachproject when you evaluate their securityposture is different maybe there's youknow similar patterns used that we canlearn from but each and every one ofthem has a little bit of nuance and it'skind of taking that chat loop back offapproach i know nothing about thisproject and let's do this uh and sothere's so much room for learning and solittle barrier I hope to joining thatand saying you know what patterns areother people aware of that I can sitwith them and learn from them while thisactivity is being conducted and I thinkthat speaks to skills development um youknow kind of using this as anopportunity to enhance security skillsets uh threat modeling other activitiesthat are very pertinent to kind of everyeveryday workforce everyday enterprisesecurity um all the way down to you knowprojects uh and applications and andthen we just talked to genericallysustainability of security uh I don'tthink anybody is really going to argueagainst doing something more securely uhbut how do we how do we figure out whichsecurity initiatives really need toscale how do we build confidence in thelandscape being a a landscape of secureprojects um and reducing risk and quitefrankly get getting more end users uhcomfortable and confident with the uhprojects they're consuming uh so there'sa lot of room for wide reaching impactsgetting involved on an initiative thatis really meant to scale across theentire landscape and I have a blurb uphere but really what I wanted to finishoff with is just saying that security isa very much a shared responsibilitywhether you're a contributor on aproject a maintainer on a projectwhether you're a consumer end user of aproject uh and everywhere in between forgetting involved in making sure that wehave the diverse skill sets we need inorder to really enhance the landscape uhand enhance kind of the experience foreveryone involved from codecontributions to uh securitycontributions to documentationcontributions etc um so with that saidum if you are interested in gettinginvolved come talk with tax securitywe'll have uh a booth at the projectpavilion at 17A thank you2025-04-15 22:03:00.146602 ways ofmanaging assets in container registriesnow how many of you didn't know youcould actually store items other thancontainer images insideregistries the best part about thatmeans that you can take advantage of alot of the same tools that we'veleveraged and built over the course ofthe last decade in a new way so what isORAS oras is a sandbox project insidethe CNCF it allows you to distributeartifacts in OCI registries it alsoallows you to manage them which meansyou can then manipulate the artifactsafter they've been published as well aseverything you need to do to actuallyput them into a registry it allows youto use it as a CLI which the majority ofour users end up using or as part of anSDK um I'm actually speaking at the nextright literally right after this atArgoCon talking about how we integratedauras inside AR Argo CD to be able tothen start leveraging it leveraging OCIartifacts and OCI images within Argo CDit also has a number of of uh SDKlibraries in different languageseverything from Java Golang Rusteverything's out there you really youcan use it it's also Oruras iscompatible with many different popularregistries inside the communityeverything from DockerHub Quay ZotHarbor all are compatible with auras andmany organizations have really startedto adopt and implement auras insidetheir own tools workflows andprocesses now today's little quick talkis going to talk about multiplatformimages a multiplatform image means thatyou can distribute a single image orartifact and have it be compatible withmultiple architectures so I'm running aM1 Mac which means I can run my theimage on my machine or run it on atraditional AMD 64-based machine itallows you to then utilize the creationof both the OCI uh image and artifactformat if we need to we can supportairgapped environments that's one of thebest one of the better parts of aurasthat a lot of organizations are using iknow um like um like Zar from DefenseUnicorns i saw one of the guys fromDefense Unicorns as I was walking uphere they um they've used um a lot oftools and created tools for thedisconnected aircraft environments youcan use a to pull in artifacts andresources into those environments andpush them to uh your local environmentyou can customize the OCI artifact youcan add annotations other metadata thatyou can then then query and then be ableto process against it and then finallyyou're able to attach resources toimages so if you think about how we'resecuring our supply chain you can youcan uh add certain artifacts likesignatures sbombs to your artifacts toreally understand the providence of yourimages and and then so in amultiarchchitect format so let's createit let's just do a quick demo of how wewould go ahead and use auras to utilactually build uh a multiplatformartifact so the first thing we're goingto do is we're going to take a look atall the different tags we have in ourlocal OCI layer we're going to create anOCI multiarchc image using the uh ORUSmanifest index this allows us to take aartifact that was createduh in both o uh AMD 64 and uh ARM 64 andbe able to create an artifact manifesthow many knows what a manifest is amanifest list allows us to then allow usto declare multiple architectures ininside a single reference so I can go tomy my image dot my image latest forexample quaioabblock my image latest and go aheadand have it be compatible with oop withOCIformat we can then go ahead and take theimage that that was that was currentlyavailable to us and push it to aregistry because right now it's runninglocally or at least stored locally we'regoing to push it to the remote registryand have it be available to us and thenwe can actually now go into GitHubpackages and see that both versions areavailable for our consumption by ourusers nice andsimple and the way we can make it secureis as I mentioned earlier we can thenattach references to it this is anexample of how we can go ahead andreference an SBOM signatures etc on on amultiarchchitect image using aurasfive minutes goes fast i thank you foryour your time this morning and thankyou2025-04-15 22:03:00.758862 q�q��<#�=ArRExAhVI1nUhello everyone my name is Martin icurrently work at Kong and I'm ankumacontributor and today I will show youwhat's uh new in Kuma so just a quickrecap Kuma is a envoy based service meshit was built with uh multi-tenency andmulticluster support in mind we alsosupport uh by default brownfield uhdeployments so it's easy to integrate uhVM workloads into your Kubernetes uhcluster and into your mesh so what'sactually new since the last CubeConwe've added new kind data plane inpolicies uh we redesigned inboundpolicies API we've added ability todefine secrets on zone level we've addedoption to exclude policy from syncbetween zones and there was couple ofstability improvement so let's take aquick look at the new kind data plane soin policies uh they are basically mainconcept of configuration in Kuma youconfigure for example timeouts retriesetc using the policies and previouslywhen you wanted to select some subset ofworkloads in your mesh uh you needed touse mesh subset uh kind in the targetref selector and you were selectingworkload by tax those stacks were takenfrom uh data plane inbounds which couldbe problematic to understand and uh itwasn't that easy to use so now we'veintroduced new kind data plane when youwhere you are selecting the realresource data plane every workload hasits own data plane and on Kubernetes weare basically building data planes frompots and we the data planes have thesame labels as pots so it's easy toselect your workloads by labels alsowith addition to section name you cannow simply select single inbound uh toapply the configuration to which comesin handy with our new uh inbound uhconfiguration API so previously we wereusing the from section with uh targetref uh that selected the subset trafficthat should be configured and in mostpolicies you are only able to selectmesh uh with exception of trafficpermission where you will able to selectsome subset of uh clients to which thepolicy should apply uh so now we aremoving towards rules which is morekubernetes native and uh gateway APIsimilar API uh right now there is onlythe default uh field with theconfiguration itself but with theaddition of section name as I mentionedpreviously you can select the singleinbound and apply configuration to thatinboundum and we will be adding the morepossibilities to this app with thematches field uh where you will be ableto select subset of traffic for examplebased on method path or select subset ofclients for example using spy IDUh yep so that's basically all that'snew and cool in Kuma uh where to findout more you can reach out to our docsatkuma.io there are plenty of guideswhen you could try those stuff uhthere's a quick start that you can try iencourage you to play with it and if youhave more questions you can find meduring the CubeCon and we will be havinga booth on Wednesday and Thursdaymorning so feel free to to come by andchat thank you guys[Applause]2025-04-15 22:03:01.325044�}�;#�3A2zjxSKAkT9Ehey everyone my name is Andrew Black i'ma distinguished architect at Red Hat anda maintainer on the auras project justgoing to talk to you for a few momentsjust to speak about how we handlemultiarchchitect images and distributedthem using auras now how many of youshow of hands have used auras or know ofauras well basically it is a CLI and setof binaries for managing OCI artifactsoci artifacts is a space that I havebeen very interested in for the lastseveral years especially when it comesto a lot of the new cloudnative patternsthat have been coming up to speed in thelast few years it is really starting totake hold of why we're able to leveragea lot of the same technologies thatwe've used over the course of the lastdecade in cloud native for newd ��k�=#�ACx5c-IueP78hi everyone Sorry for the delay only tomy laptop Okay Uh I'm Wizo L from dolotand today I'm here to share the latestupdates on the s project sparuh spo is anall-in-one Singi solution and it can beused as independent Singi with uh richfutures and it could allocate secondarynetwork cards uh in conjunction withother SI and the spo perfectly supportsuh RDMA network uh for AI jobs uhespecially in multi-tenant uh itprovides complimentaryuh features such as RDMA uhobservability and rightlimiting Uh in past practices weencounted uh several challenges Uh onone hand the singi annotationconfiguration is to be written in theyamo when using RDMA devices It is uhnecessary to manually uh query and uh uhconfigure the it uh the RDM resourcename for each network which is uhterrible uh use experience uh on on theother hand the underlying network ofeach host is complex and a perfect aperfect network solution uh schedulingsolution for ports uh uh are not foundSo this networking schedulingrequirements span uh m multipledimensions So uh in the latest uh we aimto address these challenges using DIA uhspo DIA implementation could uhcompletely finish the allocation ofnetwork interfaces based on diversdiversified strategies in resource sizeobjects uh status for network interfaceis continually reported It includeskinds of uh interface information uheven such as PCI or find the GPUs uhRDMA network region Uh therefore itcould help achieve uh future rangenetwork scheduling strategies with theselector in the resource claim objectuh in the DIA implementation of sparo uhpot can use the resource claim toassociate the CI configuration Uh theexpected number of network interfaces isclearly specified in the uh resourceclaim object Uh additionally uh specspecific scheduling requirements can beimplemented using the selectors in theresource claimuh in multi-tenant environments with Macvillain or ISO we can uh consider aninteresting need uh particularly whenports require fewer than eight GPUs itis only necessary to allocate an equalnumber of RDMA devices to the port withGDRaffinity this is uh there's no way toachieve that with traditional methods Toaddress this we developed a future ofcalled uh dynamic uh assignment Inresource claim only a request for adevice class named the GPU affinityneeds to be declared during the NI phaseof port start up Uh SP detect theassigned GPUs and uh allocate RDM withGDI affinity ondemand Uh currently the node exporter uhonly outputs RDMA traffic for devices onthe host Therefore in order to supportmulti-tenant scenarios with MAC van orSOV spot supports outputting RDM metricsfor all namespace on the node byassociating the labels of port name Itprovides graphana dashboards based ondifferent dimensionsSo that's that's all for my presentationYou are most welcome to join us on uhkioska number 21A on Thursday for forthe discussion Thank you[Applause]2025-04-15 22:03:01.877809 ��m�>#�AONuxsPWXNUUso welcome everyone to container runtimeintensifies uh we are speaking aboutcryo today i'm Sasha one of themaintainers of cryo and I really hopeyou all enjoy this beautiful cubecon thenext days so what is cryo it is actuallya lightweight container and time builtexactly for Kubernetes so means it'soptimized for Kubernetes and we alsolike follow the Kubernetes release cyclewe implement the features directly forKubernetes but it's on the other sidealso OCI compatible this means that wefollowed the open container initiativeso if something changes there it willalso change in cryo it's part of theCNCF since 2019 and we graduated in 2023so we are still in party mood uh withrespect to that and we have more than300 contributors all over the worldwhich is like really nice and it's stillgrowing so we are looking forward toCryo 133 so it will feature support forOCI artifacts and on the other side wealso have direct support for the imagevolume beta queration which will come inKubernetes1.33 means we can then also like supportsomething like web assembly NI pluginsand we also have some experimentalFreeBSDsupport and of course I would like tohighlight all those bug fixesdocumentation improvements dependencyupdates deprecations and cleanups Ithink that's a huge shout out toeveryone who maintains this project sothank you for that um we are reallylooking forward to the bright future ofCryo let's look at the OCI artifactsupport cryo is now able to pullartifacts into a dedicated storage sideby side to actual container images andthis means that we can also list inspectand also remove artifacts now this iscompletely new and we can also on top ofthat we can use artifacts as imagevolumesso this enables like a huge amount of AIand ML use cases which we all lookforward that you show us how you use thefeature um but it also allowsconfiguration distribution for likesmaller artifacts and um like stackingthe content of container images so ifyou have a universal binary for examplein a container image then you can reuseit in multiple images by just using animagevolume so let's just provide a smalldemo of that you can now use kry cuddleto for example pull a zum profile whichwasn't able which which weren't yourable before so a ze profile is just ajson block in a single file and cryctlwill tell you that the image is now upto date but uh cryo will indicate thatit's actually identified an artifact andput this artifact into the local storageby using their corresponding configmedia type and also like manifest mimetype now if you do something like RCTLimages then you see that the artifactsare listed side by side to all containerimages you have locally available on thenode and you can also inspect them so ifyou're in cry ctl inspect image and thendo with this on the container artifactthen you get the status the informationand every metadata which is availablefor this artifactnow if you use an image volume like thisso you just create a Kubernetes bot youspecify volumes the name of the volumeis second profile and reference theartifact from the registry and thenmount it to the actual container likeusing volume mounts and then referencethe name profile and then the mount pathvolumethen you can create this part and youcan exect into this pot and you will seethat on the path uh / volume the secondprofile JSON is available and if youexect the pot and just look what thisfile contains contents then it's likeyeah you have the actual second profileavailable in the local in the localcontainer and this is our maintainerrequest to all of you if you would liketo play around with that and if you havea use case in mind then please come tous and tell us that because we arereally looking forward for future usecases to make cryo even with respect tothat and artifact handling it's stillexperimental but we really look forwardto your feedback for that so thank youall and enjoy this cubecon[Applause]2025-04-15 22:03:02.457890 &&�V�?#�eAl3mSvnpLGZYmy talk quickly today is Kubortonleveraging and extending CEL for yourclustersecurity real quick my name is RobertSuchia i am a Helm maintainer and I'mtalking not about Helm I'm talking aboutKuborton and I'm talking about CEL i'm aCNCF ambassador and I'm a director inthe office of the CTO for a companycalled SUSA if you heard of it but let'sget into it um common expressionlanguage this is a domain specificlanguage it's already within umKubernetes right now with the validatingad uh admission policies it's ideal forextending declarative configurations umit's fast it's portable it's safe toexecute um it provides a Kubernetes langlanguage feature as well um and it'sdesigned to be embedded right but it canrun on its own and it can be compiledinto WASM so I couldn't ask for a bettersetup than I got with the previousspeaker so thank you if he's stillhere so what is Kuborton kuborton is aCNCF project it's a sandbox project wetalked about where it's at in itsmaturity it is a policy engines whosepolicies are built in web assemblymodules so if you like web assembly youlike policies it's kind of a projectbuilt for you you can write your ownpolicies choosing your own languages umyour domain specific languages such asCell or RIO um you'll use a generalpurpose language such as Go Rust uh C#JavaScript we have SDKs to cover all ofthat and we actually have documentationso you can actually run and write yourown policies but we do recommend thatyou first um check out what policies wehave in ArtifactHub before you writeyour own and it's designed to protectyour clusters you can gain insight andyou can build modular solutions thatrun um and we're striving to be auniversal policyengine so it is a supererset ofKubernetes Cell straight from the sourceum it uses the same upstream cellinterpreter um the same add-ons andlibraries into cell compiled into cellpolicies it's backwards compatible withcell as wellum plus Kubort and host of opportunitieswhen gives you the when you expand whenyou expose into native cell right so youcan gain access to kubernetes resourcessix verification OCI registries networkqueries things like that and since it'scouparden you have a lot of extrafeatures in there um the process of rerecoccurring cluster informationresources etc withthat so a quick example what this wouldlook like if you were deploying aparticular policy you can have theopportunity to change and modify whatyou want to do on how you deploy thisparticular policy so instead of thatbeing embedded within a um policy itselfyou can actually configure it and setvariables on the outside with relativeease with that and this was what Cellgives us when we deploy it within apolicy about a minute left here um I amgoing to be at the project pavilion um Iam also speaking at contrib for helmhelm 4 is kicked off i know this is nota Helm talk but that's where I'll be youcan find me on GitHub um and I'll be inthe project pavilion and then my boothis uhS60 in the SouthHall thank you[Applause]2025-04-15 22:03:03.033664to do everysingle thing that it can do um it's gotessentially zero cold start time it'svery tiny uh and I mean very tiny andit's also portable so those are the bigmain benefits there now why do we useWOM in the cloud once again I said firehose here's your fire hose uh there is abunch of reasons why we use WASM in thecloud if you look at why people firstuse it which is why you've heardprobably um about it is that it is verymuch a browser technology to start offbut it is not only for the browser youwe have open standards sandboxed smalland fast like all those kind of thingswe've talked about um like you you'renot worrying about like all the imageCVES because everything's sandboxed youdon't have a lot of dependenciescontained in there um and so those areall things that we get from WASOM nowwhat I really want to talk about becausethat's what everyone's here for is theproject overview this is not everythingbut I wanted to call out a coupledifferent projects that are very WASOMfocused um that I they at least knowwell we have a project like Hyperllighthyperllight is a very very lightweightVM wrapper for those running functionsand other type things where they're invery very secure very constrainedenvironments cuborton is a policy engineum and extender for the part of thescheduling um that they do a lot ofstuff with WASM as well um for theirpolicies you have spin which is abatteries included very much gearedtowards the function as a serviceproject um those are all sandboxprojects then we have WAMC cloud whichis one of the projects I help maintainthat's an incubating project in in theCNCF and it is very much focused on thedistributed side of web assembly so youcan take an application run it prettymuch anywhere and have it have itsdependency satisfied by connecting allthese different WASOM componentstogether and then there's WASM edgewhich is a specific uh WAM or anotherone of those battery included projectsthat's has a lot of focus on AI rightnow it's a lower level WASAM runtime andyou also have beyond that like I saidthis is not it within and outside of theCNCF you have various DBs you havethings like envoy WASOM proxy and all ofthis if you didn't see when they werejust showing the landscape you can clickthe WASOM section and this pulls upthese are all the things using WASM insome way inside of the CNCF most peopleone of the questions I I always hear islike is WASM real are people using ityes people are using it i don't know howmany times I can say it people are usingit there's lots of stuff that's evolvingand I'll talk about that in a second butjust keep that in mind now in case youneed it these are some recent effortswe've done inside of the WASM workinggroup one of them is we're continuingefforts on WY cloud which is an effortto define um various standard interfacesthat are like the 80% use case they helpbootstrap people into what are calledWAM components and on the right issomething we worked on where we defineda specific format for storing WASOMartifacts within an OCI artifact um howit like all the information that'sstored in there so that's something wedefine and is available those are QRcodes that link you to either of thosethings now how can you get involvedwhich is the most important thing comesee us at the project pavilion so thetag runtime booth is there um if you'regoing to be around for the booth crawlwe will be there um and we'll have a lotof the the evening on that booth crawlwith W was folks around um come join ourmeetings that's the QR code you can comejoin the WASM working group it happensevery other week um you can help usimplement and define things like WYcloud and you can also come to the WASMworking group focus talk that I'm givingwith one of the other co-chairs tomorrowat noon um so we'll be we'll be talkinga lot more in depth about a lot of thesetopics and you can come contribute toanywhere in the WASM ecosystem that wehave out there that can be inside ofthings like the runtimes or any of theseprojects that I mentioned we need andwant your help so thank you very much[Applause]2025-04-15 22:03:03.634250 u Au�8�A#��'A3s8EdlTi9bkwelcome everybody to cloud n uh toCubeCon uh the cloud native community inLondon of course we're just getting setup here in a second and then Kathymostly will take you through the CNCFlandscape i'm mostly the pretty face uhso I hope you're all enjoying a greatstart here it was certainly a great vibecoming uh on the tube and people not allgetting off in Canary Wolf all thebankers but actually all getting off astop afterwards so that was fantasticwhat we're going to talk about hereright now is actually first a hands upwho has heard of the cloud nativelandscape the CNCF landscape okay whohas looked through every single icon onthere really okay fantastic uh your dayjob must be boring uh what we're goingto take you through here real quick iswhat is the landscape why it's out thereand uh perhaps a couple of projects uhthat we nitpick that you know you maybeshould watch out for that has gottensome tra has gotten some traction or isgetting traction and hopefully we'remaking this all a little bitentertaining so we ready okayprobably have to switch it outi'll figure it out thank you thank youhi how's it going i am Katherine Duckmani am apparently challenged with myslides today i appreciate you bearingwithme so I'm Katherine i am an open sourceevangelist at Intel but today I am the13th Doctor in case you're wondering whythe get up so so yeah we are going totake you on a journey we're going to getinto the TARDIS here and fly through thecloud native landscape um but a littleabout me but not very much about me iwork for Intel i'm involved in a lot ofcommunities including the open SSF opensource security foundation uh uh the Oopen platform for enterprise AI securityand a lot of other things the pointbeing get involved find a buddy ask usquestions there we go lori would havebeen here but she's I've got Geraldinstead and and that and that's whatwe're going to win with she ditched usbut no she didn't really but anyway sowe're going to talk a little bit aboutthe history of the CNCF we're going totalk about why the landscape isimportant right that's step one and thenhow to use it because frankly it's anoverwhelming set of information and thisis the feedback that I hear that a lotof us hear that like where dok�3�@#�A2fkWLe3OqQgthanks everyone welcome to uh this quicklightning talk this is the super fasttag runtime review i'm Taylor Thomas iam a CNCF um WASM working group co uhco-chair um there's more about me toobut I'm not going to focus on that wedon't have enough time so quickannouncement um that uh they've kind ofasked us to do as well for some of thesethings if you are involved in theecosystem and have been involved in thetags there is a tag reboot coming umthat is something that will affect theWASM stuff that I'm talking about in tagruntime um so multiple TOC members andeveryone's going to be around um in thehalls and stuff if you have anyquestions so with that let's talk aboutWAM the tag runtime group asked me tokind of give an overview of WAM andwhere it is it's one of those emergingtechnologies out there most people haveheard of what's WAM this is the superfast thing you're going to get a quickfire hose of all of this if you don'tunderstand it all come talk to me laterum WAM is essentially a very small VMyou can consider it um you can build itin any language it it compiles down to abinary and then that binary is usedinside of a WASM runtime or in browsersor whatever it might be and those thosethings all run on any of the supportedarchitectures and and and oss that theycan run on which is a lot why do we useweb assembly well it has a capabilitybased security model which is justsomething that means you're explicitlygranting it permissions il I evenstart with any of this and then we'regoing to talk a little bit hopefully ifthere's time about a couple awesomeprojects you may not know about so whatis what is what is this thing that we'redoing here right there are so manypeople involved in this community andwhat we're hoping that to get to connectyou with today is how to connect withthose people how to connect with theprojects and how to get involved and asa user and hopefully maybe someday as acontributor so what is the cloudnativecomputing foundation what is it Geraldwhy are you here let's see i think wehave to switch my Oh we have this oneyeah you can put this away uh if anybodywants to turn up number three that wouldbe great otherwise we just put it awayso what is the cloudnative computingfoundation well the CNCF's mission is tomake cloudnative computing ubiquitousright to basically spread cloudnativecomputing to everywhere uh every projectevery company you should all really havethe power of cloudnative computing atyour fingertips we currently have 28CNCF projects which is really impressive257,000 and more project contributorswhich is even more astonishing if youdivide that by the projects uh 756 CNCFmembers and last but not least 96,000cloudnative community members such asyourself now the Linux Foundationmanages the CNCF so it's uh basicallythe CNCF the umbrella organization isthe Linux Foundation as you probablyknow right and the Linux Foundationitself has more than 900 open sourceprojects 3 million developers uh777,000 developer developerscontributing code and so forth and youknow 70 plus upcoming events and one ofthe events that we're here is CubeConthat you Yes yes it is no it is notworking okay we're just going to tradethat's all good um so so how did thelandscape get to be as big as it didwe'll have a little quick history lessonright so the cloud native computingfoundation has been around for roughgive or take 10 years um in 2014 itstarted with well really I mean we cangive a lot of credit to Google creditwhere credit is due google andKubernetes right it started all of thisbut you know in the early days it wasthere were not the these overwhelmingnumber of projects it was really youknow just a few um and it's grown as youcan see in 2018 in our timeline westarted out with only 31 projects well31 projects I don't know about the restof you but I've been around long enoughto to remember the days when I feel likeI we used to have a pretty good pictureof not just things like the cloud nativecomputer computing foundation projectsbut just the full stack of technologyall the technology we work with on aday-to-day basis but as the projectsproliferate and as the as the communitygrows it becomes very difficult to kindof get a handle and keep maintain apicture of the whole landscape andthat's where that's where we're goingwith our story here right so now thatwe've gotten up to so many let's talk alittle bit about what a cloud native uha CNCF project status means right we gowhen a when a project comes into thefoundation under the umbrella it's not afull-fledged inproduction projectnecessarily right so I don't know h areyou familiar with have you have you beenon that journey with a project beforeGerald I certainly have a oh this isworking now perfect thank you uhcertainly I've seen a couple of projectsthat mature right it's like not notevery project is equal right some ofthem are brand new some of them arearound for a long long time as we haveheard right like with kubernetes And sothere's three phases to uh each projectthat they have to go through so firstphase or first there's an adoption uh uhprocess to even get into one of thosephase right so a project must beselected and elected by supermajority bytechnical by the technical oversightcommittee to join the CNCF so basicallythere's already some folks that look atsomething and says yes this belongs intothe cloudnative computing foundationthen the next phase is the sandbox phaseor sorry the incubation phase sorry Iskipped the incubation phase once youhave a few more um a few more usersyou've grown your community mmaybe you'rein production but you haven't fullygraduated there's that kind of inbetween stage does that mean that youshouldn't use that project in productionno I don't think so does it maybe meanyou need to do a tiny bit more duediligence than just adopting somethingas widely used as say oh Kubernetes Helmthe big ones well maybe maybe you shouldconsider it in a slightly different waymaybe you should contribute to to itmaybe it needs a little bit more help toget it all the way to graduationright and so you can really think ofthis as a process right so somebodyelects the project there and then you'rein a sandbox right you see where it'sgoing maybe it's going nowhere it's finethen when it is going somewhere you'rein the incubation phase right where itincubates and then last but not leasthopefully it will graduate into afull-on CNCF project right and uh atthat stage you basically have to showthe thriving adoption rate rightsomething that the CNCF community isreally looking forward of having why dowe have a blank screen nowand um you know like so it it helps youwhen you just look at the CNCF landscapeto also gauge a bit where a project orparticular project is that you'relooking at right so it's like is itsandbox phase well maybe that's notquite production ready right at leastfor your use case or something right butto watch out for maybe to even startcontributing right but with graduatedprojects you can definitely put yourhand in fire that that is what thecommunity what the CNCF foundation hassaid look this is a project thatfulfills all the criteria uh and is goodto go okay so here's here's a here'swhere it gets really interesting rightso this is we keep talking we keep usingthis word right overwhelming and that'sbecause everybody does like to use thisword overwhelming but how do we get hereright and if you look at the graph thisis the number of course the sand box isalways going to be the biggest numberbecause that's you know the incomingfledgling projects um but but it's growntremendously over time to the pointwhere we've needed to put a little bitof order into it you know you couldalmost spo spike a trend in there yesyes youSo here we go here's the This is thepoint of why we're here introducing theCNCF landscape this is something I'mjust going to go ahead and switch overhere and show you yeah we going to dothis live this is what it actually lookslike there we go but it it's a huge hugewell organized Thank you George Castrowherever you are very well organizedlayout of all of the projects and beforeI start I drill down to a couple of themjust to show you how to use it i do wantto go back and show you where it startedthis is what it used to look like rightthat's not so difficult tomanage but here we go now we're in thisin this world where we have a shockingnumber of projects right and we don'tknow where to start so so let's grab anyproject right let's look at let's lookat Caviernow now for any of these projects youcan click on the projectname you can you can follow its statusright so this one you can see is in anincubating stage as of 2022 good jobright you can also see the stats on thenumber of contributors the number ofstars that can be an indication ofproject maturity right we can see latestrelease if if it's if you haven't seen arelease in a few years maybe that's ared flag something that maybe you don'twant to put intoproduction and there's a there's a lotof other information that you can getfrom this including Oh look at that aCLO monitor score what is CLO monitor hmwell we like to have we we love to tocollect stats about our projectsi I almost I kind of regret slightlytaking the uh Kyo actually as an examplebecause 99 is an almost unattainablescore this like don't assume thatthey're all going to look like thisright you're going to see some 80someyou're going to see some 70s somethingbut but it's a series of checks thatruns on an automated basis on all ofthese projects and this is a reallyfantastic place to start when you'restarting to evaluate projects both foruse in your own projects and also theprojects that you're thinking ofcontributing to right this is a reallygreat place to get an idea of where thegaps are where these projects could useyour help and and uh and then where tostart right it also includ includes Iwant to make sure I point this out itincludes some security information aboutthe projects and this is reallyfantastically useful this is I believepulled from uh OpenSSF scorecard ibelieve that it's automated into theprocess which is another project anotherproject yet another project to check outum but oh wait there's one thing in redit's not perfect i I was starting to geta little intimidated by their perfectionbut there is one thing and that's agreat thing where hey maybe you as anend user or a contributor can jump inand say "Hey I'd like to help out withthat project maybe I want to help themfix that and get a perfect score."You know what I think of like I I wish Ihad that in like 10 years ago when youpick something right and you go likewhat should we deploy oh this looks coollet's Google it you know we found itokay h this looks interesting but hereyeah you see as you can see here it'slike it's really sophisticated right soyou can really drill down not only ineach project where they are at but alsowhen we switch over to the landscapereal quick uh you also see that they areorganized not only by the categorieslike automation observability etc butyou can also search if you wanted toright umand all different dimensions righteither by just by typing in the projectname or by drilling down into a givencategory or by looking at some of theother statsyou can sort by status of course toothat's in here somewhere um the filterfilter on the very left thankyou filters there we go we can look atand we can see okay maybe we only wantto look at graduatedprojects maybe for some reason oh lookat that there are surprising number ofthose as well i'm sure you've heard ofmany if not all ofthese yeah shout out shout out to uh toArgo out there i actually I'm am an Argofan that's a that's a whole other storybut there's another thing that that Ithink is really useful here that I Loriand I when we have given this talk welike to make sure to uh share withpeople and that is do you see thislittle great icon up here on the rightyou can download all of this i find thisincredibly useful i happen to alreadyhave it open let's see if it'll Oh youcan't see come up i don't know how toshare my screen this is a whole thingwell trust us there was a file that youcan download the whole thing and amongthe information that you can get fromthat is things like uh security issueslatest security audit information reallyreally good stuff so what is yourfavorite undiscovered project oh that'sa tough one that's a tough one there'sso many good ones out thereum I don't know i really can't pick tellme one Katie katie okay okay i'm gonnapick um you know what i'm gonna go backtoOh wait sorry say that again say againchyroshow do you spell itthere it is chyros there we go awesomeit's a Sandbox project so this isactually Thank you for that by the waybecause this is a really great way totake a look at something that maybedoesn't have that near perfect glowmonitor score oh but it does okay I'mfeeling very inadequate right now um weare almost out of time so we we won'tget to to get into some of the uh thoseundiscovered projects but I I do want tosay um thank everybody for being herebecause you're going to learn allthroughout the day about these fantasticundiscovered process uh projects and ifthere's one takeaway that I hopeeverybody gets from this and from me isto make friends this week find theprojects that need your help jump inthis is a really great opportunity foryou to connect with maintainers see whatsee what needs to be done and help keepall of this going we all depend on allof these projects and we depend not juston the graduated projects we depend on athriving ecosystem because we as endusers as users of technology in ourdaily lives we depend on these projectsbeing successful and healthy so getinvolvedthank you very much enjoy CubeCon[Applause]2025-04-15 22:03:04.203721vibrant healthycommunity so thank you very much forcoming we hope you have a great timethis room is chillops it's designed foryou to figure out what you're going todo this week we have a lightning talkevery 7 minutes feel free to drop in andout go get a co-orker hang out for alittle bit figure out what you want todo if you're not feeling it just bailit's fine uh feel free to come back it'sall good it's kind of designed to bein-n-out chill vibes so um yeah i wantto tell you a few things to think aboutthis week how many of you areengineers how many non-engineers peoplelike me all right so uh I want you allto think about the ecology of cloudnative this is a thing where we are inour second decade of cloud native thefirst decade hyperrowth kubernetes youhad to have itthis next second decade that we are juststarting is going to be about growthsustainability contributor health andthe health of our projects this is myeveryday job is to ensure that theseprojects are healthy we have over 200open source projects in the CNCF that Iwant you to find out about uh this weekso I'm really excited that you're herebut I want you to take a holisticapproach this is not about just thetechnology it's about the people it'sabout integration it's about what thingconnects this thing to that thing tothat thing what combination of all ofthese things that you're going to puttogether and have a great time buildyour platform make yourself happy makeyourself miserable all of the goodthings that happen so I want you tothink about that a little bit becausefor us right now it's aboutsustainabilitymost of the talks today will be aboutthese open source projects and they'regoing to tell you if you want moreinformation come to the project pavilionfor the rest of this conference once theshow floor opens I will be at theproject pavilion it is purposely put inthe middle of the show floor i've gotthe best seats in the conference got thenice comfy cush i've got bathrooms andI've got caffeine for you so bring yourhoodie this is a place where you canactually interface with maintainersthere's no sales pitching there's noselling at the project pavilion it'sliterally nerds nerding out meet yourmaintainers meet the people that writethe software and try to understand howit all puts it's all put together sothat's what I encourage you to do if atany time you are at this conference andyou feel like you're not getting ityou're not getting the value you're kindof like bummed out come find me i willbe wearing a shirt that says CloudNativeComputing Foundation every single day myjob is to make sure you have a good timeso if you're lost you have no idea whatyou're doing come and come find me tellme what you're into networking storywhatever and we'll figure it outtogether and lastly I want to thank youfor coming i know that this conferenceis a commitment many of you havetraveled very far how many of you arestill tired i'm on like day four uh so Iknow the conference doesn't even startuntil tomorrow so I know we have a lotof high energy and a lot of new peoplehere i want to encourage you to meetpeople uh Kelsey has this thing that heKelsey High Tower that he tells us partof our culture when you're having aconversation with people in cloud nativeand we're having it together and we seesomeone else approach we always leave aspot open for the next person we'realways trying to grow our community tryto meet somebody try to find out thesame problems that you're having maybeat work common things hobbies i've madelifelong friends here and it's never toolate to start mr alan Pope i've knownhim for over 20 years and this is hisfirst CubeCon we built Auntu together itis never too late to start your cloudnative journey and I am so proud to seeso many of you this morning and I amvery happy y'all feeling it you feelingthat vibe oh come on give me a littlebit more let's go all right and withthat Miss Katherine we're going to startwith the CNCF landscape which is thishighly intimidating map of all of theseprojects and uh let's go doctor Who ohhey Gerald oh you've got them2025-04-15 22:03:04.746573 x @x��'�C#��ACQ3Wxg4qNaQhieveryone thank you all for coming to ourtalktoday uh our talk is called scalingsmarter not harder how extending clusterautoscaler savesmillions my name is Ben and I'm herewith my colleague Rahul and we are bothsoftware engineers at datadog so we're going to start today bytalking a little bit about how we donode autoscaling at data dog beforediving into the cluster autoscaler andthe concept ofexpanders and then we're going to moveinto talking about how we identifyoptimal instance types and then how wescale those optimal instancetypes so first a bit about data dog werun Kubernetes from scratch in a multicloud environment we run across dozensof clusters with tens of thousands ofnodes and hundreds of thousands of podsand within this infrastructure we servetrillions of data points per hour forover 30,000customers at Data Dog Rahul and I bothwork on a team called computeautoscaling at a high level the goal ofour team is to manage the nodeinfrastructure for product teams toenable product teams to focus on productdevelopment while we focus on the nodeinfrastructurein doing so we focus on things likescheduling and scaling efficiency todeliver nodes as fast as possible aswell as binacking and cost optimizationsacross ourfleet so a key offering of our platformis something that we call a node groupset so first a node group we can thinkof as a cloud provider agnosticrepresentation for something like anautoscaling group or a manage instancegroup if you're familiar with thecluster autoscaler you might be familiarwith their concept of a node group andours is similar but we have a customresource definition defined in ourenvironment that we use for nodegroupoups we have one more abstractionon top of a node group which we call anode groupoup set a node groupoup set isa set of node groupoups that falls underthe same scheduling domain so what thatmeans if an application specifies atoleration or a node affinity for ournode groupoup set then they couldsp�4�B#�!A3Gm5QNXcp2gall right welcome everybody can I getsomeaudio oh come on welcome everybody hiGeorgehey everybody i'm George Castra i'm aCNCF staffer welcome to the ProjectLightning Talksall right all right i'm going to giveyou a quick briefing and then Katherineis going to give her first talk then wehave lightning talks every seven minutesin perpetuity until we fall over so umlet me tell you how we're going to dothis am I supposed to have a timer uphere all right we're very cloud nativeyou get um this room is chills so wedesigned this session first of all howmany of you are new here first firstCubeCon holy smokes over the past fewyears over 50% of CubeCon attendees arebrand new growing nqchedule onto any of the underlying nodegroups that would fit theirpod so node groupoup sets are beneficialfor us for a few reasons one of them isit makes it a lot easier for us toonboard and manage many users and manyapplications across datadog because they can just set thespecification for the node group set andthen we can change out the underlyingnode groups or instance types beneaththem without them changing anything elseon theirapplication additionally node group setsare an easy way for us to providemultiple instance type options so forexample we want to serve a diverse setof pods with lots of different resourcerequests so we can provide large nodesto serve those or smaller nodes as wellif we were to run out of capacity forany one instance type we want to be ableto quickly fall back to another instancetype and node group sets give us an easyway to dothat so for node autoscaling at data dogwe use the cluster autoscaler and forthe sake of time I won't get into toomuch detail about why we use the clusterautoscaler over other autoscalingalternatives but at a high level one ofthe main reasons we use clusterautoscaler is because it has support forall the cloud providers that we needwithin our infrastructure as well we'vebeen running cluster autoscaler for awhile now and we're happy with theoperational experience we've been havingin our clusters and have been able tocontribute upstream with features whenwe needthem so to dive into how instance typeselection works within the clusterautoscaler we can say let's say we havea pending pod that needs to scale up theautoscaler will first start with all ofthe node groupoptions and then it will filter down tothe node groups that are able toschedule this pod via schedulingsimulations and then if there are stillmultiple node groups that can schedulethe pod it will use the concept ofexpanders to pick the best node group toscale so we can think of an expander asa strategy to pick the best node groupin order to trigger a scaleup toschedule thepot within the cluster autoscaler thereare a few built-in expanders the mostsimple one is random which will justselect a random node group anotheroption is the least waste expander whichwill select the node group that willwaste the fewest amount of resourceswhen fitting thosepods another is price which selects thecheapest node group option another oneis priority where users can assignspecific priorities to node groupoupsand the cluster autoscaler will respectthose prioritiesyou can also stack expanders meaningthat you can specify one and thenanother for example if I specify theleast waste expander and there's still atie it would fall back to the randomexpander and pick randomly to ensurethere's always onechoice so for us as a reminder one ofour goals is to optimize for bin packingefficiency and reduce waste so wethought the least waste expander wouldbe a good choice for usso to dive into a bit of a case study wecan see we have a node group set herethat's very simple offering one 16 coremachine and one 8 core machine and if apod comes along requesting six cores theleast waste waste expander will tell usthat we should scale up the 8 coremachine and this pod willschedule now if another pod comes alongrequesting nine cores well now thatcan't fit on the 8 core machine so weneed to scale up a 16 core machineand we can quickly see that in this casewe now have nine wastedcores so in our optimal scenario wewould have been packed both of thesepods onto a single 16 core machine andnow only have one wastedcore so there are a few ways that wecould get to this you might notice thatif the nore pod came first we would havescaled up a 16 core machine and then thesix core pod could have fit on itsimilarly we could rely on repackingwith the cluster autoscaler in order topack these pods together after the facthowever this example still gave usmotivation to take a look at our fleetin practice to figure out if thistheoretical bin packing optimization waspossible so we took a look at all of ournode group sets in all of our clustersand specifically at our requested CPUprercent for these node groupoup sets andwe realized that while we're pretty goodabout binacking in many of our nodegroupoup sets there's clear opportunityfor improvement in many clusters andmany node groupoup sets and our clustersare lots of different sizes so to bemore concrete we translated this CPUrequest percentage into the actualdollar impact and we realized that if wewere to improve our bin packing there'ssome significant cost impact that wecouldbring however we also know that instancetype selection goes beyond justbinacking we also care about things likethe performance of each instance type wecare about whether the cloud providerhas enough capacity to serve our entirefleet and we also know that withinspecific clusters we might have specificinstance type preferences to meet theapplication needs that run in thoseclusters and as a reminder we run acrossmultiple cloud providers in dozens ofclusters so we have to make thisdecision in more than just a singlecluster and as I mentioned each of theseclusters might be unique in theapplications that they run for exampleone might serve more network intensiveapplications and therefore benefit fromnetwork intensive instant types etcso at this point it's might seem likethere's no one expander fits solutionthat we can use across our entirefleet however the cluster autoscalerprovides something called the gRPCexpander so the gRPC expander allowsusers to build a gRPC service to makecustom node groupoup decisionsso when it comes time for the clusterautoscaler to pick the best option itwill make a request to the gRPC expanderand within that service you can returnwhat you have determined is the bestoption to scaleup here's an upstream example that thecluster autoscaler provides for the gRPCexpanderin this example the strategy is simplyto choose the option that has thelongest node group ID name but you couldreplace this with whatever arbitrarylogic would fit yourneeds so at this point we've recognizedthat we have an opportunity to improveour fleet efficiency via better instancetype selection and the gRPC expandergives us the ultimate flexibility tomake these instance typechoices so our goals became simple wewanted to identify the best instancetypes and we wanted to scale the bestinstancetypes so to simplify the criteria thatwe'll use for instance type selectionwe'll bucket into three highle ideasthat we care about in instance typeselection we care about the cost of ourfleet we care about the performance ofthe applications that run on them and wecare about the reliability of ourfleet so beginning with cost I want toreturn back to the binacking problemthat I showed earlier but in realitythis problem is a lot more complicatedthan just two pods requesting CPU weactually serve lots of different podshaving lots of different CPU or memoryrequests and as a reminder one of ourgoals is to abstract any worries of nodeinfrastructure away from applicationteams so we want to be able to serveapplications that request all differenttypes of CPU and memory in addition withthings like VPA we know that theresource requests of pods might not bestatic they might be dynamic and westill want to be able to binack theseworkloads so the question then becomeshow can we binack thousands of uniqueworkloads to solve this problem ratherthan approach it from a theoreticalpoint of view we decided to useschedulingsimulations we built a component calledthe instance type adviser the goal ofthe instance type adviser is to runscheduling simulations for the sets ofpods that run on our node group setsagainst every possible instance typethat we could use and then rank thoseinstance types based on which are bestfor bin packingefficiency so as an input to the adviserwe have our set of pods which we canselect via a node selector for exampleget all pods running on nodes that havelabel fu equals bar or concretely for usget all pods that are running on thenodes of our node groupset the next thing we'll pass to theexpander is an instance catalog which isa catalog of all possible instance typesthat we could use including theirspecifsications like cost per hour andCPU and memorycapacity within the adviser we also havea virtual node builder which is just aninterface to build virtual nodes inorder for us to schedule onto them thatmatch the specs of the instancecatalog and for the actual schedulingsimulations we use the upstreamKubernetes scheduling framework but wecould use otheruler plugins if we neededtoso the results of the instance typeadviser in a single cluster might looksomething like this where we have ourcurrent instance type that we're usingwhich is m6a.8x large and we can seethat with this instance type ourrequested CPU percentage is down at62% what the adviser tells us is that ifwe were to reschedu all of those podscurrently running on mainly M6As toR6A.4x 4x large we would increase ourrequested CPU percentage all the way upto 95 improving our bin packingefficiency and therefore we can see thistranslate to a significant cost decreasein ourfleet so we run scheduling simulationslike this in every single cluster and westore the results of them as a CRD thatwe can consume within thecluster beyond bin packing as Imentioned we also care about performanceso to get a more granular look at theperformance of each instance type we runperformance benchmarks for networkperformance CPU memory and storage andthis allows us to weigh cost versusperformance for every instance type sofor example before we make a migrationfrom instance type A to instance type EB we could take a look at thesebenchmarks and make sure that they matchup and we won't introduce anyperformancedegradationsfurthermore we can account for these inour cost calculations so for example ifwe look at CPU performance we might seesomething like instance B is 20% moreexpensive than instance A but in factit's 30% more performant so what thatcould mean is that if we're properlyautoscaled on CPU then if we useinstance type B we might be able to run30% fewer instances than instance type Aand that affects our cost calculationsin the endfor reliability we know that there's alot of different reliability concernsthat go into instance type selection forus we care about things like thecapacity of each instance type and as Imentioned before we also care about thesize ranges of our instance types thatwe use to serve a diverse set of podsand we also might want to expressperformancepreferences additionally thesereliability concerns might be differentper environment that you're running infor example the capacity of instancetype A might be different in region Xthan it is in regionY and so to give us flexibility inmaking these reliability choices webuilt a small library that allows us tobuild cellgo selectors against instancetype attributes as well as attributeswithin theenvironment and so this allows us tobuild selectors like the followingfor example if I know that I want to useM6G R 6G or C6G because I have a deepcapacity pool and I know I want to serveapplications between 8 and 32 CPUs I canbuild a selector to representthat if I know that I have network uhnetwork intensive applications in edgeenvironments I can build a selector toselect network intensive instance typesfor that environmentif I know that a specific cluster canbenefit from newer generation instancetypes that might be more expensive butare also more CPU performant then I canbuild a selector for that as well andeven if the cloud provider releases somesort of fancy new type that I'm not sureI want to use in prod yet but I want totest out in my experimental environmentI can build a selector for that tooso with these three tools we now havethe adviser to be able to calculate thebin packing efficiency of all of ournode group sets for cost optimization wehave benchmarks to factor in thespecific performance of every instancetype and we have attribute selectors onthe instance type and environment togive us flexibility and the reliabilityconcerns that we care about for ourfleet so at this point we feel like wehave a good set of tools to be able toidentify optimal instance types and sonow I'm going to pass it to Rahul tofigure out how we scale those opttimalinstancetypes all right so now we're going toscale the best instancetypes so right now this is ourautoscaling ecosystem uh we're going tobuild on this diagram as we go throughthe presentation but for now we have theinstance analysis tools on your left uhthat Ben just went over and this willlet us know what the best instance typesare and then on the right we have oursimple autoscaling ecosystem with thecluster autoscaler and some node groupsso we as humans through the instanceanalysis tools know what the bestinstance type is but now the question ishow does the cluster autoscaler knowwhat the best instance type is and thisis important so we can apply the uhanalysis and I want everyone to rememberthis question because it's going to comeuh back later on in the presentationso to tell the cluster autoscaler whatthe best instance type is we use thegRPC expander like Ben mentioned it's agRPC service where you can put whatevercustom code you want in there uh andit'll influence the scaling decisions ofcluster autoscaler but this kind of justpushes down the question to the gRPCexpander we actually have to write thiscustom code so that I can properlyinfluence cluster autoscalerso a simple way to influence the gRPCexpander is just you a human operator sothe human operator can get the instanceanalysis results themselves and writethat in a format which the gRPC expandercan read uh something like a config mapor CRDs so in this example uh node groupC has the highest score of 100 so gRPCexpander will tell cluster autoscaler toscale that instance typeso we wanted to validate this whole umapproach that we've been taking so farof optimizing instance types uh so wepicked one of our clusters that we sawhad the most bin packing potential basedon our analysis and in this case we arerunning on M6G.8X large instance typeswhich are have a more balanced CPU tomemory ratio and are meant for generalpurpose workloads on AWSbut uh through our simulations we foundthat there are more memory intensiveworkloads on this cluster and that our6G.8X large would be a better fitbecause it has a higher memory to CPUratio so you can see on the top graph wesignificantly reduced the number ofinstances we needed and on the bottomyou can see we significantly reduced ourcost by over 60% in just this onecluster so this validated to us that ourapproach of choosing the optimalinstance type uh did in fact work and wewanted to expand this to the dozens ofclusters that we runon so if you're running in a small shopwith just one or two Kubernetes clustersthis approach works perfectly fine witha human operator it's not too muchmanual toil you can kind of get awaywith it but at our scale we run ondozens of clusters so the best instancetype is not going to be the same acrossclusters you could have one cluster thatruns a bunch of memory intensiveapplications another one that runs CPUintensive applications so it's going tohave completely different uh bestinstance types also across regionsthere's different instance typeavailability across cloud providers youhave completely different instance typesso a human managing the best instancetypes across all these clusters is notgoing to scale well the other thing isthat these scores can potentially changefrequently let's say in the middle ofthe night an instance type runs out ofcapacity you're going to have to changethe score of that instance type or if anew application gets deployed or scaledup you're going to have to change thescore for that too so a human operatorcan't really handle all of thiswe're going to have to go back andreplace the human operator withsomething that's more scalable anddynamic so that I can optimize costsacross our dozens ofclusters so what we did was we created anew custom controller called theinstance score this will basicallyautomate what the human was previouslydoing and that'll watch the instanceanalysis results and then write those toconfig maps which the gRPC expander canread and then influence the clusterautoscalerso because of this new automation uh weare able to do this migration acrossdozens of cluusters on the top you cansee the number of instances we migratedand on the bottom you can see our yearlypotential cost savings going down sopotential cost savings is a calculationwe do where we take the current cost ofour clusters and subtract the optimalcost if we are using the optimalinstance type so let's say like acluster costed $10 million running on anonoptimal instance type but if we wereto change it to the optimal instancetype it would only cost $6 million thatmeans that we have a potential costsavings of $4 million so in this caseacross our dozens of clusters we had $4million in potential savings and we wereable to save this money uh automaticallyso when I say automatically um you cansee on the dates of the graph it's fromlate July to early September so aroundone and a half months so that's stillquite a while and although a lot of theprocess was automated we still had somemanual work todo so taking a step back this is just alist of AWS instances there's hundredsof them and they each have their owncharacteristicsand in our diagram that we have so faruh if instance type A B C or D happen tobe the best then we're just fine clusterautoscaler will be able to scale it upbut what happens if instance type Nhappens to be the best well it doesn'texist in our cluster so clusterautoscaler can't scale it up it has toscale up the next best available optionand you might not get the full costsavings through thatso now we come across the question howdo we make sure the best node groupexists in our cluster so that we can getthe maxsavings so one easy approach is justcreate a node group for every singleinstance type so you'd have hundreds ofnode groups on your cluster and nomatter what the best instance type isyou're guaranteed to have it and clusterautoscaler can scale it but the problemwith this is it significantly impactsthe cluster autoscaler performance inthis case in one of our clusters we hada few hundred node groups and you cansee on the bottom the main loop P99duration uh sometimes it took over 5minutes so imagine you're in an incidentand you need an urgent scale up andyou're waiting over 5 minutes to getyour nodes you're probably going to beupset so we realized that we can't justsimply create a bunch of node groups forno reason we have to be efficient withit so we did some cleanup and you cansee uh the P99 duration dramaticallywent down from over 5 minutes to lessthan 10seconds so as a platform we have somerequirements that we need to have to forour users um and based on thoserequirements we can create a nodegroupoup set and see which node groupswe actuallyneed so one requirement is we need toscale the most costefficient instancetype so we can have a node group forthat another requirement is we need toschedule pods requesting up to 64 CPUsand 256 gigs of memory and the bestinstance type might not necessarily beable to do that it could only have 16CPUs so it can't schedule a podrequesting 64 CPUs so we're going toneed another node group that has aninstance type that's big and is alsocostefficientanother thing is at our scale uh weoften can't rely on the cloud providerhaving enough capacity for a singleinstance type uh we're going to have tohave some fallbacks just in case sowe're going to create two fallback nodegroups for each of these main instancetypes so now we have six node groups inour node group set um and we're going toguarantee we always have the best nodegroup while not creating too manyso in the past during that one and ahalf month migration I was showing uh alot of it was automated because in a lotof cases we did already have the bestinstance type on the cluster but in alot of cases we didn't and we'd have tomanually go in and create node groupsfor them so we already know that our atour scale we can't continue manuallydoing this we need to automate it so itworks across all of our clusters so whatwe did was we created a new componentcalled the node groupoup set controllerthis will see the node groupoup set seeits requirements and reconcile with thecluster what node group should actuallybelong so now we come to the questionhow does the node groupoup setcontroller know what the best instancetype is and if you remember I toldeveryone to remember a certain questionbefore and the old question was how doesthe cluster autoscaler know what thebest instance type is and now I've justreplaced the cluster autoscaler with nogroup set controller it's asking theexact samequestion and when we answered it forcluster autoscaler the answer was thegRPC expander so you might be wonderingwhat's going to tell the node group setcontroller what todo and it's a gRPC expander yet again soyou might be a little surprised orcurious like hey I thought this gRPCexpander was a cluster autoscaler thinglike how are you using it with this newcomponent that you createdwell that's another benefit um of thegRPC expander you can put whatevercustom code you want in sure but it'sjust a gRPC service there's nothingrestricting the cluster autoscaler frombeing the only thing that cancommunicate with it we even communicatewith it from our own laptops when wewant to sanity check that our instancetype rankings make sense so through thatany service component you make can makea gRPC request to the gRPC expander andask it what are the best instance typesso this closes the loop on ourautoscaling ecosystem on the left wehave our instance analysis tools we getthe results instance score watches theseresults and writes them into config mapswhich the gRPC expander reads so now thegRPC expander knows what the bestinstance types are uh through that thenode group set controller can now createthe best node groups on the cluster andcluster autoscaler will then scale thesenodegroupoups so we saw this take action inone of our clusters it was previouslyrunning m6g.8x large instances uh forquite a while and then you can see aspike in the number of nodes and that'sbecause a new application got deployedto this cluster so originally it wasrunning on M6G.8x 8x large um and thenour instance analysis showed us thatC6G.4x large was the best and ourcluster didn't have this node group sothe node group set controller createdthis node group for us and then clusterautoscaler started scaling it so you cansee as we switch to the blue C6G.4xlarge instance types our potentialyearly cost savings went down by $2million so we've saved millions againso we've accomplished both of our goalsnow we're able to both identify andscale the best instancetypes so next steps for our platform wewant to continue what we're doing wewant to continue abstracting theselow-level infrastructure details from uhour application teams just like theexample I showed this application teamdidn't really have to worry about theunderlying infrastructure like whatinstance type they're going to getscheduled on they could worry abouttheir application get it deployed andthen our cluster would automatically uhadapt itself to be morecostefficient another thing we want toexpand on is providing differentvarieties of instance types like Benmentioned at data dog we have manydifferent products many different typesof workloads that each have their ownrequirements so we need to have instancetypes to fulfill these requirementsthings like local discs for exampleand then another avenue that we want toexplore is optimizing the placement ofour applications throughout this talk wetook existing clusters and their set ofapplications and adapted to that butwhat if preemptively we could move anapplication from one cluster to anotheror decide where a new application shouldgo because it has better bin packingpotential on one cluster over another sothese are avenues we're looking toexplore to continue improving theplatform we provide to our applicationteamsso thank you everyone for listening toour presentation uh we have ourLinkedIns on the right side of thescreen in case you want to connect withus we have the feedback form on thebottom left um and we're also on theKubernetes Slack in case you want tomessage us and meet up while we're allstill in London so thank you everyonefor attending our talk and we'll takeany questions[Applause]2025-04-15 22:03:05.292450wnmy new Reg dashboard for for feedback oflike things going wrong this was a ahappy time and then microservices DevOpsall this good stuff we know and lovepushed my cognitive load massively upand I actually quite enjoy learning i'mthat kind of person i'm sure many of youare you come to CubeCon to learn thesekind of things but when I was consultingwith a company called Open Creator basedhere in London the clients I was workingwith didn't want to have to learn allthe tools developers wanted to writeJavaScript write Go write Java solvesome business problems they didn't wantto learn Bash Terraform CrossplaneDocker amazing technologies but therewas just too much for them to learn inorder to ship valuefast and I was talking at the time allthis amazing work going around the CNCFwas a foundation for what a few of uswere calling a developer control planedcp and this is the early framings ofwhat I'm going to talk about today of aplatform architecture and diving intothis a little more I knew as a developerI wanted to code I wanted to ship and Iwanted to run and I was working verymuch on teleresence for coding I wasusing docker for shipping and I wasworking on ms ingress for for running Iwas really really liking the tools butthis kind of top layer was missingbackstage was a twinkle in Spotify's eyeat this point right it was just emergingnow backstage is everywhere but at theThere was clutch from the lift folksbunch of other projects kind of puttingthis UI on top of your platform thismiddle layer at the time I knew therewas something leverage web hooks CRDsAPIs to build an integrated workflow ididn't know how important that was iknew the crossplane folks were doingamazing work there but now I realizethis layer is very very important we'lldive into that more before we do thoughI want to talk about the what ofplatforms because we say platforms quitea lot but I'm not sure we're all sayingthe same thing much like DevOpsmicroservices unless we agree on adefinition we're never going to getaligned right particularly as developersand architects we need to make surewe're aligned with the platform team ofwhat are you building forme i like this definition a lot it's onthe Martin Fowler blog so it's got to becannon in our in our world right but Ireally do like this from Evan Botcher hetalks about this is what a digitalplatform is it focuses on self-serviceAPIs tools service knowledge and supportwhich are arranged in a compellinginternal product and the goal of thatproduct is to deliver product featuresas in your product features at a higherpace with reduced coordination not nocoordination not chuck your jar fileover the over the wall to ops not chuckyour docker file over the wall reducedcoordination i like that a lot thatreally speaks to me evans pulled out thesociote techchnical side of this tooright services but knowledge and supporttoo really really likethat if you're building a platform yourplatform architecting your platformengineering the good folks at Gartnerare on the case you know it's seriouswhen Gartner rock up and say this is athing right they focus on improvingdeveloper experience self-servicecapabilities automated infrastructureoperations again it's kind of the threeA's APIs abstractions automation rightthis is what a platform should deliverfor us as developers as architectswhat are the goals of your platform besuper clear i've got a I've bumped intoa lot of platform teams building aplatform because they're a platform teamdoes that make sense you build a teamthat's going to you know name itplatform team they're going to build aplatform are you clear of your goals bereally clear on this one you want to gofaster with a platform you want to shipthings faster you want to decrease riskright and you want to increaseefficiency at scale these are the threethings not rocket science i'm totallystealing this from good folks likeMatthew and Manwell team topologies uhthey've really crystallized the notionof everything as a service building aplatform team that book isgold Gregor Hopper's build on this quitea lot with this platform strategy if xyouhaven't read that book please do go andlike buy it it's an amazing book superlucky uh to have several chats withGregor over the years and every time Ilearn something more and actuallyGregor's got an even more amazing bookcalled the software architect elevatorand this if you're an architect is amandatory read in my mind what I reallytake away from Gregor is the need to usedifferent language depending on whereyou're talking in the organization nowhe uses the architect elevator thearchitect lift perhaps in the UK rightand he talks a lot about if you're thesea suite the three things there gofaster is make me more money rightdecrease risk is keep me off the frontpage of the newspapers increasedefficiency is save me money that's thekind of language talk at that level whenwe go to the boiler room where we'redoing all the work right gregor says youneed to use different language go fasteris improve the developer experiencefocus on the UX of the product theplatform decrease risk make it easy forme as a developer to do the right thingi know security is important i knowobservability is important make it easyfor me to do the right thing andincreased efficiency is looking at theright abstractions that I can deal withone application one node one cluster oneregion with reasonable kind ofaccommodations i'm not completelychanging my mindset as we scale i reallylike those books they're really goodbooks platform architecture now this isa bit of an eye chart slide but I it'skind of my presentation style did I sayit right i like you to be leaving withplenty to think about afterwards thereis a link to a blog post I wrote about ayear ago now on the Centasio blog andI've got the slides on on Slides Shareas well so you can check them out butbeing a Java developer I like threetiers i'm comfortable with three tiersright from back in the day i don't thinkit's controversial this kind of layer ithink the naming could be controversiali had a great conversation over lunchwith with Mark from Di Grid from theDapper folks saying I don't like thename application choreography i hear youMark loud and clear on that one but thislayer is where like we as developersinteract with the platform it isbackstage CLI it's APIs it is the realmof the app developers the full stackengineers uh and they want to code theywant to ship and they want to run don'tthink this layer is controversial eitherinfrastructure operations orchestrationscomp composition it's where the platformengineers the devops the operators CISadmins are working with infrastructureas code bash all that kind of good stuffcrds and the Kubernetes world uh andthey are building out the infrastructurethe layer which I think is a bit newer abit more nuanced is this platformorchestration layer now this is whereyou want to build a higher level kind ofconstruct a higher level API than rawinfrastructure compute storagenetworking and you want to offerservices that are higher levelabstraction but more relevant to yourbusiness too like what does a databasemean in your context what kind ofgovernance does it need what kind ofsecurity backups all that kind of thingsdoes it need this is the realm of likethe tool I'm working on cratics butHumatic are in this space crossplane Ithink your argo and flux CRDs are inthis kind of space too as wellThe good news I don't think I am alonein coming up with this i really likewhen I see smarter people than me sayingsimilar things right uh I when I bumpedinto that Gartner article first I waslike I can see the three layers in thatGartner article i can see theapplication portal i can see theplatform i can see the infrastructurekeith's awesome book if you haven't likeactually I think Plume folks were givingsigned copies earlier this morning ibumped into Keith at the booth uh thisis an amazing book i've read it since itwas like V1 and now I think Keith's onV3 i got a preview and Keith was againsaying I like this three layerseparation of how I'm going to model aplatform where developers interact whatplatform engineers need to build andagain I was at Salt Lake City very luckyat CubeCon yuh last year on the on theflight over I read this amazing book byCamille Fornet and Ian Noland same kindof deal I could see the three layers andthey actually called it like the pavedpath platform this middle layer whereyou're abstracting some of the lowerlevel infrastructure it's nice whenother people are saying similar thingswe're kind of working independently butmany of us coming up the sameconclusions so going that we're going todive a bit more into platformorchestration this is the the sort ofthing I want to explore a bit more todaybut before we do that I'm going to lookat a few sort of platform architecturecase studies take a step back throughsome of my time as a as a developer asan architect and and explore how itrelates to some of the platform changesI was seeing with the goal that if youare a developer are an architect thiswill give you a good framing of knowingwhat to ask from your platform team uhyour infrastructure team these kind ofthingsso first like a big sort of change in mycareer was working on what we callhexagonal architectures and this waskind of you put your business logic inthe middle you create a series ofinterfaces ports and adapters and youcan swap in the implementations long asthey adhere to the interface so in the fin the Java case the famous thing was wealways did interfaces for our databasein theory you could swap my SQL forPostgress of the 20 versions of thatthing I built I think I only everswapped the database once because whoswaps a database right but the idea waskind of good and there's much better usecases in that database use case you dowant some kind of abstraction betweenperhaps your logging framework uh yourbackups these kind of thingsright it's all about creating cohesivebusiness code kind of core applicationfacilitates testing because you're nothaving to like necessarily drive thewhole application with all the HTTP codeall the database code you can just testthe business logic and it promotes loosecoupling you can in theory swap out theports and adapters as long as you've gotyour adapters defined really well orport sorry ports defined really well youcan swap out the adapters to changeimplementations change parts of theplatformeffectively where this came from wherethe challenges kind of emerged is thatlayered and tiered apps were oftenhighly coupled i make one change in thesort of UI and it rippled through all mybusiness code all the way through to theplatform sometimes I was deploying codein a Java application server but also Ihad to deploy code in a ESB anenterprise service bus to make it runand suddenly one change was justrippling over all my code base i had toget the ops folks involved for multipledeployments it was a real pain some ofthe abstractions were very leaky therewas this notion of EJB remote and homeand it was like you meant to write codeas if you know it didn't matter whetheryou were going local host or whether youwere going over the network and we allknow that matters a lot right there waslike container managed transactions thatwere meant to hide all the gnarlytransaction semantics from me but whenit broke man I was debugging crazy likestack traces in Java trying to figureout what was going wrong right it wasreally the abstractionsleaked the platform evolved we saw abunch of solutions pop up particularlydriven by the good folks over in Netflixi know we're not all Netflix right butthere's many things we can learn fromthem and the Netflix OSS kind of stackthat exposed a lot of the platformservices as Java SDKs Java libraries wasreally useful in my world and eventuallySpring the the Spring framework a bigJava framework brought a lot of theNetflix stack into this like open sourcecommunity and rebranded it as SpringCloud with collaboration with theNetflix folks this was amazing but itwas only good if you were in the Javaecosystem because they were Javalibraries into platform kind ofcomponents like locking and databasesand all that kind of good stuffas it evolved i remember 2014 NetflixPrana popped up and Prana was likebasically a Java server exposing an HTTPAPI which otzher languages could callinto to get access to the platform uhcomponents so they're running like allthe Java libraries but like Go and youknow a node could call in and access theplatform this was really interesting westarted to see things like Dapper pop uparound other sidecar implementations andI've got a Dapper um diagram here and ifyou look closely at that diagram you cankind of see the three layers there's theSDKs at the top the app layer there'sthe platform abstractions in the middlethe locking the cues the databases andthen the implementation below of how youactually make that happen in Dapperright how you know your Q is Amazon SQSor it's rabbit orwhatever the thing you still need toprovide the platform bricks for yourgolden path is my pitch right becauseyes you've got these great abstractionsin in Dapper and sidecars but how do youactually deploy your application andthis is the golden brick we talk aboutgolden paths but often we want to changeout those bricks on the path so you needto deploy components as a service andthis is something I think was you knowtook a while until we kind of got tothatspace at this point slight interludemassive shout out to the Lego folks whocame along to an earlier version of thistalk in Salt Lake City and really tookaway this notion of golden bricks if youhaven't seen this talk they did it atPlatenge Day uh yesterday it'll be onYouTube fairly soon they gave me a shoutout which I really appreciate from theLego folks from Mads thank you very muchand they came to the booth and gave me agolden brick as well a Lego brick whichI thought was like freaking amazingright but this notion of golden bricksthinking I'm not going to necessarily asa developer accept a golden path i mightwant to customize that path depending onwhat type of application I'm runningright data application versus a standard12-actor app forexample after exagonal architecturemicroservices were the were the in thingright very much so you're creatingcohesive and loosely coupled servicesbut as Martin Fowler famously said youneed these three things to be true inorder to actually make microservices areality now the challenge was like wesort of said we need rapid provisioningbut a lot of companies took that is I'mgoing to use service now to raise aticket and some ops engineers will geton that and deploy the database laterthat's not really self-service right itis self-service in the sense of likedevelopers can like we can raise aticket but it might take days weeksmonths for I get my database I saw a lotof companies in London in Europe where Iwas working uh we're just building ontop um of a single database schema uhmultiple microservices and then when theschema changed it rippled through allthe microservices and suddenly we weredoing this big distributed monolithwe're deploying all our services at onego this is quitedangerous there's a lot to like inDevOps too I really like dev DevOps butthe you build it you run it and you ownit doesn't scale if you're like in abank or insurance company or that kindof regulated environment you want likeas fewer logging frameworks as possibleyou want as fewer ways to deploy yourcode as possible so when the auditorscome they're going to be signing off allthese things if you've got a hundreddifferent logging frameworks good luckhaving a a chat with your auditors onthat one and you know heaven forbid iflog for shell pops up right where's Logjrunning in my infrastructure i've got noidea but the DevOps thing is is a I'mnot ragging on DevOps love it lot tolike but a lot of folks interpretationof it stretched it a bit farparticularly in an enterprise contextyou really want the platform tocentralize quite a lot of these thingsthe visibility securityupgrades infrastructure solutions poppedup we had Heroku for the small folkslike like myself I was working on somestartups heroku was great we had CloudFoundry for the big enterprises it didan amazing job amazing piece of work butit was very opinionated the golden pathwas almost the golden cage at timesright it was very opinionated and if Iwanted to do something{ non2 factor goodluck to me as a developer good luck tome as anarchitect there was a whole lot of appscaffolding popping up at the time and Icall this jokingly puppy for Christmasbecause you say to your ops team yourplatform team is a developer I need adatabase and I need some like networkingstuff set up in Amazon and they give yousome Terraform code and day one muchlike that puppy happy days right butthen day two the puppy starts making amess you got to walk it all the timekind of like day two is like terraformneeding to be evolved or a zero day popsup in engine X right and suddenly you'rehaving to p you as a developer having topatch the container patch the config andyou're thinking like should my platformteam being doing this i've got thatpuppy for Christmas i love puppies rightbut as you got there's a responsibilitywith that puppy for Christmas much asthere is for templates as a service ifyou can crank a handle and get atemplate great for day one not so goodfor day two when you need to get thoseupgrades going on you need changes uhhappening i saw a lot of dev teamshaving to learn Terraform i actuallystarted learning Terraform as adeveloper i learned crossplane i startedplaying around Bloommy all these goodthings it was just a lot to learn rightand again I love learning other teamsthat didn't were reallystruggling now we're seeing a lot morelike workload definitions popping upthere's open application model I thinkit is there's score from the tech folkswho've donated it to CNCF there's radiusfrom the microser uh from the Microsoftfolks there's things like uh Dagger hereas well which I really I know Dagger'sgone into AI a little bit more now aswell but I think they actually got abooth somewhere to go and chat to thembut they've kind of got this three tiermodel if you like look at the diagramupside down kind of move it around alittle bit they've got this notion ofapp SDKs and kind of a Dagger engine andit's like a very customizable CI/CD ireally like what the Dagger folks aredoing platform orchestrator this iswhere I'm working at you can literallysee the three tiers in that kind ofcratics diagram if it's not too smallwhere we've got the apps at the top thekind of backstage then we've got theplatform orchestration in the middle andthen we've got the infrastructure at thebottom we're trying to separate thoselayers out there's this um Cusen stackCNCF project check them out too from theant group there's human in this space aswell many other folks are playing aroundin this kind of platform orchestrationspace but this is where we're trying toexpose an API at the platform for us asdevelopers to consume everything as aservice a Q a database uh you know evena higher level constructs even higherlevel services you might want to buildon and there's that holy grail kind ofthing like that cloud foundry thatHeroku type model where I have a buildpack I write some code I push it intothe build pack and magic happens for methe pipeline runs and you know thingsgetdeployed you still need those goldenbricks though you're building yourgolden path but you need some way ofputting the brick breaks down on thatpath this is where I think platformorchestrators that middle layer welooked at in the architecture this iswhere I think it's really important asdevelopers architects to get our headround we need to have APIs andabstractions to interact with theplatform and sort of they leak into ourcode how do we interact with some ofthese things in our codetoo briefly I want to touch on cellbasedarchitecture i'm see I have lots ofchats at the booth about this there'sonly one slide on this one but I seesomewhat of an evolution particular ifyou're in a regulated environmentsalebased architectures are kind ofmicroservices plus+ they are um theinfrastructure tends to be uh tightly uhsort of um coordinated the blast radiusis minimized like whereas microservicesmight share common infrastructurecell-based architectures don't slackI've talked a lot about how they usecellbased architecture AWS use a lot ofcell-based architecture and if you aregoing to do this in |the world ofKubernetes you really need to embracethings like cluster APIs you need tosorry cluster API because you'll begoing to be spinning up hardenedclusters for each cell you're not goingto be deploying multiple microservicesto a single cell a single cluster youneed multiple clusters you're going tobe looking at things like the podautoscalers kada for managing this kindof stuff and particularly in my timewith the Kate's uh gateway spec and openservice mesh you're going to be lookingat advanced routing advanced routting ifyou're doing um a lot of work in thecell-based um architecture space as wellcan be done on on uh on Kubernetes butthere's more tooling involved and againthink about the abstractions of how youdeal with a application how you dealwith a node and how you deal with acluster i think as a developer I shouldbe able to think about those inrelatively similar ways don't make themso different that I really struggle tounderstand how my application is goingtoscale wrapping up I think we're good totime so key insights I'm kind of drawingsome key insights that I want youperhaps as developers as architects takeaway and and chat to your platform teamabout if you're platform engineers inthe audience this is something to thinkabout as you're delivering your platformfor your customers the developers thearchitects abstract thinking first thingsolid Cupid principle of lease surprisei love these kind of frameworks growingup as a Java developer I learned aboutgang of four patterns and I learnedabout cupping and cohesion solid singleuh single responsibility open close allthat good stuff uh Dan Tur North talksabout Cupid quite a lot um composableUnix philosophy all that good stuffthese apply to platforms just as much asthey do uh applications themselves so asdevelopers as architects we can helpplatform folks think about theseconcepts when they're building platformsfor us i don't think developers wantmagic they want to be magicians greatquote by Ella Cheeser um creator ofDarklang and a few other things she'sdone done amazing stuff and I reallylike that notion of like I want to beable to like get things to my platformsbut don't make it super hard for me andand a lot of it is watching for leakyabstractions the earlier tools I talkabout in the Java world the ESPs thedistributed transactions the theabstractions leaked all over the placeand it was a it was a nice idea from theplat the early platform teams offeringthese things to me but it actually endedup costing a lot particularly on thatday two when I was debugging I wasupgrading I was learning a heck of a lotabout the Java transaction spec which isis not a good thing right not a goodthing uh the size of the spell matters Qbadly generated chatgpt image but thesize of the abstraction how you exposelike cues and databases and platformcomponents really matters for us asdevelopers as architects consuming thesethingsyou can't have good developer experiencewithout good user experience we need todesign the control planes for APIs CLIsand UIs i see a lot of folks goingportal first and I get a bit nervousabout that i love the portals but itreminds me of my old days writing likeweb apps like in uh Java with JSPs andRuby with ERBs we never put the businesslogic in the web pages because that'sthe display stuff right and we don'twant to couple our logic to to displaystuff i see a lot of folks putting theirbusiness logic their compliance theirsecurity into the portals and I get alittle bit nervous about that i might bewrong all right but I do get a littlebit nervous i say optimize forautomation that platform orchestrationlayer API first think about yourplatform APIs the UI on top whether it'sa CLI whether it's backstage portals Ithink is almost an implementationdetail aim to minimize the cognitiveload and build for progressivedisclosure now that is a bit of a like II only bumped into this term recentlydid a fantastic webinar with Sevi Kimfrom the backstage team and he talked alot about progressive disclosure likemake it easy to get started make it easyto do the right thing in terms of like}you might want to spin up like adatabase and it's just a literally aname and a size of a database but if youwant the experts to put more uh configinto that database you've got to havesome way of progressively justdisclosing the options like the day twostory what other how does theabstraction need to change to grow overtime I want the advanced options buttonthat kind of thing right check um Sevy'stalk about I really I did learn a lotfrom Sevy aboutthat don't forget the product focus it'sa key thing you're thinking aboutdeveloper experience you're thinkingabout user experience we think so muchabout this from our products we'rebuilding for our customers like our endusers i don't think we think enoughabout developer experience and userexperience for our dev tooling that weuse day in day out same for theplatforms right fantastic uh webinar ichat to Sarah Wells who is the creatorof enabling microservices book amazingbook sarah was the director of uhplatforms infrastructure at financialtimes.com like you can find Sarah on thespeaking circuit always worth watchingsuper awesome uh and she's going to beat Cucon London next week actually I'mgoing to chat to her and learn a lotthen but she talked a lot about in thebook around making sure the platformsupports the dev teams and she lived itat the Financial Times as a developerand then as she moved into operationsand into a leadership role those storiesin that book aregold the Thought Works folks love whatthey do this platform as a product pagereally helps us think it mentionsbackstage it mentions crossplane itmentions a bunch of tools this is areally good page to dive into to thinkabout that user experience of theplatform and finally what gets measuredgets managed i wanted to talk a littlebit about metrics i mentioned early on Isee a lot of platform teams buildingplatforms and when I go and chat to themthey're often not sure what they'rebuilding or why they're building it ithink the Humatic folks have done areport recently that says like 46% orsome number uh of uh platformengineering initiatives do not have ametric of success and that makes me as aleader very nervous if I don't have ametric of success how am I going tojudge the team judge the work and it'snot like in case I want to fire you kindof thing but it's like the company's gota business goal right as in we've got tohavemetrics this book yet again really helpsclarify my thinking on the metricscamille and Ian talk a lot about impactmetrics guardrail metrics and producthealth metrics now think about Dora ithink we all know Dora right impactmetrics is um lead time how quickly canI make an impact a guardrail metricwould be change fail percentage if I'mgoing super fast but breaking everythingnot a good look right and product healthis like how much adoption of the toolingare we getting this the chapter five Ithink it is i've recommended it so manytimes great greatbook in the book they talk aboutadoption rates of the platform if you'rebuilding a platform how many folks areactually adopting the platform what'sthe onboarding time to the platform thetime to the 10th pull request as Spotifyfamously talked about with backstagethese are great indicators for is theplatform delivering on itsgoals lagging indicators is there an appretention rate do people come backaround and they've deployed an app theyliked it are we as developers engagingwith the platform again what's theupgrade patch cycle all these I won'tread them out all these good things arekind of indicators of is the platformbeingsuccessful uh I talked to Paula my bossquite a bit about this and I think wehad some guests on this podcast as welland I also recommend this email from theinfo folks about improving developerexperience a lot of folks Jessica andother folks in in the audience here Andyhere as well uh I've I've tried to getall that knowledge from Cucon Londonlast year where we learned a bunch aboutdeveloper experience and buildingplatforms from these amazing folks i putthat in this e-mag or got my team tohelp me put this info in this e-mag sothose are two resources if you want tounderstand metrics you want tounderstand developer experience theseare reallygood and lastly the gold the devxframework i know it's DX core 4 now it'srebranded a little bit but I really likethis notion of thinking about Dora iswell understood i can chat to sea levelfolks about Dora they understand itthese days as much as we do asarchitects and developers it is verydelivery focused though right I thinkthere's a bit more nuance in the spaceframework Dr nicole Forsgrren massivelyinfluential on my life but theaccelerate book all these kind of goodthings a few her and a few colleagueshave really pushed this with like spacelike the S is literally satisfactionwhat is the developer experience likeyou can capture this from surveys abunch of other ways this covers allbases it is a little bit complicatedthat's why I like the devx frameworkfrom the DX folks they may even be hereI know Laura was on a panel with all mycolleagues actually um but I like the DXcore 4 stuff it is nicely balanced as away of kind of managing my my frameworkand I've got a nice graphic here whichyou can have a look later on it kind ofas a way of capturing like is myplatform delivering on the promise uhit'smaking oh perfect timing I hope uhwrapping up final slide to get us allthinking right expanding on those threepoints at the start of the presentationdevelopers we are the customers of aplatform i think the platform has to bebuilt as a product dev Ops we got to bechatting all the time what's myrequirements how does the platform getarchitected we as developers andarchitects can teach operations folksquite a lot too it's it's a two-waystreetaim for that speed that safety and thatscale those are the three things if youget it right going faster decreasingrisk and increasing efficiency magichappens in myexperience build the platform with APIfirst i I I I honestly I'm get it's apersonal thing but I'm getting nervouswith like portal first like it's almostlike building the UI before you'rebuilding the back end and sometimesthat's the right thing to do other timesI get a feeling it's it's not if youalready know a bunch of your platformgoals build the platform engine firstthen um like put the UI on top and thatkind of thing but that's a a personalthing right the platform and softwarearchitecture are symbiotic um when youreally want to build your platform kindof multiplayer mode right you like youwant dev ops infosc finance architectsall s surres all the folks playingtogether on the platform like you needto get like this collaboration going onbecause it's really hard to architect agood application without knowing whatthe platform offers and it's really hardto build a good platform without knowingwhat the application you needs it's gotto be a multiplayer game this thingrightand lastly APIs abstractions andautomation are the key i like this a lotcoupling and cohesion and the leakyabstractions are universal concepts thisis something I think developers canreally help ops folks who are often froma scripting background often not soproduct focus like these kind of thingslike teaching my ops colleagues aboutthis 10 years ago was super valuablethey taught me a whole lot aboutscripting they taught me a whole lotabout infrastructure i'm foreverthankful for that but we can teach thema little bit about some of these um coreprinciples too and finally this you knowI'll leave you thinking about this andI'm still working on this but thinkabout golden bricks rather than goldenpaths i think it's a temptationparticularly as platform engineers we'relike if only I can build this goldenpath for my developers to be you knowproductive and I really think you needto be golden bricks so developers canconstruct their own golden paths becauseif you don't there is that danger thecliche goes the golden paths do becomegolden cages at that point I'll put up acouple of uh QR codes there for feedbackand you can also check out the craticstocks I'm working on as well uh stophave a chat to me at the booth tonighti'm hanging out at SU 641 later on butthank you for your time2025-04-15 22:03:05.899837 ��D#��uAfZ_ULsJ5WGAhello hello i think Oh I can hear myselfwe are looking good i'll get startedi've got quite a bit of material topower through so uh welcome everyone ihope you're here for platformengineering for software developer andarchitects now I'll introduce myself injust a moment but I always like to startmy talks with highlevel key takeawaysthat the TLDDRs if you like and if youleave with nothing today but these threethings I will be happy we're going tohopefully dive in a lot more detail herebut these are the three things I want toI want to really stress now platformengineering should have a product focusyour developers are the customers it isnot me trying to package up DevOps andshoehorn it in a different way is verymuch a product focus we're deliveringdelivering a platform to enable you togo fast with safety and at scale aswell platform architecture softwarearchitecture are symbiotic i'm sure as adeveloper I've definitely done thisright works fine on local host pushed itto prod whoops like going over the wireeverything's a bitdifferent good APIs abstractions andautomations the three A's I often talkabout are the prize for everything Ithink in software delivery and inparticular we're not just swapping likeold tools for new tools we're trying tothink about the way we're actuallydelivering software via the platform soAPIs abstractions automation I think iskey that is the the three key takeawaysvery briefly this is me i started mycareer quite a while ago as a Javadeveloper moved into architecture bit ofops CTO type roles along the way now Ihelp folks build uh dev tools i reallylike want to cap sort of capture all myknowledge and and package up and andwork with amazing people to like pushthose ideas out in tool form i've workedon a bunch of open source projects openJDK back in the day open source Javateleresence CNCF project MSI ingresswhile I was going through my APImanagement phase and now I'm working oncratics with Cintaso and that's aplatform orchestration framework I lovesharing knowledge hence the info Qconnection hence the writing of bookswith a few friends do check them outavailable at every good book uh bookseller but that's enough about me I dowant to recount uh three years ago inValencia CubeCon EU I did a presentationthat landed relatively well I was reallypleased with the feedback I got fromI was talking about from Kubernetes topaz and what's going to be next i workedon building a few platforms on MSOS onon Kubernetes i was I we're all buildinga path a platform as a service kind ofon top of Kubernetes so where are wegoing next and I talked a lot about thisnotion of golden paths spotify leadingthe charge here of course but I reallysaid the big questions to ask ourselveswhen we're building platforms how muchdo you build yourself versus buy versusblend and how do you assemble thecontrol plane particularly fordevelopers we want a control plane intothe platform that helps us ship codeplatform should make it easy to do theright thing and help us move with speedwith safety and scalability platformengineering three years ago believe itor not was just becoming a thing at thattime it was like it wasn't really talkedabout verymuch i looked back on my dev career over20 years and I started like out of uniquite happy working on simple Java appsi moved into more enterprise use cases iwas dealing with enterprise servicebuses message cues the cognitive loadthat's that red spike went up as I hadto learn more things uh and this washard i then moved into doing a lot morespring boot a lot more uh cloud foundryruby on rails Heroku the platform wassuper simple cf push Heroku push look ov�rom third partyresources andum why is my clicker notworkinguh ah yes um and I drop in on variousCNCF projects soum I I've seen a bit of things out thereand this is largely derived from stuffI've seen and experienced and as we gothrough these um I'm going to highlightwhich things are easy to just drop inand help fix on a project and whichthings are stuff that you may need totalk to the maintainers to really getstuff going forwardbut first of all this is a UX talk forpeople who are writing Kubernetesoperators and that's kind of a weirdthing to say because Kubernetesoperators aren't really known for greatUXum and there's lots of open sourceprojects that don't have great UX andhave still been very successful uh theUX for Git if anyone really loves it I'msorry but it's badum Linux's UX is weird and mixed and youcan see all the strata and layers ofstuff why does Kubernetesmatter well because this is wheredevelopers hopefully are coming toactually live and work and hopefullythey'll find it a friendly and welcomingplace like we find CubeCon a friendlyand welcoming place furthermore this iswhere operations happen and the worstthing when you're under stress is havingto think real hard rather than havingclear obvious solutions um one of thethings I was talking about with someoneearlier today was feature flags if youhave a feature flag that says avoid bugfoo do you want that set to true orfalse you have to think for a moment sohaving clear interfaces for operatingour systems is really important whenthings go wrong and when things go wrongis the time that it's most important sothink about it all the time so that whenthings go wrong everyone is happy thatit was quickly resolved and that thereweren't mistakes that made itworse we're going to start with a bunchof basics if you're writing a customresource um if you've never written acustom resource before there's a bunchof tutorials out there and they'll fillin some of this for you but I'm going togive you a little bit more color thanjust you need to have a status status isreally where your controller whateveryou're managing tells the rest of theworld about what's goingon and sometimes you're telling amachine and sometimes you're telling aperson and so you want fields for bothso you know we have the really simplerep simple example here oh I you askedfor four replicas i've only got tworight now um one thing I'd criticizeright away is this healthy replicas ortotal replicas maybe you need a littlebit more detail but also maybe you wanta description for a human that says "Whyare there two when I asked for four?"um descriptionssentences you know here's a here'swhat's going on here are all for humansand then machines want like an enumwhat's the status of this is thisdegraded is this working uh if I sent arequest to it would I expect it to go ornot um that's all stuff you should belooking to put in your status and inboth cases references to otherKubernetes resources can be helpful umand we're going to look at a fewexamples here so um an Argo applicationhas a status that tells you that's thisum middle one here that tells you heyeverything is in sync i've applied allthe stuff you want but it also gives youa list of all the resources it createdso you can go and track those down andfigure out okay these are all the thingsthat this application controls um a Knative service i'm going to pick on Knative a bunch because I worked on a lotof this stuff we'll tell you the URLthat you can use to reach thatservice um so you say "Hey deploy thisthing." And it comes back and it says"Here's where you can reach it." Um itwould be great if we had more resourcesthat gave you that kind of here's whereyou can go to do this right away um italso tells you the last the the revisionthat is that is currently ready maybeyou have future ones that aren't workingand that it'll give you some informationabout that too but um and then if youlook at C manager they have things likewhen your C is valid for that you mightwant to know that you don't want to haveto go crawl around for they give it toyou right up front so these a�re somegood examples um and you'll see there'salso conditions which we'll talk about alittle later in the search manager umArgo and K native also have them but Icut out a bunch of stuff often status isat least as long as the spec humanswrite a short thing of this is the thingI want and then we have computers expandit out into a big long thing sometimeswe do that deterministically and we callit controllers sometimes we do itnondeterministically and we call it LLMsbut um in any case you know m computersare ways for us to write short thingsand make big things out of themum and so status is great and sometimesit doesn't make sense to have status andso when that happens you don't have toinclude it it's not a Kubernetes rulethat you have to have a spec and astatus um and sometimes you're definingconfiguration for something else andthat configuration on its own doesn'taffect anything it's only when you wireit into something else um and sosomething like a gateway class orstorage class says hey when you create apersistent volume when you create agateway use this implementation andhere's the parameters to use when youcreate that gateway it will have astatus that says oh yeah I provisionedthat stuff or you know hey I wasn't ableto provision that because you don't havequota for you know that type of disk umthat's where the the status goes forthose classes similarly if you havepolicies uh like a role binding thatgives permissions but it doesn't reallyhave a status of its own it just tellsthe Kubernetes control plane hey here'sa thing that should begrantedum another this is not in your CRD thisis a Kubernetes resource but if you'rewriting a controller you should knowabout this um Kubernetes has this notionof things called events and they'reobjects and it's a little weird to havean object that is an event and it reallysays this thing happened at least onceum you can see that somewhere over herethere is a count of how many times ithappened um that is about one of yourobjects so when you run cube controldescribe you can get a list of therecent events that have happened andthat means that your users can find outwhat's going on without having to readyour controller logs which is reallygreat if they don't have access to yourcontroller logs because they're only inonly have access to one namespace orsomething like that so um if you areauthoring a controller think about whattheevents that are reasonable and relevantare when there's a state change youprobably want to create an event ifthere's no state change and you justchecked everything don't create an eventthat'll put a whole bunch of noise onthe API server you can crash yourcluster it's a fun time done itum needed to take out that controllerand rebuild it to take out those eventsum oh and I don't know if I mentioned itbefore these little things that have alittle gopher down in the corner um meanthat you're going to need to get intothe Go code and it's probably going tobe a more involved change um some of theslides don't have the little gopher inthe corner and that is a change that'sjust in your custom resource YAML andthose are often easy to just drive byand say "Hey look Ihelped." Um and this is one of thoseexamples that doesn't have the littlegopher over in the corner um there'ssome built-in roles that Kubernetesgives you view edit admin you can assignthese to a particular namespace so I cansay I'm going to pick on Josh in thefront because I see him here i can sayJosh can can view you know the resourcesin this namespace and that's great hecan see all the pods and all thedeployments and so forth and then Ideploy a gateway API resource and hecan't see it because gateway API isn'tone of the things that Kubernetes knowsabout his permissions for if you add oneof these aggregate to view labels to acluster role it will build it into theview ro and then if you ship thateveryone who installs on the cluster whohas Vue gets to view your customresources the same way that they canview pods and deployments and don't getto view secrets so view is not universalsometimes you're like actually you don't�get to see these things but by defaultprobably give people view or edit onyour resources unless there's a goodreason not to um you can also use thesame aggregation mechanism to build yourown custom roles so K native does thisbecause we have a somewhat pluggable setof events i bet crossplane does this toowhere you have a new resource type andyou want to roll it up into some largerpermission set um aggregated roles isthe way to do it it's in the Kubernetesdocumentation read it you'll be happy oryou know moderately contentum and then this one some people findcontroversial but I found really reallypowerful um there's a standardKubernetes convention to have acondition that has a certain schema toit um and it says you should have someconditions let me give you a few morerules that I think will make you happyand will make people building a UXaround your custom resources happy umfirst of all have a top level thing asummary for humans or dashboards uhready and succeeded are the two that Irecommend ready if it's an ongoing thinglike a deployment and succeeded if it'ssomething like a job finished is not agood choice because you don't know wasit a happy finish or an unhappy finishuh succeeded is real clear about did Iget to the place I wanted or did I notum and then all the other types use thesame polarity so talk about things thatyou want to have happen so you can seehere we've got foo worked and barfetchedand I synced baz you know I don't knowwhat those things are but they're alltelling me about positive things that Iwant to get to and once you do that youcan automatically summarize in yourcontroller you know hey anything plus aplus a false means that I'm not readybecause something went bad a bunch ofstuff that's unknown means it's stillworking so if I've got some good stuffand some unknown my overall ready statusis unknown if everything's green greatit's green and you can actually do thisin your UI you can show the little greendots filling up um and if something goesred then you can show the whole thing asred um so K native has a library forthis but you can do this on your own toobut um if you do your conditions likethis machines can figure out becausethey can just say oh I'm looking forfalse bad true good when do I get readytrue um and humans can look at it andthey can figure out what's going on umthe reason that I recommend a positivepolarity here what I call it true meansgood things is when we started with Knative we had a false polarity andfailed was the bad thing and so what youwanted was failedfalse and everyone who readfailed said oh gosh itfailed when it said failed and thenfalse two bad things and you put the twobad things together and you get a goodand people's brains don't work that wayso I encourage you positive polarityhere um so now we're going to talkabout not stuff that's in your CRD butstuff that's around it um and the firstthing is think about do you need a CLIat all cube control is pretty amazinglypowerful um there is some fields in yourcustom resource definition that can makecube control way more powerful um thefirst one is additional printer columnsso when someone does a cube control geton your resource you can control whatextra information they get and you canactually give them more information ifthey ask for wide format too so you cansay oh this is kind of a detail this isthe core stuff you really need to knowum you can also add extra short names soArgo CD for example their top levelobject is called application i don'tknow anybody who wants to type outapplication everyone wants to just sayget app so you can say app is my shortname but application is the full thinguh similarly certificate no one wants totype so search manager gives youert as ashort name um there's also a categoriesgroupum there's a cube control get all thatwill list all your basic built-inKubernetes resources if you add yourresources to the all category they'llshow up there too and then when you saycube control get all you get all yourHTTP routes from gateway and you get allyour K native services and you get allyour Argo CD apps all in o�ne place andif you forget to do it you get like athird of your resources and you're likeI know there's more here what are theyand then you list all your API resourcesand you have this huge terrible scriptand you're sad um I don't want you to besad i want you to be happy that's whywe're here happinessum another thing that you can do andthis requires writing Go code anddesigning your stuff is build aggregatedobjects so you can see over on the lefthand side you've got um a K nativeservice is composed of a bunch ofdifferent stuff and on the right um ifyou want to get a search managercertificate it will spawn a bunch ofextra stuff this top level object givesyou a thing where you can summarizestuff for your users and you can seehere you know uh search manager showsyou hey the issuer and the certificate'sup to date and all that nice stuff and Knative gives you the URL and whatrevision is running and stuff like thatum and then sometimes you do need tobuild a CLI anyway so um these are acouple of good cases I've seen umsometimes there's a bunch of setup andwe'll talk about a really cool examplein just a moment of getting a wholebunch of stuff done at once and theother is um giving people a friendly wayto like read the status and like waitfor something to be done um and ifyou're thinking about that second partyou should think as well do peoplereally want a command line or are theyreally going to want like a web page andmaybe you want to build yourself alittle guey for the second case um sothe sec funny funny enough the secondcase is the one I did first here whenyou run kn service create it tells youwhat it's doing all along the way and atthe end it prints out hey your URL ishere go enjoy um and then Flux Bootstrapis really neat because it will actuallygo and set up a repository for you andset up everything that you need toinstall Flux on the cluster and then getFlux going and then the files that arein GitHub will actually reinstall Fluxover itself in the cluster so you canuse Flux to manage itself and I don'tknow how many of you have tried to setup this kind of thing on your own uh Iwould fail that eight out of 10 times iknow Kubernetes i suspect most userswould get like a third of the waythrough and something would go wrong andyou'd be debugging it so Flux Bootstrapjust gets you all the way there it'sorchestrating a workflow um and that's areally good thing to use a CLIfor and so now we're going to get intothe kind of advanced cool stuff that youcan do once you've got the basics downum so there's a bunch of types that arealready in Kubernetes that are maybeclose to what you're doing steal themuse them um object reference and labelselector are a couple things that showup everywhere don't invent your ownobject reference don't invent your ownlabel selector use the ones that are inKubernetes everyone will know themum similarly there's a pattern K nativehas done a lot with it but I've seencrossplane and other things use it aswell called duct typing where you have abunch of different objects but they allhave a particular field that matches acommon pattern so if you look at adeployment it has a spec and a templateand inside the template is a podtemplate spec you look at a damon setspec template it's the same thing insidethere um you look at a job same thinginside there you look at a K nativeservice it looks the same it's got a podtemplate spec inspect template that'snot an accident we started differentlyand then we realized everyone who knewKubernetes pods had to relearn thingswhy bother and then we had to do morework to pass things through and copythem around it's easier for people tosay "Here's this block it moves overhere."um you know borrow these things it's thebest form of flattery to the people whowrote them and it will make your lifeeasier uh another thing that I seepeople who are designing CRDs strugglewith sometimes is zero values um and I'mI'm going to pick on K native a littlebit here because they were just doing atalk about policy stuff and they hadsome really nice use of zero values soum basically zero is I didn't fill �thisin in Go if you just use like a standardint or standard list or something likethat the zero value is the same as Ididn't bother filling this in and youcan use that to say do the right thingso um they were designing permissions onyou know K native was designingpermissions on resources and they if youwant to say this this permission appliesto all the resources in a particularnamespace since this isnamespaced the target you leave emptythat's a zero thing and it says apply tothem all the other thing you could sayis "Oh you didn't say apply this toanything here's a policy and it appliesto nothing at all." That's not veryuseful i don't why would I botherwriting a big old policy to never applyit to anything make your zero valuesuseful and your users get to writeshorter YAML that's more powerful um andthey get less surprise so uh andsometimes you know this is a rule be alittle cautious with but as much aspossible zero values are real cool um Ithink go did a smart thing with theirserialization and their defaults here umand you probably know what a gooddefault is at least as well as youraverage user um otherwise why are youwriting the operator and notthem this is this is where you encodeyour knowledge of the system um and thelast thing is something that is a littletricky uh we didn't always get it rightin K native um I'm not going to say weare beautiful model citizens here butKubernetes isn't either so you knowthere's thatum GitOps so we've we've used Flux andArgo both as examples uh when you put abunch of resources into Flux or Argo orHelm there's a little bit of sequencingthat they do around CRDs because theykind of have discovered that they haveto but for the most part all theresources just get dumped into thecluster at onceand who knows like what order they showup in and if you're lucky they show upin the right order and if you're and ifyou're not uh you're trying to createresources in a namespace before youcreated the namespace and you get anerror and as much as possible try tobuild systems that you can just reapplyand reapply and it will reach a statewhere it works and it's not going togive some get in some weirdhalfinitialized state and then you'relike ah I can't go any further umuh job is another example from theKubernetes API where you try to create ajob and it already exists and it's likeI don't know what to do um I don't knowhow I would have designed it differentlybut if you can try to build your systemso that when you get ops things in andyou just throw a bunch of stuff in thereit works no matter what order you applythe resources um and this this is athink hard type of thing sorryum and then this is the last tip I'vegot and then I'll give you a checklistat the end so that if you're looking forthe distilled v you know version thatyou want to share with someone it'scoming up in just a momentum there's a couple different ways tointeract with other people's resourcesum and they're good for different thingsso labels and annotations are ways thatyou can put a little extra data onsomeone else's resource or find themlabels are particularly good at findingthings so um in this case uh searchmanager uses an annotation to find whichissuer it is if they used a labelinstead um you could make a narrowerquery to the Kubernetes API you'd alsobe limited in the types of values youcould put in there um they chose anannotation i don't know if it was a goodidea or a bad one uh at this point it'smoot because it's thereum you can use selectors on otherpeople's resources to again use labelsto find the right objects and last ofall if you're creating resourcesremember to set your owner references sothat both garbage collection works andpeople can figure out automatically heythis was created by this other guy overhere um if I need to change it Iprobably need to go affect that guyotherwise he's just going to seteverything back um to whatever he youknow whatever that controller wanted itto be to begin withum these are again they're all in theKubernetes documentation but I figuredif I was talking about how to do thisit's worth explaining lik�e if you've gotshort values that you want to look up bylabels are the the right thing if you'vegot more detailed content maybe you wanta little JSON or something to give yougive you control annotations give youthat umuh EngineX ingress famously has a wholeton of annotations that you can use tocontrol the generated enginex config andonly occasionally blows up in your faceum but uh you know that's a place wherelabels wouldn't work because labels areintended to be really short things thatyou can searchon um and so this is the authoringchecklist if you're you know looking toshare this with people I'm also going tobe putting these slides on SCAD as soonas I'm done here you can share thataround i'm going to put it on myLinkedIn um but ask yourself thesequestions when you are authoring acontroller if you're working with aprojectum and it's frustrating look at this andsee if you can feed a little bit of thisadvice you know over the over there youget to be a CNCF contributor if you'renot already you can say remember how Isaid I contributed to 10 plusprojects stuff like this you know go inand create an aggregated cluster ro andsend them a PR give them a little linkto the documentation and explain whyyou're doing it you are now a helpfuland contributing member of thatcommunity and you can put it on yourLinkedIn you can put it on your resumecontributor to projects X Y and Z um I'mnot saying you know you have to pad yourresume but if you're looking to getinvolved in the CNCF and you can writeYAML you can do some of this stuff umand if you're you know more deeply intoit write some Go code it's funum and I didn't talk about it much butum upfrontvalidation there's a bunch of settingsin both AP open API and cell that arelet you validate a lot of the resourcesas they come in and give useful errormessages back to users um before theresource is even created and so ifyou've got values where they should bein some meaningful range you know maybeit doesn't make sense to spawn a podwith 64 pabytes of memory uh you couldyou know warn people ahead of time andsay "No I'm not going to dothat 64 pabytes is too much foranyone." And thank you i'm happy to takequestions and stuff and this QR code umtakes you to the feedback on the talk ifyou want to tell me you know hey thatwas a great job or gosh you're an idiotyou know I I I want to hear what youthinkJoshi'll I'll repeat uh zero values so zerovalues sometimes zero values aresometimes zero values areso the questionso the question is um about zero valuessometimes zero is actually a meaningfulvalue and sometimes it doesn't meananything if if zero doesn't meananything I encourage you to think aboutwhat you could make it mean that'suseful if zero is a meaningful value uhgoing backhere uh to that slide the thing I saidno to over here with a pointer if youuse the pointer then you can distinguishbetween null andzero um and that's what you should doand then in your code you have to writea little if max idle time equals nullthen da da da da otherwise if it's zerodo this thing um if max idle time iszero doesn't really make sense like ifit's a timeout a timeout of zero meansas soon as you start you have to stopthat doesn't you know that you couldmake that mean that but it's not reallyhelpful to users so you can make it meansomething useful on the other hand if itwas number of replicas and scaling downto zero was a possible thing like adeployment then obviously you can't makezero mean run the right number um so yesyou have to be a little bit careful withzero um but where you can you know takeadvantage of this and where youcan't use the pointer version and makenull and zero different anddistinguishableand I'm just going to put us back tothis slide so that I get that tiny bitmore feedback thank you anyone else hasquestions i'm happy to you know repeatand answer amplify agreedisagree i do not promise to make up alimick on the spot go aheadlibraryuh so the question is is there a commonlibrary for managing the conditionpattern that I describedum and the or or is everybody doing itthemselves and the answer is sort of umK native has a library for managingconditions in K native devum package with and I forget exactlywhere it is under there that a number ofother projects use um if anyone wants tocome talk to me or I'm sure Dave is outhere somewhere Dave's over here in theaudience um or any of the other Canadianmaintainers about putting that in acommon place a lot of people haveborrowed that um some people haverewritten it uh I don't feel like itneeds to be in K native but it needs tolive somewhere and if someone wants tofind it a good home I'm happy to helpwith moving it and if no one wants tofind it a good home we're still happy toshare but we're going to version it forourselves so uh that's the that's theincentive to stand up and say "Noactually I'll help with this." is if youdon't want our versioninghi there uh and yeah it was like threeweeks of discussion to figure out thatpattern um and it was actually the UIfolks who were doing a guey who drove alot of it who were like "Hey how do weshow that this is getting ready?" And itwas like "Well you batch on this nameand it goes this way and this name." Andthey were like "We can't do that that'sdumband eventually I think it was NaomiSchaer you know was the person who said"Hey if we do it this way then this willjust work." And you know we all lookedat and we're like "Wow that'ssmart." Hey what do you Hey can you seeme oh yes uh I have a question i don'tknow if it's really really related toCRDs but it's definitely related tooperators so uh at our company we havekind of discussion going on which iswhen we have a cluster which the bunchof CRS for a specific CRD and we haveour operator and we now are going tocreate a new resource based on the CRSright yeah it can be a resource onanother account like in the cloudprovider whatever and we have thisdiscussion going on which is whathappens when we have hundreds orwhatever CRS in this and how theoperator should do should the operatorjust go ahead and just check every CRand just change everything at the sametime hitting rate limits on whateverAPIs or should we implement like a pooluh a reconcile pool where we can dothings separately and I wanted to knowyour opinion on this because at least atour companies uh it's a separating uh soum the question if I can summarize it iswe have a bunch of custom resources thatgo out and work against some externalAPI and for example when we start up ourcontroller how do we manage reconcilingthese hundreds of CRs at once and notgoing over rate limits is that aboutright yeah that's pretty much itum so I have another K native we solvedthis answer and I'm sorry for soundinglike a smug guy but um We discoveredthis problem and uh if you look in the Knative libraries we do a little bit ofinversion of control on theum on the reconcile loop and we actuallyrun a two a two-lane reconciler so whatwe found also was that when yourcontroller is starting up and it's doingthis big background pass that canactually block you fromnoticing current events because you'reprocessing through all this backlog andso what we actually do is we have one Qfor events that have just come in aschanges and one Q for background scansand the background scans is limited inin its rate and it's lower priority thanchanges that just came in through thewatch API and so we rate limit theamount of changes but we also prioritizestuff coming in over watch over thebackground informer scan and it's verycomplicated but it seems to work fairlywellum and that framework I feel like isfairly well tested we um we had a a testand I think it's probably still therethat starts up the controller and doesstuff in the cluster and they're leaderelected and every 30 seconds it findsthe leader and terminates that podforcing a new reconcile and the rest ofthe stuff should still keep workingwhile that's happening and it does greatthankscome steal our goodies they're goodwell we've we've run out of time i'mhappy to keep answering questions butI'm probably gonna stop the mic in thepresentation now[Applause]2025-04-15 22:03:06.560193 8 *8��H#��oAiiI91sUMtdghello I'm Yat Mindal i would like totell you a story about observabilitypipeline query languages uh and to befair it would be a little bit personalstory because today I'm running like asmall startup was doing database gatewaybut previously I was working for 10years and one of the observabilityvendors so if you have to blame someonefor that mess that's me unfortunatelyuh and uh before I also work at likeNvidia Quran andMeta and where the history of pipe startthey actua��t�G#��Alefjb4Vnd8kgood afternoon London I would say How'severybodydoing come on It's been a long day Iknow you want to go home I had to go topartiesright come onUm but before you're doingthat shall we make oursession optimizing matrix collection andserving when autoscaling large languagemodelworkloads as a warm up of your funtime My name is Vincent Ho It's my honorto team up with Hey my name is todayI'm a senior software engineer workingwith Bloomberg My team is specializ inbuilding an��F#��sAFEy2lhe6CM8nice to see so many of you here sorry tokeep you waiting i am Majid and this isJoffra uh so we're platform engineers inPDX Spotify's uh platform developerexperience department and we focus onbuilding tools and systems uh that helpmake uh Spotify engineers moreproductive yes uh we both have beenaround eight years at Spotify working uhin different like corners of theseinternal platforms um during this timeat the company we've seen a lot ofgrowth and uh evol evolving um ��E#��mA6KdywJWnYygyes we are excellent uh my name is EvanAnderson um I'm currently working atStack Lock but previously I've worked atVMware on the Tanu product and on GoogleCloud um one of the things I'm known foris being the founder of the Kativeproject or one of the founders of theKative project want to share the creditbecause it was really a big team effortum so I have opinions about customresources having been doing them sinceshortly after they were renamedthirdparty resources or f�so thatgave us like a good perspective of likeuh the challenges that our engineersface on their daily life uh today wewant to share with you how are we tryingto address one of these productivityblockers with the help of AI uh let'stake a look at how the problem lookslikeso as a platform department we focus notonly on measuring productivityuh we are also uh concerned about likewhat gets in the way of it that's why in2020 we started a quarterly engineeringsatisfaction survey uh we call it ENSATto get data and trends on thisproductivity boosters and bloggersacross all Spotify since the verybeginning and for every run we've seenhow issues uh with finding informationor poor or missing documentation havebeen consistently in the top three uhwhat that means is that our engineershave to spend time uh looking across uhmultiple systems to retrieveinformation let's look at how thisrecommendation looks like so as aSpotify engineer when you have uh somedoubts some issues uh what do you dofirst uh we have our internaldocumentation tech docs where we havemore than 10,000 sites with this scaleit's not only difficult to keep all theinformation up to date it's also find itit's also difficult to find the theright site that contains the informationyou're looking for we also have codelots of it uh in GitHub we have readmisuh we have reference implementations andcode examples that can save a lot oftime for your daily tasks um scatteredaround uh 22 different thousand uh22,000 different repos and when it comesto communication we use Slack a lot it'sour main tool for internal support teamsto teams so there we have a lot of ouruh institutional troubleshootingknowledge what happens with that is uhit's difficult toretrieve old answers that uh couldanswer your question so teams have tobalance a lot uh of their time betweenproviding support and answering the samequestions that get asked over and overagain with their core work that's noteven mentioning all all the rest of theinformation that we have in in workingdocuments in requests for comments inarchitectural architectural decisionrecords in um planning tools indashboards etcyesso all of you are probably familiar withwith LLMs by now or most of you at leastso two years ago we uh we saw GPT 3.5and and other advanced LLMs uhintroduced us to to to new ways ofinteracting with information usingnatural languageand these LLMs they could uh they couldproduce and understand humanlike uh textbut they came with clear limitations soearly LLMs they they had very limitedcontext windows typically 4 to 8,000tokens and they uh often just madethings up so we all became familiar withthe with the term hallucinationovernight and they were often verycostly to train from scratch or even toto fine-tuneso uh retrieval augmented generation orrag emerged as a as a pattern to dealwith uh some of these uh mainlimitations are you familiar with thathow many of you are familiar with rag bytheway i see about 30% 27% maybe of you allright so so rag is is um is a techniquethat allows you to to uh do questionanswering over your own uh private dataand the process in rag typically startswith a set of documents that containuseful information these documents areuh split into meaningful chunks and thenuh using an embedding model each chunkis converted into a a vectorrepresenting its uh semantic meaningthese vectors are stored in a vectordatabase and then when a when a user hasa question that question uh or query isalso converted using the same embeddingmodel so then uh the system uh can findrelevant information by doing a vectorsearch or or semantic search uh acrossthe knowledge baseuh so this uh addresses some of the mainlimitations uh by by uh retrieving onlythe most relevant uh uh documents so soyou you you don't overcome the thecontext window limit and this approachis also uh more cost effective rightbecause you're only dealing with asubset of information instead of ofusing the whole knowledge base and italso addresses some of the thehallucination issues by by groundingresponses in actualdocumentation so uh at Spotify we sawseveral� teams I think uh up to somethinglike eight different uh teams startingto to build their own rag solutions uhearly on so this meant we had uhduplicated efforts uh there were few orno best practices uh infrastructure andknowledge wasn't being shared betweensolutions so there was really a need toto create a single robust platform uhforrag so our team was formed and uh and wewere uh uh tasked with with creating a asingle uh platform for for knowledgeingestion and serving and to createrecognizable patterns and identitiesacross Spotifyinternally so uh in doing so we we uh weuh established some core principles uhmainly uh trust from transparency so wewe recognize that we're not a we're anaggregator of information so we're not asource of truth but users always have tobe able to verify where the informationcomesfrom and we believe there is a there isa strong synergy here where betterdocumentation leads to uh improvedresponses and better responsesincentivizes teams to to update andmaintain better documentation so so thisuh this leads to a positive feedbackloop and we wanted the platform tosupport multiple experiences and beadaptable to to various knowledgedomains we wanted to uh to allow usersto to customize the experience and wewanted to meet users where they are soto to allow uh um different interfacesin in backstage in Slack in the ID andsoon so so these principles guided ourdevelopment and and help ensure that webuild a platform that's uh that's usefulas as it grows so yeah meet Aika ourartificial intelligence knowledgeassistant here you can see it in actionintegrated into backstage our developeruhportal um I can retrieve at this pointinformation from our key knowledgesources which is internal documentationall the conversations from Slack supportchannels and our organizational data andthen it can blend all these contexttogether to generate uh useful answersbut as uh M said we don't want to likefocus only on backstage souh here we can see it also the otherclients that we mentioned we have uh acollection of clients we want a to be aknowledge platform uh powered all bythis uh same knowledge back end we haveuh Slack bot that's our Slack clientthat uh is useful for privateconversations and group conversations inchannels we have an API open to all ourdevelopers to be able to incorporate acapabilities into their um applicationsand services and we have also um Pythonclient library that uh makes it supereasy to access all this knowledge uhprogrammatically but uh let's uh seewhat ike is capableof let's look at some examples uhSpotify as a company has been aroundlong enough uh to generate their ownkind of lingo their own um set ofabbreviations and uh and jargon that canbe a bit challenging for new joiners oreven when you're moving teams take thisexample if you ask anyone uh any modeluh about MMA they're most likely goingto reference mixed martial arts atSpotify uh MMA stands for managemonitoring and alerting and here we cansee how I can retrieve this informationfrom our internal documentation andprovide the contextually correctanswer next uh at Spotify we have morethan 600 teams at this point uh findingthe right one to talk to it's also noteasy uh in this case we can see how um Iis capable to retrieve the owner of aspecific feature based on a past uhSlack conversation and lately lastly umdespite AIA being u developed tointeract with our uh internal knowledgeit still holds the capabilities of anormal um large language model uh herewe can see how we are asking about likea a general uh Python question and it'sanswering it without any problem um thisall these capabilities are very usefuluh not only for new joiners also forexperienced uh Spotifyers when they areventuring into a new domain um this canhelp when you have to both navigate uhnew technical concepts and Spotifyspecific um implementations orconventions for thisdomain yes so uh in about a year we'veseen wide adoption of uh of IAT Spotifyabout 70% of our employees have tried itat least once uh but perhaps what's moretelling is the sustained usage so uh wejust celebrated uh more th�an a thousanddaily active users uh 25% of all ouremployees use it weekly uh if you lookat developers only that number is up tosomething like86% uh but perhaps what's moreinteresting is the the effect in inbehavior this has had uh the thepositive feedback loop that we we talkedabout earlierso qualitatively we've seen a lot ofteams and users reach out to us uh forhow to best make their documentationavailable through Aika we've seen thatusers uh search for and and find missingor or or uh um outdated information andnotify documentation owners about it sothis leads to to to improveddocumentation which feeds back intoIKA yes so we've we've seen a bit ofabout what Aika is and what it can do uhso what are some of the moving parts oftheplatform joffrey talked about thedifferent clients ranging from frombackstage to Slack to Python and soon uh and architecturally our solutionis fairly similar to the standard ragsolution you saw earlier but fine-tunedfor for our needs so we have uh a mainbackend service responsible fororchestrating everything we support uhseveral LLM providers and their frontiermodels and we rely on uh severalthirdparty APIs for things like uhreranking and uh and uh embeddingsduringretrieval and we have our own inferenceservice so for some use cases we builtcustom machine learning models uh to forexample for uh for some use cases we wewant to be able to determine if anincoming question requires internalknowledge or notwe also have a custom confidence scoringmodel uh that outputs a confidence score0ero to one of how confident we feelthat an answer is helpful based on thequestion the retrieved uh documents andthe answeritself and we have uh an evaluationframework so we can uh evaluate andreason about things like uh retrievalaccuracy and and answer quality and thishelps us to to uh to move safer and andmake changes uhconfidently and all of this feeds backinto our observability system so usingopen telemetry we gather insights intoeverything that's going on uh and thisallows us to to support users and toimprove theservice and one of the decisions we madeearly on was to start small and buildfrom there so we started with a set ofhigh quality um documentation and andslack channels and um as the platformproved itself users wanted to add theirown content so we focused on buildingout our ingestion pipeline and allowingusers to maintain their own datasources but this uh this growth isn'tall withoutissues see how this animation looks likein with this Star Wars thing going onhere yeahthis growth isn't without issues so witha growing knowledge base you cansometimes uh get semantically similarbut but irrelevant answers from fromother domains so this visualizationshows our embeddings space in 3D and youcan think of it as a as a map of ourknowledge where similar concepts clustertogether so you might have for examplebackend development over there and someorganizational data over there and eachpoint represents a chunk of informationand the close closer the points are toeach other the more semantically similarthey are and vector search uh generallygets you in the in the right neighborneighborhood where you uh you canretrieve more more documents than uhthan you need and then reranking helpsto distill that down to the most usefuldocuments but when you have knowledgefrom many different domains so backenduh development web security and so onthey can sometimes overlap in in quiteunexpected ways and while reerankinghelps to to untangle this mess a bitgetting rid of the background noisehelps even moreso this is the same embedding space buthere we've highlighted a a specificregion or a set of topics that arerelevant to somediscipline so we found that often whenusers uh ask a question they have aspecific context in mind so a backendengineer for example asking about uhdeployment practices is probably notinterested in in iOS deploymentsso having this this uh all thisknowledge in the same vector space isvery useful for discovery but theability to to narrow it down and tofilter out noise unlocks newopportunities yes uh so we worked onimplementi�ng these capabilities and nowuh with the ability to filter theknowledge available um what we were ableto do is expand from being a generalassistant to creating a platform capableof uh creating focused experiences umremember the early rack initiatives thatwe mentioned before most of them werefocused on on internal support cases uhso their teams wanted to to automateaway uh this answering of the repetitivequestions uh reducing interruptions andcontext switching which is anotherproductivity drain often mentioned in inENSAT so now all these teams with thecapability of filtering the knowledgeavailable to all our clients uh were uhavailable uh were able to package uh aslice of all this knowledge space withtheir system prom to create uh theircustomized assistant or experienceum this uh allowed them to leverage allof our uh all our infrastructure umbeing able to forget about chunking andembedding and ingesting and focusingonly on what makes their um experienceor the assistant or their domain uniqueuh but we still saw that a lot of theteams at Spotify have their supportslack channel and what they reallywanted to do a lot of them were to havesome sort of automation there that coulduh automate the way their work theirsupport work so that's uh why we createduh and platformized AA goalie bot uhwhich is uh standardized uh supportsolution for for Slack uh fullycustomizable uh by our teams uh we keptthe IA identity so all uh Ike users uhcould be familiar uh with theinteractions no matter what channel theywere asking questions on and goalie isbecause um at Spotify the goalie is theperson monitoring a support channel andthey are uh in charge of handlingeverything from these more routinequestions to long troubleshootingdebugging sessions so what we saw is theopportunity to let AI uh take care ofthis more boring work and let humangoalies handle this more interestingtroubleshooting sessionsso how does this work when uh the botsees an incoming questions the firstthing to do is uh to decide if it'scapable or not to answer this questionat this given point um questions likecan you please review my PR uh are notsuitable for the bot but if we see thatit's uh capable of answering it whatit's going to do is generate an answerand using the confidence score that wementioned before um it's going to do onething or another if the confidence inthe answer is low it's just going tostop here and do nothing if theconfidence is medium it's going topropose this generated answer to thehuman goalie and let them decide how toproceed and lastly if the um confidenceis high it's automatically going to postan answer citation uh like citing allthe sources as we mentioned and savesome time for everyoneum what this meant is saving literallythousands of hours um that could be uhdevoted now to main work but we knewsince the beginning that a one uh sizefits all kind of solution wouldn't workwell here because each channel isslightly different their technical depthis different uh the docu document thequality of the documentation and eventhe team's preferences so what we did isumwe adopted a declarative YAML approachto the configuration of each channelwhich allowed the the teams to be incharge of all the parts of all themoving parts of of this system theycould decide what knowledge is availableto the bot either documentation theirthreats from past uh Slack conversationsor even their custom data sources umthey could tweak the system prom todecide on on different umparticularities for for their uh uhgoalie bot and even the confidence uh umwhat it's high medium and and low and umyeah we've seen how people has adoptingthis it's been quite remarkable um rightnow we have it working on more than 100support channels and for us what's uhmore surprising and and and rewarding isthat not all of them are from the R&Dpart of the company our approach toconfiguration without having to write asingle line of code means thatnon-technical teams have been able toadopt uh this uh bot and benefit from iton average uh the ikely bot helpsanswering 30% of the questions that umthat it sees and uh and yeah that's uhwhere all those uh thousands of hours uhcomefrom yes so um what were some of ourmost important learnings and andchallenges when when developing Aika sofirst of all we uh believe that that theretrieval is the most important part ofrag without finding the the right uhdocuments of the right information youcan't really provide accurate answers sonaturally a lot of the effort goes intothe retrieval part of it and vectorsearch is is coarse right you can thinkof it as as casting a wide net youretrieve more results uh than you needuh to make sure you don't you don't missanything important and here we'reworking towards adding hybrid search toto support some use cases better and wefound that reranking has been the thesingle best improvement we've made we'veseen uh 10 to 15% improvement inretrieval accuracy when usingreranking and we learned that morecontext isn't always uh better so uh inour measurements when doubling the theamount of documents in context we didn'tsee significantly better results andthis is because the the relevancy uhduring retrieval drops offlogarithmically so you might see a lotof improvements in say the first fivedocuments you you provide but thenrapidly diminishing returns after thatand you often just end up filling thethe context windowunnecessarily and every data sourcereally needs special consideration andwhat do I mean by that u for example uhsome of our raw data exists in in agraph structure so we have data in manydifferent formats so if it's in a graphstructure it needs to be flattened andembedded in a useful way for forquestion answering and a single Slackconversation for example can be embeddedin many different ways so you reallyneed to try to uh anticipate what kindsof questions your users want to ask ofthe data when wheningesting and perhaps uh one of thetrickiest challenges is is determiningif something is a hallucination or ifit's bad data so when you have anincorrect answer is it because the thethe LLM made it up or is it because it'scorrectly using outdated information andthat's why we always try to site sourcesso users can can verify the informationand notify documentation owners so itincentivizes them to to update staleinformation all right so what's uh nextfor us with Aika we we're evolving IAthrough a set of clear phases where eachone builds on the on the previous one sowhat we have now is is a solidfoundation of semantic search acrossmultiple knowledge sources but ourcurrent solution relies on uh onpredefined data sources right so whatwe're working on now is is enhancedreasoning so instead of users having toknow which data sources to to to querythe system can deduce automaticallybased on the the query and the contextthe user is in which data sources arerelevant and looking uh a bit furtherahead we see a evolving into a platformfor agentic capabilities so dealing withuh multi-art multi-turn questions thatrequire gathering uh information frommany different sources and perhaps evenusing tools to to access real-timeinformation and act on behalf of theuser yes andum to end for today uh we've seen howthis has helped a lot of our engineersuh so we are happy to share with all ofyou that uh we are bringing this uh acapabilities to uh Spotify portal ourmanaged solution for the backstageplatform um with this our customerswe're going to be able to uh leverageeasily all the really good uhinformation already available in theirown backstageenvironments so yeah that's uh all forus for today uh thanks for listening andstaying until the end um and uh yeah Idon't know if we have some time forquestions after the but yeah we aregoing to hang around as well we havealso a QR code for feedback or in caseyou want to retrieve the the slidesthanks a lot yes thank you2025-04-15 22:03:07.092932�d maintaining AI inferenceplatform I've been the lead of K nativeoperation work group for six years I'vebeen evangelizing open technology andcontributing to open source projectsDuring thepast 10 years maybe maybe more um likeopen stack openkurf but don't worry we'll focus on thethree dots today Jerry Yeah So my nameis Urka I work on a cool company calledKify where we try to do production gradeKDA and I'm also contributor to theproject called KGB I like open source 3Dprinting and you will see me in the uhsecond half of the of the session So seeyou Okay Thank youJerry This is our agenda for today Iwill start with how in generalautoscaling works horizontal autoscaling works in Kubernetes and talkabout the challenges which largelanguage work uh with large languagemodel workloads and the existingautoscaling solution in the market andhow we optimize the matrix collectionand serving with our solutions In theend we'll run ademo previous episodes in CubeCon Tellme why and tell me why I'm sorry Icannot think here today Uh I used tocarry many questions about how we canscale large model workloads and whatkind of matches we canleverage Have fun time on thatstage And thanks to these awesome folksI met in Salt Lake City Arshock fromGoogle Lumila from Microsoft and Davidfrom Red Hat After attending theirsessions my confusion will resolved alot I put the links over here to theirtalks You can refer tolater Okay For our session as it's inLondon let's shall we do somethingspecial for LondonHuh to be or not to be I am going to doit Act one Auto skating is heavy andlight bright and dark hot and cold seekand healthy asleep and awake iseverything except what it is Oh it is socomplicated In cloud computingautoscaling is a method to dynamicallyadjust the resource based on the loadautomaticallyIncube it means a feature that allows thecluster to increase or decrease thenumber of pods or just a pod resource inresponse to demand We have HPA forhorizontal scaler and VPA as a verticalscaler but we'll focus on thehorizontal autoscaling in cubetoday Why do we need autoscalingbecause we like to efficiently leverageuse our resources without influencingthe uh service uptime performanceNeither of them is good over orunderprovisioning This diagram will showin general how autoscaling works incubernetes Uh what we can scale based onis something we can callmetrics Metrics has a broad range Theycan be the basic resource like CPU andmemory They can be the custom metricsexposed by our workload and they can bemetrics that possibly come from outsideworkload Okay Metric that need to becollected and save it somewhere Well wecall it metricregistry After that we need to make surethe metric can be consumed by the HPAotherwise have no meaning And that iswhat the uh matrix server is doing rightover here On one end it can discover andaggregate those metrics from registry onthe other end and make sure that the HPAcan read the metric from it So once themetric reach theHPA you run some calculation maybe easymaybe complex and compare with thetarget value and make scanning decisionsthat generally how auto works withcoup by my head here come the challengesby my heel wecannot sorry wrong line by my heel Wecare alot Notnot in in yeah in Kubernetes environmentactually especially in my companyBloomber we build on premises AIinference platform empowered by Ker Quermakes the model highly scalable as aservice as it is a standard cloudagnostic model inference platform toserve both predictive and generativemodel on cube Now it's an independentproject but it's under Apache licenseand it's on the way to become a CNCFproject Well my mission was to build theautoscaling solution for large languagemodel workloads withKSER However the large language modelworkload has brought us a paradigm shiftin termsof what okay computer resource We usedto use CPU now we use GPUsand at the same time thematrix the matrix also become differentsome something that work before butdoesn't work right now for exampledefault could use like CPU or memory toscale something but they do not reflectthe usage of GPU and l�ater we gotanother matrix like the number ofrequests per second but this also Sodoesn't reflect the GPU usage latency orsuper processeither And speaking of the some criteriathat we can auto spell the largelanguage model workloads we can think ofeither latency or throughut But ingeneral the problem is that none of theexisting workloadsautoscaling metrics can fit our needsSo I say projecting back to this diagramof how autoscaling works in kuberneteswe got like five not I will say fivegroup of questions five group ofquestion we need to address so one weneed to figure out where to read thematrix and what kind of metric we canleverage to scale our workloads andsecond we have no idea where to savethem and um yeah that's probably what'sthe metrics metric registry is three weneed to find out what can be used asmetric server to discover these metricsand ser them toHPA and four large a model usuallyshould be can be very large yeah that'swhy they call large it could be uh over100 gigabytes or something so how can Imake it load and scale fast after thismatrix value has been changed well canwe even configure how fast we scale ourworkloads andFive Is there a solution that isflexible and more portable to all kindsofplatforms all right Act three And trustme love in my eyes So do you Dry sorrowdrinks ourbloodOh man Huhwell speaking of the existing ways thatwe're doing autoscaling well let's startfrom the easiestone No that'sit We try to leverage the matrixserver The benefit is that it's aintegral that cumulative nativecomponent that can scale very fast byleveraging small amount of resourcesIt can read the matrix from thecubate and export to the HPAdirectlybut the limitation isthat only CPU and memory are supportedSecond this is a good thing This is agoodproject As one ofdependencies forKer K native serving implements its ownK native pod autoscaler to directlycontrol the number of paths It can scalebased on the number of requests persecond And this metric can be collectedby either Q proxy scar or the activatorcomponents The number of the requestscan change very fast Well it can bereflected in the KP very fast as wellbecause no mattress pulling is involvedIt always push well once there was achange KPA got change immediatelyAnd it can scale down to zero as wellScale workload down tozero Well but the following drawbacksjust make it impossible for us to use itOne RPS request per second does not workfor large language model Second acomplex algorithm or calculation has tobe maintained the source code Third toextend this model No I mean thisarchitecture with more metrics we haveto write our newplugins Four the RPS metric associatewith K native only and cannot be portedto other platform unless you useit Third one h this going to be moreinterestingRecently the virtual large languagemodel community has announced the uh AIbricks as the cloud native solutionoptimize fordeploying managing and scaling largelanguage model inference It supportsboth matrix based autoscaling andoptimizer based autoscaling Itimplements autoscaling in a similar wayto K nativeserving and all the large language modelmatrix can be blended into this APA tocontrol the number of paths Well theseare all good Very good But thelimitation is that Ker is a backbone ofour infant platform We cannot leveragethis solutionthat unable to integrate ourplatform Walking through thosechallenges one more time So what's thesource of the matrix and what kind ofcriteria to autoscale a large languagemodel workload well we have the answerThe source could be the workload itselfAs long as my workload has exposed thematrix as apass/mmetric kind of per standard noworry about that and uh we can scaleeither think of is more latencysensitive or throughut those kind ofstandards So two second question how canwe scale it fast or in a config andconfigurable way what we try to we cantry to leverage uh the lightweightcomponents as many as possible using apushingmode not the push mode not the pullingthe push mode not the pulling mode asmany as as much as possible I think andthe mo most importantly we can make i�tscale fast because there is a very nicefeature called model cache which isimplement but the answer to the rest ofthe question still remains unknown uhwhat can we do for what can we use forthe matrix registry or server how can wemake it flexible orextensible once there any kind ofchanges we have no we don't know soThe thing is that we do not try toinvent the wheel We try to stand on agiant shoulders So who are thegiantsand can we find them when can we findthem so act four as it been If it be nowthis not to come If it be not to come itwill be now Wow Good newsoptimization of metric collection andserving We'll show you how Jerry stageis yours Thank you Vincent I don't haveany Shakespeare lines for you but thanksfor the great intro and state-of-the-artYeah I don't know So one of the giantsto continue with the analogy is OpenTelemetry Are you guys familiar familiarwith Open Telemetryyeah a lot of boos out there having thelogo on it on them So it's likestandardization effort umbrella slashsets of standards for traces metrics andand logs right and they've got somehandy utility called hotel collectorwhich we'll be using alot So another one is a keta I've gotconfig interest because I work on it sobut it's great right uh it's it's uhbuilt around scale object which is acustom CR that points to a deployment orany scalable resource and the mainbenefit uh compared to pure HPA is thatit can scale things to zero but also ithas a multiple different ways to plugmetrics in So it doesn't have to scaleonly based on CPU or memory butbasically based onanything And the third part we'll beusing is my own contribution It's hoteladd-on for keta And this thing actuallybridges the gap between the opentelemetry world When you have this hotelcollectors that forms kind of pipelineswe have some receivers processors andexporters And this this guy can uhlisten to those uh other exporters So itcontains OTLP receiver in theirterminology So it's kind of a sync to ofthose metrics but at the same time cantalk to KDA and can scale and can canadvise SCADA the the scaling decisiondecisions It also contains short-termmemory storage for couple of data pointsSo it's like really small promeuslightweight promeuse crafted just forscaling because that's uh ka also canincorporate incorporate promeus metricsbut the issue with this setup is thatprometers get overwhelmed and imagine ascenario when you have a graphana setupand also like alert alert manager and ifyou use uh prometers with this scenarioalso for scaling it's overwhelmed Sothat's that's the reasoning behind itOur setup looks like this Oh this islike one of the possible setups when wehave one open telemetry collectorscraping the replica ports with modelsThis is the uh bottom left segment andthis sends the metrics to hotel add-onfrom which kada takes them and scalesthe number of replicas of themodel Another possible uh setup is thisThis one when we plug uh hotel collectorto each pot using a sidecar sidecarmodel and this setup is a little bitbetter because it can react uh morequickly and we can leverage the opentelemetry operator to to help with withthe setup So without further ado let'scheck thedemo Uh so what I'm is it visibleyeah Coolstarts Well I don't have networkOops Nice So let me check theWi-Fi on stage I will create a setup Iwas created hotspotUh yeahOh yeah I will use my my own own ownWi-Fi Hopefully it's going to workLive demo right yeah it's live demo It'sit's a live cluster running in GCP Yeahit works First problem solved So we havea open web UI pod which is a web UI toLLMs and llama version 3 deployed in ourcluster in GCP Uh just to convince youwe have a ingress ingress resourcecalled llmwebcamzer.dev and I'm using the same URLin here so I can talk with the uh modelOkay I am on cube con and wifi is notgreat and it responds withsomething but I can use also curl API totalk with the model the same waySo this is basically the open open AAIum HTTP protocol There is stream equalstrue or false If it's true the modelwill start sending the tokens in astream fashion So it feels it respondsquickly more quic�kly So if I do it it'sdoing the same thing as we show as wesee before but in a command line Rightthis max token actually denotes like howlong the response should be And this isactually the thing that makes the uhrequest different because one client canask like short answer the other one canask for multiple pages ofresponsesUh if I set the stream to false it worksthe same way but it waits a little bitlonger but then returns everything atonce Right we'll be using this curl orthis HTTP uh API to create some pressureto our model to be able to scaleit like what's the criteria what why ornot why but what's the scaling decisionwe will be using we'll be using internalmetrics from the model itself we areusing uh VLM runtime and they exposessome uh GPU stats about around the thesystem so we'll use KV cache inparticular and in the next part of thedemo we'll use also the waiting Q forfor the request So let mefirst port forward the request from themodel This will share the metricendpoints from one of the replica portsIfI yeah I'm basically curling the metricsbecause now it's local host right andgrabbing two of them which we will beusing It's zerouh let me split thescreen and I can also describe the scalescaled object So scaled object rememberit is the thing from keta that has atarget deployment that it's scalingSodescribe somodel it has something called uh wherescale targetreferences the deployment in our case isthe this is the model and this part isactually uh promql like syntax is notfull-blown promqql and this is sentthrough our hotel add-on where it'sevaluated and the correct metric isreturned and collect metric value isreturned We don't support fullprimitives uh syntax because it's crazyright and and there are ways to overcomethis this issue with using processors inhotel world For instance uh keta wasn'table to work with um float values Sowhat we do did here is we scaled thevalue by multiplying by by 100 and itworked And this multiplication is donein the hotel collector itselfI can also show youthe the opentelemetry CR which is responsible foropen telemetry operator injecting thesidec cars to ourmodels Oops And thisguy contains the configuration for thosesmall sidecars hotelcollectors So for instance this is thetarget address where we send the metricsThis is the hoteladd-on This is the filtering part So weare interesting only in those threemetrics and this is the uh part where wemanipulate the metric a bit to multiplyit by 100 In the next release of keta itwill be fixed and it will also supportfloatvalues Right So now let's me do somesome load on the server I will be usinghey command which is basically creating300 threads and we'll be uh doing postrequest the same with the same same samerequest as we saw before This is thebiggest difference It has uh 4k tokensSo it's much more uh much moreuh pressure there will much morepressure on the GPU So if I do that weshould be we should start uh seeing thethe metrics comingup The top top line is the number of uhrequests being ceued I should show youthe pots right so yeah there's still oneif I be watchingitUhaliases Yeah I wasn't fast enough So itscaled up already because it detected itit's above the threshold The thresholdwas 30 So it spawn three more replicasUnfortunately we have only two GPUs Sothat's another topic or like subtopic inourtalk How to handle not enough GPUproblem right because provided you have100 GPUs you just scale the replicas to100 and it will just work But we don'thave endless number number of GPUs So wecan use Carpenter It's a nice tool thatcan add new G uh Kubernetes nodes withuh another GPUs and it works well withAWS or Azure but for instance nosolution for GCP yet or we can usecluster API which I really love and it'struly open source project It's harder toset up but uh we can scale it with ketaWith this we have a lot of troubles tosolve like for instance data localitybecause these models are huge They arelike gigabytes in memory they needs tobe loaded to me to GPU and we need tomake sure that the data is close to tothe code that's running actually andalso boot�strapping the node likeinstalling Nvidia drivers making surethat images are are pulled and itdoesn't take ages to to to spin up newnewreplicas and we totally will neglect theload balancing aspect of the problembecause imagine that we have a GPU thatis underutilized we should be probablywanting to having the traffic going tothat GPU not and not to the overressuredone but these metrics can be used forfor it as well it can be furtherimprovements so a couple of words aboutcluster API it's a there are some talksin the c this coupon as well it's for meit's bunch of CRDs and controllers thatcan help you with deploying anotherKubernetes cluster but they've also havea is have a have a have a support forcreating those so-called um self-managedclusters these they they've got CLIcalled cluster Ctl and it can do clusterCtl move and then the cluster in a wayknows about it about itself and canscale its nodes or even create evenupgrade Kubernetes version and thingslike that A nice thing about it is itcontains machine deployment which is ascalable resource So it has number ofreplicas and you can do things like thata cube cut scale machine deployment andsay I I want two replicas and that's uhthat's that's why it clicks well withketa because keta can scale this machinedeployment guys So let's do that seethat in action I actually have yetanother scaleobject for for the nodesitself but this one is actually pausedthat's why it's not triggering So let meunpauseitAnnotate not the model one this one Soit should befalse So now it should start doingsomething I can split the screenagain List a Kubernetes notes in hereThis is going to take a little bitlonger because it's not that quick I'llbe watching ithere And also I can watch the machinedeployments So remember machinedeployments shortly MD is the resourcefor clusterAPI that's scalable resource and we cansee there is just one replica but soonthere should be morereplic and I can meanwhile I candescribe also the yeah so already it'sclearing up so it detected in here nowit takes like minutes or two to preparethe VM It's actually talking to G to GCPcluster creating VM for us Then this uhnew VM needs to run cubadm join ourcluster It takes some time and I use uit's called uh image builder fromcluster API to actually uh bake thosecontainer images into the VM image tomake it make the startup a little bitfaster But for Nvidia drivers I'm usingGPU operator which makes it a little bitlonger because it installs Nvidiadrivers and this this step takes like 3minutes much better approach would be tocompile the drivers directly to the scalversion but for GCP I didn't have luckto to do that but it's for further microoptimization I guess uh I can show youthe scaled object for for thenodesnodes and we can see that here here thescale target ref is not the deploymentanymore but our machine deploymentcalled demo GPU nodes So it's thisguy and also nice feature about keta isthat you can have actually multipletriggers in our in one scaled object andby default it takes the maximum value ifthere are multiple triggers and one ofthem is uh chromebased chromebasedscheduleuler schedule and uh you can seethat minimum number of replica for thisscale object is zero but during thistime uh during between office hours sostarting from 10 to to 8:00 am thereplica should be one So what iteffectively does is during off hours itwill scale our GPU zero GPU nodes tozero It's because they cost money Theyare expensive and your AI experts don'twork at during nights So why to why topay forthem it's still scaling up Maybe we canswitch back to demo and in the end itwill just work Yeah Live demo It's gonnawork but it takes some time Yeah Somaybe just to sum it up right we have wehave a reference Yeah Yeah Just for mycompany that's we are hiring and uhthat's our my our booth number and uhyou can reach out to anyone reach out tome or anyone with a bloomberg badge oror recruiters and um yeah same for us weare onify boo come say helloOkay let's uh do the thing together andcan we can go to the question and answersession and still waiting for the uh nocoming up for the demo is online is it'savailable online so everything can bereplicated the cluster API script iseverything is there do we do we put ourlink in our chart uh sorry do we put thelink to our repositories in our this isthe link like if you want to scale scalethe QR this is the link for the yeahyeah it's a very important man orthere's a short term if you want yeahOkayYeah Yeah Okay So I can switch back todemo Yeah And uh we can Yeah we canquestions or if you have any We can seethe node is ready And now actually theGPU operator will start do the work It'snot like like this but it worksYeah Live demo man First time We havesome time so we take any questions Yeshave a question Yeah went to the mic Weuse SCA also and sometimes when we scaleup it's already quite late because maybewe receive an spike of data and then wescale up but when we already have ourcollector going up data is already goingto stable Makes sense Any suggestion howwe can Yeah sure those problems SureSure Question Sure You are askingbasically about the stabilization windowfor um I I can describe it in here scaleoperatorjust one of them both of them has itIt's called stabilization window and youcan uh modify the way HPA actually worksbecause HPA is still being being usedbut under the covers it's likeimplementation detail for KA So here wesay with this number that SC for scalingup the metrics needs to be the thresholdneeds to be reached only for one secondIf you increase this number for 100 itmeans like 100 seconds needs to measureabove the threshold and only then itwill scale up And same goes for scalingdown That's actually a good questionbecause these nodes takes a while tospin up It doesn't make sense if youmeasure less to immediately kill themright so here we are saying that itneeds to be there for I don't know 20minutes or what is ityeah 20 20 minutesOkayYeahI think that um for the beginning uh ifwe show the architecture of cases likethis is running on top of K YeahWe can we can use RPS as so and thiscase we switch to like our custom YesYes So would it be possible to to uselike as a in case that here we it is wehave only LMbut in case if we have another funnysure can okay I can I can go a bit withfor example if you want to use a caseserve to run your serving platform thisis kind of road map we're having now sayin the oldum workloads we either can scale basedon CPU or memory this basic one andlater as in kative serving can based onthe number of requests concurrentrequests but in the large language modelthey didn't work for us for example Iget100 requests coming at the same timethat's doesn't necessarily means my GPUconsumption is high they do not havethat kind of direct correlation to eachother So like in logical model there aretwo kind of criteria you can look at aswe just said you either look at latencyor is look look at the throughput So bylatency if they compile or conform tothis virtual logic algorith model spec 0to one metric you can look at which iscalled the KV cache percentage it's fromzero to one one means 100% it means howmuch GPU usage you have been used forthat KV cache if it's high which meansI'm actively processing your requestsdoing inference at the same time if it'slow which means Okay that's fine I canhandle But if go above a kind ofthreshold for example you can set to 5050% or 60% if you go above that I canscale up for larger model workloads Somatrix collection you we need to pick upthe correct matrix to scale ourworkloads But for serverless well we canalso do that likewe not model workloads based ondifferent workload pick up differentmetrics and scale based on themRunn Okay Thank you Yeah Thank you Thankyou guys Thank you folks You have awonderful night2025-04-15 22:03:07.843327�lly started with Unix so evenbefore Linux and computers were reallyslow at the time and the idea was liketo write like a very specializedprograms like just find if one word isin the text do this thing well but ifyou want to do anything moresophisticated you could like chain thoseprograms together so you could redirectsadane or s either to file or either toanother program and this was like it islike very core of unix philosophies thatworks surprisingly well it stood thetime of the time and even today if youhave any Linux server you may just dealwith your logs that way right like youcould just one comment find files theother command to do like a grip on thosefile whether there is error sort themhave unique them sort them by count andthen do some headoperations and if you don't run likeoperating systems just in your computerthat may be enough but as we know systemhas evolved a lot and this idea of pipeswas let brought by one of the first Iwould say full text commercial vendorsplank that quickly realized that youknow systems breaks all the time so whatif we have like Google for this machinedata so just ingest all of the text ofyour machine data keep them incentralized story and then you couldlike use similar syntax to querylanguages of course there's like onevery major important difference why theUnix pipes were very like imperative sokind of like you know you pretty muchexecute one tractor after the other whythis query language in distributedsystems or like in is like declarativeso it's kind of like SQL you tell itwhat needs to be done but actually thecomputers or system could optimizebehind this and how this wouldwork and this became quite popular andfor those you you may not know it it'sactually became popular and almost likeevery lock system has some version ofthat whether it's like Microsoft custolike uh EQL there's a lot of them butthe idea is like pretty much the sameyou select some data either on type oreither on like some index you do like afull text search on that or text onfield and then you either extract morefiles using some regular expression orsimilar things and then you also likechart them by some aggregationUh and the idea is like usually it'salso like a wrap in nice UI fewer peoplewrite it like manually in configurationfiles most people write through somelike user interface so the idea of thatthat having those syntax automaticallytrigger some sort of visualizations andyou could like debug what's going onwith yourdata and the reason why this languagetook off and that's like a very rarecase that the language that was not SQLactually took off was because it wasreally optimized for real-time insightyour system crashed you may be woken atnight and you quickly like to figure outwhat's going to happen right so you weredealing it on at hco basic it wasiterative oh you look for an errorthere's a lot of errors maybe look formore specific errors or maybe group itby from which machine or Kubernetes spotthey came uh it was quite concise so alot of people like are able to write ituh especially developers devops findthis syntax familiar because they wereusually like familiar with Unix so a lotof them picked and it was also likepretty much integrated withvisualization tools so automatically inthis language you could also like seesome graphs charts over time uh and thisapproach becamepopular from another approach this waslike from the lock centric of the worlduh you know there was like you know onceupon a time bork and out of bor camekubernetes and there was like also likea once upon a time in Google like aborgmon so system to monitor bork whichlater came to open source as a promeusalso like there was like a lot of othertooling other than promeus like graphanaand this became go-to system for queringlike metrics and at the time usuallythere was like a two type of companiessome companies which started from logsand very were very like lockentcentricbut the other companies which weremetriccentric and pretty much you knowthey need to query metrics right so youknow they came up withsimilar language or like kind of like toquer�y metrics and there the problem is alittle bit different because usually youwant to create some time series applysome filters and do someaggregation so inpromql I guess maybe quick question howmany of you have written some promqlqueriesi don't have to explain maybe that muchbut for me it feels like a regularexpression out of like you know forquering it's very concise you could do alot of stuff you select sometime seriesand then you do some operations andthose metrics usually came from scrapinga lot of like services okay what'shappening in given second at thatcomputer and the reason why this is likevery popular uh um One of them reason islike usually you have like one metricand while your Kubernetes spot is aliveyou just keep producing more data pointsso it's like uh not like a log thatevery log could be different and couldhave like some metadata different nousually like it's like rather like asmaller number of metric and then likethey keep appearing over some period umthey like usually naturally order bytime so there's a lot of like implicitassumptions when you do promql oh you'reactually already ordering by timethere's like some window functionthere's like some function to dointerpolations or like turning absolutenumbers to rate numbers which make itvery very effective especially for thoseyou want to put it somewhere in the YAMLfile okay is my servicehealthy and youknow that would not be that bad if thatwould be just two languages right butwhat happened was like observability islike one of the most successful categoryin the venture so if you look plot allof the companies how much money theyrised and how big they became after atsome point uh of course today's stockmarket is down i put at some point butstill you know you get people if yourise some millions and they I believeright spent just 50 millions and becameworth 47 billion so you turn millioninto billion right and that's like a twocompanies and if you look how manycompanies went IPO for billions or thereare some and I'm also like it's I don'tthink this list is complete there arestill companies private and I'm not evencounting like companies wanting hundredof million dollars so what happened likeright now we can see there is like a bigmarket many success there's like so manystartups wanting to do observabilityit used to be that you have to doeverything so that's why there was likefewer companies but today you know wehave open telemetry we have like CNCF soyou could just plug to collection youdon't have to do it you could just plugto this rich open source ecosystemthere's a lot of open source tooling toproduce like you know everything else soalso you need to you don't need to bethat big expert in distributed systemyou could just use one of the enginesand bit like observabilitycompany and the challenge that is likenow almost like a lot of yeah somepeople standardize on promql but a lotof people wrote their own languages andnow you know um there's like a big messbecause everybody has like a differentversion of syntax and though in thebusiness yes there's a lot of likeanalytical databases they still prettymuch all of them support SQL yeah thereare some differences but fundamentallythe language is the same but inobservability is like all over the placeso what we end up happening people arenot using many advanced features of thelanguages they are not extending thembecause you know it's hard to learn andmaybe then you'll be vendor locked sowhy bother and we are not taking fulladvantages to give you some example evenif you do very basic queries oh I wouldlike look at my logs parse app name outof those texts and do some aggregationthen you know pretty much every vendorhas like a little bit different syntaxand what's even worse is not just syntaxsometimes like even the results aredifferent so that's even worse than justlike translating one vendor seen to theother when you're migrating but somevendors operate still on tokens so ifyou search for invalid type in onevendor you may find invalid type errorin the other not because you still needlike a wild card right and here are� justpopular I wouldsay query languages there is in fact waymore some companies even have like somedifference between their versions theywent through some acquisitions so youknow not a fun storyAnd this seems to be like a similarsituation as before open telemetry whenevery vendor tried to invent their owncollection which you know for somepeople may work but I would say that'svery short-sighted approach that youknow you lock c you may think oh youlock customersuh because you have your own language Iwould argue okay you know did it work incollection not so much we all loved opentelemetry we all loved like you knowpicking the best tool or picking themost innovative tool some of the other Iwould say fluff argument I'm hearingokay you could innovate faster becauseit's notstandardized not true sql is also likeevolving people are adding like you youdon't need like 100% compatibility butat least you have like a commonguardless how you evolve and in the longrun it's like everybody pain becausethere is like uh not that much you knowautomation that sometimes you can buildif you would like to have let's sayprovide threat intelligence to everybodyand build some automation ofobservability systems now you have tobuild a lot of integrations maintainthem uh today there's a lot of like AIcompanies which like would like to havelike uh AI agent to automate some stufffor you and they also have to have a lotof integrations and not all of them workwell uh and even within companies a lotof big companies use like a severalvendors and they have like a hard timestandardizing because maybe they use onevendor because you know it's like reallygreat vendor but it's kind of likepricey the other is like more volumevendor so even then people have toswitch tools to figure out what is thebest way to troubleshoot that scenariouh and of course like it's not justabout observability signals you have toswitch between logs metrics traces italso like about context you may havelike you know user ID and then you wouldlike to join that with some uh data okaywho is this and then yes some vendorssupport uploading the references filesbut then you have not like eventuallyconsistency so data may be not accurateso sometimes people also like go todatabase to figure out who's actuallyaffected and this makes you know youknow it could have been better if we alladopted samestandard so you know we are in this messlike you know is there anyhope luckily if we look at the dataworld it used to a lot of people thinkyou know oh uh there was a lot ofinterest in NoSQL databases but now weare coming back and a lot of I would saygood ideas from NoSQL how to parse JSONshow to deal with unstructured data inSQL are adopted by SQL and we see a lotof like engines also like if people wantto have this advanced capabilitiesthere's a lot of like libraries projectsthat you could uh do that also like ifyou let's say would like to have like aparse SQL and parse it just to uhengines there's also like you knowformats how to have intermediaterepresentation of execution planuh also like you know this SQL dialectsbecame less of a pain because you knowpeople wrote a lot of like libraries howto translate between one syntax to theother uh and you know uh if we see atthe business side of the things you knowuh uh they also want to dig intoobservability data they also see youknow the same data that you are using todebug system may be used to debug oh whyare customers leaving or whether thisproduction issue actually affect ourretention or maybe if it affects likeconversion so the business people arealso interesting to get intoobservability data but they are notalways seeing that's like thatcompletely different problem somecompanies are using like you know datawarehouses to some of those debuggingsother are using observability tools sothis lines is starting getting like ablurred and if we look at the researchyou know uh there was like a researchand this is like a very fun paper thatyou know somebody wrote like a paper in2005 then they wrote paper 20 years agoand the conclusion wassame actually like SQA is li�ke a reallygood model it actually was like createdlike a 50 years ago it survives a lot oflike distractions and what usuallyhappens people have some ideas they dosomething outside of SQL and eventuallySQL adopt how to do JSON how to do thatso even uh a lot of like database thesedays are more like a new SQL not like aNoSQL and one of the recent exampleslike the SQL was able even to add graphsupport so SQL is like not reallycoupled to one I would say storage itcould be column based it could be rowbase but it's like very convenientabstraction okay how can we query thingsin unifiedway so then you know luckily there iscloud native um unifi body that's triedto figure out okay how can we unifythese languages it took a lot of effortit's like happening for over two yearsthe interview a a lot of like DSLdesigners and you know like withprogramming languages there's like atons of opinion it's very hard to cameto any conclusion because you know youknow how many years took debate betweenspaces and taps so you know that's hardproblem to solve harder than think butthe conclusion was like you know yeahsyntax may evolve it may feel right ornot but still even before syntax there'slike a problem with semantic like how doyou deal with numbers is this numbercould be null like what do you mean bythis operation and the conclusion islike you know why does not adopt SQLsemantic which would make other thingseasier because syntax is kind of likestill in the flux how it should be butuh semantic would be kind of like firststep how to get to unified querylanguage right so then like you know onequestion is like okay since we arecoming back to SQL after so much timeswhy we have not started with SQL in thefirst facei would say some of there were like gapslike implementation gaps like because intraditional analytics yeah you couldspend like a day or two figuring outwhat's the optimal price for youre-commerce shop you could spend a lot oftime doing like data pipelines but ifyou are like you know s sur debugginglike an outage you rather have like aquick answer like if you have like aimportant enterprise customer loggingissue okay the system is buggy after therecent deploy you rather have like quickanswer right so uh people inobservability are very rushed to doquickanalytics and usually you don't careabout it's not like a financial reportif you have like a million errors ormillion two errors if you have done manyerrors you already knew that probablyyou need to filter out or aggregate bysomething to do that and so people arereally rushed to do it very quickly um alot of also like things that are verycommon in observabilityuh are rather hard and quite advanced towrite in SQL and also like people if youare debugging they're used to to macrosthey're used to extensions okay uh whatthis IP address if is like from whichcountry and it's not so clear how tocleanly add it to SQLso to give you one example like you knowrate like one of the most popularoperator how you deal with in promuseyou scrape metrics every minute was thetotal number of requests but it's not sofun to just keep having cumulativemetrics you would like to have okay howmany requests I have per second perminute right and that's like a rateoperator the equivalent and this is likeeven not correct equivalent but you knowI run out of real estate on my slidewas how to do that in SQL all right evenI was trying to ask some generative AIand most generative AI took like a freedrill downs oh please don't forget aboutthis edge case before they were evenable to write this thing why because youknow first of all you need to know aboutwindow functions which is rather I wouldsay advanced feature of SQL then youneed to care about all of these edgecases okay what if my protest likerestart in the middle and you know mymetrics kind of likeoverturns okay also keep care what if Ihave like a missing metric from 1 secondor 1 minute because some scraping didnot worked and know that's the query andthe story is that it's not longly longit's also like not readable so itdoesn't express your intent if you haveto write that long �SQL for such a basicthing and unfortunately that's not eventhe worst example there's like manythings like that and even here I wouldargue there's also like edge conditionslike okay what do you do with the firstmetric and the last which I not handledhere but if I would do that would betwice longeranother example in log analytics verycommon case you have like a somebreakage you would like to sortby top 10 common I don't know like potcontainer apps that were hit by thiserror right and here is like oneinconsistent because some of thelanguages also like will include countof the others in the others you have tobe explicit about it but let's assume Iwant like a top 10 results and I willalso sum of all the other apps thatweren't in top10 that another like that's like threesubqueries to do it well right and again itdoes not it's kind of like break I wouldsay the beauty of SQL where you wouldlike to express your intent here you arerather like dealing okay how can I fitin SQL query languages operations uhefficientlyand what's even worse uh the good partis like most of the modern SQL enginescould execute that really efficientlybut what's worse is like you know okaythat's not the only operations usuallyyou have like a 10 operations like thatafter the other so you know stackingthem together would just make this thingevenworse so l like soluckily recently like last year finallyyou know people came to conclusion thatSQL is Great we cannot break with legacyit's almost like you know yeah in theorymaybe the common language should beEspiranto and we should all learnEspiranto but this story is that youknow Espiranto is not that easier asEnglish and English was already usedfirst by British Empire then by USA soit's already piggybacking by a hugepeople if you can use it so maybe weshould use the same thing with SQL andSQL is actually like a great languagethat have a lot of features but it gotlike a two flaws first of them is likeordering that's like not very intuitivelike a filtering operation could bewhere having and qualify depends whereyou put it uh it's even worse when youhave to do subqueries and common tableexpression that you have a lot of thisbut you know that's just syntacticalsugar we can just add like a pipeoperator and that's pretty much this topSQL is like equivalent to the lowerone and you think okay that's like asyntactical sugar not a big deal thatgives like another advantage that now wehave like a more logical place where toput those extension how to do thingsthat are usually like very ugly liketable value functions no now we can justcall one operator and add customoperators to SQL which either could becompiled to other SQL as of now but youknow also like some engines could addnative support forthat and let me show like one demo sothis is like current discussion this isalready implemented in Google bigqueryand datab bricks fireboard so there'slike growing like support okay this islike actually like a nice SQL features Iwe also like implemented like a smallthing because you know we are engineerswe you I like papers but I like workingcode more so we open sort of like asmall kind of like proof of concept demoof this concept how this could work andknow and we've write as like a graphanaplugin so we are looking in open SSHlogs and of course here's like someordering word that could be implicit setby uh time range and we can look forsome log linesuh as as you may see like we areinvestigating SSH logs but maybesomebody is trying to breakups andunfortunate and luckily we identifiedthe right loglines which you know okay here is likethe log line but unfortunately this isnot like a wide log it was not parsed wejust discovered now that we would liketo parse this IP address so here insteadof writing subquery we could just extendso kind of like add to select one morecolumn that's like a parsingthis text and adding extracted host andextracted IP address right so once weaddit we have like you know these twofields that you know we can use uhanother thing that's kind of likesometimes like useful in that's veryuseful and very common in observ�abilitysystem but kind of like hard to do inplain SQL is like you any kind of likeenrichment because ideally you wouldlike to join but you know first of allsometimes you don't want to store like aall IP addresses when you are justinterested about those IP addresses sohere you could have another syntax thatwill behind thescene download run one query downloadsome IP addresses and then like makesure we can join withthem and then once we join them we couldsee okay maybe let's look out of thesebots bots from which countries they areright and we look okay there are like abots that get rejected from severalcountries okaymaybe maybe that's them but maybe I'mgoing to use another operator which isgoing to get threat intelligence fromanother vendor okay are these botmsright and we can add this threadintelligence and another filtering isbot and now we can see okay this botactually just came from three countriesand uh have this language so this iskind of like a preview it's not standardyet but we open source it so you couldplay with it is kind of like fullyfunctional demo behind the scene usinglike click house uh graphana plugin andsome graphana uh both in the back endand front end and feel free to modifyand play with the syntax because usuallylanguages and syntaxes are not writtenby committee they're like written bypeople like you they play with and oncewe have some I would say nice syntax wecan add like what's would been whatcould go to standard and what could be acustom extension and the beauty of thislanguage that now you could like havethis custom extension that could be evencompany specific okay that's how I lookand you can do a lot of things in querytime and behind the scene this is alltranspired to sometimes like quite bigSQL that could be run on like manyengines that support it as well aspotentially it could also like addsupport to existing vendors doing thatobservabilitysoyeahso so now I believe like you know theone thing I have like controversialopinion there's like there needs to besome acceptance about standards butusually the standard are not justdefined by governing bodies in I wouldsay you know like governing body willcame with perfect spec and then we'llimplement is going to happen opentelemetry also happened because therewas like a very good collectionimplementation so uh that what we arelooking for like you Can we implementthis standard language well enough inlike a gateway manner that it compilesback on like a open source license uhthat we can start using it today uh I'mvery big fan of SQL i believe ifsomething is like a Lindy effect if youknow most of the technologies from 50years ago has died and proven not to bethat great but SQL survived 50 years andit thrived so it's probably onetechnology that is going to survive likeanother 50 years uh so far a lot ofGoogle contributions to open sourcecommunities like you know Borg Borgmaneventually became much bigger opensourcestandardsuh I believe like semantic I have like astrong opinion do the same thing whatSQL is doing uh and how to do it syntaxI'm I showcase something for you but I'mnot sure people are all buying how thefinal syntax could look like But feelfree to experiment feel free to hacksomething feel free to showcase becausethis is like something that we can allbenefit fromthe vision would be to have like onelanguage that gives visibility to allobservability data no matter if it'slike metrics traces or even like your inyour data lake that could be open foreverybody and it's kind of like aprotocol same as open telemetry thatpeople can build upon either they buildsome automation but it's also like Iwould say like some content I thinkmaybe AI will write queries for us notso sure people need still some formallanguage what do you mean by last monthlast calendar month last 30 days what doyou mean by that as well as uh we needto have like a same data sets likecollect even like you know outage occurshere here here were the queries that wefound the issue and this would be bothgreat for training and evaluation aswell as when the AI will do somethingfor us still need like a language thatwe could like seeokay actually AI did this query and islike optimized for human re readabilityand supervising if that's the thing thatwe wanted or if that the conclusion isright as well as this pipes gives now wecan give to a fit AI with all of thepartial results and also like ask AIokay what's the problem with this queryoh that place where you try to enrichactually it was not the right thing youshould do somethingdifferent so thank you that'sit i think we have like a time for onequestion and then we could and then Icould take rest offlinethanks uh this is greatuh I love the practical approach whereyou just propose a solution to theproblem uh my question is about theexamples of uh complex SQL queries forthe rate operator and the top endoperator does your proposal solve theseissues or is it also complicated yeah Iwould say I don't want to say okay nowin SQL we have to have like a rate orcolor rate operator but certainly weneed to have like some rate operator soit's like oneliner right I'm rather liketaking the approach I don't want to sayhow exactly this rate works becausethere's like many way to implement ratesso this is like I would love someexperimentation from communitythanks awesomeactually another question yeahI guess it would be possible to uh toimplement like a translation layer fromthis observability query language tolike Splunk query language ESQL etcright yeah I think that's how we shouldstart like you know it's very hard toask every vendor every database toimplement you know language that's kindof like unproven but writing like a youknow transpiling is really fast so wecould write those transpilers startusing it uh maybe some people if youwrite many integration will use thisopen source as form of like integratingand yes some vendors will eventuallyadopt it also like as the nativelanguage right and there's like a reasonwhy to adopt native because you couldimplement some efficiency gains but sofar a lot of those examples are likeperfectly transpired to pretty muchalmost every solution cool thanksi uh thank you for the prototype i thinkthat's really great to see that you youtried to like implement it uh wonderingabout the kind of storage and how yourexperience was in in doing that and Iwas also wanting to ask if there's somelike yak definition or something of thatuh specification like what was yourexperience like actually implementing itso uh I would say uh the specificationis like a different tag there's like youknow uh there was like previous talkabout it there's like some governingcommittee which got like arepresentatives from like a NetflixApple Google so this kind of like theytry to okay make something that we arenot inventing something just for onething uh the prototype we just didourselves is kind of like okay there'sno prototype seems like a cool ideanobody has it let's doit is not that easy but it was kind oflike a one thing challenging that for uswas that you know whether we definegrammar and try to understand every partof SQL which it's quite hard a lot oflike SQL got some extensions or can wedo something that you know okay weunderstand some part of SQL it's kind oflike partially parsed but you could alsolike use original version of SQLelsewhere and we don't have tounderstand and it will still be beworking uh with us and this is like Iwould say partial parsing there is likesome peg grammarss there's like someactive research probably the most activeis like duck db about this partialparsing grammarss so there is like somerecent research that but it was not thathard this prototype was like you knowit's like a month work it's not like youknow super huge effort to get startedthankYeah I think I will take less questionshere so thank you for your time[Applause]2025-04-15 22:03:08.423838�ike pretty uhnice schema maybe some SDKs some toolingthat can help us like validating thedashboards deploying the dashboardseasily and so on so before uh we divedeep into into it Let's just make surethat everybody's aware like so the ideaof a dashboard as code is just to likeyou can describe your dashboard using acode language usually YAML JSON orwhatever or even other language likeGolang Qling as we are mentioning andthen you can store this dashboarddefinition in a your source code likeGitHub like Gitlab and so on and you canhave like some DevOps practices uh ontop of it like linting unit stacktesting preview um and you can even haveeven have some nice transformations onthe dashboards before you deploy to theend solution you know like you you theidea is just to make this as um cloudnative friendly aspossible that said what is the kind oftooling that we need to create indashboard as code using perswell uh Antoanu was mentioning now uhthe persist project is offering um likedifferent ways to creating dashboardsokay you can creating through the UI asyou are doing other solutions if you'rebrave enough and you remember all thespecification for dashboard you cancraft your own yaml or JSONs but Ibelieve that nobody wants to do thathere right and the other options is youcan use the two flavors of SDKs we areoffering like the Golang SDKs or theQangSDKs we have pros and cons on each ineach one okay and in the Golang side youleverage all the power of the Golang wehave like you can leverage the ecosystemyou can leverage like the unit testingeverything that we have in Golang on theQ lang you you don't have all thesecapabilities but you can leverage somenative um native features that we havein the purses with the pers schema andso on you can see like the pros and conson this slide so you just need to chooseyour your SDK and move forward uh we aremaybe in the future we are willing tosupport new SDKs but the new languagebut it's mostly Golang and Q what wehave been using so far so contributionsare prettywelcome so okay I use I pick my SDK Icrafted my dashboard and now I needsomehow transform this dashboard thatwas written in Golink for example to themanifest that pers is is reading andunderstanding with that you can usepersiuh which is a command line interfacethat allows you interact with withpersist server and generating plugins uhtrans building dashboards to thestandard definition and many otherthings with these two tools you can withthose two tools you can choose you canhave some some development life cyclelooking like this so you have five foursta five stages you are crafting usingthe the the dashboard creating thedashboards using the SDK you choose andthen you can use process CLI DAC likeDAC DAC is a short term for dashboard ascode so you can use persi dute this willgenerate the yaml or the JSON you canjust choose uh pers will understand bothand you can use persi link to validateif this definition the dashboarddefinition that you just generated is avalid uh schema it's a valid definitionit comes with a default uh rules okayyou can choose uh offline validationthis will use a set of uh structuralchecks on the definition or you canchoose a online version like and you canprovide a pers server and we will checkif the plugins that you are using in thedashboard is available on this uhinstance that you are trying um and manyother things but there are other cornercases for linting as well like in yourcompany eventually you might have umspecific rules saying that everydashboard in the company must be nameduse it snaking case okay or maybe anytime series panel should be a tablelegend like stuff like that um persisupports custom rules where you can useuh JSON path with common languageexpression and you can run some someassertions over the schema so this youallows you extend and customize the lintand the validation of the dashboard withyour own rules well after that you haveDAC preview this will basically deploythe dashboard into a persist server butthis is not just a dashboard it's adashboard that we used to name ephemeraldashboards it's a� dashboard with a timeto leave attach it this is very usefulbecause while you are creatingdashboards you are sharing thedashboards with your peers and you won'tgetting feedback on that after yougetting the the feedback you can justyou don't need to care about deletingthe dashboard from the instance anymorebecause it will be automatically deletedby some background routine and the timeto leave of this ephemeral dashboard isis completely uh customizable okay andonce you finish you have pet cli applyand then you applying the finaldashboard to the money fast okay so thisis something that we can using CI rightthis is pretty extensible it's a CLI andwith that in mind we are using GitHubactions in the whole project and manypeople in the industry is using GitHubactions so we are offering as part ofthe project a GitHub action that umapplies all these life cycle that wejust saw okay you can run the actionsthe steps isolated if you want one byone but we also have this DAC uh the docuh workflow that does everything we justsaw and it knows how to understand ifit's a PR we create ephemeral dashboardit's build and master we create thedashboard itself okay so it's demo timeuh we are not brave enough to do livedemo we recorded but it's not going tobe boring we are going to talk while weare watching the demo i hope youenjoy so how many minutes yeah we have afew minuteslet me try to maybeone time and a halflike should be fine so this is a plainnew um project okay you can see that wedon't have anything and we're juststarting creating a Golang project go togo get to get some process dependenceand we are creating some some dashboardlike a very simple dashboard like we arenot showing all the panels because we wehave um we don't have much time okay soas you can see we are using flag it'sjust a standard library for flags andthis is nice because if you'releveraging Golang you can use uh flagson your dashboards and you can make iteven more customizable based on externalparameters that you are providing duringthe buildokay just move it a little bitfaster yeah so first here we reallystart with a basic dashboard empty we wejust provide some lines of code to proto provide the name the project where wewant to host this dashboard and that'sbasically it so really the demo we areshowing you today it's using the GolongSDK so there is also the Q SDK uh thatexistsum for the moment from the return thefeedbacks we had so Gong is more popularso that's why this demo is about it butif you are curious the QR code we shareduh earlier so it was a bit fast but youwill be able to to recover it later uhit is due to the demo project for forthis demo and there is both Q and godashboard so you can have a look uh atboth SDKs exactly so this is the bareminimum for a dashboard okay so youcreating the dashboard you need to setthe project like every person's prodashboard needs to be uh related to aproject and you we are adding the datasource and then duration right afterthat we run like C cli build and we havethe YAML definition like as Anton wassaid it looks like Kubernetes it'sKubernetes behind the scenes okay likeall thescheme so what is nice here it's likeum we linked but the nice thing is Iwant to be ableto to use the same workflow as much as Ican like I want to building thedashboards locally test everythingbefore I even open the PR so you canjust creating maybe some run some dockercontainer from perses and you can buildin the dashboard using volumes and thiswill be reflected in in your localinstance as you can see so we'rechanging things and so onyeah like we are now adding some panelsyou can see we can add the data sourcereference the data source from the panelthat we created in the dashboardpreviously you can use time series uhand the table panel and once we run pedCLI duck build it just um update thelocal the localinstance so it's pretty nice it's okayit's time to open up your and share thisfancy dashboard with my peers right sogit commit g push asusual and then if we go togithub we can just creating thePR it's pretty simple workflow we canjust Oops notso we just t�he GitHub action will justuh trigger in a few minutes um movingforward a littlebit yeah adding the build and we see allthe workflow building linking uh andcreating the preview like now if we goto the if we go to the PR we have acomment from the GitHub action with thelink of the dashboard in pers theephemeral dashboardif you click and you're going to see theexactly same dashboard that we wereseeing on our local instance and this isvery useful because you can see the nameof the dashboard it's basically the yourdashboard name plus the PR number youare openyeah and basically here with thisexample it's a brand new dashboard butit's a next PR uh coming when we youwill do changes to an existing dashboardwith this preview you will be able quiteeasily to open the current dashboard andthe preview of the new version on twoseparate tabs and to really be able touh let's say uh visualize both versionsand validate yourself that everything isfine so you and also the the PR revieweryeah like like now we are just adding anew panel with time series okay uh whatis nice here like and you can leveragewhat we have in Golang like I want tomake sure that all the time series panelis going to be using the same legendstyle like table style with min max andaverage for example you can just extractthis into a golen funk a golen functionand reuse all over the place you know soyou reduce the amount of coy duplicationyou have in your dashboards you ensurethat you have more standards of yourdashboards because you can just use thesame standard over and over againyeah and that's really an importantaspect of let's say managing dashboardsas code and not creating them throughthrough the UI really this let's sayfactorization and import uh part that itallows uh so you can imagine that Idon't know an application that wants tohave a consolidated dashboard witheverything on it including its specificapplication metrics to its technicalmetrics for example for pods or even themaybe the CFKA cluster it would be usingthis is the kind of thing where youcould import libraries provided bydifferent teams different middlewareteams expert on their own domain and uhinstead of creating or let's say forkinguh and losing the capability of uh theseexperts to to update and to keep to tocontinue having uh right metrics usageetc you can just import their librariesso the day they provide a new versionwith improvements you just update thedependency and that's it yeah and thenice thing about this code reuse it'slike consistent matter you know if youmake sure that all your dashboards inyour company looks as much as it can besimilar during on calls during stuffslike that you don't have this likecognitive load translate oh why thistime series here like maybe would beuseful if I have like the the average onthis on this legend or if we kind we cantry to use some sorting of the samecolor scholarly schema to represent whatis aor what is not a horse and so on soreusing these into functions in yourcode language it make it's easier tobuilding standard dashboards so we justopened the PR with the new change rightuh and you can see that we we have yetanother preview because you're doing achange so it's yet another ephemeraldashboard but we have also somethingcool um so yeah the preview with the newpanel as we cansee we have two data we have twoephemeral dashboards one was older thantheother and and what is really nice hereis like since we are changing anexisting dashboard besides the previewversion you have like how the dashboardis looking like you also have all thediffs between the what you're changingon the code you knowbecause of course with Git you have thediff of the base code creating the thedashboard but here it generates a difffor the final payload the final YAML orJSON payload that will be effectivelypushed to the server so we can reallyit's a an again um an additional checkthat you can do to verify that yourchange is fine cool so this was the demouh and we are just getting startedyep so what's next so basically now thatPers is part of the CNCF and that wehave uh also the new plug-in sy�stemcoming in very soon now we really wantto empower users to create uh dashboardsand to share them same for pluginsthrough a marketplace so this is reallysomething that we want to deliver soonso that you you would be able to gothere and to find both the let's say theofficial dashboards and official pluginsthat the persist organization mayprovide but also contributions from theopen source community so yeah so couldbe about plugins dashboards and we canalso imagine let's say dashboard as codepieces like maybe just a specific set ofpanels etc could also be some kind oflibraries that could be part of thismarketplace yeah and exactly and we havethe community dashboards basically whatwe're trying to do is we don't want tostarting from nothing you have likeplenty of graphana dashboards all overthe places especially for CNCF projectsand we're trying to migrating thosedashboards to persist dashboards uh andthen we have the community dashboards wealready have some like well-knowndashboards like node export or promeusoverview we have some work in progressfor for some Kubernetes dashboards andfew things that we are doing in thiswork is as Anton is mentioning right nowwe just don't want to make the dashboardreusable but the panel like eventuallyyou are building your dashboard and youwould like to leverage only one specificpanel from the community dashboard notthe whole dashboard we are making thispossible those dashboards are beingbuilt using Golang so you have thepublic panels you can go get versuscommunity dashboard and only importingone single panel from the whole uhecosystem you know so this is what wehave for you today folks we really hopeuh hope that you enjoyed and I think wemight have some time for question maybeNo questions this might be good themicro is over here so feel free ifanyone hasone yes i have one question so do yousupport exemplars from from it use sothe one which are available in graphanafor example or do you plan to supportthem yeah so it's not supported yet it'suh definitely in the plan yeah but yeahit's not supported yet yeah okay thankyou you gouh yeah I've got a question uh is thereany builtin accessibility features intothe dashboardyou mean like access control uh noaccessibility so like screen readersdescriptionsthat's a good question that's a goodquestion i never check our accessibilityscoreyeah we we will make sure that wemeasure this and put in like as part ofour work because it's really importantto make like accessible like thanks forraising this that's cool cheerscan we convert a UI to to code when adashboard is being created in the UI canthat be exported as code yeah that's agood question yeah you you can exportlike you will not ex at the moment wedon't have option to export the SDK or Qlang but we can export the JSON manifestof the dashboard or the YAML and Iremember that we are also talking aboutexported the CRD because we alsooffering uh we are also offering a persoperator that you can manage everythingusing Kubernetes CRD so we will be allowyou export to the CRD too thanks and bythe way about the export um so you mayhave heard about the D-0erouh during during the D-Zero keynoteyesterday uh so on their side typicallythey are interested in the Promeus datamodel and they are able they theysupport uh the pro the pers format soyou can export your um pers dashboardand import it in D0 and this is part ofthis open specification uh I was talkingabout earlier so we the goal is reallywith with time more and more we we wouldlike observability solution tools etc toyeah be able to support the same kind offeatureall right uh great great talk uh thanksfor it and I just have a question forpeople running like simpler setups whereyou're just for example running dockeruh could you provision these dashboardsand just import the files without somekind of discovery or any like or theoperator for example just uh provisionfiles that as provisioning dashboards asfiles directly to process yeah this isexactly what we did on the demo it'slike quite fast but we just run acontainer with using docker compose andm m m m m m m m m m m m m m m m m m m mm m m m m m m m m m m m m m m m m m m mmounting the volume where the dashboardJSON was placed and dash the processwill automatically reload uh from timeto time okay like in Kubernetes we weare also offering a helm chart and wehave a sidec car so you can use like youcan mount volumes uh you can have configmaps with the with the dashboards andwe're going to mounting the volumes gotit thanks you're welcomeuh I have two questions uh one of themis about governance we had the trauma ofKibana Graphana changing licenses it'snice to see two different uh at least uhvendors to say so so uh in terms ofgovernance uh if we invest this will itbe a uh what what do you plan uh togovern this uh project not just CNCFwise but uh in general and the secondone we are currently limited with theprovided panels if we want to add a newpanel we need to be a contributor Iguess yeah like two different questionthe first one it's a CNCF project soit's under CNCF umbrella it's not backedby one company we have Amados we havecoral logics we have headhat working onthat uh as any other CNCF project weneed to get in an agreement and makesure this is not only benefiting onlyone company you know uh the second onelike go ahead which one the second oneyeah so about the the panel types soyeah so today we have something likelike 12 different visualization optionsand with the new release coming in andthe new plug-in architecture this isreally something that now we will enableexternal contributors to provide morepanels so yeah so basically if todaysomething is missing someone has todevelop it there is no magicjust to add something on that like um wehave what the notion of core componentsthey are maintained by the core team ofpurses like Prometheus uh and tempo uhdata source they are maintained by thecore uh and we are still discussing howwe're going to handling like communityplugins but not maintained by the coreteam because you know like for exampleyou have we have people asking for clickdata source my SQL data source We don'thave all these specializations under thepersist team and it's hard to usmaintaining all those components that weare not using daily basis but we westill believe that is important for theecosystem all those plugins so we arestill under discussion there's openissue for that like ideas feedbacks aresuper welcome in this moment thanks Johnuh so I wanted to know if it would beplanned to actually allow people toextend the UI and I'm not just talkingabout you know custom dashboards orcomponents but like completely rewritingor extending the actual UI frameworkthat's a good question we were talkingabout that yet because yeah today the sothe plugins we were talking aboutearlier it's really about the differenttypes of data sources panels and thevariables so variables it's basicallythe dropdowns but so it's kind of linkedto the data sources but yeah so wealready had some discussion to haveother types of plugins and uh so to beable for example to add a brand new tabon the navbar to add somethingcompletely different so this is beingdiscussed I cannot promise anything butthis is definitely something we arelooking at to have more types of pluginshere okay thank youhello uh thank you for uh for that uh II like the dashboard ascode initiativedo you see a way to use the dashboardascode feature in our existing uhexisting UI which is graphanawell so as I told you for example -0 isable to import pers dashboard so maybein the future graphana will be able todo the same so in this case you coulduse pers to uh pers dashboard as code tobuild pers dashboards and in the endimport graphana so currently that's notpossible but that's part of the ambitionto push the other tools to converge toour data model i think it's matter oflike customers asking for vendorssupporting this like as it was with opentelemetry years ago you know likeeverybody pressuring support supportthen we start getting this thingokay so I have to go to their booth andand ask loudly yeah yeah yeah pleasethank you thank youenjoy thank youfolks thank You2025-04-15 22:03:09.068957 ��S�I#��]A7h70Olo5Uzkso hello hello everyone um thank you forbeing here uh it's a crowded room yeahwe have some lights here but yeah we cansee everyone so um thank you for havingus i hope you enjoy uh and let's getstarted so Antoine yeah sure so yeahfirst quick question have anyone heardabout Persyet and who knows a bit what it's aboutokay not that quite a few people yeahyeah cool so before we getting startedwho we are uh my name is Nicholas Takashuh I'm observability tech leads atCorora Logix and I'm Promeus operatorand process maintainer and I like topushing code on other projects liketennos open telemetry promeus so mostlyCNCF related observability ecosystemyeah and I'm Antoan Tu i'm a seniorsoftware engineer at Amadius working onan observability platform and a processmaintainer for quite some years now coollet's goyes so first uh few words about pers ingeneral then we will dive into thespecific to topic of today so Pers it'san observability visualization tool andyou have an overview of it of its let'ssay maini the dashboard UI here on thescreen so it's an Apache 2 licensedproject and recently so summer last yearit made it to the CNCF uh so as asandboxproject and for these reasons uh wereally see pers so besides let's say thereally the it part uh as a standaloneapplication that you can deploy wereally see it also as an an initiativeto converge towards an openspecification and standard specificationfor dashboards in general general so andhopefully to increase interoperinteroperability between variousobservabilitytools so pers has various main focusesthat we will go through quickly so firstextensibility so Pers comes with aplug-inarchitecture that basically has beenaround for quite some time but we have abrand new architecture coming a V2 uh byuh yeah the coming in the next weeks sothe objective was to release that beforeCubeCon but yeah sorry we failed on thatuh but anyway with yeah so with thisplug-in architecture really we we uh weallow anyone any team any anything toprovide to create their own plugins toextend the capabilities of pers so couldbe in terms of v visualization optionsand also in terms of datasources it's also uh pers is alsoembeddable in the sense that it providesa set of npm packages that you can embedin your own UI so this is for examplethe case of the open shift console byredat which is uh one of the maincontributor of persus so they embed persvisualizations into their own UI it'squite seamless and lightweight uh andyeah so for this reason there is quitemade for developers yeah and what isnice here is that you can MD otherpanels in backstage for example becauseit's two peris is just an npm packageand you can mis pack um panels in otherplaces like backstage or any otherdeveloper portal or solution that youwould like yeahindeed and one last point so Pers isGitHubs friendly so since the beginningwe provide a clean data model reallycube native so it's actually quite easyto just output the dashboard in theformat of CRS that you can uh push toKubernetes thanks to the pers operatorthat we provide uh and so the data modelis quite clean and familiar forKubernetes users um and we also provideSDKs in two languages to code dashboardsso this will be the specific topic oftoday again uh so we will dive more intothat soon and so yeah SDKs plus also apowerful CLI that you can use in thescope of dashboard as code but also yeahto automate many kind oftasks good thank you Antoine so yeahprobably most of you already heard aboutwhat is dashboard as code like we aredoing we have been doing a lot of thingsas code right infrastructure as codepolicy as code why not dashboard as codeas well but like witha good solution you know l��daries that are wellunderstood like this contrived examplethat I have on the left side of thescreen here it's totally fine for youright if you've got a simple webapplication and a database put on asingle VM where you're not worried aboutRTO RPO uh High availability concernsthings of that nature just simplydeploying a few tools and using amonitoring strategy is more thansufficient but as soon as you start tomodernize your applications and movethem to the cloud things get a lot morehairy and complicated right you may endup with a set of Legacy applicationsthat are still around on Virtualmachines obviously this is a kubernetesconference and stuff so you've got a lotof appdev that's happening in modernways that are moving uh that aredeploying containers on kubernetes orpotentially other P Services uh thatexist out there right andif you're not careful and you're notconsidering what is ouroperational monitoring strategy going tolook like you end up with technologysprawl as it pertains to monitoringtools right and that's kind of the statethat we ended up in um if you look atand and and open up the hood of what arethe technology tools that we had atWalter's cluer for monitoring ourdifferent systems it's like the who'swho of the solution showc out there okaywe had a lot of different tools lookingat a lot of different parts of our appsand infrastructure the dbas had theirown sets of tools operations wasresponsible for setting up uh the APMand infrastructure monitoring the devsdid their own thing with logs and itjust kind of went haphazardly everywhereright so but all the while we're stillmodernizing our applications andintroducing a lot of coupling anddependencies whether it be with Pservices St services or even intra ininter service dependencies that we arebuilding ourselves okay and so whensomething fairly low in the stack startsto have problems as you can imaginehaving all of these hodg podge of toolslooking at different parts of yourinfrastructure not connected whatsoevermakes it very difficult to resolveincidents right you end up as theservice owner that's at the edge whereyour customers are uh trying to use ithaving problems you're the first one toreceive the call to get on the incidentBridge you're going to do yourinvestigation up to a certain point andrealize you know what this outbound callto the service that Team B owns isclearly the one having problems let'sget Team B on the incident Bridge sowhat are your thumbs and wait okaythey're on the incident Bridge get themup to speed they do the same thing it'sa rinse and repeat until you go down thechain and realize you may have alow-level P or St service problem sothis was this was the state that we werein where we had meantime to resolutionthat was not satisfying our customers orour business leadership so where we werein our journey is in 2020 we set out tosay we need to solvemttr we need to make it better we alsorecognized we had just too darn manytools and we weren't doing devops rightso our journey was about how do weactually work together to reduce thescope of tooling that we have and toshift left bringing monitoring andobservability into the appdev teamsworking together with operations tomaximize the value of what we'redelivering and not just thinking of itas an afterthought once we've deployedit into production then we're going todeal with our monitoring strategy andultimately we wanted to take a datapipelining approach to this we reallywanted to make sure that all of ourTelemetry data we had the opportunity toprocess it enrich it filter it sample itright and ultimately correlate theinformation on the backendso set the vision April 2020 fastforward we're investigating the state ofthe industry right and we saw that openTelemetry and observability was a bigthing that was coming about openTelemetry at this point was you knowlargely concluding its merge with opencensus and open tracing into a singleopen Telemetry standard and while openTelemetry um had a lot of itsspecifications still in the uhexperimental phase there were some partsof the standard particularly tracing�that was fairly stable at that point intime and the the libraries that had beenbuilt up particularly the net Librarysince we're net shop were also gettinggetting to stability very quickly and sowhile we you know looked at the state wealso looked at the vendor space to seehow were things there the vendor spacewas also pointing to observability isthe future and open Telemetry is whatmany of the vendors were saying we areplanning on moving too so they gave us alot of confidence internal to ourEnterprise that this was the strategythat we needed to adopt because we wouldprefer an open standard rather thanvendor lock in that's one of the keyprinciples that we had furthermore wehad to align our stakeholders internallyon the the principle of observabilitymoving away from a state of monitoringrealizing and stuff that we had to lookat the whole to be able to quickly getto the answers of why is my systemworking the way that it is rather thanwhat are the different failure pointsand let's start to play whack-a-mole inproduction to figure out how do weresolve these incidents right andthankfully having that conversation withthe operations leader stakeholder thearchitecture stakeholders and myengineering leadership stakeholdersweren't that controversial we all kindof aligned that this this made sense inlight of where we were trying to go umand then ultimately though we had torecognize that open Telemetry providesgreat standards and it provides greatsdks to emit traces logs and standard uhtraces logs and met right but it doesn'tprocess that data so we still need someplace to house that data and process itinternally so we're still dependent ontools a little bit here rightso the current tool that we were usingor the primary tools that we were usingdid not support open Telemetry so we hadto go back to the drawing board and wesaid okay what were the tools that wewanted to evaluate here and my Dev teamswent off and did some investigations sowe spent a couple of months evaluating abunch of different tools we came up withone tool that we really liked andstarted pressing forward and trying tobring it to production and that's whenour operations team said hey what areyou guys doing I thought we were tryingto do devops here where were we involvedin this journey right how do you makehow do you know that this tool thatyou've selected actually is going towork for us for operations is it goingto work for the dbas right it was fairfeedback right so we came together andformed a cross functional team that wasmade up of operations developmentperformance engineering qualityengineering architecture and this thisteam we called the backend boff teamgreat great fan of Great British boffnot very controversial this thing highlycontroversial and stuff but they workedreally great together as a team to firststart by identifying an inventory of thecapabilities that we required okay andas a cross functional team that was thethat was the starting point and we knewjust based on our experience from Devthat their coming into this with biasright so the first thing that we had todo was flush out and make sure that eachteam was holding each other accountableand asking the tough questions to makesure that these capabilities weren'tbiased in favor of any one particularvendor once we identifi this inventorywe then set off on a multimon expeditionto go through and assess from every oneof these different cross functionalmembers perspectives as objectively aspossible how well do these vendors meetthese me these requirements right andthen ultimately settled on what we feltwas the right set of tools that weneeded to move forward with right therewas really one that kind of stood out tous but this may sound a whole lot likeI'm suggesting that we ended up with onetool to rule them all and in thedarkness bind them and that was in factnot the state that we were going forbecause again while I'm operating at mybusiness unit level we are alsoconsidering this for the Enterpriseright we have multiple dimensions oforganizational leadership involved inthis backend bake off and had torecogni�ze that across Walter'scluer there are some business units likethe one that I was a part of that had avery healthy budget for very expensivemonitoring and observation uhobservability tools but other groupsinside of the Enterprise do not have thesame budget and luxury so their needsmay also be much smaller much moreconstrained architect Ure right and sotheir tooling on the back end May differfrom what we are needing but we justneed to agree on a fewer set of toolsthan what we had previously and that'sthe principle that we actually alignedon the other thing too as well is isthat in software there's no such thingas a silver bullet right so that'sanother reason why we were like there'snot going to be a single vendor tool outthere that's going to meet all of ourneeds because quite honestly when youconsider things like compliance checkingsecurity um just the developersperspective operations perspectiveperformance Engineers perspective rightif you're a tool that hits all of thosecheckboxes you're generally a generalisttool right otherwise you're a specialisttool right and we need the combinationof both of these tools to maximize theeffectiveness but underpinning all ofthis and what I didn't actually show wasthat at the top of this Matrix here it'sreally hard to see on the screen the topthree requirements were are the RS thetools that we're evaluating compatiblewith open Telemetry these werenon-negotiables they had to support itin order for us to moveforward so we're now about a year in toour journey right and we've selected atool we've selected open Telemetry wheredo we go from here um and we had to stepback before we just launched intoexecution and say how do we make surethat what we are doing as a platformteam is different than how we've behavedin the past how do we make sure that wedrive the enthusiasm and drive thesuccessful adoption of open Telemetryand observability as a cultural strategyinto our appdev teams okay and we identyou know we had to figure out how do wemeasure success in that regard but oneof the key things that we had torecognize was we had to minimize theimpact to the appdev teams it's notgoing to be zero there is going to besome imp impact right we're we'rechanging their culture and their way ofworking after all right but how do weminimize it as much as possible and weto the the way to engineer this in theapp code uh involved us forming and andpulling together a pattern that wedidn't come to find out until after wehad executed was already kind ofwell-named and established which was theembedded expert model that's shown downhere got a link and stuff to the slideshere from Pete hugson's um blog post onthis right but the basic gist is is thatthe platform team doesn't know how tobuild test or run each of the differentapplications right we're not going to beable to swoop in here and stuff and pulloff uh Magic and stuff and makeeverything work right we need embeddedexperts to work with us to help usunderstand how do we do this buildinghow do we test how do we make sure thatyour application is running and has theright Telemetry once we instrument itcorrectly but in return that platformthat app developer would learn aboutopen Telemetry about observability abouthow the plumbing is actually workingbehind the scenes so that they can goback to the product team and share thatknowledge with them and drive up theenthusiasm the interest and theawareness as that code makes its way toproduction so let's start with loggingfrom a logging perspective um we did youknow we went out we did some analysisand we found that most of the applic inour ecosystem we're using a custom SDKthat was already developed because thiscustom SDK um provided some easy umconnectors to get data like the user GDthe context about the virtual datacenter that they're in the database IDsthat they are connecting to ultimatelyas a customer right and this is stuffthat was ubiquitous across our appdevteams right so with the use of thecustom logging sdks we had a centralpoint that we could start making ourmodificationsand uh for teams that weren't using that�custom SDK most of them were alreadyusing more modern appdev Frameworks thathad config configurable syncs right soagain easy points to be able to injectuh open to lemetry logging and stuffinto our application stack so what weended up doing was as the platform teamjust simply making the code uh changesin those sdks to First inject openTelemetry and add it as supplemental towhere the logs were already goingupgrade the SDK versions in all of theapps that were in scope roll it out tothe different environments we did thisfirst because as a means of convincingthe appdev teams we needed to make surethat they were confident that there wasparody when their logs are no longergoing to their ex their existingdestination and are going to the newdestination that they can see they havelike for like hey I see all of my logsin the new destination that were therepreviously I have trustthat open Telemetry is working and thatyour new observability strategy is onits path to success once we had thatcritical mass and everybody saying yepwe're good we're not missing anythingfrom what you've instrumented in theapplication we went back into the SDKand ripped out the old Plumbing stoppingthe logs flowing from to the existingsystem and now only flowing through openTelemetry to our new back end so that'show we dealt with the logging rollouttracestraces surprised us when we introducedit to the appdev teams because most ofthe experience that uh or the responsewe had from appdev were why do I needtraces logging is sufficient for meright andso when we looked at okay we need tofigure out how do we educate people onthe value of traces right um we went inin a very similar fashion to what we didwith the logging uh process where weactually injected in the app teams codethe autoinstrumentation uh sdks and stuff fromopen Telemetry right so uh a lot of ourapplication stack was WCF Windows WCF uhServices which is soap Based Services umbut we also had some web API Solutionsas well that played nicely withdifferent projects in the open Telemetryuh world so once we in Ed that anddeployed it into production for ourfirst application 3 days later uh aftergoing to production there was aproduction incident not caused by openTelemetry okay but in there while therest of the teams got on the incidentbridge and followed the existingpatterns that they used to try tounderstand what was going on I went inand started looking at the traces thatwe were already capturing and within amatter of minutes I knew exactly whatservices were causing problem s and inthis particular instance it was just acouple of hosts that had gone bad so Iwas able to indicate to Ops hey thesecouple of hopes are bad right and youneed to take them out of the pool theyremoved them from the pool service wasrestored and less than 15 minutes rightwhich at that time before we in you knowhad fully OBS uh introducedobservability was remarkably fastremarkably fast for incident resolutionfor us right and so we used that that asan opportunity to evangelize to theappdev this is the value of traces thisis how we used Trace data toquickly understand where the problem wasso that we could get to incidentresolution faster that was the momentthe light bulb moment when everybodystarted to see now I get it now I getwhy traces are so valuable in additionto logs that you're already emittingright so there's definitely an educationaspect to this and you knowunfortunately sometimes uh using a a badevent right in this particular case wasa great way to illustrate how valuableadopting observability and openTelemetry as a solution can be to yourorganization so the education was thebigger part and the bigger effort on ourpart to get adopted in our organizationthan actually the technology changesitself But ultimatelybesides using this as an Exemplar foreducating your teams where possible usethe auto instrumentation SD case that isabsolutely what we found worked in mostof ourcircumstances when it came tometrics at the time that we were doingthis roll out in theadoption we didn't have many appdevteams who were emitting custom metricsok�ay and oh let me make sure no I stillgot time uh we didn't have teams andstuff who were doing a lot of custommetrics and we didn't actually uh umhave a lot of stability in the otel sdksfor metrics at the time so we ultimatelydecided we're going to defer theimplementation of metrics until we get alittle bit more both stabilization intheet uh otel sdks as well as when theappdev teams require it today our sdksthat we're wrapping open Telemetry dosupport metrics um and our next stage isto uh rip and replace and stuff theagents that we're using forinfrastructure metrics collection withthose that are coming from openTelemetry so that we can be a 100% openTelemetry collector um based solutionnow speaking of the open Telemetrycollector this was the last piece of thepuzzle here right to connect the databeing emitted from the applications tothe backends right so um we focused onmaking sure that we didn't put theburden of figuring out open Telemetrycollector deployment options on to theproduct teams we needed to centralizethat decision- making because we wantedthose product teams to focus on learningto use the new tools learningobservability and uh using the sdks toenrich their application uh Telemetryright so in this particular casecentralizing it with the the ownershipwith the platform team allowed us tomake sure that we had consistency in ourprocessing pipelines that are filteringand sampling rules as well as we endedup deploying The Collector service atthe edge Edge which gave us newopportunities to be able to collect logsfrom the client's uh machines that arewhere we deploy uh and installable onclient computers rather than previouslywhere all of those logs stayed local totheir computers and it was extremelypainful uh between support and customerswhenever we needed to figure out whatwas going on they'd have to collectthose logs ship them over send it on toan engineer and stuff now it's allavailable through open Telemetry in ourbackends sowe've been rolling it out what are someof the Lessons Learned hereright your move our app teams weremoving down a highway at 90 kilometersan hour and we're coming in telling themhey we need to change this front lefttire but don't worry keep driving rightwe're going to change your tire andstuff while it's driving so as you canimagine that's a very very verydifficult task to do right socoordination with the appdev teams evenwhile using the embedded expert model isreally key because otherwise you end upin situations where you have gotconflicts or in situations where theauto instrumentation SDK doesn't covereverything right you don't want to endup in a situation where the team thinksthat you've deployed uh observabilityinto production but in fact you havecoverage gaps because this new area offunctionality did not get covered whenyou were working on itsecond what's your culture of ownershipdo your teams whenever they accept apull request say I have to understand itand then once I accept it I'm going toactually fully support it or is it aculture where I'll bring it in but youwere the one who made the change so ifsomething goes wrong I'm still going tocome back to you because you're internalto the company there are some thingsthat you have to consider here dependingon your culture of ownership you have tofigure out how do you properly enablethose teams to do their own selftroubleshooting once you implement theiror once you actually uh instrument theirapplication with otel because otherwiseyou're going to end up in a spin cycleof supporting those appdev teams okay umwe were we had a lot of logginglibraries um you're going to miss someif you end up having a lot like we didright uh we covered like 85 90% of thelogging libraries when we switched overto open Telemetry but we didn't actuallynotice that we had missed somethinguntil we saw all of that volume drop andmove to our new back end and we couldsee what did we miss so plan to inspectand adapt through that and that you'renot going to get 100% upfrontum last but not least when you if youdeploy open Telemetry collector at theedge right um for OTL ove�rhttps and you have similar securityposters like us where we have to deployweb application firewall your webapplication firewall might treat it likea denial of service attack so uh you'regoing to have to tweak and tune your webapplication firewall and your openTelemetry collector and log uhconfigurations and stuff to create abalance and symbiosis there when itcomes to scaling to theEnterprise right um so we've beenworking largely inside of the TAA NorthAmerican Business unit right now but howhave I been taking this and stuff to therest of Walter's clu okay and the firststep is doing something like what I'mdoing right now on the stage with youall right it is talking about ourexperiences sharing the benefits thatwe've received as well as the warts thatwe that came about from it I've beentrying to find multiple differentchannels internally within Walter'sclure whether it be technologyconferences communities of practicecenters of excellence teams channelswhatever right if you're trying to driveadoption in the Enterprise getting theword out is your first step as you workthrough those channels naturally you'regoing to end up finding some enthusiastswho want to engage you on a deeper morericher conversation because they may betriing it themselves leverage thatNetwork because because ultimately yourcommunity of enthusiasts and stuff willhelp to bridge you as an expert toothers inside of your Enterprise who maynot be in your direct sphere ofinfluence right that's actually kind ofthe stage that we're at right now whereI'm starting to get connected to othersin our larger Enterprise who areinterested in open Telemetry adoptionright and who have been asking questionsin forums that quite frankly I didn'teven knew existed inside of ourEnterprise right right so um this iskind of a gradual way of how youactually roll it out form a community ofpractice this all probably makes youknow to some of youall are like yeahthis this seems like fairlystraightforward right um but as thepointy hair manager and stuff standingon stage right now I will definitely sayone of the last ways that you canabsolutely Drive um impact right forobserve uh open Telemetry right is ifyou're in a position where you startedas a Dev and now you're in a position tocall the on what your team spends theirtime doing absolutely support them intheir efforts to make contributions toopen Telemetry because that's in factwhat I do if my teams come up and sayhey something in open telemetry's got abug or there's a feature that's missingmy first reaction is not go cut a ticketand wait for them to fix it it's get inthere and do it yourself right becausethat's what we do as a community andstuff we make it better for ourselvesand for everybody else right soencourage your teams to do thatum ultimately I got a couple like I'mright at timehere there are a lot of takeaways hereum these slides are uploaded to sked andI want to be respectful of everybody'stime here and I'm going to skip to thefinal slide here to ask if there are anyquestions but before we get to questionsfeedback is a gift if you liked thissession rate me if you didn't like thissession tell me what you didn't likeabout it it'll help me improve next timeand if you love this session I wouldlove to have a beer with you and stufflater today so thankyou where's themicrophone oh there itis anybody has any questionsallright ah there's a microphone back therewhat kind of backends are you using whatkind of backends are we using um so someopen- Source some vendor backends thatuh I'm not allowed to say on stage butthey have quite a bit of a barkgoodquestion all right any other questionsif so go grab the microphone there wego uh so maybe I missed it at thebeginning but I would be interested inthe scope of your uhproject was it one back end 50 back endsum how manyum stakeholders did you include herewhat what are we talking about okaybecause we have the same problem but theproblem is in our end to endend chain wehave like 20 companies and each companyhas three backends mhm and people aredream dreaming of a uh open Telemetryapproach where everybody hasstandardized tooling for the three mainareas but it's from a political point ofview seemingly impossible so I wouldlike to know how big your scope wasgreatquestion is my microphone still on yeahI think so okayso our businessunit um is somewhere between 40 to 50appdev teams right so we're talking onthe scale of 600 to 700 developersgloballyright in our entire ecosystem as far asproducts are concerned um we're it webuild multiple product Suites that servethe United States market and theCanadian Market aswell and ofthem from a engineering perspective alot of the decisionsinvolved uh close collaboration with mychief technology officer it involved hispeer over in our operations team uhwhich ultimately affected I'm going tosay probably on the scale of 75 to 100different operations Engineersglobally uh and then from a uhbroader operational leadershipperspective uh when I think about fromthe from the other Global impact we hadArchitects at the global level uh thatare responsible for because we've we'vetrying to figure out how to simplifythis right um we have specialistArchitects that for the entirety ofWalters clure globally right which isabout nine what we're 19,000 employeesglobally and and we have about I thinklast numbers that I had were 9,000Engineers globally we have a set ofArchitects that are trying to look atentire Enterprise portfolio thoseArchitects were involved in this processof engaging with us during theevaluation of open Telemetry and theevaluation of the backend tools and thestrategy to simplify it I would say thatfrom a holistic WK Enterpriseperspective right not everybody is usinglike I said the same tools that we endedup adopting all right um but where we'vewhere we've been trying to channel ourefforts down to is to get to essentiallya catalog of five or less differentbackends that we primarily support likewhat is your large Enterpriseapplication solution for observabilitywhat's your midsize application solutionfor observability what's your tinyapplication solution for observabilityright to meet the different budgetaryrequirements and to meet the differentum support requirements and stuff asyou're scaling yourapplication um up to meet largercustomer demand butthe um the the one kind of central Lynchpin and stuff again to all of this isthe recognition that open Telemetry asthat open standard that was actually notthe not the controversial piece of thispuzzle right everybody agreed that thestandard supports since it supportsmultiple languages right um thatintroducing that and trying tostandardize on that and stuff made senseat an Enterprise level the challengecame into when to adopt becauseultimately if you're still having tomake changes to your applications you'restill having to do testing things ofthat nature that's engineering effortright that takes time uh and energy awayfrom the business value proposition ofyour app devs right and so that's whyI'm right I'm very eager about what'shappening in the ebpf space and autoinstrumentation outside of theapplications perspective because that'swhat I want my team to start working onnext as we look more broader to theEnterprise because if we're able todrive the at least the tracing side andsome of the me some of the metrics anduh logging right just through sidecarbased injection of open telemetry andthe appdev teams don't have to worryabout any instrumentation on their ownespecially for legacy applications allthe better but if you're a newapplication you're doing Green Fieldwork our message right up front is useopen Telemetry no question about it hopethat answers your question yeah anyotherquestions uh hi so I have a questionabout Sim monitoring and uh compliancemonitoring because we are in a situationjust going say if you could take morequestions outside if you're free don'tbe S yeah yeah actually sorry I'm beingtold no problem time is out and stuff sowe'll uh I'll uh end up meeting overhere and stuff and uh we'll takequestions outside so the next presentercan actually get set up Mo so I'll meetyouall down there thank you2025-04-15 22:03:09.554514 vv��e�J#��AJqRXqk-1CLothank you all for joining me for thesession today uh where we're going to betalking about how the company I work forWalterclure uh managed to adopt open Telemetryand move to observability uh where wepreviously were a very vendor heavy uhand vendor Laden company my name isChris Welden I'm the director ofplatform engineering at Walter clureworking uh in tax and accounting NorthAmerica uh my accent and you'll hear methrow out howdy and ya'all a lot isbecause I'm from Texas so um now amoment from my sponsor so for those ofyou who haven't heard of Walter's cluerdon't worry you're not alone we're oneof the biggest companies and stuff thatvery few people have ever heard ofthat's the inside joke that we've got atleast because we make software for abunch of different verticals uh thatsupport Health Care Professionalsaccountants lawyers Finance people etcetc right so uh we're a globallydistributed company headquartered in theNetherlands um I'm very proud to workfor this companyum and uh a lot of other uh uh bodiesand stuff think that we're great too uhequip in uh particular has rated Usnumber one for gender diversity uh inthe Netherlands and stuff for threeyears running so I love to talk about mycompany but that's not what way we'rehere if you're interested in learningmore come talk to me afterwards um oneof the things that I am absolutelypassionate about is mentorship uh and inparticular working with uhunderrepresented groups right so Ihighly encourage any of yall if you'rein a position to Mentor somebody uhplease use that QR code and stuff tosign up for uh opportunities to Mentorthose in underrepresented groups becauseit's super critical to the success ofour community and ourindustry so let's talk about the agendatoday I'm going to set thestage what was life like at Walter clurewith uh as it pertains to monitoring andobservability and how did we ultimatelycome to settle on adopting openTelemetry uh within our company and howdid we actually roll it out well withthe lessons learned the burns and thescars that we collected and stuff alongthe way and then ultimately how we endedup moving open Telemetry out of TAANorth America and into the Walters clureEnterprise as a whole right so there's afew assumptions that I have here in thisparticular talk the first assumptionhere is that I imagine many of y'all areprobably here because you know aboutopen Telemetry you know what it providesuh and you may or may not be havingtroubles getting it adopted in yourorganization and getting it to scaleright so I'm not going to cover any ofthe technology specifics behind openTelemetry furthermore any of you who arein this talk and stuff hoping to see alot of uh code examples beyond what I'vegot on the screen right here I hate todisappoint you but uh by the time thatyou leave this session my hope is isthat you're going to feel uh that you'vegained some ready to use knowledge onhow to actually bring open Telemetry insuccessfully and roll it out amongstyour teams and that you feel enabled andempowered that you can do it too becauseif we were able to within our Enterpriseso should you um and if you particularlylove this partic this talk at the veryend maybe you might be feel compelled tobuy me a beerso our journey starts in 2020 whereWalter's clu uh in tax and accounting inNorth America we were uh primarily amonitoring driven shop okay so we hadnot learned and understood the value andthe uh the import of observability as awhole right so um we had different toolsthat were deployed you know in ourinfrastructure that I'm going to talkabout in just a moment right but um inlarge part right monitoring is animportant critical predecessor toobservability when you think about itright because monitoring is all aboutcollecting different uh Telemetry datapoints and stuff and setting up alertsand looking at those to tell whatproblems actually are and monitoringwhen you have a reallyconstrained problem set right withfailure boun��d you'd be right we blew it up weblew it up a lot quite often actuallywith a show of hands how many peoplehave overloaded some database with toomuchdata yeah yeah I thought so we've allbeen there um but the thing is there'snever a simple answer of how much datais too much it really just alwaysdepends now these graphs uh that I'mshowinghere uh show our server running at Peakalbe it somewhatunsuccessfully the the top graph showsstep increases of about 50 million inhead uh time series and the bottom graphshows gaps where we expect a kind ofsmooth line at around the 1 millionsamples per second Mark these dips aresigns that the server is overloaded anddroppingdata Prometheus happily was chuggingalong in this degraded state butrecording rules and scrapes were beingskipped in order to shedload these orange bars I've added to thetop graph um these aren't metricexplosions and this isn't really a highcardinality problem as it's oftendescribed these step increases are fromdeployment and not somebody adding newmetrics to the system so to explainthese increases I'll touch on how Etsydoesdeployments at Etsy we do auto scalingand blue green deployment on kubernetesthis results in super fast deploys butit also generates a considerable amountof churn and this churn is a bigchallenge when operating Prometheus atthis scale flipping between the activeand dark sides causes 50 million metricsto stop reporting and another 50 millionto come online and there's anoverlapping period where both sides arereporting at the same time so if we didfour deploys in an hour we would need amuch biggerserver however after reviewing theoptions we realized that Google onlyoffered one final upgrade in This Serverfamily it struck us that we might not beable to find a larger server next timeand even if we did there were signs thatPrometheus was already in a state thatprevented us from fully taking advantageof the hardware we need to we needed toact swiftly because the week of BlackFriday and sabber Monday was approachingit's etsy's busiest season of theyear so to buy us some time we opted todeploy the biggest server available butmore importantly we committed to takinga closerlook in this middle column we're showinguhat an intermediate State at this timethe the issues seemed to be Memoryrelated so we went from a mega M to anultr instance type which had twice theamount of ram going from 2 to 4terabytes this gave us much needed bebreathing room but clarified thatthrowing more Hardware at the problemwasn't helping we were bottlenecked onsomething and none of our otherenvironments were experiencing issuesand we also couldn't easily produce thisin a test environment it's not easy toget approval for a 4 TB server so wedecided to deploy a third replica thatallowed us to do side by-side benchmarksinproduction this enabled us to quicklyiterate through several permutationswithout jeopardizing service and thisthird column here that I'm showing givesyou a sneak peek into the specs weeventually landed on but spoiler alertuh it ran much better at 1 terab of ramthan it did at fourbut even with the faster server we couldnot still we could still not determinewhy the Prometheus remote right functionwas laggingbehind so that's about when we looked tografana labs for help and that's aboutwhen Brian gotinvolved yeah so um so as Chris saysthey they were running uh open sourcePrometheus uh in in their infrastructureand pushing the data to our cloudservice and um and this is this is whatcame in on on the report on the theticket uh that I was asked to look at umso the way remote right works whenPrometheus gets hold of the data it itpushes it off uh to a central serviceand it's supposed to do that inside likeone second there's not supposed to beany lag at all um what I was uh calledto look at the the lag was was gettingup to um five six minutes and and 10minutes uh and also it it would last forhours I mean it's it's it's possiblethat something could glitch and then itshould recover quickly uh so the othersymptom was was that it would stay highfor several hours um so this was reallybad u�m there was another uh thing we'lltalk about uh but just to mention uhdesired shards um this is a uh uh Shardis a by Shard we mean mean splitting upthe work so we can do things in paralleland um uh Prometheus runs a very verysimple model of what how many shards itmight need to send if there's a backlogit it divides the backlog by uh how manyshards it would take to send the data inone minute to clear the wholebacklog and um that's a very simplemodel it it works if you can sendinfinitely fast and if the far end canreceive infinitely fast so who's got aninfinitely fastservice No Hands okay yeah so there's aproblem so yeah I just want to mentionuh this number uh desired shards is iskind of a fantasy number and and uh it'sprobably something we should have guardrails on in Prometheus because becausepeople read this and and uh think thatthey should configure for that but butnotreally and that did get us in trouble umbut we eventually we eventuallyunderstood that at least in oursituation letting Prometheus dynamicallyadjust the number of remote right shardswasunnecessary because resharding is anexpensive operation we opted to hardcodethe Min and Max shards to the same valuethe right hand table I'm showing hereare the settings we landed on but I'llexplain some of the theory behindthem we settled on 100 shards as a niceround number far less than the number ofthat was suggested by the desired shardsmetric we recommend picking A Shardcount relative to the CPU count and iflatency to the destination is high youcan consider doubling it but also don'tforget to increase the batch and buffersizes to match and calculating yourmaximum theoretical throughput under aworst case latency scenario is ashortcut to figuring out what thesesettings should be for example let's sayyou have 30 million series scraped every15 seconds as in R setup this comes outto 2 million samples per second so witha worst case roundtrip latency of 500milliseconds each Shard can send twobatches per second so with 20,000samples per batch this results in atheoretical maximum throughput of 4million samples persecond so if we double the worst caselatency from 500 milliseconds to 1second we reduce the maximum throughputin half from 4 to 2 million this meansmeans we have plenty of headro at 500milliseconds of latency but we can'treally tolerate more than 1 second oflatency so after ironing out thesesettings for remote right we were stillpuzzled by what was causing the remoterightlag and like many good stories uh we hadseveral side quests and one of them wastriggered when we looked at this chartof network bandwidth we occasionally sawtimeouts and that kind of made it seemlike there was an underlying Networkissue that was plaguing us here and wenoticed bandwidth peaked at 360 Megs persecond even during playback periods whenit should have pushed much much higherit turns out that Google actuallydocuments one flow in their systemcannot exceed this limit when egressingfrom a compute instance to a destinationoutside of the VPC I hopped on one ofthe servers and ran netstat a few timesto discover that Prometheus was indeedsending data over a single socket whichkind of caught us by surprise we endedup disablinghttp2 um as one of the settings so thatit was send data over multipleconnections so we wouldn't have to worryabout this limit anymore but there wasmore to the story because the next daywe got alerts about Prometheus remoteright falling behindagain all right so getting back to ourmain quest I'll turn it back to Brian todiscuss the right ahead log in moredetail yeah you always always blame thenetwork rightyeah it's never thenetwork um so uh yeah so there a littlebit more detail what's going on insidePrometheus um and and focusing on theparallelism so Etsy had uh like 10,000exporters on the left um those are uhrun in in the code of Prometheus it'swritten in go each one of those gets ago routine they can if they want to theycan all run in parallel there's atremendous amount of parallelism thereon the sending side we we had a thousandshards we we reduced it to 100 but evenso tremen�dous amount of parallelism onthe right hand side so why have I drawnthe picture this way there's a bit inthemiddleumPrometheus like many databases uh whenit gets hold of some data it commits itto disk immediately this is a a a designcalled a write ahead log um and it's avery good design for resiliency if ifsomething happens if the process has torestart it can recover its state byreading the right ahead log theW um when doing remote right we want toum balance the receiving and sendingthere might be some glitch insending uh the rate the data's coming inat you you could get gigabytes of dataqueued up and we don't want that inmemory so the it's already on diskbecause we wrote it to the wow so thisis um this is the design of Prometheusthat we use the wow as a q for remoteright and it's it's a really nice designbecause we don't have to implementanother queue um unfortunately for Etsywith this massive massiveserver there's exactly one right aheadlog and the whole thing is bottleneckedon this oneoperation um how did we figure thisout well you might think uh somethinglike profile you have a performanceproblem profile it um but profilinggives you an average over the wholeprogram and um so actually this isdifferent tool this is the go uhexecution tracingView and uh I'm sure it's too small foreveryone to see but but the what I wantyou to kind of get from this is the umthe yellow line at the top is solidthat's the one uh reading thewow um and uh all the other lines thekind of swim Lanes they're the ones uhthat write the data out on remote rightand um and they spend most of their timewith with whites space there there's alittle bit of the the greens and the thepinks uh when they get some data theyMarshall it compress it send it but theyspend most of the time waiting to getdata so this was kind of The Smoking GunFor What was really wrong once we'd beenon all the sidequests so with with Black Friday theBlack Friday deadline looming we didn'thave time to rewrite the queuingmechanism we just needed to make thissingle thread go faster we so we askedourselves what could we change to thepressure on the bald neck withoutrewriting the code there are really onlya couple ways of doing that one do lesswork three two make the CPU faster andthree just have fewerinterrupts we ended up trying all threeso for the first item another expensivemechanism in Prometheus is calledcompaction we disabled it but before weget into why I'll invite Brian back todiscuss how the compactor Works inPrometheus yeah the umso switching Tac uh this is this is abit about how tsdb Works um when thewhen the data is uh put into the timeseries database in memory um we don'twant to just build up and build up andbuild up in in data in memory so everytwo hours we run a process calledcompaction um and and build a a two-hourblock of data ondisk um after another two hours we buildanother two-hour blockand after another two hours we buildanother two-hourblock and um and this is great becauseit's no longer we don't have all thatdata in memory uh it does um pull thedata back if you run queries uh but itonly reads the data that you actuallyneed for that query that and that's donethrough memory maapIOum so uh normally Prometheus then uhstarts to build these two our blocksinto bigger blocksso um and it's done in threes this isthis is default settings for Prometheusso uh so we take three 2hour blocks andmake a 6-hourblock we take three 6our blocks and makean 18h hour block and it carries on likethat 54h hour blocks 162 hour blocks umbasically however long your uh storageretention is configured for uh bydefault Prometheus will will continuetrying to make bigger and bigger blocksum which are more efficient and that'sthe the the normalidea but we realized that that theseperiodic compactions were uh incrediblyintensive operations and again it's allrunning in oneprocess um so uh inside etsy's verylarge Prometheus these compactions weretaking up a tremendous amount ofresourcesuh oh yeah a little side note it's alittle confusing there's two operationscalled compaction with this one is headcom�paction where we go from memory todisk and um uh the the other compactionwhere we're doing historic blocks butbut that's um sorry aboutthat Chris umso using smaller 1hour blocks wascritical to us for severalreasons andfirst the the default of 2-hour blocksmeant that Prometheus could take areally long time to restart because ithad to rebuild the last two hours ofdata in fact we had a maximum startupduration of 1 hour before kuberneteswould give up so if we restarted theserver between 4: and 6:00 p.m. when thewall was at its peak it would never comeback online unless we intervened bydeleting the wall and losing all thathistorical data that hadn't yet beencompacted into a blockthe second reason and perhaps most moreimportant is the compactor wouldconstantly fail a blue green deploymentduring Peak would produce so much churnin our metrics that even a 2-hour blockwould exceed an internal index sizelimit this meant that Prometheus wouldget stuck with an infinitely growingwall that could not be compacted intosmaller blocks again this meant thatsomeone needed to intervene by deletingthat wall and giving up historical dataso we cut the duration of our initialblocks in half to ensure they neverexceeded this size of 64gigs so here is a diagram of Prometheusblock layouts on top we have thePrometheus default and on the bottom isthe block layout we used at at Etsy wemade two changes the setting the minimumand maximum durations both to one to onehour setting the Min let the seriesclear out of the head block faster andsetting the max disables the largercompactionsinstead of turning the wall intotwo-hour blocks and then turning thosetwo-hour blocks into progressivelylarger blocks we configured promethus towrite out a continuous stream of 1hourblocks without ever compacting them butbefore you try this at home you shouldunderstand the trade-offs that wemade many smaller blocks means queriesover historical periods are moreexpensive we got away with this strategybecause our Prometheus server onlyretains 9 days worth of data or 216 hourblocks where the default is 15days as well using remote right means wecan offload expensive queries from theserver and set aggressive query limitsso as to not overload the server we setquery Max samples at 50million and my my last point aboutcompaction is to reinforce that for usthere was just too much turn in in themetrics from our blue to greendeployments and auto scaling for thedefaults to just work well so aftergetting this distraction of thecompactor out of the way we had one lastsidequest we realized that 4 terab of ramwas pretty Overkill now that we had uhcompaction disabled so we no longerneeded that memory optimized machinetype and would benefit from the fasterclock speeds that a compute optimizedmachine type would offer so we made alist of things to try and made sideby-side comparisons in production over afew days we ended up making some drasticchangeshere I'm highlighting two config changesI'll let Brian talk about these in aminute but the surprising thing was thatPrometheus got Faster by constrainingresources in almost half we went from128 cores down to 50 cores and from 4terabytes down to one terab ofram also note is how we applied thoseconstraints instead of kubernetesdefining the upper bounds we set themore restrictive limits at the Goruntimelevel because it turned out that garbagecollection was a baldl neck across thewhole program each dip you see in thisthroughput graph every two minutes or sowas was linked to a garbage collectioninterval that briefly starved the remoteright so I'll kick it back to Brian tohelp us unpack how garbage collectionwas interfering withthroughput it's always memorymanagement I don't know uh I've done anumber of talks uh you can you can findthem on YouTube uh just you know searchfor my name uh I've done a number oftalks about about memory management howto speed up goal programs uh by tweakingthe uh memory management um so let'slet's uh how we doing for time yeahpretty good pretty good uh yeah let'sget into that um so I uh I love to drawthis Sawtooth diagram just ju�st to tryand help you understand um the uh theHeap is where all the big memory stufflives in a go program um and over timeit it goes like a like a saw too it itit builds up uh as you use memory anddiscard it and then the garbagecollector runs and it it that numberdrops again it builds up and it dropsand it builds up and it drops and thatwas the uh the previous picture thatthat Chris showed there's a there's adefault uh 2minute cycle where it whereit will run the uh garbage collectionand it's it's another very veryintensive operation that that drags downthe whole program when itruns um but there's two effects uh thethe other one is how much bigger uh theHeap will get before it garbage collectsand and in Prometheus we configure thatto75% um that is the go GC setting so umso in a steady state is doing thisSawtooth it's it's has a certain amountof memory that it really needs it climbsup to that plus 75% and drops down tothat amount and climbs up and drops downclimbs up and dropsdown um when we had thosecompaction I haven't finishedyet uh when we had those uh compactionoperations running uh every 6 hoursevery 18 hours and so on uh the heatwould get bigger and it would then growanother75% so so this is one thing uh I I don'thave fancy animations or AI or anythingbut I'm going to wave my hands around umso uh so we were we were reallysuffering from um uh the memory wouldgrow bigger because it needed morememory to do compaction and then itwould grow bigger again because of this75% Factor um so this setting of go Mlimit um is is uh basically right justwhat you need um it caps the memoryusage uh and it will run go it will rungarbage collection faster as youapproach thatnumber you do have to pick that numbervery very carefully if you go too low ifit really needs more memory than that itwill run garbage collection infinitelyfast and youwill besadum so uh and then another thing onceyou've set go m go M limit you can turnoff um the regular Cadence that was thetwo-minute thing that we were seeing umso go GC equals off and a go m in thiscase one terabyte because we um becausewe limited queries because we turned offcompaction we we did not expect anyoscillation in the memory uh so thisgave us two things it it gave us um amuch more constrained process it wasrunning in one tab instead of 4terabytes uh and the garbage collectionslowed down to like every 10 minutesinstead of every 2 minutes so the uh theremote right got a chance to catch up alot better so that's that's a bitcomplicated um like I say I've I've madesome other talks you you can researchthis but uh but that that was what wedid and it it worked reallywell um so uh turning away from memorymanagement um there were uh quite numberof changes um while we went through thisinvestigation uh things that wereimproved Upstream in Prometheus um so Ijust put a few of them uh up on thescreen there they they um improvingcaching improving uh efficiency lots ofof changes so you I just want to notethat this this was kind of a uh ourcustomer and trying to help them out butthen the the changes went Upstream inopen in open source Prometheus andbenefited the whole communityso very happy aboutthat so to wrap up our story we're happyto report that after tuning garbagecollection we finally alleviated enoughpressure on the bottleneck and we madeit through black fridy without anycrashes or leg so a big thank you toBrian for letting us nerd snipe him onthis and another big thank you to thecommunity building and maintainingPrometheus so if you want to operatePrometheus at a similar scale here's achecklist of less learned number oneautomate catching code changes that maycause metric explosions before they landin prod but sometimes changes slipthrough the cracks so it's a good ideato set scrape limits as number two howmany metrics Prometheus will accept isunbounded in the stock configuration andit's almost certainly going to crash ifyou try jamming too many metrics inPrometheus does a good job of settingdefaults most of the time but not havingan upper bound on this one is a foot gunwaiting tohappen three consider using smallerblock sizes and disabling uh compactionif your data retention goals allow it atfour tune the remote right for requiredthroughput using worst case latency as aguide and ignore that desired shardsmetric if it's giving you a sillysuggestion and instead Pick A Shardcount relative to the CPU cores yourserverhas five right size the server and be onthe lookout for bottlenecks causingdiminishing returns more isn't alwaysbetter and six don't be afraid of tuninggarbage collection as it may help butmake sure you've covered the basicsbefore going down that rabbit hole andas seven my final advice is to avoid thetoo big to fail moments by thinkingearly about how you'll eventually splitlarge clusters into smaller ones andlastly remember to have fun we certainlyappreciated this opportunity to nerd outand we plan to continue pushing thelimits of Prometheus as timepermits and with that I hope you'veenjoyed ourtalk we have a a few minutes forquestions so there's a microphoneCenter yeah go for it um hey thanksgreat talk thank you you said you haddata retention set to 9 days did youconsider lowering itwhy do you need nine days um the storageis actually pretty cheap so um it wasn'tafter we disabled compaction historicalqueries um um it didn't really impactthe performance or it wasn't really partof the bottleneck is the short answer Iguess okay cool thanks no problem thatuh you limit the qu query scrapes likequery data samples did you increase itafterwards did your developers like heywe need more uh samples because ourquery just doesn't work withoutthem um yeah that does happen so uh wetend to offload those expensive queriesto um uh to graphon Cloud where it'smore it's a distributed system so it canhandle the those heavy expensivequeries okaythanks did you did you consider gettingrid of meus as a normal way and justrunning as an agent since you were usingCloud yeah we are considering that umfor sure um I think the the main thingis that we run all of our recordingrules and alerts and we do a lot ofcontrol systems as well for autoscalingand a lot of those systems and we justlike keeping it simple of knowing howPrometheus will scale and and bereliable and making Auto scaling and allthose complicated operations dependenton an external vendor with an externalsystem it it just it's a lot of lot ofthings to reason about about how it's atrade-off yeah for sure so we we justlike to keep it simple um but we are atthat point where we're we have to takeum horizontal scaling a lot moreseriouslyso I think we have time for a couplemore yeah um so you mentioned you weresetting go Max procs to 50 despitehaving like 87 CPUs and you goingunderstand why that was providing betterperformance I I should probably takethat oneuh um so it's complicated is the is thefull answer but um one of the thingsthat happens in go is it uh it sets upum 25% of that number uh for for runningbackground garbagecollection and um so uh so byconstraining that number you uh youlower the amount of background garbagecollection and uh I actually have a talkat the next cubon in Japan oh anywaycubon Japan I'm giving a talk about Numauh which is another reason why you mightwant to do the why you might want tolower that number um yeah it's it's it'sa very complicated tuning mechanism butbut as a general rule um uh cutting goMax procs down to what you really needuh will make your program gofaster so you mentioned the sharting andyeah that can be complex but why not usemultiple pruse instances with differentscrapeconfigs um we do actually we we have 30UH 60 servers uh in total um and so thisis just the the largest tenant that justkind of grew organically this way um andthe alerts and uh dashboards they're allkind of in one tenant and so withouthaving to refactor all the alerts anddashboards that's essentially why thisone grew to such a certain large sizeand we we also operate um um some fairlylarge you know kind of Monolithicservices that uh kind of jam a lot intoone oneservice thank you well I think we'llcall it there thank you very much forcoming2025-04-15 22:03:10.065906 55��*�K#��AIWrd-pSojqghello everyone our talk is about pushingthe limits of Prometheus at Etsy todaywe'll share some of the lessons fromoperating one of the industry's largestPrometheus servers but before we getinto it let us introduceourselves my name is Chris Levoy and I'man observability engineer at Etsy I'mbased in waterl Canada and I've beenwith Etsy for the past four years Etsyis a global e-commerce Marketplace forunique and creative Goods it was foundedin 2005 and is headquartered in BrooklynNew York now I'll invite my co-speakerBrian hello uh my name is Brian boram Iam a distinguished engineer at grafanaLabs grafana is the leading open-sourcesoftware for visualizing operationaldata I had to read that um what I what Ido on my day job is is I work on the uhscaling the massively scalable storagethat we have for metrics logs and traceswe store trillions of metric points andpetabytes of logs I the reason I'm onthis stage is I'm a Prometheusmaintainer uh I've worked on that codefor about seven yearsnow uh so who knowsPrometheus okay like 60% or somethinglike that well good good um I just wantto talk about the uh architecture umbecause we'll we'll return to thispicture a number of timesuh so basically uh data data starts offon the left uh what we call exporters sothat could be data coming from the nodeor from containers or applicationspecific metrics anything where the datais coming from that we call that anexporter and we pull the data in that'sa process calledscraping we put it in the time seriesdatabase uh tsdb first of all in memoryand then on disk and then uh to get dataout there's the prom query language umand then usually visualization is doneby something on on the outside likemaybegrafanaum the key thing that's important forthis talk is everything in that box inthe middle is oneprocess uh that's what's really coolabout Prometheus it's really easy todeploy one process and you're you're offum but the way it scales is get a biggermachineso this talk is about the biggestPrometheus I eversaw so uh what is this giant Prometheusserver for well if you're looking forthat perfect gift to Mark a specialoccasion you can try searching onetsy.com the server we'll be discussingmonitors millions of metrics aboutsearches on Etsy it's been around forabout 8 years and it's just chalk fullof alerts and recordingrules but this server is just one of 30Prometheus Stacks that we operate atEtsy each stack is isolated aroundsystem boundaries in total we Peak ataround 600 million series or 5 millionsamples per second during busyperiods each each stack writes a copy ofthe data into a central location ingraphonCloud uh there's a pair of servers ineach stack providing High availabilityand this story is about how our largeststackyou know couldn't scale verticallyanymore so why use vertical scalinginstead of another approach well thehonest answer is that pushing theboundaries of what a single Prometheusserver could handle was a fun task umbut also the the tech culture at at Etsyhas been heavily influenced by the ideaof choosing boring technology and justkeeping it simple we should exhaust allpossibilities with the base architecturebefore taking more drastic measuresbecause things like sharding andFederation they they they introduce alot of complexity and so we set out todelay the inevitable as long as possibleand see how far we could push the limitsof a single Prometheusinstance and I'm sure many areanticipating this talk will be about howwe crashed our server by adding too muchdata an��try space for uh theinstrumentation that they want to use ifauto instrumentation offering is theyalways pick auto instrumentation becauseit's easier to integrate it's you don'tneed to instrument anything manually outafter all but uh sometimes um uhoperation problems might result fromusing auto instrumentation and um forexampleu once uh we used the autoinstrumentation forn net uh applicationsand uh uh the net auto instrumentationcreated the um spans and traces withextremely high cardality and a hugevolume and it overloaded our collectorsit overloaded uh also um uh the observability vendor where we have beeningesting to and uh we had to u tune theauto instrumentation so that it uh we donot uh produce this abundance of uhunnecessary telemetry signals frequentlybecause out instrumentation itinstruments also things that are notnecessarily helpful uh to identify theuh incident or like the root causes forthe incident in it might create a lot ofnoise and it might create a lot of uhadditional telemetry that uh uh nobodywill be using in the end So this is theproblem with this uh but uh currently weare exploring uh what else can go wrongSo for example already with net we facethis problem we have faced this problemalso with Java uh auto instrumentationUm so uh let's see how it will go but Iwould be very cautious usinguh so yeah but I mean the altinstrumentation right and of coursethere is a solution to this right soauto instrumentation should follow thesemantic convention which is a singlestable document it is yeah I mean noproblems at all with with semanticconventions I suppose I think they'reperfect aren't they James ion is how Igot introduced to hotel uh in a sense uhand it was really frustrating for usbecause the the promise of the semanticconvention was that there is oneconvention to rule them all and all ofthe telemetry that you have will followthis amazing standard convention andI've I've even done talks on how amazingthis uh convention can be if all of yourHCP metrics look exactly the same andwow all of your microservices are kindof just will live in harmony and justlike uh utopian kind of dream uh ofeverything beingstandardized The promise of the semanticinvention is so great but basically itbecause it is sowid it's just been slow to get into astable state essentiallyand that has been the most annoying andUh yeah frustrating thing because manyyears ago we were like damn thisconvention is going to be so great Butuh they keep changing it and it keepsbreaking all of our metrics and uh youknow then Prometheus turns things tounderscores and then it does turns itback to dots in the collector and wedon't Yeah All these kind of things Umand we would have just lovedto How long does it take to figure outwhat a HTTP status code should be in thesemantic convention how long indeed umwell I think yeah I think James is notalone There might be a tool for thisthough right of course I mean it hotelsucks only on semantic conventions andal instrumentation I'm I'm definitelysure that the collector is just perfectright yeah I think so And that's whypeople use open telemetry after all Uhor is it thoughthere are like so many times where uhbecause the collector is not stable yetOh no that uh you know I've beenbuilding a downstream distribution andsomeone stayed on it for a long time uhwithout updating and then boominterfaces internally have broken andyou have to make all these code changesto take the latest update uh or likefixed packages uh right because thingshave things are so in lock step on onthe upstreamum beta versions of the collectorso it's not perfect after all I thinkthe collector has problems too I guessBut uh I see a lot of people in thecrowd here and I think if really didtruly suck we'd have way less peoplehere So uh for the next 15 minutes or souh Gerasi and I would like to explainways in which also rocks Uh so I'mDaniel Dila I am a software engineer atDino Trace I've been working on opentelemetry for about 5 years now And withme is you should have seen here in OTrocks slide as well but I'm havingtrouble with the slides Imagine a� slideThis is a new presentation Hi my name isJulia I'm a software engineer at OligGarden and I'm here for the next 10 to15 minutes to talk you about ways inwhich hotel rocks Right Right Uh and weare not alone here We brought someguests with us uh to tell you all theways in which hotel rocksUh and before we talk about that weshould talk about what open telemetry isright of course open telemetry is aspecification And that's the mostimportant part You can't have all of theAPIs without a specification You can'thave all the SDKs without APIs You can'thave all the instrumentation withoutthose APIs and SDKs working togetherwhich we've done in I think 13 languagesUm maybe plus or minus one or two I'mnot 100% sure but um really they worktogether really well much better thanthey really have any right toSorry hotel rocks There you go There'sthe slideSo it is a lot Of course it is a lot ofthings but it has to be a lot of thingsbecause it's a big problem that it'strying to solve Uh and a lot of peopleworking on solving it and it is acomplete framework right so people canuse hotel for from to instrument theirapplications up to uh delivering thetelemetry data to a back end So it hasto be a lot of things Yeah And aconsistent messageNow but there are good things abouthotel that is not only um of course YeahSo auto instrumentation this istypically the easiest way to get startedwith open telemetry I think without itmost people would not get started withopen telemetry So let's hear from abouther experience with auto instrumentationaspect of using openc telemetry that Iwanted to talk about uh and it was theuh auto instrumentation becausegenerally and this is also what is nowdifferent in my experience in deliveryhere versus experience in Zandandabecause back then there have not beenany uh uh or like there were hardly anyum auto instrumentation capabilities butnow there are many for differentruntimes and uh uh in delivery hero itis natural for the engineers the centralobservability team is not uh imposingany anything on the engineering team sothey can use whatever instrumentationthey want We do recommend using opentelemetry but they can decide on theirown and frequently when they do decideand search in the open telemetry spacefor uh the instrumentation that theywant to use if auto instrumentationoffering exists they always pick autoinstrumentation because it's easier tointegrate It's uh you don't need toinstrument anything manually audio afterall Yeah So people don't haveinstrumentation then suddenly they addauto instrumentation they can observetheir systems right Yeah And it reallydoes a lot Uh and it obviously is a lotand does a lot and it's in manydifferent languages And as James pointedout earlier sometimes an HTTP statuscode is just an HTTP status code Itshould be the same everywhere And theway that we achieve that in opentelemetry is with the semantic inventionSo to sort of give a little bit moreabout his experience with the semanticinventions we have again James thoughtson how amazing this uh convention can beif all of your HTTP metrics look exactlythe same and wow all of yourmicroservices are going to just live inharmony and just like uh utopian kind ofdream uh of everything beingstandardized Now uh that's an amazingthing to strive to And when you're at abig company I'm sure that you have about27 and a half thousand different uhmetrics for displaying a HTTP uhresponse So somethinglike tellsyou have iton have basically don't do that Have thesame convention across every language AHTTP response is the same name doesn'tmatter what language you're using kindof thing So how long does it take tofigure out what a HTTP status codeshould be in the semantic convention butthe the reason so it sucks for thatreason that that it's taking so long toto stabilize and it is becoming stableThe HTTPuh semantic convention is now stablewhich is great There's still much morework to do in stabilizing the rest ofthe conventions Butum the flip side of it being so slowthere's a reason it's being slow It'sbecause they are being so careful andthey're being and they take theguarantee of stability When it is stablethey take it very very seriously RightSo once it's stable you know it's goingto be stable and they're not going tobreak it Yeah So perhaps not shiny happytelemetry but uh when things are stablethey are stable right when they'restable they're stable And when they'renot there are tools to deal with that aswellYeah Um what about the collectori think u the collector is the part thatis closest to me Um and I think thecollector is really really powerfulpowerful and we can achieve so much withthe collector but I'm biased right solet's hear from Alexander Magno Um howthe collector is working for themMinority power[Music][Applause][Music]Forever Now it's time to[Music]So I think you're right The collectordoes does rock But you know what i thinkwe're forgetting the biggest reason thatopen telemetry rocks and that is thecommunity It's all of you guys Uh Ican't even I have no idea how manypeople are in this room right now It'sabsolutely mind-blowing to me Uh pleaseraise your hand if you've ever like madea contribution to open telemetry Ifyou've ever opened an issue if you'veever joined any of the meetings uh ifyou you know have ever made anycontribution to open telemetry I mean wereally could not do this without thecommunity And I don't think I'm the onlyone that feels that No And I'm not alsothe only the other one person Um I thinkwe have testimonial from someone else uhtelling us about our community Uh butI've really enjoyed working within thecommunity Uh this is like my first setUh this community is my first communityof being really actively a part of interms of open source contributions andit's in part because I've really enjoyedworking with the people there There's somany bright people um who are very kindand very good to work with across all ofthe different SIGs that I've interactedwith Um you know they're always ready tohelp They're always ready to guideThey're also super smart Like the firstmeeting I got into at the collector SIGI felt super impostor and I stillget it all the time But I remember thatfirst meeting like it was yesterday Iwas listening to um someone talk aboutthe inner workings of the collector andmy mind just exploded uh because I couldtell that he had a mental map of everyline of code within that thing and howit operated and I just I could notcompute right Um and so there are reallyreally smart great people to work withum that have improved me personally justby getting to kind of like interact withthem Um hear what they're thinking hearthe the thinking process they're goingthrough I've become a better engineer orI feel like I've become a betterengineer just through these interactionswithin the community Um and it's a superhelpful community People are passionateThey're they're excited about what thefuture holds Um and mo most of the timeeveryone is very cordial and kind eventhough they may be very direct sometimesI think I may know who he's talkingaboutSo really thank you from myself Anderasiand all of the hotel project Thank youto everybody in this room for being apart of the community We really couldnot do it without you Yeah I know I'm abetter engineer by you know based on myexperience with open telemetryuh in the community All right So umum so we have so we recorded thoseinterviews with James with Elena withMagno with Adriel Um and there's so muchto hear from them Uh we were not able tojust get like everything that we wantedhere So we added a few extra slides Soscan the QR code download the slides andyou have an extended version of thispresentation and a couple of othervideos from people with a spicier takesthan what we've seen here So do scan theQR code and and watch it later Um thankyou very much for joining Uh I hope youhad fun Uh and it's it's us showing youthat hotel is not perfect It's not butuh it still rocks It still rocksSo I think we have time for questions Ifpeople have questions or if you have a atale to share about your experienceswith hotel good and bad also feel freetoAll right Thank you again folks[Applause]2025-04-15 22:03:10.711619 ��=�L#��1AQzStkLbA7Qkhello everyone and welcome to hotelsucks My name is Judas Pashon I'm asoftware engineer at Olig Garden and forthe next 10 to 15 minutes I'm going toshow you the different ways in whichhotel sucks I'm not going to be the onetelling you all of that Um I broughtsome some guests here I recorded some ofthem Some of them are actually here inthe audience as well So if you could umraise your hands and uh I know it's notonly you So um but anyway also with metoday is Dan Dyla Yeah So I'm DanielDala Um I'm a software engineer atDinatrace I've been working on opentelemetry for about 5 years including uhgovernance committee and maintainer ofhoteljs Uh and we are here to talk aboutsome ways in which hotel sucks Uh but inorder to do that we have to explain alittle bit about what is OTEL becauseit's a big part of it So OTEL is aspecification obviously but also it's anAPI and it's not just one API It's 13APIs I think and 13 SDKs andinstrumentation libraries and all ofthose languages and auto instrumentationto set up all of those instrumentationlibraries and oh it is so many thingsBut you don't have to take my word forit Uh we have a testimonial from uhAdriel Perkins Yeah And if everythingworks we're going to see and hear AdrielhereI think open telemetry is already a lotUh it is a lot There is no doubt abouthow much stuff there is within opentelemetry Um you know one of the thingsthat blows blows my mind is that there'sa open telemetry transformation languageand that has so many things going onwith and it's super powerful Uh but likeyou know it is a lot just to understandthat one piece right let alone thecollector that it's used within letalone the semantic conventions behindthe collector and behind thetransformationsIt is a lot It is a lot Um so um inaddition to Adriel we have uhtestimonials from uh three other peoplebut um I'll let them introducethemselves Um if the Wi-Fi worksIdon't It's loading up So we have a backbackup plan So if it doesn't workwe I move to to the backup Yeah backupsprobably principal engineer at aconsulting company Oh yeah Oh god Oh noNext one That next one rightandrew Perkins principal engineer at aconsulting company called the Atro overin the United States I'm also thecurrent project lead for CI/CD uh theCI/CD special interest group within opentelemetry working alongside DotanHorvitz who's a CMCF ambassador also uma co-owner of the GitHub and GitLabreceiver components within the opentelemetry collector as wellYeah So that'sAdriel Um we also Hello I'm James HiJames Let me start again Let me startSure Start again James Starting againHello Hello I'm James Moyesus Uh I workfor Atlassian and software engineer Iwork in the observability team for aboutthe last four years it's been And I workon open telemetry every day I work onall lots of different parts of opentelemetryYeah that's JamesMy name is and I'm principal engineer atdelivery heroes uh developer platform Uhit's a organizational unit that providesuh several platform capabilities such ascloud infrastructure continuous uhdelivery uh and uh continuousintegration and uh obviously also uhobservability and uh resilienceengineering and many other cool thingsYepCool So of course open telemetry is alot of things but where do most peoplestart with it uh they start with autoinstrumentation Auto instrumentation isof course meant to be the easiest way tostart with open telemetry Uh butunfortunately there are some uh we'llcall them rough edges uh and ways inwhich auto instrumentation kind of suckssometimesrecommend using open telemetry but theycan decide on their own and frequentlywhen they do decide and search in theopen teleme��nd of Spark techniques overtime to make that work and you'll kindof see like we went we went through oneother type of uh uh compute uh story inbetween going from bare metal toKubernetes and then ultimately now beingon Kubernetes with u advanceduler likeunicorn also super excited to learnabout all the developments in Q and thatthose have kind of come along in thelast couple of weeks and then kind ofkey takeaways for like making Spark workwell on Kubernetes because you knowtraditionally Kubernetes is not builtfor um let's a data inensiveworkloads so we'll talk a little bitabout best practice and pitfalls andantiatterns to kind of stay awayfrom in terms of objective objectives inthe migration is like we have a lot ofdata engineers and um data analysts whoare using spark and they needinteractive access to large data sets uhwe really kind of want to separatestorage and compute like kind of thatdisagregation over the last six or moreyears right being able to put your datain cloud storage or some uh storagesystem that doesn't have your computecoupled to it gives you a lot offlexibility then we also will talk aboutcluster autoscaling and thencontainerization and dependencymanagement and then ultimately likebetter resource utilization is a verykey part of this to allow us to kind ofreduce cost and then automating as muchof this as possibleSo with that I'll hand it over to one ofour best minds Nha to tell you moreabout this journeyhey everyone my name is Nha Singla uhI'm a software engineer at Apple umthanks for sticking with us uh thisevening i appreciate it i know it's beena long day so uh I'm going to talk aboutuh some of the special requirements uhwith interactive spark so interactivespark plays a cru crucial role in likebig data processing data science andmachine learning workflows for faster uhdata exploration debugging and MLexperiments and uh Jupyter notebook isone of the popular tool used in industryfor interactive spark experiments sohere I'm going to talk about some of theunique requirements uh requirementswhich interactive spark brings in um sointeractive spark needs to provide uhlow latency responses for querying andprocessing large data sets this helpsdata scientists to uh and analysts toiterate faster on their experiment anduh without waiting for them forlongunning bad jobsport launch latency matters ininteractive cases where you don't wantyour users to uh wait longer for runningtheirexperiments in a multi-tenantenvironment where users are sharingresources uh real-time monitoring aswell as introspection of cube eventsport logging spark UI messaging intolive notebooks robust connectivity uhbecomes extra important uh to trackusage and to uh prevent bottlenecksbetweenresources and um there's a special therecould be special requirement fornon-premptive workloads like you youneed to provide a Jupiter environment uhwhich you don't want user that to bepreemptable uh which is different thanthe normal batch uh processing systemuh we need fair resource allocation hereuh which ensures that uh no single usermonopolizes the wholecluster minimum resource guaranteeensures that each user is guaranteed tohave some resources for starting theirinteractive sparkexperiment so let's let's start with ourbasic baseline spark on bare metal uhwhich is a traditional deployment modelright now so when we started as you cansee in this diagram we have a classicsetup with a name node a resourcemanager and worker nodes running on thephysical um physical servers uh in thisconfiguration uh there's novirtualization layer everything runs onthe physical hardware compute resourcesare preallocated by admins and they haveto manually manage like nodeprovisioning scaling and fall toleranceand uh um it used to run for us on u uhvery popular uh resource manager YAN umwhich handles jobuling resourceallocation and monitoring uh todistribute spark workloads uh acrossavailable worker nodes uh it candynamically scale executors based on theworkload demands and support shufflewith external shuffle serviceyeah uh a bit more about YAN so YAN is acore component of t�he Hadoop ecosystemit allows multiple applications likewith Spark to share cluster resourcesefficiently uh it is important to notethat YAN was not designed to orchestrateany containerized application butspecifically designed for orchestratingapplications within Hadoop uh it wouldprovide a fine grain access control uhover CPU memory it manages the clusterresources using a Q-based architectureto ensure fair and efficient resourceallocation admin can set resource limitsand priorities for different cues tobalance their workloaddemands uh there's a great advantage ofuh external shuffle service which YANbrings in um it it helps uh to maintainefficient data exchange betweenexecutors ensuring better fall toleranceand reducing memory overhead it canhandle large scale clusters and can bescaled horizontally to accommodateincreasing resource demands amongmultiple users andapplications let's talk about thechallenges so we face significantchallenges with this approach uh whilerunning in spark jobs on bare metal uhone there was no dynamic scaling ofcluster resources admin had to provisionresources for peak demands and storageand computes were tightly coupledlimiting flexibility each cluster hadaccess to the HDFS storage connected tothat cluster only and for that reasonsadmin had to write distributed copingjobs for moving data from one cluster toanother cluster and which needed SRsupports you know to keep the dataconsistent uh between multiple clustersthat they had to write more and morebackfilling jobs there was no data lakeconcept everyone was just querying therow data no fault tolerance one jobcould bring the whole cluster down andimpact all other users there was nobuilt-in orchestration we had to shipentire spark distribution on bare metalwhich was errorprone and hadconnectivity issues um for interactiveexperiment there was no notebookinterface uh everyone was using a shellinterface for running uh interactivequeries which limited our data scientistproductivity so our first evolution wasmoving to virtualized environment usingVMware this introduced virtualizationsupport giving us more flexibility wegained on demand scalability though notfully dynamic we moved from yan to mosesbased uh messos based resourcescheduleuler uh as our deployment modelwas supporting meos uh at this stage wehad to use static resource allocationbecause mezos was not supporting dynamicallocation we introduced Jupiter classicfor interactivity which was asignificant improvement for our datascientist we introduced uh at that timeuh data lake concepts where HDFS clusterhad network connectivities and samespark job could run access more than oneHDFS cluster we cowed out hive metastore concept we introduced icebergtables we supported table levelqueries u talking about the challengesuh despite this we still face a lot ofchallenges uh resource management wasstill problematic we often sawoverallocation and underutilization ofresources so our administrator uh has totrack all the usage of all the resourcesthey to carry like huge spreadsheet ummentioning okay what type of workload isused where so it was it was prettyheavyweight our t-shirt sizing were notcorrect uh for a specific amount of CPUwe were allocating large chunk of memoryand people were not uh able to getresources due to low memory presence andmost people were not efficient inwriting their spark queries um and andthey were using lot of memory heavy dataframe uh we we adjusted the ratio of ourt-shirt sizing uh which helped a littlebut still an issue due to lack ofdynamic allocation which due to whichsessions could not shrinku on the the interactive side uh theJupyter kernel configuration we were notpersisting and it was not sharable soeveryone had to uh write their own setof spark properties on on their ownkernel configuration even though runningfor the sameexperiment moving on the next evolutionwas moving to Kubernetes so as you cansee in this uh spark cluster is runninginside a kubernetes cluster usingkubernetes to manage spark resourcesinstead of yan or mezosuh Spark driver here runs inside aKubernetes port �and Spark executors runsinside a separate Kubernetes port andthe entire environment is managed byKubernetes giving us more flexibilityand standardization and Kubernetesdynamically manages these ports scalingresources up or down as neededso as we know Kubernetes is the standardfor deploying containerized applicationson cloud and cube scheduleuler is thedefault Kubernetes scheduleuler whichruns as part of controlplane it has really importantcharacteristics uh and features tounderstand uh to and to find visiblenodes uh for identifying where yourapplication can run um and it givespretty good nodes like uh tainttoleration node affinity rules nodeselectors which you can use to uh todefine where application needs to runit's primary designed for uh microsservice type workload a node batch v uhdata processing and it uses aspreadshuling approach to distributeyour workloads u uh evenly acrossavailable nodesit has a scaleout architecture withautoscaling capabilities uh it operatesat port level uh and has somewhatlimited uh flexibility uh sorryfunctionality but it was designed to bereplaceable which becomes relevant laterin ourjourney for interactive spark analysisuh we introduce uh Jupyter lab which isessentially an in browser ID um toenable data scientists and analysts touse notebooks allowing them to buildnarratives uh using both code and textin the same environment so thescreenshot uh is of the Jupyter lab uhenvironment here uh and then users cancreate multiple notebook files uh andthen run um spark kernels um either inspice spark or scala spark or a purepython kernel to power theirnotebooks uh we built kernelconfiguration management system uh formanaging the kernel configurations so weextended Jupyter kernel and supported aspark kernel where all spark propertiescan be added and it it includes bothshared and default resourceconfigurations security configurationsdata accessconfigurations uh as we know sparkproperties are huge in numbers so uh itwas important for us to not overwhelm auser who really wants to just run somesimple spark experiment as well asallowing admin or advanced users whowants to uh look at the properties andtweak some of the advancedconfigurations so we built capabilitiesto hide those configurations based onthe uh user and then uh as well asallowing them the flexibility to tweaktheir configurationsuh and our kernel configurations areorganized as a hierarchy for easiermanagement of properties and to supportdifferent versions of Spark uh and thenwe we flatten that in in our meta storefor implementation simplicity and tobetter use them in the Zuperenvironment so now talking about thechallenges so with this model uh therewere a few challenges one uh as we as weknow the cube scheduleuler was notdesigned for batch workload as spark uhas it's not app awareuling and itdoesn't have queuing support so withoutqueuing management and borrowing ofresources from each other it was not aseffective for interactive batch usecasesthere's uh there's no support ofgangulinguh especially like libraries like XDboost where we want all executors tocome up at the same time and you don'twant to leave them either is issomething which is missing in uh cubescheduleuler uh and interactive usecases like we needed uh fairnesspreemption reservation u resourceguarantee and uh resource guarantee forfor um better user experienceum and you know bin packing for forbetter utilization of cluster resourcesuh so those were the things which whichwere not available in cube scheduleuleruh and then we faced challenges u usingcube scheduleuler for spark interactiveworkloads and for kernel configurationmanagement we built the system which isoutside of zuper lab so now users haveto tweak their configurations in onesystem and then they have to use that inJupiter lab which is which was kind ofcreated a friction for our datascientists so this brings us to ourcurrent architecture spark on kuberneteswithunicorn uh as you can see here unicornsits as a specialized layer betweenkubernetes control plane and our sparkapplications uh it provides yanyanlikeuling capabili�ties uh withKubernetes giving us best of both worldsit h it uh provides hierarchal resourcemanagement with guaranteed resourcequotas per Q this architecture maintainsall the benefits of Kubernetes whileaddressing the scheduleuling limitationsfor Spark workloadsu talking about unicorn scheduleuler uhso it it can work with any kind ofworkloads through port annotation itprovides hierarchal kota management uhthere's no custom resource definition uhis needed u it completely replaces thecube default scheduleuler handling bothand queuing uh both queuinganduling it supports pri prioritypreeemption FIFO fair sharing gangscheduleuling uh resource borrowing uhexactly what what we needed forSpark so this is our currentarchitecture here um you see in thismodel we are leveraging uh node poolsand unicorn cues uh we have divided ourworkloads into three node pools um thetop uh one is node where we want to runour longunning Jupyter server which isproviding an environment for datascientists where in on Jupiter serverthey can launch kernels to run theirspark jobs uh since it's a longunning uit it doesn't scale up or down it's it'smainly just one port so it is managed bythe cube default scheduleuler and thenuh the there the driver the spark driverand is managed is uh part of a differentnode poolool too and the third nodepoolool three is for spark executors andwhich are preemptable and then they canscaleindependently so with this architectureit provides us the workload separationuh grouping them based on the similarconfiguration andstreamline streamlining life cyclemanagement for each type ofworkloads uh the other advantage we wesee here uh by putting the driver andexecutors in the same network John uh ithelped us providing better u shufflebetter shuffling and and the reduced uhin shuffle latencyuh so we improved our interactive sparkexperience here uh with furtherimprovement with Zuptor lab so we builtin uh a Zuper lab plug-in uh for kernelconfiguration management to reduce thefriction u and to allow users to doeverything in the sameenvironment uh we made the uhimprovement in zuper kernel connectivityum so it used to happen that uh user haslaunched a spark kernel from Zuptor thespark kernel is running and the code isup and running behind the scene but uhZuper lab doesn't have connectivity toconnect to that kernel so we we had weimproved uh the kernel connectivitylayer to try to uh show the the theexact status of the drive of the sparkdriver kernel which is uh which isrunning onkubernetes and uh we gained utilizationtransparency giving users bettervisibility so overall this created amuch more seamless experience for ourdata scientistsuh this slide shows our Jupyter labkernel configuration interface uh whichprovides a more integrated experiencecompared to our previous setup so youcan see here um so this is a Jupiter labuh spark kernel where all the sparkproperties are configured and the samecan be used to launch the kernel bystaying on the zuper lab environment sowe have we have different presetsthere's a minimal preset this is onlyfor just a smaller set of properties andif the users goes to the the uh the fullpreset they'll see advanced set ofconfigurations only only if they want tosee it and tweak it we are planning toopen source uh this sometime in futurewith uhZuptor so um with with this approach uwe we gained many benefits uh one is uhwith unicorn we now have applicationlevel uh wearululing we have achievedsignificant costsaving throughout our uhthrough our better resourceutilization we have a comprehensivemonitoring capabilities and uh now thissolution works both on onrem and uh athird party cloud giving us moreflexibilityso looking at specific features acrossour journeyuh so we see that we we didn't have uhthe cluster autoscaling dependencymanagement some of the the all the datalake concepts u when we started withbare metal uh we improved on that whenwe moved to VMware um solution but atthat time we lost the dynamic allocationso which which uh which helped whichdidn't help in improving our resourceefficiency NC uh with with Kubernet�esour resource uh utilization uh improveduh and then but it was still it wasstill not up to the mark where we wantedand we we saw the advanced improvementwith Kubernetes on Unicorn uh we withqueuing support we we had it availableon bare metal but then we didn't have itwith our VMware and Kubernetes and nowwe still have it back with Unicorn uhour cost efficiency has moved from highinvestment U utilization to bestresource sharing with lowest cost gangshoulduling a critical feature for uhSpark is now available with uh Unicornuh resourceful uh fairness and jobprioritization have evolved from basicto advanced with our currentarchitecturenow keytakeaways I think to start with if youare in your journey of moving from baremetal to kubernetes or you're already inthat in that process uh the first is youneed to assess your current u umworkload pattern whether it's a uh batchuh processing or you need interactive orAI capabilities uh uh this is reallyimportant to decide uh what kind ofulingcapabilities you will be needing andthen you can optimize onthat uh design a phase strategy uh startwith the hybrid approach uh in migrateincrementally ensure you have uh propertest plans before migrating users andwhich is running on your both uhprevious infrastructure and the newinfrastructureoptimize uh resource uh util uh umallocation u so as you know the cubescheduleuler is not meant for uhbatchuling uh jobs so this is one thingwhich you would want to consider fromday one like having a batch scheduleuleron top of uh the the the currentulingcapabilities which Kubernetes hasyou would like to enable preeemptionpolicies to ensure uh there are prior ifthere are priority jobs has to bescheduledum enable dynamic allocation uh which wedidn't have when we uh moved to VMwareand then we uh we were bitten by thatbecause dynamic allocationensures a more cost um resourceutilization I mean of course there arecases like XD boost where you don't wantu uh to enable it and you want all theexecutors to come up at the same timebut most of the cases you would like toenable dynamic allocation for betterresourceefficiency uh implement proper portulinguh you use uh Kubernetes provide uh tonsof configurations to ensure that yourports are placed at the at the rightnode you can use node affinity rulesnode selectors um taint tolerations toto customize your kubernetes specificconfigurations to optimize unchedulinguh handle your storage and networkingchallenges if you have data sitting onlocal disc you won't like to move to thedistributed storageuh uh and networking could be differentlike when we started we had the baremetal only u the cluster was havingaccess to only the HDFS which it wasconnected to so you would like toconsider those networking um challengeswhen when you move uh from biometal toKubernetes uh configure proper proper uhconfiguration uh proper shuffle servicewhether you want to use external shuffleservice or shuffle tracking u some ofthe configuration might not be the sameas you are uh as biometal to kuberneteslike CPU requested limits when westarted we used to have a defaultconfiguration with putting limits as wayhigher than request um which which bywhich we got throttled because uh whenwhen you when you set it that way onlythe one which has the highest limitstake over and you become the bad noneighbor so uh and we ended up settingboth requests equals to limit um to touh make it fair uh for betterutilizationand uh uh yeah this is the thing whichyou don't want to miss set upcomprehensive monitoring uh you ifyou're using Unicorn enable Unicorn's UImetrics to trackulingefficiency u there areuh in in terms of pitfalls andantiatterns yeah I mean you as I saidyou you would like to enable uh dynamicscaling um uh there are cases where youwould not like to do that but but yeahdon't don't uh forget to use the dynamicscaling which is reallyefficient uh storage and networkingchanges might might be tricky sometimesso don't underestimate your storage andnetworking changes and um uni if you'reusing unicorn uh you might want to uhmake sure that youruling policies arecon�figured correctly as uh it'll betricky to debug once uh if you were youif you want your port to run at specificqueue or a specific location it's notit's uh landing in other other place uhit'll be tricky to find it out so youmake sure that you are configuringyouruling policies correctlyyeah that's all uh thanks thank you all[Music]questions yeahdo you hear me yep yeah uh I would liketo have a two questions first one isabout data so where do data lives in theKubernetes environment and the secondone is there is no external shufflingservice in the Kubernetes environmenti can yeah take the first one yeah thedata lives in cloud storage so we justin cloud storage so it can be like localcloud storage cloud cloud storage youthink about like S3 or SE for theseother technologies okay so it'sbasically like getting things off of theHDFS clusters into a place where you canget good access to the data okay it'sstill local but then you can in ifyou're in different regions and thingsyou still have access to the data thankyou the second one is no externalshuffling service anymore uh yeah Ithink so there's been some efforts totry to build that but right now we'reusing mostly shuffle tracking and we'rereally interested in like figuring outif there's going to be a great long-termsolution for external shuffle trackingright right now we're not using anythingspecific yeah okay thank youuh hi quick couple questions so uh atthe last you mentioned don'tunderestimate the storage requirementsso we are also in the process ofmigrating spark jobs from yarn tokubernetes um one of the challenges weare trying to solve right now is theephemeral storage requirements thatthese jobs have so we are looking at uhsomething like ephemeral volumes forinstance but curious how you went aboutit and if that's something that's aproblem that you also faced yeah I thinkwe did um do you do you know more go forit if you No I mean so we have uhdifferent so one is the uh so one is aJupyter interface which we're providingfor Spark and where the Jupyter fileslives which is not typically the databut the the compute code which userswant to use it so we are u so we areusing like uh the kubernetes persistentvolumes for that one but the actual dataas uh is is the one which is residing onthe cloud like which is separate fromthe the kubernetes infrastructure thatpart makes sense uh curious so one ofthe thing one of the requirements wehave is really high throughput forparture on spark jobs because severaljobs wanting to run we want to keeputilization high very similar systemwhere we have a queue mechanism wherejobs are queued all the time when welooked into using persistent volumes forexample the throughput slowed downsignificantly like we could maintain athroughput of about 200 300 parts persecond without PVs but as soon as we putPVs in the mix performance drops by likefrom 200 to 50 parts per second max wehave the same problem right how do yousolve it yeah we you use local diskright use local local disk as much asyou can and like configuring that disccorrectly and using the right types ofdisc right is probably the best you'regonna do right i see yeah like trying toget you can also end up saturating theability to actually mount the PVCs solike I would totally avoid that ifpossible right i see in that case you'reusing just empty D with disk in thiscase uh empty dur with disk or you canconfigure more specific volumes like inthe size and shape that you need them inour kernel configuration that Nihashowed you can specify the diskproperties you want see and atscheduling time you can we can get theright uh workloads to the right pods orto the right nodeswith the right pod sizes right I see butin that case do you do use host paths asvolumes or something else um I can'tremember what that is now but let'sconnect after and I can get an answerfor you sounds good thanksHow has been your experience withshuffle tracking uh because we kind ofran into a few problems so I'm trying tounderstand if we are the special case orit's kind of uh what type of problemsdid you run into um fetch uh failing at�times or um executors staying around forlonger than required yeah yeah so we raninto some of those as well um everyone'sfeeling the same pain do you also usedecommissioning um I don't I don't thinkwe're using decommissioning oh you Ohyeah no we do we actually So um if I'mif I'm understanding decommissioningcorrectly the way you're asking it solet me just kind of explain uh as we'reusing shuffle tracking shuffle trackinghas to be configured well right andthere's a lot of knobs that you have tolike keep data cached in the executivesright and then you have timeouts as wellso you can time out those things anddepending upon the type of interactiveworkload and the nature someone may wantto cache data if they're just cachingdata there there's a cache timeout andthen on shuffle tracking there's verysimilar concepts right so um you maywant to keep those nodes uh those podsup for a certain amount of time and thenas uh as you use dynamic allocation theshuffle tracking will basically uh wayneoff because you're going to collapse thecluster right you're going to scale itdown so an example is like I may start acluster that has two pods right asexecutives and one driver and it may endup scaling up to 100 pods right but thenwhen I stop using it I've stoppedwriting queries or doing data work itcan scale in and then the shuffle uhblocks will end up uh being garbagecollected right so when you're doingthat basically if you're doing somethingreally intense you need to configureyour shuffle properties shuffle trackingproperties such that for the nature ofyour workloads your ad hoc workloadsthat those shuffle tracking blocks stayaround Right does that make senseyeah it it it does um and I think thedynamic allocation as you said isinfluenced by multiple factors not justshuffle tracking uh blocks yeah for surefor sure yeah I think there there aretwo sets of things I want to talk aboutright so the shuffle tracking is oneaspect of it but then there's a set ofuh properties to help you keep yourexecutives around as well right likewhen you go and you cache data you don'twant those uh executives to shut downbecause you've cached the data you'regoing to reuse it you might write onequery and you want that data cached andthen you're going to derive anotherquery off of that same data right soyou're keeping the executives aroundright for a certain amount of time foryour work and then once you stop thenyou have shuffle tracking that's goingto start cleaning things up right ithink it's also quite relevant in termsof if you have complex spark jobsexactly say if there are uh six stagesthat takes 5 hours um then in that casethat's where the shuffle tracking hasbasically given us the most pain exactlyum I think the interactive workloadswhere data scientists let's say stopworking in our uh setup uh the idletimeout for Jupyter notebooks basicallykills the notebooks before shuffletracking can even you know downscale soyeah I'm going you finish and then I canjump in i think I have a I want to add apoint did you want to finish that or uhI think yeah I think uh I will just saythat um for larger interactive workloadsI highly discourage using Jupyternotebooks right anything that's going torun past 30 minutes and 30 minutes isyou mean a lot right anything that'sgoing to run past 30 minutes put it inthe background so we have the ability tobasically build a DAG and you you allgave a presentation at Airflow Summit onthe same topic but we can basicallyschedule those notebooks to go in thebackground so the users get theconvenience of typing the code in thenotebook and then they can hit a fewbuttons and drop it in the backgroundand run it on a schedule or just stuffit in the background and then theresults will get like generated and umthe user will be notified that theresults are there right and when you runthose types of jobs where you need tolike maintain the shuffle data for alonger period of time across multiplestages you can tune the properties asyou need to make that job efficient andthings to not get collapsed in butremember as she showed on that lastslide let me bring the slide back ithink thisone you have on the I think the righthand side for you here there is in theQ3 there's a batch workload once you putthe notebook in the background itbecomes a batch workload and your batchconfiguration with unicorn needs to betuned for those needs right okay so wehave like uh data scientists that willgo do work and schedule things nightlyor weekly and we make sure that they putthose things in a batch uh queue so thatthings get scheduled appropriately andthey get the type of behavior they needin terms of long running work right doesthat make sense that makes sense yeah uhnow you reminded me of another questionthat No no no it's it's 6:05 so I wantto be respectful to other people's timelike people don't need to stay but I'dlove to have you if you want to want tostay go for itum why did you separate out node poolsfor Jupyter Ps drivers and executorsyeah I think like NA mentioned earlierin the talk you want to take it you takeit no no you no you take it sorry I'mtalking too much i think it was it waskind of mentioned the networking partbetween driver and exeutor i'm kind ofwondering there are three layers intotal uh yeah sure so the Jupiter portis kind of a longunning port right so itdoesn't uh it doesn't require anyspecial treatment uh in terms of scalingup or down so we we don't want topreempt these Jupiter ports if any highpriority job comes in so they want totreat this differently than the the restof the the Spark uh jobs which has apriority setup so these are so that'swhy like we want this to scale updifferently and don't want to minglethem with the rest of the Spark jobs ifthat makes sense okay yeah i Yeah okayit makes sense thanks i can also add alittle more to that like when you havedrivers right when we uh connect tothose uh Jupyter kernels there thedriver is the kernel right and thedriver because that's the interactivecomponent we're connecting to to sendthe code to get the get the result thatthat pod we don't want to go down at allright so it needs to be in its own nodepool and the life cycle of those pods isdifferent the life cycle of the driverthe driver is there longer theexecutives can come and go betweenmultiple cells in the notebook you'rerunning different queries right so youmay go from two two executives to 100and then back down to 50 and then backup to 100 and then back down to 75 sothe churn rate in the node pool 3 therefor executives is different and yourautoscaling configuration can betailored to the nature of thoseworkloads does that make sensecan you give me specific examples ofwhat those configurations are because Ithink that is the gist of the answer uhlet me think of specific configurationsum example wouldbe I'm trying to think on the executiveone is always the easy one right whichis like I may need to and I have peoplehere waiting to maybe throw me out sorrywe're running out of time okay we'rerunning out of time so last last answerand we can just sync up offline too umultimately um if you want to have let'ssay a 100 nodes and you want theequivalent number of pods on those 100nodes right and at peak time right whenpeople are running heavy queries let'ssay they're coming in the office theystart running those queries right andthen with dynamic allocation that thatcan scale in it can scale the nodes youcan go down to 10 nodes if everyone goesto lunch right and then it can scaleback out when they come back and theystart writing more queries but let meanswer the rest of that over here on theside of the stage thank you very muchthank you2025-04-15 22:03:11.468135 ��B�M#��;AQ2ct5OXQ8fUthanks folks for coming uh today Nay andI here from Apple we're going to talk alittle bit about the journey we had frommoving our Spark workloads from legacyHadoop clusters um and overall to theKubernetes kind of um ecosystem a numberof years ago when Kubernetes was maybenot as nice as it is today so you'llhear some war stories there as wellso so in in terms of agenda we're goingto talk a bit about you know going frombare metal to Kubernetes and how we hadto evolve ki��ell you know it would bereally good if you just showed like avendor neutral like open source solutionwe're like yeah that's super cool we'reokay with that and I think that's reallythat's really nice like first of alllike it was said to us really nicely andsecondly yeah we we don't want to likemake one vendor stand out over theothersum and I think the the other thingthat's really important is it's athoughtful community members treat eachother with kindness and respect um whenI first submitted my first PR now I'veI've been in tech for 25 years um I'veonly been in open source for the lastthree years and the first time Isubmitted my first PR to otel and thatwas my first open source project I waslike I was so scared I'm like oh my Godpeople are going to think I'm animposter and they're going to hate mywork blah blah blah and then like thecomments were so thoughtful and nice andlike they were constructive so becauseof that I'm like hey this is a legitgood community and I want to keepcontributing toit so next we're going to talk about theotel end user Sig story thank youAdriana and yeah I want to Echo what shehas said about it um when I firststarted and I was trying to figure outwhere I could best contribute I hoppedinto a bunch of Sig different Sigmeetings andum everyone was just like oh hi welcomethank you for being here and I was likeoh wow um so you just heard anabbreviated version of how open t treecame to be um and how kind of bits andpieces of the committee start formingtogether so we have a governancecommittee which is made up of memberswho areelected um a technical committee andboth of those oversee them the projectas a whole we also have special interestgroups or sigs for just about everycomponent of the project um as well asworking groups which are centered aroundumtemporary um projects so to speak um themain distinction between a Sig and aworking group is thepermanence um so we'll get into that alittle bit more uh in a second becausewe originally started out as a workinggroup so one thing that we noticed thatwas missing um this was2021 um a couple years after the projecthad kind of been around for a little bitso my manager at the time sh Quan um shewas working with people like aloitaSharma um Ted Young Henrik actually wasinvolved quite a bit too um at thebeginning and one of the things thatthey noticed was therewasn't a dedicated space for end usersum and because it was still in itsinfancy you know documentation was stilla huge work in progress there wasn'tdedicated resources for end users to beable to go and like ask like hey youknow how do we do this how can we dothat what even is open Telemetry and umkind of along those lines sosh um along with some of the people thatI mentioned and others as well puttogether a proposal for the end userworking group and it was proposed andaccepted obviously and eventuallyum we started working to get together tofigure out okay what are end usersneeding right now that we can helpprovide um so a lot of it was having aspace for end users to get together andbe able to talk about um hey what haveyou done I'm trying to do this hasanyone else tried this and what have youlearned from it so this is a question weare still answering because end usersneeds evolve and they're going tocontinue to evolve especially as theproject itself evolvum and of course on the stage you knowit's a lot of very early adopters um andwe'll talk a little bit more about thatin the next section as well um soThrough The Years um let's see theworking group was formed in2021 and over time as we started doingmore ofthese um events and activities andputting together resources for end usersum we realized like hey this is nolonger a liketemporary issue I don't know if theissue is the right word for it but um werecognized that hey this needs to be abit more of a permanent solution um andresource and so um in between we putforth a request to convert the s fromend sorry yeah from end user workinggroup to Sig and it was a process thatluckily had the blessing of thegovernance committee so we were able toform U form as a s �and now we have umapprovers Chargers approvers andmaintainers okay so um along with thatwe got our own giab repo we got our ownCommon Board and I think it umconverting from a working group to asingle so kind of gave us more Authorityor made us like more official yeah whichI think really helped that we've um beenable to get more contributors as aresult so that was very exciting forus um okayso I think we're going to talk aboutthat in the um implementation projectsection um more about how we've evolvedbecause that's the excitingpartokay oh yes so2022 uh the last two years we tried manydifferent things and right now we'restill trying many different things um asI said and using us needs are stillevolving so it's aproject okay so this is pretty cool umthese are some of the things that wehave done and what we're currently doingand the first thing so one of the firstthings we started putting together wasan ner discussion group and this wasactually based off a suggestion from anearly adopter of om Telemetry um he waspart of another open source projectwhere they had these end user discussiongroups so it was a dedicated time forend come together and talk about heywhat are you doing um how are you doingthis and so we kind of Follow The Formatpretty closely we had an agile coffeeformat which is where um everything istime boxed um neatly to make sure likeeverything on the agenda gets um uh setamount of time so for instance like 15minutes per topic or whatever works foryou and yeah it was a forum for and justto ask questions we also added the umum added at least one or two otelmaintainers to kind of give guidance tosome of the more technical questions ormore obscure questions that we ourselveswere not uh best able to answer and weactually had at one point this activityfor all the regions so America Europeand apj um eventually it it became alittle bit too much to handle across alldifferent time zones and so we have haveactually deprecated um this particularactivity in favor of some of the otherones that we're going to talk about nextso one of the ones we had at thebeginning as well um they used to becalled the Q&Aslin feedback sessions and now we justshort shortened it to oel me sessions sothese are where we have a um schedulewhere we'll book a time with a an userorganization and we aska set of standard questions like tell usabout your Tech stack when did you getstarted with open tety like how and whylike right why open like why did yourorganization decide hey this is what wewant todo and we also depending on theirspecific implementation of openTelemetry in their org we'll also askspecific questions on like how was thisexactly done um and then another goal ofthese is to get feedback back from theEd user so we could siphon it back tothe maintainers and contributors andhelp improve the project as awhole and yeah so these have alsoevolved um they used to be just likeZoom meetings you know the traditionalformat very dry and we've recently withum the help of Henrik rexed from Dino TRbeen able to up the production value sothey look more exciting oh I should haveput a screenshot um but that's okay alsowe started adding some really coolanimations to the beginning of our livestream so we've really Zed it up thankyou um that's it's just something fun umand if you have questions about it I'mvery happy to talk more about thatspecifically um but yeah we live streamthese I think we try to do them once amonth um it depends of course like onscheduling so I mean I'd always be oncea month but we make sure that these arepublic and and viewers um are able tolearn from what other organizations havealreadydone yes and I don't know where I pausethere okay so another one we've beendoing since the beginning is somethingcalled open Telemetry in practice andthese are in a more Meetup style formatum we usually invite um whoever isinterested to it could be an end user itcould be a contributor um to come on andpresent something about openTelemetry um could be something abouttracing um we've had some interestingsessions although I'm blinking right nowoh we had o�ne where uh we've had acouple where it was like observabilitycompanies that are that were likeimplementing oel internally so thosewere those were cool jaob aronov whenwhen he was at light step um he came andtalked about how they used targ allatyeah uh yes that's right he talked aboutthe target allocatorum yeah so it's like really cool usecases and the other thing too that I wasgoing to say it's like if you want totest out your otel talk guinea pig uslike oh yeah yeah um we try to encourageum you know people who are not asfamiliar with the stage but they want topresent a talk um feel free to hit us upand we are happy to give you a forum totry it outyourself and again these are um livestreamed and we make the recordingavailable so you can also be enabled onall the things that these people areshowing and thensurveys okay so surveys um basically weum uh I think it was like last ccon nosorry yeah last ccon EU um we gotapproached byasi from uh The otelCollector say saying hey it'd be reallycool if you guys could help us out um toput out a survey on The otel Collectorso that we can use it to drive what theroad map of The Collector is going to bein the next year so we're like oh thisis awesome so we we basically um workedwith them to craft the survey um we umwe basically um we promoted it on theotel socials um and then we even um madesure that uh we even held a a sessionlike basically a panel like a livestream panel where some uh collector endusers could actually um meet live with amaintainer to talk about some of theirfeedback around the uh Around The otelCollector and the cool thing about thisis that once that happened then we hadother sigs reach out to us and say heycan we work with you um to put out asurvey and so I I think one of ourstruggles actually originally with theend user Sig was one of our mandates wasto bring the feedback of the end userCommunity back to the maintainers andthese surveys are such a great way to dothat because now we have like concreteresults that we can share back with themaintainers so we have the surveyresults and then after the survey uh isclosed then we um then we make sure thatthe results are present are basic areshared with the uh the Sig that put outthe survey and then also make sure thatthey're shared out with the maintainersand all of the stuff is availablepublicly on our GitHub repo so we haveall the survey results on our GitHubrepo and then we work with the sigs touh to write a blog post summarizing thesurvey results so we have that um thatinformation publicly available for uhthe sigs to use that to to really helpDrive Improvement across the org um soand and actually the uh the surveys gota shout out at the otel uh uh projectupdates yesterday so pretty exciting andI think the one that just wrapped up wasthe contributor experience survey so weare looking forward to helping um afacilitate more surveys in the comingyear so hit us up um if you're part ofthe otel community and want a survey putout um we're going to start puttingtogether a schedule of surveys as wellbecause we realized that um you knowthere's a lot of demand for surveys andwe're like we need actually a moreformalized process for handling thesurvey so making sure that we have um aprocess for folks to approach us umaround uh running the surveys and thenmaking sure we have a schedule so wedon't have surveys overlapping with eachother because we also don't want surveyfatigue so that that's been a bit of achallenge with the surveys that we'reworking through right now um we've alsohad a couple of ad hoc sessions sosometimes you know we can't get someonefor hotel uh Hotel me or hotel inpractice and we're like you know what wedon't want to be out of sight out ofmind we want to make sure that wecontinue to engage with the community umin one way or another so we'll havethese ad hoc sessions um we've had umearlier in the year um a session wherewe got people together who were part ofthe otel community who have experiencedum in writing cfps or have been onprogram committees and and have beensuccessful in getting their talksaccepted we u�h we held a live stream totalk about um you know tips and tricksfor getting your your cfps accepted umso having those types of communityconnections uh We've also had round taeswith observability experts um uh as partof this as well um and uh the importantthing here here is that this is anopportunity for us to continue to engagewith contributors and end users alikebecause again this is this is acommunity that's not just contributorsit's not just end users it's both so wereally need to have um that intersectionbetween thetwo and then finally we have otel inreal life um events so it's this is thepart um you know having the cubec consthis stuff is awesome um Con Europe uhespecially um I feel like it's superwell attended by folks in the otelcommunity and I think having the oelobservatory um in the last several cubeccons um has been really really helpfulbecause it's been it's made it such awelcoming place for people to come formaintainers together together for endusers to come in and answer qu to comein and ask questions um it's been reallyawesome we've organized in-person Sigmeetings and then the other thing thatweuh that we started um four cubec consago so starting in cucon Chicago um wasthe humans of otel uh interview Serieswhere again it's it's all about umkeeping uh making sure that like we'rewe're top of mine in the community rightso this the point of these interviews isfor us to introduce um the folks who arecontributing to otel whether you're amaintainer contributor approver um oreven end user so that you know likethere are people behind open source it'snot just like some random GitHub Avatarthat's like approving your poll requestsor requesting for more feedback thereare real humans that make open sourcework and without these people putting inthe time the effort actually giving acrap about what's going on um thisproject wouldn't be where it is so wereally want to make sure that youconnect with the folks who are are partof this um of this wonderful project sowith that let's uh let's go over somestrategies for uh for helping you ifyou're building up your community sofirst off oopsie uh make sure you'reresponsive and transparent I think onething that's extremely successful aboutopen Telemetry is the fact that folksare fairly responsive fairly transparentno one's mean um even like the otel anduser Sig a lot of people um tend to postin our slack Channel questions technicalotel related questions we don'tnecessarily have all the answers but wecan at least Point them in the rightdirection because I think nothing'sworse than someone posting a questionand then their question getting ignoredand they're like well this communitysucks I'm never never going to ask aquestion again so we make it a point ofno matter what someone answers thatquestion points them in the rightdirection um we want to make it easy forpeople to contribute and I think we'regetting there um we put out a blog postI want to say last year on uh how tocontribute to open Telemetry and nowthere's the contributionuh Sig which is I think really going tohelp make it even easier forcontribution um making sure I think oneof the things that's helped us with ourcommunity is hosting events on aconsistent basis so like I said even ifwe don't have um something like otel meor or otel in practice going on makingsure that we have something to remindour community hey we're still around westill matter we still care about youoh so improving production value arepossible this is what Adriana touched onum briefly earlier um where we judged upso instead of using Zoom we have um weuse streamyard now for live uh for ourvirtual events um and then even justdoing something like I use canva I'm ahuge fan of canva um to create eventGraphics that are more eye-catchingbecause you're fighting with um people'sattention spans right there's so muchgoing on people have their day jobsthere's million other things going ononline um so if you are able to umeither do it yourself or have a resourcea creative resource um you can reach outto we definitely recommend doingsoum and also keeping vendor neutrality asa top party ad mentioned this quite abit um we work for competing um observvendors but but we don't feel we don'tever feel like we're competing with eachother um and I think that hopefully thattrickles down into uh the events that wedo and the people thatattend and of course working around timezones availab availability that's stilland always going to continue to be a bitof a struggle right um but you knowsometimes I'll come on late or come onsuper early so that someone else doesn'thave to um but also keeping things atthe same time um that generally worksfor most people um is another thing thatI think has worked quite well I wouldsay also the fact that like now we havelike three proper maintainers in the Sigum we can cover for each other so youknow if like we it doesn't have to bethat all three of us make the Sigmeetings we want to at least havecritical mass of like two of themaintainers around um to to be able tolike help steer things answer questionsand it's been kind of cool actually inthe last like month or two we've had umlike a lot of people join the Sig whichwe're very excited about and so there'sa lot of enthusiasm so now we're likedealing with a bit of a scaling issue umyou know oopsie big problems for us likethis is a good problem um because itmeans that people care enough to be ableto to to want to join uh the Sig thatthey think that they can make adifference which has been amazing butthat's something that we're um I thinkone of the challenges that we need towork around now is like um how do wemake sure that we um Define the Sigpriorities and make sure that we're allaligned to the SI priorities becausewhen it was like primarily the three ofus it was like pretty easy but now asyou have more people joining like let'smake sure we don't end up with likeKingdom building because that can bevery very easy to happen righteveryone's kind of got their own agendawe want to make sure we have a unifiedagenda yeah more people joining and thenalso um more building more resources aswell um a shoot there was something wasgoing to add but I forget um and also Ithink we're attime well so we'll we'll leave you withsome resources first of all HotelCommunity Day is coming up in um the endof June as part of uh open source Summitcollocated event kind of deal um we havea cncf community page for otel so ifyou're interested in our otel livestream events um they are publishedthere um as I mentioned I did write ablog post on how to contribute to openTelemetry so I've got a link for that wehave a GitHub repo um for the end userSig our YouTube channel and of courseour slack channel on cncf slack and thenfinally I've got some Shamelessself-promotion to do I have a podcast mydaughter who 16 helps to edit the videosand she designed the stickers it has acapada remember I like them um I do havestickers if anyone is interested andI've had really awesome guests likeKelsey high tower Lys Fong Jones umReese was actually like my second guestmy first guest was my daughter so extrareason to go check it out meinterviewing a 15-year-old at the timeand then and then reys who has awesomestories and then also if you're new toopen Telemetry observability like videocourses I have an O'Reilly video courseon observability with otel check it outif you have a subscription and then bigthanks to Reese's cat Taco who is likethe best slide model ever and every timeI look through her slides I like whatrees's done with the slides and puttingTaco on I'm like it just brings a smileto my face so this is why we keep doingthe talks together hopefully make thetalks fun for everyone especially youknow this late on a on the second day ofcubec con at the end of the day wheneveryone's tired and wants alcohol yeahand that's for the scaling you know thesake thing stay tuned for 2.0 of ourtalk maybe yeah but yeah thank you somuch for coming out hopefully you foundthis useful um definitely reach out tous we areat otel D- and- user on cncf slack yesand stick around I want to take anaudience selfie so reallyquickly we thank you you've been awesome2025-04-15 22:03:12.031050 ((��;�N#��-AHdrf5QosFTwhello welcome to our session and firstof all you all are Troopers I know it'slate in the day and also on the secondand last day of the event so thank youso much for being here uh this is oursession oh tell me how to get my opensours committee takenseriously Lessons Learned as otelmaintainers let's get started so my nameis Adriana Vela I am a cncf Ambassadorblogger podcaster and one of themaintainers of the otel and user Sigalongside Reese and Dan who um is aroundcucon but not here today he's onboarding at his new uh company yes anduh yeah by day I'm a principal developerAdvocate at Dino trce um spending mostof my time in the land of otel andobservability and by night I like toclimb walls um I like to visit localbouldering gyms whenever I visit adifferent city so I checked out a coupleof different ones here in London whichhas been awesome um by and and also Ilove kadas um because they make mehappy I am Rees Lee I'm a seniordeveloper relations engineer at NewRelic and as adri mentioned Ico-maintainer Sig Along with Dan GomezBlanco um who is as I mentioned onboarding has company and I guess bynight I like to I started trainingBrazilian jiu-jitsu like about a yearago and I love it so if anyone else heredoes it and wants to talk about itplease findme okay so here is what we have for youtoday okay so we'll start by talking alittle bit about the project um how manyof you here are familiar with openTelemetry everybody or if not everybodyawesome okay so we'll just kind ofintroduce um the project a little bitand then we'll talk about the endies areSig story which is what we are soexcited to share and we'll talk aboutimplementation so like what we didwithin the Sig what has worked and whatmaybe hasn't worked so well and thenfinally we'll finish up with strategiesthat you can take to your communitiestoday andImplement okay well let's start off withthe open Telemetry project so otel wasborn in 2019 after uh merging um open uhcensus which was from Google and opentracing which was cncf because back thenum everyone and their Uncle wanted to umhave their own instrumentation frameworkand open tracing open census wereattempting to standardize that but whatbetter way to standardize than to have asingle standard so open Telemetry wasborn um a cncf project which allowed usto standardize our instrumentation fortraces logs and metrics but alsocorrelate all three which is notnecessarily something that was happeningin theindustry um and otel I mean folks hereare familiar with otel but briefly it'sa vendor neutral open source standardthat is used to generate collect andExport um Telemetry data to wherever umaccepts it so um the things I think thatare really nice about the otel communityso it was I believe modeled after thekubernetes community and I think one ofthe things that otel and kubernetes havein common is that there isn't a singledominant vendor around and I thinkthat's what really um makes this such aspecial Community I mean reys and I workfor competitors but I don't see her asmy competitor like we're buddies we dotalks all over the globe so it's likesuper awesome so you know it's such asupportive um Community um we have youknow basically we have the backing ofall the major observability vendors andwithout that backing we probablywouldn't be able to say that we are thestandard for collecting um andgenerating t data so that makes a hugehuge difference um it's vendor neutralby Design with the idea that like we'reall you know if you work for a vendorwe're all adjusting the same data butwhat differentiates one vendor from fromanother is what they do with your datain the end um it's an egalitariancommunity so there are no snowflakesallowed and no egos in the room and Ithink that's really important um youknow whenever someone um I I I rememberI think we were writing a blog post andwe're like oh we want to show how likesomething shows up in the back end thatthat the the observability vendor that Iwas working for and the observabilityvendor that ree was working for andthey're like w��and then goesinto details about each change that'simportant and that's breaking in a wayOne problem that we have observed for along time is that uh people are usuallyget aware of when something is removedwhen that actually happens and I want toraise awareness for here about that isthat before a feature is removed inKubernetes it has to go through a verystrict cycle that's called thedeprecation cycle where the feature ismarked for removal and then after sometime and after there's alternative if itis possibleand only after that the feature isremoved Uh people are usually not awareof that and they only find it out whenthey upgrade the cluster and somethingis broken But it's really important totake a look at thedeprecations The best way to finddeprecations is the change log Those aresometimes in the blog post but sometimesnot But the best place to findinformation about deprecations is thechange log Again the same note appliesNot all depreations will be relevant toyou but it is the best to have some ideawhat's going to be eventually removed atsomepoint And now let's talk a little bitabout why does Kubernetes uh call thebreaking changes differently like why wecall them uh removals and major changesor uh urgent upgrade nodes Let's see alittle bit about that So there's twofactors to that There's philosophicalfactor to that and there's the technicaland if you talk like in thephilosophical way a little bit uh we cansay that it's impossible to addressproject without breaking changes So atsome point we really have to make thosechanges And why is that like you mightask so we can maybe add a new featurethat's doing the similar thing as theprevious feature Why do we need toremove the previous feature and the mostimportant reason is there is the projecthealthiness and maintainers Ifmaintenance have to maintain the featurethat uh two two different features thatare in a way same and that provide thesame functionality that's a lot of workfor them We have to remember what a lotwhat is not often said is thatKubernetes is in large maintained bypeople who are doing that on their ownfree time that are not uh sponsored bytheir companies and they do that whenthey can they are taking away time fromtheir families friend activities to makethe Kubernetes project better for us andit is we can't really ask them tomaintain a lot of different featuresthat are actually doing the same thingSo eventually something new and bettercomes we have to get rid of it The otherthing is that generally speaking havingthe project with way too many with waytoo much code and stuff like that isprone to a lot of different feature alot of different problems a lot ofdifferent potential vulnerabilities andstuff like that So having a more cleancode base is something that can help alot with the project wellbeing And ofcourse if a change has a mitigation isit truly a breaking change that'ssomething that can be debated In most ofcases we can say that it's breaking tothe user but we consider if there's away to mitigate it to save on ourmaintainers and to help them it's betterto doit And from the technical side theKubernetes is a very complex system withmany components and a lot of thosedifferent components are versioned in adifferent way So you might see that somesome different components are taking onbreaking changes more often For examplethe APIs like for components they'repretty much uh well known and they don'tchange too often but maybe some CLI flagor configuration property APIs tend tochange more often because the mechanismfor versioning them or for introducingthe new of them can be done a little biteasier than for other componentsAnd there is one common question that isarising all over again and especiallywhen we started removing the APIs andthis is should we have Kubernetesv2.0 That's a very good question andit's very complicated one because theKubernetes is a very complex system wecan maybe even consider it is as amicros service architecture where youhave a lot of different services andwhen there are all some of them areversioned in one way some of them inanother way s�o there's that too and theother thing is like what would thatcause like what would as a user forexample if you would create kubernetesv2 uh that would also mean that forexample some tools how will tools handlethat how will for example all thematerial materials documentationeverything The a lot of users will beconfused Can they apply that to be towhat is it risky or something like thata lot of questions comes a lot ofconfusion and then it's bringing a wholelot problems that are often not thoughtabout well and that are important tohave in mind and we already have theprocedures like being able to versionthe APIs so we can have different APIversions and for example being able tohave like different config config filesof following the same pattern and forthe end about this is that we wouldn'treally consider Kubernetes v2 unless wehave some really major change short APIserver doesn't exist anymore right sothis will be a very significant changeor we don't accept YAML files anymoreBut all those changes that we had overthe time they really didn't change thecore Kubernetes The architecture is thesame all over all over the years The wayof using it like cubectl didn't wentthrough that many changes thatcompletely changed it over the time It'sthe pretty much cubectl we have beenusing for years So this is the reasonwhy we don't have Kubernetesv2 And now to see some example ofremovals like most often we have thosethree So removal on the API versionremoval of CLI flag and or environmentvariable and removal of built-infeatures and for them in most of caseswe had a mitigation for API versionusually there's a new API versionavailable for removal of CLI flags andenvironment variables in most of casesthere have been a configuration filethat have replaced those and But you canuse instead The most notable examplethere is for example cublet We havesimilar for cube proxy and for someother components that CLI flags havebeen removed in the favor of a configfile And the other one is removal ofbuilt-in features In a lot of thosecases we have an external featureavailable Perhaps the most notableexample here is for example docu Therehave been external projects I believe uhcry docker stream or something like thatthat allowed you to continue use dockerif you want or in some cases you canbuild on your own some web hook orsomething like that that's going toreplace the built-in feature and a lotof cases community does that if there isreally need for that but uh oneinteresting here and one interestingchange that a lot of people tend tosuffer a little bit is the removal colof API version and I would like to do avery short demo here So uh what isimportant to know is that when we aretalking about removing an API versionKubernetes has a mechanism forconverting from one API version toanother and then back again So if you'reusing a deprecated API version thecubernet is actually going to convertthat object to API version that's notdeprecated It's actually this stable oneand you store the object with that APIversion and even if you ask for thatobject it's going to give it back to youin the new API versionSo if I have something so I have acluster here that's running uhKubernetesv131.2 and this uh version of Kuberneteshad the flow schema resource that is inv1 beta 3 and that is deprecated and italso has v1 So if I take this v1 betatree and applyit it will give me a warning It will sayit is a deprecated and it will beavailable in 132 plus Please use v1 Butit has created it successfullyHowever if I get this flowschema so itis held forstrangers is going to work But if I getit in the YAMLformat I am going to see that itreturned me v1 But I actually created v1beta tree So for most of API removals itis pretty much cubernet is going to dothe work of you How you can updatemanifest well you can consult whatKubernetes did for you put that in yournew manifest and just continue to workwith it Even if you need for example tocompare something like you want to getit in the previous version There'soption for that as well You can usecubectl with that You can provide itwith the reso�urce then the version andthen the API group and it's going toreturn you in v1 beta 3 with a warningthat it has been deprecatedSo after a small demo that I hope youenjoyed we are going to talk a littlebit about the anatomy of a breakingchangeor of any change actually So whatbreaking change has to go through ispretty much what any change has to gothrough and it looks like this So anychange in the Kubernetes projectespecially the significant one has tostart with a proposal something that wecall the Kubernetes enhancement proposalor a short cap The caps are living in onGitHub in theKubernetes/enhancements repository andit's a pretty long and detailed documentthat contains things like the timelinesthe proposal itself the motivation ifthere are alternatives that wereconsidered the implementation the testplan and for the endic that's called theproduction readiness review So the wayit works like how the change happens isthat someone is proposing it Uh then themainteners are going to review it andthen the special group of maintainerswho have a lot of experience with theKubernetes projects are going to look atit to make sure that it makes sense thatit is okay for running it in theproduction and to give their like ifthey are happy with it or not or if hasto be changed So every change like itwhat is important to understand that hedidn't came out just like that It hasbeen actually evaluated with a lot ofalternatives considered with a lot ofdiscussions to make sure that if we haveto do something like that that we reallyreally have to do it After that we prodwe proceed with the implementation andafter the implementation is done westart writting the performance testdoing the docs the blog posts andeventually publishingthose and from the release uhperspective So how the release cyclelooks like the release cycle looks likethat we start that it takes maybe 13 to14 weeks depending on uh exactly whatare how it all goes together andavailability of people and other thingsconferences like CubeCon but in generalthe very first thing is that there's theproduction readiness freeze which isabout 11 weeks before the release wherethis team has to uh where every proposalhas to be reviewed and to has a positiverating from the people reviewing Then wehave the enhancements freeze which isabout 10 weeks before the release And ifyou want your feature to go into therelease this is the point where it mustbe implementable Like if your proposalis not ready to be implemented by thenis not approved doesn't have all thenecessary things that it needs to hasit's not going to end up in this releaseMaybe next one and five weeks before therelease is code and test freeze wherethe code has to be done and merged wheretest has to be done and of course if itdidn't made by then it's probably notgoing to end up in the release Thenabout four weeks before the release westart with uh publishing some of the theblog post and working on the docs andall the needed what is for the releaseAnd at the end we have our lovelyrelease Now let's talk a little bitabout the policies uh which is importantaswell And the first policy that I want tostart with is the Kubernetes deprecationpolicy It is a very important one a verydetailed policy It has maybe 11 ruleswith some sub rules but quite complexdocument that has everything that youneed to know when for deprecating thefeature and eventually removing it As wesaid a little bit earlier before you candeprecate the feature before you canremove the feature you actually have todeprecate it first And what is for youimported as a user is that in averagebefore you can remove the feature it hasto be deprecated for quite a long timeFor general available features it takes12 months or two releases whichever ishigher For beta releases this is threemonths or one release And for alphareleases those can be removed at anytime And this is why is it veryimportant to keep an eye on deprecationsand to make sure that you're aware thatsomething is eventually going to beremoved What is covered by Kubernetesdeprecation policy those are APIs flagsand CLIs features or� behaviors andmetrics Again the information that theypresented was average Some of thosemight have some little bit differentnumbers but in most of cases it is verysimilar And another policy that isimportant is the version skew policy Itis maybe not important as thedeprecation policy but it is interestingto keep in mindthat you can have a skew when it comesto version between components in yourcluster If you're using the managedoffering not that important but it'sgood to know that for example inreference to cube API server you can usecube controller manager cubeuler and ccmthat's at the same version or oneversion prior to that You can usecubectl that's one version newer thanAPI server or one version older or thesame version of course and cubleted cubeproxy allow that you that they are freeversions older three minor versionsolder than cube API server so this issomething to keep in mind and if you'reinterested you can scan the QRcode oh sorryOkay but we didn't talk about theinfrastructure did we so this is alittle bit different and it is veryimportant to clarify this aswell The infrastructure of theKubernetes project is not covered by theprior policies In other words there areno guarantees of any sort for theinfrastructure provided by theKubernetes project So think aboutregists IO DLKs io packages KS IO noneof those have actually guarantees andthat's why is that that's because all ofthat infrastructure is first of allmaintained by people who are volunteersand doing this in the free time and forthe other reason is that we aredepending on the cloud providers andother providers donating the credits andtheir infrastructure where we canactually run that If for any reason isthere any change to that we have withshort notice or no notice at all to dropsome part of infrastructure we hadunfortunately happening that at somepoint and we will see that as anexample So probably the one that you allencountered is the KS GCR io beingfrozen So you all had to switch toregistry k IO to continue There was someinformation about that at some that giveyou a time to move but once we had to doit it just happened we were not able todo some let's say as long deprecationperiod as for afeature and same goes for DLKs IO thisis another change we were pretty muchhad to do that to adopt the CDN becauseserving all the binaries and everythingbecame very expensive and we had to cutdown the costs and we just had to changeto suffix CDN to be able to save us somemoney and for that we are very thankfulto the uh Fastly for providing us a CDNservices and one more example of thosechanges is the Kubernetes legacy packagerepositoriesuh we also had to do that not onlybecause of the cost reason but becauseit became a huge burden for maintainersand we needed something that's bettertailored to Kubernetes projects So wecame up with a new solution and we hadchance to drop the new one as soon aspossible so that we can continuedelivering you the releases If we keptthe old package repos then we would havea lot of troubles a lot of problems andthat wouldn't be nice to you also it issome case where we had to say okay wehave to stop doing the old one we arecontinuing just the new one so that wecan provide you the better quality andthis is the unfortunate part when itcomes to infrastructure a lot of changeshave to happen on the shortnotice and when it comes to theinfrastructure one last thing what isimportant If you use Kubernetes mirrorall the things This is very importantbecause of infrastructure being dynamicit uh it can become unavailable at anypoint because of any reason You don'twant your cluster to stop working mirroreverything there For example forregistio you can find it It's repostsome instructions and there are also alot of different places where you canfind information how to mar files orpackages It is very important to do thatYou're first helping the Kubernetesproject because we would get to you knowsave a little bit on the cloud cost butyou're also helping yourself because youif something would happen to ourinfrastructure for any reason yourcluster would not go down and yourworkloads would continue workingespecially given there are noguarantees And finally what can you doand how you can prepare those are twoquestions that you might have after allof this So the best thing that you cando is to stay on the top of the newslike follow the Kubernetes developmentfollow the change lo make sure you'reaware of deprecations make sure you'reaware of the uh some of the breakingchanges One thing that you can do is tofollow us on social media We have aprofile on licadine on Twitter on bluesky So make sure that you follow usthere We try to post the most importantinformation that might affect you Alsoone important thing is the Kubernetesannounce mailing list It has just about5,000 people subscribed to it It isprobably not very well known I believewe have many more Kubernetes users It isa lot of traffic mailing list mostlycritical information uh like CVSreleases and some of the breaking thingsthat might happen that are on shortnotice And for the end if you want to gointo some nerdy details and learn moreabout how development is going on thereis a mailing list which is called lastweek in Kubernetes development It'smainly focused for contributors but ifyou want like to uh really learn what isgoing on every moment with theKubernetes development it goes out oncein a week There are some very wonderfulvolunteers who are uh tailoring it anduh you know collecting all the changesand stuff like that So it goes into theuh mailing list format So you can uhgive it a try There are links there willbe slides published uh later onAnd one other thing that you can do isto provide the feedback on the proposedalternatives The best way to do that isGitHub or the Kubernetes Slack workspaceSo free to join any of those and whenproviding the feedback please be mindfulof the maintainers Please be mindful ofthe code of conduct because thosemaintainers are spending a lot of timeon our lovely projects and let's makesure that that we understand them andthat we understand the attention of thechanges and then we try to provide thevaluablefeedback Finally when we say that what'sconsidered a good feedback it'simportant to know that once something isabout to be deprecated or removed thereversal or major change is veryunlikely That's because we have seensome of the reason like you know havingto think of project healthiness ofmaintenance well-being So if theydecided that it has to go it is probablyfor a good reason Not just because theywant to do it but because it isimportant for the project So it'simportant to focus on the alternative totry it out on time not to wait for theprevious API or suffic to be removedthen to try something new but to try itin advance to make sure it works like inthe deprecation period before removalhappens and to provide feedback to themaintenance to so they know that itworks wellAnd some of the changes that we havesome good examples is the docker shimlike one that we probably got affectedby at some point and we had to do thatthere was the deprecation period in 124it was completely removed and you knowif you came and said yeah we want dockersim still you would get the answer thatit's not possible but a good feedbackwas like people trying out making surethe migration works properly things likethat Same goes for like probably thelargest migration in the Kuberneteshistory This is switching to externalcloud controller managers or CCMs It wasyou know a lot of people were affectedbut it was important to do that and uhit was like what would be the valuablefeedback in such cases like tryingdifferent external implementationsproviding the feedback and making sureit works and getting ready on timeSo we are just about the end of thesession H thank you very much forattention Please use the QR code toleave the feedback about this session Ireally hope that you enjoyed it I reallyhope that you learned something new andthank you so much for comingtoday Any questionslooks like no If you have any questionsyou can always reach out to me or anyonein the Kubernetes community on SlackThank you very much2025-04-15 22:03:12.649212 22��1�O#��AlQFUarM_GXohelloeveryone welcome to the today's sessionabout navigating the inventiblecubernetes breaking changes behind thescenes I am Marco Mudrin I am a seniorsoftware engineer at Cubmatic I am alsoa SIG K infra tech lead and sub projectlead for the Kubernetes releaseengineering Aside from that I am a CNCFambassador as well Before we get startedone content warning and it is about thatthe focus of this talk is the coreKubernetes also known asKubernetes/cubernetes Different subprojects under the Kubernetes apparelthink of projects like ingress engine Xor cluster API or any other project canand will probably do implement their ownpolicies So the focus here is just thecoreKubernetes and before we get startedlet's uh think a little bit about what'sa breakingchange and the answer is it dependsdifferent projects will define breakingchange in a different way and some ofthem might be more restrictive some ofthem might be less restrictive but itreally depends and if you try to findsome definition it's not really easy butI found one on the website calledVictionary and it says "A change in onepart of a software system thatpotentially causes other components tofail occurs most often in sharedlibraries of code used by multipleapplicationsBut I believe we can simplify this andif we ask user what they think mostly wewill get answer that the breaking changeis a change that's not backwardscompatible or in other words thatrequires an user action uponupgrading and I think this is somethingthat we encountered many times in a lotof different cases and I think we canall agree that we can split breakingchanges into two different categories Wecan talk about breaking changes thathave a mitigation So Sophie has beenbroken You need to take some action Youcan't just upgrade and be happy with itThere is something that you need to doBut it's not end of the world You canstill use the feature that you're usedto just in a little bit different wayAnd there are those different breakingchanges that are much more serious andmaybe what we can consider the very realbreaking changes those are that don'thave a mitigation that the feature isjust gone and there's noreplacement and very fortunately forKubernetes a majority of breakingchanges that we had over the time hasbeen in this first category that had amitigation and we are going to see someexamplestoday first of all it's important toknow how those changes are classifiedwhen we are talking about Kubernetes andIn Kubernetes you will be able to findthose changes as removals and majorchanges You can find those in manydifferent places There's a blog postseries called Kubernetes removal andmajor changes that's going out about afew weeks before the release So forexample if the release is scheduled forAugust then somewhere mid July therewill be a blog post available that'sgoing to recap all the important changesThere are also blog post series that'scalled sneak peek that's also goingshortly before the release that containsstuff that's uh that will be changedthat's important for you to know therelease announcement blog post of courseat for the end change log You probablyhave seen it many times and thelegendary urgent upgrade notes that weare always scared about but thatcontains something that you need to bearto be aware of and of course in thechange log it's very important toevaluate it Not all changes might beapplicable to all users for especiallyif you are running on a cloud providerSo managed offering change in a cubletCLI or some flag might not beinteresting to you but some other changelike the API has been deprecated isprobably much more important toyou and this is how it looks like Sothis is in the change log This is forexample an example of the change Sothere has been some environment variableNow it works different and you need totake the action depending on whatbehavior youwant and this is example of the blogpost series that I have been talkingabout So it contains like explains thepolicies of Kubernetes something we willsee today here as well ��splease don't judge us too harshly um 30minutes and no YouTube magic to make itlook like we did it really fast helloyou should talk into the mic i shouldtalk into the mic more yep all right uhwith that panelists please introduceyourselves starting with Ian hello myname is Ian Coldwater i am a co-chair ofKubernetes SIG Security and a securityresearcher i'm independent and I'mlooking for work if you're hiring um andI'm very excited to be here thanks forcoming[Applause]i'm Xander Dervinsky i uh am a tech leadfor SIG Docs and have also been involvedwith SIG release and SIG contrib in thepast um and much like Ian also workcurrently nowhere umyeah hiremaintainers um my name is Marley Salazari work with SIG CLI mostly on CubeKettle um I do have a job but nice mustbe nice it is nice all right next hellouh my name is Cat Cosgrove uh applausefor Marley i am also a SIG doc'stechnical lead but I am also a releaseteam sub project owner so um be nice tothe release team and I also don't workfor anybody so if you would like to hireme you can do that too i need a visathough in Europe thanks[Applause]all right with that let's uh sauce upnumber one i'll ask a question and thenwe'll let you all dab all of these aregoing to be dabs by the wayseeing doubleoh that's a lotwere we all supposed to have forks yeahwe're all doing it at the same time justpass it down the line i think sticks ohthe sticks sticksto kick things off on a mild noteyou said Taylor didn't write any punstalk to him i don't have good readingskills sorry kubernetes is a prettyspicy projectbut how did each of you first getinvolved in the Kubernetes community didyou ever imagine back then that workingon an open source project I hateTaylor could heat up to the scale andresponsibility it has today so like theso the Scoville scale yeah yeah theScoville scale yeah are we answering inorder go for it well I got involved inthe Kubernetes project um in 2016i uh was playing capture the flagcompetitively at the time and got a jobas an IT team of one at a place thatprobably took several years off my lifeand gave me a few gray hairs i automatedum my way through that whole job becauseI was doing the job of 10 people and uhthen I got really interested in andautomation so I found out about thislike cool weird new thing calledKubernetes and then got a job at a placethat were very early adopters ofKubernetes 1.4 which if anyone remembersKubernetes 1.4 had basically no securityto speak of they were running it inproduction which was a very brave choiceand they said they needed a securityexpert on the team and they asked me ifI could learn how to break it and I saidyeah challenge accepted I probably canas it turns out I could break it and nowI am a co-chair of SIG security becauseI got pretty good at it and it's reallyimportant to me so happy to behere i started um how many years it'sbeen five six somewhere in there umoriginally with SIG release helping outwith the release team um I had neverreally done open source stuff prior tothat because I didn't want to codeoutside of work um and uh Kubernetes isat a scale such that like it is aproject that can actually supportcontribution outside of strictly codecontribution and so there was a placefor helping out with things that lookmore like program management and justoperational kind of stuff and so that'skind of how I found my way into it andyeah been around ever sincei was doing primarily VMwarebased stuffum and decided I didn't want to do thatanymore and uh looked around and sawKubernetes and was like that's closeenough and when I was casting them outtrying to find a SIG to do stuff for Ilooked at Node and said nope uh and thenI looked at networking and said I can'tI don't understand what's going on hereum and then I landed on cube kettlewhich was a lot easier for me tounderstand and we don't have a lot ofcontributors so I became a co-chairum I was using Kubernetes at work i usedto be an engineer um I learnedKubernetes in like maybe the worst waypossible which was by using K3S uh it'snot a super great way to learn i likeK3S but that's not a great w�ay to learnlike Kubernetes best practicesum so I I was using it but I wasn'tcontributing back to the project untilum you remember that time we deprecatedthe docker shim and everybody freakedout um that was my first contribution toKubernetes um everybody handled thatinformation really poorly because we asa project um grossly overestimated whatthe average user understood about therelationship between Kubernetes andDocker and grossly misunderstood what acontainer runtime is and um somebody hadto fill a knowledge gap so I wrote umseveral blogs and an FAQ and a guide forum transitioning your cluster over tonot using the literal entire Dockerstack and uh kind of just stuck aroundum Jeremy Rickard made me join therelease team uh and then I never left iti was pretty good at managing the voidand um so eventually they gave meownership over the void and now therelease team is just kind of my problemthere's a lot of coercion involved inbecoming part of a Kubernetesmaintenance there is yeah mine wascompletely consensualsecond question Cat uh you famously havean NFC chip implanted in your handmaking you a literalcyborg uh does having that techaugmentation give you any uniqueperspective as a DevOps or cloudnativeadvocate and what's your wildest or mostunexpected way you've used that chip toconnect to communities also to speedthis along I'm immediately going tostart my wing too okay uh let me get uhthat number two sauce thenyou're gonna Oh I'm going to answerwhile you do it okay um so not only do Ichip in my hand there's a there's asmall one in the in my thumb um there isa much larger one that Taylor did notknow about when he wrote this questionin the back of my wrist that is both NFCand RFID and it has um an LED on it soit lights up when somebody tries to readit which is pretty cool um I almostexclusively use it for stupid stuff iclone my hotel keys to it which is uhwhich is pretty fun i use a flipper todo that uh doesn't work with all hotelsbut it does work with the KalahariResort in Ohio so if you go to Code Mashyou can,00% um put your room key andyour payment card onto um your wrist umI have my Orca card on it my Seattle buspass which they don't love but uh it isfun um the only actually useful thing Ido with this is it generates all myonetime passwords for me so if I lose myphone entirely I can use take any otherdevice install the Vivo Key app on itand tap my wrist and I will have all ofmy one-time passwords again which is uhpretty cool pretty useful um butotherwise it is entirely a gimmick um ifyou tap your phone up against my wrist Iforgot what I have installed on it rightnow it will open either a link to myBlue Sky account or my LinkedIn um so Iuse that at conferences or I will umchange it to the YouTube link forum never going to give you up ofcourse ofcourse all right uh mainline Wing 2 itwas fermented kimchi uh Marley yourbackground spans onrem legacy systemsbefore jumping into cloud native whatwas your biggest culture shock whenmoving from legacy infra to contributingto Kubernetes especially in cube codleare there any fun uh behindthe-scenesstories of a legacy mindset that you hadto unlearn while learning Kubernetesbeing able to contribute at all is uhyou know VMware doesn't take pullrequests i don't know if anybody knowsthat um I also had to learn Go uh whichwas a whole I'm sorry yeah I know youknow I didn't go to school forprogramming so it was a bit much um butyou know a lot of the stuff that I do islike day-to-day helping people migrateapplications to Kubernetes uh thingsthat are ancient and crusty and take alot of effort um some of the biggerthings is like back in the day votionwas a big deal because you could likelive migrate applications between stuffand uh people who are used to that don'treally like the um kill the podrecreated elsewhere methodology that uhKubernetes is is for so that's a that'sa big one for peopleum but yeah no I mean this there's a lotof stuff uh but those those are likekind of like the really big ones allright uh Xander you have been atMicrosoft Twitter Apple and more beforeeven diving into Kub�ernetes open sourcehaving seen how these big companies runsoftware projects uh what surprised youthe most in the way Kubernetes operatesas a community also we hear you're intopottery has your experience I hate youTaylor has your experience molding claytaught you anything about patience orcreativity when it comes to shaping opensource communities christ whole assquestionspick one all right um I don't want totalk about everywhere I've worked it's anightmare um yeah I I do think workingwith Clay has changed the way that Ithink about work with computersum when doing art of a lot of differentkinds I think like you learn a lot aboutcritique especially when you do it withdifferent people and with that like youum you learn that like the goal withdoing critique is not to like enforceyour agenda or ideas onto something butunderstand what that person was tryingto achieve and work to help them getthere um how can they more effectivelyachieve what they're trying to and Ithink about that a lot when I interactwith people in this community um youknow it's and and at work too like Idon't want to enforce my agenda um youknow opinions loosely held I thinkstaying focused on like what we'recollectively trying to achieve as acommunity um and working towards thathas been much more effectiveAll right next up we are going to dowing three which is the hot ones barbcoaI believeexcitingalso none of us ate lunch so I ate lunchwhat are you talking about did you i atelike lunch twice i'm so proud of youwhat's our cussing policy by the way forthis once this gets really hot you'rewe're letting it rip okayum so I didn't mention that I work forthe CNCF and I really shouldn't givegive a uh a policyhere keep itPG-13 lean into probably the uh thePG-13 near the end that means we get oneF-word pg-13 is one Fbomb i cannotcontrol you but also please know thatas a security professional I recommendsbombs oh all right ask the questionalso we promise that these will gethotter and it'll be more entertaininglater these are just the starter wingsit's so good here's a fun one becausethe next question was actually going toyou Ian uh your last name might be coldwater but you're known for bringing theheat in Kubernetes security and you evenrocked an untitled goose game shirt onstage oncetwice i'm also wearing it right now howdo you channel that playful mischiefmaker energy when tackling serioussecurity issueswell a good thing to know about uh cybersecurity is that if you don't laughyou'll cry so I try to keep a sense ofhumor when dealing with thorny securityproblems because let's be real securityitself is a thorny problem um I thinkthat generally speaking people umsecurity professionals tend to have areputation for being kind of uh not funand not terribly pleasant to interactwith and in my experience if you areyelling at somebody they don't like tolisten to you however if you areapproaching somebody with humor andkindness and good nature they might bemore likely to listen to you and workwith together with you so um I try touse humor in the course of trying todiscuss security with people and improvesecurity overall because I actuallythink it helps i think it helps peoplehear it and I think it helps make itmore funall rightuh Kubernetes decision-m often relies onconsensus across SIGs and companies canany of you share a time when a technicalor uh community decision got incrediblyheated in uh in a debate uh and how didyou and other contributors like buildconsensus and smooth things out so therewas actually a productive outcome howabout the security people have opinionssecurity people do tend to have opinionsthis also happens like constantly in theKubernetes project though like we wealways have a lot of uh decision-m goingon that touches multiple SIGs thattouches multiple companies sig releasedeals with it all the time because we'rewe're managing contributions from mostSIGs and it uh is often messy um you'redealing with a lot of conflictingemotions you're dealing with a lot ofconflicting prioritiesum you can you can go look at any issueor or PR with like more than 40 or 50c�omments probably and you'll you'll seepeople going at it um usually it'spretty um it's pretty tame and peopledon't actually like devolve into namecalling or being mean to each other umwe're we're pretty good at this point atsolving these problems in a way that'scivil um but you can go look in like theSIG instrumentation channel right nowand there's a a long discussion going onover how we're handling some unttrackedchanges in this very cycle and whatwe're going to do about them wanting anunttracked change to be a releasehighlight so there there is a a currentexample of a disagreement going on rightnow that you can go look at yeah a lotof people really want new flags in CubeKettle and while I would love to say yesuh there are so many right now andmaintaining that is you know we needlike 10 times the people that currentlyactively contribute to Cube Kettle to beable to support more flags was that thefirst milk sip from who i took oneearlier did you yeah i'm just thirstysameall right we are going to go to the nexthot sauce which is Fat Cat ChairmanMeow's Revengeoh that's a a runny sauce it's going tobe real really easy to oversaw that onei'm going to oversize that i'm not goingto feel badthese can't be too hot yet with a namelike fat cat i mean it's the revenge ofchairman meowthis is not hot at allallright the cloudnative word uh world isfilled with buzzwords and hyped trendsservice mesh serverlessAI time for some hot takes especially aswe are now in the uh done with the firsthalf of oursauces which popular tech buzzword ortrend in the Kubernetes ecosystem doeach of you secretly find overrated oroh man more sizzle thanstake why does it not live up to yourhype in your viewwellum there are use cases for LLMs and manyof the use cases that we are being soldum LLMs for are not actually use casesthat are particularly good for LLMs asit turns out and um I think it's reallyimportant when new technologies comearound especially hyped ones but notlimited to those to understand exactlyhow the technology works and whatexactly it does so that you can siftthrough that hype and be like is thisactually the thing that it is beingpurported to be because sometimes theanswer is simply no yeah uh we had anissue with this in SIG docs uh fairlyrecently involving um an an LLM tool apaid product um that purports to make iteasier for your open source project tolike generate and maintain itsdocumentation it it automates thingslike um style guide compliance and andgrammar rules and whatnot and uh it isagain a paid product not an open sourcetool and they wanted to put theKubernetes logo on their website theywanted to be able to say the Kubernetesproject uses their tool and they wereoffering us a free license in exchangefor allowing that um however they didn'tcome talk to us first they just did adrive by PR um and it it didn't work isthe thing it was making a whole bunch ofchanges to like generated files um thata human would not actually modify uh itwas not actually complying with ourstyle guide and we you know obviouslydeclined the PR and told them to stayaway from us and the Kubernetes projectis not like a free marketing opportunityfor you so um I'm also going to have togo AI and and LLMs um sometimes a personhas to do that workyeah I I mean I don't want to take thesame one um this is not going to make meany friends i I do genuinely think thatthe UX for most GitHub impimplementations that I've seen todaystill sucks like sorry but yeah that'sone of the that's one of the hardestthings for me for my for the devs thatuse the Kubernetes ecosystem that I thatI have at my job is like the GitOps likejust doesn't seem to like make sense alot of sense to them on how it actuallyall has to work together and there's somany invisible lines between each thingand I mean it is partly my job to makeit obvious to them and give them anoverarching view of all of it but it'sso hard it's so much extrawork all right let's move on to the nextwing now uh Bourbon Maple Reaper hotsauce it says it's insanely spicy n outof 10okay you got thisi'm so happy that there's just feedbac�kin another room for me to dissociate towith thisyou know it is it is kind of a vibe youknow like look I'm not going to pretendlike I don't listen to music like thisme neitherno I was listening to Ethel Kane's newalbum this morning and it's likestraight up just noises like the firsttrack starts with just airplanes oh mygod I love that album though it's sogood i can't wait to see her soonhow we feeling so far this is nine outof 10 according to that bottleokay nowhere near as spicy as the stuffwe've dealtwith so speaking of being a ma amaintainer isn't all glamour there'sreal emotional labor in shephering ahuge community i think we all know thisdeeply and canagree how do you each avoid burnout andmaintain your passion what strategies doyou use how do you set boundaries kindof I don't burnout bad how we're allburned out yeah I'm so burnt out ihaven't made I I haven't made a PR inlike a yearall right all right let me put it thisway y'all need to say a little more thanthat otherwise we're going right to thenext wingi think it's really important to findthings for yourself to do that do notinvolve a a screen the internet um PRsdoom scrolling um you know staying upall night on YouTube like really likeget off the computer and and literallygo touch grass i'm so sorry but like I Ithink it actually really does help um Iuh dealt with my burnout by learning tochase the northern lights and now Itravel around looking at pretty rainbowsin the sky and it is absolutelywonderful if occasionally quite cold andtakes me away from the computer and Iwould just really like to recommend toeverybody that you find something thatyou love that takes you away from thecomputer whether it be art whether it besomething involved in nature anythingbecause um it's good to step awaysometimes and in my experience thathelps a lot yeah i can't even use myoffice for work anymore because it'sjust become a painting studio and I'vegot some really creepy art behind me soI can't do video calls in there oranything anymore andI did for a while the Did I show you theone with the mirrors and like the bloodokay that's It's fantastic so first ofall um that hot sauce weirdly tastes alittle bit like nail polish like the waynail polish smells how do you know thatyeah i was How do you know how it tastesbecause if I can smell something I knowwhat it's going to taste like and thattastes a little bit the way nail polishsmells um second uh I personally am alsoburned out but I'm a little bit betterat managing it than most uh so I willtalk about what SIG release does to tryto manage burnout um by charter we willdelay a release before we make the teamwork at night or on weekends uh like Iexplicitly have the power to say "Nopewe're going to push back a weeklove to hear itbecause uh we can't we can't build thisthing without our contributors we can'trelease it without the release team umthe release team is like 30-ish peopleand a massive amount of work over thecourse of 14 weeks um hurting catsarguing with contributors and trying tomanage the relationships betweenpotentially hundreds of people and uhthat sucks and it is a burnout factoryso if um I have a choice between makingpeople work late or making somebody umsmash a bug on a weekend and pushing therelease a week I will push the release aweek and like we explicitly in writingget to do that so also I want to pointout as someone who was formerly involvedin the release teams a lot that was notalways the case and it was not the casewhen I was around so thank you you're sowelcomeuh next up Red Flag Blackeyed Susan bythe or the Black Eyed Susan SpiceCompanyi think this is where it's going tostart gettingthis is where it gets unpleasant i thinkthis is where it starts getting hot okaythe last one did linger and uh one ofthe little pieces of chili got stuckunder my tongue ring and that didn'tfeel greatoh well the face he just made this thiswill be funit's because it tastes good toois this our debomb no the debomb is nextalso feel totally weird just likechewing on stage sorry i know this isawkward forme good enough all rightif you could Are� you all good this has aslow burn but it does not have a fastone no I feel I feel it kind of in thelike in the back yeah oh my god yeah Ican feelOh okay by the way now this is where itactually starts getting genuinelyuncomfortable for us so another round ofapplause for these fouryeah I am starting to sweat a little bittoo sweater was probably not the rightcall for me but I wanted to at least doa little Shawn Evans throwback since heusually wears something like this we aresweaters right nowyeah godthat all right maybe the sauce ishitting me i just gotthat if you could magically redesign orfix one aspect of Kubernetesarchitecture no backward compatibilityworries no alpha beta GA cart launchwhat would each of you do and why i cangive examples but I kind of want to notgive Taylor any more wordsi would love Arbback to be inconformancei'm just going to leave it at thatwait Arbback's not in conformancebackwards compatibility babyohI think there's some things I wouldchange about the resource management umdefinitions to make it a little a lotmore intuitive and um scalablemanageable uh customizable all of theaboveyou sound like that's starting to hurtyesoh um so uh this one has a happy endingactually so for the longest time umupdating the cube config for cube kettlewas like basically impossible uh becauseit was not versioned and any change thatwe would make to it would probably breaklike 50 to 60% of CIS in the world andum but we just we just alphaum a feature called qrc that gives us aversioned RC like config so we can setdefault flags and stuff because we can'treally change default behaviors in cubekettle um and that means that like wecan't help prevent people from deletingtheir whole cluster when they do a Qkettle delete command and don't pass theright flags but now we have a flag forinteractive deletes that will list outeverything it's going to try and deletefor you to confirm but we can't make itdefault because it would break a lot ofstuff so with the new feature we're ableto enforce let let admins or IT teams orwhoever enforce amongst the people thatare touching their Kubernetes clustershaving those flags enabled by defaultand things like thatso that's what I would change and we didit wasn't me it was a group effort by 6CLI it was great uh thank you so muchArta for making the last final push forthatum I would go back in time i think um Iunderstand why um the Docker shim waseventually was necessary right um Dockerwas our original and only containerruntime but uh that whole situation wasjust so janky um that we we needed asoftware shim to get at Docker'sinstance of containerd um I I wish wehad not designed it that way in thefirst place uh it caused a lot of dramathat took a lot of time and energy andeffort for a significant portion of theproject to manage and I fear that peoplenot wanting to deal with migrating offof Docker as a container runtime uh ledto a lot of people staying on anoutdated version of Kubernetes muchlonger than they should have and umthat's bad that's bad so if if literallyanything goes I would like to timetravel and uh with knowledge of thefuture to change the way we handled ouruh container runtimesall right um we are at the equivalent ofwhat is our debomb this is Mr naga fromIan's personal stash and it isapparently 10 times hotter than Dam butis not disgusting like the bomb is whichis why we substituted into out we arealso at our time we're going to probablygo five minutes over but I wanted all ofus to suffer on this one together 3 2 1cheers[Applause][Music][Applause]that's delicious but it is very nice youhave to ask us a question haven't yousten this right let's have some fun withanalogiesjesus it's still in my mouth sorryI missed i did it theIf Kubernetes were a character in amovie or a D and D campaign who would itbe and whyuh Vin Diesel in Fast and Furious cuzit's all about family cuz it's all aboutfamily it's all about the contributorfamily it's all about the contributorfamily i don't know well Vin Diesel inFast and Furious is the first moviecharacter I could think of so that'swhat you get I guess uh family what'sthe D and D monster uh with the that itit's like the head that floats at abeholder the snakes yeah a beholder yeahI see that you know those like oldtimeyswitchboard operatorsi think we're an old timey switchboardoperator i think I think that's whatKubernetes doesi don't think I like you anymore i loveyou too all right well we are almost attime or over time but we have one lastwing uh it's tradition around here butwe have always been doing the dab atthis point for all thesauces so let's do the last dab and thenI will give you all your last questionwell I should have worn waterproofmascara sorry everybodyare you crying yeah oh no i would neveruh I wanted to get that SNL skip mascarathat like runs really badly for this butthe chat Ian is genuinely crying umthanks for pointing that out toeverybody cat no problem we're here toentertainall right soI love you all we love you nopeuh this is um this is the last dab okayfrom the YouTube showI'm not crying you're cryingall right we made it that's not so badyeah that's fine yeah that's actually agift compared to Thank you yeah for surethis is my new milkh looking back on all of your time spentin open source what is one piece ofadvice or nugget of wisdom you give tosomeone aspiring to become a Kubernetescontributor or a SIG lead never killyourself for an open source project isit a chicken nugget of wisdomit does the open source project is notsigning your paycheck okay working withan open source project or in an opensource project may get you a job withsomebody who will pay you money but ifyou are doing this entirely in yourspare time or you're unemployed likethree of us don't kill yourself for anopen source project it it can't pay yourbills i'm going to put this on a muchlighter note and say don't beintimidated to contribute to open sourcei know that we all like look super coolup here like crying but umKubernetes isreally welcoming and really friendly andreally inclusive and we pride ourselveson thatum don't feel intimidated i know it canfeel intimidating but like come plug inwe're always looking for contributorscome hang out um we've got lots for youto do we've got lots of places for youto plug in and frankly we try to make itfun as you can see right here so um hopeto see you at a meeting sometime i hopeto see your pull request and I'mdefinitely not crying right now i I'vebeen asked a lot of times like how toget started with the like contributingand and and Kubernetes and things and II think um it's enough to just show upin the beginning like you don't have tobe making PRs right away you don't haveto be you know even asking questionsopening issues like it's enough to juststart coming to the meetings and bepresent and show up like that's a greatstep one um from there questions willarise and honestly like I think I wish Ihad known that a lot earlier it therewas a lot of time of like feelinginadequate like I wasn't doing enough umreally in the beginning it's it's trulyenough to just show up and if you don'tfeel like you see the space that youwant it to be or the things that youwant to happen come make them happencome make that space we're happy to makeit with you yeah we definitely noticelike or I do anyway i run all the SIGCLI meetings and I definitely noticewhen new people show up and I'm sothankful and appreciative excuse me um Ican't overstate how uncomfortable thisis my hands are numb i'm not sure howI can't feel the tip of my tongue andI'm pretty sure my lips are swelling allright we are we're at time la lastthoughts otherwise I'm going to wrapthis up i hope that all of our sufferingtoday was entertaining to all of youthank you all so much for coming outtoday we really appreciate it round ofapplause for everybody thank you so muchforcoming thank you for the event staff foruh putting up with our BS becausegetting this to happen was not easythank you no it wasn't we're now goingto go suffer and if anybody wants like aridiculously hot sauce we've had a bunchnowand probably can't bring it on the planeno I'm notall rightsorry everybodyuh2025-04-15 22:03:13.561292 w�w��5�Q#��!AdidMaFDxRAcwelcome everybody i hope you're here forthe right session uh building aubiquitous cloud native we're here totalk to you about building communitiesaround the world and i'm excited to umbe joined by if i can click through herewe go uh fellow folks uh organizers uhfrom around the world and i'll let themeach introduce themselves but beforegetting started i'll introduce myself asaudrey montenegro i'm a core cncfemployee um i work specifically on kcdswhich are kubernetes community days oneto two day summits um as well ascloudnative community groups um satyamwould you like to introduce yourselfyeah hey everyone uh i am satyam and iam an oss developer at dvturn and uh iam a co-organizer of cloud native uh newdelhi and cloud native new delhi is sofar the largest uh cloud nativecommunity in asia and i'd like to thankuh to the effort of my fellow organizersvolunteers and uh and the audience uhfor for building this community so largein the span of just one year so i i yeahthat's all about me and about mycommunity i'd like to pass on the mic tocarolhi everyone yeah my name is k valencianice to meet you everyone and i beenparticipating building uh organizingcloud native chapters in sao paulo limaperu and also organizing kcd in saopaulo rio de janeiro was the last andlima peru and uh also participate in thecube day colombia that was the firstevent in latin america from the cncf andthis gives me some experience across thecommunity in latin americahello everyone uh my name is pabo i'mrepresenting europe here um i amcoorganizer of the krakov chapter uhcncf krakov chapter and also um thefirst kcd in warso which will happenthis year also i have the experiencewith um other events like aws communityday or devops days but we don't careabout them here i think so yeah that'sme and unfortunately we can't have anitahere with us she had visa issuestravelinρ��P#��CAsyV-QGZDmWUhello everyone uh welcome to Hot Takesit's the panel with hot questions andeven hotter wings and even hotterpanelists i am your host Jeffrey Seika alot of folks know me as Jy i found out Iwas doing this about a week ago so we'regoing to have some fun here you wouldalmost say that we're going to wing itdude xander already said that i know butI have to play Taylor a little bit jokestealer yeah I know um that said as Ikind of explained we're going to beprogressively eating hotter sauces eachhot sauce is going to cause us somelevel of discomfort and then I'm goingto be asking them a questions theoriginal show usually has 10 hot sauceswe got 30 minutes to go through thosewere or 30 minutes to do this that's whyI'm going to kind of speed thingsthrough also we only have eight sauce��g from africa so i will bespeaking on her behalf and her communitythroughout thissession so for those of you who don'tknow cncf the cloudnative computingfoundation we were founded back in 2015under the linux foundation umbrella ijust like to quickly highlight theamount of projects that we have whichare 215 most of you will know thegraduated ones um but that is a lot ofwhat we cover in the community groupsand kcds is the cloudnative topics umsome stats on our chapters around theworld the amount of kcds that we areholding this year um and in which regioni've highlighted europe since we're inthe european region so just a fun statslidehere and then finally um we're nottaking questions at the end but we arehaving a quiz and we'll have these qrcodes on future slides in case you don'tclick through quick enough but if youyourself are an organizer um or anorganizer and an attendee um please dotake both quizzes or you could just takethe organizer one but if you're just anattendee of of meetups um we would loveto see your responses in the the pink qrcode there yeah we have only 175questions there so no worries you can doit in five minutes should be fairlysimple what is it 10 questions eachyeah all right let's get started so mostof you know the rewards and the benefitsof attending meetups and kcds um you'regoing there to learn uh share have mindshare um elevate your career um and asorganizers you're also elevating yourcareer and you get fulfilled by bringingthese open- source topics to your localcommunity and and building that networkwith one another but with rewards alsocome challenges right and so this iswhere our organizers with a lot ofexperience come in and i'm going to askthem each questions on specificchallenges they face in their regions soi'd like to start with uh satyam fromthe new delhi chapter yeah so thechallenge that we face uh majorly in ourcity uh in our region is that we have avery large number of audience and toaccommodate all those audience is a bigthing that we big challenge that we faceevery time and uh and talking about thisbigger audience so sometimes uh or mostof the times uh it happens that we getuh audience around 200 folks at a meetupand with with the with this number ofpeople the time management getsdifficult a lot because uh so we have todo the check-ins we have to do theattendance thing and we have toaccommodate them comfortably so theevent starts it's get it gets delayed uhand affecting all the uh uh all thevolunteers organizers and audience andsometimes even venue partners as welland the third thing that we uh oftenface challenges with the av setup likemultiple speakers uh do have multiple uhdevices and uh and the no matterwhatever we have done good for the avsetup there's always something that goeswrong and the fourth thing is with thelogistics and the food and majorly thefood thing so many a times like when thewhen more than expected attendees showedup we have booked prior food so in indiawhat happens is that we have to orderfood uh a day prior or two day priorbecause uh it takes uh uh a lot of timeto prepare the food like the indian foodbasically that uh the people like therethe most and uh we and when theattendees show up more than expectedthen we have to sort of order the foodagain uh yep but yeah there are somesort of uh challenges uh with the withthe meetups but there are also reverselike uh when when people come to us andsay is that hey i got a job and me i gota job through attending your meetups sothis these are the things that keep usmotivating uh that keep us continue todo the community work that we are doinguh there in india i'd like to it soundslike you have a lot of event planningskills now after handling all thoselogistics and the domino effect of thehappy problem of a lot of attendees sohopefully it's still rewarding afterdealing with all those things yeah iagree to that uh like it's alwaysrewarding to meet uh and uh network witha lot of people uh and meet new peopleevery month and uh uh help them learnabout the cloud native stuff and uh saylearn from their different differentpersp�ectives yeah it's always rewardingto have that but looking on your hand ithink that's no all people are happyright with your meetups uh no it's not ii just fell down uh so there was no onewe at at our meetups like we try all ofthe folks should go happy with a smilingface back homeall right thank you satyam karolina uhin sao paulo what challenges do you facethat might differ from new delhiuh yeah i think in uh our local chapterswe have a size between 30 and 100 peopleand sometimes it's tricky to have theregularity of the meetups because youhave to build uh organizers teams thathelp you each other to to maintain thisdon't always is a question that how canopen a new chapter how can so i alwayssay that you have to find people thatyou can trust and try to support eachother because you have to keep thisthese meetups uh between years monthsand trying to do it in everyone isworking everyone has his own worksometimes it's hard to to have the timeto to do the meetup and another pointthat i think uh is also uh build moreawareness inside of the companiesuh that the companiesuh know that how is importance the opensource inside because uh give moreopportunities for example even inside ofthe companies to have more uh opensource jobs sometimes is not awarenessthe how how is using the projects likeopen telemetry or kubernetes inside ofthe company and this is like a littletricky i think this is like our nextstep of maturity inside of uh of thecommunity uh if i think about in latinamerica we are building a lot like wehave a lot of communities that iscreating new chapters but also theawareness of inside of the companiessometimes they even they don't know whatis cncf or they are using an opentelemetry so they don't have thisknowledge then even have moreopportunity inside the companies to givemore dedicated jobs to be to have moremaintainers more maintainers inside ofthe open source projects no that thiswill help us to to try to build somecubecon latin america or cubecon inafrica that is themissing the misses cubecons on the otherpart of the world yeah yeah thank youcarolina uh she touched on theconsistency of holding meetups so cncfrequires you host a meetup at least fourtimes a year so once every 90 days um tomaintain your activity level so it's nota very high standard you could do itvirtually or in person but consistencyis key for your community to count onyou and to expect what's next um andthen yeah to eventually build thosenumbers find the end users and thenhopefully bring a cubecon there in thefuturepavl what challenges do you face in kovright so first of all 200 people areexactly i'm sad it's like come on forfor for my country for my city it's likemore the conference than meetup right soi'm really jealous um anyway before itell you which what we struggle with ihave a question to you um how many ofyou areengineers how many of you working withscaling how many of you have problemswithscaling so surprise the same will happenwith community and it will problem withthe horizontal scaling and verticalscaling uh what i mean by that you haveto remember that probably in your citythere are more meetups than only yoursso you have to remember that probabilitythat they will do the meetup in the sameday and the same hour like you is like95% because the morph is low right soit's good to have the communication withother groups even if it's not like acncf chapter whatever other meetupsright technical ones to find out commonschedule to not you know steal attendeesfrom each others so i think this is veryimportant we hit this couple of times inkov and um yeah it happens it happensand you know we don't have like a 50people but we have 17 right so this isuh problem of scale we have but youconsistently have 17 right like do youhave the same faces show up um sogenerally we can say that for us it's uhthe number of people who are registereduh islike 60% at least more than uh thepeople who show up so 30% if you have30% of the registrants it's it's a quitegood number yeah so for us it's aroundwell i would like to have 50 but yeahmaybe one day you'll get there �but thecommunication is key yeah you justreally don't want to compete with yourfellow technologists and really just geton the same page with okay you host thisweek we'll host two weeks from then andthen that way you can collaborate andshare um since anita's not here i didsay i'll touch on things that are um orstruggles in some of our africanchapters and that's mainly hostingin-person meetups so a lot of ourchapters that we have in africa um dovirtual communication do virtual meetupsand um yeah it's because of the maturitythat um that carolina brought up it'sjust bringing awareness to open sourcein the first place um so yeah and andthen in north america uh the challengesvary just depending on what city you'rein you know from finding speakers to avenue and the cost of it depending on ifit's a large city or not um but theattendees seem to be there in northamerica as wellso moving on to cncf's values and how weinstill these in our communities and howwe make sure that they're exemplary andand um we talk a lot about uhaccessibility and being inclusive and umthe content that we host is vendorneutral um so i'd really like to um askpavll first we'll just go backwards umhow do you deal with this or um how doyou um be exemplary in this way in yourchapter yeah so for us it's important tobring unexperienced speakers to thestage so not onlyunexperienced by the uh way that theyare not familiar with the stage but alsounexperienced in the well experience sojuniors and people who are startingtheir career because um you know i i'min this field for 27 years right so theprobability that i forgot many basics isquite high i know myself is very highand um the problem is that we forgotvery often about basics which we shouldhave that this is the foundation foreverything what we do and the youngpeople who are starting their careerlike 20 years after me they havedifferent perspective they can stillteach me something and if they can teachme they can teach your audience as wellso bringing young young people to thestage unexperienced people to the stagefirst you give them some kind of rewardof being there it is very important forthem and also you allow others to hearthe stories which might seem simple butbelieve me from every single talk youcan find in every single talk you canfind something for you so don't beafraid to you know bring unexperiencedpeople to the stage yeah we really counton these chapters to elevate voices andthen kind of be mentors in a sense tooand then maybe they can go to a kcd nextand maybe they'll have a cubecon intheir future in terms of speaking sothank you for touching on that whatabout you karolina what uh in the sapaulo chapter are you trying to focus onwhen it comes to these valuesyeah i think in general in latin americawe have a it's tricky to find women inthe stage like in the meetups issometimes only me and the people that ishosting the avenue like like woman thatis working in the avenue it's reallythat we need to push more mentoringprograms to elevate new voices in all uhnot only so paulo it's it's a a problemof the region i think is in general wedon't have so many minority groups underrepresented groups and also talkingabout the underrepresented groups islike accessibility is also anotherimportant point that uh this is likesomething that i try to i will put as achecklist to do in the cloud nativechapters that try to use the captionsbecause already we have in the h deafand heart of hearing group of the cncfthat they already they are buildingresources in we can use these resourcesto put it in our chapters to be moreinclusive with this with other otherunderrepresented groups that is not onlyabout women yeah it's a good point andwe actually have a resource later on inour slides that point you to thisopensource captioning option that youcan try out for free at your meetupssatyam what about on the content side ofthings is what are you experiencing innew delhi and what are your effortstowards that what we focus uh in termsof like cloud native values uh is thevendor neutrality more often so we focuson the content uh like u�h informationalcontent more rather than promotionalcontent and uh uh as as uh we have themebased meetups so every meetup of ours islike from beginners to advanced uh sosuppose we are hosting a meetup on uh uhmeetup uh so we host a we try to host ameetup on a certain technology like uhit could be kubernetes it could besomething on monitoring it could be soit goes from basics to advanced and uhas pav mentioned already that we shouldhavesome unexperienced speakers in the queueat the stage as well so i believe uh sothrough that way we try to provide theopportunity to un unexperienced speakersas well and we have a a line of uh fromunexperienced speakers to uh experiencedspeakers starting from the very basicsto the advanced so yep uh we majorly tryto focus on the content more rather thanthe promotional things and we try wevalue the vendor neutrality aspect ofcloud native values more often everytime thank you and and this brings thestory to my mind because uh this is thefirst this is uh the principle of cncfthis vendor neutrality this is the firstthing uh the story from different placefrom devops days really we uh you haveto remember that even if you uh checkthe speaker check the title sounds goodcheck the description of the talk soundsvery good then someone is going to thestage and for the one hour he is doingor she is doing something like that wewe are the best we by us by us for onehour right it happens how to deal withbut well uh you can't do much at thattime because you will not kick off theguy from the stage right well maybe youshould maybe this will be the fun thiscan be let me know how that goes but yesit can happen it can happen you have toremember about that yeah and um also toanita's point since she can't be hereaccessibility in terms of um for folkswho can't make it to your meetup so ifyou have the resources to film um orrecord the sessions and then put it outon youtube afterwards that's awesomeyeah uh in our community like we haveimplemented that uh we record all thesessions so that people who is not ableto attend uh the meetups the uh thetalks are always on the youtube uhpeople can go and check it out althoughit takes a lot of effort but uh it'sworth it at the end like the when thespeakers get the opportunity and theythey come to us say i got more and moreopportunities by the videos that ishowed uh to the other conferences otherconference organizers they are giving memore and more opportunities and theyalso uh like get time to improve whenthe video is recorded they get time togo back and see like what was themistakes they were making especiespecially for the unex unexperiencedspeakers they can go and they can checkit out and they can improve for later onso it's it's it's always good to haverecordings but it takes a lot of effortsyeah and on the contrast i think if forsmaller groups you might feeldifferently yeah we have different uhexperience here because in our case veryoften when the potential attendees seethat there will be recording later theywill not come because why they can justwatch it later what never happenedreally because if you record somethingyou will never you know view this rightso at least for me that works so maybewe should have some pay wall maybe forfor for this for for youtube well no i'mkidding so we do not record uh but uh itdepends really on the on the personaldecision in the community because somecommunities records some not yeah andthat's entirely up to you as organizerswe are different and that's why we arehere exactly representing differentregions for a reason yesall right let's talk about growth andhow we do it sustainablybecause prior to me joining cncf in 2022um starting chapters was a lot easier wedidn't really do as much due diligencein terms of checking the backgrounds ofthe organizers and how they're going tobe able to contribute um we didn't havethis 90day standard and activity levelsometimes we'd have chapters that hadn'thosted something in two years and sopeople thought "oh there's a chapterthere but then they're just waitingwaiting waiting and can't pass the torchas an org�anizer if somebody else wantsto come in and take over that chapter."so um we've come a long way but there'sstill plenty to work on and i'd like toget their feedback on um what could bebetter and i think uh we'll start withkarolina on this one in terms ofchapters because we do see mostly citychapters but sometimes we see statechapters sometimes we see countrychapters um what does that mean and whatshould that look like so i'll pass it toyouyeah governance is a little trickybecause if you think about maybe whenthe cncf started there's like 10 meetupsor 10 chapters right now is like morethan 1,000 and everyone wants to buildhis own chapter in try to build togethergood practice how how how be a goodorganizers try to maybe improve withsurveys about if the audience are happywith that what can we do better issomething that we could collaboratebetween all the organizers across theworld uh we are sharing the repositorythat is there that we can open uhdiscussions and try to improve this thisuh best practice and discuss more howcan be avoided and how will how is a umgood thing to do and this is becausealso in a local chapter it's alreadytricky sometimes to do a communicationsimagine when is a chapter that representa region like a country andhow can improve the communications andmaybe uh not forget other o otherregions that is lot this is like thediscussions that i would like to to seemore uh between everyone in the in thejitab and the issues to try to improvethis this this uh governance across thethe chaptersthank you karolina and satyam what abouton the organi organizer side your teamhas a good process when it comes tobecoming an organizer yeah so on thegovernance side uh like we believe moreinmeritocracy so it's all about like uh webelieve we are more open and we are morecollaborative so what uh we have whatwhat our belief the belief of ourorganiz my organizers is that thatuh leadership should uh be earned itshould not be passed on so we have thisvery good standard of being an organizerin our community so whenever audiencecomes to someone from audience comes tous and ask hey i want to be theorganizer i want to contribute to thisuh community so we say we say that okayyou can start with the volunteering youcan pick any of the role it could be uhanything like uh it could be graphicdesigning it could be uh like contentwriting it could be anything or it couldbe social media and uh we alsothroughout that five to six months wetry them to teach more about cloudnative thing as well the teach about uhthe ecosystem as well sobecause there is always possibility thatspeakers don't show up at the lastmoment and uh uh we believe that ourorganizers should be have that amount ofknowledge that when the speakers don'tshow up uh uh the person our one of ourorganizers should go and speak on thebehalf of that speaker because when theaudience uh is stepping out of home uhfrom for the conference they areexpecting something from us they'reexpecting us to uh stay to our promisesso this is what like we value we valuewhen uh in terms of uh governance soyeah i i think that's pretty much aboutit like uh how how we are we have setthe organiz to set the rules to be anorganizer in our community uh it's it'sabout uh meritocracy and uh and how fastyou are learning and how fast you aretaking on the things yeah thank you forsharing that uh pavl what about you inthe crockoff chapter um i mean notspecifically but in europe i guess umare there any values in the governanceside of things that we should bethinking about but before i we go tothat uh to what you said this is a verygood point that you are ready to belet's call it backup speaker plan b yeahuh it will not work only in one caseactually it happened to me i was thespeaker on the meetup i organized and iwassick so yeah it happened but uh what ithink is really that um we do thecommunity stuff we organize those thingswe are in involved in this not to makemoney from it let's make this clear wedo it for the community we do it for funwe do it for satisfaction yes of coursethere are benefits from it and i am thebest� example when i lost my job lastyear in one hour later i had another oneyou had a good network mhm because ofthe network because of the you know thetransparency i hope i i used to use inmy in my work and life and uh theconnections i had right so this is thebenefit which you have and uh believe mepeople will not believe in yourcommunity in you if they will see thatyou do it formoney maybe some say well it it is naiveapproach but then those people you canshow the one of the fingers you knowwhich so um you do it for the communityyou do it for fun you do it for sharingthe knowledge buildingthe let's say the understanding of bestpractices building the network andthat's the goal rightso yeah that's that's my opinion aboutthis really thank you yeah and againplease contribute um i am the maincontributor to this repository as a cncfcore employee so it would be great tohave somehelp yes we have a question or statementmy name is indayeah come up since we stole themicrophone from the middle of the roomhelloum my name is inda bongo aawa yesterdayi met the amazing caroline that's how iknew about this event nice and theyactually forced into place thank you forall your contributions and the firstslide was kind of disappointing to mebecause i saw africa one and luckily wehave like 12 chapters i'm alsocommunity-led and what the last sentencestatement pavl said really like was theicing on the cake that we don't do itfor the money but again what's money youdo it for all what you said which ismore valuable than money losing your joband getting another one in one hourpeople money can't really understandthat and um i was just wondering and irelate to all the challenges that youall have mentioned there i tried thevisual ual part am i really audibleenough yes all right i think so i trieda visual part to link with my fellowcommunity back in africa cameroon i'mfrom cameroonoriginally and um there are a lot ofreasons they fall out of network theyhave to go to work they have otherchallenges which i totally understand weare really like we are privileged outhere soagain is there like um a comparativeadvantage or like um um um a customizedway of um encouraging the continent ofafrica to facilitate their communitybuilding so it should be sustainablebecause having the same standards asother continents it's um not really likeum efficient or equal so is is thatsomething that is behind the scenes oris it um something that we should weshould address because i'm really reallyso interested in seeing that my fellowbrothers and sisters out thereunderstand first of all to understandsomeone of you mentioned that so no it'scaroline she said some don't even knowwhat cncf is i have um um a backgroundin economics i came into it two yearsago and i'm really like so passionateabout everything aws google all what thecloud um ecosystem has to offer so isthere like um yeah is there somethingtailor made for yeah thank you for thatand i love to hear your experience umwhat actually the governance is forgrowth that aren't standards that areset yet so it's great to have feedbackif somebody creates a poll request onthe organizer specific uh qualities thatwe were talking about then please chimein and comment because you know we won'tmerge it if it's not relevant to everycommunity so this point is futuristicand ideas at this point um it's nothappening yet so yeah everybody'sopinions are valued and we we willrequire that every region chimes inbefore we pass something yeah and youcan try to approach any of us i believeand you know we we can share whatever wewe think we can help you right so soyeah yeah we have a pretty largeaudience so if you if you need a helpwith uh with hosting uh events virtuallywe are happy to collaborate with you weare always there just ping us anytimeand we we'll be there all the time okaysameyeah uh i think this panel is for thatfor create more awareness and to createmore discussions because so many topicsis that we even we don't know how tomanage big re regions how to grow up howto be equal toeveryone yeah the people that is here isi think interesting to �grow and and andgrow with this so like in the github isan is the cncf uh uh group is cloudnative what is the name yeah so it'scommunity groups uh cloud nativecommunity groups so there's the githubrepository in the middle there um andother resources too so you if you'reinterested in starting a chapter or evensimply just seeing what chapters existthat first link um has a map where youcould type in your city or state orcountry and see what pops up um and thenof course the live captioning and otherresources we mentioned uh in throughoutthe the panel and i think at the endalso everything is about people youalready is meeting people even anitathat is not here but is mentioned and weare missing so much that she and i thinkthat is showing the problems that wehave in other regions that we do haveproblems to get a visa that theimportance that maybe we have an eventsin like in latin america in africa andreach more people but yeah you can countwith us and other people that is cloudnative chapters of africa that you alsocan meet them and try to reach andcreate and expand your region yeah so weare running low on time i just want toquickly go to the mentter results so wecan say what we found um let's see firstuh how many answers we have in both ofthem here is some zero or i just need torestart it let me check two and for thisone wehave five so let's go with the morepopularone okay so no responses here oh yesright so first question was uh did youever participate at cloud native ummeetup so unfortunately three of you somore than half didn't whyexplain yourself no i think this shouldbe yes like everyone here is at cubeconand it is a cloud native meetup uh yeahand we're bringing or maybe those peopledoesn't know that it is the cncf maybeyeah quite possibly but um let's go tothe next one what uh cloud native topicsare you interested in i have to make ita little bit bigger mostly audibilityand security observability i think andsecurity not interested in all the llmstuff who said that because i love youthis is anonymous for a reason umkubernetes ab ebpf uh tooling to relievefrom the burden of infrastructureincrease productivity and finallyobservability cloud cost networkingtemplating and packaging and this isinteresting because this is what iobserving very often uh there is notthat many people who are interested inlet's call it soft topic like ours todayright then if you see it on theconference it's wow it was interestingreally so that that is something what mewe can change i a little bit in how wedo the meetups all right favorite foodum tacos good snacksno one wants to eat baremetal sad all right uh how do you preferi am i am not surprised like i i am notseeing the pizza thereyeah we're not seeing pizza google likespizza pizza everywherefor meetups how do you prefer to attendmeetups and in person uh there is uhmajority of you so two and hybrid is oneso uh i think it's changed a lot duringthe pandemic time and you know ipersonally do not like to attend onlinemeetups because i never listen because ihave two screens multi right on onescreen i have a meetup on second screeni have something what become moreinteresting for some reason so i preferuh in person right um in one word why doyou attend meetups networking we havetwo clarity community knowledge learningand straightforwardness and i like thoseanswers really um what format uh do youprefer classic talks hands on um kind ofworkshops and lightningsso we have for oh for hackathon we haveone and study squad we have also some soquite no okay okay hakaton is like athere is no answers at almost butgenerally you like all types ofum activities good it gives us moreoptions really yeah there's a lot offolks that are experimenting mentingwith lightning talks instead of justtypical speaker to audience um or studygroups we see a lot of study groups forthe certifications that we offer forexample we run in track of um the kindof workshop for public speaking so i wasteaching others how to be on stage andhow to you know present things and so onso so on all right so what are examplesof accessibility and this is the topicyou caroline you you caroline mentioneduh already so we have couple of answersum physical accessibility is importantand quiet spaces for meetups that'sthat's interesting i didn't expect thatreally uh in that numbers because thisis the highest number right withphysical accessibility something tothink of forme and do we have something or no it'shere okay what is the typical audienceand we have 11 to 50 uh we have twoanswers and satam did you answer alreadythis in this chat okay because it's morethan 100 so yeah the big meetups arereally uh hard to maintain i think so ii i love to speak with you some latertime about how you deal with thatreally as a attendee do you payattention to the diversity of speakersat cloud native meetups and uh yes someuh we have a good answers really notmany of them but good surprise so ihappy to see that okay networking duringthe meetup what is most exciting thedifferent experience that come from itand how they shape my own knowledge ithinkthe biggest power of meetupsis not the talks themselves but thenetworking you can have the meeting thepeople and doing those like uh you knowuh things together really right gettingout of the bubble and saying what othersare doing that's interesting because youknow especially like for me and i hope ithink for many of you we are living inthe bubble and when you go outside thebubble you see that uh you know thisdevops world is not that beautiful likeyouright so yeah that's that's importantthing and also the diversity ofchallenges each of the people faces andthe possibility to network and move inyour career and that's why i thinkthat's good idea will be one day to dothe meetup like really intercontinentalmeetup because all of us we havedifferent ways of solvingproblems and learning this is somethingmind-blowing really car this this meetupwill be the cubecoit's a little bit too big cube kong inlatin america cube kong in a okay couponin indiawell that's already you already have itall right all right so what topics arethe most interesting for you umobservability is most popular not ai hsomething changed in the world uh ebpfready to use tools rust and securityrust developers are here developersall right and where are you from we havethree answers all three from uheurope um fortunately unfortunately iwill not comment that but yeah we havethree um and uh yeah this uhthis qu both of those quizzes will beavailable we will post to them on uhslack i think on the slides are actuallyalso on our session page as well so youcan access them thereso those quizzes will be still availablefor you to to go through and you can seethe results as well and also we willpost the results as well in this channelso uh join the channel if you are notthere join the slack of the cncf so youcan do it for free so it's good price ithink um if you go back to the slides wecan also have you rate our session toowhich would be awesome yeah do we havesomething more um yes we have uh theexample of the um accessibility i thinkthis was already there so this issomehow uh like i copied all right so umlet me go back to thisall right so um if you have questionsplease reach out to us at one of theseemail aliases uh if you're interested instarting a chapter or merging into a kcdif you already have a chapter um and ofcourse please leave us feedback this isdirectly for our session the qr code uhwe appreciate you being here iappreciate my fellow panelists satyamthank you for this idea of the panel andgathering us together um we really doappreciate it thank you thank you somuch audra and uh i would really like tothank the cncf and uh its teamespecially you like when whenever we uhwe are having any sort of problem wejust run run to you and you are verypatient with uh with the with us andanswering all our questions verypatiently thank you so much for doingthat and all of the things that we aredoing is was not is not possible withoutyou oh thank you yes i agree and iappreciate you all jumping in to helpeach other since it is just one of meso all right well thank you for joiningus2025-04-15 22:03:14.111090�going to give you three differentkind of examples looking at what youmight be building from the business sidedown so we're going to be talking aboutdevelopment efficiency risk reductionand then you know Financial kind ofrestructuring infrastructuremodernization and so we want to Firstgive you a disclaimer in each one ofthese examples all of you will be ableto find a hole in it what we're tryingto do is paint a picture of how youshould be considering about yourbusiness justification how you shouldconsider Roi you might be like thatnumber is way too low that timeline'sway too low we know that we accept thatas presenters here but we're just takingyou along for thejourney so the first example we're goingto go through is around your churn rateso for many of you you may be like I'mnot really sure turn rate turn rate isjust as a definition you know you'vespent all this money you've acquiredcustomers they're using your apps butyou're losing customers that is measuredinter rates again we're looking at thisfrom the business level down um andthat's your you that is what's beingreported to your board they care aboutthat you know anytime you see um anystock reports or whatnot you knowNetflix is always saying we acquire thismany customers we've lost this manycustomers that's losing is your turnrate so we have a problem at anorganization are we have Rising turnrates and that's because we're notdelivering the features that arecompetitors are delivering um so salesis impacted customer success is impactedyour uh C Lev execs they're going intothe board and they're saying our turnrate's getting really bad so that is amajor problem at thisorganization so you have a goal you havea 10% turn rate today that's not goodmost people don't want to do that theywant to be less than 5% um you know sowe have a business goal to reduce ourturn rate from 10% to5% so in order to do that we have to getfeatures in Market our competitors arebeating us um I'm sure you know all ofyou will at some point of time andreally frustrated when a competitorreleases something and you're like ohwe're working on it and we just didn'tget it out in time that's what we'retrying to do so your job on thetechnical side is you're going to cutthe release times from two days to onedays now again we're trying to use kindof some simple numbers to get thinkingthrough you might be want to kind ofchange that to hours versus days but forthe sake of this it's cutting it fromtwo days to oneday so you have proposed I'm going tobuild this lightweight platform I'mgoing to use a bunch of Open Sourcetools from the cncf going to pull it alltogether and we're going to be able toaccelerate our release time now I saylightweight plat platform because you'regoing to see these numbers and belike um and you're you're like this it'sa lightweight platform I'm going to beable to do this in sixweeks in that six weeks I'm also goingto enable teams I'm going to have set upsome support for the organization so nowwe're going to get into how I justifythis to my BusinessLeaders the first is I have to show orconsider what revenue is at risk so if Ihave you know 2,000 customers I'mspending 2,000 or get 2,000s from themyou know 4 million ARR ARR is yourannual recurring Revenue again these arethe things that are being reported tothe board um and some of you might belike yeah of course I know that butother people might not have heard thatterm so how much revenue I'm gettingannually from my customer base right nowI have a 10% turn rate I'm losing400,000 pounds to customer or to youknow competitors each yearso I want to reduce that to 5% whichmeans then I'm only losing 100,000 solike life is muchbetter so I'm going to put an investmentinto this platform so again there'sloads of things to consider but we'retrying to get you thinking in terms ofthis I have two Engineers theyunderstand all of this ecosystem and soI'm going to take two Engineers I'mgoing to have you know I'm paying themabout six 600 pound a day I think Ithink it's going to take about 30 dayseach one of them working on this um soI'm going to spend 36,000� lb to dothis I then need to train a bunch ofpeople so I'm just going to guesstimatethat at you know 10,000s there's thisperson that person I could to take themoff I'm going to spend about46,000 to make this happen now just as asmall caveat and if you've beenattending any of the platformengineering day you definitely want totake your developers along for the rideyou don't want to be like hey I've builta platform by the way now you have to gouse it and train them so that's ourthat's our little caveat there and ofcourse you're going to have ongoingcosts and testing and maintenance butthis is the like I've set it upcost so I've now done this I've investedspent six weeks I've trained peoplethey're using it suddenly I am likeeverything'saccelerated I have lowered my turn rateyay I loaded it to 2% of turn rate sonow I'm actually only losing 40K of ARRI've reduced my turn by80% and the business is happy becauseyou know I'm able to say to my boss Ibuilt this and here's the sort of moneyyou're seeing in return or here's whatwe're notlosing that loss helps you withretention rates growth and now I've setup my platform so that I can expand andgrow in the future so it's an example ofhow you can be thinking about you knowyourtechnology that Maps directly to abusiness example now Simon over toyou so I'm going to talk about risk anddisaster and awful things and I suspectsome of you are probably familiar withthis little case uh crowd strikereleased an update last year and uhDelta was one of many companies thatwere affected by this happens to be anairline in the US and now forast Ed $380million just a few pennies uh in lossesand then we have another disaster thisone struck here in the United Kingdomthis was a a couple of years ago we havea bank it's called TSB they were downfor a couple of days nobody could paytheir mortgage get paid or they wereleft stranded at the checkout you canimagine how that felt foreverybody so when businesses are out ofmarkets it's a real problem they cansuffer from reputational damage they'vegot obviously got lost revenue and theycan end up being fined as we just saw inthat last example so for a businesssometimes ensuring service reliabilitycan really help us protect Revenueretain customers and of course over thelong term uphold our brand trust that isthe aim of our business we need to bethere when our customers need us andthen we've got a business requirementthat might be a little more definedwe need99.95% uptime you know I think I can dobetter but anyway that's what we'reafter and we need to avoid X th000 ormillion dollars in Lost Revenue so thejob that we face as Architects and anEngineers is to build an effectiveinfrastructure that can actually provideservice to ourcustomersso we might say okay well I'm going totake advantage of cloud serviceproviders I'm going to be multi-regionI'll beha and I might go and have uhasynchronous data replication betweenregions I think many of us will befamiliar with those diagrams we've gotsome blocks in the UK some in GermanyFinland maybe Madrid that type ofsituation and the timeline for thisbecause hey we can do it quick and Cloudnative is I'm going to do it in sixweeks and I'm going to enable everybodyand do ongoingsupport point is is that we can start todo some calculations around this theseare aggressive they're numbers but we'llwe'll we'll run with them now dependingon the size of a company we'll havedifferent um calculations around whatthat infrastructure might cost for anapplication for a small startup for anapplication it might be €7 an hour for amid-market uh institution it might besay €30 an hour or for an Enterprise ifwe're looking at €100,000per month okay in uh costs that comesout at around€40 per hour now those are prettydramatic numbers actually 100,000€100,000 to host my application for amonthwell if I'm going to be facing when welook at costs and lost Revenue 75million euro in Lost income and revenuejust for an hour of downtime suddenlythose numbers that we need to budget forthat we need to plan for for our returnon investment okay start to look� a lotbetter so if we go and investmoney as well as uh dealing with the thecost of lost Revenue we also deal withwe don't have brand damage and we don'thave regulatoryfines and so I just want to give you areminder of consequences of this so whenwe talk about Delta for example itwasn't just Delta that actually lostmoney from this it was crowd strike andcrowd strike also suffered as well asDelta from terrible reputation as aresult of that and it's going to taketime to rebuild and then in the case ofTSB not only did they loseRevenue I don't Bank there and I think alot of people didn't uh they also werefined and those fines another almost 50million reallycontributes I'm just going to move onnow to another example as well which ismore of an infrastructure modernizationexample so we might want to go and freeup capital and lower Financial Risk sohow how do we do that well we might wantto shift from capex to Opex now theseare accounting terms and uh as anengineer the thing that's most importantto me is to understand with capex that'swhen I go and purchase assets for mybusiness they turn into what's known asa fixed asset on a on an accountant'sLedger okay and then they aredepreciated over time that means a pileof servers a whole some switches and itmight mean other networking equipmentand all of the infrastructure requiredto build on premise now we've literallyturned cash into tin when we do that anda real compelling argument for a lot ofcompanies is to move more towards opx soas a monthly expense money goes out thedoor to pay for the services to supportmy business so it's really importantfrom a cash flow and accountingperspective uh and that is often ahidden driver for a lot of cloud nativemigration andmodernization so our businessrequirement is to reduce infrastructurecosts and accelerate applicationdelivery oh and what I forgot to mentionis uh if I depreciate those servers over3 years or 5 years well we might befamiliar with Mo's law I've just endedup at the end of five years with aserver that just doesn't live up to myrequirements so our job is to go andmodernize that Legacy infrastructureokay effectively get rid of it and we'regoing to adopt Cloud native we're goingto go container based obviously becausecontainers rock and a lot of other stuffand we're going to do it over 2 to 3years so we can use a very basic returnon investment calculation for this weyou know we can take something like thefinal value of our investment okay umafter what we would have spent and putit over the cost of the investment so ifI if the cost of my infrastructure of myplatform is $3 million to start with andthen I can get it down to 25 million butit costs me a million dollar to go andmake that transformation then I can umeasily make you know after a couple ofyears I can break even and after 3 yearsI've got a 50% return on investment soif we think through and it might be theoperating costs for our plat form andwhat we're spending because depreciationis an expense that we pay annually okaywe can quantify all of this okay then weget a strong motiv motivation and thisreally helps sometimes when we're havingthose questions with this thosediscussions with business owners likewell um we why am I getting these directcosts every month coming through thishelps to answer thatquestion so the outcome that we're afteris to that we can modernize ourinfrastructure to support obviouslyfuture growth you know Moors law andscalability because I can easily simplyrequest additional resource from mycloud service provider when I need it Idon't have to Rack as well pleased aboutthat I think you're NE yeah so someother considerations as well so I'vetalked about capex and Opex but anotherdriver as well that we can often face isaround upcoming deadlines so we mighthave a business that says I want to getrid of that data center you know we wantto become more lean we want to becomeSlimmer more agile so when we uh workwhen we want to go and modernize and andwork out our business case for cloud wecan often take advantage of those typesof deadlines do we really want to go andspend 10 million dollar pounds or Euroon a data center lease that we don'tneed and then there are application milStones as well if we think about it alot of our vendors that are supplyingapplications we might be using arethemselves going through their Cloudnative Journey so instead of providingus with you know whatever artifacts theymight we might have container images anda series of Helm charts for deploymentfor example so we need to we also needto take into account the distributionand software life cycle of our Upstreamvendors as well and then other driverswe can take take advantage of as well uhthings like regulatory changes like theyou know the European um EU digitaloperation resilience act Dora also knownas Dora and not to be confused with Dorametrics that you might be aware of aswell so these external constraints andregulations can often Drive ourmigrations as well and then finally ofcourse I've talked about disasters weall want to reduce risk we simply wantto sleep at night and we don't want tohave to go and make those incidentreports awesome so we've got given yousome examples we've hopefully got youthinking a little bit about this but youleave here and you think well where do Istart I learned about all this new uhtechnology I want to try this I want todo that so our first recommendation stepis to understand your customer so yourcustomer might be external they might beinternal they might be you know whateverplatform you're building forwhat who are they and what are theirneeds and what are their goals are theytrying to save money are they trying todeliver items faster what are the key uhmetrics that they're living by is itchurn rates is it customer acquisitioncosts is it you know a profit and lossstatement that is very clear um is itmodernization is it speed you need toreally understand that and that's whenyou start with the how do I do it whattechnology do I need how do I make ithappen and obviously some of you mightalready be on your Cloud native uhJourney um some might be new to it butit's understanding I want to create abusiness justification for thistechnology so how am I going to BU doingthat of course you're a coupon so that'sgreat also there's probably some yamelfor that um and you know one in doubthave this happening um so keytakeaways so you need to understand asDaniel's outlined you need to understandyour customers I'm a technologist I'm anarchitect so for me the walls of mytechnical organization are reallycomfortable for me but it benefits me somuch when I reach outside my technicalorganization and out into my businessand then when I really understand whatthe customers of my company myorganization are after that contributeseven more so make sure you reach out aswell as that know your applications andunderstand your technical landscape wehave many different applications andlarge complex institutions so becomefamiliar with them and understand thetechnicallandscape and know that it's you knowit's not just a technology decision it'sa business decision so ask why why arewe doingthis ask it a lot keep asking it um andand start small so you know most umpeople start with you know a few umprojects smaller POC start small andthen grow from there and that's where wewould say you know the cloud nativematurity model really talks about howyou can do that so definitely have alook at that yep also take advantage ofthe platform engineering maturity modelalso a tremendous artifact as well startsmall itate y yep and we are going to beworking on building some specific Roimodelso that you know hopefully at the nextcucon we'll go a level deeper on this umso if you're interested in gettinginvolved we do have a working group umthat you can join it's the Cog graphusworking group Simon named it it rollsoff the tongue easily it's his faultit's Creek like kues yeah um and yeahand then of course tell us what youthink about the presentation we' loveyour feedback we want to make it betterwe want to make it as practical aspossible so yeah yep we want Cloudnative and business help us yeah thankYouk you[Music]2025-04-15 22:03:14.827968 ww��p�R#��AOLrN7D84o4ghey everybody um welcome to this sessiona practical guide to Cloud nativeSolutions demonstrating Roi and businessimpact uh my name is uh Simon Foster I'ma technical architect I Do Contractingwork here in the city of London usuallyin financial services and I'm also acncf Ambassador and I'm Danielle cookI'm also a cncf Ambassador um and bothSimon and I run a working group withinthe cncf that focuses on it's called thecartographist working group um and we'vecreated the cloud native maturity modelwe've written about kind of the businessoutcomes as well as the technicaloutcomes and um the people and cultureand all that that it takes to achieveCloud nativematurity so kicking off we want to startwith the cloud native investment dilemmaum actually I'm going to take a stepback um quick show of hands how many ofyouthink of yourself as working on thebusiness side of anorganization okay not a lot and how manyview yourself as a technical person intheorganization okay um so we wanteverybody to view themselves as being onthe business side because you that thatis what we're going to talk a lot aboutum the other question is how many of youown or are responsible for Budget are abudget holder you have resourceources okay okay cool all right so youare here you're at cubec con you'retalking about Cloud native you'redeciding whether you want to invest inthis open source product uh spend timeon this that whatnot there's a lot ofchallenges you're going to face andthat's first of all the financialuncertainty so are you going to savemoney are you going to be increasingcost and you're going to have to justifythat and we in the cloud native maturitymodel talk a lot about how you get to apoint of the messy Middle where you arespending more money and not actuallyreducing costs and that's messy you'realso going to be going through thistransition shift of budget so you'regoing to be moving from capex to Opexwhat does that look like how do youexplain that to an organization thatmight have you know been buying theirData Center and their servers and havingthat physical infrastructure there'salso the timeline concern so Roi isn'timmediate when you go and make thistransition um and you need todemonstrate that that's what we're goingto be talking about um there's also theover engineering side are you buildingsomething that is going to be way morecomplex than what you actually need canyou just quickly buy eks AKs right andand deploy or you know are you buildingsomething that is really going todifferentiate your business or allow youto so there's a bunch of challenges youhave to gothrough so what we're going to do wereally wanted this to be practical so wewe are �� work on the uhverification of the functionalities andeverything of harbard so if you're oneof these persons please reach out to usit will be super good to have morepeople on board with that looking forcollaborations is the next thing and bycollaborations I mean on not on thetechnical level but more on the on thesocial SL organizational level so ifyour your organization is adopter ofHarvard so you have harbard in yourenvironment please reach out to me oropen issue so we can we'll be superhappy to add more people and moreorganizations and adopters users and theother thing that I want to do with thisone I want to reach out to yourorganization so we can work out tocreate more use cases so we can share uhwith the broader cncf community sopeople can um see how how you using thatthat all that can be anonymized ofcourse if you don't feel like sharingyour logo or your deepest deepest secretof for your infrastructure but that'sgreat for um to to spread the word thatHarare is there and it's healthy andit's working and you're using it uh thatwill produce few kind of stuff likeblogs and short videos so please contactus in in short uh I really want to workwith you and to create all that so wecan spread the word of ourHarbor so how 24 and forward uh if youscan that QR code that's the candidatesuh for 24 uh 2113 I'm sorry so if you'reinterested the whole list of things thatare going to go in Harbor 213 or somethat will be left over for 214 you cango to that QR code um we've createdtoday that discussion unfortunately it'snot link so um so you can check it outin the discussion section in GitHubwe're starting collecting the D for 214so if you have some features that arenot currently present and not availablein Harbor um please go and add it and bythe way is there anyone in the room whois missing something from Harbor becauseone right two three all right I want tooh more fantastic I want to talk withyou because we were discussing with withfim that more and more Harbor incontainer registry is becoming more ofcommodity so it's kind of slowing down alittle bit development over timeand um we're looking for fresh ideaswhat else is missing so we can work onthat you can work on that uh so we canadd it and then to make more more futurefuture uh um proof and Future ReadyHarbor so uh in that release we areaiming towards around 50 fixes and newfeatures the G is around April 11 I'mseeing around because that depends so onmany factors but that's the G that datthat we uh hope it is going to be sojust less than aweek so uh what we have in two uh 23 asI said um fixes new features uh you cansee the whole list there um the fixesare around U areas of for example userinterface and um some some performancestuff yeah orc yeah yeah um few of thenew stuff that we added we extended theAIT logs so we're capturing our moreevents and you have the capability ofdisabling or enabling different eventtypes um and that will allow to to openup Harbor for more uh to capture moredifferent kind of events so if you'reinterested in this one uh you can checkwhat's added andum um yeah and discover that reach outif you're missing events right so wehave like I think 14 15 events now thatwe're tracking and if you need moreevents yeah yeah that's and ai ai ai andwe we cannot be behind AI right so uh wehave a new View kind of is AI models andour first class citizens of Harbor so ifyou for some reason decide to upload a10 10 gigs AI model into Harbor you cannow you can see that and and you can seesome good stuff like this one I hopeit's not too small but you can seedetails around the model uh and somemetadata you can see you can downmore so if you if your use case is toupload AI models to Harbor please againreach out to us we want to more know uhto know more so we can improve that thatfunctionality uh state of the Harvardsub project um perform provider ishealthy and is supported and we arestill developing it um there are a fewfolks from ovh cloud there who providegreat support pumy provider it's kind oflacking because we don't have resourcesso again if you if you use palomi todeploy �your Harareinstances feel free tocontribute crossplane provider is comingalong uh the guys from crossplane cameto us that they're going to work maybecrossplane uh a company that's kind ofcurrently building cross crossplaneprovider internally and uh so they wantto donate this to Harbor and support itin Harbor in the future all right umoperator is something that wemaybe we're going to uh deprecate atsome point and archive because there isno many use cases and people are notreporting issues and it's kind of notactive um Harvard satellite uh volim isworking actively on this one we have theLinux Foundation mentorship program alsoon this one um and the harbard CLI it'sa sub project started by the um uhthrough the Linux Foundation mentorshipprogram uh and very soon we're going tohave a ga release so if you one of thesefolks who wants to use the CLI feel freeto doso yeah so the harbor CLI is alreadyusable it's just not kind of there is norelease tag on it but a lot of thingsare working so you should you shouldgive it a try definitely um now I wouldlike to talk uh more about what isHarbor satellite and what we are workingon and how this is you related to to theto Harbor itself but also what is whatcapabilities does it open up right Ithink there's a lot of things that arerelated to multi multi sitesmulticluster and also to AI as well likeif you want to run big Models All rightso Harbor satellite is a centrallymanaged artifact distribution solutionso it's a centrally managed artifactdistribution for multi-site location soI'm explicitly not talking about justthe edge because Edge is not reallyclear this everyone understandssomething different here so I'm talkingabout multi sides which can be of courseiot devices Edge and other Cloudproviders uh and making sure that youhave a consistent uh images across allthe sides in on the sides that you wantto have those images on right so it'snot just copying butalso selecting what you want to copyright and yeah so I don't have to tellyou about the challenges of edge rightso we have uh the unreliableconnectivity we have different conecttopologies so there's a lot of problemwhen you try to deploy software to theedge and if you're in this space and youface this you probably well familiarwith the these topics and you probablyknow 100 more than this right so um Idon't have to tell you about that one umand the problem that we really try tosolve with satellite is that we want tomake the delivery of the artifacts tothe edge location uh more manageable andwith the regards that you should be ableto manage like hundreds and thousands ofsatellite locations easily um and notjust a couple and that you should beable to manage everything from thecentral instance and then it shouldspread out to your old Edge locationsand then the edge location will fetchthedata um I mean there there have beenquite a few use cases in the past rightso we you might remember like dragonflycren this all have been approaches inthis in this direction and yeah we thinkwe can do it better right so that's whywe create a satellite but also what wethink is we can we can create it whichmore simpler right so I think theproblem of kren and and and and anddragonfly is that those are reallyreally complicated things and to operateand if you want to operate somethinglike this on a small device it may benot the right thing to do okay um so asI said the use cases are really uh broadmultiple SI it can be iot it can be Edgeuh it has also capabilities to you knowwork with environments which don't havepermanent internet connection uh is alsolike workable behind firewalls becausethere is only connection from uh east towest and not from west to east you willsee see that and you can also use it inisolated sides right so let's start withthis diagram first and I'll just walkyou through the components and then wewill look at the use cases right so whatwe have here we have the the the westside is the um where we have our harorwe have ground control and we have on onthe east side we have satellite we havethe registry and you know I've drawn itin two boxes �but it's basically just onebinaryum and let's start with the satellitepart right so the satellite partconsists of two components one componentis the the satellite Edge this is thepart where um the the satellite gets itinformation from from from groundcontrol or from from Harbor registry sothis is where it gets like the state theconfiguration and then there's the otherpart which is the registry part and theregistry part is a is a simple registrywe're shipping with a registry withsatellite so there's one kind of a bakedin in in Satellite but you can switch itfor any other registry if you want so ifyou want to run um a specific registryfor whatever reason um you can you canuse another one right so there is nokind of restriction to that um and thesatellite has two purposes so the firstpurpose of satellite is the deviceregistration it does this against uhGround Control um so it gets the tokenand then based on the token it knows uhwhat information it should it should getso it gets the information from from theregistry then and the other connectionthat that Harbor has is uh via oci tothe registry and over a over the ociinterface we're fetching of course theartifacts itself but more importantlywe're fetching also the state via theoci interface so uh each each state ofthe satellite is tracked in the in in astate file and the state file isbasically uh version version uh yeahversion state that can be signed thatcan be also roll backed right so you canroll back the state to a previousversion you can even if you if you're ifyou want to you will be able to havemultiple versions of uh uh present onthe edge and then you can roll backbetween between those versions right ifyou have enough space and it it's alsohave a a state config right so that theconfiguration of the satellite is alsopart of uh of the oi it's also versionand you can also sign it so that you canyou know also roll back theconfiguration if you want to and becauseof of this you know using oci for thatwe have all this this possibility of ociright so we can use the the efficiencyof ofci uh distribution here as well onthe on the edge location for the forEdge itself and this makes it possiblethat we canrun um the the satellite without anydatabase right so there is of coursestate right so the satellite is isstoring the the state alongside um uhthe application it's also storing thestate for the images right so it needsto store images somewhere but it doesn'thave any other dependencies than thatright there is no database no post CRwhich makes things a lot easier to todeploy and operate on the edge locationand the other part is that satellite hassay no configuration in a sense that youdon't have to configure anything on thesatellite side except you know settingthe token and the remote URL once andthen all the other information will comelater or can be changed on in in in inthe later stage right so we canDownstream thoseinformation okay umso this yeah the internal registry andthen we have Ground Control uh thepurpose of ground control is basicallyuh site for site registration so thesites can register there uh it's a FleetManagement so you can group sitestogether so that you can roll outartifacts to a group of sites and thenyou can also uh layer uh sites not youcanot layer sites but you can uh layergroups so for example you have a baselayer where you have all the same imagesacross all sides and then you have asite specific image set then you havemaybe a customer specific image set thenyou have application specific image setand then you can layer this on top andand combine it as as you wish so thatyou can easily manage hundreds andthousands of of sites easily with uhmaybe like 10 20 30 groupsright um and and this is what whathappening on ground control and now likeyou need to to to see like how do Iselect what images should be part of uhof the site so what is what is theworking process but before that um wewill take a look at the othercomponents so one component is callednotification or event system and it'sbasically designed for the way that umwhen we when we talk to people about� theharbor satellite there always like 10unique use cases you never heard aboutthat right so there's like yeah but weneed a use case where when you knowthere is like a 4G connection we need tobehave differently than when we have a2G connection or when we have noconnection there's like every companyhas a different use case here right andit was really difficult to find a wayhow can we uh deal with that and we cameup with the solution that we wouldbasically say like look there is a statechange happened here is an event we justspin up a container that you tell usthat we should spin up we spin up thecontainer we provide provide the stateparameters and now do whatever you wantto do with that right you can you canthen you know trigger a a state changeevent or you can trigger a kind ofupdate event or whatever you can do youcan do it now and then you tell us whatwhat we can do then right so you cankind of feed it back and then we canbehave accordingly like pausing thereplication or Zing the replication oror you know delaying it you know we cando this after you can do this afterwardswe just send the event so we don't haveto care about all this Edge use casesand we don't have to implement all thisEdge use cases because I think it'sgoing to be impossible to implement allof them that's why we kind ofexternalize this with this Eventingsystem and the last component which isthe the smallest one is called theruntime component because when you wantto give out such a system to thedeveloper you don't want to tell anyonelike yeah because you're now deployingto the site you need to change all thedeployments so they should point to thisregistry which just has different IDPaddress than uh the the site on you knownext door and what we do is wellbasically we just change the thecontainer runtime configuration right sowe can change the container runtimeconfiguration and tell them uh and tellthe runtime where it should look up forimages on different name spaces and wedesigned it in the way that whatever youpush to Harbor Side will end up on thesatellite side so there is kind of amatching matching name spaces of coursethe domain is going to be differentright uh but everything else is going tobe the same umso these are the the components and thenow the question is like how does itwork right so how does it from the userperspective how does it work so we haveof course on on our usual flow we have adeveloper that pushes images to toHarbor and this is all standard so whatis what will change is that you you youwill use the the replicationcapabilities of Hubers that are therethere already and you will use thereplication capabilities to select whatimages should be in which group and youcreate a replication policy and thisreplication policy will then trigger andcreate this state file for you and thisstate file will be transferred to thesatellite so the satellite will fetchthe state and and start reconcilingitself to to match with the desiredstate right and so that's why you have aa developer here sitting creating thesepolicies and then the ground controlwill create the you know take the statecreate an State oci and then thesatellite will kind of reconcile basedon the state oci and then start pullingimages and then in every stage you canthen basically decide what you want todo so you can kind of start getting thethe state and then decide yeah I want toreconcile now or I want to delay thereconciliation or whatever right so thisis like this part of the Eventing systemand and then of course on the on the farright side you have the containerruntime that just fetches the images soso this is the let's say the the wholeworkflow um there is another a fewworkflows that are kind of a yeah butyeah so these are the basically the Iwould call the USBs for satellite rightso we have the state in oci which is Ithink it's a it's a unique thing andit's really um gives a lot of trust andand transparency to the whole thing whenyou have the the state also version andalso inoci and uh we have this reconciliationLoop so that you know satellite tries toreconcile uh the state until it matchesand then you can of course because youhave this reconciliation Loop you cansee this reconciliation Loop alsoUpstream uh in ground control that youcan see like yeah where all all Edgedevices are in syc or not and then youcan see why it may or we can try tofigure out why they're not in Sun rightand then like there's only outboundconnections satellites always kind ofeast west and never west east and thenyou can customize it right so and thegoal is to be able to scalable to mold10,000 satellites because theconnectivity is just you know from fromthe satellite so it should be easy toscale to thatsize um there are a few other use casesuh the proxy mode and I think this issomething that more in the futurebecause this is always coming up so whydoes it better at proxy mode right whyshould I use it um in instead of proxymode and I think the main use case whyyou would not use proxy mode is becauseproxy mode is basically uha stop Gap solution because there'snothing else better right and this islike for all proxy use cases right uhbecause you don't know what you youmight need so that's why you you proxyand I think if you are capable ofspecifying and selecting what exactlyyou need you would probably not have toneed a proxy mod right but still we wekind of thinking about that and we haveit on a road map to just make sure thatwe can cope with this use case if it'sreally needed but I I think when it'stime when satellite is released uh andand usable proxy mote will be not thatimportant right and there is another usecase that I'm really uh happy about thatone it's about satellite on kubernetesso you can run of course satellite onkubernetes and there is a special usecase for that uh which makes this usecase really really unique so uh we areusing or we have planned to use this isnot implemented yet this is just on theroad map so we planning to use youprobably heard about Spiegel the thisthe project which is a kind of apeer-to-peer registry that sits reallydown below no spiggle no so basicallyit's a kind of a little um interregistry interface which does the noteto note peer to peer so that all thenotes can share the information aboutthe blobs they have to each other sothat when a Noe fetches a image theother can serve it and it's basicallypeer-to-peer Network and what it does isit basically uses the container runtimestorage as a registry so the place wherethe container runtime stores the imagelayers it uses also as a storage for theregistry so what it makes possible isbasically you can run a fully statefulHigh available container registry on acompletely statelesscluster right so this isyou know let that sin so this is something that I thinkit's really unique because you canreally run a state full registry andhigh available stateful registry on acompletely stateless cluster nothingthink something that's going to be aunique unique feature in in the futureof of of edge workloadsuh um and yeah that's it the road mapfor Harbor satellite it's going to becouple of weeks couple of months nobodyknows it's um I mean it's done when it'sdone but we havealready we have already uh quite a fewworkloads uh working so the the theventing part is not there but otherparts are are working the satellite partthe The Ground Control part the thechange to HUB that it's needed is alsothere so a lot of things are workingthere's no no no release yet but it'sall already in a in a usable usableState we would sayyes yeah yeah so um if you want to hearmore about that and you have somequestions come by the uh kosk tomorrowuh it's going to be until 2:00 I thinkno it's I think from it ends earlybecause it's the last day so we're goingto be there tomorrow uh so if you haveany questions that's the QR code to ourcommunity page if you want to join anyof our our events thank you very muchand I think we have yeah we have fewminutes for questions so if you have aquestion four minutes all right if youif you if someone is brave enough to askquestions uh there's a mic in themiddle no brave souls everybodytired all right thank you very muchthank you2025-04-15 22:03:15.471084 88��+�S#�� A1UHZT_v0rtshi everyone and welcome to that latehour people are opening beer alreadyover here they promise it's not beer butmy name is Orland I'm the harorcommunity manager and that's Vadim helloeveryone um so we're going to talk aboutthe haror and in the project update andwhat we have done until our since ourlast meeting and what we going to do upto our upcoming meeting um who who knowswhat Harbor is I hope right have Pro allright nice thank you so you can readthat at home for the rest it's too longso it's a it's a harbor har it's a cloudnative container registry uh one of thegraduated project set at cncf and one ofthe oldest projects at cncfecosystem today we released uh 213 rc1so if you feel adventurous and if you'rehappy to test and provide some feedbackyou can do that starting today uh it'san rc1 so please do not expecteverything to work work um please bepatient and you can talk to us tomorrowagain at the at the kosk in theafternoon I'm going to show that in aminute so Community First uh as acommunity manager and one of the mainersI want to talk to you about thecommunity how you can join us um we haveslack as most of you I hope know we havethe harbor channel the harbor def wehave the maining list uh and we migratedour not taking system to hackmd so ifyou have seen the last the previousversion it was in the uh GitHub Wikiwhich was terrible we were not able toedit that when we're 20 people on thecall so now we have that on hackmd thankyou hackmd for the support on this oneby the way and I have a question um wewe have a a an next or Twitter accountwho is following that account is thereanybody following that there one personone person okay I'm going to delete thataccount it would make sense to create ablue sky account for example anyone inBlue Sky Is there any anybody who isusing this kind of ways to uh receiveupdates about the projects that's uhsomething that I'm curious U if it makessense to make that more like with morestuff like Mardon or something else BlueSky anyone no all right I'm going tocreate Blue Sky account all right thankyou uh so h d as I said um he pointed orsome people try to say collaborationneed required or looking forcollaboration but it's actually reallyreally help needed uh so if some of youis feeling uh to help the project in thearea of uiux contributions or wedesperately need someone uh engaged withthe project management and product Urelease so if you feel you want toexplore that roow or you going to dothat in your everyday row and you wantto add some more work to your day feelfree to contact us again um we have aupcoming um task to migrate from uhinternal infrastructure uh of one of themaintainers to cicd in uh in the GitHuband and in quinic so we will be lookingfor people whoare feeling well with the cicd anddeveloping all these pipelines and andGitHub actions so we can migrate all therelease of and testing towards uh openum infrastructure of um cncf cluster andalso we are looking for people to uhcreate more tests and and��other communities uh for exampleCcloud provider and we are alsodesigning container object storageinterface and its implementation thatwill bring uh bucket and object storageinto kubernetes in the same fashion aspersistent volume and persistent volumeclaims uh very quick overview about thebasic storage APIs uh which arepersistent volume claims and persistentvolumes the main reason why this APIexists is to uh have some long uh livingobjects because spots are usually veryshort living uh pots come and go butpersistent volumes persistent volumeclaims stay persistent volume claim isusers request for storage pleaseKubernetes find 5GB volume for me uh itis namespaced and it is then used bypots uh persistent volume on the otherhand is uh actually a realization of thestorage and it's typically a pointer tosome storage back end with for examplevolume ID in the cloud or NFS serveraddress and share name when user createsa persistent volume claim Kubernetestries to create uh tries to findexisting persistent volumes to match theuser request with the actual volumes ifit doesn't exist Kubernetes tries tocreate the volume on its own uh in thestorage back end in a process calleddynamicprovisioning and then PBS and PVCs arebound together very strongly so thisvolume cannot be used by any other uhuser especially not from users fromother namespaces because they could besome important data something uh secreton those volumes we don't want to leakthem across name spaces and finallystorage class uh allows for uh separseparating the persistent volumespersisting volume claims based on theattributes like IOPS like like uhcosts and uh when user requests apersistent volume claim of a certainstorage class this storage class thencontains parameters for the dynamicprovisioning to create a volume withthat IOPS with that costsin Kubernetes we don't have onlypersistent volumes we provide alsoephemeralvolumes uh there are two kinds ofvolumes the first one is local ephemeralstorage uh this is a scratch space for apot to store some temporary files filessome caches whatever when the pot isdeleted this data is deleted toouh in in tree in Kubernetes code we haveempty volume that gives empty directorythat is either on the host or uh intempo in temps inmemory if you don't want to pollute hostor memory with this scratch space youcan use generic ephemeral volumes andthen kubernetes will create a temporarypersistent volume claim it will startthe pot with the claim but when the potdies we delete the claim and delete thedata so the data can be stored somewhereelse for example in the cloud for ashorttime uh another kind of e feeal volumesis used to inject data into potseverybody knows secrets volumes configweb volumes when API object withcredentials with some config file uh isthen projected as a file inside the potwe have also CSI counterpart of thiswhere the actual credentials or theconfiguration file doesn't need to beKubernetes API object it could be storedsomewhere else for example in Hashikvault or in some cloud secret store andthe last well not so ephemeral uh volumeis image where we allow users to mount acontainer image as a directory in a potthis container image doesn't containbinaries well it can contain anythingbut we expect it doesn't containbinaries it could contain some blob ofdata for example uh AI model orsomething like that and we will mount itinto acontainer uh we used to have many volumeplugins in Kubernetes code but we movedall of them into CSI drivers whatremained is hostpath NFS icecazzi fiberchannel and localvolumes uh they are not going away westill support them uh same for flexvolume it is deprecated but again we arenot removing it we are fixing bugs it'sfully supported we are just not addingnew features to it what we removed isGit repo volume in the upcoming releasethe recommended way how to connectKubernetes with some storage back end iscontainer storageinterface and what we did in the lastrelease sgauh together with sik apps we graduatedautomatic removal of persistent volumeclaims from stateful setstraditionally uh when you deleted astatefu�l set when you scale it down welet the persistent volumes andpersistent volume claims there we didnot delete them because if youaccidentally remove stateful set it'svery easy to recreate it using ML fileit's not that easy to recreate data onthe volumes that we would lose if youdelete persistent volume claims now ifyou are careful enoughuh you can enable auto removal use itwith care you can lose data veryeasily uh as beta we graduated uhrecovery from resize failure which isa corner case of volume resize we alwayswanted to support volume expansion inKubernetes users can grow their volumesbut we never wanted them to shrink thevolumes because it can be errorprone youmay need to take the volume offline itcan take some time to shrink volumes sowe don't want to support that and nobodyactually wanted to shrink volumeshowever there is a catch if user makes atypo and when they are expanding theirvolume from say 1 tabyte to 5 terabytesand they make a typo and accidentally exexpand it to fiveexabytes and the storage back end saysno I don't have space for that then theuser is in a pickle like uh we don'tsupport shrinking they can't fix thetypo they can't shrink from 5 xabytesback to 5terabytes with this feature if thestorage back end confirms it didn'texpire And yet we allow users to shrinkback to five terabytes you can see ittook like nine releases between alphaand beta this is this is how the featurecomp is complicated actually uh weneeded to iron out a lot of corner casesuh make it safe and robust and as a sideeffect now the expansion progress is inPVC status so we can see how it'sgetting resized in all the Kubernetescomponents in beta in 142 we have volumegroup snapshots which allow taking aconsistent snap snapshot of set ofpersistentvolumes so you can take a snapshot ofthe whole stateful set for example ormaybe in a near future uh you could takea snapshot of all volumes of one virtualmachines if you are usingword there are new API objects volumegroup snapshot volume group snapshotcontent volume group snapshot class inthe API server and on the CSI side thereare new cores that need to beimplemented by CSI drivervendorsand we are continuing work on SEUreabeling uh just the question who inthis room uses SE Linux in theirclusters uh couple of hands who usesOpenShift more if you use Open Shift you useSE Linux okay so beware uh so those guyswho don't use SE Linux nothing changesfor you don't worry if you use SE Linuxthere is a homework for youtraditionally uh Kubernetes applied wellnot Kubernetes the container runtimeapplied as soonous labels when potstarts to all files on of all volumes ofthe pot ifthe this usually takes a couple ofmilliseconds but if you have many fileson your volumes like hundreds ofmillions of files and the storage backend is not exactly fast it could takeminutes I have seen even hours in veryspecial cases this is very inefficientuh we are going to apply a synops labelsusing mount option in a constant timeregardless of the volume size howeveruh this introduces a potentiallybreaking change all ports that areaccessing the same volume at the sametime in parallel need to have the sameas label this is not a new requirementthis was required also before howevernow we extend the requirement also toprivileged ports if you are sharing avolume between privileged pot andunprivileged pot this was possiblebefore it will not be possibleanymore this is a breaking change so inorder to ensure smooth upgrade uh wehave three feature gates the first one sLinux mount read pot horrible name uhenables the feature for volumes that arenot sharable only a single pot can useit that that means it cannot conflictwith any other pot this is enabled bydefault for a long time nobody evercomplained then in Kubernetes 132s alphawe introduced a Linux change policywhich brings a newcontroller this controller is optin youneed to enable it in your Kubernetes togetuh to get it working and it doesn'tbreak anything yet it just observes thecluster and it reports metrics itreports events of things that couldbreak when the whole feature getsenabled so you �can uh see what couldbreak you can either rearchitect yourapplication not to use strangeprivileged unprivileged pots or you canapply opt out uh in pot security contextfield so once uh you can see that yourcluster is fine no new errors arereported everything is looks great thenyou can either enable the final Sinuxmount feature gate or upgrade to to aversion where this feature is GA okaynothing changes for people that don'tuse SLinux again no new controllers nobreaking changes if you use please payattention to documentation of yourKubernetesvendor and now I'm handing over toShing thanksYan so I will talk about what we areworking on in 1.33 andbeyond the warning populs feature istargeting GA in 1.33 release previouslyyou can only create a PVC from a datasource that is either a volume snapshotor another PVC but there are use casesto populate volumes from other datasources for example during a backup youtake a volume snapshot upload thesnapshot to a object store and then whenyou try to restore it you want to beable to create a PVC from that backup sothis backup will be an external datasource that is not a volume snapshot andnot aPVC the goal for this feature is toallow generic data populs by permittingany object to be the data source for thePBC adding support for need new datasources cannot break existing behaviorto support backward compatibility weintroduce the new data source ref fieldin the PVC spec as shown here if theoriginal data source is specified theinformation will be copied to the newdata source ref the original field willalways besupported there are a few differencesbetween the data source and data sourceref fields one thing I want to highlightis that while data source only supportslocal objects data source ref allows youto specify an object in any namespacethis makes it possible to support crossnamespace transfer in the futurein the shared library for volumes repothere is an example implementationcalled hello popul here we have anexample CR definition on the right handside there's also example of how to useit to create a PVC from the hollow CRdata source we also have a volume datasource validator controller that isresponsible for validating PVC datasources a volume populator isresponsible for registering the CR datasource type itsupports the always own a PV reclaimpolicy feature is also targeting GA in1.33 release without this feature the PVreclaim policy is sometimes ignoreddepending on whether you try to deletethe PVC first or try to delete the PVfirst leaking storageresources it also means users may becharged for storage resources theythought they have already deletedthe goal for this feature is to preventvolumes from being leaked by alwayshonoring the PV reclaim policy there arerisks even though this is a buggybehavior it has been there for a longtime so some users may expect thisbehavior to continue to mitigate therisks we introduced the feature in 1.23giving users time to adopt the changenow the feature is moving to GA it isready to be used inproduction portworks CSI migration isalso targetingGA also SQL Linux reabeling with themount options for non rewrite once portvolumes is targeting beta in 1.33releasethe data protection wing group has beenworking on the CPT feature for a whilenow it is targeting alpha in1.33 this feature provides a way toretrieve the snapshot metadata of theallocated and change blocks this enablesefficientbackups there was a session about CBTyesterday if you missed it you can watchthe recording later on YouTube iincluded a link at the end of thepresentation cozy container objectstorage interface is staying in alpha inthis release the team has been busyworking towards van alpha 2 this featureallows you to provision object bucketsdirectly in Kubernetes similar to how apersistent volume is provisionedthe mutable CSI node allocatableproperty feature is targeting alpha in1.33 this property specifies how manyvolumes can be used on a node withoutthis feature a mismatch between thereported and actual attachment capacitycan result in permanent schedulingfailure and stuck workloadsit can result in cube scheduler tryingto schedule parts on nodes that do nothave sufficientcapacity the proposal is to make thisproperty mutable periodically updatethis field through CSI driver andautomatically update this field whendetecting failure in volume attachmentdue to insufficient capacitythe storage capacity scoring feature isalso targeting alpha in1.33 previously there is only scoringlogic in the volume binding plug-in forstatic provisioning now this logic isenhanced to support dynamic provisioningas welladmin can configure cubeuler to prefernodes with the least allocatable or themost allocatable capcapacity by default a node with the mostallocatable capacity will beselected this feature is also replacinganother alpha feature volume capacitypriority which was introduced long timeagothese two features will beconsolidated i want to announce theremoval of the git repo in tree volumeplug-in it has been deprecated for along time and it isunmaintained however there are securityconcerns git repo volume types can beexploited to gain remote code executionas a root on the nodes there arealternatives you can use git sync orinit containers for the samefunctionality the biggest challenge isthat this could break users who arestill using thisfeature in 1.33 we are going to removethe entry git repo volume driver codegit repo volumes will not be removedfrom Kubernetes API it will simply errorout if usedwe introduced the feature gate git repovolume driver default to force in cubletuser can enable the feature gate and useit in1.33 they can enable it until 1.36 whenit will be locked finally in 1.39release the feature gate encubate and intree plug-in code will be removedwe are also working on warning expansionthrough staple set by allowing users tomodify the PV templates directly withouthaving to modify individualPVCs this is something we have beenworking on for a while and it will bevery useful for database operators andother operators who need this featurewe are also working on merging severalsidecars such as the externalprovisioner attacher resizer snapshotinto one to improve maintenability andpossibly reduce memory and CPUfootprint now let me talk about CSImigration statuscsm migration is something that Sikstorage has been working on for multiplereleases now it allows voting operationsfor entry plug-in that is usuallyhandled by cube controller manager andcubelet to be routed to CSI drivers thisdoes not migrate anydata this table shows our CSI migrationschedule it is almost completeportwork CS migration is targeting GA in1.33 the entry plug-in is targeted forremoval in1.36 this table shows entry driversremoved without going through CSImigration now let me talk about how toget involvedwe have six storage community page herethat contains useful information for newcontributors we have bi-weekly sixstorage meetings we also have a issuetriage meeting and meetings for cozy CSIand the data protection working groupyou can also join our main list andslackchannel here are some resources for yourinformationhere are some 6:30 sessions atCubeCon that's all we have today thankyou are there any questionsi have a question to CSI uh how much uhwork is it to write a CSI driverhow much how much work it's it's a lotof work one week one month one yearso there are some learning steps youneed to take however it's not difficultuh the first one could take one month ofreal work and most of the work will betesting actually and figure figuring outthings uh if the CSI if the storage backend is descent like icecazi and mount uhit's easy it's fairly easy so if youstill have a storage array and need toto write a CSI driver one month yeah atleast you should have a prototype in onemonth definitely if not then somethingis wrongokay thank you so there is a uh we havea example driver that you can just takea look at that when to get started it'scalled a CSA host pass driver it's ait's a sample driver not for productionbut it shows you how to implement itokay thanks i will have a look at itany otherquestions okay okay thanks for theattention then2025-04-15 22:03:16.039562 �*��U#��gA6GjLzWtqjlwhey everyone goodafternoon um uh welcome to today'ssession on driving Kos engineeringforward what's new and next with litmuschaos so in this talk we'll be uhsharing what's going on with litmus whatwe have been working on uh what are theroad map items and what we have achievedso far so yeah uh a bit of intro aboutmyself uh this is sna and I am a seniorsoftware engineer at harness and alsoone of the maintainers of lmus chos soyeah I'll givethe hi good afternoon everyone I amsarak Jan and I a senior softwareengineer at harness as well and a灏=�T#��1AX_xHC_Q5jGEhello everyone thank you for coming toour session on six storage intro anddeep dive my name is Shinyang i work atVMware by Brookcom i'm also a co-chairof Kubernetes Sik storagehello my name is Yan Shafranic i workfor Redhead and I'm a tech lead of sixstorage so what are we going to talkabout uh we will tell you what is sixstorage what we did in a couple of pastreleases what we are working on rightnow and what are weuh what will be working on in the nearfuture uh we will show you some briefoverview about CSI migration status andmost importantly how to getinvolved so six storage is fairly loosegroup of people uh we have twoco-chairs Sat Ali from Google andShingyang from VMware and we have twotech leads Michelle O from Google andmyself from Redhead we have couple ofslack channels with lot of people butactually only few people are active soif you want to contribute this is thebest place to go and try to help peoplewho are struggling with storageuh our bi-weekly meeting is attendedroughly by 30 u 25 attendees andthroughout the time we accumulatedaround 30 unique approvers in ourrepositories and directories inkubernetes what we do is defined in sixstorage charter we maintain kubernetesstorage APIs that means persistentvolume claims persistent volumes storageclasses CSI driver snapshotsuh volume attribute classes and so onand so on and so on and we also maintainthe code behind those APIs for exampledynamic provisioning on the other end ofKubernetes when talking to the actualstorage back end we still maintaincouple of in volume plugins like NFS andICE fiber channel and local we co-maintain uh ephemer volumes with secretsconfig maps downwarddownward downward API projected and soon and we can maintain them with siknode and we also uh are authors of theCSI specification contain containerstorage interface specification wemaintain its implementation inKubernetes and CSI sidec cars butactually we don't maintain much CSIdrivers uh they are usually maintainedby ��lso amaintainer of litmusKos right so litmas Kos is a cncfincubating project which allows users topractice Kos engineering on their Cloudnative applications and it has a missionof making Kos engineering secure as wellas accessible so Kos engineeringbasically verifies the resilience of abusiness services and helps devopspipelines build code that is moreresilient against software andinfrastructurefaults all right so here's a timelineand what has happened with litmus chosfrom cucon to cubeccon I hope it is uh visible uh but I'llstill read read it out for you guysso it all started in 2018 so lmus choswas first announced announced in cuconna 2018 at that time it was completelyanible based so Kos experiments were umexecuted based on some scripts and lusKos only had some basic pod level faultsin cucon Europe 2019 Kos operator waslaunched so Kos operator is basically atool that is used to manage andorchestrate chos falls into the users'scluster uh chos operator basicallymanages the complete life cycle of ofThe chos Happening into the users'scluster we also introduced chos crdsthat is the custom resource definitionsfor Kos and from ncel we migrated togoang so we migrated to golang thereason was it had better support forwhat we were trying to build and alsofor better um you knowadoption and moving to cucon na 2019principles of cnce was published CNC isbasically Cloud native uh chosengineering so uh this was published anduh we launched Kos Hub so Kos Hub is umbasically a hub which consists of allthe chos faults that are available sousers can just browse through the Hubcheck the list of the faults that areavailable and um use them according totheir use case we also added support forbyoc byoc is bring your own Kos so justin case um you know the the list offaults that is not enough for someone inthe Kos sub they can anytime go andcreate their own own Kos fault and addit to their add it to the KosHub then cubec Con Europe 2020 litmas1.0 was released and it was G8 so atthis time we received some Communitycontributions from some some Enterpriseslike red hat and Inuit and U we also gotsome inners across differentdogs ccor na 2020 CN um litmas Kosreached cncf sandbox and the first cubeccorn case study for litmas Kos was alsopublished so litmus probes were launchedlitmus probes are basically someyou know checks that allows users tovalidate their steady state hypothesisof their uhapplication and Prometheus metrics so wealso introduced one tool that is calledchos exporter chaos exporter basicallycollects all the data while the fault isexecuting and it exposes those data asPrometheus metrics users can later usethese Prometheus metrics to visualizethe data on any of the dashboardsavailable then cucon Europe 2021 2.0beta was announced and uh so the aimhere was to you know increase the umadoption with cicd pipelines so for thatwe we had we made some Integrations withGitHub and gitlab so basically we userscan add a k step with their GitHubactions or gitlabactions we continued adding some morefults into the kosub so that was thefalse Edition in cucon na 2021 litmas2.0 was G8 and and we introduced ChaosControl plane so Chaos Control plane isbasically the UI that we provide andeverything is possible via UI uh that isthe control plane that that we have soright from creating infrastructureinfrastructure is something that sits onyour uh applic application cluster andinjects Kos then from creatingexperiments accessing the Kos subub allis everything is possible VI the Koscontrol plane we added multi tendencysupport that is Kos can be executed intomultiple clusters using the controlplane itself self team support was addedso team support is um is something likeuh users can create a project and uh canadd multiple members into the project aswell and each member can have differentrule then cubec Con Europe 2022 uhlitmas Kos reached cncfincubation and U we added getop supportSo gtop support is um is like having atriggers trigger based execution of Kexperiments so let's say some somethinghas changed in your application and youwant to trigger a chaos experime�nt basedon it so that was possible by gitops weadded air gap support so we added chosimage Registries so instead of pullingimages from Docker registry um in an airgap environment the images can be pulledfrom Kos image Registries itself thenagain we were working we continuouslywork on adding some fults so we addedsome Cloud infra Falls and vmw falls inourkosub cucon na 2022 litmas 3 litmas Kos3.0 was announced we added runtimesupport of our faults we added someapplication Level chos faults that is umspring board fault and U we of courseenhanced our crds for um U you knowbetter for faults and validation and forbetter uhflexibility then cubec Con Europe 2023we added resiliency probes so resiliencyprobes is probes is similar to thelitmus probes that we had earlier butresidency probes are tunable via thechos control plane itself and propes isagain the same thing to validate thehypothesis propes can be used and it canbe added to each fault that you're usingto test your systemagainst uh we we made it leaner andscalable we improved the debuggabilityby improving the error logs and umshowing some good error messages we andthe the ux was completely resiliencyscore driven here so resiliency score isa score that we give to the applicationbased on the fults that they have chosenand based on the probes that they havetuned against eachfault right then moving to cucon na 2024we enhanced security features sosecurity is um very crucial for aproduct like litmus chos and we also hada security audit which I'll be coveringin detail in the coming slides so weenhanced security features in cucon na2024 and um yeah we made the producteven more secure and reliable to use andtoday we are here with litmas3.7.0 released we have better adoptionwe are working on adding distributedtracing and also we are working onadding some morefaults so here are some stats aboutlitmus uh I guess the stats are quiteeasily visible but I still read itbecause I spend some time creating thisslide so there are more than 2 millionlitmus installations and 68 millionDockerpulls there has been a 300% usageincrease in the last one year we have 21active maintainers and harness is theprimary maintainer of litmus chaos wehave more than 100 releases and uh morethan 2500 slack community members wealso have the list of adopters uh hereare a few we have more than 250 adoptersas of now but here list of some of theadopters that wehave all right so next goal so the nextgoal for litmas Kos is to reach cncfgraduation and it will be a greataccomplishment if uh lus Kos is able toachieve that so as part of this we aretrying to expandadoption um and um yeah we have donesome security audit documentation auditby the cncf team itself I'll be coveringthose in the coming fewslides um we have already had some greatfeature additions we are also working onadding few more we are working on it andwe try to have active Communityengagement by um you know engaging withum with community members over slack andGitHub we have recurring calls with thecommunity members and um so that we cankeep them in sync with what's going onwithlitmas uh we also have some meetingswith our maintainersour uh contributors um just in case theyhave any issues or we want to discuss onon any road mapitems all right coming to security auditjust a secondplease thankyou yeah coming to security audit so fora product like litmas Kos which um sitson users cluster injects Kos on userscluster security is something that isvery very very crucial so we recentlyhad a security audit that was performedby the 7A security team and based on itthey uh they proposed some enhancementssome modifications and they also pointedout some potentialvulnerabilities so we have fixed all allof the vulnerabilities we have made allthe enhancement set suggested as part ofthe audit some of them I have listed itdown here so API security enhancementsso we have made some generic errormessages for some of the apis so thatyou know we just don't end up leakingout some sensitive information to thepotential attackers we have rbackenforcements for our for apis and forsome �other graph and rest apis so litmuschos has apis in both graph as well asrest so we have added arbc enforcementit was already there but for some apiswe needed to have some strict ARenforcements that we have alsoadded we have added course validationfor graph C and authentication server sothis this was added just to minimize anyuh potential risk of cross originattacks right so these these API changesare made so that uh any end userswhenever they use API they we ensurethat they're using a secure and areliableAPI authentication and authorizationimprovements that we have made so thefocus was here to make this process evenmore secure and um you know as part ofthis we mandated the reset password fornew users so anytime a user logs in orcreates an account we assign a defaultpassword to the users but whenever theylog in for the initial time they have toreset it and they have to set a strongpassword and then a strict strictusername and password validations soagain same thing they like users shoulduse a strict uh strict strictcredentials which is not easilydecodable and guessable we added supportfor JWT secret creation upon K Centerinstallation so that is we we store thisJW secret in the database and wheneverwe create a JWT token this secret isused and since this secret is stored inthe database it's fairly safeenough we added we introduced theexecutor role and deprecated the editorrole so in litmas chos you can haveprojects and uh in Project each membercan have a different role one of them isExecutor role so this executor rolebasically allows user to executeexperiments and U the other two roles isthe project owner who is responsible forcreating the project and creatingexperiments and we have the viewer roleas well which is just to view the what'sgoing on in theproject all right so some network andinfrastructuresecurity so we haded environment basedsupport for https connections so earlierit was HTTP by default but we have addedhttps to ensure a secured communicationover the network we have addition ofnetwork policy amls again just to knowsecure the communication having betweenthe services inside lmas Kos itself wehave removed kubernetes client Cdependencies from graphql it makes theserver a little more lighter and U alsominimizes the risk of having any othervulnerabilities from a third- partypackage we upgraded Go version across Kcenter from 1.2 to 1 22 uh reducing anyrisk ofvulnerabilities then we follow securedevelop developmentpractices so as part of this like apartfrom the code and the infrastructuresecurity we try to follow thedevelopment practices that are secureenough so we have integrated git leaksin our PR checks this ensures that nonone of the you know sensitiveinformation like tokens or Secrets ispushed into the repository then we haveadded environment based support toenable and disable graphicalintrospection ideally it is recommendedthat um you know graphical introspectionis disabled in production envirment umthat is to make sure that the APIschemas are not exposed in productiongovernment so that's what we have addedhere as well so yeah basically the aimof this security audit is that um wedeliver a secure and a more reliableproduct to our end users and right itstarts from the development process andends to the delivering as well as uhwhile the user uses thisproduct then yeah we had thedocumentation audit so you can judge howgood a product is by going through itsdocumentation and a rigorous evaluationof the litmas tech dos was done by againcncf team so Nate and D from cncf teamthey did this U audit and um again sothe main aim of this audit was to makethe docks um easier to access and easierto navigate we removed the obsoletecontent from the website we added adedicated tutorial section and uh wereorganized and restructured our dogs tomake it easily navig so that users caneasily navigate through them right sothe again the AIM was the even our oldusers as well as the new users they findit very easy and they can find therelative content what they're lookingfor um by not having to search for muchthings and also the contrib�utors arelooking to contribute they caneffectively contribute to theproject so for new features andenhancements I'll ask S tocontinue um hey everyone so I'll betalking about what all new features wehave been working on what enhancementswe have already added and when what'scoming up next so starting up with uhfirst we have the SDK support So as youknow like uh to uh currently if uh youwant to uh use litmus Kos experiments inuh cicd pip plans you can use I mean youcan achieve it by using litmus CTL whichis a CLI tool for litmus uh Kos but wehave in addition to that we have alsodecided to add the SDK support in uh Imean Java and go languages so this uh goSDK it's currently under developmentit's been uh done by one of the LFX uhmentees currently and uh for the lmusJava we have uh I mean we have alreadystarted and we have um added theauthentication uhapis uh let me show you few Snippets soyeah so we have added uh theauthentication apis like user credoperations and project operations and uhuh environment creation and all so thatuser can directly use this snippet intheir code embed it in their Javaapplication code and uh uh do the I meanuh uh use it in the cicd pipelines soyeah that's one of them anduh yeah next is um uh distributed uhtracing so um while uh while the Kosexperiment is getting executed it's uhit it can be very difficult for theusers to uh actually visualize what'sgoing on the impact of the impact of theKos experiments the whole flow I mean uhit it it can be uh like very difficultto uh see I mean what's happening in thebackground so uh to make this a littleeasier so we we are planning to add umdistributed tracing and I mean it'salready in development we have some PRSalready merged and we're planning torelease this feature really soon but uhI have a small demo to showumyeah so we have this uh this is the KCenter UI and we have this engine exportdelete uh application I mean theexperiment where uh it's trying to Imean we have added the p deleteexperiment which will try to uh Targetthe engineers applicationuh yeah so we have the uh p deleteexperiment we can playit yeah so we have set the targetapplication uh by I mean we aretargeting the engine X application byapp label and then coming to the tunefault uh yeah so these are the existingenvironment I mean total chos durationand all we have we had it earlier but uhin addition for the distributed tracingwe are going to add one more newenvironment that is uh oal exporter OTPendpoint so for this demo we are usingum simplest collector to collect the Kosmetrics and it will be used to I meanvisualize uh in any of the uh uhavailable services so for now we are weare providing the simplest collectorendpoint but you can use any otherendpoint as well so yeah that's it andlastly we are um providing the settingup the probe we'll runthat so it will take some time and Ialso fast forwarded the video a bit butyeah so the first step would be to Imean while the chos experiment isrunning the first step would be theinstallation of Kos fault once theinstallation is complete then the chosinjection will take place soum let's see let's go to the terminalyeah here uh you can see we have theengine x uh uh ports deployed and wehave the uh simplest uh metric collectordeployed as well and the correspondingservices and um coming back tothe experiment so the installation stepis complete and the experiment runningwill be starting up soon once it startswe'll be uh start uh we'll start gettingthe logs for thatyeah uh it hasstarted it will take some time but I hasI have uh uh faster it a bit anduh yeah so for the visualization we areum for this demo we are actually usingJager UI to visualize all the um eventsthat are happening so for this uh we arewe have service I mean we have selectedKos operator as service and on the righthand side you can see uh the uh eventsthat are going um yeah you can see theevents that are uh going on and thisparticular event which happened around 2minutes ago it uh spawned uh multiple uhjobs like like uh experiment jobs trerpods and all so if we goforward yeah so um here we c�an see thewhole timeline of what is happening Imean which pod started at what point oftime and uh completed their job and uhif I uh I hope it's visible but I'llread it out so so first of all the Kosoperator comes in and it starts theexecution and um spawns uh Kos Runnerpods which in turns uh I mean afterstarting the execution it uh actually uhcreates the chos experiment uh jobpoorts and they run the I mean they areresponsible for the Kos injection andonce that is done lastly we run the uh Imean the the probes are run to check thehypothesis validation so I mean probeswe had set the probes in the uh eotformat end of the test test so that'swhy uh it is running at the end of thetest uh end of the experiment completionso yeah so this will actually help in uhin the debugging purposes mainly uh incase I mean application is not behavingcorrectly in while during the Kosinjection so it it it becomes easier forthe users to debug what's going on atwhat point of time so yeah so the as Isaid it's a work in progress and we willbe we're planning to release it I meanadd watch the pr soon and uh releasethisfeature so yeah uh going backthere yeah so um coming to the nextpoint that is uh support for document DBSo currently litmas supports uh mongodDB as database but we are are alsoplanning to add the support for uhmanaged uh nosql DB such as one of themuh being uh AWS document DB so as partof I mean while U adding the integrationwe realize that there are someoperations like operations whichare not supported by the um AWS documentDV such as um I guess uh facet andbucket operations so our communitymembers um raised raised multiple PRS touh remove the dependency of thoseoperations and tweak the uhqueries uh accordingly so this is uhthis is added and uh it has enabled uhus to I mean uh add the uh support forAWS document DV and uh next we have theproposal for uh Kos fault addition so Imean it's uh it's pretty evident that uhlike the database are the important oneof the important components of theapplication but and also AWS RDSinstance is uh widely very widely usedbut uh litmus currently does not haveany outof thebox uh uh support for Imean K's fault for uh specifically forthe AWS RDS instance so um one of thecommunity members reached out to us to Imean with a proposal to add a new faultso I mean we have accepted it and it'sunder um development right now and we'llsoon be uh adding this um new faultsupport okay yeah so next is uh supportfor U deletion an aort of experiment soI mean with the release of 3.x there wasone issue that was faced by multipleusers like uh there was there were somecases I mean some configuration issueswhere um experiments got stuck and userwas uh I mean stuck in cute State andusers were not able to move forward theywere neither uh able to uh rerun theexperiment so to I mean mitigate this uhissue we um uh came up with uh thisfeature I mean deletion or uh abortionof that uh particular uh experiment runso that uh I mean it uh un unblocks theuser and um users user can either rerunit or I mean create a new experiment alltogether okay yeah so movingforward yeah so uh so we have discussedabout enhancements but now uh and nowwe'll talk about uh how we are Imean engaging with the community I meanone of the main uh areas is through thementorship programs so we have been umactively participating in various uhmentorship programs starting with theLFX mentorship where I mean we have beenuh participating almost every quarterwith uh multiple mentees uhparticipating uh at least two to threeissues uh each quarter where the uh mainfocus is uh uh like adding new uh newfeatures or um uh if if users are facingany issues work on those and enhancesome existing issues and yeah so next isum open source uh contribution Academyso this was actually um mentored by uhnamu Park who is also one of themaintainers of uh litmas Kos so it's alocal mentorship program in South Koreawhere I mean multiple mentees uh localto uh South Korea participated and theyhave raised multiple uh PRS related tothe buck fixes uh new faults umdocumentation readme any I mean uh theywere uh there were multiple PRS uh uhfocused on the improvisation of thewhole uhrepository yeah so yeah so it this alsoreally helped in U expanding the globalcontri uh contributor base as well umand U yeah lastly uh we have been wehave also been participating in umGoogle summer of code lastly I mean lastyearum there was one uh Mente who addedmultiple uh fuz test suits and uh unittest across the I mean we have multiplecomponents within the repository so uhthey have added multiple unit test andall first test to I mean improvise thewhole U quality of the code there andadditionally we have also uh updated thegovernance policies where uh I mean itit it has really helped users to uh Imean look understand the roles andresponsibilities and also uh see how thedecisions are made and also we have uhon boarded two to three new membersrecently to the litmas um Community Imean litmas uh Kos project so yeahthat'sthat um okay coming to the yeah last uhlast but not the least we I wanted todiscuss on the road map items what weare planning to do in the upcomingreleases so I mean weuh let me see thisopens yeah I mean there's a big list ofthe road map items but I wanted tohighlight few of themuh this is one right here so so one ofthem being uh support for uh Native Kosworkflows um I mean uh so here umSo currentlylitmus takes I mean litmus workflows areusing orgo CDs so that I mean theworkflow for the workflow creation butuh we are planning to with the supportof native Kos workflows uh we will Weaim to have complete control on theexperiment life cycle and it will alsouh make the whole execution faster sothis is one of the uh major items thatwe are planning to achieve and nextwould be the support for terraform so uhthis is also I mean one of the openissues in G so so I mean as part of thiswe are uh planning to uh add theautomation for uh Kos infrastructuresand experiment uh operations so that itwill really help the sres or developersto automate the things I mean automatethe Kos experiments and not only that itwill also help uh the new users to getonboarded quickly with the with theusage of the terraform uh scripts soyeah that is one thingand uh uh next is uh implementation ofkubernetes connectors so um this is alsoI mean one of the uh we have planning uhon this for a long time so uh using theI mean with the implementation ofkubernetes connectors we aim to uhremove uh I mean remove the dependencyof uh Kos infrastructure um installationin the Target environment so this meansI mean with this uh support uh there'swe will need to uh we will Al only needto install the Kos infrastructure onlyonce and it will be able to Target theuh applications in all the otherclusters so it will I mean streamlinethe whole K experimentation flow so yeahthat is one thing and uh lastly as youknow usage of AI is increasing growingday by day so we are also planning toincorporate AI in Kos as well uh we'llstart with the integration of uh Kus GPTso that I mean uh with this uhintegration we aim to I mean it will uhscan the whole cluster and uh I mean weexpected to uh suggest the Kosexperiments which can be uh I mean it itcan really help the sres to plan thegame days and uh I mean chosexperimentation in general for the uh inthe whole uh application life cycle soyeah I mean these were some of thehighlights but um uh we have a lot morein the list as aswell uh yeah so lastly I mean uh thankyou for having us here and I hope you uhenjoyed the session and um if you wantto I mean if you are interested in uhchecking out the GitHub uh litmas GitHubrepo uh so we have added the QR scannersand also um the lmas K slack if you wantto join the community as well and if youwant also we are available in ProjectPavilion uh Kos number 10A where I meanif you want to have any other furtherdiscussion we can have it there as wellyeahuh thank you guys um I guess we alreadya couple of minutes over time so ifthere's any question and answers we areavailable here as well and we'll be uhpresent on the booth that is just ohthat is a 10A in the project Pavilionthank you2025-04-15 22:03:16.722196�derstand howpopular the project is is and for thematrix the metric we have public matrixthis a matrix long.io IO we can see thehow many no around the world uh using nohome right now it's growing uh overallis the more than 159,000 not new run uhnow and most time we have a new versionpeople just very aggressive to upgrade anew version of Longhorn to see and trythe different newfeatures and in general he's the aboutuh two uh 28 uh southern cluster intotal and from the feedback from theusers we know of course you can down tojust one node cluster totally fine uhfor your development or local home lab Iknow many people use along in the homelab and also from my current informationI know may there's a few cases runlonghorn in the more than 400 nosecluster for nails maybe more we don'tknow maybe run onAnd the area adaption area home lab ageon pre cloud private cloud or publiccloud a lot and for the domains uh AIteros data virtualations observationespecially less recently virtualizationuh we have uh CNCF have another projectscalled cubver and cuber actually use alot of uh user using cuver in their umkubernetes stackand if you want to make sure your volumefor your virtualation VM running wellespecially support the volume migrationlong support it it's a little differentfrom the traditional rewrite many but Iwill sharemore and long will be adopt in the enduser service provider or solutionprovider for now uh many case share ifyou join the slack channels or if youyou have some discussion Can you canraise your discussion or issue to let usknow what you are working for uh andwhat domain you are working for to adoptthe longhorn for that and of course loncan run on the nodes run your kubernetesuh using the traditional operatingsystem for sure if you look into or readour documentation you will know we eachuh feature release or even down to thepatches we verify different type of uhoperating system and next important partI want to mention is immutable operatingsystem probably together with a kubernetdistro um one I one from the communityfeedback very strong feedback a coupletimes they want the longhorn can supporttalos nas and in the past we don'tsupport that because we don't payattention to this area and but right nowlong is ready for that uh actually notjust for this version we start on thecover view version before but wecontinue verify when the towers have anew version when we're doing our releasewe will verifythat okay so releaseupdate release update right now thelatest one my name it just happenedJanuary and the latest release is 181March and 173 we have a one year supportperiods for community users we are doingright now try the best with the besteffort to make sure we can catch up uhwhat the feedback especially youencounter a critical issue criticalchallenge we want to have a followingpatch for that the most important partis upcoming release is one night youwill be landing in um by end of May weare working on that and 110 will beSeptember what I want to mention 110because I want to introduce a riskcadency the new way we start on a 1A isfour months minor release cadency So youcan m uh you can totally can be expectwhen you can get the long releaseJanuary May and September so you canhave a better expectation when you wantto adopt the new version and maybe youdon't need to wait for a long time ifyou want to see we are running thespring way so every two weeks we willofficial the spring release and everyspring release we just a tag and wepublish the image and um hamchart so youcan try ham is not yet but image isready so you can try modify yourmanifest to get play around the newfeatures come to the end of eachspring okay so this is about the projectstatus update and uh and release and ofcourse we have more I really want tointroduce how we work for the long haulproject management we adopt spring buttoday maybe the time is not enough so Iwe we have a kiosk if you are curiousabout the contribution uh come to cometo me uh the kiosk I can discuss moreabout these parts for longhome um thispage we do some um enhance uh if youcompare with the docum�entation partright now it's just c central parts butwe want to combine a few parts togethersso for for for longhorn is actually twoprimary part is control plan and dataplans the main core part is we rely onkubernetes customer resource and rely onthe event for the c kubernetes viewingresource to understand what's happeningfor the workload and uh their uh vingrequestAnd from the interface point point ofview the interface for user they can gowith the long UI uh when you installlong home where we are long UI for thatis very simple straightforward but it'snot quite fancies because it's the oldframework but we are working on the newUI right now it will take few uh releaseto make it ready and also long CSIdriver and long CS driver will deploythe CSI driver together with a plugin onevery note to make sure we can listen tothe cope from the kubernetics tounderstanding when to provision when toresize when to attach lavalin what weneed to do so and underlay is actuallylong API it's restful but it's not theofficial API to use properly so most oftime if user come to me David can we usethe long API or some way to forautomation purpose I say yes but don'tuse a longhaul API use a kubernetescustomer resource because this is a umthe general interface for you when youuse not just longhaul you use otherproduct as wellright and when you operate and we knowabout what we need to do we'll be downto there's component is long managerhe's a control pl and we run every nodesthere's law manager so we will know whenwe bring up the longhomeballings Long volume is composed of twoelements one is the longhorn engine is afront end for your volume and and alsohis downstream replicas depends on howmany replica you want and we will seelong will see how a how many availabledisk available in your nodes and we willprovision your replica to a right placesokay and make sure he's available uh foryour usage and when your replica done wewill make sure it will be reviewed inthe con management time to make sure thereplica will be come back because itwill be uh get the data back from thehealthy replica but if your replica justfail but it can be reused we will justlet your fail replica come back andcollect the missing part from thehealthy replic so this is over all uhthe control planeparts so this part um sorry should bethis part so data planepart is how the for the valley isactually make it happens so we have a v1and v2 probably some of you don't knowabout the v1 v2 but I watch just want tobrief v1 is current uh data engine weare using right now based on iceprotocol and homemade protocol forrabbitic okay so if you look at the uhleft hand uh left hand side you will seeokay l long hong engine is bring up andin front of longhong engine is actuallyice target server on the useruser uhnamespace and we that we use the icecashprotocol to mount the m to make thedevice available for your workloadapplication and for downstream partreplicats we use our customized protocolbased on the TCP to make sure the datatraffic can write synchronize right tothereplicas and also we come with a datalocality capacity so if one of replicattogether with your warload together withyour longhole engine you will be go withthe uh unisaki locally to make sure themore efficient but still it's iceprotocol so the performance is notquite high performance high-end so insome case most of case general caseloans run well but if your applicationis very data intensive and you arelooking for high performancecan be more step close with a rodisk we need a different way so this iswhy we have a long engine v2 we areworking on right now it's based on thespdk uh storage performance developmentkit uh it's just donated to the uh ninasfoundation recently and most of peoplecontribute to the project is intel frommy understanding Intel and my teammembers also contribute do it because wewant to make it be part of our newengine core the core of our new enginesso uh the idea is actually based on theMEV over fabrics to make the uhperformance betters and we use the SPDKPDF to to follow our ourour long engine new engine in front� andfor the downstream replicas we based onthe logical volume the idea is from theSPDK so we can provide the same capacitylike a snitch and we can do a samecapacity for the G backup because isalso simple vision the same asV1 and for for the phone how torepresent for your workload we use theMVN over fabric so you will see okayanother device is happens and you canmount into your application if this isautomation automatic because this is aCSI driver our CSL driver will handle itbut on the other side I want to mentionanother one we are working on right nowuh in upcoming one night we try to makethe not just rely on the M for overfabrics uh over is actually over TCPhave a target server for that we we wantto leverage ubrock and it's based on IOurine so have a more uh betterperformance for the front end so he willbe coming into upcoming Wi-Fi and weactually done some performancetesting and he will be jump anotherlevel if you compare withMEM over fabric so let's say for v1general usage you have is ice protocolwhich a longhorn uh default design forhis uh engine and replica and v2 will beb on the high performance spdk and havea different front end support for it uhinvolve over fabric anduprock okay so we know about the controlplan and data brand what a capacity wehave this is more a high high level viewto let you get into understand what longhas right now so kubernet valley andsometimes people will come to can longsupport a pil volume for sure this is uhthe the the first items we need tosupport so I want to say we support adifferent volume modes and uh SS modesright now it's proving and file systemvolume rewrite only and rewrite many andI mentioned another one is rewrite manybut actually is brains with themigration capacity so this is what Imentioned initially if you want to useyour like a cuber and together with theto can you can you can migrate your VMvolume around this is the mode you canuse and CSI protocol we try to fulfillthe the all the capacity we can dovolumeing provision attachments na cloneand expansion and I know some peoplecome to me can longhorn do the vinggroupstuff yeah Bing group right now is uhcome uh is to the beta stage so it's oneof item we are working we want to workin emboding capacity simple vision snupmore IO efficient train expansion and alot of scene and most important part islive migration and live life liveupgrade so the main purpose is uh noservice interruption and IO performancev2 v1 data loities and for the userexperience we have UI and rest versionwe come out with a CLI but the CSI isnot let you to operate a kubernetescustomer resource no this is not ourpurpose because you already can do it uhusing uh cubecado so Longhong CLI ismore about you can install what you needuh if you need to install the Longhornyou can troubleshoot the replica issuesyou can move around yourreplicats so we want the longhorn catsCI is more like uh day one and day twoand also troubleshooting tool interfacefor you so this is one thing and monimonitor matrix to understand you knowmodeling usage disk usage and down tolong component levels long major orcomponents and long instant major ourdata plan to know about metric about itand storage uh the long disk managed byV1 will be file system based and thebroad device base managed by the SPKdrivers replications we have a uh anti-affinity stuff and not just long levelnot just no level we just come with thedisco level together with the autobalancing so if your replica is notbalanced uh from the uh specific usagepoint of view don't have a function tolet it balance and storage tag tospecify the which disk which no you wantto use the for specific purpose you cancome with the specific storage class forit data protections replicationencryption bureau protections dataservice in classroom sna out of classbackup the aalinsSo a lot of scene inbackup compression so I just mention thehigh level ones but actually manydetails if you look into the longdocumentation so I probably will notjump into this i just quickly go throughsome of you maybe already read so I wantto highlight is the sensial cha�nce so wecan make a IO efficient and most of partI want to mention is V2 also supportthis one based on logical ballingbackup we supported increment incrementincremental backup initially but in thelast version one day we support a fullbackup because sometimes if your backupget have something wrong okay becauseyour replica has something wrong and youcontinue doing a increment incrementalbackup but your backup probably alreadyhave problematic so we allow the usercan have a period of time to do a reforaagain and you can do uh based on yourdesign so give the people have a moreopportunity to control how big it iscompression LD4um so GDP uh few compression measures weprovide 2M block size and this is bydefault but we want to make itconfigurable because your data may bedifferent type of different type of beddatas so I want we want the user candecide what B size you want for exampleyou maybe want a larger B size becauseyour data is video streaming or evendata processing data so this somethingwe want to use because this will beimpact and related to your usage andcost for your external targetsokay and this one is about the liveupgrades actually the same idea formigration so we want to make the serviceno interruption so we will come with atwo engine at the same time to make surestandby standby standby uh engine willbe ready target engine will be readythen you can switch your traffic inbetween so it's a live upgrade and uhlive migrationokay this a recovery uh in the past weonly support one pickup target but inthe latest version one that we supportmultiple pickup target right now sobecause we have a lot of people the userwant to have a backup tearing they havesome data is the cold data hot data sothey want to have a different measurefor the data pickup and also come withthe following questions uh what's awhat's the traffic go through is it gothrough the uh class network or can weleverage a long storage network have adedic dedicated storage network for thetraffic for backup right now uh Nostorage network only for the data compuh data plan traffic and for the engineand replica but this is the item we wanttodo mb2 uh just briefly is based on SPhigh performance and the P more you needto have a dedicated CPU resource forthat and I know that that resource maybe a challenge for people if I just havea lowspec can I also run a V2 because Iwant to try V2so we are investigating the interruptionmode because right now the SPK supportmore uh poling modes so we want toinvestigate is partful to dointerruption mode right now and fewcomponents in the database uh for in theSPK we are using right now actuallysupport interruption but some of themnot yet so this is something we areinvestigating so I hope we can make ithappen soon and so they mean you can youcan use a V2 in the high highspec mediaspec and even low spec this is ourautomaker and this a compare comparisonif you compare with the concept wedesigned in v1 and v2 logic volumeingcompared with the simple vision busfiles and ray one bdf spk conceptcompared with our v1 engine m overfabricsubcosi and replication based on mbn overfabric or our homemath over TCV for v1on okay I only have 30 minutes so thenew what we have a uh new stuff in oneat you we we can make it configconfigure a CPU code for your dataengines for v2 uh before we one at justone CPU because it's for try but for nowfor performance testing I know some somepeople want to do a benchmark so it'sconfigurable Dr valin right now issending to V2 as well auto salvage ifyou are all replica fail we can autosalvage your vining can come back tohealthy right now is also happen to v2ving live migration v2 ready vencryption ready as well and replicarebuilding more efficient uh we usingsna ch but right now in one night wewant to make it better we want to downto the chunk of data in the sna and weare doing right now in not just the snaor even ch d d d d d d d d d d d d d d dd d d d d d d d d d d d d d d d d d d dd d d d da levels inside and back imageupdate and download back image is aconcept for the VN image if you want touse the virtualation uh stuff and talosprevious to only support we only supporta v1 for talos but right now v2 is readyand multiple backup and rewrite manyautomatic volume expension withoutservice downtime and ham controllerintroduced by uh K3KS and some people want to use theautomation way to handle the long holeinstallation upgrade it works right nowas well and for upcoming one nights youblock fun delta replica rebuilding basedon delta snup face volume coning toreuse the ein snup from the your uh v2valins volume expansion and storagenetwork in the current status we arealready done almost done on Ubrock anduh first volume cloning and storagenetwork we still have time so we want tomade a remaining item ready for one nineand V1 and V2 offline replica rebuildingprevious we have a ving offline replicarebuilding for V2 and because this is aworkar around until we can anotherreplica rebuilding ready and now day isready but the following request can wehave a offline reparing I say oh it'sinteresting because they want to makesure if even they bing is detached andsomehow there are no gong we still canmake sure the replica around so this isthe requires uh requirement based on thebehind the context so this is why wewant to come with that for v1 and v2 andrecurring system pickup is a whole longsystem pickup also happensand benchmark we used the EQun medalsponsored by CNF but may maybe next timewill be Oracle OCI yeah because uh EQnext medal isretired okay so looking to the uh uhbenchmark this is IOPS uh blue bar isthe local pass professional tounderstand the row performance the redone is V1 violins the green one with V2volume along with your increased numberof CPU you will better understand okaythe V2 performance is getting betterbetter but of course there's a penaltybetween the road disk and thedistributing bar season right but it'sgetting better thanV1 rights is the same you getting moreperformance scan compared with theV1 and throughu yes the same a specialwe along with a number of your replicotsyou will get more throughputs this isnaturalscenes and restu as well is gettingbetter uh if you seeuh for our testing we just use a one totwo to four CPU code to understand howthe performance it is so this is why wecome with the configurable CPU counts uhfor 1 for the performance testingpurpose okay read latency yeah V2 isalso getting better but you will seewhat's the difference it seems like nodifferent with the number of CPUunfortunately it hit a B upper boundsbecause our testing environment is notgood enough so so if you have a betterenvironment you can give it a try youwill see a difference here so yeah thisis thesame okay so um maybe maybe we'll runout of time if you have question you cancome later but I will share somehighlights here for the communities andwe will have adopters uh MD i know manypeople use a long home but I don't knowwho using who are using long home so ifyou have time uh in the following weeksyou can have come with your PR to let usknow your usage uh your feedback so youwill be good to promote the longomeproject adoption and contribute to along home i know contribute to adistributed system is quite hard becausehe composed the different components butlike I say we have control plan we havedata plans and I try to make it brieflylet you understand how it works so veryuh want to encourage everyone count youcan contribute with the deployment ininstallation and upgrade or even down tothe data engine control brand and CSICLI as well or even ecosystem toolingbecause one of uh uh the the users alsoshare with me uh we can follow up withlike uh blockchange stuff so this is onetype of a contribution you can come withme uh come with us and to let us knowyour ideas okay so the last one is a Gissues any issue report issue help withissue uh this all about contribution soour next step is come to a graduation soreally hope if any contribution from allof you okay thank you so do we have anyquestions do we have time no okay so uhI will be here so if you have a questionuh come to me and we can discuss okaythank you everyone[Applause]2025-04-15 22:03:17.431545 ��d�V#��AREkSMbRrBU4welcome to the session aboutlonghome and I'm David Co i attend theqcon's capitalize the main mission forme is promotion longhome to have a moreadaptions around the world okay so I'mengineering director at susa and leadinga long teams working on that togetherwith our externalcontributors okay so the agenda is aboutI will talk about longhaul overview andI assume some of you maybe not using uselonghaul right now or maybe considerusing longhaul and some of you alreadyusing nonhong i will talk about moreabout plan later okay so after overviewwe'll be go over the project statusupdates and the releaseupdates and the next important part isabout the longhorn details so you willknow more about how long walk how simplyit is and from design from the operationand uh to the day two and whatever youwant to manage along in yourcluster including the few subtopicsarchitecture and uh capacities we haveand most one item I want to individualindependent highlight is a v2 dataengines so this new one we are we arespending lot of effort on that to letthe longhorn capacity especi performancejump to the next levelsokay and about release and we alreadyreleased one at in January so some ofyou maybe try and one day right now thelatest version is 1.1 is the stableversions so please give it a try orupgrade to it give us more feedbacks andalso the upcoming one night in May andthere are a couple of new features andnew ideal ideas following and we want todo in a foreignrelease okay this page is a simply let auser note a special for for people youdon't know about Longhorn longhome isthe distributed bar storage solutions uhbased on Kubernetes so you can simplystart from a cluster install long chartthat's it so everything layer becauseyou will use your local disk by defaultand the secondly uh you can providehighly available persist volume becausewe replica we have a more than onecopies of data for your volume but it'sconfigurable depends on your uh designand the third one is the not justposition volumeings we beyond that wealso provide the incluster snup to makeSo uh you can have a different time anddifferent strategy to manage your localdata for your volume so also we providethe external backup so this is betterfit uh if you have any strategy you wantto manage your dataoutside and the foreign storage based onbackup is a crossclass disaster disasterrecovery because you can leveragecentralized backup targets and restoreyour data to the different uhdestinationcluster and long is the can run on anyuh environment what I say any here ismore want to highlight it's a preformagnostic we have verified several uhenvironments on print age private cloudand public cloud and you will always seesometimes in the long re you will say ohlong right now is verified for some droor run right now along is able to run onthe G uh GCE uh GKE something like thatbecause we spend a lot effort to let thelonghorn can run on different places andfor sure he's aCNF incubating right now and we arelooking for the uh community user forthe uh more contribution because this isone uh criteria want to jump to agraduation want to see more contributionfromoutside about the project status updateuh right now it's growing continuegrowing over years so many people likeif you see a star from the GitHub is atraditional way we un��equest to have I think a thousand pluslisteners in a single Gateway justreally big gateways but there's lots ofother use cases here like if you want todistribute your gateway configurationacross multiple name spaces this can dothat aswell uh there's also uh another thing inhere that has a little thorny I wouldsay a little difficult to figure out inthis was uh TLS updates for connectioncesing and connection cessing is uhsomething that just does not play nicelywith Gateway API or more correctlyGateway API doesn't didn't play nicelywith it uh an example is if you if youlook at the the yaml on the right andyou imagine you have a Gateway that hasa listener for food. example.com andstar. example.com you may think thatyour a request to food. example.com willalways go through that listener that isnot always the case imagine for a secondthat you start with a request tobar.example.com and that hits that star.example.comlistener establishes a connection thenall of a sudden you go to food.example.com but you already have thatconnection open and that connection isvalid for food. example.com and so youkeep on going through that same paththat's a bad and confusing thing uh sowhat we've formally finally recommendedis that implementations return a 421that says don't do this uh but also andand we instead of sending you throughthat path will prevent you from goingthrough that path uh and it also willensure that gateways warn you now if youhave overlapping TLS configuration likethis example which could beproblematic all right there are somethings that are going to standard inGateway 1.3 that includes percentagebased request mirroring so we alreadyhave request Quest mirroring you canalready mirror to multiple back ends thethe difference here is now that you youcan mirror just a percentage of yourtraffic imagine that you have a verylarge amount of traffic and a percentageisn't precise enough uh we also have afraction as an option uh for those ofyou who may not be familiar with the funnitty-gritty details of kubernetes APIMachinery uh floats are just a no-o sowe have a fraction or an INT dependingon the level of precision you want butthere is no floathere uh the other thing that's going tostandard in Gateway 1.3 is policyattachment uh this isn't strictly tiedto the release but it's more of apattern than an individual API but wehave a lot of apis that are using policyattachment it's kind of a genericpattern to extend kubernetes apis thisis you know if you thought of Ingress asan API with a lot ofannotations Gateway is a API withpolicies to extend it instead that ggives you uh better typing morepredictablevalidation Etc lot lots of benefits hereuh but yes that's policy attachment arough example is that a policy canextend either a route a service aGateway and add a lot of additionalconfiguration to it um we have a coupleGateway API projects that are relatedIngress to Gateway can help youautomatically migrate from the IngressAPI to Gateway API we'll be talkingabout that very briefly uh and thenGateway cuddle itself is something tohelp smooth over some of the G gaps inCube cuddle when you're dealing withcrds because Gateway API is built oncrds and there are some shortcomingsright now when it comes to using Cubecuddle with Gateway API Gateway cuddleglosses over those and actually providesa lot of new functionality such asshowing you a full resource graph of allthe connected resources or if you wantto see for example if a service isreferenced by any of your routes orgateways it'll show you that and allkinds of other very useful things likedebugging your configuration foryou now the other thing that uh isprobably top of mind for many people isingress engine X and ingate any Ingressengine X usershere okay cool yes uh so uh Ingress APIitself was frozen more than 5 years agoand there haven't been any new featuresto that API uh Ingress engine Xmaintainers are shifting now to a newproject called ingate that's going to befocused on Gateway API support and notIngress Ingress engine X itself isshifting into maintenance mode uh andit's expected be archived� inapproximately 18 months so we'll diginto a little bit more of that there'sthere's other talks that go into this atmuch greater detail than I possibly havetime for here but suffice to say there'sstill a lot of work to do uh Gateway APItoday supports around 35% of of thefeatures that Ingress enginex does andhas plans you know should support around55% pretty soon but there's still apretty hefty Gap there we've got a lotof work to do to close that Gap therethere's other talks focused on that Iwon't go too far into that but just saythere's a lot of work here if you careabout this area if you want to help uhthere's lots of room for help andspecifically if you want your featuresupported in GatewayAPI please take this survey uh this ismeant to give us feedback on what it isthat is the most important features inIngress engine X there's something like118 annotations in Ingress engine X it'sa really long list of things to supportand we want to make sure we'reprioritizing the right things the funthing with open source is we really haveno clue what people are actually usingwe don't have Telemetry we don't knowthese things so if you can help F fill asurvey out and give us that feedbackthat would be greatand this is a very approximate timelineplease don't hold us to any of thesedates especially the Gateway dates I Ican't say too much about that but we'rehoping this is about how the timelinewill will work for the next twoyears and yes that's all I I know thatthat's a lot to takein Rob is it pronounced Gateway cuddleor GatewayCTL cuddle it has to be cuddle but I Iheard someone say Gateway control forthe first time todayokay I disagree with you but we'llfigure that out later um so a fewupdates about C proxy and some caps umwe're going to play some musical chairswith the microphone sorry about that umso in kubernetes 1.33 uh NF tables isgoing to go ga um which is really coolbecause it's nice and performant umsomething to note however is in thisparticular case GA doesn't mean on bydefault um you still have to opt in touse it and I think there's plans tochange that in the future shrug we'llseeum musicalchairs so some improvements inconnection tracking we now no longerdepend on the user space contract binaryfor cleaning up stale connections uh weimprove the performance like we don't uhf a user space process for cleaning upentry for every stale connections wealso added a contract reconciler whichsolves the problem of the existingimplementation which was a bestas effortin H triggered and we also added someMetric so people can monitor the healthof the Recon cyler how many entries aredeleted in total and time taken by thereconciler uh multiple service CER goesg uh it's a scalable and flexible way toextend the IP ranges service IP rangesit also exposes the service IP range itis fully Backward Compatible and fixessome uh problems with the existingimplementationsback tome um traffic distribution um the theparticular prefer close is GA in1.33 um this is the third take attopology aware routing um and in thisparticular case we assume that the usersgoing to going to do a lot morebalancing and um handle all the the thethe the balancing between the zonesthemselves as opposed to the previousimplementations which were trying to bea little bit more clever um somethingelse that's happening 1 33 is we'regetting prefer same Zone and prefer samenode um the semantics of prefer samezone is the same as prefer close umwhere uh and those are both in both inAlpha at themoment um and you can read more aboutthem in the in the linked caps no QRcodessorry um and then some more caps um afew smaller ones um deprecation ofstatus. nodeinfo doq proxy version fieldum that's been deprecated and beingremoved um DNS search strings have beenum a bit more relaxed so you can set uhnon-conformant but um or technicallyinvalid but are used out in the wildsearch strings for your pods now um IPaddresses are also getting a little bitmore validation on them previously wereallowed like z00 with with that paddingum that's going to be removed there's athere's a multi-release plan to to �getthe API to to reject those for securityreasons um and then uh V1 endpoints arebeing deprecatedshortly all right it looks like we haveplenty of time so I'll start with alittle introduction about Network policyAPI in case some of you may not know thedetails so we have a network policy V1which hopefully most of you know aboutit's a stable core API that lives inyour name space and what it does itprovides your networking security itlets the namespace owner Define whichconnections are allowed or denied as youcan see on that slide my backendnamespace is secure nothing can getinside except for the things Iespecially allowed uh which is front-endnamespace in that case now the apiswe've been working on recently and thefresh ones are called admin Networkpolicy and Baseline admin Network policythese are cluster scoped API they arecurrently in Alpha and they are designedfor cluster admins to set up clusterwide security so you can see on thatpicture &p doesn't leave in any specificname space it's cluster wide and itaffects multiple name spaces at the sametime so what the admin was trying to dohere is protect all Nam Spaces bydefault but say okay traffic coming fromthe monitoring namespace should beallowedeverywhere now when you have an alphaAPI what you want to do is to eventuallyget to Beta with that and during Alphawhat we do is we experiment with the APIwe get feedback we learn from ourmistakes and as soon as we feelcomfortable enough we say okay it'sstable now so we will promote that andthat is exactly what we are trying to doright now uh and we have a couple ofnice improvements and simplificationsplanned right now uh some examples ofthose is we want to merge these twoseparate crds admin Network policy andBaseline admin Network policy into toone so hopefully it will make all ourlives easier uh we have some plans torework our Port matches to allow moreprotocols in the future the most popularrequest would be around icmp so I hopesome people will be excited aboutthat and there is one thing I actuallywanted to ask for your help with we didget some feedback saying that priorityis a confusing word so let's imagine wehave two policies one of them haspriority 100 and another has priority200 now there is no right answer herejust tell me what feelsintuitive who thinks that priority 100would be applied first handsup ah beautiful who thinks priority 200will be appliedfirst controversial okay thank you foryour feedback we'll count your handsexactly on therecording that's very very valuableinformationseriously okay so the QR code there isour project we have a GitHub projectcalled something about the road tobetter which has all the issues we wantto solve before promoting our API allthese changes are still in progress soif you have any feedback on them or ifyou think they don't make any sense orif you really like them please let usknow we're always welcoming feedback umand another part of pro of promotingyour API to B is actually Gatheringfeedback from from the actualimplementations and we've had some newimplementations lately which is greatnews so here is how our implementationpicture looks like we had some firstimplementations that were with thenetwork policy API from the verybeginning that's anrea and ovenkubernetes and we've got a couple of newimplementations just last year that'sCalico cubn and awesome Cube Networkpolicies project by Antonio here and wealso have a celium coming next theypromised that will they will do that andif you have any projects on your mind oryou would like to see your projects logohere next time we are welcoming newimplementations please come talk to uswe're going to be super happy to hearfromyou okay and besides just the battlework there are some nice cool featureswe are also working on in the meantimeso one of them being tency which hasbeen a popular ask which means that's aset of Nam spaces that you define asyour own tenant and you want to protectyour tenant from from the other namespaces in the cluster you don't likethem so that's in progress and anotherone is a mutable pod identity matchwhich mostly we're �thinking aboutservice accounts for now as opposed tosome Dynamic label selectors hopefullythat will be slightly more secure and ithas a list of potentialum potential improvements so there alsoyou can find all this issues again ifyou have any feedback please feel freeto give it to us and there are a coupleof completed things we've done last yearone of them is fqdn matches or DNS namematches that is also fairly popular uhit's an experimental stage for now soagain waiting for my feedback andanother cool thing I want to highlightis policy assistant that's uh a tooldeveloped by Sig Network policy API ifNetwork policies themselves were notconfusing enough we are adding moreadmin Network policy API so which makesit even harder to navigate this wholeSpace which is why we have this awesometool it can do many different differentthings for example it can let you test anew policy before you apply it to yourcluster so that you know that it doeswhat you intend it to do it can help youunderstand which policies are affectingtraffic in your cluster and it evenworks without a real cluster just with acouple of yl it will help you simulateall the interesting things please take alook if you're interested in that at ourdemo on the previous cubec con there isa link and there is also another QR codefor our nice walk through for policyassistant all right all right um stillhave sometime to report from our multi Networkfriends so multi network is a SE thathas been working on introducing aconcept of multiple networks intokubernetes for a very long time and onething they've been trying to do thiswhole time is basically add this networkFields uh to the pods pack it's there'sbeen a long running discussion but theoutcome basically was don't touch potsspeec do something differently and thatis exactly what is happening right nowso multi Network group is moving awayfrom the core API development to the crdbased approach and what's they're whatthey're doing right now is using Dynamicresource allocations that you probablyhave heard of already multiple times uhand the point is you probably mostlyhave heard about Dr in terms of like GPUallocation so it's for resourceallocation but multi Network group sayswhat if we consider network interfacesto be resources and use Dr to assignmultiple network interfaces as resourcesto your pod which seems to be workingand there is a lot of exciting workhappening in that area they did have amuch more detailed talk on thatyesterday so if you didn't attend thatyou still have a chance to watch therecording there is a QR code leadingdirectly tothat okay and that's all the updateswe've had thank you all for joining andwe're happy to answer your questions now[Applause]ah there is the microphone by the way ifyou have any questionsplease say them in thatmicrophone hello uh thank you it's veryexciting stuff um two questions one oneof the main reasons we haven't moved toGateway API is most of the downstreamcharts that we use across the cncfdon't offer it they're Ingress only andI'm kind of curious to hear about how wethink adoption is going all stuff whatif anything we can do to help with likeyeah yeah no we're very aware of that uhI think past month or so we finally gotuh HP route merged as in one of the coreHelm charts we're very much working onthat we want it to be easy to use HProute Standalone without having tobundle all the rest I know Gateway APIis pretty difficult to work with in Helmwe want to solve that in the next yearthat that's a top priority right coolthank you and then follow up with allthe network policy stuff um so I comefrom a traditionally Network backgroundand the big thing with network policiesis all the logs and you can look attools like celium that you know havetheir in-house thing that goes and doesHubble and all that kind of stuff butthe formats of this are inconsistent andin the networking world you have likenetf flow and S flow and like all thesedifferent things and I'm curious aboutappreciate this is like a day two thingbut as soon as I start blocking trafficsomeone's going to go did the networkblock traffic and I have to be able tosay yes and it doesn't feel like or noit doesn't feel like there's a goodanswer for that or standard about thatum so that's something that we havetalked about in the the working groupthere's an open PR with some discussionbut it is not currently we we don't evenhave a a a fully functional idea toimplement yet um and we decided thatwe're not going to prioritize that asone of the features for getting to Betaso we do eventually want to have ananswer there but we don't have it yetcool and and vendors can still do theirvendor specific logging of course sothanks anyone else everyone's networksare perfectly fine you have all thefunctionality you wantokay Sira is pointing out that thatNadia talked about Dr at the end butmaybe some of you don't know what Dr isum so Dr is dynamic resource allocationuh it's a newcap show of hands do do people knowaboutDr everyone knows about Dr okay we won'texplain[Laughter]thatoh it sure does well um I follow up tothe Ingress question before right I'mnot sure I catch the answer like I guessthere's a whole ecosystem you know weall encounter it dayto day where Helmcharts include Ingress not necessarilyHelm charts managed by ku6 repos rightlike Helm charts as a as a whole umwhich I think was some of the referenceso like yeah like is is is your answeralso addressed that uh so what what Ithink I'm talking about that merged amonth or two ago don't quote me on thattimeline but it it has merged uh is isthe way to represent HP route just aseasily as you can represent Ingress in aHelm chart that's that's like the stepone building block of this but there arethere's more work to do this is this isthe start of a process we need toprovide some good examples of how thisworks and we need to make H route moreuseful Standalone so you don't have tofor example one one of the challengeswith HP route right now in a Helm chartis you have to point to a parent reflike to a Gateway and what if you don'tknow what that Gateway name is it's justone more thing that you don't have tothink about with Ingress really becauseIngress works as a standalone thing uhso we've talked about like maybe youcould have a default gateway and youdon't have to think about it in thatcase uh maybe HP route so again justthose kinds of things if you have ideason what would be usefulyou know we really need feedback uh thatIngress engine X QR code if you don'tmind bringing it up you can you that'sthat's just like general purposefeedback for for us thank you yes thatone uh any any way to provide feedbackif like this doesn't have to be just foryou're moving from Ingress engine X ifyou're moving from Ingress and havefeedback for us please let us know or ifyou're not moving you know just whynot okay you mentioned that the Eng Xwill be dicated and my question is willthe new solution also support serversiteincludes what was the what will itsupport uh server site includesssis uh yeah maybe if you fill out thatsurveyokay so Ingress engine X has had thisproblem that it's it's very veryextensible because it just lets youwrite Lua programs and Snippets ofconfig file that it doesn't evenunderstand what they do it just handsthem off to engine X and Magic happensand cves happen and we want fewer cvesin ingate so there will be some thingsthat you can do with Ingress engine xthat you probably won't be able to dowith ingate um but also they are tryingto get more official features added toGateway so that you can you will be ableto explicitly support these things thatyou can sort of do magically with ingrenGen X so fill out the surveysorry one last thing yes I like exactlywhat he said I I just you know it maynot be exactly the same thing as Ingressengine X but we're trying to at leastprovide similar functionality that maybe a bit safer in the cases like forexample Lua is something that probablyis not going to make it to Gateway APIbut maybe we can have other similarextensible things that might be a littlebit safer and easier to secure uh butyes I know we've I think that's all sookay thank you for coming[Applause]2025-04-15 22:03:18.089575 d�d��,�X#��AlrA6gOpLWMwhey everybody uh thank you so much forfor coming to Wom Whiplash Womcloud'swild ride to standards uh this isactually my first talk in the maintainertrack so super excited to be heretalking to other maintainers i hope thatthis is informative and inspiring foryou as it is a therapy session for me totalk about our past mistakes and a wildride over the last 5 years of working onWAMC cloud i'm a senior softwareengineer at Cosmonic a CNCF was cloudmaintainer which is now an incubatingproject i've been with the project sinceits conception in 2019 coming out ofCapital 1 um a huge rust station uh anda big uh and very sad demo enthusiastbecause I actually have no demos for youtoday it's my first time ever doing thatso I'm going to try to hold yourattention and and hopefully this shouldbe this should be fun the last uh fiveyears of working on this project so fortoday going through the agenda I want totalk just a little bit about what WASMcloud is uh why it exists the originsthe ideals why the project started inthe first place what we built thatdifferentiated us and what we built toeven show up as a project in the spaceusing a new technology of web assemblyand talk about the standards that solvedso many of our problems uh and just ingeneral standards and why they matternow today was cloud is an incubatingproject in the CNCF F we're a WASMnative application platform fordeploying applications everywhere we useweb assembly as the unit of computerather than a container for anapplication uh which means that you cancompile your code to this platformagnostic tiny binary to any cl��2�W#��AlBOdQHNNgEUhello uh welcome to the Sig Networkintro and update um thank you for comingto our talk and not a certain other talkthat's scheduled at the same time whichI won't tell you what it is in case youdidn't know because I don't want peopleleaving um I'm Dan winchip I work forred hat on obers shift networking and Iam a tech lead of Sig Network inkubernetes I'm Rob Scott I work atGoogle on Gateway things and lots ofother kubernetes networkingthings I'm Adrian I work at a companycalled salesloft um I'm a very newcontributor to sign Network I've beenhere probably about a year um if youwant to join Sig Network they don't bitewhich is really cool so would wouldhighly encourage it hi I am d I work atbroadcom I'm part of the kutiesdistribution team overthere hi my name is na Piva I work atredhead and do open shift networking andSig Network policy API is my favoriteSigI didn't even think to ask people what'syour favorite s your workinggroup so okay so we we originally had abunch of updates and someone pointed outthat this is an intro and update sohere's your intro um according to ourCharter Sig network is responsible forthe components interfaces and apis whichexpose networking capabilities tokubernetes users and workloads SigNetwork also provides some referenceimplementations of these apis forexample Q proxy as a referenceimplementation of the service API sobasically everything networking withinkubernetes eventually Falls onto us umand we Define new things and newfeatures and keep things working best wecan uh if this sounds at all interestingto you you can join Sig Network you cancome to our meetings every otherThursday uh easiest way to find it I Ishould have had a QR code um but if yougo to kubernetes dodev that is thecontributor website kubernetes dodevnot. um and from there you can findinformation about all the sigs uhandjoin um so let's just get into theupdates now all right thank you so yeahlet's talk Gateway things uh we've got alot going on in Gateway a new in Gateway1.3 is core support that's been a longrequested feature really exciting to seethat one get across uh retry budgetsalso brand new in Gateway1.3 and then listener sets listener setsis if you've ever want to merge a bunchof gateways together or just have aninsane number of listeners in yourgateway this allows you to do that thisspecific thing was inspired by uh ar��oud edgeuh even even your own now just likecontainers need container nativeplatforms to do their best work you knoworchestration all of the things you seehere at this conference uh web assemblyneeds the same thing it needs wom nativetools and that's what we've beenproviding uh for for quite a long timebut before we can think about where wascloud is now we need to take a littlebit of a step back let's let's go allthe way back to 2019 uh where we startedand before was cloud was cloud it wasactually called waxo suit a cloudnativeexecute uh exo suit for web assembly andI know what you're thinking that that isthe coolest name and this is the coolestlogo and that this talk is going to beall about the mistake of renaming tocloud unfortunately uh that is is notexactly where we're going to go but youknow I I understand uh you all arewelcome to take the name now thisproject wasn't created to use WebAssembly as a technology like we didn'tstart WASMCloud cuz we were like we gotto use Web Assembly on the server sideand we're going to be the ones to do itthe best we started because we saw thepain of making microservices at a largeenterprise there's a ton of boilerplateand copy and pasted code from thesegolden bless templates and uh and andcontainer images that you're allowed touse to deploy your application you justcopy and paste it from each one meaningthat when there's a vulnerability in oneof those nice templates every singleapplication team has to go roll theirapplication roll you know rebuild itredeploy to production and that causes aton of friction it's what developersspend a ton of time on and that is notenjoyable or productive at allnow this is something that we wanted toaddress when we started the wa the waxosuit project and the strategy was reallythere per application we are bundling inso many different things that are notjust the application code not just thethings that the developer writes andends up using for uh their features atthe end of the day it's all the open-source dependencies it's all thecapabilities like using HTTP or havinglike your enterprise specificdistributed logger and it's also uhinformation about the platform thatyou're going to be running on you knowthat you're going to be running andbuilding to a container you're going todeploy it on Kubernetes you need thisservice and this service account andthis ingress and it's all these thingsthat you need to know about the platformyou're deploying on that's not just thedeveloper responsibilityso our thought is if we can make aproject where the application code wasthe only thing that was in your actualunit of compute the only thing that adeveloper had to write maintain anddeploy to production and leave all therest of the management of dependenciesthe run ops the capabilities all thosethings if we could push that into theplatform then that allows the platformengineers to have control over thatoffer it to developers developers justwrite code whatever language doesn'tmatter and picking Web Assembly as theunit of compute was a bet that thefoundations of Web Assembly that made itso successful in the browser would makeit successful on the server side wecertainly could have approached this inin a different way and using looking atWeb Assembly as a technology in 2019there were definite pros and cons someof them still apply today the pros it'san open W3C standard at the time it hadshipped in all major browsers i think itwas March of 2019 the web assembly 1.0supported in Chrome Safari uh Firefoxthe other one we had secure sandboxexecution platform agnostic binary smallsize near native execution speed andit's just a binary target so you cancompile to it from any languagetheoretically as long as it supportedthe web assembly instruction set thesewere all the things that had to happenfor it to ship in the browser and we'relooking at this and it's like this wouldbe great for cloudnative microservicesif it was a tiny binary that was secureby default you can write whateverlanguage you want sounds kind of nicenow there are definitely some uh Wow Ihave pros and pros on here wow y�ou allnoticed it nobody said anything pros andcons some of the cons is that becausethe support shipped for the browsers alot of libraries that supported WebAssembly assumed you were going to runin the browser so it would have likeJavaScript glue code in there and thatwouldn't work for running on the serverside as far as what can go in and out ofweb assembly it's only numbers so youcan write functions that you can callfrom the host or from the guest that cantake in numbers and return numbersthat's it no strings no lists no arraysnothing and when it comes to networkingsupport all of that has to happen overthe same AI so how are you going to dealwith pushing a bunch of bites over asocket if you're just passing like twonumbers back and forth it's a littleharder if we zoom in a little bit andand just kind of think of web assemblyuh a little more holistically for thefolks actually I usually do this surveyhow many of you have heard about webassembly here okay so pretty mucheveryone how many of you have likewritten something compiled it to webassembly and like done something semiuseful with it even if it's just localawesome nice so for any of the folksthat haven't really heard of thistechnology before or looking to grasp itthe best way to think of it just at itsat its highest level is web assembly isa guest in a stackbased virtual machineso you take code from a source languagea guest language like C or Rust or Go orPython you compile that to a webassembly component web assembly moduleand then that is the guest that runs inthe host virtual machine and that isimplemented in all major browsers it'simplemented now on the server sidethings like that and then it works onany supportedarchitecture now uh this may sound alittle similar to the Java model whereyou have Java code and you compile it toa jar and then you run it on the Javaruntime environment and it is a lot likethat it's just not like only Java andweb assembly people hate it when you saythis so if they ask you you should tellthem this and this is how you understandit but if you think of it kind of likethe Java model that that works tooright so uh when we think about these uhinteracting with a web assembly guestfrom the host or you're writing anapplication you're interacting with thehost you can only take in and returnI32s I64s F-64s F-32s and that's it andso when it comes to that limitationthere's a strong reason for it it's verysimple to implement just number back andforth communication with a guest it'svery simple to implement on the runtimeside guest languages are free to havelike a higher level abstraction for thisum but it does very critically assumeshared memory and that is how uhessentially a lot of the web assembly uhecosystem worked for quite a long timeis you would like have a slice of um aslice of bytes in the web assemblymemory and the host would have to readthat in response to what you're doing sothis is obviously a lot lower level thanthe average developer is trying to goout and I I mean I would say like 99% ofdevelopers are trying to go out and likewrite a microser you're not going to belike all right you know now I'm going topass my numbers back and forth andthat's where the web assembly systeminterface or wazi came to be so thisstarted as a way to create securestandard uh interfaces for writingapplications that can be run in thebrowser that can be run on the serverside and it's supposed to create afoundation so that language tool chainscan target a standard you know this isthings like writing to a file descriptoraccepting a socket reading anenvironment variable like the pixiestyle things that you could do in anapplication and this started I mean uhthis started all the way back when wewere beginning with W was cloud butultimately we looked at this and wethought well we're not trying to dosomething that's pix we are trying tomake a different style of microser sothe standard that's emerging it itprobably won't work for us so we werelooking at ways to compensate for thecons of using web assembly in 2019instead of to compensate for the numbersin no numbers out� uh no complex datatypes we said all right we will write anFFI protocol across web assembly the thebinary guest and we can actually pass uhcomplex types around by serializing themto bytes and then passing a pointer anda length to read it's like you know comp110 like the first thing that you doinstead of uh trying to create specialways to do limited networking support wecan use native binaries which alreadyhave the support for that or evencontainers and we can just have a wayfor these distributed uh the the webassembly and these binaries tocommunicate over a distributed protocolthe polyglot support only really worksif you use the same API so we'll writelanguage specific SDKs so that it canmatch our binary protocol and we'll justuse the totally uh target triplet WAM32unknown unknown to compile to so wedon't assume any environment it'sbasically just bare web assembly computeso first thing as we start on ourjourney to help developers be lessfriction uh full when they writeapplications is to create our binaryprotocol which is YPC so this is a wholeprotocol where the host can invoke afunction on the web assembly componentand pass it a bunch of bytes and thenthe component could like call back andsay like hey you know I had this errorand you should probably read it fromthis place in memory and then the hostcan be like well okay well I have anerror actually and you need to read itfrom this and this is like so much rightand and it actually worked really wellright we used it for uh about 4 years topass complex types using web assemblyAnd really you know if you look howcomplicated this is really what thismeans is that every language that wewanted to support for was cloud we hadto write an SDK in that languageimplementing this protocol using thatlanguage's memory models and mostimportantly this is an open- sourceproject and and WPC still exists todaybut this was proprietary to thatprotocol if you were writing anapplication using our open sourceproject and you were trying to compilethe web assembly and do what we weredoing you were now tightly coupled toour platform you had to use our projectyou can't take that and go anywhere elseyou have to rewrite itbasically so really this is um this isthe same model that that happens allover the place right you have a lowerlevel API and if we can all kind ofagree on a nice lower level API then wecan write higher level APIs to target itthings like gibb c and then that goesdown to sis calls which is another APIthat we've all kind of agreed on thatlooks kind of nice so you know there'sthere's an evolution of this so westarted from the bottom and worked fromthe ground up to build our own guestSDKs so we did it in Rust we did it inGo we did it for TypeScript we did itfor Zigg Swift and every single SDKimplements this binary protocol and thenyou know you just have to serialize yourcomplex types to bytes and then on theother end you do the deserialization tooa little annoying whatever it all worksyou know very generic protocol it's nicenow the problem is is we started writinga bunch of extensions you know thingsthat you could do with web assembly liketo make HTTP requests and we assumedthat we would just like serialize thecomplex types to JSON so everybody canjust deserialize that from JSON or maybewe would use message pack like a complexbut or you know a standard but it's aand it's an assumption and then thatmeans that every single person has touse that same serialization format sowe're like all right maybe that would bemaybe that would be a little hard so whydon't we write why don't we use an IDLwhy don't we write some codegen and sothis will work in every single guest SDKwe just need to uh write a codegenmodule for that language and generatesome code right instead of hand codingit we've got the protocol so people candefine their own complex types likesaying hello world and then uh go fromthere so we tried our own we called itwhittle it it this was another one ofour awesome naming decisions it was a itwas a whittle cool you can see that onthe left you know this is how you wouldwrite a hello world �it's fine it'spretty understandable um but then wewere maintaining our own IDL like we wewanted to represent like recursive typesand now we have to update an IDL and wehave to update our code gen um this maysound so obvious to you now that nowthat I'm saying it out loud but then wewere like okay this is this is way toomuch why don't we use a wellestablishedIDL that like people use and we lookedat a couple things we looked at like capproto we looked at protobuff and thenlike that you have to put yourprotobuffs everywhere this is not aprotobuff rant talk so I'm going toleave it for later so we were like okaywe're going to use we're going to usesmithy how many of you have heard of thesmithyIDL nice two awesome smithy is whatAmazon uses to describe all of their AWSservices internally so they have theyuse the smithy IDL they can generatecode from that um and there was actuallylike a rust and some support for thedifferent languages that we wanted touse so we're like we're going to use anestablished like you know Amazon didlike this is going to be great and so wedo all of this right and if you reallyjust like step back and and take a breakwhat I've been talking to you about forthe last like 10 minutes is all thestuff that we built just to like gethere and the things that we actually gotto work on at this point the reallyexciting pieces in W was cloud that setus apart at the time were reallyinteresting we we were doing adistributed network of computing with aflattened topology network using NATSlike the CNCF project so that you havethese web assembly binaries they couldbe running on any cloud any edge we getautomatic failover they can communicatethe same as if they were running locallywe have loosely coupled contracts so youcould actually have like a databaseconnection or a you know key value storeimplemented for Reddus locally and thenyou could just swap that to use one thattargets Dynamo DB when you deploy toproduction we can run our actual host asa binary or you can embed it in your ownsystem we have a signing mechanismnetworking the distributing of these webassembly components um weekly webassembly community calls where we'rebringing everybody in and it was it wasso cool to work on these parts of thetechnology and this is what sets usapart as W was cloud but we had to buildall of this stuff just to even show upin the ecosystem and we spent a longtime maintaining it we wrote raw PC andthen all those codegen libraries and weswitched to new IDL so we wrote adifferent codegen library and now thatwe were maintaining that and it's justso much churn where we're rewritingthings and trying to adopt things andbut we're spending all this time onstuff that's really not making ourproject better or or different and Ithink all projects can fall into thatpitfall and when you look at what we'rereally missing we didn't really needthat much we just needed complex typesto go across the web assembly guest hostbinary we need to represent that and beable to pass complex types around we'renot doing hello world with a pointer anda length to read from a bite slice we'redoing hello world by handing you backthe hello world string and we wantedidiomatic experiences in the ownlanguages and there are even cleverhacks that people will do today but I ifyou take anything away from this sectionof the talk every single Web Assemblyproject that uses Wazi or WM 32 unknownthat has things that are more complexthan passing numbers back and forth isdoing it in their own proprietary waythat locks you into their platform youhave to use their interfaces and if youwant to write Web Assembly and targetsomething else like the browser orwhatever you better hope they have anSDK there or you're going to rewrite itthat still happens today there are greatreasons for it i'm not trying to throwshade on people who are using Wazipreview one it's just the thing that ifyou're saying that you've got all thisawesome support you know it it issomething that you have written as aproject to to do that in your ownspecial way there are even like superclever hacks that people are doing n�owto make this even more efficientum and really the the years and years ofpeople doing this in their own way ledto people saying "Well hey we reallyneed this to happen and we'd like it forit to be a standard." And from theseproposals came from Wazi uh preview 1 orwis and there are two core proposals tothe foundation of Wazi 2 the first oneis interface types which is a way torepresent complex types in web assemblywhen passing back and forth between hostand guest this includes an IDL that uhbrings the uh brings the assumptions andthe things you need to know about webassembly along with it code generationfor guest languages which can beimplemented by like people who work ingo all the time they can be like wellhey this is how I want it to look when Igenerate like a complex type i wanted tocome out as a go slice not this weirdthing and it also allowed for creationof standard or custom interfaces withoutchanging the entire web assemblyspecification so we can ship a set ofinterfaces for what it looks like forweb assembly to be a CLI to like executeit on the command line or for it to bean HTTP proxy so take HTTP in send HTTPout these can be standard interfacesthat we can ship everybody can align onand then target when their individuallanguage tool chains and then from thisanother set of you know what we weresolving in was cloud in our own way butwhat we really wanted was for differentmicroservices to be able to interop witheach other and so an extension of thiscomplex types proposal came thecomponent model which is essentiallyjust wrapping the core piece of computewhich is a web assembly binary webassembly module with a little metadatathat details the complex types that aregoing to go across the API this is thecomponent model there are functions thatyou can import and export from webassembly think of it like providing anAPI or calling an API and uh under thehood it's just core WASM like webassembly 1.0 0 this allows you to dolanguage interop to have strictlydefined interfaces uh and potentiallyunder the hood most importantly have asegmented piece of memory where you ownyour own stack you are not sharingmemory between untrusted pieces ofcompute and so it's not violating theweb assembly sandbox and when you lookat all of this and this is when westarted looking at Wazi and we're likewow like that solves a lot of ourproblems that solves a lot of the thingsthat we've been running into issues forand we had to munchge a couple thingsbut you know interface types in thecomponent model covers the issues thatwe were having with with IDLs andgenerating SDKs and and making our ownprotocol and so about a year ago todayand I say about because I think it waslike next week it would be a year ago welaunched was cloud 1.0 0 so from 2019 to2024 we had been working on this youknow doing a lot of churn 2024 welaunched wom cloud 1.0 and this is whenwe brought in was0.2 as the supportedweb assembly target so we didn't need tofocus on WPC writing our own smithyinterfaces doing code gen making our ownRPC protocol that has you knowproprietary things we adopted the webassembly standard alongside some of theother things that we had already beenworking on like hotel for allobservability open application model forlike our declarative application speccloud events for allformational eventsthe CNCF was working group uh publishedrecommendations for distributing webassembly via OCI and that's pretty muchthe the standard for how people do itnow we were able to take these thingsfrom the standard and implement that tomake our lives easier we get to spendtime on what differentiates us as aproject and this was the hardest lessonfor us to learn at the time because weweren't very involved with Web Assemblystandards we were doing our own thing wewere writing our own protocol it wasworking for us we had control but if wehad gotten in in the beginning joinedthe community groups joined the peopleworking on the standards offered ourfeedback you know we had real peoplerunning this in production we we couldhave pushed the standard along it itcould have gone quicker i'm not sayingthat we are like the best web assemblypeople ever and we would have made itall awesome but we would have been atthe table and and airing some of thethings that we had problems instead ofjust like working around it and likebuilding our whole thing and toultimately all these WASM projects thatare doing the same thing competing onthese like table stakes like just makeit work it just hurts the ecosystemeverybody's doing their own thing itdoesn't really match up and you can onlyget really so far making your ownprotocols and you know regardless ifyou're doing wom I mean this is themaintainer track uh everybody here canwrestle with the same problem ofbuilding too many things this is the uhyou know everybody's favorite image butafter all we're we're here at thiscloudnative conference like all of theseprojects exist have some base you knowsome kubernetes story like things opentelemetry like all these standards thatwe've kind of aligned on makes for thisecosystem of projects that interop weuse the same things we can knowledgetransfer you know I call that watsoncloud on there because we're you knowwe're over there um you know all ofthose things and it's the same same forweb assembly same for cloud native andyou know ultimately for wisdomcloud 11.0 Oh we like rewrote the whole thingwe we rewrote our host to target thestandard and it's it's actually you knowI want to say that it's actually okay todo that it's okay for you to takesomething that you were doing in yourown way for a long time you knowsomething else has come along and sayyou know what we can just get further ifwe don't do that and the way that wetalk about this internally at atCosmonic and for Woml Cloud is in termsof innovation tokens you get a certainnumber of innovation tokens when you'reworking on a new project or when you'reworking on an existing project to dospecial fun things that that you thinkare fun you know choosing your languageyou can write everything in pony orwhatever but you only get a certainnumber of those before people startgetting alienated people can'tunderstand what you're doing or peopleget locked into your platform so youknow if there's anything I can urge youto walk away from this talk with is thatyou know think about the things thatyou're building and what you really wantto do with your project are you spendingyour innovation tokens on the thingsthat make you better and make youspecial and and make people enjoy usingyour thing or are you writing a binaryFFI protocol so that your thing can justwork and really at the end of the dayWASMCloud has its project goals we wantdevelopers to be more productive we wantdevelopers to have fun writingmicroservices we don't want developersto be spending 60% of their timemaintaining the things that they alreadywrote we want people to just write codeprod at it test it and then deploy it toproduction and then leave it there andthat would be great i would have a greattime if that was my life and that's whyI love working on WMC cloud that's whatthe project is designed to do not to bethe coolest web assembly thing it's ayou know technology and standards areare what make our projectpossible i want to just give a quickacknowledgement if you're watching to uhPhil Kitty and Kevin Hoffman these weretwo uh original visionaries behind WPCuh WMCloud they were great mentors to mecoming out as a early developer at atCapital One and working on open sourceum and a huge shout out to everyonewho's contributed you know the hundredsof people who have worked on Wasom Cloudover the years thank you for yourpatience i'm glad we're on the trackthat we're on now and uh I thank you allfor for coming to this talk um I have acouple minutes for for questions if ifyou haveanything oh yeah if you do have anythingthere's a little mic over there so youdon't have to you don't have to shout atmeall right folks well I'm going to behanging around uh afterwards too if youare uh feeling shy and don't want on therecording but any any questions arewelcome but but thank you all have agreat conference uh really appreciate2025-04-15 22:03:18.631015le ofcolor and we love to have everyone joinus and we love allies yes I'm looking atyou you and you yes we would love tohave you guys be a part of our communityaswellokay so for firstquestion um so first let's start out bytalking about um why do we need toexpand the contributor Pipeline and howdoes inclusiveness make projects moresustainable I think I can go with thisone um so I feel like we need more uhfresh perspective and if we have moreinclusive environment we have morepeople we have diverse set of people uhwe have fresh ideas coming up so in ateam if there is just if there is justno conflict then there is a problembecause if because there are no newideas so if there is a conflict if thereis a discussion and if there aredifferent ideas from different set ofpeople people then the growth is goingto happen so I feel like withinclusivity we get more freshperspective new people um creativity soyeah okay Stefan yes so before answeringthe question I would like to put some uhstatistics in place so I would like toshare some uh information so 30% ofblind people in Europe have only a jobso that and finding a job for othersthat have difficulties to to see is aswell difficult additionum color blind blindness is something uhimportant as well one in 10 in 12 menand one in 200 women have uh color blindso it's difficult for them to assesscolor in technology when they have towork this this kind of information soinsing the the contributor pipelinematters because it brings diversity itbrings perspect most perspective andexperiences to the project fosteringInnovation and reducing the risk of EOChambers and open and welcomingenvironment attracts individuals fromvarious backgrounds which not onlyenrich the project ideas and solutionsbut Solutions are also build and moreresilient Community uh is is built withbroader pool of of active contributorsthe project is better equipped tomaintain momentum where you have someturnover or reduce uh activity so itensure more long-term sustainability andvibrancy okay going to pass the mic overto sendy so um what does an inclusiveand welcoming project look like fromyourperspective uh thank you for thisquestion just so I'll begin my sharingmy personal experience so i s startedcontributing to open source sometimearound February 2024 like a before and Iattended my first St contributorstrategy meeting and I did not knowanyone so just saw me and just happen toknow from Catherine that I'm a personwho relies on caption so I was verynervous I did not know anyone in thezoom meeting and just just said one linethat are the captions working so thatone line that one line made me so muchfeel belong and inclusive so this issomething that the project Main orpeople who are hosting try to beinclusive and accommodative like justwho are the new members and if they needany accommodations or not because noteveryone is very open about working theaccommodations so if you ask they getcomfortable sharing theaccommodation the second thing is tomake sure that the project contribut doare in very simple a simple languagewithout an because not everyone is thattechnically deep technically focused tounderstand all the are so like havinglike a simple document with cleardiagram and then make sure that yourdiagrams and images have alternativetext because there are blind people whomay be reading the do so it should beaccessible to them so accessibledocumentation accessible meetings is oneof the ways you can make your projectmore inclusive thankyou pass it over to Kaa please want meanswer that yeah um so my perspectivemight be a little bit different I thinkthat when it comes to engaging with opensource projects here within cncfsometimes it just takes one person torespond to a question if someone askshow can I get involved if you'reinvolved just respond and say hey cometo the next meeting or I'm working onthis do you want to help out a lot oftimes people don't know how to getstarted or there's just some fear therecuz I'm not sure if I can do this orthat or I know I'm interested in thishow do I get involved so just beingresponsive to people is a great way tohelp people get more involved in theseprojects and that contributes to beinginclusive in myopinionokayum so what role do project maintainersplay in fostering inclusion and and whatare other small but impactful changesthey can make um so I can answer thislike um so I feel like I'm I'm justgoing to give the example of like uh theopen source meetings uh which happens inthe CNC of like be it any tag or maybethe release teams and if you've beencontributing you can relate this to meum so whenever we have these calls wemake sure that we document every stufflike the project maintainer or the maybethe lead of the the call make sure theydocument every everything so that ifpeople want to follow they can follow itas sync uh there are different ways likeif somebody wants to speak uh they justhave to raise the hands in the zoom sobasically these are the small things Iknow these are the small things but likeme uh I I could see the difference whenI started my journey as a devopsengineer in one of my or the firstorganization and parall I wascontributing to open source and I reallyloved the culture of all those meetingsin open source like how projectmaintainers were leading the calls and Ikind of went back to my manager and Itold that this is how it happens and Ithink we should also follow this kind ofpattern because I I feel inclusive Ifeel heard in those calls and I feelsuper comfortable uh to basically takepart in those discussions now what wasthe main difference between these two Ithink we had smarter people in both theplaces right and I think like the maindifference was like project maintainermaking sure uh to create an inclusiveenvironment by different adoptingdifferent ways of uh conducting themeetings also I feel like um a lot ofproject maintainers are looking forpeople who can contribute to theirprojects for along uh and what they cando is uh they can reach out to differentset of people diverse set of people yesit does take efforts but maybe you'remissing on a great talent so for examplein my story like I uh joined tagenvirment sustainability because I wasinterested in that topic and I wascontributing to that for one year like Iwas taking part in the calls I was I wasgenuinely interested and curious andthen the coach then the chair at thattime Leo he reached out to me and askedme would I be wanting to lead thesustainability week initiative and uh II felt it so good because I've beencontributing to that and he gave me anopportunity to basically lead it and nowI'm going to basically pass pass it onto someone else who is new so basicallyrecognizing who can be the newer peopleand also making sure they are there area diverse set of people we are givingopportunity to everyone maybe they bringnew fresh perspective because I led itin a different way Leo LED it in adifferent way and we were learning fromeach other so yeah this is somethingwhich uh I feel project maintainers cando and they've been doinggreat uh thank you for this question JoI Just sh how I got started in myjourney in open source so in the cuonand Chicago thanks to the scholarship Igot a chance to attended and I wassitting in the K I just happened to sitnext to Catherine she the cier of thetax yes group and she said uh she saw mereading the captions and she said areyou enough I said yes and she said shehas just helped to find the death andthe Heart of young working group so shesaid why don't you join us and I becamehis member then and and I haven't misseda meeting of the group since the lastone and a halfyears from there I wanted to expand mycontributions Beyond and and the hard ofyour working group so then I out to Joand I began joining the D CS meeting andas I said in my very first meeting Jowas wonderfully inclusive to me and thatis what got me hook so it works as asort of to way he demonstratesinclusiveness and on my part I also haveto demonstrate the commitment toconsistently attend the meeting so thatis the one thing about open soures youcannot just jump in and jump out youhave to maintain a consistancy level andconsistently contribso then I started contributing to taxand I got a chance to speak at my firstever cube in Paris where was a part ofpanel discussion with and from there Iwanted to still expend my contributionsso then I reached out to M so he's atechnical lead of country grp and thebeauty about open sour is that you canreach out anyone senior Junior anyone inopen source there are no enemies in opensource everyone is so very welcom so Mthen connected me to cast fields and Ibegan contributing to coms and I wrotethe spotlight for the death and theheart of your working group and Iunderstood like what goes on behind thescenes how to bring a spotlight so Ithought W blog is a piece of cake is soeasy it isn't it goes to so manymultiple levels of review because theblog is on theous contribut website soit has to be reviewed very verycarefully every world has to beand it went to so many so many reviewcomments and cyle before finally makingthe light of the day and then I got toknow that there is something known as aKUcontributorm so I did not know what is ad I thought it is a tag that we attachto Virtual Machine and not a technicaladvis and then I do not know what KUcontrib I thought probably somedifferent so I got a chance to be ashadow to the casSC and that was when I Lear Shadow so myle C she created a very wonderfuldocument and so it become very easy toget onboarded as a shadow and I wrotprobably like 55 emails or somethingabout theCasia and then in the Sal Lake City Igot the chob for car water award whichwas beyond my expectation so for someonewho started the journey in open sourcein April last year to get in an award inNovember this isn't just my story but itis more about the story of how all thoseLe help me to enable they help andempowered me in so many stages whichhelp me to reach the St so the projectmaintainance play very very crtical roleand lastly all of the meetings should bedocumented and sorted so that peoplemiss the meetings or may with people whoon captions or something may be able togo to your meeting no so make sure thatall of your meetings are totallydocument thank you awesome thank youokay before we go to the next questionum just wanted to say if anybody in theaudience has questions um and you'drather write them down then coming up tothe mic um again we have index cardsover here so just raise yourhandum uh otherwise take one more questionand then we'll open it up to audiencequestions um so you want to pass the micdown uh toca yeah so are there examples ofsuccessful initiatives that helpdiversify a Project's contributor baseand what made them work oh well I'm soglad you asked okay so we're working onone now called bipac that I talked aboutearlier that is considered an initiativeso like we're just getting started sowhat we're trying to do is bring inpeople from underrepresented backgroundsinto Tech especially in this Cloudnative space and their allies to helpthem understand how to contribute how toum get started in open source how tobecome a speaker like me sitting righthere on this panel is because I gotinvolved with that group because someoneasked me would I be interested in justtalking with people on slack who are inthis bipo group bipac is blackindigenous people of color um and whatthis does is that this allows us tounderstand one how do you approach amaintainer or two how do you actuallyjust get started with going to a meetingwhere do you find these meetings ifanybody doesn't know finding a meetingfor any of these um strategy groups orany of these working groups is kind ofconfusing and but people reach out toyou when you ask how do I find themeeting I want to attend and somebodyfrom that group will respond to you andjust send you the link show up at thistime we're happy to have you um when Ifor me personally when I'm here at theour cubec cons I do the hallwayconference that's the one where we'retalking to each other outside of thesemeetings trying to connect and buildrelationships and then I follow up withthem afterwards or they follow up withme and we build some type ofcommunication so that I can get moreinvolved or they can start getting moreinvolved with the initiatives that we'reworking on so hopefully that answeredyour question hopefully it wasn't toolong-winded you guys are okay yeah allright I see some Smiles okay so it'sgood okay so take Nancy next in that oneand and then you uh I Could Just sh okaygoad example of the and of working groupso the working grou has about seventalks inthis and the members of the workinggroup are so busy with the talk and theproject that we have we are s of runningand it's not that we're only doingdiversity talks so from the seven talkthe five talks ofTechnical and only two facts are aboutthe contributor strategy and thediversity so this is a very good exampleof how when you come together a groupyou stay focused and work smart and workhard you can make a very big impact at aconference like yka thankyou that's so great um I mean I reallyloved when you mentioned there has beenseven talks that's amazing um so I'mgoing to share about like uh so weyesterday we had uh women in Cloud uhNative Gathering and it's it wasshocking to me so when I asked like howmany of you are the first time cubec conor the conference comers there were liken More Than 90% people raised theirhands um so at that moment I felt likethat's one of the good initiativebecause uh we like more than 90% womencoming for the first time to the cubeccorn yes that's great uh them connectingwith the senior leaders um that's greatand we concluded we had a lot of ofdiscussions on each table and weconcluded that uh for the growth of anyindividual be it anyone you need amentor and uh conferences like this umis a great place to get a mentor um andI think like initi and we are also uhJos is going to talk more about it weare also going to launch a mentorshipprogram so if you are interested to be amentor please be because uh I've beenlike trying to run this community and weneed more and more people uh it justbecomes hard if we have more peoplecollaborating on this it's going to begreat like for example and the bestthing is like you collaborate like Ireally want to appreciate Katherine Joshcontributor strategy like they help somuch like I was discussing the samething with Josh yesterday that maybe ifyou could come and you can you know uhgive a workshop on public speakingbecause a lot of people they theythey're not good at it and uh they'regood at the work but they just don'tknow how to present their views so sothings like this you know thecollaboration happens in conferences andI think with the initiatives like thisso um really um thank you so much uh forcontributor strategy okay well thank youum so uh I've got a link up there to thementor Pro program if you're interestedin mentoring folks uh uh out of theinclusiveness effort um uh if you don'twant to do the whole QR code thing uhyou can also go to the tag contributorstrategy repo on GitHub and we've got auh page there uh inclusiveness UHCdocument that will have links to thisand we'll continue to build it out withlinks to new initiatives that arise umin the meantime I want to ask a questionfrom Lonnie I think I from Wikimedia whoasked as an engineering manager how canI support inclusive spaces forcontributors and best prepare myEngineers to help support Andorparticipate in these inclusiveefforts all any anyone anyone who wantsto startAh that's a really very good question soI'm really fortunate to have like atruly wonderful manager who actuallyallows me to attend open sourceconferences gives me a break from workto come and speak here without thesupport from my organization my managerit wouldn't have been possible and howcan you support inclusiveness So as Isaid even in the like even in the deathSpectrum not everyLi and relies on captions like me thereare some who sign there some mix ofsigning and liing so everyone has theirown individual way of accessibil so thefirst thing that you can do is to justlet your team know that you always opento providing accommodation so then theyfeel free to share what accommodationsthey need and then you can take it on ac by B because it's not necessary thatas I said that accessity is notnecessary like a very huge and it is notnecessary that it has to be like arocket s or something it is just aboutaccessibility just about and having anopen mind a bit of empathy and D ofawareness and lot of willingness to be agenuine Al so you could try to be justan Al without actually having the titlelike I some friends who they don't knowwhat is Al but a very wonderful Al Imean in my entire life I had people whoare hearing fully able bed people whohelp me whether it is my open sourceJourney whether it is my career alwayssome people chose to believe in mehaving faith in me giving me theopportunity like I cannot listen on thephone and yet my manager sent me to Centand the Cent had a Santa Claus like so Icouldn't him and I got nervous my magyou have come to work just focus on workand when I finish the work before timethe client actually gave me a job offerwhich I politely declined so there areso many ways you can supportaccessibility it is just about having aFrank conversation about keeping an openmind and understanding the news and thensee what budget you have and how can youmake it a more inclusive more welcomingspace so that everyone feels happybelong andincluded thank you um so I'm going tocome at that a little bit differentdirection you're the manager right womanokay so I've been in tech for 20 yearsit's rare that I see many femalemanagers so I love to see you very muchum I think uh a good way to supportpeople and I'm going to come from awoman's perspective right now um there'salways someone on the team that yellsthe loudest to um drain out other voicesum assist your people or your personwith the ability to speak their voiceand to make clear what it is they'retrying to express like their ideas orcontribution um don't dismiss it rightit's helpful especially a young womancoming into the field right um when yourmanager does that and listens to you andencourages you it just makes you moremotivated to do a great job at whatyou're doing but it actually makes youenjoy the work that you're doing I havea great manager he's sitting right backthere so I'm happy I'm very supportivehe's trying tohide but um I I enjoy Eon because eversince I was brought into the company themanagers that I've had have been malebut they have been very supportive theylisten to my ideas they allow me to cometo cubec con and participate in opensource our open source manager is rightthere um so I really appreciate that andother people would appreciate thatespecially if they're underrepresentedin this field um or new to the fieldbecause young men coming in the fieldcan also sometimes feel drowned out andthey just need a little bit moreencouragement so that's just my littletwo cents hope thathelped yeah you want to take it no Idon't have any I don't have any answerto that question but I can follow thethe the question we have well let'sfirst see did we have more questionsfrom theaudience anyoneelse okay well we'll take the nextprepared questionand if you think of something don'thesitate to raise yourhand um what are some barriers tocontributing to projects that you'veencountered that you think most peoplein the cncf don't know about Yeah sobasically I will give you some tip andstreaks are mostly common sense when youuse diagrams or pictures when youdescribe the architecture of yourproject think about how people cannotsee diagrams so put more descriptiveinformation there with the text andduring the meetings speak loud and explexplain what you you you show instead ofsaying please show this explainexplicitly what you Des what you show toto the audience and uh when you use whenyou communicate most of the time we usesome fun memes or gifts and for this Isuggest you to to add into parenthesesthe the meaning you would like to shareto to the to the Audience by this memeor uh or gifts and for uh the projectwhen you do a meetings especially with aworking group or office AWS when youdemonstrate something with common lineEtc so please share some uh git uhrepository there and the environment useto let everyone to reuse what you youpresented and uh something I would liketo add as well is as soon as you you getuh people uh in in in the in the meetingto be to be more welcoming during themeetings do a round table at thebeginning it will be as well better toto get newcomers whatever he comes fromand it after this he will know betterwho is in the room and he could beparticipate better uh into the yourprojects great do we have anotherquestion from theaudience can I add just one thing go forit always allow people to ask questionsjust sayinglet them question Engineers are going toquestion everything if they're notquestioning something then are theyreally an engineer because I'm justsaying we have to question everythingand allow that to happen so thateveryone can grow on the team and justenjoy what they do cuz I ask a questionsokay I also ask go for it yeah I justwanted to say like whenever I visitcucon I feel there's still huge room foruh Improvement like there's huge spacefor improvement like in terms ofrepresent ation I see very few uh youknow women speakers and speakers fromother other groups like um so there's ahuge spacethere okay uh I'm going to bring this upso people have time to see some of theother stuff while we're stilltalking um so again I'm going to pauseto see if there's a question from theaudienceokay we'll taketake so for those who want to help butaren't sure where to start what are someconcrete ways folks who aren't projectmaintainers can do to supportunderrepresented contributors we'vetalked about what project maintainersand leads can do now what can otherpeople do yes so so first you can justgo to the web page of uh contributorstrategy so we'll give the link at theend of the sessions and basically allinitiative has a monthly meeting you canjoin so and if you cannot join allmeetings are recorded and you can getthem back on YouTube uh you have allinitiative some slack channels fordiscussion so feel free to to join andwe we need more alas uh to to the to allinitiatives so uh as other project weneed more people uh to be moresustainable and be more with with morevibrancy into ourinitiative um I think you wanted to sayoh um so I think that a good way is tobe an ally I believe in allyship so it'sa great way to be involved and to helppeople and just invite people to thingsbecause in my opinion cncf is massiveand it's just hard to find stuff sosometimes someone just needs aninvitation so I just think that's agreatway just one line about Al so Al issomething that when start doing therecomes a time when you actuallyinternalize it for example like I hadmany friends who use okay but I'm but Istill have friend outside the circle sopeople don't use whe and so now say Iand my partner we are going into a c andwe found that the entrance was notwheelchair friendly and we just St it sonone of us was using whech but when youinternalize it you s start see is thisaccessible is this not accessible sowhen this start happening once it startBlood start becomea thankyou okay so any thoughts to wrap up yesa final Source something I would like towrap up to to on board Talentefficiently meritocracy ensure thatopportunities and recognition are basedon skills potentials and contributionneither than arbitrary uh factors sotrue mer meritocracy isn't about atingtalents slowly and current achievementbut recognizing potential and overcomingsystemicbiases real inclusions means activelycounteracting biases that undervalueunder represent representativetalents anything else yeah I just wantto say like um if you have time and ifyou can be a mentor please be a mentorum if you have time in your teams andyou can be more inclusive and helpanyone else to collaborate with you Imean like the question the last questionwhich you asked Jos um if you can likebe a co-speaker with someone if you canjust invite a new person a new be withyou just do for just go for it uh youare going to leave a huge impact ontheir lives so yesokay okay well thank you so much um andthanks everyone for coming um and I hopeto see some of you aroundour various inclus inclusiveness effortsin thefuture y we're done than you2025-04-15 22:03:19.235892 �� Y#��wARgzyEc8pPa8okay welcomefolks um to uh the tag contributorstrategy session we're going to betalking about inclusivenessunfortunately I know well the Keynotesare just finished maybe a couple minutesago um they don't let us start thesession late so we're going to go aheadand get going and and people will becoming in as we talk um uh my name isJosh burkus um I'm co-chair of tagcontributor strategyum uh we have a panel here who's goingto introduce themselves in a minute uhto talk aboutinclusiveness um uh we will answer a fewpreset questions um and then we're happyto take questions from you um we cantake questions two ways um you can askthem out loud and also Dawn who's one ofmy other co-chairs for tag contributorstrategy has index cards in pens ifyou'd rather write the questiondown um the um uh and and pass it uphere and I'll make sure that it getsanswered um so uh with that let's gettalking aboutinclusiveness and the first part of thatis I wanted to go ahead and ask ourpanelists to introduce themselves so whydon't you start awesome thank you thankyou so much I am super excited for thispanel discussion today because it's anamazing topic uh I'm Nancy I am co-chairfor tag environment sustainabilityadvocacy I'm also CNF Ambassador I'veworked as devops engineer uh I foundedwomen in Cloud native community inDecember 2022 I'm going to talk moreabout that in today's panel discussionuh I met amazing people in thatcommunity so um yeah and I really lovebeing part of Open Source Community I'vebeen contributing to open source since2019 and I think it's going to go forlong forsure okay my name is stefanas I'm a cncfAmbassador I contribute to the CN CFglosary project that uh is theonboarding of New Commerce on the Clonenative uh ecosystem or more Communitymore then I organized some clone nativelocal meetups and I I contribute as wellto the visual and Visually Blind andVisually Impaired initiative or in shortBVI thank you uh everyone I'm sand Iwork as a Le software engineer Jen I amthe co-chair of the cncf death and thehard of youring working group and I'm amember of St contributor experience andalso a member of the D contributorstrategy uh I'm a person with hearingempowerment and I rely on captions andLi reading so if you see me looking atmy phone during this panel discussionI'm not checking my company emails I'mjust following what the speakers arespeaking and H just like you are here tolisten to us I'm also very excited tolisten to my co because because it isthe first time I'm meeting them inperson thankyou uh hi everybody my name is CA TaylorI'm an observability engineer andconsultant at Eon digital technology I'malso the co-chair or co-organizer of kcdAcra and I was co-chair of the one andonly debass Dev day that we had inChicago a couple years back um I'm happyto be here and I'm also a member of bipowhich is black indigenous peop�rrorprone manual analysis And onthe right side you can see that evenChad GPT thinks that you need a safetyhelmet when working with SQL Our firstattempt at solving these challenges wasthe workflow agent We used a singleagent approach where one LLM washandling all questions withoutspecialization As you can see on thegraph first we would process userquestion and detect its type Then wewould route this to either use Spotifyinternal knowledgebase or using rates or generate SQL toask billing data set And then based onthese answers we would just forward themto theuser Now this approach worked relativelywell but there were a few issues with itThe first one is SQL reliability We weregenerating SQL directly withoutvalidation or repairmechanisms and also system struggle withcomplex multiart questions that requiredifferent types of expertise To solvethese and other challenges we decided towork on the next iteration of the agentOne of the key challenges we identifiedis that LLMs often struggle with precisemath calculations which are essentialfor financial analysis Even advancedmodels can make small but significanterrors in calculations that compoundover time Our solution to that was toimplement React agent that generatedPython and executed it in sandbox Thisapproach allow us to delegate precisecalculations to Python to ensure greaterreliability And you can see the exampleof the code that agent generated uh tocompare cost difference To address SQLreliability issues we developedspecialized SQL expert agent withself-healing capabilities This agentgenerates contextually aware SQLspecifically optimized for BigQuerywhere we can keep most of our data Thekey change here is the self-healingmechanism that automatically detectsfail queries and repairs them It canmake up to three attempts to fix a querybefore failing back to alternativeapproaches The agent also handles resultformatting returning clean data back tothe main agent On the diagram you cansee that uh we have main querygeneration node that sends query to theexecution node If we encounter anyerrors we'll send that information backto the query generation node and try torepair it Once we have final workingsolution we'll format the results andsend it back to the user For questionsthat require external knowledge weimplemented web search capabilitiesusing React pattern which combinesreasoning and action for dynamic searchThe system can perform iterativerefinement improving search queriesbased on the initial results We leverageGoogle search API with Gemini models forgrounding to ensure that the system hasaccess to up-to-date information On thesearch flow diagram you can see thatReact agent would first think if itneeds to search for information and thensends it for retrieval If that's notenough it will do more searches uh asmuch as needed And then with when itthinks that it's enough it will returninformation back to users For complexquestions we needed a more sophisticatedplanning approach Our task planningarchitecture has three key capabilitiesThe first task decomposition breakscomplex questions into manageablesubtasks Then workflow orchestrationdetermines execution sequence withdependencies so we can optimally launchthem in sequence and in parallel FinallyAsian delegation maps tasks tospecialized Asian capabilities On theright you can see the example of thesimple task plan in JSON format So firstsystem decided to call SQL expert toretrieve last month's cost by projectThis data then getting into Pythonexpert that analyzes cost trends overtime which then calls cost engineeragent to identify anomalies and rootcauses A critical aspect of ourarchitecture is verification andreplanning We use LLM as a judgeapproach to verify answer quality andcompleteness When information isincomplete the system generates newtasks to fill the gaps It can alsoidentify inconsistencies and providefixes through error correction tasks Theverification flow diagram that we haveshows that after all the tasks areexecuted task results are being sent toreplanner node that analyzes if thisanswers original user questions If it'snot then it will generate new tasks tosend to execute node or if it'sincorrect it will also issue newcorrection tasks After those new tasksare executed reper node will assessagain if that answers user questions Ifit's yes then it will try to format thefinal answer to exclude all the thinkingprocess and present nicely formattedanswer to the user So putting it alltogether here is our complete multi-aent architecture The user query firstprocessed by planner note whichgenerates a list of tasks These tasksthen executed by specialized expertagents including Python internalknowledge raids data SQL queries orGoogle search After execution plannernote evaluates the results and eitherdelivers final answer or creates a newplan for additional execution This modelor approach allows us to continuouslyimprove individual components and expandcapabilities over time So let's look atthe concrete example of how systemreasons This example shows the systemsearching over Google docs to findrelevant information about cost analysisquestion We also have reasoning trace umthat shows how agent breaks down theproblem searches for informations andcombines finding to generatecomprehensive response Here's anotherexample showing Python agent performingmath calculations The trace shows thatwe first go to rate expert to getup-to-date RA information on a standardstorage And then thisinformation is being used by Pythonexpert that writes Python code andexecutes it in the sandbox This producesvery reliable responses every time andreduces hallucinations by a lot Tomeasure effectiveness we build acomprehensive evaluation infrastructureWe use open eval framework whichimplements LLM as a judge evaluationwith detailed reasoning We focus onbinary correctness assessment withdetailed reasoning capture to understandwhy response is correct or incorrect Ourdata pipeline flow consists of threemain steps First we extracted real userquestions from Slack channels Then weuse thread summarization to generatequestion answer pairs And finally weintegrated all of that data with GoogleSheets for easy access and analysisbecause we want to make it accessiblefor nontechnical users as well Ourinitial evaluation results showsignificant improvements The multi-agentapproach achieved 58% accuracy comparedto just 22% for workflow baselineapproach It's over 150% improvementThese evaluations based on real userquestions from Slack covering diversequestion types and were compared againstexpert reference answers The keyimprovements came from the multi- aentarchitecture with specialized expertiseand modern approaches to planning Thisdemonstrates that the complexity of thearchitecture justified by significantperformance improvements Now lookingahead we are scaling our AI agentsacross three main dimensions The firstone is compute We've learned that usingmore compute in general improves theperformance of the system So we areimplementing Nway search using Monteoltree search to more efficiently exploresolution spaces Essentially at each stepwe would ask LLM to provide let's sayfive instead of one answers and choosethe best one And for capabilities we'reaiming to develop 100 times morespecialized agents across differentdomains including infrastructuredeployments code that would create muchmore comprehensive and capable systemAnd for evaluation we're buildingmultimodel benchmarking thatcontinuously improve as new LMS would bereleased So we can make a call on what'sbetter and switch very fast And our endstate goal is an extensible architecturethat autonomously improves when new LLMsare released that in return creates asystem that gets better over timewithout us doing any work Thank you foryour attention and if you're buildingyour own agents I'd love to hear yourexperience You can connect with me onLinkedIn or an X Those who areinterested in exploring this further Irecommend checking out Langraph uh agentorchestration framework or opens libraryfor evaluation and also check outanthropic guide to building effectiveagents and I'm happy to answer anyquestions you might have2025-04-15 22:03:19.806834 � ��T�[#��_A2-fSMpCSYnw[Music]sweet i'm a Kubernetes contributor andmuch more importantly returning championfrom KubeCon Family Feud in Salt LakeCity joining me today is the mansecretly controlling this game he needsno introduction but I'm giving him oneanyway please put your hands togetherfor Tim Hawintim how are you feeling about this gameare you excited this is going to be thebest game ever amazing now we got apretty serious setup this time do youwant to maybe talk us through it a bityea �!�Z#��yAsTbJ1-x3_ychi everybody Today I'll be talking aboutautonomous AI agents for cloud costanalysis Specifically how we can buildintelligent systems that helporganizations understand and optimizetheir cloud spending We'll look at boththe planning and execution aspects ofthese systems and I'll share somepractical insights from real worldimplementations Before diving in let meintroduce myself I'm Ilia currentlyworking with cloud infrastructure anddeveloper tooling at Spotify I've beenfighting LLMs last two years with aspecific focus on rack system andproduction AI agents Now let's talkabout why cloud cost management is sochallenging The rising complexity ofmultiloud environments with theirintricate pricing models make itdifficult to maintain comprehensive viewof cost Access to cost data is oftenlimited to specialists and expertcreating bottlenecks in organizationsknowledge about cost optimization tendsto be siloed within teams rather thanbeing broadly accessible And finally themost organizations find themselves inreactive approach to managementresponding to high bills after theyarrive rather than proactivelyoptimizing beforehand These challengescreate the perfect opportunity for AIassistance Traditionally cost analysishas been very manual process withseveral challenges First of all there isa significant technical expertise gapYou need SQL proficiency and deepknowledge of cloud billing data modelsThis process is resource intensive anddiverts valuable engineering time awayfrom core product development And thereis inefficiencies of repetitive queriesand e h well we learned a few lessons fromlast time so this time we've broughtsome real buzzers and uh we changed theway we're controlling the game you'renot going to see that but we will ohawesome and you've This is set up on anactual Kubernetes cluster isn't it it isit's running on a little tiny GKEcluster that I spun up just for this ilove it it's only in this industry thatwe can take such a niche minute problemand completely overengineer it it reallymakes me proud to work in thisindustry well with that let's uh stoptalking about our hobby projects and uhlet's meet theteams starting on my right we have teamtabs we have Keith Maddox keef is anengineering lead for service mesh andnetworking at Microsoft he enjoys longwalks on the beach and helping usersavoid complexity in their service meshdeployments next up we have PriyankaSagu priyanka works at Sousay as aKubernetes integration engineer in thebusiness critical Linux department she'sone of the technical leads forKubernetes SIG contributor experiencelet's give it up forPriyanka next up we have Peter Huntpeter is a principal software engineerworking at Red Hat passionate about freesoftware Peter focuses on being chair ofNick Sign Node maintaining cryo andwriting or sometimes squashing bugs inand around Sig Node and containerruntimes outside of that Peter likescollecting floral printed paints uhcooking anddancing and finally on team tabs we haveAmit amit is a software engineer atLinkedIn living in Seattle he likesgardening and writes coupe cuddleplugins forfun and facing off against them today wehave team spaces starting off with CalebWoodbine caleb is a software andinfrastructure engineer in WellingtonNew Zealand who likes electronic musichiking andaviation next up we have Richer Bankerricher is a software engineer at Googlein Cal in California who is a hiker byfoot reader by corner and artist by whimnext up Sandeep Cannibar sandeep worksas a lead software engineer at Gen ifyou've been in the US those are on adsas Norton Lifellock he loves to breakthings just to see how they work thenfix them and then break them againbecause learning never stops give it upforSandy and last but not least we haveTabitha Sable tabitha Tabitha has nevermet a system she didn't want to takeapart she serves the Kubernetescommunity as co-chair of six securityand a member of the security responsecommittee tabitha's career is soaccomplished that I'm actually going tohave to go to my next card at workTabitha leads runtime infrastructuresecurity at data dog she writes exploitshub hardens infrastruct another card andbuilds the relationship between securityand infrastructure orgs outside of workshe loves cats capture the flag contestsand pretty much anything with wheelsgive it up for Tabitha[Music]now our teams today are not justcompeting competing for the fame andglory of winning family fortune althoughthat's already worth it in its own righttoday we have something a bit specialcompared to uh in Salt Lake City becausewe always take it a step further we haveour lovely gorgeous family fortunetrophies available for the winningteam which uh you're going to have it'san impressive almost 5 in tall reallyincredible stuffuh now who here was at the keynote in uhSalt Lake City and has watched thisbefore put your hand up uh who here haswatched Family Feud or Family Fortunebefore who here has never watched any ofthese things before okay so some of youhave been living under a rock but foryou guys I'm going to explain the rulesthis is a roundbased game we asked 100members of our contributor communitysome questions some serious some maybenot so serious uh those those answershave been collected together into aseries of questions and answers ourteams today are going to try and guesswhat the community thinks about somevery important topics go in Kubernetesthe team that gets the team that answerscorrectly and gets the most points willwin the glory of uh being of being KCONfamily fortune championsrounds will rounds will start with ahead-to-head round with one member fromeach team after that we will go to theteam that wins the head-to-head who willtry and get as many points as they canbut don't get three free qu three freeanswers wrong if you do you will hearthis soundTim if you get three of those wrong wewill go to the other team who will havethe potential to steal all of the pointsfrom the round so be careful with thatthough I think it's time to start roundone tim let's start round one and let'shave Keith and Tabuffa come outkeith are you uh are you a bit uhconcerned you know Tabitha's got such anaccomplished career it's going to bequite hard to beat her on thisheadto-head i'm going to do my bestgoing to do your best that's all you cando yes all right round one name aKubernetes feature that has caused aproductionincident who was that was that Keithyou got it first all right Keith let'sdo it livveness probes liveness probestim do we have anything about probes yeswe do congratulations all right let'spick it over to teamtabs all right Priyanka uh how many howmany production incidents would you sayyou've had in Kubernetes have you gotexperience here um yes I was a Sokay well if you get this wrong I willbe judging you then yeah you can'tbecause I was there sorry two jobs backso all right then well then this shouldbe easy for you name a Kubernetesfeature that has caused a productionincidentum you were an S you just said thisyeah a feature okay let me think youalso think I I know SRRES don't work onfeatures much but my godum how how about CRDs count as featureCRDs Tim do we have CRDsoh well you need to go back into doingSR Ithink peter let's bring back to yourteam it's uh name a Kubernetes featurethat has caused a production incidenthow about pods pods tim do we haveanything about podsnolike all of them would be pods but okayi feel like this I feel like this teamare dangerously good at running theirclusters if they can't think of anyfeatures that cause an incident amtmaybe uh you've had less luck withrunning clusters can you think of afeature that's caused a productionincident dns dns i mean it is always DNSisn't it and it is really alwaysDNS all right Keith you can't afford toget another one wrong but you you youyou did you got it right last time isthis saying anything about the thereliability of Kubernetes clusters atMicrosoft or none whatsoever okay okayall right then name a Kubernetes featurethat has caused a production incidenti'm going to say network policy networkpolicies tim do we have network policiesyes wedo all right Pianca it's time to redeemyourself is resources limits of featureresources and CPU memory limitsmaybe we It's there you don't have to goback into SREnow peter it's uh give Have you got uhany more in the tank how about uhcertificate rotation certificaterotation that's a good one uh do we haveanything about uh certificates or No wedo not and that's free strike so thatmeans that team spaces has the chance tosteal all the points you're going tohave to confer you have one shot to dothis name a Kubernetes feature that hascaused a productionincident they're very they'reconcentrating a lot right now it's likewhen there's a production incident forrealall right autoscaling autoscaling do wehave autoscalingTim we dowell that means that unfortunately foryou team tabs team spaces gets thepoints maybe uh maybe be less good atrunning clusters so you can see more oftheseincidents all right then let's move itlet's move it on then to round twounmask the other ones oh yes absolutelyyes i almost forgot that let's see whatwas the numbertwo admission web hooks what aboutnumberfive stateful set the best the bestobject in Kubernetes in my opinionperfect uh and in numbereight name one thathasn't veryintrospective all right then with thatlet's start round two uh Priyanka andSandeep come ondown now oh now there is a slight changehere sundep is uh deaf and we don't wantto disadvantage him so he has thequestion card right there ready to goand as I start to say the questionthat's going to be ripped off and he'sgoing to read it all right here we gowhat is What is something you should dobefore updating a deploymentbianca buzzed in first bianca buzz infirst backup take a backup take a backuptim do we have anything about backupsbackup everything yes you should ithought youAll rightPeter you uh had much uh experience uhwith this no I've never I've never runKubernetes in production no me neitherhonestly it's all theoretical it's uhYeah I'm I'm running Open Shift rightexactlywell with your best guess then maybename something you think you should dobefore updating a deploymentuh set up uh blue green set up bluegreen tim do we have anything out bluegreen oh that's Oh too bad you should uhtry Kubernetes one day though maybe itwill help himamit name something you should do beforeupdating a deploymentoh my god uh ensure you have enoughcapacity or CPU memory ensure you haveenough capacity tim do we have anythingabout capacityoh my word oh this team they're so goodat winning the uh headto-head but thenthey just can't get any right after thatit's like a curseright Keith killing me you've been uhyou've been really good so far for yourteam can you bring it back for them nowwhat is something you should do beforeupdating a deploymentoh the pressuredeal with the pressure of like beforeyou update the deployment or deal withthe pressure right now the pressureright now oh goodnessum let'ssay check your metrics check yourmetrics tim do we have anything aboutmetrics atall no we do not well this is theopportunity for spaces to steal a whole17points a lot writing on this tabifer youlook prepared you look like you knowwhat you're going to say and you shouldwith your such accomplished career say aprayer say a prayer tim do we havepray we do and it's a toughanswer well this just shows how thisjust shows how accomplished Tabitha ishere right what else do we have thoughlet's see what uh was missed can we seenumber two please do a dry run uh veryfew do in fact who in the audience doesdry runs before updating theirdeployments put your hand up that's like10 of you i'm judging the rest of you atthis pointyeah all right what about number fourwhat do we have there test it in stagingwhat about Do some of you at least havestaging put your hand up you have stOkay good at least half of you now putyour hand up again if your stagingactually worksokay less hands than before uh rightwhat about numberfive bloodsacrifice now this is one I ampersonally invested in i think this is agreat way to do deployment updates butit's just not caught on i think that'sthe rest of the industry's fault thoughwhat about numbersix verify context and namespace whohere has ever uh put Yep yep put handsup hands up in shame comeon all right numberseven gut check are you sure about thisi I personally can't do this because theanswer is always no um but the rest ofyou maybe you'd have more luck andnumbereight that's the energy we want that'sthe energy we want when runningproduction all right with that it lookslike uh team spaces is 114 points aheadteam Tabs may need to uh catch up maybethey maybe they need some help briberyis an option this game is rigged and Ido make the rules so you know with thatthough uh let's start roundthree all right Peter and uh Reachercome ondown all right you ready for this youare well preparedno no the answer okay perfect timing tostart then what is something firsttimers should do at KubeConrichwith a lightning that was meet with newpeople meet with or talk to new peopletim do you have a And it's the numberone answer number one answer all rightwho here who here is it their first timeat KubeCon put your hand up all rightnow you fall into my trap everyone lookput your hands up again keep them upeveryone look around these people yousee here I want you to go talk to themafter this session okay make a noteright now good you're really you reallywant to talk to people have you beenalone is there anythingwrong all right uh let's start then uhwith uh Caleb because you're down fromreacher caleb what is something first-timers should do at CubeCon i mean youyou're a first- timer right you've notbeen to a lot of CubeCons how many haveyou been to maybe five or six real firsttimer complete new to this yeah uhplease someone help me to find what'saround KubeCon cuz I'm Yeah yeah youknow what Kubernetes is you ever heardof this um I mean the other teamapparently Peter doesn't know either soKubernetes is a distribution of DockerSwarm right yeah yeah yeah i think it'ssomething like that yeah yeah yeah andit's um small uh family like uh it'sartisal like Yeah it's a beautiful thingi think when I look at the service API Iwould say it's very artisalall right uh Caleb what is somethingfirst- timerrs should do at KubeConattend a talk attend a talk do we have atendertalkno you spend all this money and what doyou do you shouldn't even go to themcomplete waste of time apparentlyespecially this one never mind all rightTabifer what is something first timersshould do at CubeCon acquire vendor swagacquire vendor swag yes I like that isit on there though is it on there yesabsolutely[Applause]all rightSandy what is something first timersshould do at KubeConso I think you should watch the demobecause it is a really good chance towatch the demo because only if you watchthe demo you can grab the swag that tobe tested something related to swagwatching the demo to get the swag payingfor it with your attention is it onthere oh no unfortunately uh you shouldjust trick them into giving you the swagwithout having to go to the demo at allall right uh Reacher what is somethingfirst- timers should do at CubeCon isthis your first time you know you'vebeen around the community much yeah I'veattended two two CubeCons before thesecond okay well then you should havesome level of experience herewhat did you do on your first CubeCon irealized on my first CubeCon that youshould be wearing good shoes because youhave to walk a lot oh that's a great onei love it is it on there though is it uphere oh no no you should just destroyyour body for this uh job absolutely allright well team tabs there is anopportunity here to steal 38 pointswhich gets you you just need to do thisthree more times and you'd be back inthe lead i mean it's very closelet's say go to an afterparty so uh thebooth crawl something like that go go toa party tim do we have go to a partywe doincredible stuff all right then let'sLet's see obviously these people hereare not very good first- timers cuz theycouldn't guess most of these but let'ssee what they missed let's see numbertwo pleaseselfie with a K8 maintainer right whohere's a K8 maintainer so the rest ofthe audience know who to take a selfiewith behind you there's one behind methat's crazy that's crazy i think Ithink this guy might be one as well idon't know i've never really heard ofhim before never heard of him all rightuh let's see numberthree take it easy and drink water uhyeah surprisingly few people dothis yes yes absolutely steal that guy'swater bottle and that one's all rightlet's see numberfour visit the project pavilion the bestpart of the solution showcase area in myview it's the only place you won't getaggressively soldto all right and numberseven find Kelsey HighTower is he here kelsey are you herethis is like a really long and messed upversion of Where's Waldoor I should say where's Wally we're inthe UK uh all right well with that TeamTabs gets the points all right let'sstart roundfour so let's have uh Amit and uh Caleblet's come on down amt you uh you needto smash this round to get back what isyour strategy here how are you going todo it i've been practicing you've beenpracticing you are ready to just don'tdestroy that buzzard please it was like$50 or something on Amazon it's not thetop of the linebuzzer allright what is something you can sayabout your clusters but not your familycaleb wins the buzz calebnow I have to answerit's okay no one is judging or recordingthis i think both of those are wrongi'm going to have to an answer huh uhtwo the one this is not helping uh thisis not helping great all right AMT youget the chance to take it too big[Laughter]you're going to have to repeat this cuzI didn't hear you too big too big plusthere's a family right uh do we have toobig or anything about it being too largetimtoo expensive sure I'll take that that'snot what I said but sure all right thenteam Tabs this is your chance this isyour redemption arc right now calebThunder fumbled it so now Keith uh howhow uh how big is your family and is itbigger than your cluster no my family isnot bigger than my cluster okay wellwe'll see if you can uh pull anythingout what is something you can say aboutyour clusters but not your family wellyou know uh I've been married for aboutsix years and uh it wouldn't last verylong if I called my wife a pet or cattleokay so cool uh pet or cattle what doyou think about that Timohno oh maybe you should try nodon'tpriyanka what is something you can sayabout your clusters but not your familyi want to say a lot of things but thenI'm you work with your familyuh howabout they don't scale well they don'tscale up and down tim do we add anythingabout scaling autoscaling anything likethatyes it's onthere one of those really annoyingproblems about families isn't it Peteryes so no matter how tardy my family isI don't think that they would takenicely to become being being calledeventually consistent families are noteventually consistent i would say thatmaybe the process around babies and thebirds and the bees might be but you knowbut have we got anything about eventualconsistency anything aboutreconciliation Timoh no see you should have thought aboutthat with babies first all right AMTwhat is something you can say about yourclusters but not your familyunreliableunreliable tim anything aboutreliabilityfalling apart[Applause]love it love it all right well Tabsyou're still in it this could be theredemption arc here i I'd like to pauseand remind people these are real answersyes our contributor community is messedupall right Keith this could be it all youneed is another what you need to getlike pretty much the whole board andthen they don't get it but you know itcouldhappen what is something you can sayabout your clusters but not your familyi can't curse at my family you can curseat your family but not your Are you sureit's not the other way round like itfeels okay anything about cursingswearing or anything like thatTim oh no don't swear about that thoughplease we're on like stage all rightteam spaces have you had a good thinkhere maybe the contributor community isa family a really messed up one but youknowall right Tabitha they're all horriblyinsecure insecurewow youYou can tell Tabitha's from sicksecurity and you can also tell a lotmore with that answer all right Timanything about securityohno oh well well that means that uh thepoints go to team Taz but let's see therest of the answers then so what do wehave as number one and we shouldremember here as Tim said real peoplesaid said this stuff so let's see numberone you cannot recreate your family fromYAML uh numberfour it does what you tell it to do idon't think this applies to either ofthese things i don't know what these 12people were thinking but you know uhnumberfive cube cuddle delete node yeah that'salso frowned upon in society to do thatto your family numbersix have a second one in anotherregion doesn't everyone do this i'm I'mthe odd one out here okay and numbereight don't remember theirnames again all of these things likeapply to both i don't see what the issueis here but anyway with that let's let'suh move on to roundfive it's uh Keith and Tabitha come ondown it's a rematch i know right it's arematch this is probably our last roundright Lucy yeah i think this might be itnow i There's a countdown clock you allcan't see here but I've been told if Igo over that they're going to come anddrag us off this stage so I have to bevery careful all right Keith this is thechance for redemption of team tabs orteam spaces are going to take it andthen no one's going to be able to usetabs in the project everagain what is a subtle sign your clusteris becoming self-awarekeith wins the buzz offkeith I'm going tosay it autoscales correctly the firsttime it auto scales correctly uh Tim dowe have anything about scaling in thereno all rightTabitha uh what a subtle sign yourcluster is becoming self-aware you're insix security so this should be like easyfor youoh I never encounter arback issues whenI try to apply something no permissionsissues tim we got anything aboutpermissionsno wow i thought this AI hype train wasmeant to be like rolling you should allknow this at this point all right Keithwhat is the subtle sign your cluster isbecoming self-aware that wasn't yourlast answerit actually does what I tell it to do itdoes what you tell it to Tim anythingabout it doing what it tells you to door working it justworks i mean that is like a dangeroussignpriyanka uh you ever had a cluster goself-aware and go rogue or are you notat that point in this AI hype train yetdid we talk about scalability scalesit's It scale it scales what do you meanby thatit scales up and down what like itselfi think Yeah I think I think Keith mighthave said that but you can have it againif you want another strike i mean you'recoolthis is just like using an LLM it takeslike ages torespond to right hereum what was the question again thequestionWow this really is like an LM what is asubtle sign your cluster is becomingself-aware too reliabletoo reliable i mean we've had it justworks but I'll I'll take it anyway timwe have anything too reliablereliability we do not unfortunatehallucinating just like an LLM as wellincorrect answer all right Peter what isa subtle sign your cluster is becomingself-aware cbd starts forgetting stuffcd forgets stuff it's distracted i thinkthat just means that your uh hosts arecompletely messed up but sure we'll takeit anything about Etsy or storage Timno all right Amit to bring it back forteam tabs this is This could be theredemption arc if you get this nopressure no pressure at all it does oncall for itself it does on call foritself tim anything about oncall nowell team spaces I will I will say Ithink that you've won anyway but just toseal the deal what is a subtle sign yourcluster is becoming self-awarei feel like we'll have to pick a betterrepresentative sample of players nexttime i mean apparently so I will saythat we had a free people decline and itmust have just been thatAll right Tabitha what have you got itshuts itself down it shuts itselfdown uh Tim anything about shuttingitself down no well team Tabs gets thepoints with an impressive one correctanswer after like what was that were theFA the headto-head we had like fourattempts then we had four like threefour over here then one over here wellthese must be really incredible thenlet's see what they are let's see numbertwo it argues with you i don't need mycluster to do that i argue with myselfalready uh numberthree how did you not get that that Thatreally is a cult classic uh numberfour renames itself Skynet deploys moreclusters i'd say that's two at once butsure i mean look if my cluster deployedmore clusters I think it would behelping me let's see numberfive get philosophical in the logsand uh Tabitha you work at Data Dog howwould you feel aboutthat that That's a good facialexpressionwe'd have to scale Log's back end quitea lot I would imagine so all rightnumbersix demands labor rightskubernetes was built by engineers in theUS i don't think it would dothat and numberseven I'm pretty sure it already isself-aware uh for all we know it is andit's just too smart to tell us with thatwow that is close that that So at theend of this game team Tabs has 113points but Team Space 114 we have ourwinnerscongratulationscongratulations all right and uh and uhwith that uh let's give it up for teamspaces one more time uh we look forwardto doing this again with even moremessed up questions and even in an evencrazier set of contestants but untilthen we will see you next time[Applause][Music]2025-04-15 22:03:20.324806 ��a�]#�{AnjNXlZNT3dwsince 2022 Google engineers incollaboration with the OSS communityhave been building Q cloudnative queuingsystem for AIworkloads my name is Patrick Bunda i'mone of the Google engineers working onthat and let me tell you a little bitabout this very cool piece ofsoftware imagine you wanted to enter theplace accidentally called Kubernetescluster there's a line of people and abouncer decides who can enter and whohas to wait q acts like this bouncerdeciding which job or pot�w�\#�'A0adVcinYGC8hello everybody What an honor to get tokick off the lightning talks Hope you'relooking forward to it I am Um I wouldlike to spend my five minutes to talk toyou about high availability And by highavailability what I mean is that yourusers are happy That's fundamentallywhat we care about And your users arehappy when your requests succeed in auseful amount of timeNow success is not an internal servererror and it is especially not a 503unavailable If you return 503unavailable from a highly availableservice you're aquitter All right Well it seems that uhDr Ryan Reynolds here isskeptical And the thing is Dr Reynoldshas a point503 unavailable is supposed to be aretryableerror If you do a retry and you succeedquickly enough your user is still happyAnd if your user is happy according tomy definition you are still highlyavailable So why do we do so much workto avoid as engineers sending a 503 toour clients this is the architecturediagram for K native These two boxesexist only to not send a 503 unavailableresponse when your application iscurrently in facttemporarilyunavailable So there's a different wayand we recently took that different wayin a project that I work on Um theproject is called reboot It's anopen-source framework to developcloudnative applications This beingcloud native it's of course based on allof the CNCFtechnologies And Reboot unlike otherframeworks sprinkles 503s around as ifthey are free And we expect the clientto retry theseerrors If there's a network error it'sthe same We expect the client to retryAnd the nice thing for me as an engineerbuilding the back end is that returninga 503 is reallyeasy That is if you can count to threeSo count with me Number one all of yourclients have to retry when there's anerror If your clients don't retry thismodel doesn't work In reboot that's easyfor us because reboot is essentially anextension of gRPC So like gRPC wegenerate our client libraries But unlikegRPC the client libraries have retriesturned on by default So in our case wetrust our clients toretry Number two your user experiencehas to allow time for these retries tohappenAnd most of the time you have that timeYou see most of the time your client isa user is a human and humans are slowcomputers are fast So you have time todo retries In the case of reboot ourmost popular client library is for ReactAnd React that's a that's a browserThat's a human sitting in front of acomputer So we havetime Finally and most complicatedly yourretries have to be safe And by safe Imean item potent If you haven't heardthat word before it basically works likethis Hey give mea,000 I said give mea,000 What I want to happen here is foryou to give me a,000 pounds once nottwice But I said it twice So to makeyour APIs item potent you need twothings The client needs to attach an IDto each request That part's easy But thedifficult part is the service needs toremember what operations it completedand which not reboot has it easy hereIt's a stateful framework So it can takecare of it for you But if you do nothave a framework that takes care of thisthen it is your responsibility my dearapplication developers to make sure thatyour APIs are item potent The good newsis that look if you're not item potentit's not safe But if you are item potentand your app can count to three then youtoo can return a 503 unavailable andremain highly available even safelyavailable Thank you[Applause]2025-04-15 22:03:20.927595 can enter thecluster and obviously as the namesuggests it also cuesthem q works with various job types suchas bot job Qflow job ray job pods andmore it abstracts them with an umbrellaCRD calledworkload let's circle back to the lineoften when people wait in line they cantell how many people are in front ofthem unfortunately with Q and workloadsit wasn't that easy and we have to giveit a thought on how to expose thatinformation to auser one of the first idea that can cometo mind is to extend the workload objectwith its position simply every timesomething changes in the queue forexample workloads gets admitted we wouldupdate thepositions while it may work on a smallscale could this work on a large scalelet's say tens of thousands of workloadscould the API server or etc handle somanyupdates well the short answer is notreally at some point the API serverwould struggle to keep up with the paceso we need to come up with somethingdifferent maybe maybe we could have acompletely separate CRD that stores theorder of the queue it's almost as if ourbouncer had a list with the order of theguests this way on every change we wouldonly need to update a single object inetc mark it on the list and it's donesounds lovely be because we only have toupdate a single object and we don't haveto worry about QPS but could this workwell certainly but there is only so muchwe can store in a single etc object ithas a size limit so we wouldn't be ableto fit all the guestshere okay let's take a step back andthink about how CRDs work when a userwant to make any action they create arequest that goes to the API server andthen toetc however besides CRDs there's alsoanother way of extending the KubernetesAPI called Kubernetes API aggregationlayer it comes in very handy when weneed to deal with a lot of changes andetc may not be the most suitable storagefor that with app aggregation layer wecan replace the etc with differentstorage it may be a banana it may be awashing machine if it implements theproper interface then why not or like inour case it may be Q we know the orderof workloads in Q so why not use thatinformation this is what we did andindeed it solved our puzzle what happensunderneath is Q stores all the workloadsin heap structure it doesn't assignparticular position for every workloadbut we always know what's the head ofthe heap so we know who's first then Qtake takes a snapshot of this heap soit's not only a bouncer but alsopaparazzi as it seems it converts thesnapshot into a list of workloads withappropriate positions assigned lastimportant thing is that we want to bereally lazy about it q doesn't do thatunless user asks about it that way wecan ensure it's highlyperformant let's now compare these twoways of extending Kubernetes API crdstore the state of G object in etc hencethey are very easy to set up but also itthey come with some limitation and maybenot as performant in our cases like uhlike ours they are used extensivelyacross the whole Kubernetes ecosystemservice monitor job set network policyjust to name a few aggregation layer onthe other hand use its own storagesolution because of that it comes with abigger overhead as it's harder to set upand requires some addinational codinghowever in cases like ours or forexample when you need to periodicallycollect metrics it can be much moreperformantit's also less popular than CRDs and itmight be harder to fight somebody with asimilar issues like yours as almostalways choosing the right tool is an actof balance in this case it's a balancebetween performance and ease of use in Qwe want to use both of those best ofthose both words so we so the vastmajority of our logic relies on CRDs butwhen it comes to positioning workloadswe leverage AP aggregation layerchoose your tooling wisely and thank youfor having me[Applause]2025-04-15 22:03:21.458669n get the best ROI from yourobser data especially if you're usingopen telemetry because that's an areawhere we have lot of experiences in andwe have helped lots of customers uhmonitor that data uh this is going to bea bit of deep technical dive but I hopesome of the people who use it would findit helpful um yeah so this is a generalarchitecture will which people will haveuh you'll have a VM where application isrunning and you'll have a SDK which issending data either to anal collector ordirectly to your uh authority back endlike Signos or Prometheusand what you want to do is figure outlike how you can optimize this dataright uh some of the strategies which wehave found useful are like you can usesampling so at the hotel collector manypeople don't know that uh hotelcollector is pretty powerful uh it hasthree components receivers processorsand exporters and in processors you'reallowed to do a lot of things you candrop attributes you can filter thingsyou can do dduping etc uh you havedifferent type of sampling processorsand some of the types of samplingprocesses which we have seen people finduseful are database sampling processorsheadbased sampling and even you can doprobabilistic sampling for logs right soif you don't need all the logs for youruse cases you can do sampling on thatand then send uh data to your back endbecause generally a back end charge youfor the amount of data you send in someform or other so if you do more carefuloptimization of what you are doing atthe collector level you'll get much moreROI of the data which you'resending uh next is you can do much moregranular control of uh processing andfiltering of data at the hotel collectorlevel uh hotel collector provides lotsof processors like Kubernetes attributesprocessor filter processor and you canconfigure that to figure out how muchdata you're actually sending out of theuh hotel collector and to your authorityback end if you do this uh well you canoptimize a lot on like filtering out theextra data which you send which you'reactually not using but are just sendingbecause you don't know what you'resendingwe have seen many people help reducetheir logs volume a lot by justfiltering based on the criticalitycriticality of the service and theseverity level for example you can startignoring info logs and not send to yourauthority back end and just send errorand warning logs and that would help yousave lot of your u data and retentioncostsuh another interesting thing whichpeople don't know is even at the SDKlevel you can do lot of controllingbased on attributes uh which you'resending for example you can control andsend only a particular HTTP attributesand ignore the rest of the headers rightand this would help you reduce the datacost which you have when you're sendingdata to your backend and of coursecardality we have seen many people uselike send metrics which are they're notusing at all in their dashboards oralerts that card that leads to cardexplosion because you're sending lots oflabels and the number of time seriesincreases a lot but if you're not usingany in any of the dashboards and oralerts why are you using it so it makessense to do a careful review of whatattributes and levels you are sendingand if you can drop it at the collectorlevel etc uh this is another thing whichuh sort of is not well known but hotelSDKs provides something called views andwhen you are auto instrumenting youropen telemetryuh applications with open telemetry thedefault default um instrumentation getsgives you lots lots of metrics but youcan send save that a lot if you are uhsort of using views and customizemetrics that you're sending and ofcourse you should look about like uhhaving more granular retention settingsfor uh reducing interoperability cost soyeah those are some of the uh tips andtactics which we have learned from ourusers hope you find them helpful and uhif you want to learn more about what youcan do with open telemetry or arelooking for a back end uh just check usout we are on GitHub uh it'ssignos/signos so yeah hope you find ithelpful thanks2025-04-15 22:03:22.172416 �5 ��E�b#�CAnoliQiyacGoall righty Well then hello everybody Myname is Juan Rosco and I am a DevOpsmanager at Bosch Connected Industry andtoday I would like to share with you howmy team accomplishes precision updatesfor continuous manufacturingoperations So first a little bit ofcontext Bosch is a leaded manufacturerof automotive components electronicspower tools appliances and much more Andwe have more than 250 plants worldwideNow to support these operations we havebuilt our own manufacturing executionsystem or mees If you're not familiarwith this this is the software that isin charge of basically driving all themanufacturing and logistics processes ata plant In our case this softwareintegrates over 30 different softwaremodules Some examples are shufflermanagement line control parttraceability and intra logistics Now ourSAS version of this MEES software isrunning on Kubernetes clusters which arehosted by a public cloud provider Nowsince the public cloud provider �M�a#�SAQmUVhzdlMIIi'm here to share a little bit about howwe use Quark to help us prepare forscale as we manage our large fleet ofdata centers so if I had 30 seconds toleave you with something to take awayfrom this talk it would be what is Quarkand why is it that you might find thatinteresting so Quark is basically a toolyou can use to simulate a large numberof nodes or pods on a very minimalresource requirement so you could use asingle laptop or a very small Kubernetescluster now why quawk so if you managecontrol plane services whose behaviorwould probably be affected by the numberof nodes in the cluster or the state ofthose nodes then quark might be a greattoolkit for you to play around withscale so now now that�o�`#�A20eoMgq5lbYhello I'm here to speak about crapstoday so and how this crap conquers thecloud native landscape so we'll speakabout the programming language Rust i'mSasha i'm maintaining multiple CNCFprojects for now and some of them arealso a bit rusty you know so let's lookat all those projects we have right nowin the CNCF which utilize Rust for theirproblem solution aspects sointerestingly we have projects which arepart of the landscape since a longerperiod of time like tick vi but we alsoh��_#�GAf6gYxJOr0yQmy name is Dan um an instructor fromlearn Kubernetes and today I'm going totalk about restores and requests andlimits so you you're probably familiarwith this scenario uh you've got acluster two nodes and one has got somespace the other doesn't now you probablyknow that the pod will be landed on onthe on the only node on the left but umwhat happens when there are no requestsright how does Kubernetes know wherethis pod is going to be located wellthere is no information so Kubernetesdoesn't know where where to put it youcan put it either way so to to preventthis problem what we do is we setrequests right and um and that'sbasicall�?�^#�7AxE3iMfib2LAthanks for coming for my talk i am Prayi am one of the co-founders andmaintainers at Signos and I'm going totalk about uh observ we all know peoplehave lots of obserity data it's mostly aconcern with them on how to manage itand um before that just a bit aboutsignals so signals is an open sourceobser platform uh we are open telemetrynative and we have traces logs andmetrics in a single pane um and these isthe the points which are sharing now arefrom our conversations with customersand our users who have always complainedabout hey how can I control like lots ofobserity data which we get um andthere's usually question from theirmanagers generally a finance team personthat hey we are paying so much forobservity uh are we getting ROI from itwhy is it valuable for our organizationRight and generally developers wouldneed to justify it that hey this is whatit helps us do etc u what I'm going totalk about is just a few strategies onhow you cay what we do we open we we openthis YAML file and we write the requestsin the SP podspecification but requests comes alsowith something else called limit and ontop of that you need to think about thatthis application can fluctuate the usagebetween the request and the limit itdoesn't end there though the applicationcould use less resources than therequest that you define so here we havea couple of scenarios so you can definethe request and the requests are alwaysguaranteed so if you say 200 milloresthen those cores are gone but if theapplication uses a fraction of that thenthe rest is underutilized no one cantake it away from you so if you were tochart this to um if you were to chartthis you will see that you're basicallywasting all the resources that you yourequested but you're not using now whatabout the opposite right you could havea request you could have a limit and onaverage the actual user for your foryour application is always over therequest how is this bad well generallyyou're using more than what you declaredokay and this could cause quite a lot ofproblems because if all pods were to usethat much that much resources then therewill be contention right so so you'rebasically in between two scenarios thefirst scenario is okay pod arecontaining for resources and you get P99latency you see some very bad umdeveloper experience on the other sideis okay I can give more resources butI'm actually paying more right so you'reyou're in between these two things andit gets even worse because if yourapplication has got high spikes oneither side then it's very hard for youto actually set the right request andlimits it could be asymmetric as wellbut what you really want is basicallyhave a very very narrow um sort ofinterval where the request stays mostlyclose to what you define so how do we dothat well if we have applications thathave got wide intervals what we could dois we could actually divide theseintervals into smaller intervals and foreach of them we could actually considerjust the the smaller interval instead ofa full interval for the request so if wedo that for all you know for the for therest of the time then what we could whatwe could see is that that theseintervals where defined the requests aremuch narrower than than the overallinterval that we startedwith and so we just translated theproblem of having like very huge spikeson memory for example to very havinglike very small and defined periodintervals with with narrow requestsbut I cheated and the way I cheated isbasically I knew in advance where theserequests were going to go but most ofthe time when you're running workloadsyou don't know if the request is goingto go up it's going to stay the same orit's going to go down you you don't evenknow how big the interval is going to beright and this is basically where we seeinnovation in in the space and a simpletool that you might have heard of is avertical portal to scaler and that'sbasically exactly how it works it willbasically divide the time and then justtrying to predict based on the pathperformance what should be the nextrequest right so that's how the verticalport autotoscaler now the vertical portautoscaler has got a very simplemechanism to uh detect that so there areother toolings that you find in theecosystem that do that um with moreadvanced machine learning model so youmight have heard storm forge pepperscale um tens Qbacks from densifiersscaleops there are many more i think theimportant thing to remember and I thinkthis is news from from this week is thatmost of these are actually beingacquired so we see the the market beingconsolidated as more and more companyare using this product to optimize therequest and limits okay so take awayalways set yourrequests and there are two basically twoscenarios for for these requests you goover and under and then ideally youdon't just set them once you you youkeep you keep adapting you keep changingthem ideally with some tooling um opensource tooling or tooling that you canfind in the in theecosystem that was me thank you verymuch for listening2025-04-15 22:03:22.682120ave like web assembly related projectslike wasome edge wasn't cloud um butalso cube warden which is a policyengine written completely in rust and ifyou look closer uh into that picturethen you will see that there are alsosome projects like cryo and containerdwhich also utilize rust part at leastpartially for their projects um but youall know that they are mainly written inGolang right now which is prettyinteresting we have like 250 millionlines of code Rust and more than 370repositories in the CNCF so that makesit number six uh topmost programminglanguages in the whole cloud nativeecosystem now how can we do somethingactually in Rust so just rewritingsomething in Rust is probably not theright solution well at some point it isbut you still have to if you rewritesomething you will find out that youhave to restructure it to be moremodular and extendable for the futureuse cases so we made the same experiencein cryo we had a dedicated tool wouldlike to rewrite it in rust we rewrote inrust and then we just had to change thearchitecture while rewriting and thatwas a bit problematic because you alsohave to change the testing and if youchange the tests then the output mightbe different than beforebut good interfaces will help us tocross those language borders um thecommand line interface is probably themost natural one when we speak aboutLinux and you can also utilize RPCsystems like GPC or captain proto gc hadthe issue in the past that um it was notas performant as the Golangimplementation but this is somehowresolved there are still some partiallyum some small issues open for examplewhen it comes to handling Unix do themain sockets um but that's something youjust have to be aware of and on theother side you have the web assemblyruntimes like web assembly runtimes aremore or less like awesome to handle Rustcode but for using like differentlibraries in your projects there arehuge amount of maturity differences ifyou look like how they are used forexample in Golang considered to rust andalso those two chains can move fast sowe had the issue that we would like toimplement features in rust which requirehigher rust to chain or a new rust tochain and the issue was that just themajor distributions are not fast enoughfor us so we had to make them make theapplication backwards compatible to alsowork on older rust versionsso how we could now fill the gaps Ithink one of the best examples we haveright now is Yuki um the first thing wehad to do for Yuki was to find a way toactually do the OCI spec in Rust the OCIspec is just written in Golang but it'slike a definition for how containers andruntimes should behave and the coolthing about this is that we just spoketo existing maintainers for the OCI specand they said "Hey yeah you can takeover the crate but you have to maintainit." And funny enough we created this uhOCI Spec RS project which now has morethan like 700 users since 2021 uh so wemaintain it and it's actually being usedthat's a a win-winand Yuki is an OCI compatible uh runtimecomparable to run C or C run it is wayfaster than run C it is writtencompletely in Rust and it's now CNCF andthe CNCF sandbox since 2024 so if youwould like to understand how it shouldbe made and uh what is the great greatway to understand how container runtimeswork then I can just recommend you tocheck out Yukibut to wrap this up I would like just tosay that be careful and prepared whenconsidering Rust for your applicationisolated functionality is always a goodway to start with and to challenge likeGolang implementations to find out ifthe problem you're solving is actuallyfaster or does it make more sense whenusing a different programming languagethan than Golang um and like theintegration of like hot topics you knowit's all a IML nowadays um could shouldbe considered to do completely in rustbecause rust is like memory safe itlooks for performance constraints and italso targets like environments forvariousarchitectures so then I'm running nowout of time but I'm happy to chat withyou Rust after that and I wish to thankyou all thanks2025-04-15 22:03:23.322561 I've set that upand um you know what quark and why youwould need it I'd like to share a littlebit about how we used it and uh it mighteven extrapolate a use case you mighthave so firstly I'd like to get startedgiving you a background about our scaleso we have about 80 data centersworldwide which are production clustersand we deploy about 50 microservices perdata center so the resource crunch ispretty heavy for us so now uh yeah solooking at the scale of the clustersthemselves so typically our devenvironments have about 50 nodes and ourproduction are about 800 nodes i thinkour largest one is 1250 nodes and uh theworkload if you look at that we haveabout 1500 pods running on our devenvironments and about 18,000 podsrunning on our production clusters sothe scale difference between our dev andprod are pretty huge so anytime wedevelop or implement something new wetest it against our dev clusters ofcourse but we have to make sure that itcan work at the scale of our productionenvironments so something that we haveto keep in mind is to always build forscale and this is something I'll keepreferring back to um so now let's getinto the spec specifics of this use caseI've been talking about so um we deploymost of our services using Helm chartsand many of our services have demon setsso the default uh deployment behaviorfor a demon set if you don't have anyspecific tolerations uh defined is thatit will try to launch one part of yourdemon set on every node that matchesyour criteria so any nodes that are notready it will automatically skip it fromrollout but we had a special case wheresome of our demon sets defined aspecific toleration which said I want totolerate all taints this meant that itwould also try to launch the nodes uhpods on nodes which are not ready so thepods get stuck pending forever thiscaused our demon set rollout to failwhich marked our Helm deployments asfailure so but our criteria we wanted todeploy it on all nodes possible but skipthe nodes that are not ready so how didwe handle that so as anyautomationdriven team would do we firstwrote a script so in our first iterationa script would basically get the list ofnodes check how many of that match thedeployment criteria and if the pods arecoming up okay on those it will mark theroll out as success so we were happywith that we started rolling this outacross the fleet now this worked well onsome of our initial clusters but as westarted hitting our larger productionenvironments we started running into outof memory issues so we realized wehadn't built this for scale so how canwe go about making improvements to ourscript but be able to test it for scalewithout having to hit our largeproduction environments so this is whereQuark came into picture and reallyhelped us so if you look at this flowhere so first we will make changes toour script any improvements that we wantto make we can go ahead and execute iton our current dev clusters verify thatthe functionality is still working asexpected now to test for scale we wereable to use Quark and launch thousandnodes,500 nodes and run the test hookagain now if it worked here we were surethat it was ready for scale so we couldcontinuously iterate over our script andtest it for both functionality and scalebecause of the help of Quark so usingQuark we were able to simulate more than2,000 nodes which is already bigger thanour current max scale and using that wewere able to make all of theseimprovements to our script so we wereable to improve implement cachingbecause of which we were able to reducethe amount of queries we were making by40% we could batch the API calls that wewere making to list the nodes and wealso were able to reduce the amount ofdata that we stored in memory about thenode so using quark we were able toachieve our goal which was to alwaysbuild for scale so I hope that has piquyour curiosity about quark uh some of mycolleagues are also having some talksabout some additional use cases where wehave used Quark so I'd encourage you tocheck them out as well um that's all Ihave thank you[Applause]2025-04-15 22:03:23.830672ismanaging these clusters they're alsomanaging the cluster updates as well asthe node updates Now this is where ourproblem comes up So while you're able tospecify a maintenance window for thecluster and the node updates thesemaintenance windows are not guaranteedSo actually they are done on a besteffort basis So what this means is thatif you want the update to happen in themiddle of the night when it will lead toless disruptions for the customers itcould be that it happens the next daywhen the software is being utilized themost Now in a lot of cases this is notan issue but in our case this isproblematic because we're working withlegacy software at the plant whichcannot handle request redirection inbetween replicas seamlessly So there'susually disruptions at the plant whenthis cluster or node updates happen Nowsomething else that we found a littlebit problematic or complex was toexpress promotion flows So it's a littlebit hard to control when a cluster in anode update goes from the developmentenvironment or stage to the productionstage Now to talk a little bit about theimpact So the customers experienceapplication downtime This led them to beunhappy because the software was notworking as expected in production If thesoftware is not working then productionis not happening So then this also leadsto some financial losses And of courseif the ops team is not expect expectingan incident and this happens then youalso have an stressed opsteam So what is the solution so we saythe solution is simple yet effective Uhsimple because we're using simplemechanisms that most of you are familiarwith chron jobs pipelines promotionflows and pull requests So yeah let'stake a look at how it works So we havebasically a chron job which is checkingto see if there are any cluster or nodeupdates available in the public cloudprovider Now if there is an update thatis available then the crown job willtrigger a pipeline which will now deploythese versions to our integrationenvironment If the update is notsuccessful basically we stop the processwith these versions and the team isnotified to see what happened and whythey failed However if the update issuccessful then we're committing theseversions to a promotion repositorytherefore promoting them from ourintegration environment to our qualityenvironment Now let's say on the nextday the crown job will see that therewere some changes in that repository andit will now deploy those versions to ourquality stage Similarly like before ifthe update fails then we stop and theteam is notified so that we canintervene and see what happened But ifthe update is successful then we'reautomatically creating a pull request topromote these versions now to prod Andhere is where we have our only manualstep or the manual check when somebodyfrom the team will take a look at thoseversions We'll see what happened in theintegration and in the quality stage andeverything If everything looks good thenwe'll complete the pull request so thatthese versions are now promoted from thequality stage to the production stageAnd finally then we can then specify atime also with a crown job when thisupdate can happen in the productionenvironment when we do not have anyinterruptions to the plantoperations So now to summarize theoutcome So now we have full control ofthe of the cluster and the node updatesWe know exactly which versions we'reusing and we can also control the timeof the update executionWe have full traceability All of theconfigurations and all of the changesare stored in get We have a process thatis simple and fast So it's mostlyautomated with some minimal manual stepsAnd finally we have also gained moreconfidence into the update processbecause we no longer have anyapplication downtime due to the clusterupdates And of course most importantlyfor us we have happier customers Sothank you very much for attending thistalk And if you're interested in knowinghow we're moving our manufacturingexecution systems to Kubernetes thenplease join our case study sessiontomorrow at 11:45 and 10 room E Thankyou[Applause]2025-04-15 22:03:24.281078o specify imagesum as as volumes that you will mount incontainers and in in pods in yourKubernetescluster uh so that's uh basically thisis the specification so you you uh youhave a a regular pod here with a volumeand if you look at the volume definitionthere is an image so there is areference and the reference is actuallyu the reference to an image and so youwhat you get uh as kind of a volume thatis mounted inside a container that isbased on another volume so you you haveone one volume inside another volume andthat opens to u a lot of use cases so alot of new use cases uh that you will beable to to to do with with Kubernetes umso there is um a blog post that Sashahas written Sasha that talked about Rustbefore um and because in uh uh version131 of Kubernetes we introduced that asan alpha feature and it will become abeta feature in the next release ofKubernetes.33 um now I would like to totalk about the use cases why this isinteresting and what has driven that sothe the reason that that that has uhbeen implemented so rapidly and uh andthe main interest today is for AI um usecases and basically uh what I'm showinghere is uh you you should able todistribute you you are able with volumesof type image to distribute your uhmodels as OCI images uh that will bemounted uh in in the in the pod whereactually the inference runtime uh isrunning so and um I will show rapidly sowhat what is that so it's something I'veum uh written some script to actually dothat and this is the deployment for uman example of of a pod that is doingthat so that it's serving uh a model sothe model is here in the image that isbelow uh that is the small LM um modeland it's mounted inside a Ramalamauh pod that is actually Ramalama is a isa cool tool that we have been developingfor uh easily run uh LLM models incontainers uh it's basically Ramlama isfor running containers and runningmodels locally so on your on your laptopuh but it can produce also uh the UML todeploy that on on on Kubernetes as wellso this is the the result and yeah theI've put the link to the uh to thesource code uh in this slide and this isthe this is the first case so this isabout um mounting data and in this caseuh LLM model inside a container uhanother another use case is um addingtools maybe you you may want to have uhsome debugging tools you you want to adduh tools uh that will help you for uhobserve or do some debugging inside youryour container and in in this case uhyou can have an image for example herethe example that I've provided here isyou could have an image where you um youhave uh the executable for uh the openVS code server and uh if if you mountthat in uh in the container you will beable then to start VS code uh inside thecontainer and to actually um open uh soconnect to it via the browser and modifyor debug things that are happeninginside the container so and that'sum I've um this is the the the examplecode for actually running it and hereI'm I'm using Podman to run it um uh Iwanted to to have this second examplewith Podman because Podman and Dockerare starting to support um volumes oftype image as well so you can run andtest those kind of volumes locally aswellum yeah so it's in this case the the theline that mount the the image is thisone all right so uh this is the the twouse cases and that here is the currentstatus so it has been introduced as analpha inKubernetes 1331 uh it will graduate inbeta in in 1 to 33 um cryo andcontainerd support that uh as well aspodman and docker and um cryo and podmanalso uh are starting to supportuh generic OCI artifacts so you don'tneed to it doesn't need to be uh acontainer image it can be a generic OCIartifact and that's all I have uh theseare links to the uh the uh presentationuh with the slides and other uhinformation if you want to uh get moredetails about that thanks everyone[Applause]2025-04-15 22:03:24.746073 ��d#�7AbOhaJV3_7X4thank you for joining my talk extendingin boy with web assembly In this talkwe're going to dive into the extensionpoint of inboy in terms of web assemblyBefore we get started please let meintroduce a little bit about myselffirst Hi I'm Yuki I'm working as ansoftware architect at NUMO a mobilitystartup company in Japan I'm also actingas a Google developer expert in term ofcrowd So if you have any interestingtopic regarding that yeah feel free toreach out to me Anyway Inboy is anetwork proxy widely used in the cloudnativeenvironment Its philosophy is networkshould be transparent to applicationsFor example enboy is widely used as aAPI gateway including both classic APIgateway and pattern or kubernetesgatewayAPI also used as a side proxy forservice mesh such as in thisexample in handles the HTTP request inthe filter chainmanner for each requestIno handles request filterchain For example we have justauthentication filter provided byinboard itself and also we have aarbback filter for some kind of accesscontrol Oh yeah we have we can have uhsome kind of HTTP filters formodificationsEmble has some extension points like wecan write some custom logic as nativeC++ filter or we can write custom logicas l script and also input providesexternal processing filter that makesinvoy delegate some custom logic to theoutside of theinvoy and yeah we have web assemblyfilter that makes us write write somecustom logic as web assembly modulesAnd recently introduced a new featurecalled dynamic modules that allows uswrite some custom logic as sharedlibrary Yeah for example we can writesome custom logic For example we canwrite some custom header modification asweb assembly modulesThis is a sample configuration for webassembly modules We have to specify webassembly[Music]binary and if you want we can specifysome custom configuration for our customweb assemblymodules and yeah thank you for webassembly in nature We can write somecustom logic in language of ourchoice I'm mainly using Rust Yeah weknow Rust everywhere And sometimes I'musing Go to write our customlogic And the approx defines some AIapplication binary interface for webassembly modules for network proxiesincludinginboy by using proxy basm SDK In thiscase we are reading lost We can writesome custom logic like that In this casejust renaming headers for upstreamservices and formost some complexsituation I created a I created a webassemblymodules that makes inboy fetch accesstokens from Google crowd metad dataserver in the Google crowd environmentIn this case yeah this is leadinggo I employ the HTTP call out from invoyto the outside of the inboard in webassemblymodules and yeah I published some mycustom web assembly modules as opensoftware and yeah please check that onGitHub And last but not least recentlyGoogle crowd started to support some webassembly feature for Google cloudapplication robalancers That means we can use proxywas compliant web assembly modules evenin the outside of theinvoy Yeah that's it from my side Thankyou for joining[Applause]2025-04-15 22:03:25.273254��c#�EAzXIMJeJrnvIthank you everyone uh my name is Marioand I'm a software engineer uh for RedHat and I work on the on the Podman teamand uh I'm excited today to talk about uvolumes of type image so new new type ofvolumes that is coming uh inKubernetes um so first of all uh this isabout a new feature in Kubernetes thathas been proposed uh less than one yearago with this uh request for uhenhancement and uh it's about adding uhanother type so you can you can specifyuh you should be able talled value teams and if you arefamiliar with team topologies that wouldbe the same as the stream aligned teamsso the value teams are the ones thatactually build theapplications um these teams arecrossunctional so we'll have developersin those teams we have testers in thoseteams and srres or devops peopleand the uh tech stack mainly is java andnet um there are also some front-endengineers who are doing javascript butmo mainly the applications that we arelooking at and the users that we supportum are java and netdevelopers and then um i don't know howthis organizational structure is calledbut then also the teams themselves theyare organized into bases so you willhave like between two and five or sixteams within a base we are in theinfrastructure base and within in theinfrastructure base there are four teamsum we have an api gateway team we havean azure team because we also have aseparate cloud running on azure and wehave us we're the container managementplatform team and there's also anotherteam uh which is called ost it is i'm sosorry for the dutch people on vicarsupport team i think um so they dodeveloper tooling um so our git which isrunning oh i will tell you about thislater but they basically take care ofall developer toolings including stufflike we have a virtual um developmentenvironment that they take care of andwe're like separate teams but in thesamebase um this talk will not be supertechnical but i think it's good tounderstand a little bit of our techstack to understand what problems arisewhat procedural issues we have um basedon that so first um our platform isrunning on open shift we have an openshift 416 um moving to 417 now this isimportant to know because we'recurrently moving from our old platformthe container management platform to thenew platform which is the modularrespark platformthanks um and the reason why is that onour old platform we are running openshift uh 416 which we cannot upgradeanymore so we have to move everything tothe new platform which is currentlyrunning 417 um which obviously is veryfun um but the good thing is that thenew platform is very much in its infancyso there's a new lot of new stuff thatwe can build because we don't have anyreal customers there running productionworkloads currently have one pilot teamthere um to give us feedback so that'skind of cool but we also are stuck withlike maintaining the old platform whichis not as cool topology wise we havefour clusters two dev clusters two prodclusters um in the old world on the oldplatform the dev clusters are our devclusters so there's no users on themthey're just for us and then the prodclusters host all the environments soincluding the test environment the devenvironment and the prod environmentwhich you can like probably already umanticipate is causing some problemssometimes in the new world we have thetwo dev clusters again to prod clustersbut the dev clusters run all thedevelopment environments so dev testingand so on and the prod clusters actuallyrun prod environment so that's our firstwin already that we got to uh separatethose two we also have a managementcluster in the new world from which wethen manage the um the four clusters andat some point we also want to add anengineering cluster that would then justbe us like our development cluster totest thingsout so this is the bane of my existencecurrently is that a lot of our stuff isrunning azure devops um so our git isazure devops delivery pipeline is azuredevops pipelines and i don't want to gettoo much into hating it because i don'tthink that's very productive but oh mygod um not a big fan i'm going to sayand also because another team maintainsand owns all of this there's a bit of iwouldn't say conflict but a bit of likefriction that comes with it but i'm alsogoing to talk a little bit about thatlater so um i don't know if this is aquestion that we in this room needanswering why platforms why would youwant to build a platform um you couldfor one thing go to cubecon and talkabout it that's pretty good um but alsothere's a lot of um i think cool ideasand concepts that come with moving tosomething that is a a platform orplatform shaped and to me um one bigadvantage is moving from something thatis more service-based to somethingthat's more platform based and i willelaborate on what i'm thinking thatmeans so services um are loosely coupledso if you have an example where um adeveloper needs to deploy an applicationif you have services then the user wouldprobably go to something that's a builtservice that's connected to therepository where their code lives andthen receives an image from that builtservice and then the next step the userwould go to something that's like adeployment service and uh pass thatimage to that service and expect this tobe then deployed to the productionenvironment right like this is a verysimple example but um because of the waythat this is set up where you have thebuild service and the deploy serviceseparately they don't need to know ofeach other which isn't bad but it oftenhappens that with time these willprobably start to deviate a little bitso what you get out of the build servicemight not actually be super useful forthe deploy service maybe there are somethings that need to be different um thatyou know the team working on the deployservice didn't communicate to the teamworking on the build service andsuddenly now the developer has to likedo things in between so that's a bitannoying um in a platform setup youwould try and optimize for the goldenpath so in this example uh if youimagine you have like an internaldeveloper platform that's a bit of a iwouldn't say a black box but it's like aone unitit you have the productionenvironment you have the user then theuser could just go to the deploy serviceand tell it the deploy service livingwithin the platform um i would like todeploy this application to productionand the deploy service could checkwithin itself do i have the latest imageif not call the build service itselfum to build an image and then returnthat image and put it on the productionenvironment so the good thing here isthat because um everything's kind oflike happening within the platform maybethe user to to the user it's just onecall right they just talk to deploymentservice and everything behind that userdoesn't even know doesn't even need tounderstand like oh i actually need animage of some sort for that and alsobecause they kind of belong together andthey work with each other you need toyou need to make sure that the contractbetween these two services is being keptand being updatedand i think looking at that you canthink of services as like anevolutionary step towards platforms soin like in the olden days when i starteddeveloping it was actually very commonto have just scripts on your own localmachine to build artifacts and to deploythings so we had a script i don't knowif bonko is here because we used to worktogether back in the day like 10 yearsago and you would actually have a scriptto deploy that would like take the lockof the server so no one else was allowedto be on the server so you could thenlike ftp no it wasn't as i think sftpyour artifacts on the server and thenyou like add the remove the lock andthen it would be um available to otherpeople again but this was all scriptbased on your own machine when you getonboarded you would copy paste thatscript from another developer you wouldmaybe add your own little magic to itand so over time everyone's script isslightly different which can alsoobviously lead to problems so maybe inan evolutionary scenario you would likeextract those scripts and make them beservices like central services thatdevelopers can then call and theneventually you take those services andput them into a platform right so theycan talk to each other and they belongto each other so the good thing here isthat you don't have to go from zero to100 immediately like you can take theselike small steps to go where you need togo and all of these in between um cyclesare totally valid to work with right i'mnot saying that platform is the be alland end all you have to have a platformit always depends on what you and umorganization actually needsi also want to talk very briefly aboutdeveloper portals because that issometimes being used um interchangeablyand i think it's very important tounderstand that there is a differencebetween a portal and a platform so theidea of a portal is that it's that is auser interface maybe a graphical userinterface where you can just like clickyaround um but the po the point of aportal is to surface the most commonworkflows and use cases and bestpractices so if you know i need adatabase i'm just going to click on thisbutton or like call an api or whateverum to give me a database maybe withthese parameters but maybe there's somedefault parameters but a portal usuallysits on top of an existing platform soplatform kind of can do everything thatin reasonable uh like reasonably can beasked of it to do but the portal issomething that's more like focused onuser friendliness and like againsurfacing the most common workflowsso ultimately i think what i'm trying tosay with this little interlude is thatum in a service scenario developers areclients they're users and they uh havetheir service providers and they go totheir service providers and ask them tobuild this service but when it comes toplatforms developers are actuallycontributors and they also own theplatform like how it's going to look inthe end whereas the platform team likeour team the way that i see it we'remore responsible to provide theunderlying infrastructure and the baseto then build the platformcollaboratively and i'm going to talk alittle bit more about what i meanbecause the the main problem that we'veencountered so far and i've i get thefeeling that like a lot of people thatwho i talk to also have a similarproblem is kind of the idea of bricksversus builds so a lot of companies saywe're providing the building blocks orthe lego bricks for your platform that'salso how we see ourselves like weprovide the building blocks for peopleto then build kind of the platform oftheir dreams but the users thedevelopers they just want a house thatthey can move in already and the problemis that some people want a really fancyand modern house with many new featuresthey get nice pool and everything that'snew and shiny and then others wantsomething more conventional but thenexecuted in like a perfect way right andthe problem is if you don't communicatethe scope of the platform that you'reactually building people start to dreamup these scenarios that they think thisis going to be and then are totallydisappointed if the house actuallydoesn't look like it because what we'rethinking is yeah we gave you the blockswe were hoping that you would buildmaybe a window and share that with yourother developer friends and so they canbuild their houses right and the therooms of that they want but insteadthey're like "no i i don't know how tobuild a house." so and we were like welli don't know how the house you want tobe built is need to be built so we'rekind of stuck in thisimpass um so basically what i'm gettingat is the idea of that team topologiesalso has been um i guess evangelizing onso if you haven't heard of teamtopologies that would be very surprisingto me because we keep talking about itas if it was the bible but the basicidea is that you have teams streamaligned teams or in our case the valueteams that um are building the the umapps and underneath there is theplatform team and this is therepresentation that they choose i don'tthink it's super ideal because thislooks like there are some teams that arefurther away from the platform team so ithink maybe this is a betterrepresentation um but i need to switchviews sometimes to illustrate mythoughts so just so you don't getconfused um there's no stream alignedteams that are further away they allneed to use the platform um butbasically what um the team topologiesidea is is that you have thesestreamlined teams you have the platformteams but then if you have an enablingteam to enable teams to use the platformthat enabling team sits across is a um acrosscut of the stream aligned teams soit's not a team that is kind of betweenthe value teams and the platform teamthat is separate from all of them butactually it's a team that is a crosscutthrough all of them so in our example itcould be that we just use the srres ofall the stream allian teams and maybelike a developer advocate of ourplatform team and then we form anenabling team doesn't mean that it needsto be like a full-time team that onlydoes like educational content or stufflike that but the main idea is that weneed people from these teams thatactually work in these teams and dostuff with them because they are theexperts of their domain right like whenthe streamline teams come to us and theyexpect us to build their house we'relike but i am not a net developer i'mnot a java developer i don't know howyour dependency management actuallyworks i need you to know that at leastto the point where we can collaborate onit together where we can at least googleit together or something like that rightso that's the the main takeaway fromfrom thisslide um so this is a long winded way tosay for me that um community is alwaysat the heart of everything as soon asyou work with like three or more peopleon something i feel like the communityaspect becomes quite important i doappreciate that even though cubecon hasbecome this big we're still trying tolike push the idea of community andreally focus on having creating spacefor the community because ultimatelylike what's the best way to get allthese like users and service providersand management and everyone aligned onwhat this platform should actually belike and look like and work uh howshould itwork this is my opinion by buildingcommunity okayum so i'm going to share like twoscenarios where i think it went verywell and maybe not so well when we weretrying to build community thank you forchecking so we when i joined the um thecompany we had this uh and they call itin dutch a hoop which is great ummicroservices hoop microservicesit's basically a guilt and uh when ijoined this was a very unloved meetingwithin the team because the the feelingof our team members was very much thatyou would go in and they would justblame you like why didn't you build thisthing that i absolutely need you tobuild right as i said before um and thenwhat happened was that a manager fromthe infrastructure base took over thismeeting and he was inviting people tocontribute so people would come in withlike things to share some ideas thatthey had some solutions that they'vebeen working on um and they would getspecifically invited to share and alsothere was an agenda and i this is a bitof a side rant now but i cannot stressthe importance of having agendas formeetings i try to avoid these meetingsbecause meetings without agendas is justtime that people take off because theycan't be bothered to think about thetopic before so they just like forceeveryone to in these meetings and thenjust start to think about this nowinstead of coming prepared so we cansave each other some time anyhow that'sa side rant just saying have agendas foryour meetings um the good thing aboutthat was that we would or in thosemeetings the focus would be much more onsolutions rather than problems sobecause there were solutions being umpresented they would talk much moreabout oh this actually is quite nice imean i would probably need it a littlebit more this way but how about you knowwe can discuss it and we could likemaybe fork it and maybe like you knowinner source things rather than i havethis problem and this solution would beso good if someone just would build itfinally right so that went really welland um from what i hear themicroservices meeting is going very welli usually don't go because it's veryearly and i have a 1-hour commute butbut i hear it's great on the oppositeend of the spectrum there was anothermeeting like a user group it's the s surmeeting so as i said before we havethese um cross functional teams and umalmost all of them have an s sur personso we had these bi-weekly meetings wherewe in would invite them and would givethem updates and so on uh when i joinedthis meeting was basically on lifesupport it was being cancelled like aminute before it started um like it wasnot very well prepared or anything ithink the the the people in our teamthat was that were running the meetingwere just kind of very surprised thatit's again happening and then i guess wejust upgraded something so let's talkabout that so i joined uh and i i have adeveloper advocate background so i waslike okay i'm going to take it over i'mgoing to restart it as a forum we'regoing to have an agenda very importantbut i want to create some space for thecommunity to get to know each other togrow to share and um i in my mind it wasgreat it was awesome was the best thingi ever did it just very much was notthat uh so for one thing we go to theoffice twice a week and we had this roomwhere our team mostly me and anotherperson would go and then everyone elseand a lot of them were in the officealso but they would all call in fromtheir desks and i don't know if you knowthose movies where like nick fury orfury or like the evil guy was like infront of all these monitors and theyhave to like defend themselves orexplain themselves is very much thatlike there was this wall of people whowere so not interested whose eyes wereglazing over who never said a singleword um and it very much felt like wewere just constantly just being kind ofnot berated but like you know we alwayshad to explain ourselves rather thanlike collaborating finding like sharedsolutions creating a sharedunderstanding of what the problemactually is that we're trying to solveit was very much again like a why is itnot the way that i want it to be and alot of times they uh asked like "oh itwould be great if we had this." and weencouraged them yeah why don't you buildit and we'll help you with it and thenwe never heard from that again like itapparently wasn't a problem or wasn'tthat much of a problem so eventually umwe we replaced that meeting um we killedit uh we had a new product owner join usabout a month ago she's doing very welland she started doing public reviews anduh how like maintaining a public roadmap and because the the update part ofthis meeting was kind of the only thingthat people seemed to be interested inwe were just like okay you know whatwe're doing the updates in this othermeeting you're welcome to join if youwant to have a space again where you umcollaborate on ideas and you know wantto share things why don't you also dolike a devops group or somethingand we'll be happy to join but we'rejust not going own it anymore becausewe're just kind of done with likecreating all of this only to have no onebe interested right so um that was a bitsad because that was the first thingthat i i thought i could just reallychange kind of the the vibe but i justkind ofunderestimated it's the government idon't know what i was thinking i justthought like i'm going to come in i'm soawesome i'm going to just like changeeverything but yeahum so finally just the last thing that iwant to talk about um the vision and thescope of this project has just alwaysbeen very unclear and as i said beforeif you don't clarify it's people startdreaming up kind of their scenarios andthey're just very disappointed andsometimes confused and one example ofthat is the example of image registry sooriginally we used the internal imageregistry of open shift um but we becausewe have these two clusters that arecompletely separate from each other theusers always had to build every singleimage twice which obviously is veryannoying actually four times because ondev and then they have to build it againon prod because we don't have anythingto promote images for so we said okay weneed an external image registry for thenew um platform we looked at key welooked at harbor and then the businessdecided no no no no no let's have oneregistry that just stores everythinglike everything that you could possiblywant to store every kind of artifact sonow we're using artifactory and it'salso not in our team anymore now it'swith the developer team the developertooling team and again it's kind of alittle bit caused a little bit offriction because now it's like oh wehave to like provide image poll secretsdidn't think about that because wedidn't do that before also they andthey're really nice people right butthey're kind of from the microsoftwindows world and we just have differentunderstandings of what good what is goodright because also now the builds havealso moved out of uh open shift so wewere using the open shift build configbefore and now they are magically beingbuilt in azure azure devops pipelinesdon't ask me how i don't understand itbut it's another thing that just issuper confusing and again this wasdecided in a business meeting before ieven started and it was just nevermentioned again until the day that ourarchitect was like "oh by the way we'realso not doing that anymore greatthanks." um and again because like theirteam they're like "oh powershell is sogreat." and you know we just have likevery conflicting ideas of what a goodci/cd process looks like and that justleads to a little bit of friction and itwould have been nice if we had beensitting together earlier and maybediscussed kind of like how do we want todeliver our applications to productionright like everyone was just thinkinglike i would want to do it this way andthat's how probably how we're going todo it i guess so this has been veryranty i didn't really re like think ofit as like a rant but here we are so tosummarize this entire thing um platformvery much a collaborative effort so it'sthe platform team provides the base forit but ultimately the shape of theplatform depends on everyone who alsouses it so you know if something breaksthen also developers can fix it whichfrees up the platform team to like buildmore capabilities to for theplatform building community so importanti'm going to harp on about it until theend of time it's i think one of the mostimportant things that you can do as an amore senior engineer right if you'relike if you're just interested inbuilding stuff that's awesome but ifyou're like really trying to get asynergy going i think community is thekey to it um communicate the scope ofwhat you're building early and oftenjust like just until people are soannoyed by it so they understand doesn'tmean that scope needs to be the sameforever it can always change but makesure that everyone is aware of it ifsomeone isn't just hammer it home morehave like a public page where you keepit up to date just so people know andthey don't you know like expect you tobuild their barbie dreamhouse for you oryour your barbie dreamhouse for them andthen also finally this is something thati had to learn the hard way let thingsgo um sometimes things just don't workout and even though you thought of likethis perfect solution it's just not whatworks for this group of people and thatis fine there's not the best solutionthat's not the perfect solution the bestor perfect solution is the solution thatworks for your users you can build themost beautiful platform if no one usesit it's not great it's not good i'mgoing to skip this for the interest oftime we can talk about it later i havetwo things to um pitch i'm doing publicspeaking workshops um if you'reinterested if you want to speak atcubecon maybe no guarantees but you cansign up if you're interested i'm doingthem online i'm doing them in person i'mdoing them hybrid i'm doing them forbusinesses i'm doing them for individualfolks um you can sign up interest once ireach critical mass i will create a newworkshop as i mentioned before chiefkaraoke officer for the first and onlykubernetes karaoke community kobarokiprobably heard of it if not then i don'tknow what to tell you but we have ourparty tonight we have a very long waitlist however if you sign up if you stillcome if it's not full you still have achance to get in it's in uh the cityliverpool street so uh it's 2 minutesfrom the elizabeth line um yeah sign upfor the wait list and then show up umand there's a chance that you can comein and that is it thank you so much i'mnot sure if we have time for questionsbut um if you want to talk i'm righthere i also have stickers if you want toget some uh yeah thanks[Applause]2025-04-15 22:03:26.016264 ��G�e#��EAgjEuIUCbNYYlet's get started if anyone could umfind a place sit down oh we're not in arush it's just my talk you know it'sfine um excellent hey thanks everyonefor coming i'm very very happy to behere um this is my second cubecon tospeak at i'm still very nervous this isa very big room so please bear with me iwill do my best um i will talk to youtoday about um my experiences as aplatform engineer with um a greatorganization the dutch government umthey're pretty cool uh i'm originallyfrom germany so i'm already pretty happythat this is not the german governmentum but yeah just to start off i do wantto tell you a little bit about myself ummy name is lean i am a as i saidfreelance uh platform engineer cloudnative human um because you know we'reall cloud cloud native engineers orconsultants or whatever but i i do feellike there's more to being um a personin tech than just your role but i alwaysbelieve in bringing like your whole selfto your job um i'm also the co-chair forthe uh cncf's tag app delivery as longas we still exist there's been arestructuring planned i'm sure you heardof it um but if you want to come andchat about it i'm going to be at the tagbooth um i think from 4 till 5:30 todayi am also as some of you may know thechief karaoke officer going to talk alittle bit about that at the end if ihave the time uh i took some time offfrom tech to pursue uh standup comedy topursue um musical theater so i justactually finished a three-week musicalrun before i came here that's um wasgreat but i'm very tired um but yeah sothis is who i am thanks for having mei'm going to find out how who all of youare later uh so this story that i wantto talk to you about is the story um ofan organization that had this amazingidea to better fulfill their developerneeds um i guess someone went to aconference and then thought hey let'sbuild a platform super great i thinkwe've been talking about platforms forcouple years now um and um yeah ipersonally think platforms are greati've been working with platforms asearly as i want to say 2015 2016 maybeum before we called it platform reallyum but the thing iswhen when i started and i' i'veconsulted with other companies beforebuilding a platformum i thought you know there's thisaspect of again with people you know howdo you deal with people how do you dealwith different umrequirements and i was thinking when ium sent in the abstract well by the timei'm giving this talk i'll be at thecompany for like the organization forlike four or five months i will have itfigured out right i i have the recipe iwill know exactly what to do it will begreat and then i can tell tell you aboutthe the secret sauce ofit surprisingly i i didn't i didn'tactually solve all our problems i don'tknow what happened but um so i'm justgiving you the warning that if you camehere to expect to get like the recipethat you can just take and apply at yourown organization and it will be perfectit's probably not this talk um so if youwant to leave now is the time um butapart from that i'm going to tell you alittle bit about you know what thesituation is um how we tackled theseproblems what worked what didn't workand maybe a little bit about like what iwould like to do in the future dependingon time um but yeah so just so you knowyou've been warned so a little bit aboutthe organization um it's called evorespra and uh this is as i said the umdutchuh it service provider for the judiciaryso um it's mainly running web servicesfor judges and lawyers and the publicwho need stuff done with uh the courtsor lawyers uh and so on so most of theapplications the way that they work arevery very process driven and um a lot oflike judges and lawyers they're kind oflike not interested in new ways to solveproblems they're just more like this ishow we've done it the last 100 years sodon't change anything this is exactlyhow we're going to continue doingit in the organization we have thesethings c!se wehave memory bloat so we don't care whatuh allocation we do of the memory wehave execution bloat or simply we haveenvironmental blind spot so how did weget there right um I mean it's it's hardto answer probably the reason aredifferent but probably also abundantcompute as an answer so why optimizewhen it just take one second to put onemore node on like a bigger one so it'sjust cheaper to go this way or like alsothe fast market um has pushed us todevelop more and more features rightlike that's what we call feature creepand also there are so many layers thatwe if we have to deep dive maybe we justnot able to like because we don'tunderstand a lot of layers that are deepdeep behindso that's why we wanted to to set thistalk and yeah I'm going to give the thethe word to to Leo who's gonna talkabout environment setup right so I thinklike in the community um and also ingeneral like in the sustainability andenvironment area most of the discussionsare pretty like high level so it'sabstracted away and it's not so oftenabout actually like what it's aboutright so you have resources and energyso it's a very lowlevel kind of thing umbut because we are in the cloud we aretalking about workloads about servicesand everything everything is abstractedaway and um thatas Antonio said sort of the idea um justto sort of turn the table not comingfrom the top and going down but actuallystarting from the source um so bottom upso to say um so if you look at like theentire environment I mean this lookslike very different obviously fromcompany to company but in general it'slike very complex and this is alreadylike a very simple view but you havedifferent types of like how you um umstructure your network how youcommunicate how your workloads are ifthey are small if they're large whereyour geo locations are located are youlike a global business or not a globalbusiness um and that's just like from atopology standpoint already very complexso when we started thinking about okaywhen we want to start like how do westart okay let's just focus onum perhaps like one region on computestarting a little bit small and thengrowing perhaps later umso all the measurements that we do likelater in this talk basically are focusedlike in a very capsulated environmentwhich is in some ways not very much likeuh near to the reality because rarelyyou have a very capsulated environmentbut still it's sort of where we startedso I said we want to go from the bottomup right so what does it mean it meansthat we buy hardware that we just asAntonio said look at servers not for thefirst time but uh sort of um gettinginto bare metal thinking about okay howdoes storage look like how differenttypes of stoages do you have in not justlike an abstracted kind of way fromcloud providers where you have differentstorage tiers and whatever it is butactually the bare metal part um and thissort of led to building um this kind ofdevice um which is just a bunch of uhcompute units a switch um and um coolingso fans which are connected um and so onso you have something to play with butalso you always sort of um stay in touchwith the hardware itself um so you'renot getting um sort of disconnected andif you build later communities clustersservices and so on you always can sortof have all the way through to thehardware to have like some kind ofunderstanding and then hopefully um sortof the information can be transferredalso to your more productionenvironments at least that that was theidea or is the idea um so this is sortof how it'sstructured pretty simple um so we havewe have uh four different compute nodesthey're all the same um so there canalso be a lot more variety right so ifyou want to do different ARM chips andso on this could be like a veryinteresting comparison um in the futurebut these are all Intel mini PCs as yousaw like in the previous one stitchedtogether connected over a switch one ofthe nodes is responsible for sort ofyeah being the gatekeeper um it does DNSand netting and so on and so forth umthis is for sure not like the best wayto structure this but it also gives umlike an good "idea if for example node 2node 3 node 4 with the communitiescluster want to access the internet insome ways they have to go over node onewhich um is sort of interesting in termsof measurements because then you canisolate the environment a lot more umand right so if we put some stuff onthis to actually do something um wechose to go with K3S which is kind ofsimple to set up in a Nixos environmentit's like something on on the next slideum so with Nixos you can just um havelike a very easy configuration so youcan spin up a bare metal cluster withouta lot of hustle um this is again alsoprobably like in a larger environment alittle bit different but for ourpurposes I think this served quite wellso all the components which you see atthe top are basically just like thedefault configuration we did not changea lot um besides maybe IP addresses andso on but but all the other stuff likein terms of like setup it's all defaultum all the optimization that can be doneis for us like an experiment to see forexample if you want to change like CNIsfor example like how does this compareand then for um energymeasurements um we chose scaffandrawhich is um a project there's also aslide later about this just to measurethe energy consumptionum yeah so that's thesetup right and said this is all likeconfigured withNixos um this was like primarily also ina way like self-driven a learningexperience getting into this entirerabbit hole of other technology um whichis also very interesting um probablyalso like in other environments so Nyxis um uh a lot of things it's um apackage manager so you can also use iton your Mac OS notebook or whatever umjust to configure basically as like ahomebrew variety basically your entirepackages um but it's also um you canalso use nyx to declaratively um sort ofdescribe container images which then canalso be sort of compiled to u virtualmachineimages or um for entire Nexos Linuxdistribution also for bare metal andwhat is very cool is that you can as Isaid have like a declarative way so ifyou want to open certain ports youdefine this in your Nixos configurationthen you rebuild re um apply the systemum and that's quite cool and then forbecause there's multiple machines I useda project called colina which sort ofwraps the entire environment uh into asystem closure and then you deploy thisentire sort of system closure to thething and um make updates based on thisso very interesting piece of technologyI think and also in terms of like repurepusability and also um if you want tosort of track your changes because ifyou use nyx flakes you always canguarantee that inputs always areresulting in the same output um but thisis not about ny um but still this wasused and then I mentioned scafandra sothere's a couple ofdifferent ways to measure energy whichI'll describe on the next slide but wechose Gfandra to um basically get allthe different metrics energy metricsessentially about our workloads and thenwe do like some analysis later um thereare different projects that um that dothis kind of stuff we chose cafferbecause we know it and it works and umit's inrust right so what are the options tomeasure energy so there's a couple ofdifferent ones um like the moststraightforward one in one way is likeif you just go to sort of the wire rightso if you put like some kind of uhdevice if you measure the energy by thewire with a smart meter or uh by hand umso that's the one below basically theone above is over uh the BMC so if youhave a server usually like in datacenters or in larger like systems andservers you have um a bare metalcontroller which you can use to alsoaccess these kinds of metrics so there'sa small chip which basically does likefirmware upgrades and whatever and so onthere's entire like um um also API whichyou can use and over this API you canget access um to um heat to power and alot of other stuff um and then at thetop it's basically the software way soit's like a correlation basically so youdon't go to the wire but you use likeevents in the CPUum basically to get a very good veryclose very accurate um assumption of theenergy# that is being used um but RPL andother implementations so RPL is the onefrom Intel um there also otherimplementations if you use ARM orwhatever um they all have they are allsoftware based so it means there can beas I said like you do not necessarilyhave like the entire picture becauselike if you stitch like differenthardware components they need to be likecompatible with the API and if they arenot you don't capture for example memoryum um usage or or something like thisright so you you can lose some of the umsome of the values so in our situationwe don't have uh BMC unfortunately ly umwhich would be very nice um but we donot um so we chose to go with the firstone and the last one umessentially and what is like the scopeso this is already in some ways I thinkclear to everyone but I think it kind ofmakes sense to make this more explicitso if you talk about like resourceconsumption and energy consumption youcan also have this like very broad rightbecause if you engage with software youalso use like some kind of monitorthat's also required that also drawsenergy um so there's like a lot of likefirst level second level third levelsort of uh energyusages um also in usually or perhapslike in systems you have some upstreamAPIs that you consume um or some SASapplication that you just sort of boughtsome someone's cloud environment thatyou borrow uh in in some ways or thatyou use for your system so all thesedifferent kinds of dependencies alsoconsume consume energy and it'sdifficult to get the entire picture sothe overall scope is very large it'scomplex um so it's it's sort of likewhat we reduce because we um have likethe capsulated environment so what wecan measure is actually just like a asub um sub a couple of thesubcomponents CPU and so on basicallywhat is supported by ripple and alsowhat we um can um gather basically inthe encapsulated environmentso that's sort of our scopeall right so yeah what we talked uhabout was like why why can't we try aday journey so we went we are allfamiliar with some kind of platformengineering so here at day zero is theday you get some requirements and maybeyou're already lucky enough to get someKubernetes running in infrastructure ofyour choice and then day one is likeapplication development so we have ourmicros service uh developed in this casewe took um the Google cloud microserdemo which allows us to deploy onkubernetes uh a microser this a microserthat would uh somehow simulate a realone and then we wanted to also stressthe hardware a bit we use uh stress NNGto to see how the power measures wouldchange when we have like high memoryconsumption or high CPU consumption andthen day three we thought about okaywhat if we add some uh securityconstraints right like we have somesecurity applications so not only wehave our Kubernetes cluster really usedand stressed but also we run somecontinuous monitoring on it like how howis the power numbers uh going to looklike and uh yeah like we have a verysimple uh workflow because this ismanual right is also our firstexperiment and we had fun with it but italso took a lot of time to to set up sowe have our manual run we manuallydeploy our workloads we manually deployfalco the micros service demo and thestress and gpod and then we launch amanual report of scaffonder this is likecontinuous measurements for one minuteand we get uh P levelum energy measures and power powermeasures and and yeah we have also uh alog file that we later use to show someof the the graphs but uh yeah likequestion to the to the crowd how muchpower do you think a single node in thisconfigurationwould would consume in watts inwatts someone maximum maximum yeahmaximumpower someone would like to give ananswer yeah 100 wats 100 wats do we havemorewe have less uh-huh10 okay the man on the left say 10 likewho who wants to say something different20 okay 20 do do we have somemore five okay we have five yeah we'llsee we'll see this lateruh okay how much power is the home setupit's still the same oh okay never mindso let's say it's 100 uh the entiresetup so like the four nodes the switchall the n$etworking component you havesimilarnumbers let'ssee so this is day zero and this is thetotal cluster uh power consumption soit's all the data of the four nodes uhwhat we got from scaffander and coldmeans just uh kubernetes running Andyeah like it's very close to like it'scloser to five than any other numbers uhso we get a prize for youlater and um but yeah like this is likeeven more close to one right likebecause I don't know it's like less thanfive and it's the sum over the four soyeah like it was funny because on thetop you can see above 40 right likethat's what we measured at the at theplug so we expected something like 40ishand from scaffer we would just able tosee less than five so it was already abit concerning us so we said okay let'stry to deploy the micros service and andlet's see how how it changes and itchanged a little bit but it didn'tchange much of course the microser demoin its basic form doesn't have a lot ofinteractions so that's why we decided tostress it so like I think it'ssimulating 10 users in a minute orsomething and it's like basic yeah sothe microser is like um basically a demoapplication I think by Googleand they wrote I think it's like 10services or so it's simulating like aweb service a web shop and there um isone of the services is basically justacting as a user filling in the shoppingcarts emptying the shopping cart and soon so um the different programminglanguages different containers and so onso they try to um that's that's sort ofthe the microser that we deployed it'snot very heavy and then we decided okaylet's try to stress the machines andagain the expected power at the plug wasslightly higher between 50 and 80 so tosay but still the uh softwaremeasurements were not so like we're notcapturing all all of that what what washappening right and then yeah just justfor fun we added falco on top so it'sslightly more so this is good because ina way we're addingcomplexity to our cluster we're addingcomplexity to the workload and the poweralso adds up right so we can see this atleast in our home environment but we canalso see that every time there is likesome kind of offset that is missingright maybe it's the power idle maybeit's all the things that we cannotmeasure with software but yeah we werecurious to see the numbers and then umyeah like here we have like thisuh view uh a bit al together so we havein gray the cold so only Kubernetesrunning then the micros service on topjust a slightly higher uh powerconsumption we have those drops uhprobably that are related uh to how thestress NG works uh but yeah we we didn'tgo very deeply into understand thestress we just took them as uh outliersand then like falcon on top is givingsome some more power usage spikes Yeahand and then what we did is like okaywhat if we shift all the measurements bythis cold baseline that we have so let'ssay we have some sort of power idle thatwe cannot manage to measure and we justshift all the measure by adding all thishardware related power metrics that uhthat that Scafandra is maybe not able tocapture right are are we going to eatmore or less what we saw at the plug andwe could see that for example for themicros service demo and the cold likethe ranges were hit uh slightly betterand also for uh stress and and falco andinteresting like maybe even the softwareis like uh capturing something more sothis was also an interesting finding soso basically if you account for theoffset the numbers that you get fromscaffandra are kind of accurate so aslong as you know the offset um that isnot captured then you can just add thisalways on top and then you havebasically your number which isrepresenting the actual entireumusage yeah but as as we said like thiswas our first experiments it was kind offun to go back to garage days and seewhat we would uh get out uh yeah like weplan uh many more and there is for surea lot more that could be that could bedone u but this for the leos I mean atthe end a lot of the projects in theCNCF have also sort of relate to eachother right so if you have depending onyour microser if you use falco thiscould have more impact or less impactdepending like on the services that youuse so understanding like thesesynergistic effects sort ofis kind of or very interesting um justto understand how this works um um andalso to find out like how to sort ofcommunicate those kind of effects backto the maintainers of all thesedifferent kinds of projects because ifthere are in itself scalable and and soon but there's like synergistic effectswith each other which do not scale sowell or have like an energy footprintwhich does not um shape so well therewould be I think interestingobservations um but yeah I I mean as wesort of like highlighted throughout thetalk there's a lot of different thingswe could change which would have veryinteresting results perhaps so changingthe host forexample switching operating systems orusing a different kind of operatingsystem or uni using a a uni kernel likea stripped down um Linux kernel umsomething like this could be kind ofinteresting for sure um also usingdifferent kinds of ships different kindsof hardware um how they all relate toeach other um especially like using ARMbecause ARM is always um from thearchitecture more energy efficient umthis is for sure like related also inthose numbers if we would have an umARPC in in the mix but um yeah justputting all these different I thinkthere's a lot of different like umthings that could um be measured as wellyeah I think we wanted to start aconversation about uh yeah like what isthe right number right like either powerrelated or energy related uh what islike should I get worried if I get Idon't know 100 uh vats or not so we wewe're really curious to have a I don'tknow a number out of something that wecould control like on the stack andthat's the numbers we got yeah and Ithink the approach like starting withthe hardware and then going up umespecially in terms of sustainabilitymakes a lot a lot of sense um because Ithink that's essentially like the mainissue right because if you're likealways very abstracted you sort of losetouch and you don't really understandlike all these like workloads that youship um actually consume so manydifferent resources and um yeah I thinkit's a good practice in a Rightwant to say anymore no cool[Applause]all right i think we have a couple ofminutes so if you have any questionsyeah there mic there i think that worksif someone wants or Yeah just feel freeto ask one otherwise we will be alsoafterwards standing here for a bit ifyou want to chatall right i think then we can Oh surethere's a questioni I don't I'm not sure if I got thequestion right was it about basicallythere's like virtualization layers inbetween so you cannot get the access tothe bare metal matrix was that kind ofyour question yeah so there's um sothat's an issue so if you don't so yousaw like one of the slides before youdon't have BMC access you will neverhave BMC access like in the cloudenvironment because this is like for theproviders basically to orchestrate it'salso security sensitive and so on um soand usually they also don't have a smartmeter which you can hook in so the onlyoption that you have is something likeRPLE but in virtualized environmentsthis is capsulated so you cannot accessthose metrics um so in those situationsum it's not possible to get those kindsof numbers but what you can do umthere's another project called Keplerwhich basically does like some somefancy machine learning um basically toget very close to those numbers but theissue is as as far as you sort ofabstract away from the hardware and thisalready happens with something likeRipple even if the margin is like verytinyum as further as you abstract away asbigger as the correlation basically goesand then it's difficult to make actuallythe exact numbers right so you have likea lot bigger margin of error um but itcan be done um you can get pretty closethat's something I think which we alsoput on this slide um doing measurementslike in a virtualized environment whereyou have like another hypervisor orsomething in betweenYeah all right cooland thanks everybody for joining2025-04-15 22:03:26.625837 s s��p�f#��A4OdYWliYpPghi everybody um yeah like my name isAntonio here with me uh we have Leo andtoday we'll have our talk uh more dataplease ends on green cloudexperiments um so yeah like actuallyI've looked this up it's been yeah moreor less 20 years the term cloud has beenout so yeah like in this 20 years we uhrecently had the 10 years anniversary ofKubernetes so it's a very establishedindustry and we all know why uh thedream of the cloud has become true so wehave scalability so res a lot ofresources on demand agility so thefaster dev um for for the cycles andlike there are a lot of abstractionright like a lot of complexity that hasbeen made simple on the other hand wehave some hidden costs so we have energydemand that we rarely measure and alsowe have infrastructures detachment soactually the first time I saw a serverwas like this year and it was like hehad this noisy big machine that takes alot of heat and it's like I never seenone right like I never had to see one idon't know how how many people canrelate but it was like quite impressiveand yeah like we we are getting towardsa a world in which you're more orientedto application development and a littlebit loss of technical knowledge to theirdeepeststack so on on the other side we alsohave software that gets bigger in a wayright like hardware has never beeneasier to provision so just a click awaywe can have more compute more nodes andour software just gets bigger and biggerlike I I remember like we had somediscussion because we ended up havingdocker files or docker images that have2 gigabytes in size and like maybe thiswas not possible like I don't know 20 30years ago and that's maybe becau 11�K�g#�OAO1EJnC0pjZIthank you guys thank you for joiningthank you CubeCon thank you cube nativeso um I'm I'm Prashant i'm uh the seniorDevOps at Morantis so today we arepresenting we're show showing how we aresolving real world case scenarios withedge computing with Ker Nuts andRaspberry Pies so the use case is thatuh in Vietnam and in Mauritius we aredoing the coral restoration project andso as to ensure that corals and algisesare in their optimal environment we haveto monitor all the scientific metricswhich is uh temperature nutrients levelsand water flows and then we have to dothat hourly and there was no oneavailable at 4 a.m to do the collectionso what we thought about is edgecomputing but then it came with aproblem that it was we had vendor lockin we had high le licensing cost andlimited customization so we were storedin a in a black box then we thought alittle bit more and then we came outwith our own solution which is uh this asubmersible boy IP68 waterproofenclosures Raspberry Pi 18 watt curvesolar panel and a lithium polycarbonatebattery various sensors and a 3G moduleso this is currently floating in the seaat 100 meters of the of of the uh theshore and youhave the uh so this is the second uhgeneration of of the of the of the boyandthen the software stack we chose was uhon top of the raspberry is K0 not ourgolang application so what's happeningthere is uh with K0 there is simpleprovisioning deployment it's taken careof and the with the applicationself-healing there is always autorestart health check and p reschedulingso and then uh resource limits thisensured that the platform which is uh inthe sea at 4 a.m is reallystable so this is the architecture wehave Cosmotron which is our controlplane manager from Mirantis it's opensource and it's controlling all thenodes uh in the open C with the 3G andconnecting uh connection is always uhstreamlined right so what is KS it is aan open source project donated byMirantis it is currently in the sandboxCNCS sandbox it is a single binaryKubernetes so you can uh it has zerodependencies so you just uh download itand install it without any anything youwill get a cluster within minutes it'sproduction ready and has support itaders to the uh to all the Kubernetesstandards and it runs anywhereeverywhere uh and has an active a veryactivecommunity so this is KUS installationstep we have only three step and you geta Kubernetes cluster running in twominutes right so you just download itinstall it and start it you have yourcontrol plane and yournode so we actually achieve very morewith less that is Raspberry Pi which isa tiny board but with big dreams andthen K0 which is which allowed us to doeverything and then we had nuts uh andthen well how uh who would thought thatKubernetes would float onto onto the seasomeday in a Tupperware so this is it areal life uhexperiment so call for action uh caserest is there for download and you havethe documentation and git repositoriesto use so please uh don't hesitate grabit and use it as as you can and thenwhat is next so this is cordant whichsits on top of k0 it is a our supercontrol plane management system which isuh which is been showcased which we havejust released version two uh and it uhit is our super control plane and it'sprovisioning workloads in minimum oftime on multiple cloud and on prem it'sa proven technology and it's currentlyprovisioning AI and ML workload togetherwith we are we have partnered with Jcoreand everything is ready within minutesto get your uh your uh infrastructureready to uh provision ML and AIworkloads so please do come to our boothat end 31 so uh to have ademo so thank you very much um please umscan and then all the documentationblogs and use cases are there on thewebsite um thank you very much2025-04-15 22:03:27.294277(ploy this imageregistry that's going to be leveraged tobe able to bootstrap all of ourKubernetes clusters and it turns outthat that was actually a much harderproblem than actually being able tobootstrap a Kubernetes clusteritself i think it's still an interestingquestion about where we go in terms ofthere are so many registries that wecould leverage right but how do we makesure that the images that we leverageare secured you know we've gotattestation we've got provenence etc etcright we can talk about sbombs all dayevery day but how dowe make sure that all of the images thatwe use even to bootstrap the clusterhave gone through some level ofprovenence and luckily in future and youknow relatively new releases ofKubernetes they have released thesesbombs in order for us to be able toleverage so I would say that there aredifferent levels of air gapped uh Iworked in a semiairgapped environmentbefore where you could download stuffbut you could not like have any data goout of the network uh that that is Ithink uh a sweet spot for someapplications like again if you had aregistry like you could still downloadthe images maybe you want to have a zeroCV style image um and then well uh youdon't want to have any data going out ofthe network i would say that networkingis the most difficult part when you aredealing with a airgapped environment uheverything else is pretty much the samesometimes you also have resourceconstraints like when you're working atthe edge maybe you're working onvehicles maybe you're working ondifferentum um different edge uh devices i don'tknow just uh a satellite or whatever ithink that some companies here around inthe showcase uh can talk about that ithink that that isuh the the most like big challenge whenyou have resource constraints and yournetworking latency also is differentlike when you are up there in the skylike maybe you cannot really uh downloada lot of data you cannot really uh haveall of the uh goodies that you have inthe cloud so to tackle air gapenvironment uh there's no one fits allsolution uh I would say that it's a mixof being savory saving enough to um todeal with the networking issues uh andwell praying as we saw before like justsend a nice prayer before doing that umand well uh you can work with uhspecific systems like I don't know tailsor um specialized OS that arelightweight and they're allowed to beimmutable so I think that that's uh alsopart of how you can work with your gapenvironments yeah I think you raised agood point there right about scalabilityyou you do have some kind of elasticitybut you're obviously at the mercy ofyour own data center in order to be ableto know how how large to scaleokay i'm super conscious of time so wecan start spinning the wheel yeah let'slet's start spinning the wheelso go with Matteo firstlet's see whichstar evf first okay all right one morecoolall right sustainability it isall right and for me what do we haveoh yeah service meshokay so on your slideo now there shouldbe a new there'll be another tab that'sfor Q&A idea being now for every uhtopic we're going to go through you'regoing to have your chance to ask us onequestion you can upvote questions thatare previously there we'll take the topone maybe the top two um so yeah come upwith interesting questions that you uhthat you want to gothrough amazingso cloud native complexity skyrocketedin the last few years um and I thinkthere's a gap in how we secure monitorand optimize systems at scale and here'show I learned about EVPFum that is a revolutionary I I think atleast way to deal uh with these issuesthrough the kernel but without being akernel developer without like patchingthe kernel like working in the userspace uh but still providing a sandboxto attach programs to the kernel space igave a talk about this uh in so lakecity at rejects last year you can lookit up it's uh I explained EVPF to mygrandma i have a series of those talksum and you can take a look for a deeperdive today it's going to be a littlemore um on a superficial side so whatyou can really do witheB transforms the kernel into aprogrammable) interface uh driving a lotof innovation in the observability spaceuh you might have seen in the projectpavilion selium falco some of thoseprojects uh give them a shout out likego and uh and uh maybe contribute evenwhich is nice uh think about anythingthat is tracing profiling advancednetworking uh package manipulation or uhdeep security application like any umanything that goes and really at theCisco level scrapes what is happeningwithin your cluster unlike thetraditional setups uh there's no needfor again lengthy kernel patches or uhcomplex workflows to uh attach um uh asidec car maybe uh you just work at eBlevel one of the difficulties is that uhwell it's C development so if you're nota C developer uh there are somelibraries that allow you to interactwith DBPF in Go but C is the native wayand well in this case you have the rightsuperpower to uh save the day most ofthe time industry adoption is growingagain uh like BPF with projects like BPFtrace for debugging corselium falco andsoon now when did I use uh these projectsuh in a fashion and security environmenti was working for a large fashioncompany uh as a consultant i was aconsultant at Steve uh and their multicloud infrastructure was like more openin a brew pub on a Friday night soeveryone was going in doing whateverthey wanted they even had like a cryptominer going on in their infrastructureand they did not notice that that costeda I think couple of millions ininfrastructure costs uh AWS was so niceto just like uh forgo the bill um anddon't tell it to AWS that I told youthis under my advisory they adopted uhFalco uh which is entirely built on EVPFuh to replace well legacy andnon-existent I would say security toolsuh I implemented the Falco agent Ideployed it everywhere uh even in theVMs like not only in the clusters um andwell I started putting up policies andthe performance went really great Theydecreased also the uh security incidentsof about at least a quarter which isstill great they had still some becausethey like Falco is not the uh holy grailyou still have your policies your uhyour access and yeah well they didn'thave a VPN so uh that's it pretty coolthing uh of course was to integrateFalco with their sidekick uh and its UIso you could also like leverage all theum uh sending real-time messages onSlack or Teams please do not use Teamsuh based on the securityfeed let's see what are your questionsabout EBPF if you have some in themeantime is EBPF a universal solutionfor future projects to leverage kerneloperationsi would say it's the first buildingblock it's fundamental so it's not um100% a universal solution there arepossibly many more uh but it's a goodstarting point to do monitoring securityand observabilityso running Kubernetes is relativelystraightforward right when we've gottens of nodes maybe hundreds ofworkloads but it gets far moreinteresting when you've got thousands ofworkloads and tens and thousands ofnodes right and really there's a numberof things that we need to start to thinkabout here right one is the controlplane itself how do we make sure thecontrol plane performance doesn't getdegragated at thatscale but we also have to think in termsof the teams that are leveraging theseKubernetes clusters how can we maketheir lives much much easier so for mewhen we're thinking about successfulscaling we're thinking about four thingsright the nodes is like I said we wefocus on the control plane we can thinkabout virtual clusters we can thinkabout multiferate federated clusterswith regards to teams it's really how dowe make the onboarding experience easierright when we've got strong namespaceisolation we've got resource quotaswe've got those networkpolicies regions is really about graphuh geographical distribution so do wehave a federated couple of clusters thatspan multiple regions do we havesingular clusters that are running indifferent regions like there's a numberof options for this right and then thefinal one is really multi-tenency somulti-tenency requirements could beevaluating virtual clusters it could beproviding dedicated controlplanes um we nee*d to come somewhere witha middle ground right you can have asingularuh cluster that essentially leveragessomething like vcluster to create thosevirtual uh control planes but then youobviously have that kind of ummanagement overhead on top you've got tomanage the underlying cluster and thenyou've also got to manage the virtualclusters that are running ontop so a media uh streaming platformthat I was working for that run out ofthe Middle East was probably one of thelargest clusters that I've actuallyworked on before um their suite was over50 clusters and it was about 10,000nodes and I think it was somewhere inthe region of about 150 product teamsthat were leveraging this and they hitcontrol plane limits because they triedto deploy everything to one what Icalled superclusterum but then they kind of split this upso they their solution combined physicalclusters but with cluster API withvirtual clusters using vcluster and thenthey had these um custom admissioncontrollers that were used to enforcepolicies across all of theclusters they had a very very complexthat's putting it politely um networkpolicy configurationuh I don't know how half of theapplications actually got deployedsuccessfully because there was so manyof these network policies that kind ofstacked on top of each other but theyactually moved to a service mesh thatwas then used to be able to providescalable and easily leveraged policiesso they essentially saw it as one bigcluster but it actually wasn't it wasmultiple clusters and they had somefancy ways of being able to deploy theirworkloads across this pool of clustersbut from a developer experienceperspective they saw it as one supercluster essentially and the result wasthat the platform availability went upbut the ability and need for thedevelopers to understand what was goingunder the hood actually decreasedsubstantially and it was reallyabout how can we best support thehundreds of teams that we wereleveraging to be able to deploy weeklybut still have the ability to keep thesecurity at some kind oflevel see what your questions are onscalability no questionsor how do you scale for networkcongestions maybeyeah so how do you scale for networkcongestions it's for for me personallyit was really about how dowe how do we distinguish clusters andtheir responsibilities right because alot of people probably think or have adevelopment a development cluster astaging cluster and a production clusterwhat we ended up doing was havingdedicated clusters that run specificfunctions so we had a a specificproduction cluster for function A andthen another cluster for function B andwe tried to minimize the communicationbetween them and essentially hadclusters that were dedicated for certaincomponenttypes what's for yousustainability i already see a questionfor sustainability thank you for puttingthat inso uh cloud native adoption skyrocketedand of course with this the need forresources but all of this has uh greatimpact on the environment especiallywith uh the the rise of AI and all ofthe machine learning workflows that arevery demanding in terms of power um inthis case with organizations um they'repri prioritizing uh of course profit asany organization ever but uh they'restarting fortunately to also prioritizecarbonaware computing and what does thismean it doesn't mean just costefficiency because that yeah cost ispart of it uh but sometimes you mightwant to pay a little bit more uh to geta little bit less carbon uh we only haveone planet of course we need to get takecare of it and our impact in tech canactually be huge yes of course we shouldstill recycle go on a bicycle when wecan uh but even optimizing one uhworkflow even optimizing our code uh andthinking about how many CPU cycles arewe using that's more than recycling onebottle still do it please but yeah everycontributionmatters now back to carbon awarecomputing what does that mean well thatmeans three things carbon intensityenergy efficiency and measuring impactso first of all you need to well uhthink about your cloud provider do youneed to use hyperscaler do you alwaysneed +to use that could you go on maybe aminor cloud provider uh minor in termsof scale um that maybe uh reheats uh acacross the city um some buildings forfor example we have uh leaf cloud in theNetherlands uh that does this very wellyou have uh Civo in the UK who's alsopartnering with um uh with anothercompany that has dislocated data centersthat allowed to like use the heat ofthose uh for heating houses or uh publicspaces and with this you are alreadyreducing yourfootprint aside from this you shouldalso measure your uh workload uh carbonfootprint there are some projects forthis we're going to talk about thatlater um and there is also for example aproject that I love which is cube greenhow many of you heard of cube green oruse cube green currently perfect eventsuh Cube Green is a project that allowsyou to uh well using the fundamentals ofuh sustainability turn off your forexample test environments during theweekend and then turn them back up on uhon Monday uh same for time of the daylike do you really use your testenvironment from 8:00 p.m to 8:00 a.mand the next morning probably not so youcan turn them off of course this savesalso money to your company which isgreat uh but it also saves resources anduh carbon that doesn't get emitted intheenvironment now carbon intensity uh isnot just about uh cost savings but it'salso about finding the most energyefficient uh region uh in a determinedtime scale uhwhere your uh your workflow is usingless energy for example uh I was workingon a project to reduce uh the emissionsof a company and I found out that uh thethe region in uh in Europe where thecarbon efficiency is is the be is thethe best is Sweden so if you want to usethe Swedish region for uh whatever cloudprovider you're using that's usually theone that has the best and cleanestenergy due to the mix of resources theyuse to produce itso in short embodying uh practices likethis and optimizing uh the resourcefootprint of all of your software uhusing of course like things likemulti-tenency when possible leveragingefficient containerization like evenslimming down your containers is greenerum it helps uhand this is actually the story that Iwas talking to you about the cargointelligence one and your question wasvery interesting onslido it was how often projectmaintainers advocate and forcesustainable coding practicesin open source there is less of anincentive uh economically speaking tomake uh workloads uh more energyefficient and so less expensive in terin terms of that um I would say that itstartsby having a few so open an issue in yourprojects and ask u what are they doingabout carbon efficiency go uh to thecurrent tax sustainability uh they haveuh carbon reviews which is an initiativeto measure the carbon footprint of everyCNCF project and like that we can startand think about green opensource cool so the the next topic is isGitOps right and probably people thatare familiar with me know that I kind ofrave on about this thing constantly soperson the flux person uh Stefan um sowe started with infrastructure as coderight which was really about how couldwe define our infrastructure and kind ofversion control it but then along camethis concept of GitHubs whereby we couldstart to see git as our source of truthfor the workloads and the configurationsthat we were leveraging to be able todeploy our clusters and it was kind ofthis fundamental shift when thishappened about the operational mindsetright we had to start seeing seeinghopefully manual intervention assomething that happened less often thanprobably you know the future the the thepast before that so when we think aboutthe GitHub sweet spots for me obviouslyKubernetes is the obvious one rightwe've we've got thousands and thousandsof lines and files of YAML and we cankind of circulate them and keep them ina centralized location whether that beone git git repository or whether thatbe a multitude of git git repositoriessplit across multiple functions so maybeyou have all of the security features uhthat are owned by the security team inthe security repository you have thereleases for product team A and aproduct team Arepo multicluster consistency becomeseasier in my mind right we can start todeploy and manage multiple clusters frommaybe a singular repository withmultiple environment overlays or thesame kind of concept but with multiplerepositoriesand really it allows us to be able tosee that drift between clusters a littlebit easier right and we don't have thesekind of scary things that are running instaging that have been running there forages but we're not really sure what'sgoing on in in production and thenfinally for me at least the the mostimportant one is kind of auditabilityand compliance i spend a lot of timewith financial institutes and if you'veworked in a financial institute this isthese are the two terms they bang onabout the most so auditability is greatwhen we're using Git we have a reviewprocess we haveapprovers and we also have the abilityto be able to provide some kind ofcompliance report and based off of thosegit that githistory so I'm conscious of time so I'mgoing to I'm going to skip the exampleand go straight to the questionother than telling CD to ignore elementshow can we integr as Ican upso we've spanned through super quicklyfive cloudnative topics right and eachrepresents this kind of different facetof the cloud native landscape but whatyou will see is they're allinterconnected somehow sometimes that'sabundantly obvious but sometimes it'sless obvious and this is the fascinatingthing right it's really about how thesetechnologies come together and it isn'ta coincidence it's the power of theginormous cloudnative ecosystem and thelandscape that we currently leveragethere is one thing uh away from thistour which is successful cloud native isnot about implementing every new shinytechnology it's about selecting andintegrating tools that you already havethat aligned with your specificchallenges uh the most successfulorganization we worked with uh didn'tchase any new project uh announcementlike sometimes yes but not always uh andthey built a strategic foundation firstand then based on actual needs theyevolved that the topics we covered todaygave us at least I hope so thosefoundationalelements so where do you as an audiencego from here right and my suggestionwould be focus on one of the areas thatwe've discussed that resonates with youthe most start small right experimentbreak things learn new things and thenbuild competency via expanding and thesecond is connect with the communityright hands up in the audience who'sattending their first CubeConcome to a second and a third andcontinue right so leverage the communitywe're all here we've you know we we'veall got our own battle scars from manymany different organizations manydifferent clusters and we've allleveraged a a multiple uh toolingoptions right so it's really aboutleveraging and building that connectionand building that network you can'tpossibly be an expert in the wholeentire CNCF landscape because to startwith you have to actually zoom out onyour browser to be able to see them allum and then I think the most importantthing is your your cloud native successis not measured by what fancy toolingyou use or what version of Kubernetes orname any any tool you're using it'sreally about what are the businessoutcomes are you using technology as adriver to be able to drive businesschange and business valueso today we spin the wheel uh on thesetopics but your organization journeywill follow its own path the key isunderstanding that the landscape um willbe spinning blindly so you're going toadapt to that uh cloud native is notabout technology is about possibilitiesthank you for taking this journey withus today and now go and spin your wheelof fortune if you want you come to theGitHub booth and we have a real uh wheelto spin and you can win some swag uh andyou can find me there after the talkif you want to rate our if you want torate our talk we have a QR code and ifyou want to connect with us we also havethe QR code with our faces which isreally funnythank you much for your time and enjoythe rest of the conference2025-04-15 22:03:27.988434 ��M�h#��QAPmba7R4_4oUall right welcome everyone so by show ofhands how many of you have sat throughconference talks thinking I wish they'danswer the question that I wanted themto answerokay so today we're going to flip thescript it's going to be completelydifferentthis isn't just another talk it's theCloud Native Wheel of Fortune and youwill have the opportunity ofcontributing one topic on top of theothers that we prepared in our wheel offortune and that's just the beginning sowe're going to spin the wheel you'regoing to have four chances to be able toyou know select topics but you're alsogoing to get the chance to ask us aquestion think of this like speed datingfor cloud native so we give the we giveyou the essentials on every of thetopics and then you're going to enrichit with your answersso before we spin the wheel I'm SteveWade i'm a cloud native catalyst fromthe UK my background is really inplatform engineering from running largescale Kubernetes clusters to debuggingservice measures and kind of a biteverything in between i'm at Bianke i'ma CNCF ambassadors i work for as asolution engineer at GitHub and I'm alazy engineer uh well I like to automatestuffso from CI/CD and uh everything else inbetweenum now please scan yourslideo we have 100 spots so we it shouldbe enough if you're not in it's stillfine these are the topics that we havewe have already AI ML EVPF KS releaseteam do not put those already because wehave them already in the in the wheeland please be as nice as possible do notput quantum computing now that I say itlike everyone's putting thatso we're going to give you about 30seconds the one that has the biggest uhcloud is essentially the one that we'regoing to spend the next five minutestalking about so this is you guysgetting to decide what we're going totalk about for the first five minutes ilike yourgap from engineering is already on thewheel apart from engineering and githopsis already on the British food that's uhBritish food we can talk about Britishfood if your brand yes I I likethat okay all right choose a topici would say Oh oh go on air gapall rightall right so we're going to talk aboutairgappedenvironments so hands up those of youwho have been running Kubernetesclusters in airgappedenvironments hands up if you enjoyedthatexperience okay so probably the mostinteresting uh example that I canremember is when I was spent a long timein a financial institute and fullyairgapped and by fully airgapped I meanlike you weren't even allowed to takeUSB sticks in it was very very highlyregulated and one of the interestingthings that we had to do is you canbootstrap Kubernetes right that's that'sa relatively well-solved problem howeveryou need to somehow get the images thatare required to be able to runKubernetes ities somewhere for you to beable to leveragethem so the hardest thing for us was toget the ability for those images toexist somewhere that was or wasn't on aKubernetes clusteritself so the interesting concept thatwe tried to come up with was really howdo we where do we de' �� r�j#��A5ZWbS01wCMkhey everyone my name is JoeI'm LauraI'm AishiroHi I'm MaximSo we have a little bit of um we're justgoing to do a little bit of slidestoward the beginning Uh talk aboutreally briefly for anyone that doesn'tknow what containerd is A little alittle intro Um we're going to do arecap over 2.0 that just came out backin November Uh and then we'll talk aboutuh 2.1 that's going to come out Uh somechanges to the project including a newrelease cadence uh some changes to ourLTS m.�:�i#�-AnESkFf0j_7cthat was an amazing set of keynotestoday Thanks a lot to all our speakersand then of course to the wonderfulaudience as well Before you leave pleasedon't forget to check out the sponsoreddemo theater in the solutions showcaseThere are a lot of innovative and livedemos today starting right now duringthe break So do not forget to check outthat Check skare.com for more detailsAnd while you're in this showcase wewelcome you to join us in congratulatingthe two newest projects uh CubeFS andInto the fun starts at 3:30 So grab atreat and stop by these project kiosksto congratulate them Congratulate themthisafternoon Yes And when you're donecongratulating them we invite you backhere in the ICC auditorium at 4:45 p.mfor our lightning talks We have anamazing lineup and you won't want tomiss these today Breakout sessions willstart again at 11:00 a.m and a reminderthat lunch will be served in thesolutions showcase Again we hope youhave a great second day here at CubeConCal Native Con See you tomorrow for ourlast set of keynotes Thank you2025-04-15 22:03:28.476530/odel Um some things that we've donearound the kept process and uh uh runswhich is a sub project of containerd andthen we're going to just do kind of likea panel discussion and talk through acouple things al togetherHiro so introduction to conj but uh I'msure most of you are already using uhcon because uh if you are using a dockeryou are already using cond and if you'reusing kubernetes it's very likely thatyou are already using cond because it'sadopted by wellknown enterprisedistribution of kubernetes and the uhmanager services ofKubernetes and the ecosystem ofconsiding So in addition to docker andkubernetes we also have uh command lineclients such asnctl and snapshot of plugins for razloading which means running uh containerbefore uh print entire image and we alsohaveruntimes such as uh webassemblies and here's introduction tontl Continuity CDL Uh it's a commandline client for continuity uh with thesame UI and user experience as docker CIIt was originally made for facilitatingnew experiments in the contin platformuh such as the study the image formatfor lazy print and faster rooresscontainers with uh by net ns and ncttlis also useful for debugging run thisnodes because you can just use the samecommand line as the docker commandSo uh let's quickly recap what 2.0 is Uthis is our latest and the greatestrelease and it took us a while to craftalmost 18 months So we decided torevisit our release schedule a bit to bemore aligned with Kubernetes Um Laurawill cover this in a bit So with 2.0 wehad two major goals in mind uh first onewe wanted to stabilize lots ofexperimental API that we introduced in1.7 and the other one we wanted toaddress technical depth that weaccumulated over time this is our firstmajor release since uh 1 1.0 so we havea lot of we had a lot of deprecatedfeatures that we carried for backwardcompatibility reasons and 2.0 to was agreat opportunity for us to cleans thisup refactor project structure and buildbetter foundation for futurereleases This is non-exhaustive list ofthing that we had to removeand some of these introduce breakingchanges though we tried to keep uh theblast radius as as minimal as possibleSo if you're using docker schema v1 youshould migrate your images to image specWe removed support of legacy v1 runtimesin container 2.0 in favor of uhshimb2 A UFS shutter was duplicatedsince 1.5 So we removed in 2.0 as wellAnd there is a lot of smaller thing thatcould potentially affect endusers Um here is also non-complete listof features that we've added in 2.0 Solet's try to quickly recap what arethese uh transfer service is now stableIt's a new API that we introduced in 1.7It transfer transfers artifacts fromsource to destination Uh but also itrelies on birectional streaming and datachannels as a first class citizens So itoffers robust and much nicer API Wecurrently use it for image pools oncontainerd client side So we don'treally use it in CRI but this issomething we want to add in 2.1 SendboxAPI is now stable as well and sandboxCRI is enabled by default It adds anabstraction layer layer for uh C portsandboxes to allow more granular controlhow a group of containers are definedThis API offers you know a much nicerway to support VM style containers In2.0 we ship a default implementationwhich implements plus containers underthe hood which also serves as an exampleof how to implement it And array isanother API that we introduced in 1.7 asexperimental and if you're not familiarwith it you can think of it as a ummutating web hook for containerconfiguration on contain site um it isnow enabled in 2.0 by default We alsointroduce a new extension point calledimage verifier plugins Uh it's an execbased plugins will call during imagepool and it helps to cover uh variouspolicy enforcement use cases like pullimages only from trusted sources Uh in2.0 Though it currently integrated withtransfer service and so it can be usedfor continuity client but we plan to addsupport in CRI for legacy pool as wellFinally we've added IGZ support whenpulling images and if found on yoursystem uh it will greatly improve imagedecompres0sion performance when pullingimages and a lot more but 2.1 is comingsoon Umhero and uh we are heading to releaseversion 2.1 next month and one of thenew features of version 2.1 is a supportfor FS enhanced reloading file system Uhso with this file system uh you need uhvery recent uh Linux version 6.12uh but it's more optimal for images uhwith uh manylayers because uh the legacy overlays isquite slow when you have manylayers and in this race we also supportimage volumes uh which means uh mountingimage as a human body uh for examplethis is used for for distributing AImodels as a OCI image so as to split uhcode image for the data image so thatyou can just swap uh data image forchanging the AImodel and in version 2.1 we also supportwriting thrush s thrushFS/ Cg groupoup so that uh container camuh controls the resources such as CPUand memory byitself and we also enhance the supportfor user name space so that the UIDmapping range can be now uhnon-ontiguous range for moreflexibility and here is our update Inntl version2.1 in ntl 2.1 we support uh user nsremote mode Uh this is similar torootress mode but uh it's uh differentSo root mode means executing everythingas a nonroot user Uh this has beenavailable since bas 0.6six and user NS wrap mode meansexecuting containers as a nonroot userbut uh contrary itself uh still run atthe root so this must not be as secureas root but it's more faster the rootmode and in version 2.1 uh we also Sohave experimental support for gel uhwhich means uh jail for gore sorry jailfor coremodules Uh this means imposing uh Ciscorestrictions on a specific set of gomodules uh so as to mitigate uhpotential vulnerabilities and supplyattacks you know in the last year uhthere was a terrible supply chain attackin XD uh compressionlibrary and the same uh instance maypotentially happen for go modules aswell Uh so this uh go to jail uhdistricts uh executing uh share commandsand reading or writing files or creatingnetwork soits Uh this is not fancy uhbecause uh this is not applicable to uhmodules that use unsafe pointers orreflections But uh I hope that uh thiscommod can uh reduce uh the attacksurface in the uh supply chain of gomodules Um in terms of project changesuh we're going to be doing a new releasecadence So um new minor releases every 6months uh starting with 2.1 in May and2.2 in November and then hopefully every6 months from then on um with beta buildstarting 8 to 10 weeks before a minorrelease Uh and feature freeze RC's twoand four weeks um before a minor releasewhich will be around like April andOctoberUm we're also doing some changes to theLTS mo uh model So we're going to behaving named volunteer maintainers foreach LTS branches Um LTS releases aresupported for 2 years but that can belonger depending on at the discretion ofthe branch owners And so the currentstatus is 1.7 is now is LTS Uh and EOLis scheduled for March 10th of 2026um or 1.7 LTS and uh 1.6 end of live isscheduled for July 232025 soonThat's me nextSo in terms of uh project changes wewe've also um adopted some changes tohow we handle incoming caps from theKubernetes pro uh Kubernetes project Thegoal here is to have some increasedvisibility and communication uh so thecontributors have a better experience uhand so that our partners in Sign Nodealso have a better experience workingwith containerd Um we now have newtracking issues in the containerd repoto capture containerd specific state foruh contributions and kepts that are thatare going on in containerd Um we've alsointroduced two new roles in thecontainerd project to help with this Soone is a a kept sheepard which is acontainerd maintainer that gets assignedto a particular ke and is responsiblefor helping it move through the processand make sure that everything stays ontrack The second is we uh now have asign node liaison role um who's going tohelp work with sig node inside thekubernetes project to make sure that wehave a clear line of communicationacross everything that's going on Umthis change covers both like caps whichis the main part of it as well asanything else that sig node ne1eds out ofcontainerd project um whether or notthat's that's driven through the keptprocess Um we've got furtherdocumentation about that on thecontinity website as to what exactlythis is going to mean but our goal hereis to make everything go faster and moresmoothly All right So now I'd like totalk about some innovations thathappening in the shim layer in theecosystem runi is the most efficient andcost-effective way of running webassembly workloads inkubernetes Uh web assembly is gainingsome popularities these days Uh it's abinary format and that promises you touh build once and run anywhere Andrunwazi is a rust implementation of thecontinuity shim um that designed tofacilitate running web assemblyworkloads in kubernetes and includessome readytouse shims for you to uhdeploy to the clusters including wasn'ttime sham wasn't a shame and more Itempowers spin cube an CNCF project thatstreamlines the deployment of serverlessweb assembly applications and is used inproductions in Azure Kubernetes serviceand SQL civiletc Today this uh deploy a web assemblyapplication to Kubernetes is reallysimple Uh there are some tools toautomate this away such as the spin cuberuntime class manager but under the hoodall you need is a runtime class whereyou apply to Kubernetes and there's ahandler in the runtime class that youcan configure in the continuity configwhere you say the CRI runtime type isthat uh wasn'ttype So uh as you may see the partspecification is exactly the same exceptone line change where we add a runtimeclassname is 1.0 and is ready for productionand just last week we released the firstrelease candidate and of 1.0 and ifeverything goes well we'll release 1.0in probably like a few weeks It providesa shim and sandbox interface where youcan use the library to implement yourown shims for running web assemblyworkloads It can be implemented for anywic runtimes and we have done extensivebenchmarking and it supports OCIartifacts and some cool optimizationswere into specifically for running webassembly in the kubernet on the serverside Um and one cool optimization isthat we cache the pre-ompilation modulein containerd So if you run the sametask with the same im image uh you don'thave to compile wisen to native codeevery time because you can just use thecached module and uh because of yukianothercncf project written in rust uh toimplement a container runtime we canexecute Linux container and wasn'tworkloads side by side in the same podWell that enables you to apply theservice mash or any other psycharpsychar containerpatterns Lastly I want to talk about theperformance We have done somecomparisons between the wasant timeshame in runwazi versus thedistrolless time container in run C Andwe run that concurrently on a thousandtasks where run C as you may see is fourtimes slower than the wasn't time sh andpart of the big reason is because of thepre-ompilation technique that weoptimized for runwise and another reasonis because runisi embeds wisen timeshame so you don't have to reload thecode for the engine every to the memoryevery time you run the tasksWe have done some stress test and thisis another example where the uhpre-ompilation OCI artifact outperformednon-p pre-ompilation containers Uh asyou may see the pre-ompilation run 60task per second and non pre-ompile runsabout15 So that's it for runwise and you cango to runwise.dev F to learn more learnmore about it or you can go to the Ronzichannel in the CNCF Slackgroup Cool So I think we're going tomove on to a little bit of a paneldiscussion The rest of you want to youcan all sit down You don't have to staystanding anymore Um we tried this lastCubeCon uh just to have a conversationamong maintainers and I think it workedpretty well Uh so I'm gonna I'm gonnahelp get this started and sort ofmoderate a bit Um we have a couplepre-prepared questions and then we'llsee where we are with time and maybe wecan have some audience ones too Um butfirst one I think it'd be nice for usall to talk about how we got involvedwith containerd Um maybe we can startwith Laura on the end and just go backdown and2 I'll I'll be at the end of thatSure Um I think I've kind of beenworking my way down the stack So I wasat Docker contributing to Compose andother way higher up things andeventually I was like well I kind ofwant to dig into a little bit more So Istarted doing engine work and I wasdebugging something in the um engine Iended up contributing something tocontainerd and just kind of went fromthere I thinkin my case I'm a maintainer of ROI Uh soit was almost inevitable for me to uhget involved with ID because uh moviedepends on IDWell for me um when I start firststarted working in web assembly I knewit has to be integrated into Kubernetesfor packaging distribution and scalingup Um but just by bounding web assemblyinto a container with its runtime andoperating system it's not the idealsolution because it obl oblivviates thereason to run WSON on the server becauseWSM can be portable If you bundle withthe runtime operating system it's nolonger portable So we have to integrateWom into Kubernetes and there areprevious attempts such as reimplementcublat in rust to run web assembly uhbut some of the previous attempts failedbecause they are incompatible with thebroader cloudnativeecosystem and uh so we when we learn thelessons we look at the whole stack wehave to go lower level and thecontinuity is a perfect perfect matchthere because um it is the standard torun containers in Kubernetes So BrianGoff a software engineer from Microsoftand the reviewer of continued actuallystarted run a continu shame and wellthat's how I got involvedUm for me I think my first contributionwas in 2018 Um I worked with them at thesame team and I was really excited aboutcontainers and looked for a way to youknow get involved Uh so I think my firstcontribution was to implement pixie decocompression for contained to improveimage de compression speeds when youpull images Um and you know I think it'swas it was like good first issue type ofissue So if you're looking for a way toyou know get involved that would be agreat startI think I'm the last one Um so my myfirst involvement was actually withDocker before containerity was split outand then uh as containerity got splitout into its own project I I got to beuh involved there I think early on I wasfocused more on like testing and um sortof minor but like code cleanlinesschanges to containerd as I was trying toget up to speed and learn more about thearea Uh and then later on I was able toto do more I think in about 2017 mighthave been when I formally joined theproject as a security adviser um whichis a role that we have to help with umsecurity triage and security responseand then from that I grew into um doingmore of the the maintainer work on theproject Uh so maybe we can talk aboutnext thing Um I think everyone mighthave an answer to this but what'ssomething that you're excited about inthe container space whoever wants tostart offI can start Um I'm a big fan of Rust andwe maintain Rust extensions in containedand I'm really excited to see thingslike runi get into production running onbig on big workloads This is very niceYeah just to follow on that I'm reallyexcited as you may have guessed thegrowing popularity and interest in usingWisen as the first class citizen inKubernetes and containerecosystem and um there are communityefforts within CNCF to standardize anOCI artifact layout for WAM and it justbrought consistency and the broadervisibilities and that can be applied toany WM projects or the consumingprojects and it also solved the registryissue where you can now package wisemodules as a OCI artifact and store theminto OCI registries So that's prettyexciting Um I'm also very excited aboutsome of the virtualizationuh innovations in the cloud computing Uhfor example Hyperlite is another CNCFproject where it can it's a hypervisorthat can spawn a micro virtual machinewithin two milliseconds and because itcan start up so fast you can essentiallysandbox your HP request every requestand so you can imagine uh what's the usecases uh to run that as a container inKubernetesYeah Yeah So I'm I'm excited about uhRASM as w3ell So ROSM is a veryinteresting technologies because uh youcan migrate the work world from thecrowd to a web browser on desktop oreven on smartphones and vice versa Sothat means potentially the contenttechnologies with res can be also usedfor desktop computing and even mobilecomputingUm I think there's I was trying to comeup with an answer to this for a whileand I think there's a lot of coolinteresting things going on but I thinkthey paradoxically I'm kind of excitedabout the fact that containers are kindof becoming boring in the sense that youknow ecosystems a little bit more matureand like APIs are a little bit morestable which means we can have a lot ofextension points and like integrationsand we can do cool stuff like you knowhave run like wy shim and have thesekinds of stable APIs that let us do coolthings and extend containerd and allthis stuff in the container worldYeah just wanted to add like containerdis highly extendable and it tries to beas unopinion and unopinionated aspossible and basically there is anextension point for everything if if youcan find a way to extendWe're going to have another questionabout that We we are but you can finishand then I'll gothat sounded very logicalcontinuation but anyway so if you cancannot find how to extend any let usknow we will add it so we do have aquestion coming up about this but uhI'll answer sort of the first questionof what are you excited about and thething that I'm excited about isextensibility um so we've talked about afew of the things already um the shimshim extension point lets us plug inthings like WASOM uh with via RunWazibut it also lets us do things likeexpand to other operating systems Sothere's a um that's part of how ourWindows support works It's also how wehave a shim for FreeBSDum and sort of like a a really naturalway to expand containerd Beyond that umthere's also an extension point that wehave called snapshoters um which is howwe deal with storing image layers And uhAkiheriro touched on this a little bitearlier um that NerdCTL was one of thethe places for experimenting with uhlazy loading and we have a lazy loadingsnapshot a couple of them We have wehave uh stargaz snapshot and then uhoverlay bd and nitus that are um do lazyloading and then there's a couple othersthat are elsewhere in the ecosystem likewe have uh I work on GKE We have a alazy loading snapshot in GKE that powersa feature called image streaming So thatkind of extensibility is really excitingto me Um one of the new things thathappened in containity 17 and is um morestable in two and then we're going tocontinue to iterate on it is somethingcalled NRI uh which is node noderesource interface Uh and that's anextension point that lets you do thingsvery similar to a mutating web hook umbut for the container configuration So aa podspec you put in a podspec inKubernetes it comes through the CRI uhand containerd will generate an initialset of configuration for that containerbased on your input But an NRI plug-incan then go and uh adjust that and dosome interesting things based on that Sothat's something that's reallyinteresting to me Um and talking aboutthat is is extensibility So NRI andchimps and snapshoters I touched on someof this already but I don't know ifanyone else wantsto talk a little bit moreYou covered itOkay Um Laura I think this one the nextone was for you Um you wanted to talkabout shim correctness and performanceand things like that specifically for meUh yeah I think I so put this herebecause I just recently opened a um PRadding some weird uh formal verificationto the continuity run C shim and theimpetus for that was kind of lookingback at a couple of regressions we'vehad um over the past year after we kindof switched out a not very performantbut simple uh concurren currencymechanism in the shim with a much moreperformant but a little bit harder tounderstand one and then we ran into acouple of things Um and so I thinkthere's a couple of interesting liketradeoffs between performance and alsoif I look at this can I understandwhether it's correct or not um and a fewways to um address that One of which Iguess is formal verification but I don'tknow Does anyoneelse have thoughts about thatno I I think I think formal verificationis going to be really helpful to makingsure that that continuity is reallyreliable So I'm excited to have that inthereI mean I will say if you know the shimwasn't in Go and I was like in Rust orsomething we would have better toolingto just formally verify or extract someof that information instead of justwriting a spec but not going to do thelet's rewrite it in Rust meme Sookay Um we had a couple more preparedones but we're running close on time SoI'm going to skip those and maybe openfor uh audience uh questions at thispoint and I'm just going to stick thisslide on the screen of like ways to getinvolved But um if anyone has questionsI think we have a mic up here We canpass aroundUh thanks very much for the talk Um Iwas wondering with uh the sandbox API in17 and now stabilized in 20 uh how longit will be until we see uh microVM baseduh isolation uh security isolation Sonot not for WM which you've talked aboutbut just generally for containers uh soyou know firec cricketer type things uhso we can have an actual sandbox uh as acontainer abstractionYeah So we introduced sandbox API in 1.7as an experimental one because we reallywanted to give community some time toyou know get familiar with it andreceive some feedbackUm in 2.0 Oh which was released quiterecently It's now stable So we think APIis going to be like more or less thesame We might add add some features infuture but more or less it's welldefined bynow Does that answer thequestion that much I knew but I mean interms of when we'll see uh microVM baseduh sandbox isolation actually beingusable So something like firecracker oryou know some something else thatenables us to have microVM based uhisolationsorry already started implementing thatso we have a request that was maderecentlyso now but it was only for the time sowe are halfway there I didn't hear anyof thatI'll repeat um he was from the Kataproject and said that they're abouthalfway through integration uh with thewith the new sandbox API and I thinkthat's really exciting So Kat's probablygoing to be the first to have that realkind of microVM integration that you'retalking aboutYepNo no I justsomething what really bothers me for awhile but one of your slides actuallyremind me about uh the runtime handlersSo now for let's say like if I want touse v was uh I need to go to kublet ohsorry to kubernetes I need to createruntime class object I need to not missnot to misspell the handler name when Ineed to go to container deconfig andwrite all those parameters and so on Doyou see uh or do you want to make animprovement what like with container uhruntimes will be exposing to kuberneteswhat kind of runtime handlers areconfigured so just to remove this bunchof manual steps to configure themUmso there are projects like spin cuberuntime class manager that manage thelife cycles of runtime classes and thatcan abstract away some of the complexitythere Um I also thought about the ideabut wehaven't done any real implementation orprototyping is that if containerd candynamically check the image type and runif it's a wen image just run a wenruntime that could also reduce thecomplexity Did I answer that questioni think we're a little bit over time nowSo um I wanted to thank everyone forcoming and uh I think we're all going tobe hanging out here for at least a fewminutes after if you have more questionsyou didn't want to ask in front ofeveryone else But uh thank you everyone[Applause][Music]2025-04-15 22:03:29.063660 ��^�k#�uAt8p7S-46SWIhello everyone my name is Vadim Mau i'mone of the co-maintainers of ProjectHarour and in the next four and a halfminutes I would like to share with youhow we are using the project uh or Alfixmentorship program to ouradvantage and how it's worked out for usuh in the last two and a half years soit's a brief recap of two and a halfyears 12 mentees later and what welearned from that and how it works usfor us well uh to for the context if youdon't know it the LFIX mentorshipprogram is a program run by the LEUFoundation and CNCF and it connects thementors and mentees together to work ona project uh under guidance supervisionso they connect on a platform those twoparties together and they provide afinancial compensation for the menteeswhich is nice um who can apply everyonecan apply but what we see from theapplication that we get is that mostlystudents apply um from the STEM sectorand mostly students from regions wherethere is young population which is uhIndia not India but I would say Asia andAfrica andum when we started two and a half yearsago you know when you get like 20 to to60 applications per per termYou need to review this there's like lotof CVs that you need to review conductinterviews and you need to zift throughall this AI applications that you getnowadays and ifyou are attempting to write anapplication with solely AI don't do itit everyone sees it everyone knows itand you will get directly discardedright so don't do it even if you don'thave much to say don't say much rightum so but even given that we can zip outa lot of applications there's a lot ofwork to do right and what we found outover the course is that people who areinterested in open source and they wouldlike to know what they actually signedup for right so they try to find outfind out like what do I sign up for whatis the thing doing that they're doingwhat is hardware doing how does it workhow do I install it how do I run it whatproblems it solve and so people tend toask question about the the product andbased on the engagement and thecommitment and the incentive uh in weselect our candidates right so it's forus the outcome is that we have to reviewmuch less CVs I mean we still review CVSbut not as the first process but as alast process right just to be certainand I think it's also has less biasbecause based on the GitHub profile orthe the Slack handle you don't see thethe person's gender race or whatever umand so far it was great work great forus we had one failed completion in theearly beginning we have uh one newproject maintainers coming direct fromthe mainten maintainer program uh wehave two employments one directly in theopen source world and one in in thecorporate world um we have two createtwo projects that we've been created aspart of the uh mentorship program a subproject of harbor we don't let thementees work on the core product becauseit's a lot of time consuming and also umoften times it's a bit critical right soyou need to ship features and here wecan um work on a way that it's done whenit'sdone and uh yeah for us it has been agreat success and I would say that youif you're a men mentor or you want tobecome a mentor for an open sourceproject you should definitely try thisout um but you know be aware that thetime frame mentees are working it's like3 months only you cannot enroll thewhole process of uh you know selectingthe the candidates yeah you need to findinnovative way how you can how we can dothis and for us the the criteria ofengagement works the best um yeah if youhave questions reach out to me you canfind me online um you can find me herethank you very much for your attention[Applause]2025-04-15 22:03:29.567256 ''�U�l#�cAERztKTd-ckAhi everyone thank you for coming My nameis Schlomi I'm with Planet Scale and aVites maintainer for the past five yearsUm I'm standing in um replacing DipyZigeredi the uh leader of the projectwho unfortunately could not come heretoday and I'm here to talk to you aboutuh unlimited scaling in Vit What'sunlimited scaling anyway so Vit is anopen source CNCF project graduated aboutsix or seven years ago It's a shardeddistributeduh relational database framework thatruns on top of my scale You've probablyused it You are probably using it Everymessage you send in Slack is stored andserved by Vitess Every pull request onGitHub is stored and served by VitesCashup blocks Cash App usesVit And these companies are able toserve many millions of queries persecond out of a relation database uhusingvit So there's a large community and alarge userbase We actually don't really know whatwe don't know Uh it's open source so younever know who's using your productright uh I think in this room many canuh can relate you know a user pops in inSlack asks some question turns out therethere's this mass operation they'rerunning you know critical app in somedifferent country we've never heard ofbefore Where did you come from who areyou soum we keep we keep hearing about newusers um large users and keep gettingmore input from the communitySo the V test story is uh a bit peculiarIt was created by YouTube to store andserve their entire videometadata and uh YouTube was subsubsequently purchased by Google and aspart of that Google migrated VESinto probably one of the most hostileenvironments a relational database canexpect to run on which was uh Google'sBorg Borg is the predecessor toKubernetes And so Vitz is kind of oneprobably the first relational databaseto run at scale on a Kubernetes likeenvironment and uh Google uh later onmoved on to uh run its owninfrastructure and uh contributedvitf and we've been maintaining it eversince Vitz achieves scale by shardingYou may have heard about uh autosharding That's not how Vest works Ittakes a different approach It takes a anapproach of custom or configurablesharding key It's a key design choiceforVES It comes at a cost upfront You as auser need to think about your shardingscheme need to think about your data andhow you might want to approach your dataThat's the cost But once you buy intothe sharding world on you're in shardingsphere things become very flexible Sothe most obvious thing you can do is toscale out right your uh data is growingYou can reshard you can split your yourdata into smaller or more more shardsand grow with those But you can alsoreconfigure some some parts of your uhsharding scheme You can reconfigure yoursharding keys You can use compositekeys Not all shards of data and not alldata is created equal Some someintentionally or unintentionally uhreceive different more intense workloaddifferent kinds oftraffic With this custom sharding schemeyou can further reshard specifichotspots of your dataand really tune fine-tune it to matchyour application's uh querypatternsAnd the key thing about Vest sharding isthat it's managed by the framework Soeverything happens online with zerodowntime Uhsorry right it's a bit of a provocativeuh slide but only a little bit So uhwe've seen users struggling with uh thenumber of connections on Aurora or thenumber of servers or the server capacityuh on RDS Moving into the VES they getthe flexibility of running morecommodity hardware You can scale outwith more shards uh commodity hardwareor you can scale up your servers useless shards then scaleout Uh looks like I just ran out of timeSo please um join us on the Vitestcommunity on Slack Uh askquestions open issues on GitHubOtherwise please check out our docs anduh website Thank you so much2025-04-15 22:03:30.1457797A devops s sur platformengineer uh but basically my entirecareer has been how to help mycolleagues uh now and you knowthroughout that as well a big part of mytechnology journey has been about beingin the community so I'm a CNCFambassador team topologies advocate andjust general enthusiast of just meetingpeople getting to know people so comechat with me about things uh And Philput up his mantra and I swore I'd comeup with something witty before wellyou're gonna have to come up withsomething on the spot you've got aminute while I do my intro and I thinkabout it and you can get get back to youif you want deal right so I love thatyou think about the experience becausethe DevX for me is really reallyimportant so that today we'll show youthat whole DevX as well and I love allthings automation i like automating asmuch as my life my robot hoover turns onwhen I leave the house uh and things canget quite complex in softwareengineering right so my mantra gonna putyou on the spot Abby is there's no rightor wrong way but there's always a betterway and it's true when we think aboutour abstractions and when we go more andmore into what the technology we'relooking at today to keep that in mindenough time for you to think about amantra i don't know if it's a mantrathis is what I was going to write on itwas I always believe you should standyou should be known for what you standfor not against though especially in theworld of the CNCF and all the projectsand all the technologies instead oftalking about which ones you don't likespend your time finding the people hereduring this conference uh who areinterested in the ones you do like andgo focus on that ah that's beautiful andI found a few that I like today and I'velearned a lot so the access group whoare we are we are the UK's largest softheadquartered software company if youtook a tube today to get to the eventyou've used our software if you've goneto Nando recently you've used oursoftware uh if you've ordered somethingonline chances are you've used oursoftware through the chain we areeverywhere we process around 2 millionpayroll records every year through ourpayroll bureau uh and the list can go onand on and on uh and I could talk aboutit but we're here to hack right heck yeslet's get hands on keyboard let's justget on with it yeahso if you can all scan the QR code or ifyou're brave use the bit.ly link uh I dorecommend that it is case sensitive andwe'll get started with today's hackabsolutely so as you're loading that upone of the things that's key aboutsomething like a hackathon is that ittakes a lot of energy to make it workright if you ever done like a bug bashor a hackathon can be quite quitestressful but this guy just eats stressfor breakfast cuz this is what he wasdoing when his when he ran the hackathonthat he did for TAG oh yeah absolutelyand uh if I can make you all look likethis when you run a hackathon then I'vewon so this is a picture of me during ahackathon enjoying a large bottle of rumand watching pipelines kick off like amadman in the in the Caribbean so itdoes work and I trust the platform in myteamall right all right right so I hopeeverybody has found their way intoInstruct and what we're going to do nowis switch and you would have found ascreen like this if you haven't alreadybeen the curious person that I'm sureyou all are please hit the start buttonuh and that's going to start up a screenthat looks like this and we've done somepre-warming of machines so I'm hopingthat this will be quite quick and it wasyou'll see a kind of pulsating startbutton in the bottom right corner i knowthat some of you don't have laptopstoday that's okay we're gonna runthrough the workshop in such a way thatwe're going to do everything on thescreen as well but we're going to givesome time for like self-studyessentially to read the text to explorethe the virtual machine and all of thatso we'll make it clear when that time ison feel free to zone out if you don'thave a laptop or research things um andthen we'll show everything on screen soyou won't miss anything of the workshopso if you press8 start here which I'msure most of you have already uh jumpedpast me this first screen is really justhere to teach you about how to useInstruct we're really grateful thatthey've donated their time uh thevirtual machines to us to be able tomake sure it works on conferenceWi-Fi let's go uh and it's a greatprogram that's letting us have theinstructions on the right hand sidewhere you can open and close yourinstructions like this um and you canlike see the images kind of bigger andsmaller as well and then on the lefthand side there's a whole bunch of tabsyou can go through and and this firstkind of section including giving usfeedback on each step if you want notrequired but optional uh this first oneis just learning about what's running onthis virtual machine and how to useinstruct it's very quick feel free whenyou're ready to click the check buttonand that will take you through to thefirst proper task and this task is toget started with your your platformservice your first service you're goingto deliver as a platform i'm not goingto talk to you all about theory rightnow because we're here to do a workshopbut we are going to talk about why we'redoing this once you've gotten hands-onso feel free to press start and continueon with this so I'm going to stopspeaking now and let everyone get busybut what you're going to be doing isgoing through the providing a platformservice task and if you click the checkbutton and you come up with this errorof not quite right try again becauseyou're going too fast that's a goodthing we're going to keep everyone onthe same pace so we're going to give youa password to be able to go to the nextsection when we're ready to all right sothis will mean you're done and you canjust sort of explore around the virtualmachine or or wait for the time to totime out does that make sense iseveryone happy to do that we're going tobe walking around to answer questions aswell as you go so any questions aboutplatform engineering the workshopinstruct anything feel free to give us aquestion raise a hand allright cool we're also going to startsome music in a minute i can press theplay button here if anyone feels likeit's either too loud or not right andit's affecting their ability to do theworkshop please also raise your hand andlet us know we're happy to to changethat or remove it but we just thought itmakes it clear that it's kind of studytime it might be a bit more fun a bitmore like a cafe so go ahead and getgoing if you wantall right so it's not a livedemoworkshop if something doesn't gowrong so if you were hitting a tracklimit we've fixed that so go ahead anddo a refresh or or try and try it againand it should work for you so alert usif it's not now i've just tested itlocally so cool we've got at least onethumbs up which is good so here wego so we've got just under 30 seconds togo but I know we had a bit of snafussthere does anyone want more time feelfree to to raise your hand and let usknow else we'll we'll go with the demoside of things allright we're getting a countdown herebanging on the table 10 nine feel likethe sound guys are going to startactually doing that the next time I sayit and it's going to be really awkwarduh okay so um I hope you all I hope youall had a bit of fun i understand wejust threw you into the the deep endthere we didn't explain anything aboutanything so uh if you're new toKubernetes if you're new to Kratics ifyou're new to platform engineeringthere's probably a lot to think about soI'm going to go through this now myselftalking through some of the things thatare going on uh and then talk throughsome of the theory about why we're doingthis feel free to raise your hand andask questions during this this is likewe're a nice small crew let's takeadvantage of that and actually talk itout there is a microphone but I'm veryhappy to repeat the question as long asI can hear it so feel free to just shoutit out to me and and that'll be okay aswellso what we were doing here is we werebuilding a the start of a platform andwhat a platform is is it providessomething as a service and thoseservices 9can be things that you have tobuild yourself but often there's thingsthat already exist in the in themarketplace around you the world aroundyou could be a SAS product you depend onlike GitHub as the way that you hostyour software or your uh your codebasethey could also just be other templatesand and packaging tools so Helm chartsyou depend on from the community in thiscase we used a promise from thecommunity that does Postgress as aservice and what that did is it installsapromise promises are the definition ofsomething as a service so what we'redoing is we're building a platform herewe're not being handed one so unlike apaz platform as a service solution suchas Heroku or cloud foundry or or thosekinds installing something like does notget you a platform it gets you theability to create one and that promisepackages up all the moving parts ofproviding an API with a server behind itand does and and provides that simply soyou install that promise into cratics itcreates a userfacing API for you asdefined by you with your constraintsyour requirements and it sets up yourenvironment to start serving thoserequests in this case we're living in aKubernetes world which means we need anoperator to be able to generate theresources forPostgress so we then uh go through andwe actually look at the promise itselfso we know that promises providessomething as a service how does it do itthe way it does it is you define an APIwhich is the contract you have with yourusers what are they allowed to tell youas parameters what do they have to tellyou within what constraints are theyallowed to request things that APIexists when a when the user makes arequest through that API you have to dobusiness process things to make thathappen you might need to do securitychecks and authorization andauthentication you might need to docosting checks and make sure no one'sgoing to run up a crazy bill securityscans you might need manual signoffs forcertain activities to be okayed thoseall happen in a workflow you also mayeven choose to deploy things as a partof that workflow using commands like uhthe AWS CLI or cubectl CLI but you mayalso choose to use declarative code anduse the GitOps mentality and we wouldadvocate that you know here at CubeConwe would advocate that the Gitopsmentality provides a lot of benefits andwe can talk more about that i think wewill a little bit later definitely umbut if you choose to use that Kratic'sframework has a um convention of helpingyou make sure that the declarative codegets to the right repository in theright place so that it can get deployedto the right infrastructure so thatworkflow um results in some codedeclarative code that goes into in thiscase uh git t so a git serveruh and then the dependencies had alreadyexisted in there and setting things upso uh if we go through and we look atthe next command this is that diagrambut in kubernetes so the uh promise is acustom resource definition in Kubernetesand it has these different fieldsincluding the API field where you'redefining a CRD and the workflows fieldand the destination selectors fieldwhich is how you tell it where I realizethat's probably small overhead isn't itthat's better probably um where youdefine uh where it goes in your gitstate storeyou can you can check out the entirepromise if you were to do uh acoupubectl get and and through to yamland you can see things like statusesthis is just like any other Kubernetesresource built in or custom right youhave that ability to to explore it sothe API itself is a CRD and so if I usethe Kubernetes command to list CRDs wecan see that now in that list of of kindof public custom resource definitionsthat we depend on we can see thePostgress one that we justcreated we can also uh see that um yeahand and it's right there alongside allthe other CRDs like Flux and searchmanager that we depend on we can seewhat fields the promise defined in thatCRD by using the explain command so thisis the enduserfacing API forPostgresses we all know having useddatabases before there's a million flagsyou could set but you may not want toprovide all: of those to the applicationdevelopers because they may notunderstand them they may not be allowedto use them based on your companypolicies and so on so this is a veryvery simplified API and maybe too simplefor some of your companies but that'sthe things it's completely up to you youwrite your API the way that you the waythat you want to so fields are nameenvironment name space and teamID and then finally what we did is wehad a look at what happens to set up theenvironment we talked about the factthat there's going to be uh an operatorthat needs to to be managed the way thatit does that is it is we create thisworkflow object which is as you can seejust a job um in Kubernetes and that jobruns a set of containers so you canwrite your logic into any language youwant a lot of people start with bash butvery quickly find that they preferlanguages that are more testable so wesee things in uh Golang in Python in uhElixir um and and so on so really anylanguage that you want to write a scriptin and at the end of the day those getuh written based on the destinationselectors so this first command is uhsaying hey what did we tell what whatdid the promise define as the locationfor this uh for these resources to becreated the promise said these should becreated in a world that that lives up tobeing environment dev and then we saidokay well what worlds do we have whatwhere could this be scheduled and itturns out we have two possible locationsone the platform cluster where Kraticsand Git and Mino and all those tools arebeing run and one being the workercluster which is where it's going to bescheduled to so you can see this ifyou've if you've used um the labels andselectors idea in Kubernetes before thisis it right this is how we do pods withservices and things like thatum so then the last piece that you wouldhave done is go into git t and actuallysee how that uh is in theuh is done in um the the githops way soyou can see so basically Kratics doesn'twant to know how to get to any of yourinfrastructure because we know that yourinfrastructure is not as a probablydoesn't want us to know about itpermissions wise but b it's not alwaysgoing to be as simple as a Kubernetescluster it can be we work on mainframeswe work we work with customers onmainframes on uh edge compute nodesreally on networks on anything and so wewant to be able to just write out to agit repository and let the deploymenttechniques that you use on yourenvironments work for you so here we cansee that those YAML have been writtenout and we can see that those YAML arethe ones thatresult in the Postgress operatorsuccessfully deploying out to the workercluster and we do that through Flux GitGitOps so at the end of this um we have hwe have a situation where an applicationengineer would be able to come tocratics come to your platform request adatabase and be able to see it but whatwe're going to be building towards isthem being able to do that for an entireapplication so a web app that's backbacked by Postgress database and withthat you all get unlocked which isprobably what you all are waiting for ohsorry I'm going to talk a little bitjust a quick minute about the theorybehind this so the reason why this is animportant kind of step when buildingyour platforms is because platformengineering is intentionallycentralizing operations again whichsounds really counterintuitive if you'vebeen through the dev ops wall to devopsuh transformation over the last kind ofdecade or so but the recentralization isgiving access to more security and moresupport and so what we need to talkabout is the fact that we got a lot ofbenefits of reduced delays and increasedautonomy from DevOps and we don't wantto lose that with this recentralizationeven if we want to improve on our kindof centralized management um the uhability to get like economies of scalewithin our organization and so forth andso with thatrecentralization we need to think abouthow to service the teams in a way thatis sustainable and that means that wehave to build as a service platformsrather than having requests come inthrough tickets an;d lots of manualprocesses or even automated processesthat depend on a person to pick them upuh so we want to centralize withoutdelays we want that autonomy but withsafety built in and guard rails in placeand we want maintenance that allows usto to manage things efficiently andwe're going to look look at each ofthese in practice over the rest of theworkshop basicallyyeah there's a there's a lot to take inwith all the different moving parts butwhat we've just shown is how to get yourdependencies on a cluster and this isjust for a hackathon with the Postgresbut I do this with all of my otheroperators so port works search managereverything I customize and I do viapromises so it's a lot of moving partsbut yeah we'll go through it and you'llsee more and more as we explore yeahit'll get fun it'll get bigger andfunner uh so the way that it works is ifyou create a file by the name AAS or asa service you will be unlocked to clickthe check button again the if youhaven't fooled around on your VM you canjust create the file in your currentdirectory in the terminal using thecommand touch if you have fooled aroundand gone elsewhere in your terminal makesure you use the complete command tomake sure it gets placed into thecorrect location all right so uh withthat command you should be unlocked togo through to the next section and thesame process will repeat we'll set thetimer we'll let you all go and and getgoing on it um answer any questionswalking around and then we'll talkthrough what you've done any questionsbefore we get going withthat all right well then I'm gonna setthe timer and goall right well we'll get cracking so sofar what you've seen is you've created aPostgres operator you've you've setwhere that is and now you've gonethrough the process of requesting aresource you've got this so I'll gothrough the commandshere i will have to create the filethank you very very verymuch there we goso when we have this promise where weyou see in the API you've got the namethe environment uh and the othervariables you've got set there but andyou've been able to request a resourceso let's go have a check on thatpromise we it's definitelythere and we're going to go apply thisdev database with a team name acid and aDB nameset that's gone through the resourceconfigure pipeline taken the informationthat you've provided in the API and setall the complex stuff on behind thescenes allows us as platform engineersto decide what we want to give to ourusers hopefully your pods will now spinup and you can see that the workflow isexecuted and the actual database itselfshould be running on your worker quitehappy to go now I was learningKubernetes as I go i was doing this onhard mode gitops cratics plusKubernetes and now I can createdatabases and teams only need to give mea full few bits of information uh atthis point that's where I fell in love iknew I could go create 100 databasesgive me a said command and I'm amadman try it at home so this reallyhelps by giving you the flexibility tosay hey you can now provide things onyour platform without having to rewritethe rule book when new requirements comealong so we can you could change the devto um the dev to a prod tag and we'lllook at that later on and see how thatchanges in the platform APIso you get to really decide your levelof control right this is what we'retalking about what is important for yourusers do you want your users to have thefull customization of the operatoryou've chosen where they could breakthings change things and have thatcomplexity or do you want it really easyto start with with no configurabilitylike we've done with a Postgressoperator you can make those decisionsfor your engineers and codify themwithin yourpromise as you see in the Postgrespromise by changing that dev toprod we're going to do the backups rightwe're going to make sure there's enoughreplicas for your application ensurethat follows our operator that we'vedecided on because I'm pretty sure noone wants to run in production without abackup rightright and we test our backups right uhyes we do test outbackups so if as we're< ready to move onto the next section do a touch on hyphendemand and uh we'll start the timeragain and goon uh what we were just doing is lookingat the fact that the reality is isplatforms are just software now when youthink about things being as a serviceand with software we've had decades oflearning that the three little bearssimplify things too big too small justright is not as easy as it sounds andtrying to figure out what size a serviceshould be micro mono whatever isdifficult and so the reality is is thatwe need to be able to compose anddecompose the decisions that we've madeinto the platform uh service so this isshowing composition so taking smallerpieces and making it bigger intosomething that is more powerful so adatabase plus a web service togethermaking an applicationcombination so with the app as a servicethe idea here is that you have a promisethat orchestrates multiple other thingsso we're not recreating Postgress herewe are reusing the Postgress promise byhaving the app higher level promise makea request to ituh there we go uh and so what we can dois we can see in the editor that the appas a service promise this app promisehas a set of required promises that areon lines 62 to 66 so in order for theapp promise to succeed to do its job itdepends on Postgress and EngineX coolthat's fine that is something that wecan make sure the platform has setup we can also see that uh we use themagic of instruct here to pre-installthat for you so you had previouslycreated the Postgress in the lastsection we then installed the app as aservice promise and brought with it theengine X so we now have this situationwhere we have three promises and we cansee that they're of slightly differentshapes like one doesn't have engine Xdoesn't allow user requests that's okaybecause you're going to have oddlyshaped things in your platform and youneed to be able to managethem so now that we want to build theapplication we want to actually make arequest for a to-do app and this to-doapp is is uh is the same as any kind ofapplication developer in yourorganization wanting to just write theircode package it up and be done with itand see their see it go out to end usersin this case we're asking them to tellus the name of their application theimage they want to run whether or notthey want a database and if so what typeand what port we need to expose for thatservice again keeping it kind of simpleso with that request we are going to seethat a couple of um tasks run so if I dothe next commandfirst let's do thisuh so if I see this pod we'll see thatit is currently initializing so I'mgoing to put that on a watch commandquickly it's already completed all rightwell I missed it but if you saw it youwill have seen that it went through afew different steps a few different initcontainers basically the frameworkbehind the platform building starts withgiving you as the the promise authoraccess to the resource that triggeredthe workflow so Kratex runs a containerthat goes and gets that resource andplaces it on your file system you thencan do as many tasks as you want in themiddle in this case we're managing theresource and the database and thenKratics also as a framework provides thetasks at the end so it it takes thedocuments you write onto your filesystem and it writes them out to a statestore status etcso we can uh with that having been donewe can see now that you have a built-incatalog because Kubernetes this isn't aKrabix thing this is a Kubernetes thingyou're making your requests on here youcan list them and when you're done uhwith your like when once that's beendeployed it tells you where you can goget to the the website and we actuallyhave that being exposed on Instruct soyou can see that websiterunning and yeah and so you can see alsohow you can customize the informationthat goes back to the user so uh what wehave now is a um a promise which iswires together a few different thingsincluding if I go to the actual scriptsthat we're running inside of that uhinside of that workflow we can see thatthe database is taking information fromthe um from the =application in order tomake itsinformation so it's basically saying letme get the database information and addthat to the application so it can wiretogether in a more in a peacemealsolution you'd have users having tocreate a database and then uh kind ofwalk their permissions over to anotherlocation to create their web server inthis because of the composition you canget that done all together yeah andthat's the real power right we've madethe decision to use Orlando so we canfollow their patterns and it's aninternal platform and we've justcodified that experience here to make iteasier for engineers to just have an appwith adatabase there you go and so um yeah andthen this command here is just showinghow at the end of the day all that itdid was make a request the samePostgress promise that you hadoriginally made a request to in the lastsection uh and you now have yourapplication soum why why are we doing this it'sbecause of that monolith microser aspectplatforms have to start thinking abouttheir architecture they have to startthinking about what is should be changedtogether what is the blast radius of thechanges they want to do but the goodthing is you don't have to start fromscratch lean on the existingarchitectural conversations aroundsoftware to build that platform rightuh and some tests for whether or notyou're building your platform with theright foundations is whether or not youcan introduce a new sort of paved pathor or like best practice within yourorganization without having to kind ofcopy paste or rebuild from scratch youshould be able to build off of thepieces that you have and compose anddecompose and recompose them because youneed that flexibility and you need to beable to also make sure that people whoare not uh eligible to use the entirepackage you want to deliver so theycan't use the whole app as a service canstill have standardization for thingslike the database by itself and that'ssort of that composition aspect uh andso the the QR code here is for theGregor Hope uh article about fruitbasket and fruit salad it's just if youhaven't checked out his book it's reallygood on a lot of thesetopics but again we're here to behands-on so uh if you uh go ahead andtouch the file compose create the filecompose you can get started um as likewhat we just did you can get started onthe next section where you're going tobe turning your internal platformproperly internal and moving away fromthe marketplace but doing other thingscool keep walking around answeringquestions please keep asking it's beenreally good questions this is like thereal power of being in the room togetherso please keep talking to usokay everyone thank you so that's thattimer up does anyone need a few moreseconds or got anyquestions excellent so what you've gotso far is you've configured the app as aservice to accept environment variablesso let's go through that togetherwe've updated the promise right so we'vegot our promise a promiseAPI extended and uh to include theenvironmentvariables so if I just pop that in herethat's updated our promise specificationwhich if we go to our editor refresh ourfiles we should see beautifully herethere's our environmentuh our environment API let's go aheadand apply that to the cluster so thatwill update the promise and the coolthing that we're going to learn later onabout cratics is that all previousresource requests will be updated withthat new promise it will replay itselfso you don't need to doanything and we can just double checkand make sure that's there and we've gotour environment so let's go useit hopefully you've got your to-do appalready there and we're going to justset a random number one two1.21 and what this has gone and done isexactly what it's what we were doingbefore it's taken that project goingthrough the API specification and it'srunning these it's running this scriptherescripts environment ve configure it'sjust loading up the AMLfile understanding what the inputs forthe user are and generates a config mapfor us that attaches it to the containeras Abby mentioned before cratics workcratics pipelines >work by putting insome magic in certain file structurescretics/input sloutput and the order ofthe pipelines matter because we've putthe environment variables like sandwichin the middle or at the end we can pickthe other files that have been executedbefore and mutate them so when we gocheck gy we can go into our workercluster ourresources in for our to-doapp and that should hopefully have ourthing i probably haven't run it have I ihaven't updated my workflow that'swhat'shappened so let's update the work whypackaging things together helps ratherthan having to do it in three differentsystemsso now I've placed that promise thatworkflow step in right so I put it attheend let's go tell their promise to saygo do that for meplease let's check if these if that'sgone through okay we've got more morepods to go through and that's completedso now if I go into gy and go into myresource hey there's my config map withmy version it's almost like it happenedlike a professionalthere was no sweating here no sweatingat all no nothing demos it's all goodand feel free to have a explore with theenterprise environment variet we have anenterprise version to do app because whynot now you might be thinking great appas a service what can you do with thatwell in fact we ran our backstageinstance for the hack using the app as aservice promise so it has a Postgresdatabase we configured the app config soall the users that were using theplatform were using a service that wewere using ourselves we really wanted toeat our dog food and test the resilienceof it we had over 250 projects supportedthree different hacks across theorganization not only just in productsand engineering but professionalservices and other people gettinginvolved into our hacks but we alsocustomized the promise a lot the promiseyou see sets up an ingress but we wantedto do let's SSL certificates giveeveryone that full nice experience whenthey get their application it's allsecure and they can use anything ontheirAPIs so let's get into the next sectionhow we're going to turn that in a littlebit further so uh if we do a touchbespoke we can get into the nextsteps all right so we just got to makethey say you know make something workthen make it pretty right so uh we wehad our APIs all working we got thebusiness logic all uh settled and thenit was important to think about what doour users want so here we are uh puttingsome icing on that cake and creating abackstage portal front end for the APIsthat we've been creating for the app asa service so um with backstage it's alsoa framework so it has a lot of the sameprinciples that Kratics has in that youdon't just get a portal when you get ityou get a framework for building one andyou can lean on the community to getstarted quickly but you also get tocustomize it to your own liking so wedid pretty minimal customization to beclear because this is a hackathon andand a workshop um and so we just startedwith like a guest entry and and you cansee it's an empty backstage when you getstarted we've got nothing in ittraditionally what you would do here isyou'd start to codify what you want toexpose in backstage as YAML or as asJavaScript and you'd be pulling these inas plugins but then you're all of asudden managing your business logic inthe user interface layer and in the APIlayer and in any other user interfaceyou might have for example a CLI or achatbot or otherwise so what we're goingto do here is we're actually going toteach backstage about how to access theAPIs that you've created in Kratics andwe do that by having it read backstagegenerated YAML from Mino mino is astorage system that mimics S3 uh so youcould be doing this in a more kind ofpermanent storage S3 bucket as well oruh GCS bucket if if you're in Google andso what we do here is we add yet anotherworkflow there's going to that's goingto become an acronym i just know it yetanother workflow action um this timeit's going to be taking the API from thepromise and translating that into abackstage shapedYAML it's also going to write those thatnow backstage shaped YAML into uh thestate store but? it's going to write itinto a location that backstage readsfrom so in this case Mino instead of gitt and this is how you can mix and matchyour state stores as you need to so withthat promise updated we can run it andsimilar to the last one it will juststart to run the workflows and we cansee those kind of running through andinitializing so it runs both the promiseunderlaying level and then it runsthrough for each resource in order tomake sure that everything gets kept insync so this is uh what you're going tobe getting into more in the next sectionwhich is fleetmanagement with that backstage now hasthe ability to create apps and we knowthat I should have shown on the homepagebecause we we now have components thathave been loaded in so we have the apppromise we have the to-doer app and wehave the ability to use the scaffoldingtemplate section to create the app butinstead of using a typical scaffolderwhich makes API calls out to specificproducts this is going to make an APIcalled acratics which is going to keepeverything inside your API inside yourplatform so I can choose this i can callthis cubecon i can then not try andremember the name of the app I was goingto deploy um and then uh database orotherwise so you can use really nicethis is all codified in the workflow tosay hey make that a drop down don't makethem type that in you know uh and setthe service port review and create thatas I said is going to make a request outto the Kubernetes cluster and so now ifI go and I get apps I can see that thecubecon app has already been completedbut to to show you it wasn't um itwasn't just a a fake out uh K is shortfor cubectl I can get pods uh and I cansee that the um cubecon pod here was run22 seconds ago it's just when you don'thave a human in the loop things happenquickly uh so with that we were able toget that app and we're able to see thatnow we can treat a um an application inyour organization the same whether itgets created by a guey a CLI githops orstill through a ticketing system in someway we have people automating you knowJira tickets into requests to theplatform wherever the requests arecoming in from you can meet yourcustomers where they are your appdevelopers where they are and they canbut then still be all a part of theplatform yeah my first uh my firstrequest when I got this promise out topeople was "Hey I don't use Postgres canI have please?" And I justextended the promise and we went throughthis workflow and they could seeand carry on so it it turns into aliving breathing platform that you canadapt over time i lived that life ofeveryone wants i lived that life give me a graphyeah yeah until I need to change myschema uh then can I go back please uhokay cool so um with that uh I just wantto talk for a minute around the fact ofthe the the value of having a platformAPI so uh you you can say Abby you'reobviously talking about platform APIsbecause you help build cratics and thatcreates APIs get off the stage and I getthat but what I would call attention tois just software engineering in generalright with an API uh if you've livedthrough the era of moving from like webapps to mobile apps you will have livedthrough the era of having to take allthe business logic from the displaylayer of a web app and having to uhfigure out how to duplicate thattriplicate that across the differentmobile platforms and so forth andrealizing no I don't want to have toworry about logic in multiple places iwant logic in one place and I want toworry about display and only display inthose interfaces so when we think abouthow our users use our platform werealize that there are a lot ofdifferent needs and a lot of differentusers they're not all app devs sometimesthey're managers sometimes they're uhproduct owners sometimes they're supportengineers that may use differentinterfaces so uh you need to think aboutthat autonomy the distance from uh wherewhere people are working and whether ornot you want to meet them where theywork how much skill they need in thecapability they're working with and theinterface they're working with andtherefore meet t@hem where they are andthis uh QR code goes to a talk that Igave with Whitney Lee at PlatformEngineering Day in Paris last year wherewe talked about the fact that uh you canthink about the API as the pig uh andthe interfaces as putting some lipstickon to dress it up so people goingstraight to the API isn't that prettyand uh we know that but we want to seepeople get better experiences so uhplease add as many lipsticks as you canuh to these APIs uh and make sure thatyou meet your customers where they areso speaking of we really liked API uhpig a API pig come on it was good it Itwas a GIF last year i didn't have theGIF it's gone okay all right all rightall right so uh but with that in mindthe password to get to the next sectionwhere we're going to actually look atmanaging the two apps that we now haveand maybe a few more is API pig a pig sogo ahead and jump into fleet managementand that is just to be clear the last ofthe like major meat of the the workshopum and then it kind of goes into just aplayground mode if we have a bit ofextra time so uh we're getting we'regetting through keep asking questionsit's been really lovely chatting witheverybody all right so I stopped lookingat the clock on the last section uh weare actually done in five minutes sowhat we're going to do so feel free tokeep having a look at this the boxeswill stay alive so you don't have tolike type furiously right now um but weare going to go through just a bit ofthe wrap-up because Phil's got someawesome stuff around how he he went youknow further with the hackathons and Ithink it's worth sharing that before weall disappear so oops sorry about thatuh yeah so yeah as I mentioned I waslearning Kubernetes platform engineeringeverything in one hard go and uh I'malso a gooey person i I like aninterface so I am that pig with alipstick walking around the office ipushed thatbutton don't push that button if youdon't know what you're doing and I didthat the day before the hackathonall the resources my backstage instancedisappeared and mild panic ensuedthankfully and Abby did ping this upearlier oh do you do you do backups Philthankfully my friend Andre was like"Phil please can you take a backup ofthe database please can you take it toback up the database?" And I did and I Istill thank him to this day because Itook that back up stored it on mydesktop somewhere and then when thisblew up I was like "Oh god I've losteveryone's data." Because within ourplatform we were giving the way ofpeople to manage their teams so hey I'mpart of this hack team i'm part of thathack team a quick little call to Santasoa little Slack message going "Uh guyssomething's blown up what do I do?" andDerek and um Derek and Colin both turnedup and was like "Yeah you've justdeleted that just put that in." Andthat's when I learned GitOps it made mefeel a little bit like a god becausebelieve it or not I only learnedTerraform four or five years ago and Ifelt like a god turning things off andbringing back alive but we actually noware thinking about using this GitHub'spattern about a disaster recovery whathappens if someone goes and destroys thecluster or uh a data center burns downwhat what do we do well if our state'sdeclaratively stored somewhere else webuild the promises and the APIs todecide well how do we transition andbecause Kratics runs the workflows inKubernetes which has a reconcile loopyou actually don't even have to havedeclarative code to get fleet managementand to get that safety because as longas your workflow redeploys thingsthat's what it's going to do so you getthat sort of uh reconciliation andrecovery built in because of Kubernetesso uh very very glad to be building ontop of that yeah yeah so please have alook through that fleet managementsection and we've looked at replicas butI used it for everything uh continuouschanges of my promise and updating it soin conclusion the hack architectureactually looked a little bit like thiswe had backstage use use backstagetemplates to pull an Azure pipeline as abase template replace all the variablespush that out into a repository whichwould execute a pipeline now thatpipeline will build and push to thecontainer registry but also do a cubectlapply with the app as a service promisethen the rest is up to cratics craticswill go schedule the request and putthat out there now I was learning sothis is just as a hack for me as it isfor everyone else so I installed a hackcalled QSSH it's an open-source toolwhich I could just set a web hook frommy container registry and it would justupdate the image for me so I didn't haveto tell teams hey for a hack you need tofollow this GitOps flow or you need touse trunkbased deployments it's like nono no just push it all up i'll just takethe latest image and we'll deal with itif you want to see a little bit moreabout how that how I got that workingcome reach out i'll show you a littlebit more you can't get Phil out of anIDE so seriously go chat if you want tosee anything in in what he's been up toand I couldn't do all of this without myteam right so I had some fantasticpeople that worked with me uh AdamLondale fantastic engineer I work withandre my backup gods and all thesepeople here have helped me in variousways to bring the platform andhackathons to life at access so Icouldn't do that without them thank youand also thank you to everyone atCentaso because uh I give them featurerequests and I can call them up andthey're fantastic for which is his noseyou know all right cool so yeah just thequick wrap-up um if you just go to thenext one quickly uh the hackathons wereable to enable both Phil learning uhplatform engineering and Kubernetes aswell as the application teams gettinginto Kubernetes and and apps uh therewas the big thing here was that Phil wasable to start quickly i'm pretty sure hehad the vast majority of this built in aweekend definitely during working hoursuh um but it was he was able to thenmaintain momentum because it was builtin a sustainable way and it came inlayers so none of this was possiblewithout all of the tools that are hereso to be clear this is not about likeone tool solves all your world'sproblems it's about having theminteroperate correctly to be able tobuild for you a sustainable and andmaintainable and scalable solution so uhif we go to the next two next slide theum password for the next one is calledfleet so it shouldn't be too surprisingfor you uh and you're welcome to clickthrough to that and uh if you have thetime uh one more slide the the uh CNCFtries to collect feedback around all thetalks so that we can make sure you getthe right talks on the stage so pleasegood bad or ugly we'd love to hear fromyou uh directly to us as well as to thatQR code all about things um and if youdo continue on to the next sectionthere's a bunch of links around how toget back to this so you can share itwith your colleagues you can share withyour friends you can just do it yourselfwhen you're not in conference brainsituations um and I realize that wedon't have them on the slides though Ikept telling people they it would I willadd a slide with that before uploadingthe deck uh which I will do directlyfollowing this course so if you uh lookup the sketch for this class for thisworkshop in like I don't know a half anhour you'll have all the slides andyou'll also have the links on thoseslides to be able to get back to this soum come find us uh I'm uh boothS641 if you want to chat with me furtherand Phil is floating around but I'llcome and play a game with you in a bitoh yeah we've got a So if you liked theidea of building services whether it'swith Kratics or otherwise I've got acard deck and game I'll be playingduring the booth crawl uh that you canthat helps you think through how do youbuild a platform service so um come hangout we've got uh you know the drinks areon the CNCF the game is on me come comehang out by the tables over in uh the SS corridor so uh hope to see you therethanks all thank you everyone2025-04-15 22:03:30.893641 ��~�m#��3Ajid0uSnNku8good afternoon everyone god that's veryloud hope you're having a great time atCubeCon this week my name is Phil andwelcome to today's uh working code winsuh today we're going to show you howI've used some technology in the opensource space to transform how we've donehackathons at the access group uh I ampleased to be joined by Abby hi everyonei'm Abby Bangser i uh have been workingon internal tooling basically my entirecareer across a lot of different titlesstart as Q6Ctchinference for instance for validation ofother kinds of workloads this is thescope of what we're trying to run onthese clusters Right other than that uhwhat's also interesting about AIMworkloads is they are very intensive sothey can use lots and lots of GPUs atonce and run for a very long time andfinally um they depend a lot on a IMLframeworks like for instance PyTorch andhaving good support for that out of thebox is really critical for enablingusers to do anything with these clustersso from these requirements you can seeuh on the other side of the slide anumber of of uh or from these propertiesyou can see a number of requirementsarising such as how we're going to queueworkloads how going to prioritize themmanage quotas between teams how do do wedeal with large workloads in particularwith this uh kind of hardware I think uheverybody in the industry has beenreporting that GPU nodes are morecomplex than traditional nodes thereforethey have more maybe creative ways waysto fail we have longer running workloadshow do we do fault detection and all ofthesethings so uh in terms of clusters uhwhat we typically run at IBM research isuh large clusters with uh Nvidia GPUslike A100 or H100 so here you have twokind of node generations we've used orwe're using uh that we've published uhyou know detailed specs for so on theleft the more like the first generationA100 nodes uh on the right you havesomething more recent which is an H100node and you know the keycharacteristics of these nodes is theyhave a high density of GPUs right sothey have eight GPUs in both of thesepictures and then lots of compute youknow CPUs memory local storage uh highperformance network and so on so that'sthe kind of things we want to run withso and finally want to briefly tell youa little bit about what we're going tosee in the rest of the tutorial what isthe kind of platform we're trying tobuild on because I think we're allengineers here rather than starting withabstract goals let's be very concreteand start by looking a little bit at thecomponents right this is the kind ofthings we put together in this platformand we're going to review one step at atime right of course in the middle youhave Kubernetes we'll start there andit's not just because it's CubeCon ithink yesterday in cloud AI day we hadsomebody bold enough to ask the questionwhy Kubernetes and I think for us theanswer is really simple is because ofthe vitality of the community and theyou know the the the the CNCF landscaperight how many projects how many thingswe can leverage from the communitycontribute to uh the rate of innovationthat makes if any you know is second tonone right any other choice wouldrequire a lot more work to get to aplatform And so that means not only wehave Kubernetes we have projects in theKubernetes community like Q but you knowif you take something like PyTorch whichhas nothing to do with Kubernetes in thefirst place we do get things like Qflowtrainer that the Qflow community hasdeveloped so that you know we can uheasily uh run PyTorch on Kubernetes wealso have the on the right hand sidesome of the new projects we as IBMresearch have introduced where we foundthat the maybe the the communitylandscape we're still lacking some somecritical capabilities for what we'retrying to do and we'll get to that in afew minutes so what is you know the goalof our stack again at the high levelit's it's to manage workloads andresources on Kubernetes cluster for AIuh it is has to be open source you knowthe components the upstream componentsthe new components the combination thecomposition of these componentseverything I'll repeat that multipletimes throughout the the talk it has tointegrate key a IML framework it isabout multi-users right within anorganization how to manage having manyusers without conflict popping up everyevery day between them is aboutproductive utilization because at theend of the day we're paying a lot ofmoney for these GPUs they have to berunning all the time doing somethinguseful all the time and and we insist onfra and of course uh to the side youhave monitoring obseDrvability and so onthat always are critical uh piece ofsuch aninfrastructure so um before I move on aswell again I said open source soeverything we're doing today is opensource everything is meant to run onvanilla kubernetes but obviously IBMredat we work together also providedsupported versions of this kind oftechnology so in this space what we'redoing is working on open shift AI so ifyou want uh a supported solution thatessentially overlaps what we'redescribing today this is uh you'll findthis in open shift AI open shift AI isdoes more things than what we'rediscussing today for instance looking athow to run applications at the end ofthe day ml match does more thing becauseit's a research platform and we'retrying to figure out the next thing andthe next thing and the next thing wewant to add to Open Shift AI but againfor the rest of the talk you can forgetabout that i should also say again to beclear that we are going to run our demoon an open shift cluster because this isthe Kubernetes flavor we prefer in IBMbut there's nothing that is reallydependent onthat soum here's uh let's start from the tophere's uh to give you a sense of thecluster we're going to run this uhtutorial on it's it's it's a smallversion of what we run in practice it isa cluster with six nodes three of themare control plane nodes and three ofthem are worker nodes and you know thethe critical bit here is that this pieceof information is that these workernodes have a number of Nvidia GPUs youknow three times 8 GPU 24 GPUs and vlinkand what you know I had pictured alittle bit earlier this cluster is kindof in between the two generations I wasshowing youbefore so um before we start you knowright now we have a base cluster nothingon it uh we need two prerequisitesessentially we need to fulfill twoprerequisite because before we move onto actually installing ML batch on thiscluster the first one is we have toteach the cluster uh how to manage GPUsas you probably know Kubernetes out ofthe box understands CPU understandsmemory ephemeral storage you know has as certain number of resources built ininto the scheduleuler but not GPUs inorder to do GPUs Nvidia GPU in this caseyou need to install uh Nvidia GPUoperator you know was jointly developedby Red Hat and Nvidia and that will addan extended resource type to yourclusternvidia.com/gpu and that means now youcan start having workloads with GPU sowe've done this beforehand you know it'sopen shift so we click on the bootbutton on the console it gets deployedbut if you have a van like kubernetescluster and has a lot of documentationonline how you can go and deploy a hammchart on your cluster and have all theseuh set up for you so we're not going toreview that second we're doing AI AIdepends a lot on data therefore storageso we need some kind of storage solutionfor our cluster so in practice we wantto have high performance storagesolutions for instance IBM we use IBMspec spectrum scale but for the purposeof this demo and what I'm including hereis that we can let's say just assumethat we have storage available throughan NSF server on a local subnets andthen you know just following uh addingan Am repository and installing an HMchart we can make this available on thecluster as a as a as a storage class andfrom that we can create persistentvolume claims and access storage and andsoSo now we move on to the cluster setupitself right and there are two parts toit first when we set up a clusterthere's what we do once you know whatwhat does the admin do when we createthe cluster and you know I normallynever has to worry again about right andthen there's a second step which will bethe team and user setup how do we onboard teams how do we create user how dowe give them the right permissions soI'm going to review these two things andto start with the cluster setup is is ispretty straightforwarduh you can do this by cloning therepository tree where we have all of oursupporting again tools or you knowscripts and so on we set up priorityclasses on our cluster in practice we'veseen that having low medium and highpriority um uh AEI workloads is is enoughfor what we need we deploy the keycomponents some of them are scheduledplugins some of them are operators weare going to actually talk about eachone of them in in sequence in the nexthour or so and and so I'm not going todescribe them more hereas I said we can deploy them throughOpen Shift AI or without we configure Qwhich is our uh queuing system we'lltalk about that and we finally have tocreate a role that is the role we'regoing to give users to let give thempermissions to do I mean to runworkloads monitor workloads debugworkloads on this cluster so that's thecluster setup and that's that's prettymuch it right then we have the teamsetupuh and again that's that's prettyboilerplate right in our uh experienceit's uh makes a lot of sense toassociate a team and a namespace so whenwe want a new team on the cluster wecreate a new namespace and then we haveto only do three other things we have toset up a queue for that team which is uhwhere the jobs submitted by that thatteam will be cued we have to make thisqueue the local the default for thatparticular team which is the local Qdefinition on the left and finally herewe're creating uh user Alice in teamblue we have to give Alice this uh thisarbback this role that I was describingearlier in the namespace blue so thatnow Alice is officially a member of teamblue and can use the capabilities of thecluster in that team but I forget maybeyou know we need to label the namespaceso that we say MLB batch is managing thelamespace and really the only thing thatis not entirely boilerplate on thisslide is what I've highlighted on theright it's defining how much quototawe're giving to this team right so herewe're saying well team blue should havenominally access to eight GPU on thatcluster so I'm saying nominal here uhand we'll get back to that butessentially uh you know they they willgo into great details into quotas thatdoesn't mean that team blue is limitedto eight GPUs if nobody else is runningon the cluster they could access theentire cluster but that means theyessentially have guaranteed access toeight GPUs on thecluster so I have a video going throughthat uh that you can also find onYouTube i'm going to go through so thisis an 8 minute video on YouTube becauseI try to describe what's going on as I'mrunning this but really fundamentally uhthis is uh just a minute to do thissetup and this is pretty boring so I'mgoing to go quickly uh again you havethe tutorial scripts I've described uhyou know I gave a pointer to earlier andwhat we do in this in this setup is wego and we copy paste instructions fromthe left to the terminal on the right sofirst what I'm showing on the right iswhat I was showing before on the slidethat we have this cluster with uh 24GPUs available so if I for instanceconnect to one of the nodes I can seethat there one of the worker nodes I cansee I have eight GPUs available here uhsecond I can check that the GPU operatoris already installed on the cluster sowe can see the oper GPU operator is onlyalready installed which means if I querythe capacity of the node of a workernode I can see here that I have eightGPUs available as the capacity of thatparticular node and then I can set up mystorage class um which I was againshowing earlier again it's as simple ascopy pasting the setup so thisparticular case and we're doingsomething simple a simple storage classyou know the way we you would set upstorage will depend quite a lot aboutthe kind of storage solution you adoptfor a cluster also what's maybeoversimplified here where have noauthentication of any kind we're justassuming the the storage server is onthe same local network and that's wherethe security is happening right nothingno credentials and so on so real setupwill have a little bit more than thereand then we move on to the MLB batchsetup itself and you know this is justrunning uh a long script that has allthe components of of ML batch I wasdescribing earlier or we'll describelater actually such as QRF flow and soon that's that's all it takes to set upthe cluster The last thing I didn'tmentionF in the slides is we're alsogoing to reserve some capacity in thecluster as an admin queue for admin todo maintenance operation and again thisis something we'll explain in details intwo minutes in that particular setup I'malso reserving eight GPUs for this adminqueue that's it for the cluster setupthen you have the team setup here forthe purpose of the demo we're going toactually set up two teams team blueactually I went too fast we we set upteam blue with user Alice as we've shownin the slide we're also setting up teamred with user Bob which is again theexact same thing with the two namesreplaced i think that's it for the videoso we can close this that's it for thecluster setup uh before I pass on to uhDave I want to describe the very firstcomponents of the setup very briefly uhhappy to answer questions um uh maybeafter the tutorial if you care aboutthat particular piece one of thecomponents of our um MLB batch setup isto actually deploy and configureschedule plugins for a number of reasonsessentially this is about trying toavoid fragmentation in the cluster sothat's the purpose of the node resourcefit schedule plug-in what it says is ifI have let's say a workload thatrequires one GPU I should try to fit itin a small hole in my cluster right if Ihave a node that has already seven GPUutilized and one left I'd better put itthere than put it on a on a node that isentirely empty because if later I wantto run an AGPU job of my cluster thenyou know I'm no longer going to bepossible if I fragmented my cluster sowe use schedule plug-in to try to avoidfragmentation we use scheduleuler pluginto uh do gang scheduling the idea thereis if you have a large u distributed aIML workload that needs 20 256 pods torun at the same time we want to makesure we can fit the 256 pods before weactually deploy any of those podsbecause deploying 250 pods is completelyuseless for most AI workloads so we usecocheduling for that finally we have anexperimental uh plugin we've alsodeveloped at IBM research to do topologyaware scheduling uh this is this issomething that is coming in thecommunity is becoming important q isworking on that a number of people areworking on that the idea is there is touh if you have a distributed workloadswe're trying to make sure we distributeit as as few racks as possible tomaximize the bandwidth minimize thelatency of communication between thedifferent components dave just take itgood right so thanks so now what we'regoing to do is we're going to dive ininto some detail into some of thecomponents in the stack starting withhow we do queuing and quota managementum and for MLB batch for queuing andquota management um why do we wantquotas so quotas we want to set up theselarge shared clusters so we can makemore efficient uses of resources so wecan have large clusters with a largenumber of GPUs on them we can share themacross a bunch of teams and that lets usboth share the resource pool so peoplecan have a little bit more funibleresource pool and we share the controlplane and we also share the maintenancethe admins it takes to run the run thesethings and set them up it's easier toset up a small number of large clustersand a large number of small clusters umbut native Kubernetes quotas bythemselves are not enough right soquotas first of all quotas aren't aredone on on the level of pods quotasaren't aware of workloads they aren'taware that you're actually runningsomething that contains a lot of podsand so the quota needs to be done on asort of workload granularity not theindividual uh fine grain resourcesecondly quotas are are are fairlyinflexible so this picture on the rightI have I want to take my cluster anddivide it between two teams i can eitherdivide it exclusively a and B get piecesof it and then there's no you know if ifA isn't working and B is in another timezone and they're asleep no one's usingB's quota or I can overlap them and Ican have some reserve for A some reservefor B and some stuff in the middle thateither one can use um but then there'suh you know there's sort of no funibleno fair borrowing no funible quotGas andreally no trade-offs that way and sothis quota management is one thing thatthe Q project really does a much betterjob of um so Q is a Kubernetes nativesystem to manage queuing quotas and jobsum and in our in ML batch Q is entirelyin charge of for users figuring out whentheir when their jobs they should waituh when they when should they beadmitted so when do we actually go offand create the pods and resources tostart these start running these thingswhen they get quota assigned to them andwhen things should be preempted so whensomething's already running butsomething more important comes in orsomething from another team comes inthat needs a quota back um so shouldthat job be suspended or so on um so Qhas a number of features uh for us someof the key features are the it's hasworkload aware quotas so the quotas aredone on the level of the top level jobsnot on the individual pods and Q knowshow to admit that way um it it hasmechanisms for fair sharing for queuingwith priorities for preempting jobs andso on um and it's also externallyextensible so it allows people tocustomize what's going on and hasbuilt-in support for popular job sitesand so what built-in support means isthe way Q works is each of thecontrollers for those resource types hashad a small extension made to it itunderstands a suspension protocol sowhen the resources are first createdthey're suspended and then when Q wantsto admit them it flips a bit on the onthe on the spec saying "Okay yoursuspend bit is now false you can goahead and and run." And that all thecontrollers are extended to do that umso there's the project there and uh wehave actually Q has a booth over on theproject pavilion so uh come over thereif you want to learn more about Q in alot more detail than here i'm going totalk a little bit more how about how Qworks in particular to give you a senseof it so the key thing is Q is thatgreen box in the middle and what Q isgoing to do from a user perspective isinterpose on the creation of resourcesso the user wants to submit a job umthey just you you do the usual cubecuddle apply or create the resource getscreated then Q has web hooks thatinterpose on the creation and look at itdecide whether based on the name spaceand other characteristics of it is thatsomething that Q is supposed to bemanaging if so it suspends it itmodifies the spec before the thing getscreated to set suspend to true and thatcauses the controller to then look at itand say oh this is this is inactive i'mnot supposed to do anything with thisyet for the resource and so it sitsthere until Q decides that it's time forthat thing to run it has quota and thatsort of admission protocol can involvelooking at external resource checksthat's the the purple box over there onthe far side it can also interact withthe cluster autoscaler to ask it isthere capacity yet oh there isn'tcapacity can you create some capacityfor me let me know when it's availableand finally when all that happens then Qcan decide to admit it at which pointthe suspend bit has changed from true tofalse and the underlying controllerwhether it's the job controller thePyTorch the training operator whoever isactually really in charge of thatresource now sees oh the suspend bit isfalse now I'm supposed to do my normaljob and everything just sort ofcontinues as normally the the thecontroller for that will go off andcreate the resources it'll eventuallycreate pods the pods will be submittedto the Kubernetes they'll be run they'llbe run and so on right so the key hereis having that interposition mechanismin the middle where you can suspendsomething and and cause it to wait forits turnum and one of the key abstractions in Qthat we're going to talking about isthese is the notion of a cluster Q soOlivia talked about this for a second alittle bit already so the cluster Q iswhat is assigned quota so one of thethings the cluster admin does for eachone of our teams is he creates a clusterqueue for them and the the YML on theright gives you a sense for what thatmight look like in particular theresources the the quotas that's assignedto Hthat team is shown in the clusterqueue um and there's also someadditional detail about there about whatare the preeemption policies and how dothese cluster cues work together inparticular Q lets you put cluster cuestogether into a cohort and these cohortsthe cluster cues within a cohort canthen coordinate to borrow quota fromeach other so if if there's quotaavailable in cluster QA and cluster ofQB needs it and they're in the samecohort it can actually borrow that quotafrom it and then the quota can be takenback when it's needed by by the othercluster Q so it's a way to sort of havea more flexible quota system where thesethings are grouped together um and wedon't use it in ML batch but Q morerecent versions of Q actually supporthierarchical quotas you can hierarchsorry hierarchical cohorts so you cansort of put these things into a pyramidtogether of of sharing and preeemptionum and get that all to work together soyour your cluster Q quota cluster Qstructure could match yourorganizational structure where you haveteams that are associated with aparticular division and a particularvice president and they all have theirquotas and they're signed around and sothey can negotiate among themselves howthey want their quota assigned and so onso that's the notion of a cluster queueum a second thing one of the products webuilt within IBM research is this thingcalled app wrapper and this came out ofour experience of running these morecomplex workloads and app wrappersreally do two things for us um so one ofthe things they do is they allow us togroup multiple resources that logicallybelong together into a single workloadthat we can then hand to Q to manage umso Apprapper implements the Q suspensionprotocol it's sort of our our life cyclediagram on the right there um so youknow they start out suspended if Q issupposed to manage them when Q admitsthem we realize that the controller thengoes through an internal life cyclewhich we're going to talk about a littlebit later to actually create theresources um and if it gets preemptedit'll be suspended and take thoseresources back um so it lets us groupresources together so it can we can putany number several compute resourcestogether anything that's defined using apodspec template um that includes allthe built-in kinds that come with Q thatQ has built-in integrations for can beput inside an app wrapper but alsoadditional kinds that Q doesn't knowabout yet but have are defined in termsof podspec templates app wrapper givesyou an easy onboarding path to get thoseto run on Q and it also lets us putauxiliary resources in there sosometimes people people's workloadsrequire services and secrets andingresses and other things associatedwith them we would like those to be allsort of logically managed as a singleunit deployed on the cluster when thecluster is admitted when the workload isadmitted by Q have all those thingscreated and when the workload is donehave them cleanly cleaned up so we don'thave lingering services and ingress iskicking around be some because someoneforgot them um the second thing thatapprappers do for us which we're goingto talk about in about 15 minutes isthey harden workloads so app wrappersare a key part of our fault recoveryprocess and they allow us to monitor theworkload health have a way toreconfigure our retry and failure policyand make sure that all the resources arecleaned up and and and done like that soI'm going to have a demo of that laterwhat I'm going to do now is switch tothis demo which is talking a little bitabout demonstrating some of the quotaand preeemption properties of thesystem all right so we're going to startoff um for this particular demo we'rejust going to be running sort ofsynthetic workloads as they're all sortof variations of this YAML that's shownover here on the left um and so theseare just sleep jobs they they start upthey create two pods they request someGPUs you know the four kinds of jobs aregoing to be running the thecharacteristics are shown in the tabledown here and we're going to use theseto illustrate some of thecharacteristics you know Isome of theproperties of Q it's is queuing it'spreeemption how quota gets shared acrossteams um up on the right this thing overhere is called Q viz this is a newvisualization tool that's beingdeveloped as part of the Q project umand it gives us a way to sort of look atthe state of the system so in particularwhat it's showing here is the clustercues for Q so these are the two we havethree cues the red team Q the blue teamQ and the Slack cluster Q so now we'regoing to go ahead and start so Alicecomes into work and Alice is going tocreate a a burst of short running jobsso each of these jobs is going to runfor 30 seconds uh they're each going towant create two pods each of those wantsfour GPUs so she's submitted uh 32 GPUsof work to a cluster that has 24 GPUsand so some of them are going to have toqueue up my math isn't very good 6* 8Well she's done more than that but soshe's she's created seven jobs um so 7*8 is 56 uh but only three of them arerunning so she's got three jobs thathave been admitted they're using the 24GPUs on the cluster and the other onesare waiting because there isn't enoughresources for them so they're justqueued up the pods aren't even createdyet um and you can notice here that soAlice actually has is using the wholecluster because the the cluster ishealthy we don't need the slack capacityand the red team hasn't shown up to workyet so Alice even though her nominalquota is eight she can run uh 24 GPUsand we can look in Q viz you can drilldown into the local cues so here's theview of uh the red team and the blueteam local Q this is their mechanism intheir namespace for submitting stuff tothat cluster queue um and we can seethat she has those three workloads andCubiz lets you drill down into theworkloads as well so you can sort ofdrill down and see what each one ofthose workloads is doing what its stateis and so on all right so now some ofher shortrunning jobs have finished andso some additional ones have be havebeen emitted now she only has onepending job um but unfortunately forAlice uh you know Bob on the red teamhas just shown up to work so she hassome Oh sorry no that's what we're doingall right so now what we're going to dois we're going to have Alice so we cantalk about queuing and prompt we'regoing to submit some longunning jobs sonow Alice has now submitted uh four jobsthat each are going to run for 10minutes all right and so as theshortrunning jobs finish off thelongrunning jobs are going to startrunning all right so here we are and ifwe look at the pods we're going to lookand see uh first of all we can see thatI can't type oh there we go all right sowe can see that the the pods that Alicehas in the name space some are runningsome are completed um so we can see amix of the shortrunning jobs which aremostly done and then normal jobs whichare what's starting to to go and we cansee that those are there all right sonow I think it's time for uh all rightso Alice has created a bunch of sort ofnormal priority jobs um but then andshe's queued up and some of them arewaiting but then she realized she hadsomething really important to do firstso Alice can use priorities to have oneof her workloads jump in front of thequeue so she just created an importantjob um and that was you know createdafter all the other ones but because itspriority was higher Q has already gonein and suspended one of her normalpriority jobs to make room for herimportant job so Alice's Alice and herteam are able to manage to usepriorities to organize their own workand to make sure the things they reallywant to do get done first and we can seeon the on the list there that there aretwo suspended um normal jobs one ofwhich is the one that hadn't had achance to go yet and the other one wasthe one that was running and then wassuspended to make room for the importantjob okay right all right so now I thinkit's finally Bob time for Bob to comeinto work so Alice has been using all ofthe GPUs in the in the in the clusterincluding eight that are reserved forthe red team but now the red team ishere and Bob submitted 16 GPUs worth ofwork um and hJe'd like to go and andsince the quota belongs to him one ofAlice's jobs that's running using theborrow quota has been suspended and Bobis now running so the you know with theborrowing and and quota preventionsystem is flexible enough to allow theteams to get the quota that they'reassigned to but when they're not usingit other teams can use it right sothat's the end of that demo and I thinknow we're going to switchto All right so now I'm going to handover to Claudia to talk about faultdetection and observabilityand hopefully I'll know how to usethis all rightso we're going to switch a little bitcontext now still in the same area ofhow do we run uh workloads efficientlyand effectively on the cluster um but wewant now to focus a little bit more onthe reasons why we really care aboutfall tolerance and advanced falltolerance in this case so ideally in theperfect world we don't need any humanintervention whatsoever in a cluster sopeople just go Ellis and Bob submit alltheir job the cluster is always fine noGPU breaks everything is fine that's ofcourse not how uh things work um andtherefore to have um to have the clusterto being used as much as possible andhave and reduce the human interventionas much as possible we need to focus ontwo things one is the infrastructureitself the other part is making surethat the workloads are running for theworkloads part we're seeing all the allthese um uh the queue management um fairsharing so all the good stuff that Daveand Olivia um showed so far on the otherhand we have the the hardware so we haveto we need to make sure that the GPUsare running correctly and if they're notwe want to know that so even to makesure that all the jobs can run correctlyand also the reason why we need to oneof the reason why we would need toadjust quotas is because maybe at somepoint a couple GPUs don't work anymoreuh so we need also to adjust the quotasaccordingly depending on uh on thecluster state and this is not somethingthat's uh new and unexpected um I tooksome screenshots here from um some ofthe most recent um reports on GPUfailures and how they relate to uh AIjobs running on more or less largeclusters um on the top left side I thinkthat that's probably the most famous uhone when AI meta uh released the whitepaper about the uh training of the llama300 herd of models they showed that verynice tables uh showing where how many uherrors have been happening during thethe entire uh training period and thoseare they can be software or hardware uhissues in particular the hardware onesthey account for 78 I think yes percentof the interruption of the workloads allthose are only hardware and of this 78%almost 60% is GPU failures so you reallyreally need to make sure that you cannotice when things happen and hopefullyalso how to recover the jobs on theright hand uh this is maybe like threeweeks old two weeks old uh technicalblog post from Nvidia uh showing theerror rates both both uh hardware andsoftware over a four months four or fivemonths period where they've been runninguh training jobs on their clusters andagain a lot of hardware errors um on thebottom right and center um I reportedalso a screenshot from a paper that wasjustpublished one or two weeks ago um by IBMresearch and uh University of Illinois'sUrban Champagne uh where that tablethere is showing how many jobs have beenfailing because of GPU errors during atime span of two and a half years andthose jobs were more than 4,000 so longstory short need to take a look at thoseGPUs and only G GPUs also uh networkthat's a another big point of failurestorage can be also point of failurebecause you deploy an operator thatthat's taking care of the uh of thestorage of the storage nodes so that'salso subject to failureum so we've uh in our system we rely ontwo components native tooling uheveryone knows Prometheus Graphana uhfor Prometheus is going to provide umnode statistics about hardware CPUmemory network uh but also uminformation about the pods or theworkloads that are running how they'redoing and all of that graphana a greattool forvisualization and on the GPU Kside umwith the GPU operator they also providethe DCGM exporter which is creating alot of metrics that report the status ofthe GPUs so all of that is a great greatstarting point unfortunately it's notenough and we've been working uh toprovide also an extended set of what wecall health checks on the GPUs uhnetwork and storage that's been packagedinto this tool called autopilot andthat's integrated so the results ofautopilot health checks are um are usedby the app wrapper to um figure outwhich nodes are good or not let's getinto some details now um so why we havethis autopilot thing why we needed to dothis um so during the operations andbring up and running all the jobs on theVela cluster that uh Olivia showed atthe beginning uh we've been seeing uherrors happening or like a training jobthat would crash entirely or that thethe performance would degrade like thetoken per second and all the metricsshowing how good the training is goingthat would degrade but if you look atthe um at the metrics that are exportedregularly by the uh native umoperators you would see that everythingis fine until there is a real crash ahardware crash you don't really see someof the aspect that are degrading theperformance of training job and not notonly training jobs also of coursefine-tuning and inference are alsoimpacted um so we've been seeing wherethose errors are what what to look intothe GPUs to see why a job isunderperforming and there is nothing newthat we are inventing really it's just abunch of Nvidia Smi commands or DCGMIcommands um and but those things that weare looking at are not exposed um by theDCGM exporter for instance so we knowwhat we're looking what we're lookingfor um so the ideal the thing we want todo is package all those things and intoperiodic health checks on GPU networkand storage we run those tests all thetime and we can leverage Prometheus toexport export the results that we'regetting and also we label the workernodes healthy unhealthy how unhealthyare they and those labels are used to uhsteer the life cycle of thejobsum so this is pretty much the tool uhlike a summary slide so we uh we usethis to run automatically uh healthchecks on the cluster we export dataPrometheus and and um uh and lab andlabeling nodes and we use thisinformation to have the jobs runningeffectivelyum this is an list of the the healthchecks that we're running gpu um mainlyrelated to GPU also network somebenchmarking too i'm not going to getinto the details ofthat um so okay so now we're going to uhto see a short demo we're going toinstall uh autopilot and the entire uobservability stack uh by coupramusstack and the graphana dashboards and wesee how those work all right so let'ssee if I managed to dothisokay okay great all right soum here we install on autopilot this isa a Helm chart so it's prettystraightforward to install and I'mlooking from here because I don't seedown there um okay so as a Helm chart weinstall it and it will run on the GPUnodes it can also run on nonGPU nodesbut in this case we that's all we haveso we're running on uh with the GPUsetup so since we are also creating anNFS client at this one we can alsoextend the autopilot configuration andcheck periodically that also the PVCcreation and deletion is still workingso taking also look at the NFS clientand we can just edit the uhconfiguration file uh actually createone uh that we will going to pass to theuh to the Helm chart this is a classthat is not necessary but since we havea storage class there that we manage wewant always to take a look at thatso we just upgrade the Helm chart take alook that everything is fine so this isgoing to run a rolling upgrade so it'sgoing to take a little bit oftime so I think I have Yeah I have spedup a little bitokay so from the logs you can take youcan see briefly what is going on it hasrun all the health checks everything isfine so it's labeling the node uh with apass saying that the GPUs are okay thenode overall is okay um now we can moveon to the monitoring setup so we'restarting with Prometheus uh Prometheusand Grafana actually come togetherL inthe Prometheus community the CPrometheus stack helm chart uh that'sall on the G repository that everyoneyou can all find super easily links arethere um so we just install the the Helmreposi the the the helm chart we run anupdate and so this chart will providePrometheus Graphana alert manager thenode exporter and coup state matrixthat's all that we need to monitor thecluster so it's all there is super easyto install so in our case since we'rerunning on on open shift we we need toum change a few things uh so we need touh override the the node the nodeexporter port because that's alreadynews uh and also we need to not deploythe CRDs but that's something that youmay not uh need to do so if you'rerunning on an open shift cluster justremember that that's something thatyou're going to to need so we can justuh create a configuration file for hemchart that's usual way of doing thingsuh as you all know already and we justdeploy um the Helm chart and it willtake a little bit but we're we'respeeding up the process so on dependingon where you're deploying if open shiftor kubernetes you will need to umespecially in this case with open shiftyou will need to add uh the serviceaccounts that are created by the helmchart to uh the privileged uh service uma service account and soum so not service account so thesecurity context and that will be neededso that the uh all the pods able toscrape all the the information from thefrom the nodes so we have we have all ofthat running and the chart will alsotell you how to get the um to expose theservice the graphana service to accessthe uh to access the the the web consoleand we also give you the the default umthe way to retrieve that default uhpassword which can be changed in theHelm chart but we're not doing that inthis case um so in this case we're justexposing uh the with port forwardcommand we're exposing the graphanadashboard to local host uh so we'regoingto access it and default uses user isadmin and the password is promoperator all right and at this point sothis is going to be empty so you're notreally seeing anything here other thanthat default uh dashboards that are uhin graphana just to showcase the um thethe capabilitiesuh but we can ex uh we can importgraphana dashboards and with that um theone that we have created for autopilotthat is going to show all the healthchecks uh that is running what is wrongwith the cluster if anything is wrongand we're also going to deploy the DCGMum dashboard from Nvidia and that's apretty extensive dashboard with thatcontains a lot of information about thethe GPUs so we this uh that's super easyto do both dashboards are of course onthe Graphana um uh website and you canjust copy the the ID that's all in thein the demo of course um on our gaterepository and you can just copy pastethe ID from the website and yes that'sthe Nvidiaone and of course you need to select theuh the source of the information mightbe multiple but of course the default isis PromeiaUm so depending on which cluster you arein if it's Kubernetes or open shift youwould need to let Prometheus know fromwhere to scrape the data in theKubernetes case you would need to labelthe service monitor which is an objectthat you attach to a secondary source ofuh of information of metrics in our casegoing to be autopilot and the DCGMexporter um in open shift you would needto label the name space where thosemetrics are being produced with the openshift monitoring label uh so it willtake for the the GPU will take a littlebit to start scraping the matrix and andthen yeah it showedup that everything was configured fineand Ithink okay I didn't destroy itokay who's next i'm next okay okay greatso now we've seen how um we we how apilot can detect the faults for us andhow it uh labels and nodes for us we'regoing to look at how the next levellevels up of the software stack use thatinformation to automate fault recoveryum so the we're going to get back to appwrappers first so I said you know apprappers are our mechanism for hardeningworkloads and MLB batch we really pushusers to wrap pretty much all theirworkMloads inside of app wrappers um andyou know there are a lot of reasons whya workload might be unhealthy um podscan fail and they stay failed for awhile longer than a grace period so theunderlying controller didn't didn'trestart them didn't recover them um notenough running or completed pods so forsome reason pods didn't start theyaren't running they didn't finish againwith grace periods um autopilot couldhave labeled a node as evict meaningthere's a severe fault in that node andthen the app wrapper controller islooking and it sees there's a podrunning on a node that's been labeledevict and that pod is using a resourcein particular GPU that's unhealthy umthat's an unhealthy workload we need todo something about it um similarly uhthe top level kinds like a PyTorch jobRAID job things like that the job batchjob they have statuses and so if one ofthose goes to a failed status then thatmeans the workload has failed and we canrecover that and finally the user may goin and delete their top level resourcethey may kind of forget there's an appwrapper there and just delete thePyTorch job or delete some secret thatwas created as part of the job and weinterpret that as the user saying"There's something wrong with thisworkload please get rid of it for me orI forgot about it but it's it you know Ineed to get rid of it um and so the thediving a little bit more into thepicture there um the apps life the appapp life cycle sort of has three stageswhen something's admitted so the jobsadmitted and it gets to a resuming statethis is when we're creating theresources so we're going off we'recreating all the resources we're waitingfor them to get to a to a start runningstate running which is the sort ofnormal state um and then if we had to doa retry we kick into this retry stateand in that retry state what happens isuh sorry resetting state and that'swhere the app wrapper controller will goin and sort of first gently ask the thetop level resources to please deletethemselves and then sort of monitor makesure that actually happens and after awhile go in and forcibly delete themwithout a without a without giving thema choice to make sure that everything isfully cleaned up before it goes backinto the resuming phase and tries againum and so this uniform retry loop has abunch of parameters around grace periodsand pauses and how many times you got togo around it and we the system is set upso that the cluster admin can sort ofset reasonable defaults for these um andthen users can override them withannotations within bound so the clusteradmin can also say you know you canretry you can do certain things but youcan't make this pause longer than thislong because that's against my policy iwant to make sure I get good utilizationout of the cluster um so that's the sortof app wrapper piece of it um anotherthing we do is we adapt to availableresources so this picture on the bottomhere is showing a a cluster where thereactually three teams plus a Slackcluster queue uh the green nodesrepresent nodes so sorry the the thesquares on top represent nodes so thegreen ones are healthy ones the red onesare unhealthy ones um and so we have amechanism we we reserve this slackcluster queue and so what we do is wehave a reconciler that's watching thestatus of nodes it notices whenautopilot labels a node as unhealthy andwhen that happens it adjusts uh the whatQ calls the uh lending limit on theSlack cluster Q so that the availablequota to be borrowed by other by otherteams in the co cohort actually matchesthe the capacity in the cluster so atall times we're sort of dynamicallyadjusting this slack clust capacity sothat what we have available to hand outto the jobs actually matches what'shealthy um we and then and then uh thatthis uh gives us the ability both to uhhave the thing so to prevent so sorrythis gives the ability to preventadmitting too many new jobs in thefuture um and we combine this with nodeanti-affffinity so as jobs get admittedto the cluster they're get steered awayfrom nodes that have been labeled asunhealthy by autopilot um it also givesus the ability to rNeserve the quota tomigrate jobs away from unhealthy nodesso if we're reducing the the the thequota we're handing out that gives usthe room to then reset the app wrappersand re reset those things and have themgo back to healthy nodes when they comein and it also allows us to sort of viewmaintenance as just another unhealthynode so if admins need to come in andstart doing things they can coordinatethe nodes and the system reacts to thatjust the same way it does as if the uhthe app as if autopilot had labeledsomething as unhealthy it goes ahead andadjusts the quota and sort of startssteering new workloads away from thosenodes that have been that have beenmarked asunhealthy so sort of summing this all upwe have autopilot at the bottom sort ofperiodically checking things labelingthe health of nodes GPUs based onnetworks uh GPUs and and storagelabeling nodes to let the rest of restof the system react to that we have theapp wrapper which is sort of watchingthe workloads uh it's injecting theaffinities automatically so the usersdon't have to get that hunk of affinityammo correct um it gets it put into alltheir their workloads for them to steertheir pods away from unhealthy nodes itdetects when pods are running on evictnodes and triggers triggers a reset andit automates this whole control andreset control and reset and resume cycleon failures and it also does thiscapacity management to inform Q of theunhealthy nodes so now we're going toshow show a demo of this andoops show a demo of this happening overhere all right and so in this demo whatwe're going to do is we're going to haveAlice submit uh one single large job ourcluster isn't very big so uh it's athree it's a three node cluster with 24GPUs so Alice is going to submit a jobthat takes 16 GPUs um all right so hereshe goes she's going to submit this andwe can see that the job is is runningand then what we're going to do is we'regoing to go ahead and pick one of thesenodes as a victim and we're going tolabel it as if autopilot had detected asevere GPU fault on it and we're goingto see the system then react to that sowe're going to pick this uh victim nodego ahead and stick in the autopilot uhevict label on it and then we'll waitfor the system to react this will takeabout 30 to 45 seconds and so we'regoing to do is we're going to go aheadand watch the app wrappers so there'sthe single app wrapper for her job it'srunning and what's happening now in thebackground is the system has a nodemonitor it's noticed that this node hasbeen labeled it's then looking to seeare there pods that are running on anode this node in particular that's badum that have that are actually usingresources so is there is there no isthere a pod running on uh this nodethat's using a GPU in which case weought to reset it um and so once it doesthat it'll then go off and from the podsgo back to the app wrappers figure outwhich app wrappers need to be reset toto to do this and we'll put them into aresetting state um and then and then thethe the over there he goes all right sonow it's reset it and the resources arebeing deleted the pods are being deletedbeing cleaned up and they're gone and sonow we can resume and the workload isnow running again and if we look to seewhere it is the antifinity shown thereon the on the right or left whicheverside they are have steered the new podsaway from getting onto that bad node andso now they're running on the on the tworemaining healthy nodes of the clusterand that's all happened without anyhuman help all right this all justhappened uh with the system taking careofeverything okay so that's the demo ofthatand get the mouse over hereokay so what we want to do now for thelast section of the tutorial is we'regoing to take you through three sort ofrepresentative workloads that we run inthis cluster um and all of the YAML anddefinitions these are available as partof the tutorial but we're just going totalk about each of them quickly and showthem running um so the first one ofthese is the coupeflow trainer so thecoupeflow trainer is an operator for uhit's Kubernetes native frameOwork forfine-tuning scalable distributedlearning of training of LLM it supportsa bunch of machine learning frameworksand for our demo what we're going to dois we're going to take a bite a pietorchjob and do uh FST uh FSTP so uh fullysharded distributed parallel uh trainingof a small job we're going to put thatinside of an app wrapper for enhancedfalltolerance we're going to go ahead and dothat um and so we're going to look atthedemookay so again so this stuff is out hereso we're going to do is um first we'regoing to go ahead and do a the cubecuddle apply for this job all right sowe're going to grab thecommand go ahead and apply this yl andso this is this is create an app wrapperthat wraps around a pietorch job that'sgoing to go into this training job forus and we can see that it's starting tostarting to go it's created it's runningum so we used the here's the the YAMLit's way too big for you to read um butwe created this ammo actually using acoupeflow trainer notebook so we used aPython notebook to take some fairlystandard training uh code for Pythoncode for doing this and then we use theuhtraining the training client to actuallygo off and create that YAML for us whichwe then grabbed and use a tool from MLBbatch to take that YAML and inject itinside of an app wrapper for you and soall this stuff is available and the jobis actually finished now um and if we goand look at its uh logs we can see wecan sort of see what it did so it it ranthis training job um there's somesnippets down there in the bottom youcan see that it ran uh you know it's ait ran with two GPUs inside of each eachof the two pods um and then it here'sthe output from the the actual trainingand when we're all done we can go aheadand delete things and clean up afterourselves okay so thatOkay and I think I'm handing it off toClaudia next I thinkright where are weokay um so now we're going to see anexample of fine-tuning and for thatwe're going to use uh cube kubra is theKubernetes version of Apache Ray um it'sRay is a framework uh that goes end toend in entire ML life cycle so you gofrom uh data prep-processing up to modelserving and it gives you a runtime fordistributed uh work to run distributedworkload and also an API uh with a verynice uh Pythonum Python module and also a CLI to runuh to submit jobs and also create jobsthrough the the Ray um the RA the RayAPI um so that works uh out ofKubernetes and in Kubernetes kubra isthe one that will work uh in Kubernetesand the one that that's where that we'regoing to use uh the install is supereasy through a um through a helm chartand it the nice uh among the nicefeature is the ray autoscaler that willhelp you scaling up and down up and downthe workloads in a very transparent wayum so the fine-tuning demo is anadaptation uh from a blog post from RedHat uh which in turn is an adaptationfrom uh a a demo that you can find onthe uh Ray um repository um so we'redoing going to do again fine-tuning uhwith with Ray and we're going to trainuh Llama 3.18 billion on a grade schoolmath uh data set uh with Lauraenablement so the demo is going to be aset up the environment we're going torun the fine-tuning job uh through theray API uh and within an app wrapper andso we're going to submit the job andwe're going to follow how that's goinghow that goes on the arraydashboardsookay allright so we we need to install someprerequisite um it's just to set up thePython environment so it's a oneliner uhwhere we're going to install uh themajor is going to be the the ray CLIfrom which we're going to submit the jobuh so it's a super easy and quick setupum to run the job we also need somestorage because we need to first ofcourse download uh the llama uh uh modeland that's also where we're going tosave the checkpoints where thefine-tuned model is going to be so weare creating a uhPVC and for the this entire um demowe're going to impersonate Alis so we'regoing to run in the blue uh name spaceand with thed-sis we're saying that we are alis inthis case uh so we created the PBC uh atthis point we're going to create the rayjob so here we're usingP an image fromQuay produced by Red Hat with the ray umruntime uh of course you can use theregular uh ray images provided in thedocker hub so we're going to run on oneGPU node so we're going to use eightGPUs and we're going to set up uh setone GPU for the ray head which is likethe master uh and seven for the rayworker so we're going to have a total ofeight pods so we're going to wrap thisin the app in an app wrapper and thenthere is a very nice utility in the MLBbatch uh repository that given a um anyAPI object will wrap that into an approuh the default Q which is the one thatAlis uh can use so we create the appproper we just take a look and we seethat in the the default Q is the onethat's going to be used and the rightcluster is exactly the one that wespecified in the YAML file there is nochange there uh so we're going to submitthe job by applying the the YLfile so at this point an app wrapper isbeing created and since the the node isuh is empty it can run immediately so wecan see that uh the ray head and the rayworkers are being initialized it willtake a fewminutes um okay so at this point we canuse uh also the the ray dashboard tomonitor the job which is pretty handyinstead of looking at the logs you canjust take a look at the dashboard sowe're port forwarding so you can seethat on local host on the default port8265 uh so at this point now we can setup the uh the fine tuning um so we'recloning the repository where the uhwhere we can find the code that's uh ofcourse openum and through the Python Python API umthe RA Python API we are creating thejob with the specification that we needand we set up all other things likewhere um where we where to download themodel in this case in the uh HF chefhome so the hugging face home is goingto be on the PBC so we have one one copyfor for all thepods um we are saying that uh we'rerunning on eight devices in total whichis the number of the GPU nodes in thiscase one for the ray head and and sevenfor seven different workers so we wecreate this very short Python code uhand we run it and this is through thisrayum this ray API it's pretty muchpackaging the entire thing is you canimagine it as like a when you packagethings into a docker container so we'llcreate one package and that packagecontains code and instructiondependencies everything that's thatthat's needed for the job and that codewill be running on the ray head and theray workers so also the head despitethat's the one that's um managing thethe job distribution that's also doingsome actual useful work um so we'rewaiting for theuh for the job to be submitted and thenwe can uh we canfollow the the actual work eitherthrough the ray API uh on the CLI orthrough the array dashboard so this istaking 15 minutes so of course we're notstaying here for 15 minutes i sped upthe process uh but there are a fewthings that we can uh put some attentionon uh so in the dashboard you're goingto see on the bottom the logs of what ishappening and those log those logs arein the ray head so you can also monitorfrom the coupube cli um there is alsothe ray uh overview where you can seeall the tests that are being executedand through the cluster uh menu you canalso see what's going on in in terms ofuh utiliz resource utilization so we seethat the GPUs are working good becausethat's what we want to do and yeahthrough the the ray core the ray coreoverview we can see which jobs are thetotal number of jobs the one that arecurrently running how many are in thequeue and all of that it showed up atsome point but since we're speeding upthe process we may have missed it sowhen the job is succeeded you can seethat on top and at this point we're justgoing to take a look at the where thedata is so in the we're uh executing aterminal in one any ray pod will work weare in the red head in this case andwe're seeing that in the slash modelthat's where the PVC is and we have boththe this checkpoint is last one that'swhere the the the checkpoint that wehave been um with the the fine tunemodel that we just that we just runand I think that's it forAnd nowthanks so we have jusQt one last uhexample workload to share with you todayuh another really important thing on ourcluster is to run models and inferenceso here I'm just briefly going to showyou uh an example batch inference inthis example we're running VLM you havealready seen VLM mentioned in othertalks this conference this is aninference runtime originally uh comingout of Berkeley that is now part of theLinux Foundation and one way to run amachine learning models and what we wantto do here so we're going to run an IBMgranite model in a in a pod and what onewant to do is submit a batch of requestsand collect some statistics about thebehavior of the model right so the waywe do that is sketch on the right wehave an app wrapper that contains a jobthis job itself has two container againone container runs the inference runtimeone container ers run the load generatorthat creates the request and the requestcollects the responses and so on and Ithink as we mentioned before one of thethings with models they're big so wedon't want to have to load them everytime we run such a job we want to cachethem so in this case we use uh apersistent volume uh set up in our atthe bottom here set up here in our inour pod so that the first time we runthe workload we're going to download themodel waste from hugging face but thenext time we run on the same workload wewon't have to do this again the detailsof the containers are here uh prettystraightforward on the left hand sidewe're running VLM and on the right handside we're running all generator with anumber of parameters here we're justsubmitting for simplicity demoake we'rejust submitting random request and we'remeasuring what happens with the modelthe only uh non-trivial somewhat hadocksomewhat huggy thing here is we needthese two things to synchronize becausewe want the server to be running beforewe start measurement and when we're donewith measurement we want to kill theserver right there different ways to dothat uh what we found in practice isthat u users are not really u good atbuilding their own container imagesthere are lots of reasons not to buildcontainer images so here we use anupstream container image unmodified thatwe can trace and so on and then we dowhatever kind of synchronization we canmanage between these two things uh Icould show you a demo but it's probablyyeah maybe I have a second to do that soessentially again this is exactly thesame as all of these demos this is whatyou'll find in the you know supportingdocumentation from the tutorial we havethe specification on the left of theworkload i mean first I I'm goingthrough the but I'm going to skip thatthe persistent volume uh definition butskipping on that we just copy paste theworkload definition the YAML you've seenthe previous slides in my terminal hereand then I can run uh I can capture thelogs of the two containers in my pod thefirst one is starting to load the modelthe second one is waiting for the modelto be loaded eventually the model isloaded the uh inference can start youthe the load generated can start sendingrequests we're processing requests andat some point in the end oops I went toofast at some point in the end we get thestatistics out of this particular testrun and we can compare let's say with aprevious checkpoint of the model and sayif we're doing better or worse so let'slet me uh try to wrap up here uh no Idon't want to play this again um so whatwe what we realized when we startedtrying to have a Kubernetes cluster withGPUs to do AI is two fairly obviousthings the first one is our admin had noclue they didn't know about GPUs theydidn't know about AI so they needed aturnkey solution something they wouldreally could deploy on their clusterwithout having to think about prettymuch anything right the only and andwhat we've tried to share with you todayis that you know the only thing youreally have to think about is what arethe quotas you want to give to thevarious teams and that's essentially theonly thing that then admin has to do todeploy this setup on the cluster thesecond thing we've realized is that AIML experts domain experts they don'tunderstand anything about Kubernetesright so what they need is pre-bakedtemplates you know if you want topre-train a model this is what you doyou know maybe a number of ways to dothis but here's the collection if youwant to fine-tune the model this is whatyou do if you want to do dataprep-processing this is what you do ifyou want to do inference this is what wedo right so what we've tried to sharewith you today is again uh some of thesetemplates here right so unfortunately uhwe cannot really provision uh a GPUcluster for every member of the audienceso we went through this through videowhich I is not entirely ideal buthopefully you'll get a sense and againthe key the key idea here is that all ofwhat we've shown you is just turnkeykind of solutions so I before weconclude I want to just share what itmeans in practice right here's uh one ofour clusters we uh manage in IBMresearch it has about 1200 GPUs and thisis a snapshot I took last week as I werepreparing those slides so at that pointin time we had about 20 teams uh youknow uh on boarded on that cluster rightand the quotas for those teams wereessentially what you see on the left piechart here which is you know one youhave two teams that more or less haveaccess to a quarter nominal access to aquarter of the cluster each and then abunch of other teams right and there's abunch of extra capacity that is notnominally associated with any particularteam and what you see in the middle iswhat we end up with Right with all theseteams submitting a number of workloadsto the cluster we end up at the point Iwas I capture this particular snapshotof the state of the cluster where 126GPU utilized in the cluster and just a avery small fraction less than 1% I guessuh of of 10 GPUs not act not currentlyin use in the cluster either because wewere transitioning between workloads orbecause there wasn't any really smallworkloads that could still fit in thecluster Right so we've you know if wewere to compare that for instance tousing Kubernetes quotas to do the samething which is to enforce straightquotas and saying for instance the theblue gray team cannot go above itscotton we would have lostessentially oneird oneird of our clusterwould have been idled at this point intime right so what we're trying to sayhere is with this system we've managedto do these two really uh antagonisticthings which is one guarantees kota toteam so for instance if the uh greenteam at the bottom comes back and sayhey I need my 16 or my 32 GPUs they'regoing to get them but at the same timewe're never really wasting any kind ofGPU capacity that we have that is againreally fundamentally unacceptableso uh I've shown this before again wetry to run through all of these kind ofkey uh uh features that our stack needto have to really be functional rightother than what I just said which isessentially be self uh maintaining andself you know no question asked when youdeploy and you manage it and again uhyou know a lot of that has to do with uhthis community and and and taking advantadvantage of all the great work that isdoing that is happening in in in manydifferent places and finding the properway to get these things together jointlyconfigure them so that they you knowwork together and and and and give allthe capabilities that that we want sothank you very much for attending uh youcan uh look at uh you know again you canfind all the details of what we've shownyou uh videos everything at this QR codeif you're also interested if you're ingeneral interested in you know platformuh for AIS right on Kubernetes we have anumber of talks in this in the rest ofthis week we've uh you know from IBMresearch and friends like Nvidia likeRedat and others about different aspectsother aspects of this platformsustainability complianceuh benchmarking actually and and and uhdata prep-processing also which is oneuh one we have an entire talk about thatwe didn't show any templates uh umworkload for data processing but thesecond talk we'll we'll have a lot aboutthat once again thank you very much forattending2025-04-15 22:03:31.453359 ��4�n#��AAb7mRoJYsMouh good afternoon uh welcome to asession so my name is Olivia Tardu i'm aprincipal research scientist and managerat IBM research and today I'm joinedwith my by my IBM research colleagueDave Groove and and Claudia Misali andwe're going to talk about how to buildoperate and use GPU clusters for AIright so by now we've all heard about AIand Gen AI we are all probably excitedabout it but we're also sometimes alittle bit anxious about it for lots ofreasons and as an organization maybe thefirst reason to be anxious about AI isto actually procure GPU to do AI rightit's both time consuming expensive slowyou know whether we rent them from acloud provider whether we buy themwhether we lease them whether we want adozen of them or thousand GPUs it's abig problem but someday we'll get themand and then the second panic arisesright it's how can we take these GPUsthis pool of resource and somehow shareit across all the teams all the users wehave in our organization all theprojects that need access to theseresources so that we can manage them andwe can maximize utilization and we canmaximize the return on investment sothis is something we've been working onon IBM research on our own cluster forseveral years now and we've essentiallybuilt a point of view uh incrementallyrefined methodologyuh a setup and we're going to try toshare this with you today uh with thehope that it can help you with your ownuh you know AI journey and also that youwill tell us everything we could dobetter so let's get started so um in uhwhat we're going to show you today isentirely open source uh everything isavailable online including this tutorialso if you go to this QR code and thisURL you'll find uh all the comments allthe scripts all the uh YAML that we'regoing to use today and you can youconceivably follow along and and or youknow replay that at home right so uhbecause we're going to talk about GPUsand I don't typically travel with twodozen GPUs in my backpack uh this is allgoing to be recorded demos but uh youknow on a real GPU cluster so I'm goingto start by a brief background on a IMLworkload on the kind of hardware withtargets on what exactly are the keycomponents and maybe the key goals ofthe platform we've been developing atIBM research and then we'll get into thethe the the deep dive of the tutorialhow do we set up a cluster how do we onboard users how do we do things likemonitoring forance and also what kind ofworkload templates we give our new usersto get them started on these kind of ofclusters so um let's start with AIworkloads uh again it's probably notworth spending too much time on thisbecause you probably heard a lot aboutthat before there's essentially anentire zoo of workload the collectionsof things we want to do all the way fromhaving the idea of a genai applicationto actually running this application inproduction right and what we'reinterested in discussing today is thispiece of the pipeline again it's morethan a pipeline all kinds of feedbackloops bypass and so on that are uhdependent on accelerators such as GPUsand that are typically batch workloadsright it's not about running theapplication at the end of the day youhave lots of other talks at CubeConabout running large language modelrunning identic frameworks and so onwe're not going really to get into thattoday because you know also in senseit's the scope really there's a lot ofoptimization and special things you wantto do for that what we're going to focuson is everything that happens beforethat right everything starting fromthings like data preparation even youknow like when we do a profanityfiltering for instance on on our dataapp before we train models we actuallyuse already machine learning models todo that so we need GPUs we need a lot ofresources you know training modelsfine-tuning models running baBTe um half years agouh we collaborated um collaborated uh tolaunch Japan chapter of CNCF called thecrowd native community Japan uh in shortCNCJ um there are various people uh fromuh CNCF member company and localcommunity organizers and uh LinuxFoundation Japan and um uh in CNCJ uh weareorganizing special interest group uhlike security and um sustainability andand so on uh we are holding meetups umum more than once uh per month uh we arevery active anduh and also uh we uh we uh we areholding um uh upstream training toincreasecontributors and um um by and we uh alsowe worked uh very hard to invite Cubiconuh the first Cubicon to Japan and wesucceeded to uh invite uh the firstCubicon this uh this Juneyeah that's a great answer thank you uhyeah you can also read the gyyeah uh for uh for the next question uhMSN uh as someone who uh has theextensive experience in the community uhcould you briefly describe the landscapeof cloud native activities in Japanbefore uh the CNCJ the community uhhappened like Narson explained uhperhaps maybe uh you can touch oninitiatives uh like upstreamtraining or like Japaneseconferences okay so uh uh as I as yousaid we are running Kubernetes upstreamtraining Japan uh based on the newcontributor workshop that was runningKubernetes uh contributor summit uhbefore COVID 19 uhand uh we started it in 2019 and it had12 events in six years and as a resultuh we had275 trainees uh as of now and over 10%of trainees are students uh almost halfof trainee are contributing to OSSuh around 30% of trainingare contributing to CNCF projects uhover 20% of trainee are uh contributingto uh some Kubernetes project uh butonly three trainees became uh K8 memberuh we also have uh diverse trainers uhthere are six active trainers uh fromthree companies and one university uhalso uh there are lots of crowd nativeand opensource communities in Japan i'dlike to introduce some key communitiesfirst Kubernetes meetupTokyo itstarted 2016 and there are uh 78 eventsin 9 years uh they have over 2 uh no9.2,000 2,000 members uh there also havesub communities uh such as Kubernetesmeetup novice and crowd native meetupTokyo next uh crowd native days uhpreviously container days uh this is thelargest crowd native conference in Japanuh it started 2018 and had 14conferences in sevenyears including events focused on CI/CDand observability uh they haveuh average 1.7uh,000 attendees in Tokyo also toaddress COVID 19 era uh the live streamtrain uh platform called Dreamcast uhwas developed by Dan uh Rust uhopensource summit Japan uh it is anannualopensource event held by Reax Foundationin Japan uh it started in uh2009 and its scale is around uh 1,000attendees uh this is one of the topinternational opensource conference inJapan uh there are many other technicalcommunities and the members aredifferent in each communities sobringing these together uh is veryimportant to hosting a Raj event likeCubeCon yeah so there was uh basicallyuh this like Linux foundation officialconference and the community conferencesin Japan mainly and uh for for Kasan forthe next question uh you work for theLinux Foundation so uh from theperspective of Linux Foundation and CNCFum what unique challenges uh did uhJapan have historically and uh faced interms of technology adoption okay thankyou uh so uh uh I think it is the natureyou know kind of like a nature of theJapanese industry that you know we aremoving towards more and more isolatedfrom the global uh trends uh rather thanyou know moving toward the joining themso uh without because it's the nature sowithout uh making like a push towardsknow collaboration you know uh we justwe just know becoming more and more andisolated from the global ind industry sothat's the nature sowe we you know basically had to dosomething uh to uh make them uh make youknow industry excited about theircollaborating uh in the global industryso uh in the case of the Linux kernelcommunity so what we have done at theLinux Foundation was now as Mosan justbriefly mentioned that we started youknow a event uh called the Linux conJapan uh I think 2009 I thinkU uh oreither 2009 or 2010 and uh having thatglo global conference uh what we havedone was basically know we brought uhglobal engineers particularlymaintainers uh to to Japan knowbasically invite them over to Japan andhad a opportunity for the local uh Linuxcorner maintain uh engineers to directlyuh uh speak you know communicate withthe maintainers you know globalcommunity leaders so they feel morecomfortable to sending the patch overthe email without that you know it youknow because of the know languagebarrier and also the cultural thing uhit has been it had been really difficultfor the Japanese developers to send apatch over the email uh which is likeit's it's like you know uh uh sendingsending you know their IP you know theirtheir uh patch to somewhere uh they havenot you know even seen before but uh youknow having that conference locally inJapan they are actually you know able tosaw and communicate with the global topmaintainers who will be reviewing doingtheir patches so they they had confidentto submit the patches and that way youknow we were able to increase thecontribution from Japan so eventuallyyou know after a few years later uh uhamong the top 10 Linux contributingcompanies some like a few of theJapanese companies like Fujitsu and theRenaissance has become you know one ofthe top Linux corner contributioncompanies so I I think you know havingsomething that kind of uh events uhlocally is really important and thatthat was that was the my experience ofbasically know uh uh I guess increasingthe collaboration of know Japaneseindustry and you know global industrythank you yeah uh I probably whoever isjoining this session would agree with uhjoining Kipcoin is not just about theopportunity to learn but also like uh wehave maintain summit or project forvideo we have a place to discuss withupstream contributors or you want tobecome contributors and this is a reallyimportant opportunities to have such anevent locally yeah so uh that was likethe highlight of like the Japanesecommunity in history um for the nextsection let's see if I can dive into thechallenge we face uh so next questionPansan uh could you elaborate on like wehave several differences uh like inculture or like in languages or like thetime zone differences even uh so yeaheven for online events uh we would havelike some challenges and some of theattendees would have the same problem soum souh how challenging is it for committeemembers to secure the kind of companysupport for attending uh distanceconferences uh that often involvesignificant expenses okay uh I wouldfirst like separate the questions thefirst question is about the timedifference deep dive like what is thechallenges that we face right as ourfukuyasans and also the mans alreadymentions we have very strong like localcommunity we have a lot of meet we talkwith each other yasan like invite peoplecome to Japan we connected we we getinvolved however what what we have moreproblems is that if it's just only asingle PR is okay we we can't do thinglike a synchronize we create the PR wegot the feedback we got the review PRmerch but in term like in the long termif you want to get involved like to be apart of a minute tenders for examples orlike to get um like involved to thelatest thing that's come up to thecommunity it's not it's not that uh easydue to our uh time zones by the way howmany of you here um face their similarproblems that we face like for exampleyou um your residenc is in epic timezones but yeah you're Yeah yeah yeah wetoo most of the community call that I Iget involved is actually um around 1:00a.m or 2 a.m in Japan times and it'svery difficult like even right now it'sjust only once that I join the the the1:00 a.m call um I have parts to thechallenges in different way in myexperienceum one is that okay let's try to find ahalfway maybe 7 a.m as your west coastand then 11 p.m in Japan so there aresome project like yes we we can changeto this one because for example the kelproject that I'm the maintenance so yeahI I make a decision so it's 7 a.m and 11p.m but some project is hard becauselike moVst of the maintainer is like theydon't want to like wake up at 7:00 a.mso um uh what I did is that uh first Itry to find the key persons who reallyinterested on the proposal that I'minterested tied to pings like trying toset up I would say like finding two timezones like people from two time zonehappy it's much easier than finding thepeople from west Europeans and Japan sotrying to coping up I mean like I I joinone like working groups and I do thecping like there one with the Europeantime zone and another another time frompeople's from the west coast so yeah wewe are happy with that and at some pointwhen we like take the times to deepdiscuss about our idea get know to eachothers it's come time go back to thecommunity um and say maybe we want theback time soon so in in in two of theirof the um obviously that I um getinvolved and trying to contribute to uhdid have a epic town after we have adiscussions that yeah that is the firstquestions do you want the secondquestions to go or is that time outokay there secondokay I will try to go short for thesecond one how to secure how to securethe um the ticket for the CubeCon yeahum it's I would say it depend on thecompany um in my company they arewilling to pay if you have like evidenceof the contribution like your your talkgot accept you got the fund to come andtalk and meet people's but for some somecompanies may not yet but I just learnedthat actually CNCF have a veryinteresting page called like convinceyour boss your search is convince yourboss yeah this is our have a likesamples like letters that you can liketry to convince your bot that howvaluable of the event is this yeah um isthat enough for the answers short enoughor you want something moreokay thank you uh CNCF also uh provideslike a like basically like a afterexpense type of funding like ascholarshipall right uh let's go to the nextquestion move on to the next question uhnext mutan uh so building your ownexperience as a upstream uh trainer uhcould you discuss the like specificlinguistic challenges andcultural in their projectcontribution entry barriers for uhJapanesecontributing to this event in theyeah I'd like to sharebarriers to studying contributions andhow toovercome uh first uh there is a languagebarrier for Japanese uh the mostdifficult uh thing about English forJapanese people is the difference in theworldorder in Japanese word order uh issubject object verb but in English uh itis subject verb object uh even if thereis a translator by geni uh the wordorder is different so uh if a sentenceuh is not finished speaking uh thetranslation will uh not be completed uhand the time lag will occur and makingit difficult to understand synchronouslyuh to reduce the language barrieruh we encourage people by saying uh wedon't have to be fluent in English likeme the people in the community are veryvery kind and uh there's no need toworry uh we need to uh writedocumentation to know uh how to getstarted and what the educate isand place to try it out uh Kubernetesrelated project have good documentationsuh and repositories uh such ascontributor playground but few peoplecan lead and understand everything uh sothey have no choice but to try it out uhso I hope that existing contributor willuhkindly teach beginners and what I wantto uh emphasize here is uh that uhplease useemoji such as smiley faces uh to showyourkindness so that beginners uh do not getuh scareduh another big issue is uh thedifficulty of uh finding time tocontribute uh there are many people whoare not able to contribute as uh acompany people or uh as part of theirwork uh there are various reasons forthis uh such as open source policies andinternal rules not being established uhor the procedure are being uh toocomplicated and uh taking a too muchtime uh even if uh there are rurals uhit can be difficult to get uh boss'sunderstanding and permission there'ssomething that upstream training uhcannot do anything about uh so I'mworking as a sig business of CNCJ todiscuss OSO related uh topics uh but Ifeel that uh this is one of biggest uhbarriers uh many contributor it hasW alsodiscussed uh uh this as a problem at themaintenance summit uh therefore Ibelieve uh we need guidance uh andworkflow templates uh to spreadopensource culture internallyuh and get understanding the value andpermission to contributeokay yeah thank you uh yeah I guess weneedconvincing open source as wellespecially in enterprise right uh okayuh uh for uh the next question uh for uhcould you highlight uh the importance oflocalization efforts uh for CNCF andKubernetes related uh resources uh howcritical is this uh for uh expanding thecommunity in non-English-sp speakakingcountrieswell uh without translation they are notable to convince the boss that's forsure that's really sure but I know uhanyway um uh there was a reallyinteresting report uh came out uh 2023so that that was done by the Japanesegovernment uh was uh the uh report wascalled the uh digital transformationreport uh 2023 so uh in that report uhuh it was actually really shocking so uhuh the adoption of the you knowtechnologies like know container ormethodologies like know devops uh wasknow like less than 10% in Japan thatwas shocking enough but what was evenshocking was know 40% of uh respondentresponded responded that they don't evenknow what they are so that that wasreally shocking and that actually reallyhappened know because know uhuh many of you know cutting edge uhtechnology information is basically youknow uh uh distributed in Englishlanguage and unless know uh it is uh itis translated into Japanese language itwon't be you know uh uh distributedacross know Japanese market so they areable to you know uh convince their bossand you know uh uh the boss make adecision to uh adopt those technologiesuh uh in their business in their systemso uh I would say you know you knowtranslation is really really important Iwouldyeah thank you yeah that was a veryshockingreport uh let's move on to the nextsection um so this was like uh what'shappening in in the challenges uh whichwere like have been going and uh let'slet's touch the part of like a gainingCNC recognition like getting attentionfrom CNCF like in order to adapt uhCubeCon to the region so uh Dr uh couldyou explain the motivations behindestablishing CNCJ the community to Japanin light of the challenges thatdiscussed also uh could you shareinsight about uh the importance ofsponsorships yeah sponsorships in makingthese events possibleokay about the first question thebiggest challenge uh is um to changetheir culture uh however to change theculture it is important to change mymindset uh let's go to upstream so uhabout the contribution uh we are holdingupstream training inviting um um toplevel contributor Japanese speakingcontributor and about to increase uh CFPto uh uh to increase uh speakers of inCubCon uh we we held a meetup u calledlet's uh let's let's submit cfp and theum cubicon speaker likeuh tell how to uh how to pass CFP andhow to hack CFP uh it was um very usefuluhuser and also we uh invited uh cubestronauts in Japan and we uh held ameetup u like called uh let's becomecube stronauts and uh and the cubegathered and cub shared theirexperience and the thanks to that uh inJapan the number of cub is higherU compared with othercountries and also the u big umchallenges time zone uh about time zoneas P pans said we are um seekingcollaboration with uh APACpeople APAC APAC people their their timezone uh are similar to uh JapanJST and recently uh we uh one of ourorganizer launched a tag security APACmeetinguh it is working well and and they arecollaborating to write a white paper andaround about second question is aboutthe importance of sponsorship yeah uhyeah sponsorship is very important uh itis afact we need money to hold a big eventlike CubCon uh yeah venue is veryexpensive and the andand the um networking takes money and uha lot of uh we need money for for loy souh in order to uh uh invite CubiconJapan we persuaded uh Shen CF uh that wecan gather a lot ofsponsors and the um organizers alsopersuaded their their companies andtheir friend company their friendscompanies uh to to become uh sponsorsand we we showed uh CNCF that weU candidate of sponsor list and the CNCFfinally approved uh to to have cubiconJapan also it was important to involvehigh level sponsors in Japan uh platinaand gold member members it is also uh itwas also importantyeah's company is Hitachi and Hitachiwhich is one of the top tier uhsponsorship uh for CubeCon uh if you'reinterested who's sponsoring uh you canlook on the website uh okay yeah we arerunning out of time a little bit butlet's try to make it to the lastquestion uh for the second last questionuh Pansan uh as a you're cube chair yepyeah yeah she's a cube conchair uh so uhcould you share uh uh like the updatesuh on the current sponsorshipacquisition and the CFP pro uh statusincluding uh submission numbers and alsoacceptance rates and some details aboutthe upcoming events sure um first of alllike Nagam said uh sponsorship is veryimportant so start with the sponsorshipsright but um we have three of theplatinum sponsors okay I I will watch tosell whether I like s to the rightnumber okay we have we have Google cloudwe have Hashi's and Yahoo's corporationsfor the platinums uh members we havepins of their gold members we have 13 ofthe silvers members and we have twostartupmembers correct yeah yeah we um yeah thenumbers can still going until May 2nd inmy understanding for the end of thecontracts uh for the status of the CFP Iwould say like um uh we is meanuh my co-chairs are um a masang and uhI'm sorry Kim Rosans and me as aco-chair and uh our program committeeswhich is really like um imminent and uhthere's prejudices uh program committeewe have 61's committees helping us onlike evaluating the um the the CFP andfinally uh we have received uh746 of their submissions and wecalculate just only for 4 46 of themwhich is very small number if you don'twant to calculate I already calculated6% which is like um less than like ourtop tier conference with like 10% 20%30% it's very uh very difficult for usto like get the list of of the CFP um Iforget the last uh is this a question ohokay um about the cube CubeCon Japan isgoing to be on June uh 16 17th um at Odaand sorry Hilton Hilton Hotel is OdaTokyo otherwise a really popular seasideum uh recreations place that you canenjoy all all the days or the weeks andyou can take public transportations tomany places so believe me I could tellmore about like how amazing Japan iswould be but I don't think we have timefor that right yeah okay um my lastquestions to the audience how many ofyou are going to CubeCon Japanno no i need a hand i need a handeven you're not going can put a handlike just like that yeah yeah yeah thankyou that's it yeah please also doconsider coming to Japan it's a greatplace okay um uh yeah we I think I thinkwe still have like a few minutes solet's go to the last question uh Fukasanuh to wrap up uh our discussion couldyou share with uh our audience that uhwhat kind of uh support uh CF CNCF cando hosting cubecon in their locations uhwhat what advice would you give tocommunities okay well you know uh theLinux Linux Foundation and CNSF uh ablewe'll be able to you know uh create theuh the place where an opportunity whereuh you know uh people can get togetherand discuss right know thanks for thosesponsors but as I said uh for Japanesedevelopers uh it is an opportunity forthem to interact with the uh communityleaders like you coming from overseas sowithout your participation actuallycubicon Japan isn't really valuable forJapanese industry so uh you know wereally hope that you come over to Japanyou know encourage Japanese engineers toparticipate more into the community andknow in in order for you to enjoy uh thetime in Japan we are you know we all arehappy to help uh you know uh uh you knowto make your uh trip worthwhileplease welcome to Japanthank you uh okay so we're uh we we arewe have to close uh so if you have anyquestions uh please talk to us uhespecially uh Northern from LinuxFoundation and also Cubon cochairs hereso yeah it's a great opportunity tointeract uh thank you for coming oursessions today and have a great couponthank you2025-04-15 22:03:31.958092 � ��#�o#��}Awb8K3RV6Sbwhi uh I'm Uina Camura from Hitachi uh inCNCF I'm serving as a governing board uhgoverning board and also I'm anorganizer of cloud native communityJapanmy name is Shu Muto from NC solutioninnovators and I'm working at uh as a uhSIGUI chair and CNC CNC ambassador anduh Kubernetes upstream during Japanorganizerhi my name is Noria Gquest i work forthe Linux Foundation i'm a Linuxfoundation staff so I'm helping CNsafety business in Japanese market thusI'm helping know uh Japanese communityuh to grow thank you all right justquick okay um my name is Syan and butmost of people know my names as La Panguh I'm from IBM research in Tokyo um Igot involved in some uh CNC projects ialso one of organizers of the Japanchapters from the community groups and Iwill be a co-chair for uh CubeCon Japanthis yearthank you thank you yeah thankyou uh as you notice uh we are all uhpeople from Asia and specifically mostof us are from Japan and English is notour first language so uh minus forhaving accents or like suffering alittle bit and because we are also verynervous uh please uh allow us to drinksome of the water sometime yeah okaylet's start uh for uh let's start fromthe community history uh of the Japanesecase studyuh uh can you share uh with us how theJapanese uh committee has evolved uhover the years and how you've recentlymanaged to establish strongerconnections with the global cloud nativeecosystemyes um u Japan was behind in theadoption of cloud native technologiesand there was no official connection uuh with um CNCF upstream for Japanesecommunityuh more uh about onSZies uh takes care of thedecision maker which decides like whatyou can do once the once it evaluatesthepolicy so this is a sample policy forexample over policy so here we just sayokay this is particular user and thenthis is the uh HTTP method and then thisis the endpoint where a person is tryingto access a user's endpoint andespecially like the user allies that'sthe one who will be able to access thisparticular endpoint it provides you aoutput saying like deny then you canbased on that condition you can uhdecide like what you want to do whetheryou want to allow this particular userto access the admin page or like userspage in this case um so as I saidUh OPA is written in reggo language uhthe policies are version controlled uhit decouples the application logic soyou can keep the uh policy decision inthe is decoupled um manner um so andthen it's centralized you can uh haveall the policies which you have in acentralized in your current environmentin a single paneso why phups for uh uh why OPA for Phopsso obviously I would say like how manyof you here are using OPA in yourenvironment nice uh so I think like mostof you are using OPA to like um securitycontrols and then like make check forcompliance and stuff so uh we can reusethat existing tool in your landscape uhfor like this kind of POPS purpose aswell um so as I mentioned before it's asingle policy framework you can use formultiple domains uh shiplap approachinstead of like you uh allow thedevelopers to provision the resourcesand then like go back and then check forthe cloud span instead of that we try torestrict whenever they're trying toprovision the resources that way it'seasier uh to deny those uh overspend uhnext thing is like GitHubs friendly uhit's version controlled you can uh haveit in a git repo and then have a githubsum flex cd ago cd tools to go and deploythose uh policies in your targetkubernetesclusters so this is opa uh gatekeeper soit's a kubernetes um integration um sothis opa gatekeeper is written uh usesopa language so basically it's the samething it's gatekeeper is admissioncontroller so it checks for the policiesit sees uh it validates the policy andthen make sure like if it needs to beallowed or denied um it provides youauditing capabilities uh for existingresources as well in if you're trying todeny your uh uh in incoming request uhbut it can still valid go and validateyour existing resources and see if thereare violations for those uh existingresources so this is a traditionalapproach most of the people here mightbe using it uh for example image policyenforcement here in this case like umyou want to restrict this particular umimage registry to be used you don't wantpeople to use dark docker or gc orpublic registry and stuff that you wantpeople to use private registries andstuff uh network policies for examplelike you uh don't want people to uh docross communication within your clustersor like don't want uh people to let uhtraffic go out outside of your clusteruh to external domain uh google.com orsomething like that and then secretmanagement for example like uh you somepeople may be uh denying that uh usingenvironmental variables and stuff thatusing a secret ref or like uh kubernetessecret in the secret ref or likeexternal secret management tools andthen compliance validation so forexample likeuh you can uh uh there will be certainpolicies basically like uh uh torestrict uh privilege container runningon this uh cluster so that's what uh tocheck if there if it's uh compliant andthen if it's uh violating your uhsecurity features uh security securityviolations and then for PHOPS basicallyI'm trying to say like okay we can justadd tagging and stuff to meet the uh tomeet the compliance and then likecompute optimization so like wheneveryou try to whenever a user is trying toprovision the resources you can justcheck like if those resources are uhcompliant and then like if you areallowing those particular resources tobe utilized um and then namespaceallocation on the namespace level youcan add some budgets and make sure likeokay this is my budget for th[e wholename space and then these many workloadsare trying to use this for only theparticular uh uh spend like within thatname space and then uh for example likethere could be a storage costoptimization policies as well you canhave like premium SSDs standard SSDs uhstandard uh disk and premium SSDs uh youcan decide like which uh uh storage yourapplication team has to use so you candeny based on that so this is example ofa certain policy so here I'm trying tosay okay if a particular uh deploymentis trying to use more than one CPU orthousand mill try to deny that uhrequest right so when when wheneverthey're trying to use further more thenlike you can either flag it uh then youcan decide the enforcement action youcan deny dry run and one so one is likeaudit and dry run it just doesn't showany information to the users u deny isstraight away deny doesn't allow you toobjects to becreated so this is another example whereyou can dictate saying like okayum we want uh non-production workloadsor dev workloads here in this case umake use of spot in which is like 70 80%cheaper than the uh on demand orreserved instances um so you can dictatesaying like okay this particularworkloads going to this particular namespaces need to have these labels so theycan just uh make those changes and thenget those things deployed and then thisis for the cost allocation so we canknow like having these labels make surethat uh whenever you want to do achargeback in a multi-enant cluster it'seasy for you to charge back to therespective application team or businessunits uh so this is on the uh namespacelevel uh if you want to globally controlthe spend on the each namespace you canhave a policy on the namespace level torestrict the overall u uh spend on thisparticular uh overall resource creationon this particular name space so this isone is saying about uh storageoptimization where you can dictate okayuh premium SSDs and uh uh supreme ultrasultra disks need to be used only on theum production and then critical uhwhichever has the critical labels andthen like deny and all otherum uh all other uh name like all othername spacesso we can further do some more like ifyou have external tools like which isalready calculating the cost you canmake some API calls to that and then getsome metrics and then based on that youcan still go and deny those traffic uhgo and deny those requestsbasically so how do I get started so Iwould say like try to start small checkwhat is the current utilization in yourenvironment see what'sthe request being utilized and then likewhat what what is the CPU uh requestbeing uh requested and then like what isbeing used and based on that you candecide a policy and then uh you candictate okay if I'm a GPU workloads onlyGPU related workloads should use the GPnodes not the other workloads soinstance based uh constraints you cancreate and then budget enforcement soyou can create a namespace based or uhnamespace based policies uh so that youcan clearly know like what's your spendon the name space and then like uh keeptrack of the name space and then finallylike furthermore you can do like autoremediation so like once you see allthose violations you can haveuh uh make those change uh like you canuh create a mut like even OPA can do notonly validation it can do mutation soyou can make those changes like let'ssay like if your CPU relation is lessand then you can go ahead and then makethose changes uh modify those uh CPUrequest and uhlimits so here it's a team effortbasically like uh you need acollaboration with your platform teamsecurity team and finance andapplication team they all to worktogether whenever uh you create anypolicies sometimes there could be anexception like where you want to allowcertain policies to be allowed for someum exception cases so maybe you can comeup with some extra annotations by whichyou can temporarily bypass thatparticular request and then document allthose policies and then explain the endusers like what this policy do does andthen like uh how uh it's effectivelybeing implemented in this currentenvironment and then m\easuring successis a key uh try to uh see like how thesepolicies um help in your organization touh like whether it is like helping youto like reduce the cost which uhoverspending and then monitoring theviolations and then see like if you'retrying to come uh if if you're like ifthe violations are coming down if youare in trackum so these are the common challenges uhwe will usually face whenever we tryingto implement um so it's policy as Imentioned it's us uh uh uses reggo soyou need to like have some good testcases uh written maybe you can use opatest or uh conf test to evaluate thosepolicies and and then uh try to do it ina dry run mode so that like uh end usersare infected but you can you can come toknow like what's the violations in yourcurrent environment and then as Imentioned like uh there could be someexception cases you create a annotationuh so to override the uh violations orto allow the particular request uh nextis cloud build is keep changing so youneed to keep track of like what'shappening so you need to have areal-time integration or like try tocorrelate with whatever is happening inyour cloud spanUh next uh I would recommend like tohave those policies in a GitHub way sothat like whenever you make thosechanges once it's merged it getsimplemented so it's not like outdatedsitting there in a repo and then it'snot uh applied on yourcluster uh so these are the resources Iwould recommend so going to the OPA uhGitHub repo or Opa uh gatekeeper libraryso gate cable library has a set of uhdefault set ofuh constraint and constraint templateswhich you can play with and then you canenhance those uh uh policies in caselike if you want to further uh makechanges and then uh obviouslydocumentation is the key so you can gothrough the OPA documentation and thenthe gatekeeper documentation and then umobviously join the slack channels andthen like you can ask the questions ifyou have any doubts or like if you arefacing any challenges there are a lotmore community members for you to helpuh yeah towards a demo I'll just showsome of theum I'll try to show you like certainscenarios um how it can beused um so uh so in this specificcluster I just haveum um open uh gatekeeperinstalled and uh I'm sorry aboutthat I have a gatekeeper running hereand then I just have a open cost uhrunning as wellum so what I'm trying to do is um uh letme show you so there are like multipleuh constraint templates I have appliedhere um so so these are the custom CRDsuh which are created using thoseconstraint templates um I'll try to showsome example here um one of thosepolicies maybe um so right now I'mtrying touh uh create a PVC for example so I amtrying to is itclear yeah so if you can see here I'mtrying to create a premium SSD uh liketry to use the uh storage class premiumSSD in the dev name space so when I'mtrying to apply this speakmanifest so immediately the policy uh imimmediately those resources uh it'sdenied it's not applied on the clusterso gatekeeper uh as admission controllerhere so it checks for this policies andthen uh since it's not meeting thecondition like this this particular uhstorage class need to be used only on uhproduction uh broad name space in thisexample but uh not on dev so it's notallowing you to create but let's checkother uh thingum so here I'm trying to douh so here I have a PVC which is uh onbroad name space we'll try to apply thisone so it gets created so it's thepolicy is working so it makes sure thatuh this specific uh storage class uhthis specific PVC can be you uh storageclass can be used for on the broad namespace um another example I would like toshow is hereum I'm trying to likeum createum um deployment and then like I notspecified any CPU resources or uh memoryresources or anything here so whenever Itry to applyhere so there is another policy createdso which is basically checking for thoseuh resource fields and then it says likemissing CPU request which is required umuh so I will show an example nowso yeah this is an interesting one sohere we are trying to allocate likethree core CPU um so what I have done isum I have tried to integrate with uh uhopen cost so what it does is um I'llshow youhere so here uh I have like a twopolicies first policy is basically likejust checking for the CPU resourcesuh it says like the limit the min shouldbe 100 mill core and then like uhrequest should be 100 mill and my uhlimit should be,000 mill core um andum the second policy is what uh I'mtrying to predict so based on the opencost it says like okay I am uh trying tolike overspend my limit for thisparticular namespace is only $5,000 andthen if I if I allow this particular uhdeployment then it will reach like$7,000 um so way I'm trying to do it isI'm trying to get the current spend anduh since I'm not running for a longhours so I'm just trying to check thecurrent uh minutes used and then basedon that I'm calculating the uh CPU spendso by which uh like if I allow this newuh deployment which is around threecourse which is like over like$2,700 um so I'll show you the open costhereso yeah so this is the default namespace so I'm trying to uh extract thisinformation hereso I created a custom app which pullsthat uh opencast API and then likespecifically says okay this is theamount of uh 8 10 minutes is where theapplication is running and then thismuch core uh is being used and then thisis the corresponding uh CPU cost forthat particular uhworkload those workloadsso yeah so by which uh there are likemultiple ways you can do it like if youhave like external other tools uh youcan integrate that with that and thenlike you can block thoserequests um yeah so if I try to applythe otheruh deployment which is having like a lowCPU it should get uh created so here I'mjust trying to useum just under millico which is fittingwithin the budget so it has uh it it itfits with the budget the currentutilization is just $4,500 for therandom mill it is still within the limitso it allowed the particular request tobe createdso I'll just show you uh the violationshere um and I will show the constraintso basically you can see the violationshere and uh so I'm trying to run in adifferent mode so I'm trying to run inthe dry run here so here you can seeclearlylike so these specific name spacesdoesn't have thoseuh specific labels uh for uh trackingrightso even if you try to create the namespaces it doesn't allow you to likecreate the name spaces because like itit itit needs umso for example If itis non-compliant name space whichdoesn't have those labels so it will tryto say okay you need the cost center andenvironment and otherlabels um soum and then like uh if you want to trackso maybe like you can create a graph ona dashboard and then see like what arethe constraints you have and then likeuh what are the violations you have inyour uhenvironment anduh so Just want to show something hereuh sineokay yeah maybe I'll just show it herefor exampleuh let's show somethinghere cpu request right so you candescribe so you can see here thisspecific name space has missingrequestes which is requestum uh sosorry get constraintum for example um monthly spend right soyou can just seeum even if it it can uh check for yourold resources soSorry so the existing workloads whateverI'm running it's already likea violating that uh thing so it showslike it will be ex like it might exceedso you can uh restrict or you can seethe violations for the existingworkloads as well uh it's not uhmandatory like it will check only forthe new resources even for the oldresources whatever got created it willuh keep checking those policies againstthose uhworkloads yeah any other questions here[Applause]thank you so much thank youhi I have a quick question thanks forthe session um the cost what is thesource of truth like where does it gettheum s is it running on EC2 instance orwhat what all cost does it considerum so here I'm trying to use open castwhich checks that kubernetes uhnamespace resources and based on that uhyou right now I'm just using custombilling so you can integrate with yourcloud provider so it can keep track likeit can check against your cloud providercost right okay thank you thank2025-04-15 22:03:32.721081 88��/�p#��AaiC7C56pE7Ihello everyone good afternoon to thanksfor coming for this talk uh today we aregoing to talk about beyond securityleveragingoperetis my name is Satishh Kumar Wingeni'm coming from Canada uh so I run awebsite uh DevOps cloud junction so it'sa community website we talk about cloudnative AI other technologies uh I'mpassionate about computers electronicsuh photography i love to contribute toopen source projects ive contributedmultiple projects i like to uh attendcommunity events likecubecon so today we will uh cover aboutuh the finaps challenges whatever wehave in the kubernetes environments uhopa fundamentals uh and capabilities uhif you are not aware of opa and thenlike how we can extend opa to uh costgovernance which is like fabs practicesand then like I'll show some examplephops policies which is written in regoand uh how we can adapt and then how wecan implement in environment and towardsthe end maybe I will show a demo uh withset of policies uh I can which I haveapplied in myenvironment so just a quick overview oflike what is PHOPS so PHOPS is apractice um in a dynamic cloudenvironment it's very hard for everyoneto keep track of like what's happeningin your cloud environment so uh come upwith this uh key goals uh in form somake sure that the uh your cloud billshave visibility you can take you canview those uh cloud spin effectively andthen optimize try to choose like what'sthe correct uh right size instance whichyou want to use and then go for likereservations uh instead of trying to useon demand to reduce cost and thenoperate basically like try to have acontinuous process to identify and thenremediate uh those uh uh overspendthe PHOPS challenges whatever we have inthe Kubernetes environment right nowlike in the flex era uh state report itsays like 32%age of the resourcesunderutilizeduh and then there are like other commonissues basically like overprovisionresources there are like idle workloadsuh and then nonoptimal node selection sowhenever people try to provision thoseclusters we don't know like what's theright uh skew to choose so we justprovision some skew and then like laterwe see like okay it's not uh effectivelybeing used and then uh resource kota solikewe don't specify we don't like restrictthose uh workloads to have a requiredresources to be used instead of like weallow people to leverage whatever theunderlying hardware uh it's capable ofuh the resources and the untag resourcesso it's a problem like in the chargebacklike if you are uh not able to charge itto the right team and then uh uh youlike not able to like track keep trackof like what's happening in your cloudenvironmentso this is a latest cast report whichsays like in a given in a group of uhKubernetes clusters whatever theysurveyed so it says like only 10%age ofthe CPU being utilized and then likeonly 30 23% of the memory being utilizedlike remaining like over like 50%age ofthe memory is like wasted like we arejust paying for it but we are not reallyusing it so I would say like we needsome guardrails here to protect thecloud span uh so that's why we can makeuse of opa so OPA is a CNCF graduatedproject in 2021 it's like probably fouryears um it's a policy engine so you cancontrol the uh you can write policiesfor kubernetesterraformy and much more uh you can usethe same policies to control multiple uhenvironments and then like it can beversioncontrolled and then the basically thething is like you don't have to uh addthe logic in your application instead ofthat this policY ��.�r#��ALrL5AcS2d5ghello everyone thank you for joining usmy name is Way Chanaiand my name is Leon today we're excitedto share how Bloomber manages AIworkflow fiction in a mill clusterscheduling system and in this talk we'lldiscuss how we schedule and prioritizethousands of AI training jobs across mulkubernetesoh at Bloomberg our goal is to show youhow we improve system reliability andbusiness agility by ensuring critical AIworkloads get the resources they needwhen they needthem uh let me start by outlining theroad map for today's presentation firstI will briefly introduce Bloomberg andthe critical role AI plays within ourbusiness then we'll discuss our datascience platform and describe ouradvanced millluster traininginfrastructure built on Kubernetes i'llintroduce Carmada an open sourceorchestration tool we use to efficientlymanage these multiple clusters andhighlights key concepts and APIs nextwe'll explore the resource allocationchallenges we encountered as demand forour AI workloads grew emphasizing whyestablishing clear workflow prioritiesbecame essentialfinally I'll explain our need forpriority based scheduling andpreeemption which ensures critical testsget the resources they require promplyat that point I'll hand over to Leon whowill dive deeper into the implementationdetails share our experience andinsights and outline our communitycollaboration and future roadmap so to understand our challenges ithelps to know a bit about Bloombergbloomberg is a global financialtechnology company known primarily forthe Bloomberg terminal widely used byfinancial professionals the terminalprovides real-time access to market datanews analytics research andcommunication tools enabling informeddecision making handling vast streams ofdata daily we heavily leverage AI andmachine learning to deliver timely andaccountable insights to our clientsso how does Bloomberg use AI to enhanceits products and services here are a fewkey examples for extra extraction we useAI to parse financial documents and newarticles accurately identifying andlinking important details like companiesprices and dates and we enhanceextracted data to improvediscoverability such as adding metadataor linking related contentwe also use AI to power personalized andefficient searches within our vest datahelping users and relevant informationquickly to tackle information overloadai _�,�q#�AJqKwvN8MaSUalrighty uh good to see everyone againthis morning um it's uh amazing uh to behere uh we actually had a bunch oflittle last minute registrations sowe're closer to 13,000 people here whichis amazing so thank you for that umtoday's kind of theme for CubeCon isgoing to be very much focused on endusers um this job I have a fun positionin talking to a lot of organizations outthere that are adopting cloudnativetechnology in all different phases fromlike early days to massive massivedeployments and you know I thought thistime around how fun could it be to kindof do maybe a set of like quick littlelightning talks from companies fromdifferent uh industries and so on totalk a little bit about their adoptionof Kubernetes and other CNCF projects soto kind of kick off this fun uh paradeof people to talk about uh their cloudnative adoption uh I'm going to go kickit off uh with a uh person Stephen fromHSBC to talk a little bit about whatthey are uh doing so let's get thisthing uh started those let's get Stephenuh on the stage2025-04-15 22:03:33.154189`provides concise summaries of newsand research allowing faster decisionmaking and we also use AI models toanalyze data trends to generateactionable financial signals addingstrategic investment decisions overallAI transforms enormous volumes of datainto a valuable insights driving fasterdecisions and a better userexperience to build this powerful AIcapabilities Bloomberg relies heavily onour robust data science platform acomprehensive solution deeply integratedwith open source projects we leverageKubernetes for managing GPU intensiveworkloads and utilize tools like Jupyternotebooks Carmada Qflow Ker and Argoworkflow for model development trainingdeployment and automatic workflows theopen source ecosystem is a foundationalto our platformsuccess allowing our ML developers anddata scientists to quickly and reliablydeliver innovative solutions bloomerActive contributes to some of theprojects helping shape the future of theopen source ML community in short ourdata science platform is a central opensource power asset that empowersBloomer's teams to rapidly build deployand scale AIdrivenproducts so one core component of ourdata science platform and the focus oftoday's talk is Bloomer's traininginfrastructure where all our AI modeltraining occurs we operate an on-premises bare metal Kubernetesinfrastructurespecifically optimized for GPU intensiveAI training workloads using Kuberneteswe standardize how we schedule deployand run thousands of containerized AItraining jobs eachday we we manage tens of thousands oftraining workloads daily ranging fromsimple experiments to complexdistributed training task to ensurecontinuous availability and reliabilityat such scale we build a highlyavailable multicluster scheduling systemusing Carmada and open sourcemulticluster Kubernetes orchestrationtool to efficiently or our manyKubernetes clusters we leverage Carmadaan open source millluster orchestrationplatform designed specifically forKubernetes environments it provides aunified management layer thatsignificantly simplifies milllusteroperations with Carmela workloads canautomatically and seamlessly shiftbetween clusters if one clusterexperience downtime or resourcelimitations ensuring uninterruptedbusinessoperations and it also intelligentlydistributes workloads across clusterspreventing resource bottlenecks andunderutilizationthis efficient in management of GPUresources translates directly intoimproved operational efficiency and costsavings and managing configurationsettings and security credentials acrossmillful clusters can be complex anderrorprone cara centralized thismanagement providing consistent andsource configurations that reduceminimal manual effort and enhance systemreliability in short it transformsBloomberg's Millipole Kubernetesclusters into one cohesive highlyreliable and efficient system perfectlysuited to meet our growing demands forAI trainingworkloads and to help you betterunderstand the coming sessions let'squickly review the core concepts so theresource template essentially the samedefinition you would use for workloadslike deployment or customresource and each resource template willbe associated with a propagation policywhich outlines how scheduling roles suchas cluster affinity multiclustersplitting and sometimes clusters needcustomized configurations for exampledifferent Docker image or resourcelimits the override policy allows thisadjustment to be made easily withoutredefining the entire workload and nextinternally Carmela used resource bindingto keep track of where workloads havebeen assigned it links the abstractworkload definition to concreteschedulingdecisions and finally the work objectrepresents the actual workload deployedon each member cluster created fromresource bindings and override policiesit ensures that the correct resourcesrun in each cluster that they areconsistently synchronized and forinstance when a user deploy anapplication Carmada first match theworkload to a propagation policy anddetermine which clusters will host theworkload then it applies an overridepolicy asneeded generates resource bindiangs andcreates worked objects for the memberclusters this structure approach allowsCarmata seamlessly orchestrate workloadsacross multiple clusters so in short uhCarla's user-friendly APIs and clearabstractions transform mill clusterKubernetes management into a simpleunifiedexperience so despite our platform weface resource allocation challenges AIworkflow increase high GPU demand fromtasks like urgent bound pricing modelretraining earnings related documentanalysis and routine model trainingoften exhibited available resourcesthis competition highlighted the need toeffectively prioritize workloadsaccording to businessurgency this is where the conceptpriority comes in not all workloads arecreatedequal some are mission critical otherscan wait at Bloomber we categorize ourAI workloads by priority level based ontheir business impact and how long theycan be delayed to to illustrate urgentbrown pricing jobs are highly priorityfor example retraining a bound pricingmodel during model volatility is crucialfor the business to have accuratevaluations in fast changing conditionsthe business impact of delaying thiscould be huge and the acceptable delaytolerance is very low and for newsummarization updates we consider mediumpriority these are typically schedulemodel refresh to keep our new summariesup to date the business impact of thisis maintaining accuracy in the userexperience but they are less urgent thanthe pricing scenario if needed theycould be delayed a bit without majorissues though ideally they stillcomplete within a fewhours and regular document analysistraining is also usually medium priorityits goal is improving long-termcapabilities of our models the impact onthe business is more long-term and notimmediately visible tocustomers and for the research anddevelopment experiments those aretypically low priority those might beexploratory model training runs byresearchers and they can wait days oreven weeks if the cluster is busy withmore urgent work so by assigning workpriorities like high medium and low toworkloads we align our resourcescheduling with business needs highpriority jobs should get access to GPUsooner even if that means telling somelower priority job to pause or moveaside low priorityjob sorry uh the lower priority jobs canuse a leftover capacity when thepriority job is done or if the clusteris idle this way we maximize businessvalue and critical analytics get done ontime and we still utilize any spacecapacity for this urgent task so giventhese challenges how do we solve thisissue of important AI jobs such asurgent bound pricing model getting stockbehind this important onesnext I will hand over to Leon to talkabout how we tackle itthanks so before we dive right into thepriority and preemption details let'squickly review some terminologies thatI'll use for the rest of the talk uh soI will be using preeemption and evictioninterchangeably to refer to preemptinguh kicking out a lower party job for ahigher priority job and also will bereferring to workloads um as eithercommada resource bindings or uh justbindings in short finally for thecontext of our discussion we're assumingthat we're working with a single karmascheduler orchestrating over workloadsover multiple Kubernetesclusters traditionally uh Kubernetes andearlier versions of Kmada scheduleworkloads on a first come first- servebasis uh while this is verystraightforward first come first- servescheduling often delays criticalworkloads when resources become scarceso what we really need is a prioritybased scheduling to ensure that urgenttasks areprioritized as well as a preeemptionmechanism that allows higher priorityjobs to reclaim resources from lowerpriority jobs when necessary andtogether these two features willguarantee critical workloads to progresseven when resources are limitedin order to enable priority andpreeemption chromatada currently proprovides two feature gates uh a prioritybased scheduling feature that willcontrol enable priority schedulingfeature as well as a priority basedpreemptive scheduling that will controlthe uh enablement of preemptivepbreeemption mechanism while currentlyboth default to false but as we continuegathering user feedbacks and in refiningthese features we plan to enable them bydefault in futurereleases to support the multiclusterparty in preemption scheduling Komadowill now introduce some essential APIchanges so first of all now eachresource binding associated with theworkload is defined with a schedulepriority field which includes an integerpriority value as well as a soontobeadded preempted preeemption policysetting which can be either preeemptionor neverpreempt a second field change is thepriority value will be sourced from oneof the three sources below it couldeither come from a federated priorityclass defined in the Carmada controlplane or it could come from a standardKubernetes priority class or it coulddirectly come from the workloads customresource definitionspec when it comes to managing priorityclasses in your cluster in Kamadathere's no built-in referentialintegrity so this means updating apriority class doesn't retroactivelyaffect existing workloads and to preventconfusion and disruption we'll allowusers to only update the descriptionfield of a priorityclass but if the priority value updatesare absolutely necessary users shoulddelete and recreate uh the priorityclass with the same name so this designwill ensure consistency as kmatadacomponents rely on stored priorityvalues and preemption policies ratherthan live references pointing topriority class objectsso up till now uh we focus on how thescheduler naturally prioritizes highpriority jobs but in practice priorityalone isn't always sufficient forimmediate scheduling factors such asresource availability affinities and notolerations can impact thescheduleability of a job in schedulingorder uh so this is exactly so thisadded complexity means that priorityitself won't solve every schedulingchallenge and we need an additionalstrategy which is the which is for highpriority jobs and this is exactly wherepreeemption comes into play however it'scrucial to emphas emphasize thatpreeemption or eviction of low prioritybindings is always considered as a lastresort so our goal first of all is topreempt as infrequently aspossible while uh while making sure thathigh priority jobs get scheduled asimmediately and in other words thatmeans preeemption only kicks in when ahigh priority workload has no availableresources and evicting the lowerpriority jobs we selected willexplicitly free up sufficientresources and when preeemption becomesnecessary a second objective is thatwe'll aim to disrupt as few jobs aspossible in practice this means ourscheduler will always try to minimizethe total number of evicted workloadsuh in the diagram below you can see if ahigh priority job requires two GPUs thescheduleuler will prefer evicting asingle lower priority job in the middlethat's already consuming two GPUs ratherthan disrupting multiple smallerworkloadsuh each using one GPU so this carefulapproach will allow us to balanceefficient resource usage with minimumdisruption to for to our users whilealso guarantee that the high power jobgets scheduled immediatelyuh now given our emphasis on minimumminimizing the disruptions there's alsoanother important thing we should onlyhave a single commoduler or otherscheduler performing the preeemptiondecisions when there are multipleschedulers that allow to preemptindependently they can intentionallytrigger round robbing invictions andthis repeated repeatedly preempting eachother's workloads and this will createinstability and is especially painfulfor machine learning jobs that havelengthy code starttimes so to prevent this efficiency makesure we only use a single uh schedulerthat enablespreemption now let's quickly walkthrough the end to end workflow imagineyou're submitting an urgent workflowwhich for example a bond pricing modelthen with Kamada you would specify yourtarget target cluster affinities or justselect the network tier as well as theworkload priorities directly within thepropagation policy shown on the rightand then commada then immediately placesthis high prciority workload at the veryfront of the priority based schedulingqueue guaranteeing it's considered firstif resources are tight Kamada carefullypreempts low priority jobs aiming forminimum disruptionthis will ensure your critical jobs runsmoothly smoothly even under resourceconstraints and lastly I also want toshare some lessons we learned uh whileworking on this feature so beyond justhaving a powerful scheduling mechanismit's also essentially it's alsoessential that we balance platformefficiency with our user experienceon one on one hand users canunderstandably expect their mostcritical workloads especially the timetime-sensitive ones to be prioritizedand executed promptly on the other handour we as platform engineers want toensure that GPU utilization rate is highso valuable resources are not sitting inidle inidle yet when scheduling decisionsbecomes more complex they tend to becomemore inconsistent and users start to askquestions as well as frequentpreemptions also disrupt ongoingworkloads both of these can lead to userfrustration and ne negatively affect ouruserexperience so it's important to strike abalance between these competing factorswhich will allow us to maintain both anefficient platform and and happy usersso here's a high level review of ourapproach to balance the two competingfactors on one side to maximize platformefficiency we choose to use a maturemulticluster open sourcesolution that will u as well as addingpriority and preemption feature thiswill allow us to reduce resourcefragmentation over the clusters andimprove GPU utilization rate as well asum making sure high priority jobs getwhat they want on the other side uh foruser experience we emphasize fairnessaiming to minimize disruptions bypreempting as few jobs as possible andonly whennecessary additionally transparency isanother important thing will provideclear visibility into the expectedqueuing times of the jobs as well asproactively notify users users whentheir jobs are preempted last but notleast we we offer a uh practicalrecommendations for using our platformsuch as checkpointing training jobs toimprove joboutcomes looking forward we're activelyuh finalizing and enhancing thesepriority and preemption features so yourfeedback is invaluable to us please joinus in help shape the future ofmulticluster scheduling for yourorganization thank you so much forlistening to our talk i'll also give abig shout out to all the commodcontributors and reviewers who have madethispossible uh now I'm ready to take anyquestions um but first let me switch themost important side of this talkand floor is yoursthank youhello hello thank you for sharing i haveone question i think like command onlyresponse for distributing uh the job foruh clusters but not like I have a verybig uh ML job want to do it's not ableto merge different GPU on differentclusters to run my job right okayyeah I see the head not thank you yeahcurrently only supports jobs in oneclustersoh you mean one cluster is supported forjobs yeah so for scheduling of jobs itwill be only scheduled to one clustersah so it's not like syncing mill syncingfrom mill clusters uh okay so I wantedto ask that how to resolve data localityproblem on the management clusters sotraining workload does work for so datagravity matters right for trainingworkload how to cara management clusterresolve the data locality for thetraining jobsuh are you talking more about like dataset or are you talking about Yeah dataset data seti'm going to take this okayoh does this mic work oh for I think forYeah that's for actually uh withinBloomberg our uh solution is that wecache the most important model in thelocal node we have very large like MVMEdisk so we have the model cachemechanism like it's also a controllerbased on the Kubernetes controllernative way to cache the uh uh users uhdata set on the local disk so it makesure they have very uh quick startuptime but not for all the but but wedon't like cache checkpoints okay thankyoui thanks for the talk do you alsoconsider the expected runtime of the jobbefore you preempt one like is the userable to say like I expect this job torun 1 hour and then 40 45 minutes withinthe job you p you preempt it so can youpredict or allow the user to specify howlong he expects the job to run so thatyou don't evict something that is justrunning or has an expected runtime of 5minutes yeah I think that's a goodquestion uh we don't have any sort oflike uh two completion time um as Ithink users felt it's harderto provide a really fine grain uhevaluation so they can say okay thiswill run for like a week gener uh in anexpectation or there will be like withina day but no it's harder to provide anymore fine grain estimation yeah maybeyou can you can train it from like havea horistic that creates like an expectedruntime or maybe penalize people whowill expect or who who classify theirjob as running for an hour and they onlyrun for Yeah we're actually thinkingabout that uh maybe if a user provides ato completion estimation then we'llprioritize their jobs since we know uhwhen they'll finish okay cool thanksthank youhello guys thanks great talk i justwanted to ask how do you calculateestimated queuing timeum that would be a running uh runningaverage of all previous jobs submitteduh using the same GPU size so forexample four GPUs we have like anaccumulated uh average okay so averagebased on historical data right yeah yeahbasically thankshello thank you for the talk um I haveone questionum the preeemption is based on the timeallocation and I presume you havehundreds of teams who you are supportingright there's always going to be kind ofresource crunch budget crunch acrossmany clusters that you are deploying sohow do you tie this back to your costmodel and cost efficiency model is therea way you can actually alsouh tie that backUm can you elaborate like what do youmean by cost model cost effectiveum training is the most expensive partum of the machine learning life cycleandtypically any hardware that you also ouruh our budget uh man uh management so uhtraditionally uh each tenants we givethem likeuh based on their uh yearly plan like wegive them like fixed resource kota likemaybe a team require 100 GPU and each ofthem is corresponding to a resource kotain their name space and we observe likeyour GPU resource is really is reallywaste so and then I think training uh inthe training platform we intro weintroduced the time based budget we havesome calculation like if people use oneGPU and one hour how many uh like howmany it will cost and then we have theuh maximum we have the uh uh we have away to to collect all those metrics andsave that in our database system likelike for quarterly or monthly we willgive user the bill but we will stillpresent some resource for the user sothis is how we manage so basically theuser when they submit to the job theycan use all the resource on this clusterand we will um we will build them basedon the uh GPO hours they use yeah got itokay i have a followup maybe I'll reachout to you thank youthanksuh I was wondering if uh in terms ofserving priority you also have thesesort of uh scheduling use cases forinference whether you have some highperformance um and if you can talk alittle bit about that and if you're alsothinking about karmmada for that or someother approach yeah exactly karma ismeant started off for like longunningservices and applications so that'sexactly why we're kind of finalizing ourpreeemption design so we already have apreemption mechanism running uh inproduction for like uh ML workloads thatrun to completion so we want to make ituh work for both jobs and workload uhand servicesthanks for great presentation i have umuh I would like to ask about onespecific uh aspect uh in case of yourimplementation what uh solution you usefor your networking especially when itcomes to portto communication amongdifferent clustersoh wait uh we don't have any workloadthat like especially ML training jobsthat will span multiple clusters becausewe assume it could be in different geollocations which doesn't make sense withthe model network bottleneckthank you thanksthank you so much[Applause]2025-04-15 22:03:33.639741epretty substantialpresence of engineers just working on uhthese uh internal developer platforms Uhdo other folks want to weigh inyeah I'd be be happy to go becausethere's a nice contrast I think like 140people that's more than our entirecompany Um we do we do have a platformteam as well but from our perspectivethe platform team is more aboutprovisioning the WV8 database for ourcustomers So it's essentially you couldsay the platform team is what runs ourSAS So in in that sense um it's not somuch an internal team catering todifferent teams with different workloadsbut it's very much a a specialized uhplatform team that really just and it'sa bit oversimplified in saying it justruns the we the WV8 database because ofcourse there's so many other things thatcome with it be it observability andthen if we're talking AI and GPUsthere's different observability loads umas well um but it's it it gives the teammuch more of a focus compared to justany AI workload popping up anywhere inthe companyAnybody else yeah just wanted to addit's actually very interesting It's apoint that I've been hearing about moreand more over the last couple of daysthat the concept of like platformengineering and platform team changes alot So of course like platformengineering in Uber is different the onein weate and it's different the one thatwe have at overstore which is a muchsmaller startup and so it's also youknow the task of the company and theteam to adapt to you know what are theuse cases of the business to build aplatform team that fits the company uhrather than a company that fits theplatform team that doesn't really workright Tim did you want uh no I guess Idon't have much to contribute there Iguess we're we're a tooling vendor at atour heart so we don't actually have ourown platforms to maintain fortunatelyDefinitelyActually we're talking aboutcustomization extending Kubernetes So uhmaybe starting with Andrea What kind ofcustomizations have you seen uh you knowthat people have been doing and if youhave any adjacent projects that you'veseen people bring in to Kubernetes uhfeel free to talk about that as well asthey build out the data pipeline Yeahabsolutely So as I was mentioning beforewhat we do over story is uh uhvegetation management So basically whatwe do is that we get satellite imagesfrom satellite provider like Airbus orMAC or Maxer and so on and basically wemerge that information with informationthat we get from utility providerbasically the folks that bring energyand electricity to your homes So wecombine these two types of data and wecreate a risk profile The idea is thatfrom satellite images we can look atwhere high tension power lines are goingthrough So where they go through aforest or through cities and wheneverthe vegetation gets too close to thepower line we can notify the energyprovider so they can go and trim thevegetation before a wildfire spreads outWildfire are massively you knowdestructive event They bring a lot of uhdamage to our community to people's lifeand businesses And so we are workingtowards preventing them that rather thanreacting to them Yeah thank you We kindof need that in the US Yeah the US isindeed like a a big you know manycompanies in the US are customers of usjust because it's the the land is veryvastOne thing that uh uh I wanted to addthat I will talk more about this rightafter lunch I have a session dedicatedto how we do and that we try to handlethese problems at over story So in caseyou're interested I have a talk latertoday right after lunch uh at level zeroroom J um where I will talk specificallyon how we tackle these problems But justto give you like a brief intro or Iguess you can call it spoiler Uh we useuh Google cloud specifically we usecombination of uh GKE and Google CloudRun GKE has been extremely useful forour type of use cases Satellite imagescome in all kind of shapes and forms andresolution We use resolutions up to 15cm So that's very high resolution highimagery and the flexibility that we getwith Kubernetes and having thepossibility of having workloads that useone CPU and 2 gigs of memoryf or 72 CPUsand 450 gigs of memory It's it's veryimportant for us On top of it one of ouropen source project that we use iscalled Daxter This is the tool that weuse for our data data workflow engine uhit's a open source project that worksreally well in the cloud in my opinion Iam uh an open source contributor of theproject I've been working with it forquite some years since version you know0 something and yeah it has been reallyexciting to see this project grow uh addmore integrations with Google cloud aswell and with kubernetes especiallywell this is great thank you for a niceplug for both GKE and cloud run but thisis definitely uh not only uh you know wehave other uh upstream Kubernetes atUber and other forms of Kubernetes soit's not really a vendor pitch oranything like that but thank you Um sowe we talked a lot about uh you knowkind of workload placement and uhdynamic resource allocation in thisconference So um you know so especiallythe DRRA is a beta feature in KubernetesSo what are some of the advancementsthat you're seeing um in um you knowresource allocation or workloadplacements i'm gonna start with Tim onthat one Uh yeah Yeah No I've I'veprobably been at a half a dozen talks onDRRA uh in the past couple days at leastUm I think DRRA does start introducing alot of really interesting capabilitiesto make sure that you can start doingvery very fine grained allocation ofresources Um alongside that one of thethings that I'm here pushing for thisweek is is being able to do a verysimilar set of management tricks to CPUsUm and then building off of thatcombination of fine grain resourcemanagement to start talking aboutcopying that same pattern acrossmultiple nodes Um and maybe that thereare some missing primitives to to betterhelp enable multi-node AI trainingworkloads um with a very long towardstrategy towards being able to nativelyprocess training um which is isgenerally being done on just slurm onbare metal today u but maybe it can bedone on kubernetes on bare metal withslurm providing our our schedulingexpertise into the mix Yeah Um if youthink uh a dozen uh talks on is bad youshould try being an upstream contributorI've been on a dozen pull requests aboutDRA in the last month But uh no so uh Ithink I'm going to talk about somethingthat isn't the RA for once because I'msure you've all heard a lot about thisUh one of the other things that'schanging a lot especially in the nodeintegration of Kubernetes and uh thiswas not this has not been pre-arrangedbut shameless plug at the same time asyour talk unfortunately I am having atalk on what I'm about to talk about Souh what a lot in uh the latest releaseof Kubernetes there's now support for inplace pod vertical scaling Now what doesthat mean that means you can change theCPU and memory requests and limits of apod without stopping the application Youcan change it even after your pod hasstarted running That is particularlycool for both aim a IML for training Youdon't have to disrupt your applicationUh and it's also cool for batch jobs andall sorts of things in between Uh alsowhat's changing is there's uh if I wasto make take it to a bit more of a widerangle there's definitely a lot more of apush that pods are less cattle and morepets especially nowadays with moreadvanced workloads where maybe in thepast the project has not been the bestfor them I know racm uh we very muchhear the uh sentiment across the projectabout the fact that there could bebetter integration there and we we'llwe'll try and do more there at least inmy opinion we will um and then uh whatwas the other thing yes and uh the otherchange that's coming that is also quiteuseful here is pod level resources youcan share your resources among multiplecontainers in the same pod uh whichbefore of course you had to define yourresources at a per container level thatis quite useful you can uh balanceresources across a single C group acrossmultiple containers that are part of thesame training workload Obviously that'snot everything Uh I think that forexample we need to think a lot moremaybe in the project about in if I'mtghinking a IML maybe multiode inferenceit's not particularly the funnest thingin Kubernetes to do today especially ifyou have multiple pods that you have touh kind of keep together kind of relatedbut Kubernetes doesn't really treat themin a related way Um but yeah there's athere's a lot coming I think in uh inadvancements 2K itself on this Uh Lucyare you doing a talk in the afternoon atthe same time as Andrea's talk so makeyour choiceUm so actually um we talked a lot aboutrunning Kubernetes for AI You know atcan you talk about some of thecharacteristics is it kind of like aconstant volume or is it very bursty andhow does multi-tenency play into thispicture yeah Yeah So we definitely seeboth May maybe just to to quickly startwith what Lucy just said about theability to change pods um that arealready running Um that is sort of thefirst category of three categories thatwe see because VV8 is a databaseDatabase is a stateful load Statefulloads on Kubernetes is something thatfive years ago people would have saidyou're crazy Um I'm sure these days alot of people still think so but it'sit's fewer and more and more people areare doing it just because the theflexibility in general gives us so muchvalue Um but yeah stateful load and thenspecifically for vector databases youcan almost think of them as in-memorydatabases for for many cases and thatthat just you're essentially bound bywhatever throughput you have from yourdisk at startup time to just load thatall back into memory and as as uh loadsgrow there's only so much that you canscale horizontally So the ability to tobasically vertically change a pot thatis that is a way of bursting I would sayum just the workload growing andshrinking and then of course in amulti-tenant environment if you have somuch physical uh infrastructure so manynodes you can share that across these ifif we can share that more dynamicallywithout introducing a restart that'sgoing to be a gamecher for for databasesthe other two categories that we have ismore thinking of probably when peoplethink of AI workloads they think more ofof GPUs and uh there we have we have twoone is um the embedding creation so webeing a vector database you can think ofthis very roughly as you have any kindof multimodal data you throw it into amodel that model spits out vectorembeddings and those vector embeddingsare what's actually indexed in thedatabase so what that means is even ifthat indexation part is CPUbased youstill need the model to to create thosevector embeddings and you need them umyou need a lot them at ingest timebecause most people do bulk ingests andthen updates Um but you also need themat query time because at query time umsame thing you query in some kind ofmodality typically text can also beimages can be sound that needs to betranslated into this essentially arrayof of numbers again and for this uhwe're running uh what we call the the8embedding service which is essentiallyjust this it's a media to vector uhservice and that runs on Kubernetes inum sort of various pods that have GPUsattached toAnd there we're controlling theburstiness through multi-tenency So thisis a multi-tenant service Um and ofcourse you still have peaks aroundspecific hours or so So one of our ourcustomers is an email client for exampleSo there's very clear distribution ofMonday morning 9:00 a.m you get so manyemails Um and and that is something thatyou can plan for and you can you cansort of scale dynamically But then wealso have and that's the third bucketextremely unpredictable and bursty loadswhere for example um one thing that wejust recently introduced is called thetransformation agent and that'sessentially a AI agent that runs on yourdata and you can essentially kick atransformation off with a single promptSo you would say like take every singleobject that's in my database translateit to Spanish and write it back into thedatabase And that is extremely burstybecause you could have could havebillions of objects and and all of asudden all these translations areessentially LLM model calls And uh whatwe're doing for that that is essentiallhythe category we said like we don't wantto self-host this and we're we're umusing a third party provider uh modal inthis case um who from what I understanddo not run on kubernetes themselvesbecause it is so bursty because theythey really charge for CPU minutes whichis something that is great for us as aas a customer but quite a quite a bit ofa challenge from an infrastructureperspective I think I think Andrea youhave a lot of data that you'remanipulating as well do Do you want toadd anything here yeah exactly as I wassaying like satellite images they take alot of space as you can imagine So weuse very large amount of stoages andthis is this is why I'm also interestedvery much into um the array right thedynamic resource allocation because ourpipelines they always look the same moreor less whe there are a lot of choicesto be made but depending on the inputthat we get the pipeline might take alot of resources or less resources andwhen you run like workflow workload thatare a little bit more stateful thatbecomes a problem you don't want yourpipeline to go out of memory afterprocessing in you know 12 hours and youhave no checkpointing and all of asudden you need to redo everything fromscratch So I'm very much looking forwardto the you know new changes and newfeatures that will happen over thecourse of the next months and years inthat regardOne of the things that at least I'mtrying to get us in a better state aboutis being is having more checkpointingand being more disruption tolerant whendoing training in batch because GPUs arewell this is going to be a real shockerhere but GPUs are really expensive Umand uh it's not it's unlike CPUs whereoh you know just we can have a safetybuffer it's fine it's going to cost thecompany some money but it's worth itWith GPUs ideally I want to be using allof them all of the time to do at leastsomething and to extract value from thembecause getting because they are worththeir weight in gold Um so obviouslythat's we're never going to get to 100%utilization 100% of the time But how dowe get how can we even get close wellone of the ways we can do that at leastis uh there we will only be usefundamentally a load load follows a apeak and valley right and uh when we aredown in this valley when we have a lessload because people are sleeping etcWell really we shouldn't just leave inmy in my opinion we shouldn't leavethese GPUs idle We should we should uhwe should reuse them on a workload thatis more disruption tolerant Um that ismore disruption tolerant but we can veryquickly take the GPUs back if wesuddenly have a spike in a spike oftraffic a capacity crunch something badhappening right uh we are not yet thereyet We're working a lot on this projectuh within Uber that I won't go intospecific details of because I did thiscrazy thing where I signed an NDA formoney from them Um but we are trying toat least get into a position where wecan uh maximize the potential use casesthat our GPUs could be used for so thatif we have a disruption tolerantworkload we can opportunistically eatthe idle capacity rather than buy moredevices because yeah they're reallyexpensive Yeah So this morning's keynotetalked about the inference gateway Soyou can kind of low balance the requeststo the available endpoints Uh so now I Ihave do have a couple questions for thepanelists but if folks can go to themicrophone you can ask individualquestions Thank you for being the firstNo problem Um yes question about GPUmanagement Um I think I think I alreadypartly know the answer from Lucy but beinterested in the rest of the panel Umso do you work with a fixed pool of GPUsor do you use the autoscaler to get themon demand and why so at our scale wecan't the question I think was do wework with a fixed pool of GPUs or do welike use autoscaling with a cloudprovider or something right i I see ahead nodding so I'm going to assumethat's the question So the problem wehave is that our at our scale uh if weask a cloud provider to hold capacity inreserve and give us like a quot uh theysay we are not holding an entire datacenter hall in free in free capacity foryou So our GPU pool is fixed It's a mixof uh onrem GPUs In fact we still havesome quite old ones because they'reexpensive now So why not keep them andkeep using them uh we also have some inuh cloud providers OCI GCP Um but yeahit's a fix it's a fixed amountfundamentally and it's uh because of howexpensive they are actually if you wantlike more GPUs and you want them outsideof the shared pool that is aprocess because that's a lot of moneyI just wanted to say at at scale allresources end up being finite at somepoint Um I I think historically thecloud providers have have done a greatjobof pretending they're infinite and and alot of workloads been built around thisidea that they can ephemerally alwayspull in resources But what you're seeingout of especially a IML workloads is ohwe do need to queue up work that can'tinstantaneously run How do we starttalking about deferring work how do weprioritize work how do we preempt workefficiently and especially in respect tokeeping system utilization high uhbecause idling those GPUs is quiteexpensive Yeah And yeah at at justreiterating that at scale cloudflexibility just disappears It's notreal Yeah Yeah We found a similar thingWe used to use the autoscaler veryflexibly and then in the last year or sothe autoscaling latency is about threemonthsYeah Yeah that sounds like a threemonths sounds like about the lead timeuh on this stuff if not more ThanksThanks Thanks for a question Are thereany other questions from the audiencethere's no lineThat must mean that we were really goodat answering everyone's questions Have aquestion for you actually Lucy Um soit's uh a lot of times it's like yousaid it's a very finite resource and sohow do you make your compute choices iknow that some people are making computechoices using different variations forspecialized compute How do you do thator how do you choose workloads to go onwhich type of commute comput uh so wehave a right now it's a bit less mixingin in my opinion should be but uhfundamentally uh we have uh and actuallyI'll tailor this to a IML uh when youcreate a service as Uber that needs acertain that needs to do use GPUs youspecify what GPUs you actually need withgating that it has to be done by someonewho knows what they're doing you can'tjust be like a random engineer in anoffice who's got no affiliation with thea IML folks who just goes I would like100 uh H100s now um they that is thenplaced onto a machine uh typically onprem uh where those G where those GPUsare available and then ECAC thereoutside of that um we're trying to getbetter at collocation because we don'twant fundament fundamentally uh we wewe've at least found that uh collocationfirst off allows us to do more resourceovercommitting um and it and it but thechallenge there really is uh noisy isnoisy neighbors isolation there's nomagic bullet here it is a really toughproblem um yes I'd uh sorry anyuh no I think you covered it uh how manycores you run on kubernetes out in Uberokay so on kubernetes uh okay so at Uberwe measure things in TPU cores becauseit's like it's a roughly good metric ifyou measured in hosts like a host couldbe small big you know it's not it'suseful CPU cores has its downsides butit's a good enough metric I'd say rightnow I think we have about 4 million CPUcores on Kubernetes and it's going to bemore like 8 to 10 million within liketwo years so it's a lot I think we'rethe biggest end user in the world onlyevidence for that is I have yet to meetsomeone who's got more if you are an enduser in this room and have more pleaselet me know Uh but yeah uh one questionout in the audience Anyone using uh GPUsand anyone running TPUs kind ofinterested inthat GPUs we got a few Okay Uh do do anyfolks out there running GPUs like at theedge just checking if there's any folksAhokay Okay Um I I think uh are there anymore questions feel free to go tomicrophones If not uh the panelists willstick around for individualquestions Uh there'ssomebody Okay Well thank you very muchWe'll be we'll stick around forindividual questions Thank you Thanks2025-04-15 22:03:34.245493 zz��i�s#�� Ad9K5PSsHtDghi my name is Susan Woo Thanks forstaying back on Friday to hear from usUm I'm Susan Woo I'm out in the pro I'ma product manager in Google Cloud Ifocus on cloud networking GKE networkingnetwork security We all wear a lot ofhats here at Google Thank you So with meI have my panel a steam panel Uh we'retalking about extending Kubernetes forAI So come on upThank you very much We have no uhwalk-on music Thank youSo with that I'm going to let thepanelists introduce themselves They'reall miked up and ready Cool Thank you Umyeah hi my name is co-founder and CTO ofWeeb8 Uh Weeb8 is a vector databasecompany or actually it's more of a of anAI data platform these days but itstarted out as a as a vector databasecompany And of course we run a lot of AIworkloads uh on KubernetesHi Uh I'm the only one who wasn't mikedup We ran out of packs Uh but my name'sLucy I'm a engineer at Uber where I workon our platforms Uh we have a quitesubstantial presence on Kubernetes We'reone of the biggest end user deploymentsin the world which uh I'm sure we canget into later And then we run a mix ofa IML workloads both for training andfor inference uh across non-criticalstuff but also in the core trip flow ofUberGood morning everyone My name is AndreaI work for a company called Overstory Wedo vegetation management using satelliteimages Basically we use satellite imagesto prevent wildfires And uh yeah I'mhere today to talk about how we useKubernetes in the cloud in Google Cloudto yeah make sure all our clients gettheir risk assessment on time using ourplatform Um I'm Tim Wickberg I'm thechief technical officer for Skidmd uhwere the principal slurm developers umalso now the developers of slinky whichis meant to be our set of integrationsfor bringing slurm schedulingwherewithal into the kubernetes stackOkay Uh I got a couple of preparedquestions but I at some point I'll I'llhave folks come on to the microphone andyou can ask your questions directly Uhso starting with Lucy you know forplatform engineers or infrastructureengineers do do you typically provideKubernetes clusters for sort ofconventional uh microservicesorchestration or do you have dedicatedteams for a IML workloads so um ourapproach is that we don't bad time for acough our approach is that we don't uhdirectly offer what I would call maybeKubernetes as a service uh our worry isthat it's very expansive and it's verypowerful but at the same time it leavesyou with uh very little guard rails touh do things uh maybe the way we want Wewant you to make do rollouts anddeployments in a safe way Uh we want youto do them within at least somesafeguards that we as a company have Sowe run a stateless compute platform thatuh users actually deploy their servicesthrough and that gives them a lot ofstuff for free safe rollouts safedeployment safe CD Uh our a IMLworkloads are then actually built on topof that uh there's a whole team ofengineers who build uh the platform foruh folks to deploy things like pipelinesand uh training jobs and etc uh that isthen also uh wrapped in a little bitmore opinionation in that uh you have toget your data source from certain placesetc Uh I'd say I think we have so I workin Denmark and in our office we donothing but build platforms and we haveabout 140 people in that office plusprobably like maybe another 20 plusworking on a IML in the US uh plusanother maybe 10 plus in the U 10 plusscattered around the world as well Uhbut yeah it's a dk or CERN were uminvested in this and they suggestedvarious solutions uh mostly using mix ortime sharing or some mentioned some newscheduling strategies and the thing isthat this problem persists and in ourcase we experience it too but we aresuggesting slightly different approachto solving these problemsSo who is we um I represent Czechnational e infrastructure that operatesmulti-tenant multi-purpose kubernetesclusters that are free of charge forresearchers and academics in CzechRepublic These clusters feature around50 GPUs of various uh types and around300 active users that operate uhdifferent kinds of workloads thatgenerally fall into two categories ofbench workloads or interactive workloadsIn operations we aim for simplicity andsome level of general applicabilitybecause of these variety of workloadsbut also because by far not all usersare proficient with containersKubernetes or generally complexsetups Now I will present three insightsfrom our infrastructure So starting withthe first one on this graph we can seethe said reality So computational nodesseem generally occupied due to generousrequests but the in reality the realutilization is quite low Um in this datautilization in this like CPU utilizationgraph uh we can see computational nodesand their real utilization and the wholecapacity and the ratio between these twogenerally vise between 1 to 50% with anaverage being 6% which is quite sad anddefinitely could be improved But overallin cloud some level of overprovisioningis necessary because cloud should beelastic Cloud should be able toprovision for sudden spikes and alsocloud should be able to accommodateredundant workloads and if this is thecase with the CPU utilization then wecan imagine that with GPUs is sometimesworse Continuing with the second insightThis insight comes from the category ofbatch workloads So this plot shows alphafold jobs and their durations and theamount of and the amount of jobs fallinginto the relevant duration categoryAlpha Fold is a protein structureprediction software and jobs require aGPU to run and they usually run for manyhours normally up to 10 hours but we'veexperienced cases when the job ran for30 days since it's a batch workload uhGPU utilization is quite fine it's uhusually about90% so the case is obviously falltolerance in everyday data centeroperations hard or soft failures happenkind of all the time So we want toensure that these long running jobs donot lose their progress and so we wouldlike to have some mechanism for ensuringtheir fault totolerance The third insight is from theinteractive workloads category Uh onthis graph we can see GPU utilization ofa Jupyter notebook We can see that it'squite unpredictable and dynamic andthere are times and bursts of activityfollowed by inactive periods in somecases long as long as one day uh Jupyternotebooks or generally interactiveworkloads are expected to be readilyavailable but at the same time we don'tknow when human will interact with themand so by definition the resources areallocated but can remain underutilizedor not utilized at all and it is worthnoting that overprovisioning suchunderutil underutilized workloads evenworsens the problem of ineffectiveutilizationSo um based on these insights and alsolike the insights from the previoustalks we now can have kind of a wishlist for GPU workloads with two wishesFirst wish is a wish for um falltolerance generally for any kind ofworkloads but longunning workloads wouldbe very interesting and uh ensuringfault tolerance isn't essential it's notthe nice to have Second kind second wishis achieving efficient utilization ofresources mostly for interactiveworkloads Uh because we've seen what'stheir utilization pattern and alsobecause they're becoming more and moreubiquitous in theclusters And of course there exists someexisting approaches to these problemsconcerning utilization But uh not allapproaches can be applied to every usecase or to every workloadOverprovisioning is not suitable for allworkload types because workloads need tobe aware that they can beautoscaled Then GPU sharing in form olfmulti-instance GPUs is limited in up tohow many parts you can partition the GPUand also can lead to fragmentationTime sharing can worsen the performanceand also is not suitable in environmentswhere resources are shared by unrelatedgroups New scheduling strategies tend tobe workload specific and are notapplicable widely in generalenvironments So we are suggesting uh touse another versatile tool and to haveit in the toolbox alongside all theseexisting approaches and that istransparent GPU checkpointing And now toRed AustinThanks Vicki Um so as Victoria mentionedtransparent check GPU checkpointing canbe used for improving um resourceutilization or providing fault tolerancefor um GPU acceleratedworkloads Um last year at CubeCon we wehad a talk presenting how we can enablecoordinated checkpointing fordistributed applications And at the endof our talk there were many questionsabout how can we actually checkpointrestore GP applications and this is afollow-up session on uh this talk Um sothere are a few existing approaches ofhow do we checkpoint restore GPapplications and the most commonly usedapproach is to use something called APIinterception So this is where umessentially replacing the uh CUDAlibraries and or ROM librariesessentially JP libraries withinterception mechanism that um logs andreplace API calls and keeps track ofmemory transfers And the problem withthis approach is it's difficult toimplement You have to uh essentiallyexplicitly intercept every API call andyou have to track all the memorytransfers It adds performance overheadand you have to handle the loading andum essentially loading the uh kernels onon the GPU which is architecturespecific and uh it can be different foruh different versions of libraries andit also requires dynamic linking Um sofor example has uses static linking bydefault for the p runtime So you have torecompile certain applications Um andour approach is using recentlyintroduced uh GPU capabilities thatallow you to um essentially checkpointrestore GPU the state of GPUapplications We call this transparentlyunified CPU GPU snapshots because itallows you to create a single snapshotthat combines both CPU and GPU stateIt's fully transparent to theapplication It works with both staticand dynamic linking and uh it can it'sintegrated in into Creo through plug-inmechanisms Um so this allows us to beused for example with docker pmon othercontainer engines andkubernetes Um for AMD GPUs the uh AMDGPU driver is part of the NOS's kerneland the AMD team essentially introducednew API callsuh API calls that allow you to obtaininformation about the process currentlyrunning on the GPU and freeze theprocess from ex essentially evict thecues of the process And um the secondstep is to checkpoint the state of theprocess which is safety to um a filedescriptor and the unpulse operationallows you to um essentially unevict thecues So these are the three main stepsthat can be used to enable GPUcheckpointing and similarly for the coolcheck uh for the for Nvidia GPUs we havea few similar steps The first is toobtain the status Now what what is thecurrent status of the CUDA task um toessentially pause the execution of ofthe GPU process and then to checkpointthe GPU state into first memory and thisfunctionality is exposed throughuh command line utility code for thecheckpoint and there is a blog post onuh from Steven um describing how thisworks in more detailum integrating this functionality withKubernetes is fairly simple It's part ofCreo and it's fully transparent to theapplications So it doesn't require forexample to inject additional librariesinto the container image or to uhessentially modify the workflow theapplications that are being used AndAdrian is going to describe in moredetail how um the checkpoint restoremechanisms work in KubernetesUm so we have a a demo showing how wecan use this mechanism for uh hotswapping of models So in this case wehave the cluster that Victoria justpresented um using a single Kubernetesnode uh running a training port in aJupyter notebook And here we have highutilization but we consider mthis to be alow priority task that will be runningfor a long time And if we need to switchto for example hyper task we cancheckpoint deplication from the GPU intohost memory And this essentially allowsus to allocate the GPU to anotherprocess to essentially in this case usean inference workload with lama3.3 Um and this would essentially loadthe model into the GPU and startresponding to the request And if youhave another workload with even higherpriority we can essentially checkpointthe inference task again into hostmemory And then um after this ischeckpoint it We can umessentially we have two ports here Thefirst port is notebook and the secondone is open web UI which serves twousers The first one is the llama 3.3workload The second user is usingdeepseek So the deep sync um model hasfewer parameters and it's faster Itgenerates response in a faster way Andonce the high priority workload in thiscase deepseek has completed we canresume the execution of um of theinferenceworkloadAnd once this is completed we can alsoresume the training job at the endSo this transparent checkpointingmechanism allows to implement somethingwe called hot swapping Essentially uhswap multiple models or workloads uh toand from host memory um to essentiallyoptimize resourceutilization and Adrian has another demofor migration later on Uh just show theevolution results Um yeah Um so we havewe we we evaluate the checkpoint restoremechanisms with uh differentuh with different models and model sizesand GPUs And we we were able todemonstrate that it achieves transparentuh checkpointing for essentially fasterrecovery and to accelerate the goldstart times of inference worksum Adrian over to youYeah thank you So thank you for the umpresentation of the use cases and of thedetails about um how it works with GPUsI want to talk a bit more about thewhole state of checkpoint restore inKubernetes focused on on GPUs but umgive some background around it So Iprepared another demo a live demo So wehave seen the demo right now which umkind of preempts the the running umapplication on your GPU and in my demo Iwant to migrate um an LLM from one hostto another host while it's running Sohere so um this is a really simple setuphere So this isKubernetesand cryo from git from maybe three weeksago So I have this really simple um potdefinition here which I will start nowand I um I have a rubber script aroundit which tries to measure how long ittakes for the application to start sothat we can see um if maybe restartingfrom a checkpoint and I hope it's fasterit should be faster um is actuallyfaster than doing a cold start So what'scurrently um happening here is um thisis this is starting the the VLM It'srunning on an Nvidia GPU an A10 on anAmazon EC2 something Um it's a it's arail 9 based systemand it's it takes some time to start Itshould should be around 40 seconds andthen it should be ready to go Yeah So umnow the LM is running We can see it hereOh where's my demo hereNvidia SMI So we see it's it's usingaround 4 GB of memory on the on the GPUIt's it's a Python process and I cantalk to my LLM using um usingcurl and it basically um I can just uhgive it a sentence Cre is a Linux toolSo Creo is the tool with just acheckpoint restore below everything SoI'm telling it please finish thesentence create is a Linux tool and theLLM says you should definitely try itout I I agree and then it says a coupleof things So um we have this umcontainer running um in Kubernetes onour uh on our A10 GPU and now I have ascript which does the the checkpointingThe checkpointing takes about one minuteto write all the data um to the localdisk The um checkpointing interface inKubernetes is currently a cubelet onlyAPI That's why the command line is a bitmore complicated than just writingcubectl something What the script willalso do it will checkpointum the container from the host and GPUand then it will also transfer it to thesecond host Um what's happening now herein the background is so I talk to to thecublet The cublet talks to to cryo thecontainer engine The container enginetalks to run Runci talks to nCreu thetool doing the checkpoint restore toolAnd run as Rodstein mentioned then talksto the Cuda checkpoint tool The Cudacheckpoint tool will write all will stopum all running kernels on on your GPUwill write all the information to mainmemory Then Creo will take all theinformation from main main memory make astateful copy of your container and thencryo will the container engine will addadditional information about meta dataand about file system changes and allthis information is then written to diskand it takes about 1 minute and 4seconds And now this checkpoint imagewhich is quite large at this pointaround 11 GB from the GPU and an CPUside is now transferred to thedestination host Currently in Kubernetesum this is um independent if you usecryo or containerd they both create thesame file format It's a tar archivewhich contains all the information aboutyour running process is this if this isrunning with GPU or without GPU doesn'tmake any any difference at this point Sonow that the the checkpoint image hasbeen transferred to the destinationsystem I haveto convert this to a to an OCI imageThis is another script This is basicallyrunning builder in the background whichtakes the OCI image which takes thecheckpoint archive and copies it into uhinto an o which takes the tar archiveand copies it into an OCI image Thiswill also take a couple of minutes So Iwill switch back to the slides and talkum go back to the talk So um theinteresting thing is if you look back inthe history of checkpoint restore inKubernetes there's actually a ticketopen since 2015 which talks about it's avery high level discussion aboutmigrating workloads migrating potscontainers it doesn't go into manydetails but the interesting thing frommy point of view is that um containermigration has been discussed for over 10years now in inKubernetes and in 2020 we startedworking on this and we um put it underthe story of forensic containercheckpointing There were a couple ofreasons for doing it this way Forensiccontainer checkpointing the idea is wehave a container that's running and youthink it might be attacked or not andwhat you usually did or what tools todaydo is they just kill the container andhope that your attacker can no longeraccess the system With forensiccontainer check checkpointing the ideais you can take a snapshot or acheckpoint of your container without acontainer ever knowing that it wascheckpointed and without knowing thatyour container was ever checkpointed andthen you can keep the container runningand then you can do an off offline uhsandbox analysis of your container Ifyou do just just want to scan throughall the memory pages there's an optionor you can um you can in um you canrestart a container and see what it'sdoing if it's doing something maliciousor not And and the reason we did it thisway forensic container checkpointing inKubernetes is that the impact on thesource code of Kubernetes was minimal ifwe just do the for if we just just dothe checkpointing part This is also whywe today only have a a cublet only APIjust to to minimize the code changes andsee if this is a good idea to have inKubernetes In 2022 um Kubernetes withKubernetes 125 the feature was releasedas an alpha feature forensic containercheckpointing and since then you can useit In the beginning in it was a cryoonly feature and um over the couple lasttwo years we worked with containerd toget it also into into container D Uhwith Kubernetes 130 we switched the thethe label the flag of the feature fromalpha to beta The code changes toKubernetes at this point in time wereminimal it was only aboutum basically changing the label fromalpha to beta and but at the same timewe also uh were able to get the changesinto um container D so that container Dalso can do forensic containercheckpointing just like um just likecryo does the interesting thing is I'mI'm I'm presenting here and we'retalking about also restoring containersthe thing what we did or which wediscovered in the process ofimplementing this into in the containerengine is that we are able to restore acontainer without Kuberneteos knowingthat a container is restored So when youstart a container in KubernetesKubernetes does a call to the containerengine container create and containerstart And what we do in in cryo andcontainerd today we hook into containerstart and container create and start andsee if the image we passed is a is acheckpoint image and if it's acheckpoint image then we will restorethe container So from from Kubernetespoint of view it's a new container a newpart but it's actually a restoredcontainer This way we today canimplement um container migration with orwithout GPUs in in in Kubernetes Let'sgo back to to our our demo So the imagehas been converted to an OCI image andnow I have um I can um start thiscontainer So the the YAML file looksalmost the same I'm putting anotherscript around it to see how long ittakes to start And so what now ishappening isnow the c the cube will talk to cryoteplease create this container Cryo willdownload the image from the registry Seeokay we have information that this is acheckpoint image we will unpack it andtell run and Creo the tools below andCUDA checkpoint please restore thecontainer and do not create a new oneand this way we are able to um do astateful migration of the container umwithout losing any information and wesee so so counter six is a one of theexamples that it's um at the same pointthat it was here last time so if we talkto it here again we see counter fivecounter six and on on the on on on theother host we see also counter sixcounter seven Oh no this is the wrongcommand Um so this is the right commandSo we see um it it continues from thesame point It was checkpointed um butjust on another host and now we have twoGPUs programs running from the sameinitializationum on one host on on two hosts And it ittook a shorter time It only took like 25seconds instead of 40 seconds to startup So even the startup was faster goingfrom a pre-initialized um in GPUcontainer So back to the slides the thenext steps we are thinking about and wewe also discussed this with I think siknode Yeah sik node was it can we declarethe thecheckpointing can we declare thecheckpointing API as a stable in thecube So the basically finish theforensic container checkpointing umfeature move it from beta to stable GAand if this at some point will happenmaybe we can also think aboutintegrating in in cubectl we opened afirst Kubernetes enhancement proposal toextend the cublet API endpoint forcheckpointing to the API server and andonce it's in the API server we can alsothink about integrating in in it incubectl This is an an ongoing processcurrently and it will probably take sometime but we're slowly trying to make itfrom the lower levels of Kubernetes tothe upper levels um of of cubectl sothat it's easier to use than what whatI've currently shown you um in my demouh reaching out directly to uh to thecubelet API Another thing I just um sawlast Friday there's a company Weveroftthey released a tool for um forKubernetes um it's a it's a webinterface which you can use to umcheckpoint and migrate container I havenot used it myself it was just releasedon Friday uh but I I was very happy tosee that so that um seems like peopleare not just us three are interested ina topic there are many more people umlooking into currently So with this weare at the end of our presentation So wementioned thatwe are able to fully transparentlycheckpoint GPU applications Um it workswith AMD Nvidia GPUs and it's out of thebox integrated in Kubernetes today Umour demos have used checkouts from Idon't know three four weeks ago And withthis uh we're at the end of thepresentation Thank you and happy toanswer any questions you haveI have one one question on the uh sothis is transparent right so theapplication is not supposed to know it'sbeing checkpoint and restored how doesit uh are there any edge cases to thatso for example how does it handle thefact that uh all the network connectionsare broken and then it comes up with adifferent IP address and stuff like thatyeah so um this is one of the most askedquestions uh everybody has and so thecool the tool tool below Uh Cre handlesTCP connections So you can migrate a TCPconnection from one host to another hostThe thing is you need to have the sameIP address which is pretty unlikely in ain a pot We we are able to to recreateit in the demosby by editing CNI files or you can editthe the checkpoint information to havethe IP address of of the new uh port Umbut there's also the option what what weuse in our demos there's an option wejust say all open connections we say uhare closed and then the expectation isof course that the application handlesautomatic networkreconnects So um one of the problemswith TCP connections is that umessentially with crew we have a lockingmechanism that has to drop all incomingpackets from clients and this preventssending the tier set bucket to theclient Um but in Kubernetes because thenetwork name space is allocated to podsthis mechanism it doesn't workessentially um it Creo cannot apply thismechanism and we checkpoint restorewithin the name space and as Adrianmentioned there is Creo provides a TPclose option which would can be added toa configuration file that essentiallycloses OTP connection and thenessentially reconnects after restore andin general it depends on the use case Sofor live migration it's important tocheckpoint restore TP connections butfor fault tolerance where essentiallyyou can restore the checkpoint a weeklater or something like this then it'snot that uh essentially we use the TPclose mechanismOkay thank youDo you see any kind of fundamentalblocking points to get actual livemigration the way we are familiar withwith VMs transferring between nodesso at this point it's from my point ofview it's not a technical problem It'sfinding the right way to integrate it inKubernetes It's a it's kind of a featurethat breaks a lot of assumptions becausecontainers are stateless and things likethis So I think that's that's the hardpart finding the right way how tointegrate it in in in Kubernetes withoutbreaking any I don't know infrastructureassumptions from from from something Idon't know that's yeah so one of theproblems with GPUs I guess for lifemigration is that you need to have thesame GPU on both sides and you need tohave the same number of GPUs so this issomething that could be improved in thefuture but um in general there arecertain limitations you need to have thesame libraries And yeah there arecertain um requirements for lifehoundation to workY but I think those are the same withwith VMs If you change your CPU or yourGPU then it's probably not going to workUh hi Um so follow up to the firstquestion that you got Can this be usedfor uh databases like say you have aMySQL database running or a radius cacherunning uh how effective would this beforthoseso I we checkpointed both We have theMicrosoft SQL database was checkpointeda couple of years ago and migrated aswell as as Reddis is is an example wealso often use Yes it works Um the Ithink the main problem is uh like thefirst question do these tools handlenetwork reconnections automatically so Iguess this would be something uh thetools need to handle to to make it workmore easily in in any environment Ifthey do that then I would say sure itworksAnd in terms of performing the operationitself you almost need the techcheckpoint in both places Is there a wayto sync the checkpoints in both placesat onceuh I I don't know So we we need totransfer it just we don't need it inboth places We need it on the systemwhere we want to restore it Ah so youmean it has to be on the same nodeWhat about a multi multiode as in doesit support pushing to object storage orNFS yes So so the way how we domigration is we convert the checkpointimage to an to an OCI image and push itto a registry and then we can use itfrom from any any any node any anycublet The thing is this image wasrather big So if I would have convertedit on the original system pushed it to aregistry and then downloaded it on theother machine then it would have justtaken much longer because I have to theregistry is just slower than using oursync from one node to another node2025-04-15 22:03:34.753794 � �� v#��QASsTUGO9YbnQhello everyone Welcome to the panelabout quantum computing and KubernetesWe've talked a lot about how Kubernetesis complicated Now we make it even morecomplicated by throwing quantumcomputing into the mix Um and so most ofus have heard the term quantum computingmostly as a theoretical pursuit We'venot really seen what the real worldapplications around it are and what doesit really mean to run quantum workloadson Kubernetes and how we can make theKubernetes and cloudnative ecosystemquantum safe We've talked a lot about AIThen why are we even talking aboutquantum now because AI is already socomplicated because if we don't starttalking about quantum now we're going torun into trouble later on So we need tomake progress while we still have timeSo since we're short on time we've gotjust got 30 minutes and I'll try toleave some time for questions Let's diveright into it So we've got a wonderfulset of panelists here Uh why don't youall go and introduce yourselvesso my name is Natalie Fischer I work forVMware by Broadcom with Nikita on theKubernetes um area stack So uh I work asa product manager thereHi there everyone I'm Nigel Jones I workfor IBM research Um so I'm involved inpostquantum cryptography and uh quantumand our services there and now doingsome work with AI So it's an interestingintersection to talk about todayYeah And I'm Thomas Koson I'm chief PKIofficer at key factor and I've beenworking with cyber security and publickey infrastructure PI and opensource forthe past 30 yearsHi I'm Ricardo Uh I lead the platformsinfrastructure uh u��M�u#��QAnHGzMmstR0Ewelcome everybody to the streamlinedefficiency and shackling Kubernetesimage volumes for rapid AI model anddata set loading presentation todaywe're going to talk a little bit abouthow we can go about speeding up umvolume loading when using image volumespecifically so to start us off uh I'mEssen Ree i am a software engineer fromMicrosoft i specifically work on Ashurecontainer registry my oh one secondum my main area of expertise is OCIconformance as well as artifactstrq��O�t#��UABSoEY_tpxIoum sotoday we are going to be talking abouthow we use efficient transparentcheckpointing for um AI workloads and inKubernetes and uh this work is acollaboration between my supervisorsprofessor Rodrigo Bruno and professorWes Armo and a few people from um Nvidiaand AMD Um Victoria over to youHi So do you know what all these talkshave in common so during past CubeConsfolks have alreadypresented some problems regardinginefficient resource utilization andfault tolerance These people wereworried and concerned about ensuringthat GPUs are used efficiently and thatany problems that arise concerning GPUare resolved simply because GPUs areexpensive resource We can see thatpeople from companies like GoogleMicrosoft Huawei Nvidiajreaming my co-presenter over there isYan Yuan from Alibaba cloud he's asenior software engineer there and aresearcher who's done many manycontributions to the overlay BD projectum so to start us off I want to speakabout the current state of efficiencywhen it comes to starting up images andhow we can deal with accelerating thoseso the first thing to note is that todaywe have achieved a lot of progress onimage startup the first thing is we haveartifact streaming however artifactstreaming helps us solve one problemwhich is how do we optimize uh workloadsthat are specifically applicationbasedbut when we start to deal with we needto have these large data sets along withthis we start to encounter some problemspackaging our information into ourcurrent images and then streaming themis not necessarily efficient we want tobe able to load data at a scale but atthe same time we don't want tonecessarily pay the upfront cost ofpackaging this data and putting it inour container registries so when we'redealing with large language modeltraining or just AI in general wesometimes need to load data in parallelthis data parallelism presents a lot ofchallenges and means that we actuallyneed to be ableto simultaneously access lots of dataacross multiple places at the sametime uh this presents a number ofchallengesum the first of which is we need to beable to have data that can be accessibleall the time so if we don't havecontinuous data access we're going toend up paying a lot of money for GPUsthat are not in use we're going to beunable to actually load all the datawhen we need it as we needit we also need to make sure that whenwe do access the data it's actually fasttoaccess additionally because we'redealing with Kubernetes clusters weobviously need to be able to scale upour access if we're not able to scale upto thousands of nodes then we're notgoing to be able to use this dataeffectively when training AI models orany other thing that requires large datasets andfinally there are some things aboutmanagement that we really need to thinkabout first of all this data is likelyto need to be versioned so we need tohave versioning that matches both thedata and the applicationsfinally but most important of allmanaging this data needs to be easy wecannot have any applications that aretoo difficult to manage because nobody'sgoing to use them so given that we havethese challenges we need to considerthat we do have a lot of solutions forthis so here's where image volumes comein from OCI so when we're talking aboutOCI registries we already have a lot ofthese challenges tackled first of allOCI registries need to be performantbecause they already need to scale upfor these Kubernetes workloads theyalready need to be highly access highlyavailable and in general people aresomewhat familiar with them so we alsoknow that they solve a lot of theversioning problems we have tags forversioning as well as digests andmanifests so specifically tags can giveus the ability to have arbitrarydefinitions for our different um imagesright so any data that we have we couldversion this way we also get the benefitof using digests within our manifests toactually maintain data consistency andalso have versions that will not changeover timebeyond all of this there is one othergreat benefit that we get from havingour data in OCI registries we can enablegarbage collection for this now this isvery important because every year datakeeps growing and growing and it's outof control nowadays uh there's been somestudies including an IDC report thattells us that the total um data in theworld will grow by 175 zetabytes by 2025and if you're an organization who needsto have these very large data sets thatare versioned as well you're going tokeep using lots of data and if you'renever able to clean that up because youdon't know what sort of runtime it'stied to then you're just going to end upaccumulating costs with no end insight so I mentioned this a little bitbefore but users are already familiarwith OCI registries the distributionspec is well known and well defined andin general people haves to use it forKubernetes of course tooling is alwayswonderful so being part of the ecosystemmeans that if we can actually leveragethe registry to also tackle thisparticular challenge of loading largedata sets then we can get a lot ofadvantagesnow has this been done before to someextent uh there's actually a longhistory of using OCIum images to store arbitrary data theOrus project for example formalized thisat some point by defining how you canput any sort of arbitrary data into anOCI artifact we also have image volumesof course uh as the presentation hasbeen talking about which allow you toactually load um container images into afile system in the Kubernetes side ofthings okayum there are some questions why we mightneed to hold off on doing this sort ofthing so the first thing is thatregistries were not designed initiallyfor volume mounting uh what this meansis that there are some things that don'tnecessarily work out of the box or atleast not across all registries uh forexample we need to account for the factthat different registries tend to havedifferent size limitations for examplethe ACR registry has somewhere around a200 uh gigabyte limit per layer but thisis not consistent and it's not specifiedin the OCI distribution spec so itvaries depending on implementation theother thing is that while we have beenable to get streaming support working onum on container registries this was notsomething that was built into the specinitially now there are some things thatallowed for it to happen today uh butoriginally was not built for this soit's not something that we can say wasuh intended from the startthe other thing is that there are somelimitations when it comes to how westore the data inside of images that gointo registries oci registries tend tostore information using a overlay filesystem which is a union file system thatbasically creates new layers every timeyou modify the data so if you're dealingwith a very large data set this mean avery large data set that might bechanging at any point this means thatyou might need to actually constantlybuild new layers and eventually you'regoing to run into some layer limits andthere are costs to this sort of overlaystructure this means as well that everytime you're adding new data or even atthe first time you're doing this youneed to package a lot of this data andas we'll see in a moment packaging canbe quiteexpensive so here is a bit of anillustration of what the cost is forpackaging uh a bunch of data so wegrabbed some data sets from kaggle.comwe grabbed some popular uh machinelearning uh training data sets just toget a little bit of an understanding ofwhat the cost is we used copy fromdocker using a dockerum a docker file and uh just to try toillustrate this so people can replicateit if they want to and then we startedpackaging these and we noticed a numberof things first of all as you have morefiles and generally larger images youtend to have much much larger times topackage for example the largest data setthat we had in this case actually tookalmost 4 hours to package and it's only22 GB in the modern world 22 GB is notthat much but for the purposes of imagesit can be quite a bit i mean I do wantto note that this particular data setdoes have something like 700,000 filesand they're mostly images so they're notvery compressible as well and there'ssome considerations there butnonetheless even the smaller data setsin the order of just a few gigabyteswould take many minutes to actuallypackage so we can start to see that thisis something that is maybe notideal so from hereon we've noticed that we've alreadysolved a lot of the problems but we havesomething to consider so the registriesprovide high availability andscalability and performance we have somemeasure of versioning data and we cansupport data uh garbage collectionnatively they're also widely adopted uhin use across a number of tooling and ingeneral they are something that mostpeople will already be familiar with sothat said how can we solve that biggestchallenge for us which is packaging sowe can actually use this for largter dataset loading in AI work uh in AIworkloads so now my colleague Ethan willintroduce what we've actually done tofix this issueuh okay thanks Estban for his speech uhnow I will show you how to solve thisproblem rightearlier uh the inspiration for this camefrom the indexing to OC image uh thecommunity already has some solutionsthat a remote snapshot can create amount point for streaming by externalindexes and uh without repackaging thethe image such like the SQL OCI andoverlayduh the core of these solutions is theycanuh package a index file of the imagelayercontent anduh they are all meta data by hooking theIO request at the file system level orblo it combined with the image indexthey can find the corresponding data forthe remoteimage so uh taking a B speculation couldwe build an index for the entire storagebucket uh my answer is yes i'd like I'dlike to introduce you the inuh that solution can create a mountpointfor accessing remote data through OCIartifact and uh no need to package themuh the data set description referencelist and remote slab shorters are thecore of this solutionuh reference list is a set of dataobjects you want to package into the OCIartifact remote snapshot here we passthe reference items from the referencelist and create the model point forstreaming uh more details reference listuh contains a set of records each recordhas at least four parts the source passuh is the original pass for the uh forthe object in the remote storage themount pass the pass when you access theobject from the mount point uh it tag ittag reflects the data change and itrepresents an version of the objectpeople can know whether the data hasbeen changed from the E tag instead ofdownloading it and file size the objectsize inbite through the reference listuh we can describe all the object wewant to access within OCI artifactuh now let's package the reference listinto OF artifactUh first of all we need an registry totake over the backend storageuh which enable us to get the targetblob URL from the registryas we all know uh when we try to get ablob from the reg registry API theregistry will return either a redirectURL or the blob content depending on thestatus codeso it's easy to get the back end storageendpoint from the registry API andcombine with the source pass field wecan get the actual pass uh of the targetblock second uh a special annotationfield is needed this can uhidentify that the current layer is areference list which can convenientlyenable the snapshot to support accessingremote files for the existing OIartifact uh of course there may be abetter approaches in the futureuh and also the reference list shouldshould be saved as a common format likeCSV or uh JSON or something else so thatthe registry can analyze the details ofthe reference list to get the valid dataacross the entire storage bucketwhen we create a command point thecontainer runtimes should pour andunpack the reference list through theregular image poolingprocess and the slaughteruh will create the uh mount point byparsing the reference items[Music]uh yeah uh slsher should create a fileentryuh create a file entry at the relativepath of the mount point according to themount path field and this file entryactually point to its sourcepath uh please note that the accesspermission is required here because theregistry may not have the accesspermission for all the remoteobject uh so an additional authorizationmay be neededuh streaming loading enables thecontainer runtime to mount the imagevolume without pulling it escape thedownloading process of theartifact uh and more important when theuh when the data set is too large itwill occupy it will occupy a lot ofstorage space and even failed todownload it so the streaming loading isessential hereuh when the streaming streaming servicehandle an IO request uh it shouldconvert the uh IO request to theuh corresponding corresponding data ofthe referencelist after uh ensuring that the e taghas not been changed uh it convert theIO request into a range gate for theremote targetfor the proof of concept we chooseoverd provides a merged view uh of imagelayers as a virtual block deviceuh it supports on demand transfer dataas disk sector level and it also uhprovide a bound point where a blockdevice interface rather than fuse whichhas a better performance uh in the smallfiles accessing and more major inhandling the stability problems likecrash recoveryuh the most important is overd has beenwidely used in the productionenvironment of Asia and Alibaba groupand data bricksuh overd has a lightweight mode calledturbo OCIit already support indexing to OCI imageand also it can build for ERS filesystem from image indexlocally uh to support eink we can justimplement a simple function to map theIO request into referencelist since OABD provides a backendimplementation of the of a block devicethe IO request from the mount point willbe converted into a simple read andwrite operations through the file systemand uh blockdriver that is also one of the importantreason why we choose OBDwow based on the plan above uh weconducted performance test on Elink forpackaging and a full volumeaccess during the packaging phase Eninktransformed the file copy into recordingthe reference data which gives Elinksignificant advantages in both packagingtime and build speedfor the 22 gigabyte data set whichcontains more than 700,000 files thepackaging of OCI image takes nearly fourhours but Elink build speed is less thantwominutes uh at last we tested the EU dataaccess performance uh this chartcompares the throughput of traditionalOCI images goofies andelink goofies is a high performanceimplementation of AWS S3 file system theright access of this chart shows theaverage size of individual files in eachdata setin this test we add the preparepreparation time the preparation timefor OCI image uh include the downloadingand unpackaging which is the uh defaultbehavior for creating a OCI volume mountpoint for goofies and for goofys and thepreparation time is in uh include thetime taken to create a mount point andbuild the file system metadatathe result shows that in the scenariowith a large number of smallfiles Elink performs significantlybetter than OCI image and goofiesuh in the US accident data set with alarge single file elink performance isalso on par withGIS and all the test environments wereprovided by Alibaba clouduh that's conclude our presentationthank you for listening and welcome anyquestionsuh thanks for the presentation uh I justhad a couple of questions uh one is howdoes this uh work with uh six door imagesigning and uh second when you are uhstreaming a blob how do you verify thechecksum on the target host thanksyou want to handle that oneum so I don't think we're yet at thestage where we're doing a lot of umsecurity verification for some of thesethings we are doing uh verificationusing the E tags to at least validatethat theum that the contents that we'rerequesting are actually what we want toget uh this is still somewhat earlystages so I don't think we've gottenquite there as for streaming purposes uhdo you know what kind of validation wedo there oh we we have to change some wehave to change some for the file toverify thedata is correctyeah uh I just want to know when you arestreaming before the if you get theentire uh entirety of the data how doyou verify the check sumyeah he's asking how do we verify thecheck sum if you haven't gotten all ofthe data uh we we just just check thethe elink edink is the check uh M MD5check sum of the object um maybesometimes uh uh before we get the datawe we through the HTTP head response canget the E tag and and and the E tag isalso saved in the reference list if theE tag is is mismatched so the data hasbeen changedokay thanks2025-04-15 22:03:35.498188vteam at CERN Umresponsible for the teams uh doingcloudnative deployments as well asmachine learning deployments and uh nowalso starting to look at uh quantumcomputing management Awesome Thank youAnd as Natalie already said um I alsowork at Broadcom Uh my name is NikitaI'm a principal engineer there I've beeninvolved in the Kubernetes space for along timeNow quantum space interests me and Ithought why not we talk more about thisat CubeCon So let's get started Um we'vetalked a lot about the AI hype so I justwant to circle back and talk about thequantum hype When when's that going tohappenso I I think we're all learning aboutquantum computers and we're stilllooking for real world use cases butthese systems are developing rapidly andwe know that at some point if we takecryptography for example you know thatsome of the mathematical problems whichum those are based on could be attackeduh by quantum computers Um but it is itisn't just hype There are actuallyservices out there I I work for IBM forquantum services in the cloud and so itis about then making those available topeople so that they can startexperimenting and start learning um withthemYeah I can say what we are doing at CERNSo uh at CERN we have two maininitiatives around quantum computingActually there have been people lookingat this for for a few years now Uh thefirst one is uh CERN uh helps leadwhat's called the open quantum institutein Europe Uh so we have a a role thereand this project is more about thegovernance and the access to the thistype of technology Uh we have anothereffort that's called the quantumtechnology initiative which is actuallynow in phase two which means uh therewas a phase one before so it's it'salready progressing and here it's moreabout the technology aspects and notonly the development of the algorithmsand identifying the use cases wherewhere things can match but also ensuringthat uh we can manage these workloadsand deploy them uh uh efficiently andcostefficive as as well Uh so yeah we'requite involved internallyand working in uh in PIuh postcontography is one of the hottesttopics since NIS standardized the firstquantum safe algorithms in August lastyear So it consumes a lot of our timesand it's a big topic on all kind ofindustry focused or cyber securityspecific focused uh conferencesDo you want to go ahead um so one of thethings that kind of got me into thisparticular space is we were talkingabout cryptographic uh agility So howcan you quickly change cryptographickeys and it sounds like you are alsoworking on something very similar tothat One of the things that had sort ofdriven the projects that I was workingon because I used to work in research isum the US government had basically madean announcement in 2022 about having tosupport quantum computing basically by2035 and because the field is so nentthere's so like like everyone's kind ofsaying here there's a lot that we don'tknow there's a lot that's still going onthere's a lot of research that's stilloccurring and one of the things that'sreally important is try to figure outlike how can we get there because we'restill pretty far behind There arecompanies such as drug companies likeFizer that are using this to kind oftest out molecules So that's like one ofthe more practical ways if you will ofworking on it And then you also havelogistic opportunities as well that kindof plays in with AI workloads as well asquantum computing where they're tryingto figure out best ways to um identifydifferent routes and uh and and ways toget aroundSo I think you all raised a really greatpoint especially around cryptographythat there's a lot of changes that weneed to do So c can we like talk alittle bit more about that so in say forexample in this whole cloud nativeecosystem or just kubernetes to startwith what changes do you think we'llneed to bring ini'm a lot of almost all security inKubernetes be it MTLS or uh digitalsignatures JSON web tokens uh codesigning you know everything relies onasymmetric cryptography and that is asNigel said you know the algorithm RSAelliptic curve diff those algorithmswillw be broken when a large enoughquantum computer sees the you know thetime of light and shores algorithms canbe run on it So that's and of courseeverything not only kubernetus buteverything in society relies on thesealgorithms that it means yeah we have toupdate things or what we today think issecure will not be secure anymore I Ithink another part of this as well is weneed to understand what cryptographywe're actually using So getting acryptographic in um inventory you knowwe've talked about sbombs for um bill ofmaterials There's a standard calledseabbombs which also kind of augmentsthat with information about yourcryptography usage and and that needs tobe you know it's not just about what'sin Kubernetes and how we buildKubernetes but it's also that awarenessand that usage you know far beyond thatum so yeah I think uh understanding whatyou've got prioritizing as well so wehear of this you know harvest nowdecrypt later so the idea here is we cancapture traffic at the moment save itand then come back to it when quantumcomputers are able to break it Thatinvolves looking at the risk It's goingto be relevant perhaps for your veryhigh value data that has a long lifetimethat's still going to be valuable in 510 years time It may not matter for yourshorter lived data So I thinkunderstanding what you've got and thenworking out how you kind of prioritizeis is really really important you won'tfix everything at once and we in theopen source community have to help youknow things again like the S bombs and Cbombs may be a small part of that Do youalready see work happening in any of theprojects or anywhere in the communityso I'm actually a TSC lead for a projectthat's within something called the PQCApostcontent cryptography association Wehave implementations of some of thesestandard track algorithms like MLDDSA MLChem um and are working towards makingsure those are high assurance so that wecan start including them in stacksThere's other efforts going around inother projects OpenSSL for example 3.5 Ithink is coming out soon So that's goingto add support for some of thesealgorithms and we'll see that filter upthrough the stack So I I think for allof the project maintainers working onother components it's about um beingaware of what's happening with thosedependencies and and and making thatavailable through your um particularprojectsThis is not to put you on thespot but um IBM's working on an opensource uh Kiskit SDK right so that'spretty available as well for people tokind of look at and play with But Idon't know too much about it I don'tknow if you do That's why I said I don'thave tools So so yeah Kiss Kisskit is anopen source toolkit for developingquantum applications Um that absolutelyis out there It's it's been there for awhile and and people can take a look atit That itself is open source will workwith multiple backends But yeah I wouldI would definitely say if you'reinterested in quantum computing uh lotsof companies have education materialaround there So I think it is somethingthat um is good to start learning aboutand understanding and understanding whyit's different It's it's not thatquantum computing is going to replaceclassical computing It's actually how itaugments augments it and how you knowsome of those little um functions thatjust happen to work well on quantum canbe part of our standard businessprocessesYeahUm I I just wanted to the one thing thatI found interesting was you said sothere are changes that are already beingmade in certain projects and that isthat's something that we're going to seeup the stack going forward So from aKubernetes or just someone an platformengineer perspective right so when whenI talk about migrating clusters to aquantum safe future what does thatreallymean i mean there's a couple ofuh two a couple of things One is youknow modern have making sure theinfrastructure is modern So oneprerequisite you could say is TLS model103 I think u in most Kubernetes casesit's probably modern enough so TLS 103is used but other legacy organizationshave a just a huge uplift to move fromTLS 102 to TS 1.3 and txhings like thatwhen that hap after that is done youalso need to make sure you all thecomponents are upgradable easily rightyou can't be stuck with legacy for yearsby now you have to be more agile andupdatable and and keep track and then itcomes into the crypto agility aspects uhlast right so projects can't hardcodefor RSA or EC anymore it has to beconfigurable so it's easy to update youknow once theuh development reaches productionmaturity so you're not stuck thereeither you don't have to redevelop toomuch of the components so those are Ithink uh three key aspects that thecommunity really has to step up to I Ithink one point also when we talk aboutum making use of these postquantumalgorithms within our software stacks issometimes that you know the key sizesmay be bigger right the packet sizes maybe bigger uh in many cases you may beusing hybrid schemes where you'recombining um traditional encryption withsay elliptic curves together with uhquantum safe encryption and again thatcan increase uh resource um sizes andCPU maybe especially important with umvery small um sort of transactions orvery high volume Yeah And for someapplications it won't matter at all butfor some applications it will may havehuge impact So we don't know until westart testing So kind of starting toplay around early I think is importantSo let's switch gears a little bit Uhwe've talked a lot about security andthe cryptographic side of things butwhat about running quantum workloads onKubernetes what gaps do you see rightnow and where are we atokay I can try that one So thecryptography is is not the main focus uhuh where I work for for uh looking atquantum computing Um there there areclear use cases where we already haveworkloads that have been seeing uhbenefits from this So I I'll give twoexamples Um one of them is beamcalibration So we we have large particleaccelerator We have proton beams goingaround calibrating these beams isactually a very hard task Um so peoplehave been looking at quant quantum uhalgorithms to to help out with this todo this live They actually validatedthis algorithms with a live proton beamwhich is quite interesting Uh the secondone is a lot of effort uh around quantummachine learning Um so we talked abouthow what's the complication that quantumcomputing is bringing to to the cloudnative area after AI It's actually verytightly related in some of the workloadsSo even things like the hick analysisthat the hick boson which is a lot ofwhat we do is just analyzing the datacoming from the detectors and trying todiscover things Uh there are quantummachine learning uh algorithms that willhelp us with this um they do have issuesright now because of the the problembeing um the dimensionality of theproblem not being adapted to the currenthardware we have for quantum computersAnd this is where things get uh uhinteresting because you can do reductionof the problem using traditional machinelearning algorithms and then do thesecond step using quantum algorithms Andthis is the main challenge we have onthe platform side is that we have tointegrate this kind of hybrid uhscenarios where we have more classicaland and quantum workloads in the samestack Uh so this is what we are uhinvestigating from from a platformperspective is how we can have this verycomplex workloads uh how to manage themin a hybrid uh world which will staythere for quite a while There are otherchallenges uh which are more on theinfrastructure side which is we verylikely will not have quantum computersuh on site anytime soon So they areremote So we need to integrate them inmuch more of a HPC like way where yousend workloads to the cloud or remoteand fetch the results and then integratewith the rest of your analysis a lotmore than than the tightly coupled uhinfrastructure you would see on atraditional data centerI I think also and the other side ofthat if you like from a providerperspective um is that Kubernetes iscritical to us when we offer quantumservices um through the cloud becauseyou know the actual computation thathappens on the quantum computer is justa small part of that yThere's a lot ofpre-processing there's a lot ofpost-processing that occurs there's alot of control um that's involved aroundhow you manage these systems And thenthere's all the usual kind of boringstuff right whether it's um you knowCI/CD process whether it's login whetherit's authentication and all of thatagain is is based on Kubernetes soKubernetes is a critical part of makingthis utility available to people and Ithink the parallel with AI is also verygood in this case because the when whenthe AI hype started a couple years agothere was this notion of what'shappening to cloud native where is AIgoing but all of us that were doingcloud native for a long time like therethere's not a lot of motivation to goand reinvent the whole stack and redoall the tools that we all uh rely on forseveral years Uh it's it's there's astrong motivation just to integrate thistype of new workloads maybe adapt theexisting systems to accommodate properlyand I think the same will happen uh forquantum computing as wellM do you want to add something oh no nono I was just going to say I was theyboth kind of already said what I wasgoing tosay Basically I was just going to saythe similar thing whereas in terms ofhow Kubernetes is currently interactingwith quantum computing and of coursewith AI workloads right now is it isacting as of course the orchestrationcomponent aspect of it but everythingright now in that space is hybrid But Ithink u the thing that Ricardo was kindof mentioning that I thought was reallyinteresting is like where's it going togo from here and what does it mean whenit's not hybridso we we talked a lot about like youtouched base on quantum machine learningand such Um so if someone's hereinterested in the audience about it whatshould they do about it like how canthey what's the next step for them ifthey want to tap inand get quantum machine learning yeah Souh I think the first uh the first uhstep is to actually have a use case thatcan benefit uh machine learning isactually a pretty good uh use case forquantum computing Uh the the I don'tknow if you if you listen to peopletalking about quantum computing is whatwas mentioned before A lot of peoplethink this will be the next generationof computing This is clearly not thecase There will be a hybrid world So Iwould say the first step is to reallyidentify a good use case Uh the secondthing is that uh what was mentionedbefore which is there are there aretoolkits that can simulate quantumcomputers and this is a really good stepto introduce yourself and and your toolsto to this type of workload like KisKitis a very good example The third thingis that quantum computers are real andyou can actually get access to throughthem to them through even your publiccloud providers that you probablyalready have access Some of them offerservices in their catalog that give youaccess to quantum computers Uh I'm notsaying it's easy then to start usingthem That's why the integration withKubernetes is is interesting There arechallenges I'm happy to expand a bit onsome of them But uh but you can you cando it Uh it's it's it's real It's notcoming It's there YeahDon't youalso just discussion here i mean I thinkthe quantum algorithms they work quitedifferently from you know what we'reused to from classic computing So isn'tthere a quite steep learning curve inorder to kind of understand the actualutilities for uh these new machinesyeah Oh this this is luckily not my jobwhere I work I work on the platform sidebut we do have the advantage of being aphysics laboratory So a lot of peopleunderstand this technology really reallywell even the theory behind it and theythey are pushing the boundaries Thisthis is why CERN has a lead role inthings like the open quantum instituteand the quantum technology initiative isthat we have the knowledge in house ofhow these things can have an impact forus and potentially for other use casesas well And I think the other thing inthis space as well is as we we learnmore about these algorithms more of themare developed then they themselvesbecome offered as a service that otherpeople cazn just consume And so you'redoing you have a certain financialworkload and you're looking at someoptimization problem you're going tocall out to this algorithm The otherthing that then becomes interesting froman AI perspective is your AI model cancall tools Your AI model tool AI modelcan call these services So it itself canhave access to these uh quantum basedalgorithms as well as then AI being usedto help you um develop those algorithmsand write your software just like youwrite your normal software Um so I thinkthere's a a lot of potential with withAI on multiple sides of this That thatmakes sense I just out of curiosity sowho here has played with quantumcomputers or just has some experience inthis area be it the cryptographic sideOkay just two three four maybe And I getwho here is interested or maybe five Whohere is interested to get into thisspace okay that's abunch So for folks who are interestedwhat more can they do like what are thetangible things for instance when I waslearning more about it I was like okaythis is cool but this is it looks likesomething that I cannot take an actionupon right now So for instance there isthere are some maintainers of projectsor people working on certain projectsinternally in their companies Whatshould they be doing what's the nextaction item for themi'll give first shot I mean from a uh Ithink myself you know when the postconicphotography came along having beenworked with this a long time you knowit's it's fun It's new things newalgorithms uh applied in similar waysbut it's it's a lot of fun So you canhave a lot of fun with this Uh but yeahyou have to you know learn how the whatthe new algorithms are how they work howyou apply them So there's a lot ofthings you can uh study up on then youknow how will the new hybrid keyexchange work in in TLS for example andthen yeah crypto agility as well there'shas been a lot depend regardless ofwhich industry you work in be itfinancial or government or whatever nowthere's a huge flurry of you know newwhite papers coming out how to applycryptogility that just came out a coupleof weeks ago one from from NIST uh whichis a real good you know almosteducational what are the things you haveto think about when it comes to cryptoagility if you're developing stuff ifyou're deploying stuff in your platformwhat are different aspects you have toconsider so there's a uh great lot ofgood material out there to to read up onyeah I'd echo that I think there's a lotof material out there whether it's onthe cryptography side or whether it's onthe quantum computing side and I guessone thing for anyone who's new to anarea is it is I mean I know this as amaintain maintainer on projects rightit's such a great thing when new peoplecome along and have new ideas or say whyis this so complicated or why is thisdocumentation poor and that's where newcontributors I think are just soimmensely valuable you think you comealong to a new topic area and you don'tknow a thing but actually you'rebringing that extra insight um and thenhelping to spread the word um morebroadly so I I think I would encouragepeople to try and get involvedUh I would I would also add this is avery new area Um I will give an exampleof how how early it is There areconcrete uses but it's quite early So uhfor example we started uh procuring foraccess to quantum computers becausethere are a few around the world Uhthere is no standard to define uh howyou measure the workloads or how youcost or you do the costing of theworkloads Now to procure resources whenyou don't know how to actually definethe cost of what you're using is reallyreally hard And this is where uh notonly the the this area needs quantumcomputing experts and IT experts itneeds all sorts of people It needs thebusiness people to actually start makingsomething more concrete uh from this uhthis uh services Uh it's it's reallyfunny when you when you start lookinginto it how how different uh eachquantum computer is is defining how yousubmit a workload how you measure theefficiency how you you can actually askfor access and time Uh so it's quiteearly w{hich makes it really interestingin all sorts of uh areas So even ifyou're not like a a quantum expert uhthere there's a lot to contribute to Uhso there's a lot of resources it lookslike Um so while I was putting togetherthis panel so I probably try to keepthis as the last question and have somehave some time for audience questions Sowhile we were putting this paneltogether um there was one more panelistwho's going to join us Paul from IBM Sowhen we were discussing about this werealized that there really isn't a spacein the CNCF community where we can aforum where we can talk about this SoRicardo I'm going to put you on the spotyou're in the TOC the tab Uh so where isthis forum now what should we do yeah II think it's good to look uh there Ithink there are two areas One one is tolook at what we did with AI Um there wasa lot of uh demand to have cloudnativeuh infrastructure evolve to accommodateAI Uh and this needs two two parts oneis to to do the governance of all theseprojects uh and to integrate them in theecosystem So I think that that roles uhbelongs a lot in the what we have in thetechnical oversight committee So wecreated this AI working group that has alot of the people doing the work uhgetting together frequently uh theypublished a white paper that kind ofdefines the state of the ecosystem andgives some directions of where where weshould go The other part is thetechnical more technical evolution ofthe platform And we learned from from AIthat Kubernetes actually is extremelyflexible in accommodating workloads thatare not just managing nodes orcontainers Uh it became an orchestratorfor a bunch of stuff and this is thearea where we can learn a lot from whatwe did for AI or HPC All the newscheduling primitives all the evolutionwe did in this part This is essential toaccommodate new things like quantumcomputing and the fact that we weresuccessful doing this for AI reallygives a lot of optimism that we'll beable to do the same for for for this newera as well Nigel do you want to talkabout the PQCAsorry the PQCA uh the PQC uh yeah soagain as I said there is some work goingon in something called the postquantumcryptography association that's arounduh curating some of the work in thisspace it's part of the Linux foundationit's not part of CNCF uh and I think youknow our our groups at like PCA and CNCFneed to work work closer together onthis and also other groups like there'sopen SSF of course as well uh when we'relooking at the code analysis side yeahwe we had a workshop on Monday at themaintainers summit about what we callthe tag reboot The tag is the technicaladvisory groups in the in the CNCF Uhand we propose to have more flexible wayof starting initiatives So I thinklaunching whoever is interested in thisarea we should push to launch aninitiative to to start getting peopleand discussing more closely uh to notonly have like a panel during CubeConbut have this going uh steadily duringthe year as well Yep So if anyone'sinterested please find Ricardo after thesession and maybe we can spin somethingup Um so to summarize uh the keytakeaways right and to simplify itreally I I see mainly three takeawaysOne is that quantum computing isn'tgoing to replace the classical systemsbut in fact it's going to augment themThe second one we've talked a lot aboutcryptography So there is a hugetransition that is coming to quantumsafe cryptography There's a ton of workthat we need to do and we need to startright away and like like how we talkedabout AI Kubernetes is going to play anintegral role in quantum workloads andwe need to start looking at fillingthose gaps and doing all the work behindthem Um so with that I'll let's let'send the panel here but we'll open up forquestions from the audience Uh there's amicrophone with a stand over there ifanyone's interested[Applause]So we we saw five hands raised before Socome forward and ask questionlike five hands Okay So I guess noquestions So you get some time back Butif anyone's interest Oh I see I see afew questions over there OkayHi Uh this is mainly for Ricardo but canI open to everyone uh you mentioned someof the challenges you faced whenintegrating quantum workloads withKubernetes I'd be interested to hear abit more about these Right So uh thechallenges are still there I'm not goingto claim that we sorted but I think thesummary is uh that we have this um theworkloads the quantum workloadsintegrate with with the classicalworkloads So we have this hybridscenario where we need to delegate apart of the analysis or the process toquantum computers fetch the data backthe results back and continue Uh thefact that this uh uh infrastructure isnot where the rest of the infrastructureis poses a problem on itself So it's thesame issue we discuss constantly in thecommunity about multicluster hybriddeployments This challenge is even morevisible in quantum computers because theinterfaces are not necessarily what weexpect and this is where I think if theproviders of these quantum computersstart offering the APIs that we expectuh it benefits a lot everyone The secondone uh is what I mentioned before isthat we when we want to procure oraccess this quantum computers the way wedefine the workloads is very differentbetween the different uh devices we haveaccess to There is no standard acrossfor defining uh units of computationacross the different quantum computersThis is also a challenge Um and the lastone is that a lot of the algorithms arealso suited to specific devices Uh sothat there's a lot less because of thele lack of standardization Some of theworkloads are fitted to a specific typeof quantum computer or a specificimplementation which also poses quite alot of uh um like uh challenges not onlyin in orchestrating these things uh butalso making sure we have time availablein that specific device and this is uh Idon't know we all have this issue withAI currently with the lack of thescarcity of GPUs uh it's much worse interms of quantum computers like if youhave a lot of people interested thereare not not a lot of devices availableto actually uh move forward So yeah Iwould say this is what we what Icurrently face I'm sure if you go up theup the stack people are facing otherchallenges as well Thank you very muchHi thank you for the panel That wasreally great Um so in February Microsoftum unveiled their major 101 um quantumuh processing unit Um so my question isa little bit more about um ethicalityand is there any consideration aboutethicality to um releasing them to asort of wider audience uh before theystart you know breaking cryptography umyou know on a wider scale Thank youThat is that is a that is a hardquestionAh how do we answer that the ethics Theethics[Music]Um I guess it's like any service Well goYeah I mean it's security has beenhaving this kind of history a long timecoming with CVS and exploits and thingslike that So uh probably when it comesto you know when finding acryptographically relevant quantumcomputer comes along well it's might notbe from an ethical hacking group or anethical organization who knows so Iwell hopefully it's going to be verygradually and obvious when it happens soit doesn't you know the one of the kindof joking things is how do you know acryptographically relevant quantumcomputers is there well that's when allbitcoin are are gone or all yourbitcoins are stolen right but we'll hopeit doesn't come to that So yeahdefinitely the organizations like IBMthough who builds thesethings probably thinks along these linesright not going to throw out somethingthat does it easily and it's going to beexpensive as hell And and I think thisis also goes back to that point aboutprioritization and what is your highvalue data and and focusing on knowingwhat cryptography you you use what theimpact of that is deciding where youneed to put those protections in placenow Um don't try and fix everything butbut focus on that priority because wemay not know uh when it happens untilafterwardsThank you very muchRight Um I think we're at time Uh if youhave more questions please find us nearthe stage and if you're interested instarting more discussions around it alsoplease find us near the stage2025-04-15 22:03:36.189823}like the pods and thenodes on the cluster that werestruggling so hard with the resourcethat were they they were allocated andthey were not able to keep up in areasonable amount of time uh with theload that they wererequested uh having too many pods in acluster is not the only problem that wecan face and typically we are runningselium for the networkum the selenium post that are runningare watching uh the selenium CRDs andone of the behavior of the API serverwhen there's a CRD update is that itdrops all the watch requests so whenthat happens we have all the C engineparts on all the nodes that start doinglist requests on the API server soreally fun incidents when we kill theAPI server and etc at the same time uhbut my favorite incident above all ofthem I think is this one because itinvolves some DNS shenanigans we had theload balancing misconfiguration where westarted putting all the cubelet trafficto a single instance of the API serverthat was completely overwhelmed whilethe other API servers were on the sidebusy doing kind of nothingso so far for that the option communityhas come with solutions uh for usagainst search overloading uh namelythey say to use the API priority andfairness uh to prevent one bad type ofrequest from taking up all the availableresources of the API server and actuallyit comes already preconfigured on thecluster uh since Kubernetes 120 withthose two kinds of resources the flowschemas and um the priority levelconfigurations and this is what youwould see on a brand new emptyKubernetescluster so let's dive in a little bitmore into those kinds of umobjects the first one is the flow schemauh that is used to filter the requeststhat are coming to the API server uh theway it works is it will match therequests against the users that havebeen uh sending them uh either on agroup or a service account in that casewe are targeting all the requests thatare coming from systemnodes not all requests though uh we alsofilter on the resources that are queriedso in that case the non statues and thelees and you can imagine why we're doingthat for this one specifically this isfor the node herbit once we have filtered the requestthey are sent to a priority level uhthat is referenced over here and thepriority level is really what defineshow the requests are prioritized and howthey are throttled and to do that itdefines the conference shares cuesnumbers lots of this is becomingcomplicated over here what does that allmean and more importantly how does ithelp at all with this incident that wehad uh and I talked aboutpreviously so to answer that question uhI want you to get a better intuition atwhat the priority and fairness is andhow it works and I'll show you how Ireason about them so we have this APIserver instance over here that isconfigured with a given set of flowschemas and we have inbound requeststhat are coming from different users ondifferent resources that I haverepresented here with different colorsso the first step uh is really to goover the flow schemas in the order ofthe matching precedent and match therequest against them once we have thatthe flow schema is referencing apriority level which you've seen isconfigured with a given number of cueswe have four of them in thatexample the priority level also comeswith a hand size and the hand size isused for the shuffle sharding of theflow schemas on the various cues and theidea behind that is really we don't wanttwo types of requests coming fromdifferent users to end up in the samecues to avoid overloadingeachother so this is what's happening wehave requests from different cuesdifferent resources that end up in theum in differentcues after that we have a fairscheduling algorithm that will take careof dispatching the requests that havebeen encued to the workers that areavailable how many of those seats areavailable this is defined by the maxrequest in flight flag on the API serverin that case we have 20 of them that aresplit over the two priority levels thatwe have in thatexample the cues themselves they have alimit in size and what happens when onespecific flow sche~ma overflows itsavailable cues this is really where thethrottling kicks in the extra requestgoing through this flow schema will bethrottled and they will be returned witha 429 too many requests and all of thatis happening while the requests going tothe other four schemas from the otherusers on the other requests they canstill be encued because those cues theyare notfull and this is what we can see uh weeven have a few golden metrics that thatwe track for for that that I wanted toto show you over there so mainly we'retracking the number of requests uh thatare going through uh the dispatching ofthe priority and fairness we alsomonitor the latency the time that isspent in the queue waiting and in thecase that the throttle kicks in we canmeasure the number of four to9s that arebeingserved so that's great we have thepriority and fairness that is in placeit is working we have the metric thatshows that it is actually kicking in whydid we have the incident in the firstplace and how can we do better and thisis really where we had to start tuningall those resources so the first step tothat journey was to introduce new flowschemas and a few of them that weintroduced the one that is on theslashmetrix and /debugp prof that one isexempt it will never ever been throttledeven if there's contention on the uh APIserver and the reason for that isbecause this is exactly the moment whenyou want the most observer on the APIserver so make sure not to throttlethoserequests we also introduce flow schemasper namespace So all the serviceaccounts that are in the given spacewill be assigned to the same flow schemaand we have a last one for human usersso people running cubectl commands ontheir laptop uh internally they are inthe group engineering they will be intheir own uh flow schema so that theycan be prioritized typically for manualoperations during anincident so at this point we have betterability we have uh and actually it's wedidn't gain much from what we havealready in the audit logs but where thisreally this becomes really handy is whenwe start introducing priority levels andthe first one that we introduced was thetarget constraint priority level thatwas very limited in the concurrentshares we only have five over here andinstead of encuing requests in casethere are too many of them in par wewill reject them right away uh and thisThis is where splitting the traffic intomore flow schemas becomes handy becausein an incident we can identify which ofthe flow schema is offending and weapply on demand the the top priority andlevel so we were pretty happy at thatmoment we we had the open book uh toapply when something was going on uh weeven have the telemetry to to see whensomething was going wrong uh and we hadthat that nice metric that was showingthat uh the top was effectively runningat that limit that we've set so at thispoint we were really looking forward forthe next incident so that we can usethat and it happened we had an overloadof the control plane and we applied thetarpit and we had still the overload ofthe control plane oh wait that was notsupposed to happen the top was supposedto prevent that altogetheruh and we found this other metric theactual limit that was set on the targetpriority level and it was going wayabove the limit that we have setinitially in the place uh the only thingthat we could find was it was not goingabove this upper limit that was set andactually there's an upstream uh issuewhere this limit is unreachable butreally um the target was not toping thetraffic and we want to know why and toanswer that question I have to mentionthat it actually worked in the firstplace we tested it and it really stoppedworking after Kubernetes 1023 so whathappened in Kubernetes 123There was a boring mechanism that wasintroduced what does it look like solet's go back at the schema that we hadto to look at that we have this prioritylevel that is constraint it has a numberof seats that are all being used andthis time uh we have on the other sideanother priority level that has someseats that are available and not processprocessing any of the requestswhat if we could borrow those seats thatare unused and this is exactly what theborrowing mechanism is used for we havethis uh one constraint priority levelthat will get the seats from the otheruh priority level that is not atcapacity and when you think about thatthis is what what you want to to do tomake sure that you use as much aspossible the resource that you haveavailable except this is not what wewould want in the case of the tarpitpriority level so how do we do wedisable the boring mechanism to do thatthere's a fact that we can set theboring priority limit uh the boringlimit percent to 0% uh to prevent theboring mechanism and we also set thelendable person to 100% so that we cangive away all the seats that we are notusing so that we do not waste capacityso you see the slight difference betweenborrowing which is how many I can getfrom another priority level and lendingwhich is how many I can give awayso after doing that uh the tited againeverybody was happy uh well except mebecause I'm still an SR and I don't likemanual toil and I'd much rather like myincident to be resolved automaticallyfor me so how do we get from thereintroducing more priority levels andI'll show you a little bit how we designuh other priority levels through thisexample over here uh I think the demandset is quite interesting because this isone of the type of workload that caneasily overwhelm the control planeespecially in cluster where we havelarge number of kubernetes nodes so wewere talking about selium earlier onlet's introduce a priority level for theselium agent we have the flow schemathat is matching the service agent thatis referring to the priority level uhcelium agent too uh and since we have asingle user a single subject that is uhthere we only have one queue with alimit on the queue that is not to endall the traffic from all the nodes ofthatcluster uh and we still have precastthat were denied no problem we just hadmore concerns and that did not reallyhelp we were still facing those thisthrottling under normal conditions andin that case the actual bottleneck was asingle queue so we had to split overmultiple cues so that the workers couldpick up uh through those differentcues after that we were back to theexpected behavior under normalconditions no throttling and in case wehad the CRD update then the throttlingwould kick in and would paste uh therequests that were coming in notoverloading the control plane and stillbeing able to serve requests from otherusersso you remember that we are running ourown control plane and can we do betterthan that like sure I can throttle but Ican also add more API servers and howwould I do that uh and the priority uhlevels they they give the metrics ofutilizations and one of the idea that wecould have had was to look at thisutilization if it goes near 100% this isbasically at that moment that thethrottling kicks in and you serve the429 so what if when we go above certainthreshold uh we add more API serversthat can handle more traffic um to thatmy recommendation is not to do that andthe reason is because autoscaling isreally for hardware capacity think aboutCPU memory um network traffic networkbandwidth if you need um but this is notfor this abstract limitation that we seton the priority level and also we canhave this utilization going up notcorrelating with the CPU utilization orthe hardware utilization because of anexternal dependency that that we haveum we mentioned that we we had thenumber of seats that were defined by themax precessing flight so we decided toset a max precessing flight per core andby doing that and setting the flag maxprecessing flight linear in the numberof cores that we had on the APA serverwe were able to correlate the prioritylevel utilization and the CPUutilization going back to the regularautosaling that we had on the APIservers uh one thing that I want tohighlight here is that the API priorityand fairness is all about throttling aslong as we have capacity on the APIserver we will serve requestssuccessfully and it is not a good toolto limit the number of resources t�hat wehave and I will hand over to Matio totell us more about that thankyou so let's try to look at this from adifferent angle if you think about thenumber of resources that you have inyour cluster during the day you probablyhave a profile similar to this um maybesome resources or some workload arescaling up and down depending on thenumber of users that comes to yourwebsite during the day maybe you havejobs like Spark jobs that they processin batch or maybe you have a more linearuh workloads but as I told you at thebeginning there are cases where you havehuman errors or a bad deployment withautomation and then your number ofresources can grow uncapped and at somepoint you're going to reach some limitif you have a cluster in an ad in anaccount uh then maybe you get throttledby your cloud provider API and if youhave multiple clusters in the sameaccount you're impacting multiplepeople's uh maybe you get out of IPs ormaybe you can actually break the controlplane so how we can prevent this tohappen well you can use the resource quthat is an object designed to limit theconsumption of resources per name spaceyou can limit the number of resourcesfor example you can say I want maxed1,000 PS in space or number of configmaps and then you have also the totalamount of resources maybe you want tolimit the number of GPUs or CPU ormemory and so in our uh case you can uhyou have cluster administrators or usersthat create the resource code object inthe name space and when uh the user isasking for more you get a 403 forbiddenerror in the error you also get back theresource that you're about to violate soyou can adjust and so either you go backto your cluster admin or you change theresource quota and so if you apply thisinto the cluster this is what you getyou apply the resource quota you havethe error on the automation or the humanerror and you start scaling up until youget the resource qu the problem here ishow do you set the limit where do youput the resource quota value if it's toolittle then you are very close to theorganic scale up of your surface maybethere is an incident there is a trafficshift in within the cluster you'reblocking a legitimate scale if you aresetting it too high then well doesn'tsolve you and doesn't help you at allbecause you still have the YouTube spikeand even if you set it correctly this isa static limit so maybe during the weeksduring the months you get more usersyour traffic pattern changes and youforgot to update it so what we did wasthat we took the resource quote as abuilding block and built a controllerthat watch for the resource utilizationand dynamically adjusted the quota foreachnameace so uh in case of uh a resourceuh scaling during the day we for eachname space we apply the resource quotayou see that we leave a buffer uh toallow uh for small uh uh immediateupscale but we keep growing the quota upand down following the workload patternand then in case of an incident of aimmediate uh spike of resources then wegrow a little but then we have a cooldown period that is the period where wedon't allow any more scale up and thenwe keep doing this step uh profile inincreasing the quota until we get to themaximum limit that we define for thenamespace so at the moment we supportcounting only the object because that iswhat we care about limiting the numberof resources we support bots becausewell they are actually the things thatrun in your clusters and config mapsbecause we deploy with Helm and so wehave a lot of conf maps goingaround for every resource we define thebuffer that is the percentage that youcan have above the current utilizationso imagine that you have a 10 pod andyou define a 40% buffer you're getting a14 pod squ then we defined a defaultminimum this is because when you'reworking with percentage and you have alittle number small numbers uh you canhave a very then small uh quota imaginethat you have one pod with a 40% bufferyou have a 1.4 pods maybe you run it totwo this doesn't get you anywhere so weput the default minimum sensible forevery name space and then we defined amaximum allowed code per� name spaceusually based on a six scalabilitythreshold recommendation on uh based onour own experience andobservations and so uh the controllerlooks like this uh in every name spacewe watch for pods and config map andthen we manage the life cycle on aresource qu object and we increase anddecrease the limits depending on thecluster utilization of course now thisis this controller is on the criticalpath when there is a problem so wewanted to allow our user to selfs servein case there is an incident we want toget page in the middle of the night sowhat they can do is they can annotatethe resource quote object and thecontroller will immediately uh set thehard limit to the maximum allowed givinguh high room for increase and then wealso set a seven days time to leave tothe annotation this is because 7 days isalmost enough is always enough toresolve an incident or to finish aninvestigation but then we don't want theusers to forget to opt in again to theto the controllers and so we are goingto go and clean up the notation for forthemsel uh of course we also have aspecial admin uh annotation where weannotate the cube system name space andwhat it does is going to trigger thecontroller to delete all the manageresource quota across all name spacesbasically restoring the cluster in astate before resourcequota um a couple of things that welearn while uh developing the controllerthat I think are interesting to sharefirst one is we deploy this thiscontroller in a cluster and we look atthe memory utilization is massive uhthis is a cluster with around 40,000config maps and that is exactly theproblem because when you're listing theresource to count them you are thenopening a watch and you get the fullobject object back uh and when you areuh using helm this can be very expensivefor the config maps actually what wereally care about is the number ofobjects so what we can do is we can uhgo in the meta v1 package on the APImachinery use a partial object metadatalist that is only giving the metadatafor for the object uh enough to know howmany you have and you can see for thegraph we are using almost like uh uhmore than four time less memory on thisside uh and the second thing is rememberthat I told you that we annotate cubesystem and we delete all the all theresource quota we do that by triggeringa reconciliation for every name spaceand so how do you generate events frominside your reconciliation loop well todo that you can use um a source channelso you basically can create a channelthat listen for generic events and whenyou are setting up your controller youcan pass a row source and you can uh youcan watch on this channel and frominside the reconciliation loop everytime you want to trigger an event youcan send uh the the object that you wantover thechannel so the controller is ready wedeploy it look at the graph high- fiveeveryone green lines pods going up anddown red line the resource qu you canalso see that when we go beyond thelimit there is this flat line that isthe minimum resource quota that I wastelling you uh and so we don't never setthe quota below that that limit problemis uh we start getting pinged by theusers hey Mateo I try to deploy myconfig maps I get four or three errors hwell that is because as you know theconfig map doesn't have a controller soif you try to create a deployment whenyou have a resource quota let's say yourresource quota five you deployment isfor 10 pods with 10 replicas the replicaset controller is going to give you thefirst five and then it's going to waitand try to reconcile until you increasethe quota and that works you don't dothat for the config maps and also helmand usually other components they don'tretry uh on config map so what you cando you can have a different minimum qufor config maps at the end of the daythey are less disrupted them pods and souh at the beginning we had a singlebuffer and single values for all theresources then we split them indifferent uh groups and the second thingthat you can do can uh instead of havingthe same buffer to go up and down youcan split it in uh a different cool downperiod so you can scale up pretty fastand then scale down maybe more slowly totry to prevent and try to catch a suddenspike that is going beyond so we solvethis i5 again and then again the usershey mateo I'm trying to deploy my pod orthe config map what is this managedresource quota why am I getting aconflict I'm not updating the resourcequota and so here the problem is that uhuh there is an upstream issue that hasbeen open for a while since 2018 whereuh uh people are complaining that whenthey try to create a different resourcesthey get this error and if you look atthe very long thread the solutionsolution is that on the client side whatwe do is people is parsing the error andif the error has the string conflict onresource quota 409 conflict they retrybecause hey it's upstream bug but whatif you don't have the control of overthe client you can't go to your customerand say hey yeah why don't you fork thecontroller and you add the retries sofirst of all let's try to look at whatthe what the problem is so the error iscoming for the resource quot admissioncontroller What happens is when you havea resource squad enabled when you aretrying to create a resource you aregoing to go through the admission thecontroller does all the checks that itneeds to do to see if you are allowed toadd an object and then it tries toupdate the resource quote object itselfup to hardcoded three times and whathappens is that sometimes you get anerror in trying to update because youyou have a stale version of the objectand so you try to update it up to threetimes and where are no more retries yoububble back to the user 409 conflicterror even if there would be enough roomto create the resources and so what wedid was that well yeah we cheated and wesaid well if there's a conflict don'tcount it as a retry and keep retrying ifyou're getting a 409 conflict so you canget a fresh version of the object andtry to updateagain the thing is we are stillrespecting the 10 seconds timeout so ifyou look if you look at the uh admissionquotota resource quota workers they tryto calculate the quota and then they cutoff at the second timeout and then wewent also and look at how much we wereincreasing the admission duration but wedidn't see any significant increase inthat and we were able to fix for9 so are we ready with this kind of workwell not yet first of all we want to geta better upscaling algorithm uh I showyou that we scale up based on a on abuffer uh calculate based on the currentuh resource utilization maybe we want tolook at the past behavior of theworkloads maybe something like theexponential moving average from uh thevertical podautoscaler uh we want to submit anupstream PR for the open issue we areaware that our approach might be alittle too naive for upstream so uhmaybe we're going to look at swappingthe queue with the rate limited queue orsee if we can add some gter and then uhwe want to uh finalize the userexperience about the maximum limitbecause when you say this name space isnot allowed to scale more than x youneed to go to the user and explain whatto do what if is an incident how do youshift traffic between between clusterswhat if they were above the threshold soyou were not limiting them and then theysplit the workload in multiple namespacenow they are on the quota and thingslike that so I hope that today you learna few things api priority and fairnessare good to help you uh limiting thecontrol plane load they help you also touncover best practice i has explained toyou about like demon set shenanigen andthen uh well they both good to start butthey are it's very easy to do like asort of hello world you firstutilization with them but you need thento plan to spend some time to tweak themfor for your actual needs and there arestill few upstream bugs and missingfeatures uh there is the one that I toldyou about the resource quota but also uhmissing metrics or better metrics forborrow borrowing in API priority andfairness uh that they would help you touncover even more issues so we are attime thank you very much for listening2025-04-15 22:03:36.748870 ��V�w#��cAc73SzCKx-OYhi welcome to our talk to learn how totake down the Kubernetes control planeand more importantly how to take goodcare of it using two techniques the APIpriority and fairness and the resourcekota my name is Ayas Badali i'm asoftware engineer at data dog i startedin the SR team and now I'm applying allthe reality best practices to ourinternal Kubernetes platform helloeveryone it's great to be here my nameis Materowina i'm a software engineer atdata dog and I worked on the controlplane and I'm working in the clusterlife cycleautomation data dog is a SAS monitoringcompany providing full observability onyour applications we over 2,000engineers building the data platformwhich consists of workloads that arecompletely running on Kubernetes to giveyou an idea of the scales we are runninghundred of thousands of pods over tensof thousands of nodes which are spreadover dozens of Kubernetes clusters uhone detail over here is that we manageour own clusters which means that wedeploy and operate the control planethat that is underlying the Kubernetesclusters if you don't and if you rely onmanage Kubernetes clusters by thirdparties don't worry the best practicesthat we're going to introduce today areuser side and you will still be able touse them uh to ensure the properavailability of the control plane ofyours even though it is managed by athirdparty so why should you be uh interestedin the performance and reliability ofthe control plane when this is used uhand operated by a third party it shouldjust work right and be right but we'llshow you a few of the failures that weencountered where a single user couldaffect the stability of the wholeplatform for everyone in the cluster andfor each of those failure mods we'llpresent the tools a cluster name can useto keep and ensure the stability of theplatform spoiler alert those are the APIpriority and fairness and the resourcekota we also present the challenges thatwe faced when we were using them withthe default configuration and thelessons that we learned as we use themmoreextensively so our Kubernetes clustersthey are multi-tenant which means thatthey run various types of workloads fromvarious team and with that we have thecontrollers that ensure the properfunctionalities of the Kubernetesclusters um all of them they coexist inan environment that is resourceconstrainted the control pane itself islimited in resources and we successfullytook it down in a various occasionsinstead of excluding the offending usersand keep the stability for everyone elseso for example we had a bad deploymentof an admission webbook that was wipingout all the labels of the pods so thereplica the replica set controller lostthe ownership of those pods thought thathe did not have the pods created themagain the pods went over the admissionweb books again that wiped again thelabels so the replica set compiler waslike I don't have my pods I will createagain they will go in the web admissionweb book and you can imagine where thisis going we created hundred of thousandof pods over the course of couple ofhours you can imagine this is way abovethe six scalability limits that theyrecommendon another occasion we have Spark jobsuh that requested thousands of podsrunning as exeutor for for the job um atthe same time and the problem is reallyat the same time so we ended up with allthe controllers that they that weretaking care of the life cycles of thevarious components |�ing on your use case your uhyour use cases will be competing againstthe same resources in the host uhmeaning uh if you have an ingest heavyrightheavy uh workload uh then your uhqueries are going to take a hit or youralarming uh alerting and rule enginewill take a hit so uh there's no aredundancy whatsoever and uh there areretention limits because uh yourretention limit is basically going to belimited by the host's uh disks that youcan attach to a hostso okay uh as Orchin was saying uhPrometheus uh even though we reallywanted to use Prometheus uh it didn'tfit uh our use case so we kept lookingaround uh exploring different uhsolutions out there and we end up uhpicking Cortex what's Cortex cortex is aCNCF project uh the the main proposal ofcortex was breaking down this uh singleprocess uh solution that's prometheus inseveral micros service each one of thosemicroservices would have a very specificrole uh in a cortex cluster and uhwhat's good as well is that cortex useprometheus code bas under the hood socortex import prometheus as a libraryand this makes uh makes easier for forCortex to be fully backward fullycompatible with Prometheus and also iseasier for us to uh backport newfeatures that Prometheus is uh isdeveloping such as like native histogramlately so this is a very high level viewof a cortex architecture uh and how thecomponents interact with each other weare not going into details of what eachcomponent does uh we had uh talks aboutthat in the past already but the mainthing here is that uh we have basicallythree paths we have the right path inred which includes distributors and uhingesters we have the read path in blueuh that in includes store gatewaysqueries and query front end and thealerts and ruler path in green thatincludes the alert manager and therulers uhservice okay cortex not just break breakdown Prometheus into different microserbut also allow us to scale thosemicroser uh horizontally this means thatin a Prometheus setup we are not boundedby the size of a single host and more Ican scale my fleet uh to handle moretraffic uh as the traffic increase alsohaving those different microservicesgives us lots of flexibility in how wescale our cortex cluster so let's saythat if you have a read read have haveuh workload we can just scale the readpath microser and those thosemicroservices will not interfere witheach other uh so there is some points ofcontact between the the two the pathsbut usually uh they are like the readpath the right path and the alert rulesand rules path are verysegregated also Cortex uh offer uhautobalance uh in response to anyscaling activity what that means is thatlike uh we can uh scale our fleet andcortex will uh take care of rebalancingthe load between nodes and rechardingthe data between nodes uh and it useunder the hood like consistent hashingso like we we make sure that uh we havemin uh minimal redistribution of load uhwhen that happens so scaling up and downcortex clusters is quite easy you canjust basically go to a kubernetesdeployment change the replica set cortexwill see the new nodes the nodes willregister uh and it will start receivingnew nodes uh start receiving loads youcan use simple things like uh kuberneteshpa to scale up and down thosemicroser also uh cortex implement uhit's built in on cortex the concept ofhigh high availability so corteximplements high availability byreplicating the data across nodes anddisase uh and this together with a corumread a corum operation means that acortex cluster doesn't have any singlepoint of failure uh and can tolerateslike uh hardware failures and aoutage also which was very important toto us is that cortex supports multtenants tendency out of the box uhcortex keeps uh data from differenttenants in isolation so uh we don'tshare any data on disk uh betweentenants uh and also implements lots ofnoise neighborhoods mitigations such aslike uh cortex will give you limits andquotas uh by tenant to make sure thatone tenant cannot use all the uh costlyresources and also strategies likeshuffle sharding that is a strategy thatwe try to minimize the� amount ofresource shared between tenants so inthis particular case we can see that forinstance we have a bad tenant that'scausing three hosts to overload but theother tenants are happy becauseuh they don't share uh resources and theone that they share like because of theredundancy and the high availability uhthe tenant doesn't have any impactwhatsoeveralso uh finally cortex support along-term storage uh cortex uh only keephot data on disk uh and ship this dataevery few hours to a object storage likeS3 or uh or Google uh GCS it means thatlike we are not longer bounded by alocal disk of a host uh and this allowus to increase our intention period fromdays to years which was something thatour customers were were wasdemanding okay uh everything seems finelike cortex is great is a good fit forus let's go ahead but we still haveproblems to solve uh we don't want tohave a we we don't want to have a singlecortex cluster that will growindefinitely uh we knew from fromprevious experience that like especiallyin big distributed systems uh we canhave like a very uh hidden scalingcliffs that only show ups uh when thecluster uh gets to a become too big uhthose hidden contention points can showup in different ways it can be somecommunication uh and coordinationoverload and those things can cause alsononcale nonlinear scaling which meansthat you can end up uh when you get to acertain level certain size you end upadding more hosts to hoping to uh beable to handle more load but theoverhead is so big that adding more uhmore host is actually uh make thingsworse like the host is making you usemore resources and this can become asglobal effect so we knew that wecouldn't have like a single cluster foreverything so then that's when westarted to think about cellulararchitecture so in a cellulararchitecture instead of having onesingle cortex cluster we have multiplecortex clusters per regionuh each cortex cluster is now what wecall cells uh so we instead of havingone we have a multiple uh deployments ofcortex and each cell should beself-contained that means that it's notonly cortex that's deployed in the cellbut everything else that needs to runcortex needs to run so like uh each cellhas its own S3 bucket uh each cell haslike the whole authorization and theauthentication st everything iscontained in one celland finally we assign tenants orcustomers to sales so uh we have everytime that you create a resource in oursystem we assign this resource to a celland go from there this gives us lots ofuh lots ofuh positives uh this means that we havemore scalability like we can alwayscreate more cells to handle more moreloads to handle more clients it improvesour deployment safety because ourdeployment is now done by cell so if forsome reason we were shipping uh a badcode instead of having that bad code runin my whole region I will first run in acell uh if that thing has some problemuh only the customers assigned it tothat cell will have a problemsimilarly we have like a blast radiusreduction uh if for some reason uhcortex could not protect himself and thetenant caused a cortex cluster to getunhealthy i have I I even more likeisolation between tenants because nowthis tenant can only affect one clusteror anything else like it can be likesome any type of disaster that causesthe cortex cluster to getunhealthy also it decreases our meantime to recover because like when wehave a a incident the operator now canlook to a subset of your fleet with asubset of your customers instead ofhaving to dive and like figure out uhthe your whole uhinfrastructure okay so uh a little bitabout a cell architecture like those arethe main components that we have uh wehave a very thin layer called cellrouter this layer is like the thinnestpossible layer uh that only the onlyresponsibility is take the request fromthe outside world and route to a cell uhlike this architecture is hidden fromthe customer so the customer has asingle point of entry in our system thatgoes to the cell router and the cellrouter will uh will forward the requestto the right cellwe have the cell itself which �is acortex deployment uh together witheverything else uh as I said that needsto be uh inside of the cell so the S3bucket uh the authorization uh the theKubernetes cluster everything is insideof the cell uh and we have the controlplane the control plane is basicallyresponsible for administration tasks itwill like assign customer to sales itwill will also provision new cells or douh or do or deprovision cells if it'sneeded okay a little bit about cell uhcapacity management inside of the samecell uh we keep the cells within a limitthat we deem safe uh and this limit wego we get to this limit uh doing lots ofdifferent uh load test and types of testand this is the limit that we we feelcomfortable of running a cortex clusterso every time that we will roll out achange this change needs to be testedwith a cluster a pre-product clusterthat is uh that's maxed out that is likethe size that uh we feel like that themax size that we can allow the cell togrowso then we we inside of the cell wedefine two thresholds one is called likescaling threshold uh so which is thethreshold that we uh that we set uh tostop assigning new tenants to a cell andwe have a hard limit which is a limitthat we don't want the cell to uh goabove uh so when you have like uh sowhen all all your cells are closer orabove your scaling threshold we scaleout instead of scaling up a cell whatthat means that means that like we stopassigning tenants to that cells and wescale out uh we scale out the we createa new cell to start assigning tenants tothe new cell uh okay but like what aboutthe hard limit how can we preventtenants to grow inside of the cell andcross the hard limit well for that wehave some automations inside of the cellthat allocated limits uh and quotas forcustomers so basically how that works iswe have like controllers that is likeconstantly looking at usage uh customerusage uh and it will auto grant limitincreases or grant give more limits to acustomer kind of outscaling the limit ofthe customer when is deemed safe so uhthe controller see like how much howmuch the limit was granted in this celland if it's below the hard limit it saysit's it's safe you grant this controlleralso reclaims limits that's not used soif the customer was using lots of limitsyesterday uh and stopping using so westart to reclaim those limits to free upcapacity to othercustomers and finally this controlleralso pre-provision nodes uh when wecreate a new cell we don't create thecell with like uh with uh a scaled tohandle the full capacity so what we dois like we create a cell with a minimalfootprint and we let the cell groworganically uh and when the cell isgrowing the the same controller beforegranting limit because a customer needit will pre-provision nodes uh whenthose nodes are ready it will uh givethe capacity to thecustomer okay uh that works great butwhat happens if a customer inside ofexisting c inside of existing cell needsmore capacity uh well in that case wetrigger a operation that we call cellmigrationuh we create a new cell or we pick acell that is has free capacity as targetof the migration we mirror this trafficbetween this we for some time we mirrorthe traffic between the two cells uh andthis is to have same data in both cellsuh then we start migratingconfigurations and historical data fromone cell to another remember that uheven like the historical data is is on aS3 bucket that belongs to a cell becausewe don't want to end share between cellsso we need to migrate historical dataand then uh when everything's done atthe end we stop mirroring the traffic ihave all the data in the new cell and Istart to route uh also the readoperations to the new cell uh thecustomer doesn't note the oper this thisoperation the only thing that uh thecustomer may note and we are working touh to fix is that in the end of thisoperation the customer can uh receivesome duplicated alerts uh but other thanthat should be totally seamless for thecustomer all right uh so you might bewondering how do you release all of thiscomplexity to over tens of regionshundreds of cells and clusters acrossten�s of thousands of hosts so what we dois what we call deployment waves in ourdeployment process we want to releaseevery change safely but we want to uhrespect the cellular boundaries as wellas regional boundariesuh in our pre-production steps we runour uh unit tests and uh make sure thatevery everything is passing and then wedeploy the new change to our what wecall beta environment and in our betaenvironment we run our integration testsetc and if everything looks good westart uh what we call uh productionwaves each wave uh consists of multipleregion deployments uh as the big wavesyou can see up top uh we pick a rathersmall region to begin with ourdeployment process uh and each waveconsists of subwaves and the subwaves westart deploying our change to our gammaenvironment first in our GMA environmentyou will realize that we deploy ourchanges to two cells to be able to testuh our uh changes in cortex and alsocellularrouter and in our approval steps uh as Imentioned before we go through ourintegration tests we uh have canariesrunning 24/7 in our gamma and prodenvironmentsuh we wait for a few hours to get thesignals from our environment and see ifthere's anything breaking uh with ourchanges if there is uh we look at whatthe issues are and then uh decide uh toroll back the change or uh we rollforward uh in our first production wavewe pick a cell from a small region wedeploy our changes we run our approvalsteps again and uh in this phase if wenotice that if uh any alarms are firinguh the system auto rolls back thechanges so this is addressing one of thefundamental problems with the blastradius uh so when we uh deploy a faultychange we do not want to impactworldwide we do not want to impact thewhole region only a subset of customersand users are impacted by this changebut uh majority of the time things arelooking bright and then we move on toour next phase uh in the next way we uhdeploy our change to the remaining cellsin that region uh we go through ourapproval process again and then uh inthe next big wave now we pick tworegions uh now that we have gainedconfidence in our first wave uh weslowly uh expand the scope of ourrelease uh we pick the cells in thoseregions in gamma environment we deployour changes we go through our approvalprocesses again and then uh notice thatwe pick a single cell in all theseregions in our first prod first prodwaveuh we wait we uh see if everything islooking great uh and then we move on tothe next uh cells in that region noticethat the subwaves and number of subwaysuh can increase depending on how manycells we have in the targeted regions umthe process rinse and repeats to uh nextwave of regions we exponentiallyincrease the number of regions that wedeploy our changes to and the overalldeployment for a single change uh takesuh in the order of daysso a quick word about testing as Imentioned we do our unit test acceptancetest and like canary tests as I'm suremost of you all do uh but we also do fuzfuzziness and correctness test what thatmeans is especially for our querycapabilities given from QL supports wideuh a range of queries if we are makingany changes on our query path we want tomake sure that we are not introducingregressions on the query logic in orderto test that what we do is uh wegenerate bunch of random queries andthen we when we deploy our changes toour GMA environment we uh look at theresults from Cortex and then we alsohave a vanilla Prometheus instancerunning in our GMA environment that wecompare the results to if things arelooking fine uh we uh wait a little bituh to gain our uh gain confidence andrelease our changes we also do faultinjection betweenum weight production waves uh what thatmeans is uh we basically test theresiliency mechanisms that we have inour system to test such as like uhagainst a impairmentso what happens inside a cell'sdeployment uh that's when we deploy ourcortex uh images to our Kubernetescluster so uh quick word about our uharchitecture is that even though uh asAlan mentioned uh we can toleratedisruption in a single a system incurs asmall uh downtime when uh two inesttorsare down from uh multiple a so how isthis relevant to our deployment processso we have a stateful set when uh we areuh for our injusters when we aredeploying cortex so pod disruptionbudget doesn't necessarily take zoneinto account when it's replacing pods soif we are taking two pod if we arereplacing two pods it's highly likelythat we are going uh we are going to beuh taking two pods from different as wedon't want to cause any uh impact to ourcustomers so we introduced zone awarecontrollers sonware controllers are opensourced and you can use that um in yourprojects if you have stateful sets andrelying on u multiple ACS uh how thatworks is that as you can see from the uhtop leftum graph uh we start taking uh replacingnot uh nodes uh in uh one a we startsmall uh we take one and then we taketwo and then we take four weexponentially grow and then once thata's deployment finishes we move on tothe nexta so this process uh speeds speeds upour deployment uh by a hugemargin so what happens after we deployall of this complexity uh the greatfailures can happen if you are operatingyour service uh that relies on tens ofthousands of hosts uh any number ofthose hosts uh can uh go down at anytime we do our health checks but thereare gray failures uh that uh cannot benecessarily detected through uh healthchecks so we do host health monitoringto be able to detect gray failures suchas network and disk failures uh we uh weuse multiple signals to determine if ahost needs replacement such as 5xx countor latency we replace the node even ifyou get a false negative signal to notdisrupt the service atall you can use this mechanism to payaway uh from an a in case of a failuresif you'dlike as we areuh looking at our deployments we alsonoticed that uh Carpenter project in forKubernetes is taking off and we are inthe process of moving toCarpenter uh Carpenter provides us abetter transparency and debugabilityduring deployments which helps uhspeeding up the deployments on ourobservations we also noticed that uhCarpenter provides faster scalingcharacteristics and in operating uhglobal scale uh you can have capacityconstraints on the hosts that you arerunning your service uh host typesspecifically so with carpenter we can uhpick multiple instance types to run ourservice and then uh we do not get any uhcapacityconstraints so what's next for cortex uhwe are uh uh as I mentioned before weare tolerant across sing uh single aoutages but however due to how we shardour data we incur a short small impactwhen two injusters are down uh from twodifferentACS because uh of how the series areassigned toinjesttors what we are thinking of isintroducing a new concept called uhpartitions instead of uh directlyassigning series into inesttors we aregoing to assign series to partitions andthe partitions will be assigned toinesttors so how does this work intheory um with the introduction ofpartitions we are more resilient towardsmultiple injesttors going down acrossmultiple ass notice that the customer orany time series is not going to have anyimpact even if more than three ingestersare down from three differentaes the only downside is we can onlyhave impact if two inesttors uh are downfrom the same partition but which isless likely to happen compared to ourprevious statethat being said one of the next thingswe are working on is we are exploringparket format for our long-term storageum working with the upstream communityas well as tonos maintainers this willkeep the same query performance but withfurther optimizations we can improvequery performance but the main focus ofthis uh effort is to be able to reducethe operational toil uh introduced bystore gateways and reduce theinfrastructure needed to runCortex so uh that's it for today ifyou're interested in learning more aboutour improvements and how to run uhCortex and what's next please feel freeto attend our uh the talks uh that ourmaintainers are going to give uh onFriday you can also find us in uh Cortexand Open Search booths as well as reachout to us uh in CNCF uh channels thankyou2025-04-15 22:03:37.365932 ,,��7�y#��%AzLHdgl2qxbgall right welcome everyone Thank you forjoining Marty talk So today we are goingto talk about memory and with uh whereall my memory is gone So today we aretrying to map the Kubernetes memorymatrix to actual physical resource Ihope you you will learnsomething So um I'm I'm May I'm workingas a software engineer at isalent atCisco Um I'm working on tetragonon whichis an ebpf based runtime securityproject It's and it's part of selium andthis is not a picture ofme So it it��E�x#��AAOqLpKJwKZlkwell uh welcome everyone uh today we aregoing to talk about uh operating globalscale Prometheus deployments onKubernetes uh I'm Orchin uh principalengineer at AWS uh focusing on opensource observability solutions and I'mAlan uh I'm also software engineer atAWS and a cortex maintainerall right we have uh a lot of challengesoperating at global scale uh and on thebottom right side uh you can see apostit that I prepared when I joined theteam two and a half years ago about thechallenges we had back then and apriority order we are not going to talkthrough every single one of the problemsand challenges but a subset of thechallenges are going to act as our roughagenda today we're going to touch baseon availability multi-tenency uh blastradius reduction operating at globalscale and Kubernetescomplexity before we dive in let'srewind a bit uh in about let's uh talkthrough uh talk about the ecosystem andhow the ecosystem has evolved uh so in2012 uh Prometheus was uh started uh andthen open sourced in uh 2015 uh Cortexproject a Prometheus compatible uhscalable uh open source project releasedshortly after and we released uh managePrometheus uh by Amazon um at 2020 wewanted to take away the operational uhburden of uh handling the infrastructurefrom the users uh we have an open-sourcefirst uh tenant uh so any improvementthat we do uh we contribute back to opensource community and you can see that uhfrom our commits uh over the past fewyears in the cortexrepository so before we launch theservice let's go through uh what wewanted to launch back in 2020 uh wewanted to provide a fully managedserverless Prometheus compatiblemonitoring service by AWS that isscalable highly available integratedwith AWS services secure supports IMauthorization as well as data encryptionand it can support data retention up tomultipleyears as I mentioned earlier we uh useCortex under the hood uh but uh we cantalk let's talk a little bit about uhPrometheus so Prometheus is anopen-source purpose-built uh time seriesdatabase that comes with a very powerfulquery language called PromQL uh itenables analytical query capabilities onyour time series data but uh alsointegrates really well with Kubernetesand it's widely adopted format andsupports wide range of exporters uh inthe community however uh it inherentlycomes with uh limitations the primaryone uh that is our concern is that it'sa single machine uh and it's deployed ina single machine it's not a distributedsystem it uh is a monolithic service souh depend�� it all started withsomething like that for me So um amemory dashboard So uh you might end upin the same situation and um I was quiteconfused that like uh what was thenumber I was actually looking at So hereyou have like an application running abunch of pods of this application inyour cluster and they are like consumingan amount of memory but what is thismemory about and how can I actuallyunderstand what is this about so if youare checking theuh this this dashboard in particular youmight see something like this and inyour case it might be the same So thememory metric that is actually usedbehind the scene is this thing called uhcontainer memory working set bite Soit's it's a pretty long name and um forme it wasn't very clear what it wasabout So let's try to understand wherethis metrics come from and how it's it'sbuilt So this is the first conclusionvery early conclusion I I I think Icould be uh fairly right to say that themain memory metrics that is tracked inthe communities world for pods iscontainer memory working set byesSo uh the thing I wanted to do next istrying to understand where it comes fromSo um the one one idea was just to readthe documentation In the documentationon the website you can see a bunch of uhum descript description of like theresource metrics pipeline and these kindof things But one uh good idea is was tolook at the Prometheus dashboard itselfSo I just wanted to show you how itworksum because primitives might be runninguh as part of your cluster you can justlike port forward the service theprimitive server and then you will haveaccess to the dashboard so I'll try tozoom in and from the dashboard you cansee uh a bunch of information and amongis like the target elf and uh here youcan see that promeuse is scrapping avarious number of endpoints uh here youhave like the API server the nodes and abunch of uh service endpointsSo by gathering all this informationplus uh a very nice blog post written bya guy named Mia Halbert uh I could kindof create this uh little diagram of uhthe Kubernetes metrics flow So I willtry to explain the different part inthis graph So on the on the left sideyou have this uh green boxes So thoseare just service that might be runningas part of your cluster of outside butthis is just what we are fetching forgraphana or cubectl top and you havelike this ble this big blue box which isthe abstraction of your nodes and onthis node you have like a bunch ofprocess running uh those are the purpleboxes you have cublet so this is the oneresponsible for starting the containersand managing the containers on your nodeyou have the container runtime thepromeus node not exporter so this thingis running as part of demon set in thecluster but it's running on the node andexposing matrix and as part of the uhcublet you can see you have like a bunchof endpoints so uh those are httpendpoints exposed uh via cublet exposingvarious metrics so in our case uh wewere looking at this graphana dashboardfor memory so we are directly reading uhinformation from the primitive serveryou can even like see when you are onyour Prometheus instance you can querythe container memory working set bydirectly from here and you can likegraph it and then like see uh thismetrics right so this is coming fromPrometheus of course and uh the nextthing was like where does Prometheusscraps this metrics from so there thereare like two main culprits here theprimitives not exporter and the Cadvisor endpoint so let's start with theprom primitive node exporter So thisthing is a demon set as I said it'srunning as part of your cluster and it'srunning on each nodes and because it'srunning as part of the of a pod it isexposed as a serviceSo if you want to take a look at thisthing you can port forward the serviceuh this one and with curl for exampleyou can query thematrix path and put it into less So ifwe do that we will see a bunch ofmetrics exposed here like a bunch of gorelated metrics but the vast majority ofthe metrics will start with this nodeprefix So this is not really what we arelooking for and if we check for anythingstarting wit�h container it won't bethere So this one is mostly aboutexposing u data about the node elf Soit's not really about the workloadsthemselves Let's get back to thepresentation All right So the nextculprit is the C advisor endpoint So umthe C advisor endpoint it's somethingthat is running as part of the cubletright is exposed part via the cublet Sothe nice thing is that we can actuallyread this directly by quering thecubernetes API Um we can use thiscommand So we can cube get and use thisvery nice proxy feature in in the APIserver So we can use like SL API/v1nodes In my case I have only one nodeIt's like a locally running cluster It'smini cube And then you have like thisproxy thingy and and you can pass anyendpoint you want to retrieve directlyon that nodes on the cublet node viathis proxy uh thing And if we check likethe the thing from the the metrics fromthis endpoint uh we can see that nowmost of those are starting with thecontainer prefix which is actually whatwe want and what we are looking forSo let's get back tothis So now we have like one moreconclusion Um the container memoryworking set bytes it it it comes from Cadvisor inside cublet Butum most of you in the room mightactually wonder like what what is Cadvisor what is it running inside ofcublet so let's let's try to understandthat furtherUm so uh C advisor it's the project onthe left Um this is basically uh autility for you to analyze andunderstand the resource usage of yourcontainers in a cluster or in a clusteron just like the on the node sorry anduh I wanted to mention runc here becausepretty much all the underlying technicalimplementation of C advisor is relyingon run lips container which is the gopure uh pure go implementation uh forinteracting with containers on um uh inthe cubernetes world it's used bycontainerd it's used by cryo etc etc soyeah and uh how how how does this worknow like how is my uh container memoryworking set bytes computed by the Cadvisor thingy so uh let's try to makesense of this diagram um it's a prettybig diagram I will try to walk youthrough it so it it all starts with thisget stat method that is called inside ofthe C advisor code right and Just afterthis thing it calls this croup managerget stats and it it calls this get statsthat is on the right So now we are likejumping into run cip containers code anduh we are executing in the package thatis called croup and you can see herethat we are trying to retrieve statsfrom this thing called croups uhapparently they come in two flavorsright they have like croups v1 andcroups v2 and depending on the versionuh lip container is doing like slightlydifferent things and you can see at thebottom that we have uh forexample different name for similarmetrics uh it's reading memory usage inbytes for croup v1 on one side and onthe other side it's reading memorycurrent and other statistics so allthese stats are collected thanks to theuh lib container croup package andreturn to the uh C advisor code and theneventually they reach this uh set memorystatsfunction and uh this is the actualfunction that will produce the metricswe have been looking for since thebeginning So let's let's take a look atthe code Um the the code like thesimplification looks like that Uh it'sit's fairly straightforward in somesense It's retrieving the usage we'vebeen seeing in the slide before So usageinvite for V1 and current for V2 Andthen it's retrieving something elsewhich is called total inactive file inthe case of V1 and inactive files in thecase of V2 And then it just does likethis substraction Very simple code likethe working set is basically the usageminus the inactive file thingy And wecan recap this with that So it's likekind of the max between zero and this Itcan be negative of course like we canhave like negative memory usageunfortunately and and for V1 we havelike memory usage in bytes minus thistotal inactive file and for V2 we havecurrent minus inactive file So rightgreat now we know that but um we knowthat say advisor computes this uh metricfrom this stat from this croup usingthis current minus inactive file butw�hat is these things about like what isthis uh current stat about what is thisinactive file about and what is a crouprightuh so here is a wall oftext please don't read that's just likeas an illustration purpose it's theLinux man page for croups But what Iwanted to uh explain here is thatcontainers are not like a C group Linuxuh concept right they are not like theOS concept And for creating containerson the OS we used various systems likeuh namespaces C groupoups and othersystems Actually name spaces on one sideare for like isolating the process withthe the rest of the computer like uh youcan create like a network C groupoup andthe process will have its own vision ofthe networking or you can create like aprocess croup and the process will thinkit's the only one running plus the otherprocess running inside its containerWell it's not true But on the other sideyou have like croups and cgroups aremostly to restrict the usage ofresources on the system So you can havelike CPU CRPS so to restrict the CPUusage So basically the idea is you youwant to see how much CPU you're usingand maybe you want to restrict like sayI don't want this process to consume100% of my CPU and then you have likethe memory uh croup So this is the onewe are going to be interested uh now andthe idea here is to retrieve a number ofinformation from the the memory usage onthe OS from this croup and also likeenforce uh memory limits uh maximum andminimum All right Um so Kubernetes theythey use Cgroups uh basically when it'sstarting a container is talking with thecublet which is talking to the containerruntime which is creating uh someclusterand they have this abstraction ofquality of service classes So you youmay have never seen this kind of qualityof service thing but you uh may haveinteracted with them without knowingbecause you have like three main serviceclasses guaranteed bustable and besteffort and those are actually related tothe way you set limits and requests onyour workloads So when you don't setanything like like most most of the timeit's it's just like best effort It'sgoing to be like in this category uhwhen you set like uh one request and alimit that is above the request or nolimits it it's going to be in bustableand if you set a limit that is equal tothe request it will be warranted SoKubernetes uses this concept using thememory limit and request to kind ofclassify the differentworkloads and this uh will make sensebecause it will uh Kubernetes will usethis concept of quality of serviceclasses to um tune certain settings uhin inside of the croups So this isanother concept So the out of memorykiller you may be familiar with this onebecause it may have killed yourcontainers in the past Uh again this islike a Linux concept This is like an OSconcept It's not aware of containers inthis case And uh what what happens onthe system when it cannot reclaim anymore memory it needs to pick uh one ofthe process to actually terminate thatprocess and get back some memory So thewhole uh challenge here is to actuallypick the right uh process So KubernetesKubernetes uses like this quality ofservice to choose like the score it'sgoing to assign on each um of thecontainers So the idea being here isthat like guaranteed will be a lowerscore than best effort and in betweenwill be burstable and um uh the ideabeing is that the out of memory killerwill target a best effort port before aburstable port and before a guaranteedport Um yeah I just wanted to make anotes aside as well to show you like thekind of quirks that can happen betweenthe Kubernetes world and the OS worldlike like for example again the out ofmemory killer is not really aware of thecontainers concept So when it's pickinga process to kill it doesn't know ifit's like uh the main process of acontainer or not So you can end up inthis weird situation where like the outof memory killer is like killing processtwo in your container and Kubernetesstill sees that or cublet still seesthat the process one is running and lyso it doesn't do anything So you can endup in this kind of weird situation Soyeah anyway what we �had here is that uhwe use these croups to kind of classifyour containers and to like tune thesekind of settings and how it's actuallydone in in reality the interface betweenuh the OS and uh uh the Kel and the userspace for C groupoups is using uh a filesystem and all the C groupoups they canbe represented in this uh file systemthingy here we'll use like the systemdterminology with the slice and the scopethingy and and what you see here is thatyou have like at the very top you havethe root um cgroups So these croups uhcontains all the process on yourcomputer So you will see the statisticsabout memory from all the processrunning on your computer and you canadjust uh settings on all the processand then uh for in the case ofKubernetes you will have like this cubepods thingy and and here you can see whyI try to explain this quality of serviceclasses is because like they willmanifest themselves as categories in thecubeot slice So you will have like thisguaranteed best effort and bustableslice and in each of these slice youwill retrieve all your uh pods So I I Ipicked the best effort in my case and II put like two pods and you have likepod on the left uh and pod on the rightand the pod on the left I added likemuch more details and you can see thecontainer like the the scope that isrepresenting the container which is atthe the very end of this graph So it'slike a leaf on this graph Uh I putcontainer ID or docker It depends on theuh container runtime that you use Thename will change but uh anyway if wezoom in this thing because we areinterested about memoryuh we will see that for memory So abunch of files um some of them are readonly some of them are write and read andonly write So the idea again is that uhthe kernel will expose uh somestatistics through these read only filesand uh you can also again set a memorymax a memory mean some kind ofboundaries to ask for the kernel toenforce those things for you and um thething we are interested about is thememory current So this is like the thingyou've been seeing in the code like justbefore uh it's just a number So this ispretty much uh the total of memory thatis used by these uh croups So all theprocess belonging to these croups andyou have this thing called memory statswhich contains a much more detailsbreakdown of all the memory usage So youhave the anonymous file kernel whichshould like um uh addition to this kindof total and then you have like a lot ofuh different statistics that you canuse Um so this is croup v2 by the way uhthey have two versions but I I I try tofocus on cigar v2 in the in the nextslidesum and yeah trying to make sense of thestatistics you've been seeing before isactually a bit complicated becausememory is is is a bit complicated likeit has a lot of terminology it is like avery complex uh subsystem into the Linuxkernel so today we are going to try tofocus on just like a few uh words sojust trying to explain a little bitabout like main memory virtual memoryand overcoming So main memory is couldbe referred as like your physical memorySo the actually fast data storage memoryinside of your computer or the randomaccess memory uh um device on yourcomputer and then you have like thevirtual memory So virtual memory is thiskind of abstraction that we built on topof the main memory the physical memoryUm it isuh theoretically it's like an infiniteamount of memory possibly it doesn'tneed to pair with the main memory and uhyeah again keep in mind that likevirtual memory is not real memory andthat that will explain like thedifficulty to map to the physical memoryand the overcomit concept So Linux likethe the kernel is uh overcommittingmemory uh which means that basically uhit will be able to accommodate for asmuch memory that you want to allocate Soif you want to allocate like 1 TB ofmemory and you only have like uh 8 GB ofphysical memory in your computer you canallocate that because Linux willovercommit So how does that work let'stry to seeSo I borrowed this diagram fromBrendan's Greg systems performance bookVery nice book Um and I will try toexplain uh a little bit furt�her thisconcept of virtual and main memory Umtry to bear with me So uh when you arewriting programs and like runningprograms you want to allocate uh objectsright So if you are using C it could bevery explicit like you just use malocand you allocate like a new object or ifyou are using like a more modern umlanguage like I don't know like go or Idon't like python maybe you will noteven realize that you are allocatingmemory you will just like create newobjects and they will be there they willbe allocated on the e for you by theruntime but you you might not evenrealize but anyway when you once youallocate that memoryuh it's actually taking uhum it's actually taking no space in themain memory initially when it'scompletely empty So you can allocatelike terabytes and terabytes of objectbut as soon as you will start to storein this memory So as soon as you willstart to use this memory behind yourback the OS what it will do is that itwill perform a lookup on this uh memoryl virtual memory it will talk to thisMMU thingy which is the memorymanagement unit and if the memory is notactually backed by some physical memoryit will do something that will call thepage fault and it will ask like the MMUto map this uh virtual memory into someactual uh physicalmemory So at this stage like the fifthuh point your memory will be likeallocated and mapped So it will actuallyat this moment it will like take somespace on the physical memory But just toshow you like the complexity of thislikeum uh I mean your your object initiallyit doesn't mean that your full objecthas been allocating allocated in thephysical memory If your object is hugeand the page size so the the minimumunits of memory that Linux handle is issmaller you could have just like try torewrite on like some part of your objectand only like a part of your objectcould reside in actual physical memorySo even even there it's actually trickyyou you've written to some object youdon't know if like everything hasimpacted your memor memory usage andeventually um if the object that you areusing um like the OS thinks that maybethis memory has not been accessed for awhile and it's like under memorypressure it might even decide to swapthis memory so to take the memory fromthe random access memory parts like thephysical memory and put it on some otherdevice like a slower device like a swapdevice like ayou know so from all the lifetime youyou don't really know when it's likeimpacting the phys the physical memoryIt's like it's always like anapproximation And just to add on top ofthat um on like accounting memory toprocess memory can be shared betweenprocess If you are using like sharedlibraries uh memory can be compressed Sowhat it means is that you will need atsome point to kind of create some kindof uristic around your memory and andand try to to stick with itRight so this is what we ended up withthe C advisor thingy Uh it was usingthis memory current thingy or usage inbite for v1 and it was substracting thisinactive file thing So as I say justbefore the total memory is likeseparated into like anonymous memory Sotypically a heap memory use in yourprogram like allocating objects uh onthe heap some files uh it could be filesthat you used for IO like you've beenlike uh opening a file at some point andreading from it and kernel which couldbe like um object related to yourprocess running on the OS like likekernel stack or other things But thething is like some parts of for examplethe file uh breakdown of this memory itit could be inactive So if for exampleyour process opens some file at the atthe beginning of its lifetime it readsfrom it and never touches that that fileagain you might not need that that thatlike file to be in cache forever Rightso you could just like reclaim thatmemory later on And this is kind of theidea of this number like trying to um umtake into account this memory that canbe reclaimed before it's actuallyreclaimed because eventually what willhappen is that when the out of memorykiller will tryto find more memory or before it'sactually called it will try to reclaimas much as memory as it can So it willtry to reclaim all the memory from thecached file that are no longer usedSo this is kind of the key key takeawayfour Uh so this container memory workingset bytes we have been looking since thebeginning It's actually the currentmemory use like the main memory metricfrom the croup minus the inactive fileAnd it's like a kind of a good enough uhuristic for the actual memory use andwhat the killer is after And I I hopenow you kind of understand why we needlike this kind of good enoughuristic Um so this is kind of theconclusion like going from the physicalto Prometheus again So we have like thismain memory it's impacted by the objectsthat have been like um allocated on theE the files that have been cached thekernel files the kernel uh structuresthat have been allocated or even likeum or even more Yeah you have like thisphysical memory on the side and the bestyou can do is using like this memorycurrent uh um stat from the croup Sothis is like an accounting of this thingSo again anonymous memory it can bethings on the heap depending on yourlanguage you might want to use uhdifferent tooling to understand like howcan I reduce like the heap usage on mything For go for example you can like uhuse like go the go me start thingy oryou can use like the you can trust likeflame graphs on how the memory is usedUh the file uh it will be like the filethat you interacted with So the IOyou've been doing or just like thebinary that has been started as aprocess that needs to be mapped inmemory before starting and the kernel itwill be um like as I said kernelstructure related to the execution ofyour process or in the case oftetragonon it could because we are usinglike this BPF thing So BPF like is akernel technology and BPF progs they usethis thing called BPF maps and the BPFmaps for c V2 are accounted like thememory impact of those are accounted inthe kernel box here So all of these likecompose your container then it's read byit's read by uh C advisor like libcontainer then C advisor that thatcomput thing and it's exposed by cubletand then you retrieve that fromprimitives So yeah in kind of aconclusion um again it's tracked by thisthing C advisor again this is a uristicand this is what you you're looking forUh I just wanted to mention a bunch morethings here Uh I found on thecommunities documentation uh some nicescripts So if you are in a situationwhere you can't really like dive intothe croups and read everything byyourself and do the math by yourself youcan use this little scripts that theymade uh to check like the main uh Cgroups and see like the working set byesthe inactive files andeverything And uh yeah shameless plug II did like a very small tool So this isvery limited It's using lip containerlike from Reny again And the the goalhere was just um to kind of analyze mymemory usage without having to start myfull stack of like a cubernetes clusterand everything So this little utilitywhat it does is that you can give it uhyour program its argument and it willstart your process uh as a child put itinto like a croup and uh monitor likethe the little the little the stats thatcome from this croup's memory stats andthe idea is to print it like regularly abit like like simulating what we coulddo with Prometheus and Graphfana tryingto see the evolution So this is justlike a very small utility but can behelpful if you don't need to like spawnyour own entire like cumatuscluster So that's pretty much it for meUh thank you for joiningUm sorry uh I I I put some QR code onthe on the thing You can like uh go tosee like the the slides I puted theslides online and uh and yeah thank youThank you for joining2025-04-15 22:03:37.843565� to a docker file andultimately reduce the number ofvulnerabilities that were found by opensource tools so they actually fooled thecontainer vulnerability scanners and uhwhich are in charge of detecting packageand vulnerabilities in the containerimage so in this work we wanted toanalyze how the landscape evolved overthe past two years and we proposed anacademic approach to the problem that wewill detail rightafter so I will first give a shortoverview of the tools that exist forcontainer vulnerability analysis andscanning so there are typically two mainsteps when we want to scan forvulnerabilities in a container image sofirst the tool will index the content ofthe container image so they will analyzethe container file systems and try tofind installed software so it can be theoperating system the operating systempackages the programming language thedependencies and libraries and finallythe binaries and at the end of thisprocess they will generate a softwarebill of materials or sbomb that is alist of components all of packages uhrunning into that container then thesecond step is to search forvulnerabilities in this software so theywill match uh those package against thebase of non vulnerabilities andtypically they leverage custom onlinesources and aggregate them togetherhere is an example to find installedsoftware using sift which is an opensource tool from narr we analyzed python3.10 image and output the result as thebomb uh for JSON uh so here is a list ofartifacts as you can see we have a listof packages uh each package is uniquelyidentified using annotation we have thecommon platform enumeration or CP and wehave the package URL or P so thosenotation enable to give a uniqueidentifier and to give details like thepackage name the organizationuh the version and even the author ordistribution so they um also provide thepath of the file that were used to findthe given software uh the layer ID wherethe file is and finally some metadatainformation on thepackage the second step is to search forvulnerability so here it uh we use gripewhich is another tool from ensure and itwill process the sbomb previouslygenerated so on the left you can see thelist of package that were previouslyidentified u with their version and thetype for each package you have one orseveral vulnerabilities uh mostlyexpressed as CVS with their severity andfixes if possible to upgrade to a newerversionso we can wonder which files areactually indexed to find those packageand actually there are different kindsof files corresponding to differenttypes of packages so first we have theoperating system and it is mostlyexplicitly written in a file like uh etcrelease uh it can be different filesdepending on the distribution and on theoperating system then we have thepackage installed via the packagemanager of US here is an example to findDPKG package for Debian uh concerningprogramming language we have thedependencies and the files uh installedas a package manager so requirements.txtfor Python package log.json for NodeJSand with the package manager we havedistin for Python and N modules forNodeJS and finally we can retrieve someinformation about thebinary so how these tools index thiscontent from the file system by defaultthey will analyze the squashedrepresentation of the container imagemeaning they will combine all of thelayers together to produce the finalfile system and then they will searchfor nonfindings uh containing packagerelated information just like the onethat we saw previously and they will useregex based uh search for licensepackage version author and soon so we can wonder uh what happen ifsome of these files are removed or evendo not exist one example could be then adeveloper did not use a package managerto install dependencies but itdownloaded software from source andcompile it locally so in this case wedon't have all of the files to do theanalysis we also have meaningful contentthat is not indexed and not used bycontainer vulnerability scanners so wecan think of the path uh of the filewhich are not fully analyzed we alsohave some content files which� is relatedto the package like the configurationfile the binaries but not actual packagemetadata so those file currently are notanalyzed we have the history or dockerfile commands where we can see forexample the URL that were used todownload software and finally we have uhintermediate layers that may containvery interesting information sotypically files that were previousremovedin next layer of thecontainer and from this observation wenow give a definition to the concept ofcontainer opuscationso application is the act ofintentionally or unintentionallyuh modifying or generating the contentof a container image so when designingthe docker file the docker file in sucha way that the installed software ispartially or even totally undetected bythe container vulnerabilityscanners and this can arise for severalreasons as we see developers may want toreduce or to minimize the size of thecontainer image uh so to do so they canremove installation folders or it can bealso the case when using custom softwareinstallation without using the packagemanager and finally using multi-stagebuilds so we have several intermediatebase image in the container when we copyonly some of the files and we can missimportant files to do the analysis andthose are actually best practices andgood guidelines when we write the dockerfile but at the same time they make thetools vulnerable to opuscation and thethe container imageopiscated so from this observation wederive a list of objectives that wewould like to cover during thispresentation and in this work we makethe assumption that the containeropuscation is only inintentional meaningit is not introduced by purpose and formaliciousintents and first we want to understandwhat are the main opuscation and howthey occur then we will study differentstate-of-the-art tools to see whetherthey are vulnerable to offiscation so welook both at cloud-based and open sourcetools we also analyze a real and popularcontainer image to see whether theyactually contain opuscation and maybe uhdeploy in production and finally wepropose counter measures and mitigationto avoid container opiscation whenbuilding the imagesso now I will give the floor to mycolleague Yakopo he will present you theobjectives and the work we've been doingto enter themhello again um so let's start ourjourney towards container opiscation umfirst objective is to uh derive ataxonomy of the different offiscationtechniques uh for that uh we uh exploreddifferent uh information sources uhstarting from uh research papers inacademia then white papers industrystandards and we even went uh into opensource software and looked at the sourcecode and how they they work we thenbuilt a taxonomy of the different typeof techniques that can be used toobuscate the content of a of a containerimage and here is the list of eighttechniques that we were able to todevise uh bear in mind that all of thisassumes that the creator of the dockerfile is has a non-malicious intent um sofirst tactic is the OS offuscationmeaning that uh users can deletemodify or somehow alter the content ofthe uh uh files containing operatingsystem information then we have OSpackage information uh where basicallyusers they alter the content of thedatabases or files that containinformation on the installeddependencies uh similarly we havedependency offiscation this time we talkabout uh programming language dependencyfiles let's say requirements.txt forPython or package.json for forNoode.jsum similarly we have package offuscationuh package obuscation happens when uh auser downloads dependencies and thenalters the content of the downloadeddependency let's say changes the contentof the node modules so that uh theversion information or authorinformation of the dependency is altereduh then we have URL autheis basically uhthe the user downloads software uh fromthe internet rather than using packagemanagers uh then we have alias and linkoffiscation uh in this case we usealiases or links to to hide the path thethe real path of some files that areused uh as index files from uh softwarecomposition analysis tools an�d finallywe have the pack ofcation uh basicallythe content of the image is compresseduh so this refers to uh multi-stagebuilds or tools that uh compress the thelayers into a single oneum so now that we have a taxonomy of thedifferent offuscation techniques uh thenext question is is obuscation detectedby state-of-the-arttools uh so we need to find to find outwhether uh this offuscation techniquescan be actually uh exploited by anattacker so for this we selected uh sixdifferent tools ranging from uhopen-source tools to cloud ones so herewe have uh CNCF projects like tree butalso open source uh tools like sift andgripe we also have docker scout uhartifact registry for Google defenderfor cloud in Microsoft and Amazoninspectorokay now we have tools we havetechniques how to test them uh we needto find uh we need to generate a dataset of images that are increasinglyobiscated so um we curated a data set uhto test the resilience of tools tooffiscation and you can find it herethere's a QR code so we started by abase image base Python 10 image that isnot obuscated and then we startedprogressively adding more and moreobiscation techniques first individualtechniques and then we combined multipletechniques in the same uhimage and uh here are the some of theresults so as you can see uh many toolsactually can I use this maybehello does it workanyway yes um so as you can see um manyof of the of the tools are vulnerable toindividual instances of offiscation butthen uh as more um techniques arecombined the tools become more and morevulnerable to them um so what what wedecided to do was that we decided tocheck for uh vulnerabilities andpackages detected for each tool andtechnique just to have a morecomprehensive uh uh overview uh so hereis a table that shows for each uhtechnique and uh for each tool thenumber of vulnerabilities and packagesthat are that are found um and we cansee also that the impact of obuscationincreases also also here with the numberof of techniques and we can see that formany tools and many techniques we havezero uh vulnerabilities and zeropackages identified meaning that wereduce quite a lot the the visibility ofthis uh software composition analysistools um but we also see something elseuh we see on on the right here you seenot not available not available uh thisis because uh some of the cloud toolsthey actually make assumptions on thecontent of containers so if containersdo not have a specific file or specificfiles then they will refuse to scan themfor vulnerabilitiesand many times it's very hard todistinguish a container that cannot bescanned from a container that has novulnerabilities in theUI um also food for thoughts as you cansee tools report different numbers ofvulnerabilities and packages andsometimes uh tools may have uh mayrecognize for different type ofoffiscation the same number of uhpackages but different vulnerabilitiesfor instance uh sift for the OSoffiscation finds nine vulnerabilitiesbut for the alias offiscation uh finds625um so to sum up the key findings arethat there's a significant impact ofobiscation across multiple uh uh toolsevery tool has at least one isvulnerable to at least one opuscationtechnique and that cloud scanners rejectat times images that missing that aremissing some specificfiles let's now go to our thirdobjective obuscation in the wild so nowwe know that obuscation exists tools aresomehow not resilient to it but ourcontainers in the wild using obuscationtechniques uh so for that we first uhdecided on a data set we had to to findcontainers to to scan um we selected sixdifferent data sets um because uh wewanted to cover as many use cases aspossible so uh are uh obuscationtechniques happening only in uh hobbyistuh container images or are theyavailable in uh in production or do dowe see cases in in productionum so we selected DockerHub officialimages DockerHub Bitnami images that arevery popular for hobbyist and in thecloud uh DockerHub verified uh opensource software and then the quay.ioregistry and the official Amazon ECRregistryum okay uh how to detect obuscation uhwe devised uh� a methodology toeffectively detect cases of obuscationin uh in cont in in containers uh so howhow this works is that we take thecontainer imageuh and we do two things so first weextract the lay the image layers and atthe same time uh using the OCI standardwe are able to uh reconstruct andextract the metadata information on theoriginal docker file uh with the uh withthe analyze lay with the with thecontainer image layers we we basicallygenerate a sort of a G repository bylooking at all the modifications andchanges on each individual file um andthen uh for the Docker file we insteadlook at instances where software isdownloaded from the internetuh if and at the end of the day if wesee uh software downloaded from theinternet in the docker file or we seethat fileswith a a known file name let's sayrequirement.txt changes or gets deletedthen this is a an hint for us that thecontainer was modified or wasobiscated um so we have a data set wehave a methodology let's test this 600containers and see what whathappens um okay this is a glo globalview across this the the different uhcontainers and these are all the thedifferent cases of offiscation that thatwe find so we see that more than 10% ofuh of the containers they they presentOS offiscation uh most of the times theymodify the OS information but sometimesthe OS information is also deleted uhthen we see a spike on the OS packageand that is because mainly most of thecontainers uh docker files they run aget update as a first command buthowever we cannot um really remove themfrom from this count because in manyother cases uh the upgrading thedependencies also comes with installingsoftware from source um then we can seethat there are many uh containers thatare downloaded from the internet like20% of them and uh uh also uh packageand dependencies areremoved then we uh computed this thesame results but we grouped them by dataset uh and those are the uh theobservation that that we see is thatoffuscation is present in all of thecontainers so all of the registries soit means that there is a small but notsignificant difference in the amount ofobuscation from hobbyist to productiongrade uh containers and also we see thatbit nami has a surprisingly low numberof uh OS and OS package obuscation andthis is because Bitnami pack packs allthe layers inside one using a toolcalledclaim so we basically don't have a viewof the of the changes in the in the filesystem uh let's now see a practicaloffiscation attack um this is an extractfrom a very popular uh docker file forposgrql as you can see there's a reallylarge rank command that first installsall the dependencies then download somesoftware from the internet then it alsoinstalls plugins for uh for posgress andthen builds the software and thendownloads everything so what it means isthat uh the corresponding layer willjust have a couple of binaries and oneof them will be a go binary which isokay you can uh actually get thedependencies but the other one is goingto be a C binary and it will not berecognized by any software compositionanalysis tool so how can an attackerleveragethis well basically uh the attacker willjust wait that somebody uh creates thiscontainer or just downloads thecontainer and then uh it will actuallycheck that it will uh exploit knownvulnerabilities on this image and one ofthem is the CV 202410979 which allows umuh an an unauthenticated uh orunprivileged attack to the to the uhpostgress SQL container so basically theexternal attacker will just gain accessuh leveraging this CV and no CI/CD toolwill report thisvulnerability good uh but can weactually oh sorry can we actuallymitigate instances of offiscation so weknow that offiscation exists we haveseen several cases of obiscation in opensource uh containers uh how can wemitigate it so we kind of borrowed someof the uh uh ideas from our firstmethodology where we were able toidentify cases of offiscation andbasically at the first fourth step we doan iterative analysis of all the layersand the reconstructed docker file and webasicallyuh examine each layer individually thenwe analyz�e package metadata and the uhthe actual downloaded packages so notonly the requirements.xt txt but we goin and we look at every uh downloadeddependency um then we also uh analyzebinaries like go binaries configurationfiles libraries and uh uh we also lookat the file paths so we we try to uhsomehow increase the coverage on the onthe container file system so we usecoverage as a metric to understand howwell we are able to reconstruct the uhcontent of the image and uh uh we builtan open source tool called ORCA which isavailable here and uh feel free to tocheck itout so those are the results uh of ourtool basically we are um this uh thismethodology is basically resilient tomany instances of obuscation way morethan the state than the state-of-the-artum but still you see that there arethere are some cases of obuscation thatcannot be reconstructed and that isbecause of the data loss that occurswhen you use multi-stage build or whereyou compress the content of of an imageuh we're getting close to the end sothese are the takeaways of thispresentation so container imageoffuscation is still very much a problemin2025 uh obscure images are spread acrossmultiple use cases from hob toproduction deployments uh we built atool to discover and mitigate many casesofobuscation but not all of them can bemitigated we have seen that multi-stagebuilds compressed images they actuallymake it very hard for any tool todiscover obuscationso we need to update the the bestpractices that we have because we'veseen um an unbalance between the imagesize so what people are trying to doreduce the image size or reduce uh theinstances of CVS that are are related toa given software but on the other sidewe have to think about the ease ofscanning we need to be able to providenew um techniques and new best practicesto make it easy and more transparent tobuild containers that are easy toscan um and uh with that uh we thank youand uh if you want to check out uh ourtool is is there and uh I think we havecouple of minutes for for questions[Applause]okay questionsuh I think I think we can use this um myquestion is while doing this researchhave you ever seen cases where forexample the the file like therequirements.txt txt is saying one thingbut then actually in while doing withthe other scanning techniques isactually something else so is like asort of obuscation but it's saying thatit has certain file certain packages butactually has othersso the question is have you ever seencases where the content of thedependency file is different from thedownloaded package uh actually we haveseen cases of that uh we have seenmostly cases whereum the package the the therequirements.xt is updated in asubsequent layer so the the user loadstwo differentrequirements.xt or cases where the umthere is no version pinning so oneversion is basically the um the versionuh on the requirements.xt XT may bevulnerable but in the end the versionthat is installed is not vulnerable orthe opposite yeah that being said thankyou hi there thank you great talk by theway i loved it um I have a point ofunderstanding because I'm not veryfamiliar with SIFT um what's thedifference between the columns of SIFTand SIFT all because there were verystark differencesyes I got Yeah I can understand uh sohere sift all is the so sift is the onlytool from the tools we analyze thatallow to scan every layer independentlyrather than scanning the squashrepresentation of all of the layers sothat's why it perform significantlybetter than other so this is a goodrecommendation that uh we can say to umincrease uh the the performance of thetools and the only difference is that orcan also uh analyze theURL downloaded compared to sift all okaythank you very muchhello um thank you for the talk so youhad the table with vulnerabilities andpackages identified for each tool andthere was quite a wide difference inresults across tools um firstly did youcheck kind of accuracy or or differencesacross these tools and quality and thensecondly how does Orca fit into thistable does it identify packages andvulnerabilities or does it fix the sbomband then you would feed that to trivialor or something like thatmaybe I I can so there are multiplequestions here first question is aboutumuh the difference in packages andvulnerabilities across tools and thenthe question the second question isabout Orca how it performs okay uhreally quickly because we're running outof time so um with with respect topackages uh we didn't do this work ofactually uh understanding which is thesource of truth so we which one of thetools uh has a good representation ofthe of the content that's actually whatwe want to do research next so we wouldlike to find a benchmark of containersfor which we know they installedsoftware and seeuh how it compares with the across thethe the other tools basically uh becausewe don't have a clear idea of a contwhat is the content of a container imagein the first place so unfortunately Ihave no good answer on that but uh forthe second question about orca uh our toour idea was mainly to uh work on sbombsso the output of Orca is an sbombuh that then can be ingested by othertools then of course there are problemsalso there because there'sincompatibility between uh uh bomb sbombformats but we made it sure that it wascompatible at least with the sift sothat you you can use it with the with seyes okay awesome thanks you're welcomegreat talk thanks uh my question is on adeveloper point of view so uh as adeveloper some some of these offiscationI say oh I'm doing it in my dockers andI can do it do you have set ofguidelines so I can me and my team canprevent um doing this obuscation likeunintentionally on our dockers imagesokayum so the question is um do you have aset of guidelines uh well uh we haveideas on what to do but this requires acommunity effort so I can tell you uhdon't uh use multi-stage builds andeveryone here would say oh my god whatis he saying well there's like a balancebetween easy to scan and best practiceswhen it comes to sides so uh the idea isbasically uh do not delete anythingleave everything as is uh and all andthis is kind of a initial step but thenof course you will end up with gigabytesand gigabytes of images souh it kind of depends uh for sureum the idea would be to use our tool tofind out if something is wrong and thenstart from there and modify your dockerfile habits but we hope that in the nearfuture we will be able to publish someuh real guidelines on how to uh make iteasier for tools to scan containersthank you thankshi there um one understanding questionabout the Vietnami images so they cameout with quite few obuscations and yousaid that it was because of thecompression that they use so did theyobuscate the obuscation with that isthat what the outcome of that isso if I understood correctly thequestion is uh why is so that bit namthat bit nami contain container haveless uh obuscation well the the problemis that uh during their cicd processthey compress the content of the imageso the image has only one layer and wecannot reconstruct what happened to theimage basically so we cannot say whetheror not they have they're more obiscatedthan others because they just compresseverything in a single layer so we couldonly get information from the originalDocker file and that's it great thankshi thank you for the talk um justthinking about not using multi-stagebuilds i feel like that's a pretty heavycost that a lot of people aren't goingto want to swallow so is there maybe amiddle ground where when like we couldstandardize a technique where whenimages are built a some sort of metadatais added to the image on the way like asto what was in the image would that belike a better approach or is there likea a middle ground we can aim foryeah I think it will be a good practicebecause it's important to reduce thesize of the image also to reduce thesurface of attack but still by providingimportant metadata information aboutwhat is actually present in the image sothe objective would be to good a goodbalance between uh the transparency ofthe containers and and the same wayreduce as much as we can the size andvulnerabilities in the image2025-04-15 22:03:38.656788 QQ��z#��SAG24upbAXVd8hello everyone so welcome to this talkwhich is about analyzing the resilienceof software composition analyszisagainst the container imageopuscation so today with me there isYakopo um hello everyone i'm Yakopo i'ma PhD candidate at AL University and atthe same time I'm working as aresearcher at the Senam Institute in inParis uh my research revolves aroundcontainer and network security and myname is Agad BL i'm a research engineerworking at Tales in France and I'mespecially working on the security orvirtualiz networkenvironments so very brief advertisementbefore we dive into the technicalcontent so this project was partiallyfunded by a project which is named SEfor Y for sec it is a European projectso I put the link and QR code here ifyou're interested in what we are doinguh please check it out we are alwayslooking forcollaborations so now diving into thecontent of this talk which is obuscationin container in image why should we carein the first place so two years ago atCubeCon 2023 a talk introduced theconcept of what we call maliciouscompliance and container opuscation sostarting from a base container thepresenters progressively introduced somemodification��upports block volume onlysupport for file volume is out of scopefor this design but we can consider thatin the futureto add CPD support in Kubernetes we alsointroduced two Kubernetes snapshotmetadata gRPC APIs get metadataallocated and get metadatadelta the backup software will becalling this Kubernetes snapshotmetadata APIs to retrieve snapshotmetadatathis design is very different from otherKubernetes APIs we have designed so farin six storage we did this way to avoidoverloading the Kubernetes API server inthe worst case scenario well every blockis changed we could potentially get fivegigabytes of metadata per one terabytesper one terabytes of volume data so thatwould definitely um overwhelm theKubernetes API server if we choose touse uh the custom resource to store themetadata we implemented this Kubernetessnapshot metadata in the snapshotmetadata sidecar a GCservice this GC service calls CSI driverto retrieve the snapshot metadatawe do have a new snapshot metadataservice CRD it contains information onhow to connect to the GPC service a CSIdriver creates this CR advertise thatthe gRPC service is deployed togetherwith the CSI driver and the backupsoftware will be fetching the CR andtries to connect to the snapshotmetadata sidecar the gRPC servicenow let me talk about the backupapproach with volumesnapshots in a typical backup a backupsoftware creates a volume snapshot froma PVC that is part of a application ittries to back up then it creates a PVCfrom that volume snapshot launches adata mover pod and attach the PVC to thepod and then the data more part will tryto copy data to the backup storagerepository the backup software managesthe life cycle of the volumesnapshots a persistent volume can be auh volume mode that is either block orfile system that specifies how thisvolume will be consumed by containerit's either a formatted file system or ablock device usually it is moreefficient to consume the volume throughthe raw block mode compared to throughthe file system in in this talk we'regoing to be focusing on the block modeuse caseso um in this talk we are going to focuson the snapshot metadata accessoptimization we are going to um talkabout how this works currently withoutthis feature without CBT uh a backupsoftware will have to call vendorspecific APIs to retrieve snapshotmetadata for a full backup up it needsto query the allocated blocks forincremental backup it needs to query thechange blocks so that is highlyinefficient with this feature we aregoing to have Kubernetes snapshotmetadata API it provides a vendoragnostic way to retrieve the snapshotmetadata so now the backup software cancall the scan API to query allocated orchangedblocks this is a optional CSI feature soa CSI driver can implement this it is aalpha feature as mentioned earliernow let me hand it over to car who willtalk about the CSI driver deploymentthanks Shing so as Shing introduced thistopic right there are two APIs really uhthere two gRPC APIs there's the CSI APIwhich act the CSI driver will implementuh and that API is the one that actuallyprovides the raw data but Kubernetesapplications don't talk the rightlanguage for a CSI driver also the CSIAPI is uh is only has one client it'snot designed for multiple clients theCSI the Kubernetes domain clients arenot the CSI driver clients instead wehave a a sidecar which intercepts and uhprotects the CSI driver from Kubernetesclients it performs um all theauthorization arbback checks and nametranslation required to take CSI to takeKubernetes objects into the CSI namespace so a CSI driver will deploy thissidecar uh over a unique socket the sameunique socket they use today right forthe other side cars which they deploy uhthe sidecar will implement theKubernetes version of this API it'salmost identical to the CSI version withthe exception that it takes aauthorization token security token andit takes names of Kubernetes objectsso this is an optional feature not everyCSI driver will have it right and so howdoes an application figure out thisexists so when the CSI driver installsthis they are required to create� a CRnamed for the C CSI driver that CRcontains the endpoint information ofthis of the Kubernetes service namelythe sidecar um typically a CSI driverwill install a service object and theservice object will have information uhyou know the CS the name space of theCSI drive other things can be used tocreate a DNS name the CR is alsorequired to have a CA certificate and anaudience string this is for mutualauthentication right as you'll see inthe next few slides and uh so securityuh authenticationso I'm going to go over the workflow ofa backupapplication so the first thing a backupapplication would do would be todetermine whether this feature isavailable and it's a very simple thingit just looks for the CR based on thename of the CSI driver of course it getsthat name of the CSI driver from itseither the PV or the or the volumesnapshot right any of these objectsonce that's done the backup applicationhas to ask the Kubernetes API for an Otoken and this O token has to have theaudience string specified from the CRand we'll get to the reasons in the nextfewslides now at this point with the datafrom the CR we have the endpoint rightand we have the audience string so thebackup application is now in a positionto make the a call to the sidecar usingthe Kubernetes snapshot metadata gRPCAPI this is not going to the KubernetesAPI serveranymore at this point the gRPC call isoneway authentication right the backupapplication trusts that the sidecar iswhat it is but the sidecar knows nothingabout the backup application however inevery one of these RPC calls is the isthe Kubernetes authentication stringwe've obtained from the previous step sothe first thing the snapshot the sidecardoes is contact the Kubernetes API tovalidate that authentication string sonow we've reached mutualauthentications uh at this point thesnapshot uh sidecaruh will also look up the the names ofthe snapshots which are part of the APIcalls and um validate access throughsubject access review APIs and also pullin the metadata of that those volumes sothat it can translate the call into thelanguage of the CSI driver namely pullout the CSI ids the volume the snapshotids and the volume ids uh at this pointit just makes it proxies the callthrough the CSI snapshot metadata gRPCcall to the CSI driver csi driver younotice has nothing to do with theKubernetes side of things it's totallyshielded from this uh CSI driver doesits things and data is streamed backthrough the same gRPC call totallybypassing the API server so the the onlyactual contact with the CSI with the APIserver is to read the CR to do thesubject access review and and Ocalls all right to make this work rightyou have to have adequate securitypolicy i'm not going to go into muchdetail in this slide there's a lot ofstuff there but the most importantthings to note is that use of the tokenrequest token review and subject accessreviews APIs which are not normallytouched by backup software so the CSIdriver has to have permissions forsubject access review and token reviewto validate and check permissions andthe backup application when it installsitself must make sure that it hasadequate permission to perform a tokenreview a token requestmost of the other permissions exist justbecause they're backup applications andCSIdrivers so the work we've done whichincludes developing the sidecar is allin the Kubernetes CSI external snapshotmetadata repository there's a lot ofstuff there the it's got the gRPCprotobuff specifications stubs and mocksfor the Kubernetes snapshot metadatametadata API the CSI metadata um the CSIgRPC spec is in the CSI repo right youcan reference it from this repo but youknow most most application developerswon't need that the sidecar containerlogic is there uh as well as the CRD forthis uh for the snapshot metadata CRDuh we have utility packages to help youas an if you are an applicationdeveloper right help you write and usethis stuff we You know it's it'scomplicated to deal with so manydifferent with this workflow but we havean iterator package that does all thesteps for you if you if you're using Goif yo�u're not using Go then theprotobuffs and all the supporting stuffare available for whatever language youwant and there are other additionalresources to support CSI driverdevelopment there are tools we've addedthere which are used by CSI driversdevelopers u there are other resourcesaccessible the the Kubernetes CSIdevelopment developer documentation hasbeen updated to include this feature andtalk about u the change block trackingfeature it also has links to pretty mucheverything we're going through in theslides okay we'll go through ademonstration allright um sorry where are we yepuh I'll just talk a little bit aboutwhat I'm going to demonstrate and thenwe'll go through a video to show this soum it's a little bit of an eye chartright but you know uh we're going to usethe CSI host path driver the CSIhostpath driver is not a quote unquoteproduct ready driver it's a developmenttest tool which CSI developers use torun unit tests and other things um thiswas the first vehicle we used toimplement this this spec and test it outso u the CSI hostspot driver has a modeof deployment right where you can impleyou can attach this site the thesnapshot metadata sidecar we developedum it has because it's used for lots oftesting this driver only with blockboard block volume mode supports thisfeature but you know it's it's a minorissue real CSI drivers will support thisfeature for volume or file system modeso we have a very simple app which isgoing to create a PVC and mount it in ablock boat we will add data to that uhPVC and take a snapshot right then we'lluse this C this uh pod over here uhcreate an application called the CSIclient which is one of the tools weprovide in the snapshot uh repo sorryexternal snapshot metadata repo um andthat tool provides mechanis themechanism to actually use this iteratorand we'll work on things it's in a podby itself because typically backupapplications will be in different namespaces right uh to work um after we dothat we'll look at the allocated blocksthen we'll go back to the app and youknow cause a change add some more blocksto the to the volume and then snapshotit again and then once again use thetool to see the change blocksokay let me get tothe Allright okay I will move this around tohasters we've seen this already so okayso the first thing we're going to do iscreate this applicationsoit's just asimple app so we we use a CSI host pathu storage class it's part of the hostCSI host path driver create a verysimple app which is in this case is justa busy box and you know create the podwith the raw PBC uh since it's ademonstration we're actually just goingto create a few blocks using DD right ijust used an odd odd number of blockaddresses just so that we can view afinite amount of data here so we take afirst snapshot the volume snapshot classis part of the CSI hostpod driver rightsee it's ready to go and now we'll viewthe allocated blocks in the system so wetalked about the the CR so I just choosethis point to you know look at theactual CR which is installed you noticein the spec section there is an addressthere is an audience string and there'sa CA certificate which was all whichwere all provided by this CSI driver oninstallation okay that's criticalinformation it gives the it's needed bythis tool and the backup application toconnect to thatsidecard all right let's continueall right so now we're going to actuallylaunch that tool um this this launcherright it's um I'll show you the the YAMLa little bit later right it runs in aseparate pod and in this case we'veinvoked it with the name of that um PVCright sorry the name of the volumesnapshot and um the name and the namespace of the volume snapshot this oneit's telling us there are five blocksthere are five blocks and these are thebite offsets of the blocks uh the fiveblocks present and uh these are theallocated blocks the volume just as weexpect so movingon let's modify the volume and create asecondsnapshot so I just chose to add a coupleof even numbered blocks and changed oneof the odd number blocksso once again we take asnapshot wait for it to beready �and now we can view the changeblocks so the tool is invoked in prettymuch similar manner except that thistime uh we have two snapshots specifiedthe new snapshot and the previoussnapshot and as you can see it's showingus the changes only the last one itdoesn't tell you that you know it itjust tells you what has changed betweenthe these two it's not doesn't give youany history which one came first or theor the others just just the way thisis so moving onuh just to confirm right we view theallocated blocks again because we havegrown the volume slightly so we run thesame command just with the singlesnapshot and you'll see it's got sevenblocks as we've defined allright so a little segue into the thistool and the codebase so this is theKubernetes CS uh external snapshotmetadata repo so there is the examplesuh subdirectory there in which you'llfind the snapshot metadata lister tooland this one um so this tool has gotuses this iterative function it also hasthe pod YAML which for the for what wejust launched for that CSX client theYAML is it takes when you first run runit the YAML takes a little bit of timeto start because it actually extractsthe whole tool out there and builds itin your cluster and then copies the dataand that's done in an init container andthen it's ready to go with the compiletool waiting for youso if we look at the um at the coderight it uses this iterator packagewhich we provided in this repo and umit's called get snapshot metadata is theis the entry pointand if you look at the the tool it'smainly b driven by the arguments in thedata structure so you know we havearguments for the API clients theemitter callbacks right name spaces thesnapshot names previous snapshot nameand there are other things which are notgoing to go in the demo including thingslike what's the starting offset say yourapp your backup failed in some period oftime and you want to restart you couldyou if you knew where you were if you'vedone some checkpointing you could skipthings again and restart from someparticular point so there there are alot of options out there which are kindof useful for backup vendors the data isactually returned through a callbackinterface called the emitter interfaceover here anduh we'll seethat the iterator itself will handleevery aspect of that workflow so youknow they really don't have to mess withanything other than create your CSIclients uhultimately the iterator calls these twofunctions get allocated blocks or getchange blocks i mean basically that'sthat's the the two thing the two choicesyou have when you when you create theiterator you specify one snapshot orspecifytwookay i thinkthat okay that's the end ofthis let's go back tothis okayall right we got to see the code so uhthese slides have already been uploadedto SCED uh you can just download themright now because you can take picturesbut I want you to just download the PDFand start clicking on these things godirectly to the primary resources godirectly to the repo uh and then quitefrankly if you're done with the codethen it's time to talk to management orother influencers or your other peerengineers to help implement and adoptthis um we've got a bunch of other timeswhen we've presented these materials sohere's a whole bunch of links things tosend people to read at night and soon so if you're a backup vendor pleasecome and join us and we'll help youadopt this thing if you're a storagevendor please come adopt this thing andwe'll help you if you are the rest ofthe Kubernetes community please talk toyour backup vendors talk to your storagevendors and say "I want my change blocktracking." Nowuh it's really that simple because weare already talking to a number ofvendors and projects but we always needyour support in doing so not only thatwe need yourfeedback so please get involved come andjoin us at the data protection workinggroup uh we meet bi-weekly uh in themorning Pacific time uh but if you wantto reach out asynchronously to us inSlack now uh we'll respond as quickly aspossible and if you want we can set upmeetings but it's a friendly group uhamazing people we've already identifieda number of the contributors that havealready helped out we want tocontri amongst our ranks as well sothanks for your time follow up with usin any way we'll go on to Q&Anow please come to the microphone if youhave questionsuh yes obviously a lot of vendorsalready have snapshotting functionalityor running on Amazon or something likethat um will the Kubernetes snapshotsshow up as snapshots on the underlyingvendor so in AWS as an EBS snapshot willyou see them in both interfaces or willthey just appear in the Kubernetes worldoh if you're talking of Kubernetessnapshots right this spec operates onKubernetes snapshots it's up to the CSIdriver which created those snapshots tosupport the CBT spec but we actuallygive it give them the ids they have putin the Kubernetes objects give them backto the CSI driverso you don't and much of it answers yourquestion so you don't know yet or you'reexpecting that they will or at the endof the day there's still volumesnapshots we don't touch those all we'redoing is doing the difference to showthrough metadata which blocks havechanged effectively that's all the APIeffectively does so same snapshots samevendors just a different way to nowaccess and find out which blocks I needto retrieve to get me up to date does itmake sense sir i think so thank you okayI can add one more thing to that it'sthe CSI driver that generates thedifference for example you mentioned theEBS direct API well that comes from EBSso if uh AWS EBS was going to extendtheir CSS driver to support this theywould use their API under the covers tosource the information we do not look atthe P the volumes ourselves the sidecartalks to the CSI driver which does theefficient thing for its storage maybethat's closer to what you're looking forand uh the AWS EBS team has participatedin the data protection working groupthey're thinking about this but pleaserequest it and they will uh probablyadvance their agenda fasternext question please okay so this CBTrequires CSI driver vendors to open aservice generate some CI certificates sodo you have any materials how togenerate them how to rotate them becausethis is new to CSI driver vendors uh nowe leave that to the CSI driver vendorof how they do it every platform seemsto have different ways of doing thingsright so we specify what the we well wewe give a specification for example eventhe arbback policies we mentioned we saywhat has to be set up we don't provideany YAML to do it we leave it to thebackup app and the CSA driver to do itbecause every cluster is manageddifferently so we cannot get into thatspace okay thanksso um currently I'm we are usingsnapshots for nothing really becausethere's so much distrust in having asnapshotso like what exactly are snapshots goodfor like usually you would uh use PGbackup or MySQL backup for anythingrelevant and then there's like the stuffthat shouldn't have to have backups tobegin with like badly writtenapplications which store their state insome ubious file thingyso there's an excellent white paperthere was a talk just before this whichtalked about the range of of u the wholespectrum of backups and what you expectvolume snapshots right give you uh crashconsistency at the lowest level right ifyou want to go higher level toapplication I think what Dave called itapplication consistency then you dosomething like PG dump or you know orgenerated stuff there in between there'squing the apps if your app has multiplevolumes how do you do a group snapshotyou know there's there's a wholespectrum Sperum of of requirementsdepending on how complex your yourapplication is and how sophisticatedyour backup software is so this is youknow addressing the base level how to beefficient between twosnapshots at the end of the day you needprimitives on many levels of the stackin order to assure data protection weyou need this option no matter what andthe operators hopefully can call down tothis if they need it there are many waysto solve this problem this is afoundational one that we've neededforever okay thank you everybody2025-04-15 22:03:39.235118 ""��A�{#��9AU9qwxp7Uv08hello everyone thank you for coming toour session on the change block trackingmy name is Shing Yang uh I work forVMware by Broadcom i'm also a co-chairof Kubernetes 6 storage and the dataprotection working groupuh Rob can you help us with that no allright we'll we'll talk over them hi I'mMark Lobby i'm uh the open sourceproduct manager at VH Casten and I'mjoined by one of my colleagues he'lltell us about the other colleague justnow hi I'm KL Banza i'm one of thetechnical staff at BH uh I'm one of theco-authors of the CBT cap in Kubernetesuh Prasad and Ivan are two others prasadwas going to join us today butunfortunately couldn't make it um I'malso one of the authors of the CSIedition for this cap rightall right i'll just try to talk overhim all right today we're going to goover change block tracking itsmotivation history architectureimplementation with respect toKubernetes we'll demonstrate a workflowand then leave you with resources andhow to get involved we want you to adoptit and we'll answer any Q&A at theend withthat let's get started so change blocktracking is what everybody expects whenit comes to Kubernetes and cloudnativestorage and applications however it is alacking feature when we compare totraditional storage how do you do thedifference between any two volumesnapshots well you need to take anotherfull backup but everybody else intraditional VMs and bare metal worldalready has this so in in order to comedown on what we measure RPO RTO for allbackup and recovery and disasterrecovery we need to be able to get tochange block tracking in order to makesure that if the difference between anytwo volume snapshots is nothing well Idon't want to take a whole anotherbackup a whole another consumption ofstorage network compute and memory no Iwant a zero delta zero metadata pointersetc right so that is what change blocktracking brings and we've brought thisforward because everybody is coming toKubernetes from the VM world right nowthey want change block tracking becausethey expect it and you needed to go to aproprietary vendor's driver in order toget change block tracking or explain whyit doesn't work with CSI we've changedthat and this is a critical unblockerfor for Kubernetes adoption we createdkept 3314 to do this and we have nowcompletedimplementation so the history is that westarted this well over two years ago atthe data protection working group uh thedesign has gone through three majoriterations and we've had lots of fundefending improving going backwardsimproving again and defending over threemajor revisions and it's taken SIGstorage SIG O SIG uh architecture theAPI reviewers CSI reviewers storagevendors backup vendors the communitycontributions it takes a village quitefrankly to get to this point it's notbeen fun but we've arrived and we'revery excited to have arrived with animplementation in the CSI specificationand now we're ready and finished withthe implementation finished with theendtoend testing this is now ready toship with Kubernetes 133 alpha APIs sowe'll talk more about how to adopt itnext but uh that's where we areso I'm going to hand over to Shane toexplain our architecture andimplementationthanks Markso to support CBT in Kubernetes uh andin CSI we added a new snapshot metadataservice in the CSI spec we also addedtwo new CSI RPCs get metadata allocatedretrieves the metadata of the uhallocated blocks for a single snapshotget metadata delta retrieves themetadata for the change blocks of twosnapshots for the same block volume nowthis feature s��ctsand I've been involved with the Dapperproject with the K8 project that I'mshowing today and also with Captainsomething that I'm not showing today butit's also like a a project that it'svery close to to my heart uh I've beenrunning a Blog for 15 years so if youare interested check it there like salaoor salao at Twitter is my handle butlet's get it started let's take a lookinto some open source projects so thefirst thing that I will do which issomething that I usually don't isshowing a project that called openfunction how many people here knowsabout open function I see Zero handsraising and this is there is a very goodreason for that uh open function is acncf project uh but it's a Chinese Lecncf project that basically means thatit was created in a Chinese communityand donated to the cncf and it's a veryvery interesting project to analyzethat's why I wanted to show it todaymostly because when we're talking aboutserverless some people will think aboutlike function based you knowapplications and platforms and openfunction gives you exactly that so let'stake a quick look at how this looks ininpractice let me see if you can see myscreen here I'm connected to a cluster Ibasically install open function into akubernetes cluster this is running ongke on Google Cloud so it's not in mylaptop meaning that if the Wi-Fi goeswrong at some point I will just not beable to do anything we will try 4G orsomething else but it's it's kind ofworking right uh when I just basicallylist all the namespaces you can see thatI have a bunch of things I've justinstalled open function which is just asimple project and I have a bunch ofdifferent tools installed there just tobe able to provide this like functionexperience the idea here is to enabledevelopers just to write functions inany language that they want and thenjust by creating asingle kubernetes resource to be able toget that function running into thiskubernetes cluster right but in order toachieve that they actually integrated abunch of tools together in a very veryparticular way the main interface thatyou will see as a user it's somethingcalled function of course and I don'tknow if you can see this like in theback but I will try to make it as big aspossible everything is in a repostothere that it's called kcd Spain 2024 inmy GitHub reposter I will share the linklater but as you can see here let me seeif I can yeah as you can see here thereis just a new kubernetes resource calledfunction that's the thing that youdefine now in order to actually deployfunction then uh basically you write thefunction code in this case I have a youknow a go function which is prettypretty simple it's just printing a helloworld or something like that and thenyou define two things here in the specuh definition in the spec you knowtemplate here you define the build pHhow are you going to build this codeinto a container and then you define howare you going to serve this code forpeople to access this function right asyou can see here the build side ofthings you need to specify a builder inthis case you need to know that it'sgoal and this Builder I think it's usinga cncf project that it's called buildpacks to actually do the build and asyou can see here you start seen thiskind of stuff like okay if you're inChina maybe you cannot access the youknow go modules so you need a proxy forthat kind of like interesting stuff thathappens in the other side of the worlduh but another important part here isnot only how are you going to build thethe container but also where is thesource code and here I'm pointing to aGitHub repos story that has the functionsource code in this case again writtenin go under this directory so no matterwhere the cluster is it's going to beable to pick up the source code of thefunction build it and deploy it into thecluster itself and it's going to thenserve it here we don't have muchparameters here in general it's just uhthe name of the image that will becreated On Demand by the build processand then the fact that we will uh youknow we can configure the image here inany way we want if we want to set u�penvironment variables or whatever wejust set it up there then the port thatit's being used by default so with thisthing um I will just apply this to tothecluster uh so basically not that one uhthis one functions go function functionsgo function. jaml so I'm just applyingthis resource to the cluster andbasically what I have now is you know afunction that is going to be built anddeploy inside this cluster by using abunch of different tools uh if I do Ican do get functions now because this isa resource right like as any otherkubernetes resource and I can see kindlike the state of the function itselfright so it basically says it's beenbuil it's building now uh and at somepoint it will tell me if it's beenserved if it's ready to receive requestor not and I will get some kind ofaddress here uh after all this processhappens I did build this function beforeso I was expecting this to be builtagain things that happen when you'redoing kind of like live demos right butat the end of the day you will have aURL there that you can basicallyinteract with just call it it will besome sort of a public URL that you canaccess uh if you have configured yourcluster correctly you will have a publicyou know URL or like an internal clusterURL that other functions can use to callthisfunction one thing that I wanted to showhere uh which I think is important ifyou go to their website like openfunction. deev you will find thisdiagram uh which for me basically tellsme a lot of things right it tells methat to build an experience like thiswhere you basically expose an interfacethe function interface in this case thefunction resource and then you will havedifferent options to do different thingsdepending on what you're trying to doright if you are trying to build thecontainer based on the source code youhave a bunch of cncf and open sourceprojects that will help you to do thatbuildpacks is very like it's it's a it'sa complicated project it's not a simpleproject but the main idea is that youdon't even need to specify in whichlanguage your function is or your codeis uh buildpack will have a detectionmechanism to say you know this is a Javafunction or this is a go function andthen we'll create containers based onthat which is a pretty interestingmechanism and then the other ones areare kind like more typical things thatyou will use just to create containersuh the main idea there is just not to beplaying with Docker fast right likewhich are good they are easy to createbut they are also very easy to makemistakes uh too and then you go throughyou know the the facee of okay I have acontainer now I need to serve thisfunction in a way that you know it'sserverless in a sense right like and themain idea here is that I will be able toupscale the function if it's getting alot of requests or I will be able todownscale it to if it's nobody's usingit right and for that you know you cansee there that they are using a bunch ofprojects and again it's not just usingthe projects but it's combining themtogether to achieve some use cases sofor some specific use cases you need tocombine them in in in a very specificway you can see K native there you cansee kada you can see Dapper and at thebottom they added also like a wasmruntime too for for wasm modules andstuff like that which is kind ofinteresting and then on this side of thethe screen you can see events which isagain it's it's a way that they createdin order to to connect functionstogether via messaging or Eventing rightit's kind of like an interesting thingtoo uh that they created and again thisis where it becomes all about like isthis something that you really need likedo you need the functionality in thisway or are you willing to create yourown glue between the projects and Definethe use cases that you are trying toTarget in your company let's see and Iwill just not wait for too long let'ssee if this isrunning I don't know demo gos are on myside today so it seems to be runninghere um I don't know if you can see downthere but like basically what I havehere is a URL that it's not a public URLthis is basically an inter�nal URL andthis is where open function you actuallyneed to start understanding a little bitabout the tools that it's using in orderto use it because when things go wrongabstractions start like falling apartbecause if I need to troubleshoot orfigure out what's the public IP for thatfunction I need to know how it's beingserved in a way and I will just run acommand because I know how this is beingserved served and I know where thepublic IP is uh but yeah at the end ofthe day you don't really want to do thisso you need to make sure that whateveryou expose here it can be used right andI can call this function Now using justjust sending an HTP request saying kcdsSpain for example right as you can seethe function will not respond prettyfast because it wasn't upscaled yet sothere wasn't an instance of thatfunction running and it took like asecond to spin up a new replica you knowprocess the request and then it's nowwaiting for most more request to happenright if I call it again it will be muchfaster right I can list the pots hereyou will see that now I have uh let mesee if I can do that uh you will seelike I have that b running there andagain in the true serverless fashion inin after 90 seconds or so which is thetheault time for this to be downscale ifI do not send more requests this will bedownscale to zero right and again as Imentioned before uh there are some toolsthat are basically implementing thisbehavior and uh youknow it's actually up to you to Defineif the way that open function is uhgluing things together works for you oror not right open function gives likefor me it's a great initiative againthere are some you know language isbarriers that we need to tackle becauseagain all all the documentation it's alittle bit strange being translated fromChinese to English there is tons ofpossibilities there to contribute uh tomake things easier for non-chinesespeakers and and for you know forSpanish speakers as well uh it providesthis abstraction based on a functionconcept which is great and I think thatthey did a good job there at gluing sometools together but then of course likewhen I see companies like largecompanies creating their ownabstractions they probably havesomething they need something similar tothat but not 100% right uh I think thathaving an interface called functionactually limits the kind of applicationsthat you can deploy because you thinkthat you can only do functions likelambdas or stuff like that but actuallyno matter what container you put inthere it was going to be served and andbuilt in the same way and then you needto figure it out you know like if thetools that open functions is using areyou comfortable with those tools do youknow about those tools or not because atsome point when things go wrong you willneed to troubleshoot them so you willneed to maintain all that stuff togetherand I see people you know using GitHubactions to build their containers andthen using GTH ups to do the deploymentof things so having this tool thatbasically you send a resource and thenyou just get something built in thecluster and then deploy in the clustermight not be the right thing for you forseveral reasons so you know it's it's atrade-off right like I mean at least forme just seeing how people is gluing tooltogether uh these tools together it'sit's really important because it showsme a set of use cases that are welltested and well adopted but then tochoose this for real scenarios then youjust you just need to figure it out ifthat's the good thing for you or not uhand in order to go and talk a little bitabout the tools that are being used byopen function we created an example hereand this is where we the dangerous partof the presentation we are going to useHTTP and not https so if you scan thatQR code you will be able to access kindlike an application and I would love tosee if we can uh if we can play togetherfor a bit so let me know if you canaccess you will need to accept that it'san HTTP endpoint and not https you willsee the warningthere please accept it if if I have twopeople pressing buttons I think thatthat's more� thanenough so I see phones app uh I willtake the QR code in asecond does it work somebody got itworking yes I get some thumbs up so letme switch here and switch to theapplication itself so the applicationthat you're are accessing basically it'sum it's just a simple thing it's asimple dashboard uh to vote and we havekind of like the real time eventspopping up every time that somebody uhsubmits a vote and then we have somecalculated views of who is winning catsor dogs we have dogs people here in theroom so I'm I'm super happy thisapplication is really simple uh when youvoted you had like a like a hash codegenerated so don't close that windowlook at that that's going insane that'sawesome so don't close the windowbecause with that you can win uh a bookat the end of the presentation great sothe application is working again thatapplication is running on Google Cloudthe architecture of that application issuper super super simple right so wehave a a boat service that's the onethat you're using from your phones ithas a UI but also basically what it doeson the back end it just sends data toRed's to store your boat like you knowyou get the Json payload with the boatthen you store it and then at the sametime we send a rabbit mq message to uhjust to a queue that it's being pickedup by the Dashboard that you see wellall the cats and dogs are popping upthere in the middle here we have aworker that doesn't have any userinterface and and it's doing kind oflike a chrome job right like every twoseconds is picking up data from BR andtranslating that into a view to postgreSQL and that's been read by the resultsUI that it's basically showing every twoseconds you know the the percentage ofthe BS pretty simple stuff and as youcan quickly see here there is no likethere is not I when I see thisarchitecture I don't think aboutfunctions in general because I have userinterfaces I don't want to be waitingyou know to upscale a function to beable to show a user interface thatshould be up at least with this onereplica running all the time I have someinfrastructure like R rabid mq postgressthat needs to be running somewhere and Ineed to manage and then I have you knowsome uh web sockets going on back andforth again another reason why afunction might not be the best thing ifI need to keep this by directionalconnections open just to sendnotifications so how difficult it couldbe to build an application like this uhit's actually it's it's a little bithard but like the main problem that Iface here is that okay we need to createcontainers for all these things we needto connect to all this infrastructure weneed to make sure that somebody createsenvironments where this infrastructureis is ready for me to to develop and allthat's stuff so let's talk a littleabout the projects that are being usedby open function and the reasons whythey chose to use those projects thefirst one k native how many people haveheard about K native before okay so wehave half half the room or a little bitless so K native is usually associatedto serverless and to you knowautoscaling basically the main idea withK native is to be able to go fromcontainer right like I have a containerand I want to serve it in a URL so ifyou give me a container I will just giveyou a URL where this container is beingserved and I will manage as a k projectI will manage the life cycle of thatcontainer in a way that I can downscaleit to zero if nobody's calling it orupscale it if I'm getting a lot ofdemand uh so letme this is something that happens withKeynotes sometimes that you cannot getout of the presentation mode when youare presenting the screen which is superinteresting it's it's really interestingand and then you said okay what do I doone one way of dealing with this is likemaybethat but not really so I'm justconnected again and just praying that itwillwork well and it doesn't uh react to myuh to my clicks either right like it'sjust going forward which is superinteresting uh this happened to mebefore and I don't knowwhy so let's let's let's play with thedemo Gods this is not even demo Godsthis is like� keynote Gods right likethis is isinsane all right all right uh thinkthink what do you do when nothingworks turn it off and turn it on againright I've done this before let you gooho let's see let's see so basically nowit's un blocking I'm blocking my myscreen right this is superfun yeah there there that's that's thething that happens so what I can do is Iactually can continue without slideswhich was my originalidea but that will not work let me trylet me try like hard hard reset we canwait right so K native again it'sassociated with that upscaling and outscaling applications there it goesoff I will unplug this and then see ifit works yep so it's bootstrapping againagain K native is upscaling down scal inapplications how does it work itbasically works by injecting a sidecarinto yourapplication uh and basically monitoringincoming request HTTP request so youhave a proxy in front of yourapplication basically looking for HTTPrequest and it will uh basically extractdata about those requests and inform theK native control plan the things thatyou install in your cluster about howmany requests are you getting and if youneed to if you need more replicas orless replicas uh to serve all thoserequests it's it's a very sophisticatedmechanism and it's a very mature projectthat uh it has been tested a lot andit's been it's been adopted by a lot ofcompanies and a lot of products uh whichuh makes me think that it's matureenough for people to use and now it'sreaching to a point where uh you knowit's uh it's been graduated in the cncfwhich basically means that there isenough adoption to justify the use ofthese tools so no more keynote for me uhand I will probably need to authenticateto all the services again I don't evenknow what that is that's probably slackuh but the good thing is that we get mycomputer back again so I can keepshowing demos and stuff let me seeif documentsmaybe nah allright yeah there you go notthere that's that's really good okaylet's try keynote again why not whynot keynote case this pain let's see allright I was there so kened how does itwork do deploy a service uh they have anew kind of resource I don't even wantto go in full screen anymore but let'stry why not so you have a a thing hereyou have a new type of service basicallywhere again the only thing that youdefine is two things the image name thatyou want to deploy and basically K8 willgive you an name basically and K8 willgive you a URL with using that name andit will run your image it deploys a sidecar that called Q proxy that basicallywill run very close to your applicationand it will intercept all the requestsbut it will also create a bunch of otheryou know resources behind your back likeit doesn't really matter but it willcreate that route resource thatbasically is in charge of having a URLAssociated and then routing all thetraffic to the proxy of the applicationuh the main idea with K native is thatit has a networking layer that will giveyou access to doing other things liketraffic splitting and a bunch of othercrazy things that you will do withsomething like a service mesh but butthe basic functionality is that it'sgoing to Route the request to yourapplication and the Q proxy willbasically inform the G control planeabout the requests if you have a lot ofpeople calling your application you knowthe Q proxy will tell the control planehey we need more replicas more replicaswill be created and as you can see therewill be because there are multiplereplicas that basically means multiple Qproxies so there is something loadbalancing on top of this as well ifnobody's calling your route basicallywhat happens is like your service getdownscale after 90 seconds or so andbasically there is a route now there isa URL with no replicas serving requestswhich is a pretty interesting scenarioand I think that this is the mostinteresting scenario of the entire kthing is the okay what what do you dowhen you don't have anything servingyour requests well you need a differentmechanism that it's called activatorthat basically is waiting for request inorder to buffer them and then wa�it fornew replicas to be kickstarted when youdo that you wait for a second that's whywe waited at at the beginning because wedidn't have any replicas of the functionthe activator kicks in creates a newreplica and then you know you have areplica serving your uh your traffic uhthe main idea from K just to uh go backin track and let's go faster to the nextone is go from container to URL and Ididn't show that because I wasrestarting my computer but if I do herethe the ctive services that I have thisis my function K native service I candescribe the K native service type thisis a new type that basically providesall these abstractions to go fromcontainers to uh toURL you can see here that there is abunch of status reports about you knowhow the function is being configured andall that stuff but the only importantbit here is that if I have an image Ican just provide you an URL right youcan add like environment variables andother stuff but at the end of the dayyou want to run this image and then youwant a URL back so you can go thiscontainer uh important uh thing tomention about K native and and the Knative serving apis is that it's beingused by uh you know Google Cloud runGoogle Cloud run provides this containeras a service mechanism where you canjust give them a container and they willrun it without using kubernetes oranything and they expose the same API ofK native serving which is prettyinteresting to see you know it's justthe standard interface for container youknow container as service applicationsbut again this is just about servingtraffic right like it's about justdeploying applications and then justmaking them AOS Scale based on demandand only based on HTTP in this caseright like this is just using HTTPmetrics to know when to upscale and downscale uh there is another project beingmentioned there in in open functionwhich is called kada which is basicallythe other side of things when you dowhen you want to do a synchronous youknow upscaling and down Scale based onfor example messages on our rabbit mqqright so if you have messages uh inKafka or something like that that it'smore like a synchronous in nature youcan be monitoring these resources takinga look at how many how many messages doI have in the queue and then based onthe messages that we have in the queuewe can upscale ordownscale uh and again this is aboutrunning workloads it's not about what dowe do inside of our workloads and that'swhat I want to talk a little bit aboutDapper which is the project that I'mcurrently working on that stands fordistributed application runtime this ismuch more closer to developers andusually I see people looking into thisproject when they are working in aplatform team and they want to enabledevelopment teams with like a set ofunified API that they can use tointeract with complexenvironments so uh Dapper uh it's aproject that I'm using in the in thecats and dogs example that I've showedyou before and for that I willswitch I will not switch to the othercluster because it's uh closed nowthat's good right things that happen letme see uh it should besomewhere somewhere here new windowuh and with my company profile so I canshow you all the secrets restore thereyou go probably I need to log in againokay so I have kind like the clusterwhere the open function stuff wasinstalled we have a bunch of projectsinstalled in there and I created adifferent cluster for serving theapplication that we usedbefore and uh there you go so I canconnect to a different cluster now andI'm using some of these uh uh some ofthese you know tools that I mentionedbefore so we have ketive Services that'show I list my ketive services for theapplication if it actually connects tothe internet there you go and you cansee that I have three services that arebeing exposed the three uis that we sawbefore the ones that you use to vote theresults and the dashboard that wasshowing the cuts and the dogs rightthat's that's the the side of things soI'm using K there to scale things up anddown and as you will see and as Imentioned before the vot service forexample because is um uh des�cribe it's ait's it's a UI I don't want to downscaleit right so one thing that you need tounderstand about kinetic is that you canset up boundaries right there is no needto downscale to Z if you don't need itin this case I want to maintain onereplica all the time so I can do thatand also what you can do is you can inin can you can Define you know um how isthat called like how many request persecond your application is designed tohandle right so you can say you knowthis container can handle 100 requestsper second if I'm getting more than 100requests per seconds I will need morereplicas so I will need to scale up soyou can fine tune when new replicas willbe created using that something like acommon misconception is is about likethinking that you know one pod onecontainer will handle a single requestat a time and this is not the case youknow the container can't handle multiplerequests at the same time and you canDefine when to scale up and when toscaledown the next thing that I wanted toshow is Dapper uh because we weretalking about now let's go and try toenable developers to do stuff uh insideyour clusters and the kind of stuff thatyou want to enable developers to do isto basically interact with uh you knowthe environment that they are runningwith and in this case Dapper basicallyprovides uh an obstruction uh that uhgives developers apis to do things thatdistributed applications wants to doright so if you go to the Dapper websitewhich isway right like with resilient with retrymechanism secet Breakers and all thesethings uh and then like stuff that it'svery common for applications to do andthat's why I was showing the applicationlike sending and consuming messages mostof the application will want to submitand consume events right so if insteadof pushing developers to learn aboutgafka or rabit mq I just give them anAPI that they can call well let's givethem an API so they can start you knowcreating their applications faster ifthey want to store a state in a in a Ststate store like a key value store orlike a database well let's give them adatabase and let's not push them tolearn about drivers or where thedatabase is and all that stuff I willjust give you an API you can startbuilding your applications you have thencommon stuff like reading configurationsor secrets from the environment thatit's very envirment specific so you needto use bolt you need to use a cloudprovider service or what do you need touse instead of pushing developers tolearn again let's give them an API andlet's lets them building applications soif you see uh the application that I wasshowing before like with the cats andthe dogs I'm using Dapper here tobasically store the bots in red uhwithout pushing the developer to knowabout r at all so what I have here forexample I have a go service right so Ihave all the all the application modulesin different languages using differentlanguages on purpose just to connect tothe same Dapper apis from differentlanguages and I have some for example ifI want to save the vote in a persistentstore that I don't even know where it isor what it is I will just use a clientin this case to connect to the apis sowhat it does basically is just send thisrequest to Dapper Dapper will know whereto store it based on the environmentI've been working on some moreIntegrations how many Java people do wehave here spring boot stuff yeah okay soa fair a fair amount of people willunderstand a little bit more this thisexample so I've been working on this youknow Springwood integration where I canshow you uh you know more like insteadof just using an API using common springinterfaces like this is a spring datakey value template that you can use andalso the messaging template which ispretty similar to the Kafka template orto the rabbit mq template this case uhagain I'm uh storing bats by justcalling the key value template updatemethod here I don't even care where thatboat is being stored I just want tostore it somewhere or if I send if Iwant to send a message I'm using youknow the same messaging template sendand then I'm just sending my vatsomewh�ere right and it's going to besent and somebody else can consume itfrom there interesting enough like nowbecause we don't have gafka r or rabbitmq dependencies in my application myapplication now can be moved acrossenvironments even if I'm using forexample manage service provided servicesfor example for Google popsa formessaging or or or Amazon sqs I can justchange the Dapper configuration behindthe Dapper apis just to connect to thoseservices without changing myapplication so how does it work I wantto show you quickly some diagrams ifthis works because I'm now super scaredthat this will not work so how does itwork so you deploy your application youcan be using K native or you can beusing just the normal kubernetesdeployment then you annotate you knowyour application with a specificannotation saying this application wantsto consume the Dapper apis and theDapper control plane will basicallyinject a side car that exposes theseapis to your application these apis areexposed via HTTP and grpc so yourapplication can basically be sendingHTTP request to the daer side car or useone of the sdks like I was showing withgo or with Java just to connect and dothat connection this is a VI directionalconnection that basically means that youcan send data or receive data fromDapper right in this case uh when youconnect to the Dapper side card Dapperwill read to some environmentconfigurations about how these apis youknow if I want to store state or sendmessages are implemented with realinfrastructure so what I have in myenvironment is basically configured thatfor messaging I want to use rabit mq andfor storing State I want to use use Rright so the D side card will connect tothis infrastructure and it will allow meto you know uh run my applications inthe cluster and because now I haveunified interfaces that basically meansthat again I can just move myinfrastructure to Google cloud managservices and my my application will notchange I just only need to change theDapper configuration for how these apisare implemented the same with AWS I canjust switch to AWS or I can switch tousher and I will have differentimplementations for all theseenvironments making my application youknow a little bit more stable andportable across environments it's notthat you're going to be moving yourapplications from one cloud cloudprovider to another but think about likerunning your application locally withlocal infrastructure and then movinginto to the cloud with manag Servicesthe API is the same and because we haveside cars every time that you have sidecars you can do a bunch of things as Imentioned before service to Serviceinvocations uh you can have you knowDiscovery but you can also have retriescircuit breakers and a bunch of otherthings like for example security you canadd mtls connection between the two sidecars and all the communications now willbe handled by that because you areobserving traffic between infrastructureand other services then you can get likea unified view of logs metrics and andtraces about how your applications areconsuming infrastructure how much timeinfrastructure is taking and exactlywhat your applications are doing againstthat infrastructure so you can troubleshoot easier because you have a singleview of what your applications aredoing from uh upper point of view againI showed you a bunch of things usuallythis project will not make any senseunless you go and give it a try uh ifyou take a look at the book it has abunch of reposto that are basicallyshowing all these projects in action soyou can run them in your own laptop andexperiment with an example applicationnot similar to the one that I'm showinghere but like it's close enough uh andagain this project makes sense when youtry to enable developers to just do morestuff without learning about complexenvironments or learning about cloudprovider stuffuh this does separate you know theinfrastructure from your applicationcode and I didn't show that but if I gohere to the application uh I can listall the daper components it's anotherkubernetes resourcecomponents another kubernetes resourcet�hat basically is defined there Imentioned before that I'm using R postSQL and rabit mq so I have two what arecalled State Stores configured to uhconnect to post SQL and to r i candescribe one of theseuh and you will see that it's very basicconfiguration uh which has like twothings that are important like typethat's the implementation type so if youwant to move this to Google you know inmemory database store in memory datastore or something like that you justneed to change this type and then justprovide the credentials here so it canconnect to to that service in yourGoogle Cloud account right uh and thenit has a bunch of other parameters butthose are actually not important this ishow you index data here maybe you wantto add some hints on how to index dataand I have the same for uh for rabid mqagain I have a rabid mq instance runninghere so if I I didn't show you that butlike I have a rabit mq instance a postinstance and a r instance running herein the cluster that's red uh I thinkthat's postgress and that's rabid mqright so the only thing that I'm doingis I'm telling thater hey you know thesecomponents are here just connect to thelocal instances and if actually I wantto start moving my the services toGoogle Cloud because my application isgoing running on Google Cloud I can justneed I just need to change thoseconfigurations there just for uhcompletion let me show you uh componentsthe Rabid mq1 so you see that it'sbasically connecting to thatinstance again the only thing that Iwanted to show here is the the typebasically that basically tells you youknow this pu up API of sending andconsuming messages is implemented byrabid mq from the application point ofview I don't care but from theinfrastructure point of view I need toconnect to Rabbit mq so I need to knowwhere you know the rabbit mq host nameis and the user and the password toconnect to it right that's just prettymuch what you need to know as like fromfrom an infrastructure side of thingsagain because it's a kubernetes resourceyou can start defining environments ingit right like you have all thesedefinitions on how to connect toinfrastructure defining git and then usesomething like Argo CD to replicate thisenvironment somewhereelse and I think that's kind of likewhat I wanted to share about Dapperagain trying to share some some kind ofthe main points and and some examplesquickly so you can go and research moreif you're interested in that spacespecifically uh to finally arrive tosomething that open function is notusing but I think that it's alsoimportant to mention because uh I'vebeen talking about clusters I've beentalking about infrastructure likedatabases message Brokers and also likedeploying the application uh but Ihaven't been talking about creating andprovisioning infrastructure if you'reworking with any kind of applicationlike the one that I'm showing that isrunning on a kubernetes cluster you willneed to to manage and create Cloudresources including kubernetes clustersright who created the cluster that I'musing well you probably will be tryingto automate that using a tool likecrossplane or terraform or pumi are thethe ones in the space I'm usuallyconcentrated on crossplane becausecrossplane extends kubernetes and Ithink that that's a very importantdifference compared with all the othercompetitors so for certain use casesthat makes a lot of sense so again uh wehad our application running in a clusterwe had G native and Dapper installedthere and we have a bunch ofinfrastructure so who created all thatstuff somebody needs to create andconfigure these environments in a waythat it's reproducible but also simpleto use so a platform engineer can usesomething like crossplay that issomething that you will install in yourclusters in order to provision Cloudresources or resources in general uhit's interesting because crossplane isnot is not using kubernetes as mostpeople will use it like most of us willuse it just to run workloads withcrossplane you basically installcrossplane in a cluster where you areprobably not going to run yourapplications you will have a manageme�ntcluster that is in charge ofprovisioning resources but not runningyour workloads your applications will benot running in that cluster at least init you know you shouldn't do that Ithink uh the main reason uh why is thatis because this cluster will havecredentials to connect to Cloudproviders and provision resources onyour behalf right so what I did for mycluster is basically I installedcrossplane in a cluster then I installedthe gcp provider so actually crossplanecan provision new resources for me andhow does it do that it basically extendkubernetes with custom types for cloudproviders right so you can havesomething for example like the clusterresource or the database resource or themessage broker resource for Google Cloudthen you create a jaml file thatbasically represents the resource thatyou want to create you send it tocrossplane crossplane will create thisresources for you and it will keepmonitoring these resources to see ifthey are created and running all thetime and it will report status back ifyou go and delete the resourcekubernetes will create it again in thesame way that it will create again a podif you delet it when you're running yourapplications so it's a it's a a littlebit different model and it also opensthe door for more like multicloud orcross Cloud uh kind of scenarios whereyou can have multiple providers it's notthat you will only have your Googlecloud provider you can have Google cloudAWS Asher and then send resourcesprocess and create stuff acrossdifferent accounts which is kind of likeinteresting uh I wanted to show youquickly I think again I don't know ifthis will work but let's give it a tryso I have cross plan installed here uhand I have the gcp uh provider installedyou can get theproviders and you will see that yeah Ihave a provider gcp SQL that basicallymeans that I have a bunch of resourcesrelated to databases that are installedhere I have here uh the directory herewith just a simple Jamal file whichbasically describe a database instancefor Google Cloud right so this resourcebasically allows me to say hey I want tocreate a new database in my Google Cloudaccount and I want to send theseparameters you know I want the postgressul5 to be created and I also want tohave a way to connect to it so I needthe URL for that uh database and theuser and password that I will need toconnect the kind of database how big itis is in which region all the stuff thatyou will need to Define when you arecreating a database right and because Ihave crossplane installed in my clusterbasically I can just apply this I thinkthat I need to create this is using someweird stuff for generating names so youcannot apply it you need to create so itwill just generate the random name andbasically what I do is just send thisrequest to the cluster and thenbasically crossplane will provision anew database in my Google Cloud accountin a way what crossplane is doing herehere uh and I know I think that Ijust I have the screensomewhere here so if I go to my GoogleCloudaccount uh and if if this is working uhwhich is my not but let's check so if Igo to Google cloudsqlwhat I'm doing with crossplaying isbasically yeah here there you go so umI'm just codifying a way to createresources into resources into kubernetesinto kubernetes resources so I create akubernetes resource uh which basicallyallows me to enter pretty much the sameinformation that I will be able to youknow fill here into a form in order tocreate this database but what crossplaneis doing is automating all the processof creating and monitoring theseresources which is pretty pretty usefulin general if you are building complexinfrastructure right when I'm creating adatabase usually I don't want thedatabase I want the database I want thenetwork I want probably some keys I wanta bunch of different things in order tobe able to access that probably I wantlike a like a role somewhere so I canhave access to write tables and do stufflike that so usually it's not that Iwant to just create a single thing Ijust want to create a group of thingsand they need to be uh linked togetherand th�at's where I think that crossplaneactually nailed it with their their uhdefinitions and their mechanisms uh andthey created something that it's calledcomposition so if you go to theirwebsite crossplane doio and look for forcompositions basically what they allowyou to do is to create you know thiscomposable resources when you can say Iwant a CL cluster a database and amessage broker and that will be myenvironment and now environment becomesa kubernetes resource that only exposesthe thing that I I'm interested indefining when I'm creating a newenvironment for example if I'm adeveloper and I want to test myapplication the one that we deployedmaybe I just want to say I just wantlike a development environment so I canjust send a single resource sayingdevelopment environment and you get allthe things like a cluster with K nativeinstall Dapper and R and rabit mq allrunning for you to deploy yourapplication I think that mechanisms likethis that allows you to createinterfaces and create abstractionswithout actually creating a controllerare extremely powerful and I think thatthat's the main reason why crossplane isso popular nowadays into the cncf spaceI strongly recommend you to check it outagain in the book there are like a bunchof tutorials where you just will go anduse crossplay in different ways tocreate abstractions and to deploy theseabstractions in in a local cluster thatuh will give you a different Insightthat when you see it in a presentationit kind of makes sense in a way maybeyou're using terraform maybe you'reusing some of the tools but when youactually have some hands-on experiencewith the tool you actually realize howpowerful this is and how other use casesyou can uh tackle with this and at theend of the day I've seen many manypeople using terraform for provisioningsome stuff and then going into crosslingfor doing the management of theseresources that are a little bit moremore sensitive or more uh where you needjust to fine tune uh some things thatyou can do with terraform but it's muchmore easy to managehere so crossplane again basic stuff youcreate infrastructure using kubernetesresources more Advance stuff you createhigher level abstractions that you canuse to expose your interfaces that aremore specific to your domain lookinginto compositions and compositionfunctions is a a key thing on theproject and they have been doing greatwith that and I've seen a lot ofcompanies just picking crossplay forthosecapabilities uh and the other thing thatyou can do with this is now because youhave some kind of like packagingmechanism to say all these jaml and allthese compositions and all theseabstractions can be packaged ascontainers so I can just packageeverything as an oci image share it withanother team that can install it intheir crossplay installation and startusing these resources make it even moreattractive for platform teams to shareknowledge and share these obstructionsacross different installations so Istrongly recommend to check that out I'mrunning out of time now uh but I wantedto mention this because I think thatthis is this is super super important ifyou're building platforms don't buildthe platforms on your own just gooutside and check what other companiesare doing I mentioned K serving here andI needed to mention this because uhagain it's it's a very mature projectthat does one thing and does it well dothe AOS scaling of HTTP uh basedworkloads it's been adopted by companieslike red in their open shift suite andBMW tansu they both bundle K in therebecause again it's it's mature and it'sbeing used a lot for those particularuse cases and as I mentioned beforeGoogle Cloud run runs the same apis forfor their containers as Service uhService uh where the main idea there isjust to simplify you know the experienceof you know running containers withoutneeding to know or needing to createclusters that basically cost a lot ofmoney and you need to maintain them socheck that out the case of that Imentioned uh you know the this unifiedinterfaces for interacting withinfrastructure and building distributedapplications I need to mention thatDapper was created by Microsoft in 2019and it's running as part of containerapps Asher container apps so if you'rerunning containers on Asher you alreadyhave the Dapper apis already there theyrunning Dapper in as a manag servicethere and again if you're are using aureyou just are half way to be able to usethis across across teams and in the thecompany I'm working on right now Iagreed we offer two products formanaging you know daa at scale whenyou're not running an Asher and you'rerunning daera yourself and uh Catalystwhich is basically okay a multi uh Cloudapproach for running these apis withoutpushing you to manage this theseservices and finally crossplane it'sagain it's uh cross plan is been uh youknow has been created by abbound thecompany which basically offers you knowmanaged services on top of crossplanethat you can run on their cloud or youcan run on your own accounts in your ownclouds which makes a lot of sense forfor for a lot of companies I've seenvery very large companies running theseServices inside their compute becausemost of the time you don't want todelegate you know this to an externalservice like for example terraformCloud again uh building platforms ishard I wanted to show open function atthe beginning because it's a verycomplicated project that does a lot foryou but when things go wrong you need tolearn about all the tools that they areusing I wanted to show like each of theindividual tools that I've seen beingused by platform teams when they aretrying to to simplify their operationsuh for development teams or for uh youknow operationteams uh and I keep saying this becauseI think that it's the most importanttopic you need to build your ownabstractions and you need to focusreally really hard on what to expose andwhat not to expose to your teams if youwant to simplify their their job youjust need to make sure that you havethat right uh and again platforms arenot about adopting technology just forthe sake of adopting technology just tryto measure you know try to measure ifyou are simplifying the life of users ornot and in order to just actually dothat you need to prioritize what are thechallenges that they are facing soadding crossplane because you likecrossplane will not make sense unlessthey are having kind like an issue withprovisioning infrastructure and againdon't do it alone just there are tons ofcompanies building uh products out thereuh and most of these are related to opensource projects so they are more thanhappy to engage with companiesimplementing these Solutions and helpthem to integrate you know uh withdifferent tools and the entire ecosystemso thank you very much uh there is adiscount code there if you're interestedin the book and I will do that nowbecause I have one more minute so let'sdo that so I have two copies of the bookand I if you voted at thebeginning and if the window it's ofcourse it's not open anymore becauseit's nothere but I'm using kubernetes so Ishould be able to get this sorted outright uh it should be here there's anURL somewhereuh and let's see if this works I justnot trusting my computer anymore fromnow on uh so if you still have the youcan see that that's working I know thatlike dogs win I think uh but drum rolllet's see if we can find a winner fromthe voters check your hash code ifsomebody has that hash code that I justclick and delet it there you go thathash code if somebody has that one youget abook do we have it okay yes okay we haveone winner that's the the cats winnerlet's do the dogs winner and my computerdoesn't have the mouse anymore there yougo dogs winner drumrolls you cannot win two books that'sthe only thing that I'm saying doessomebody has that one hey there you gocongratulations guys thank you very muchfor your attention and apologies for mycomputer freezing I cannot do anythingabout it right the clust are workingwhich makes me happy so thankyou wego any questions uh or things pleaseplease feel free to come and and askdirectly right I will be around todayand tomorrow so I think that's it thankyou there you go2025-04-15 22:17:42.107945 ��9�|#��)AhbV-C2YzaIcfolks thank you very much for joiningthis session about seress platform onkubernetes uh we will be uh doing abunch of things here uh talking aboutkubernetes mostly uh so let's start withthe basics kubernetes users how manypeople using kubernetes already allright that's that's good that's fairenough most of the time because this iskcd Spain uh it's you know it's aboutkubernetes so I will not be goingthrough the basics but if you have anyquestions please feel free to shout outand we can stop and spend a little bitmore time there I will be showing abunch of different projects and I haveonly 15 minutes uh so it's uh you knowit's going to be fast and the main ideahere is just to show things and then uhyou can go and research more you can askquestions or you can ask for moreclarifications but I wanted to show abunch of different things um importantuh when we talk about platformsbasically we are talking about combininga different uh set of projects to kindlike achieve something to enabledifferent teams and what we are going tobe talking specifically today is aboutserverless on top of kubernetes which isa specific thing that some people mightwant and some people basically they arenot interested so it's not for everyonebut I think that it has some advantagesthat you might consider at some point somy name is Maurice salatino I work for acompany it's called Uh diagrid I will betalking about the project that we'reworking on a little bit and also theproducts too not too much just tomention kind like the the ideas BehindThese and other companies and otherprojects I'm so sofware engineer I'm nota developer advocate so my presentationstends to be not funny and very technicaland and you know and Technical orientedso it's it will be just very dry way ofshowing different projects uh I'm also acncf Ambassador and I think that'sinteresting uh mostly because I've beenjust working with different projects andI've been talking about this for forsome time now and I author this thisbook I have two copies we will beplaying with the Wi-Fi in a bit uh justto see if you can get one of the twocopies that I have here uh for free butif you're interested in in platforms andkubernetes this this book basicallycovers like I think that like 17different projects in the cncf space andsome projects that are not in the cncftoo uh and as I mentioned before I'mcollaborating with different proje��ntle with thearea of the project this project is thispresentation is not designed to blameany open source project in the oppositeI'm a big fan of the Flyn bit CommunityI love the found the founders of calpiauh I am also very involved in the openTelemetry Community as well so uh so theidea here is more about helping thecommunity on picking the right agentsregard related to the need right umagain when I designed I delivered atalking cubec con uh and I did TheBenchmark with f2.0 and At cubec Con they released 3.0and Isay so I updated the the test and Irerun the test again uh so this thiscontent is uh updated related to theversion three that is currently outthere and also I uh for those who'susing Vector so here I'm not doinganything about Vector but on my YouTubechannel I dida video about vector and I included sometests to see how Vector behaves betweenthose two projectsandsurprisingly yeah Vector is not so fastas it's been announced on thewebsite all right so the idea here is Idon't know if you've been playing videogames when you were teenager or stillvideo okay so there's a game calledStreet observ Fighter it was verydifficult to get that game when I was akid but I had a chance to get it so uhthe idea here is we're going to pick twotwo players Two fighters and Iintentionally picked for you those twofighters so on the right corner we haveRyu from the fbit dojo so very wellknown and very popular in the loggingspace uh he's been trained by his bigbrother FL d uh built in Ruby did a lotof great things but then they decided tobuild a c agent very more efficient veryquick so a very good fighter and thenrecently in with 2.0 they introduce thesupport of metrics and the support oftraces so flim bid is extending andproviding it's not only a log agent forwater that was named in the past it'snot a Telemetry agent so this is goingto be uh the fighter from the fluentdojo and on the left corner we have KenKen come from the open t Community uh ithas lot of things and one of thecomponent that is is heavily used by ourby the industry is the open Telemetrycollector initially designed uh with fortraces in fact because traces were thefirst signal so they were providing mostof the feature related to traces so thenof course uh after the release of openTelemetry metrics The Collectorsupported metrics and now with thesupport of logs of course the supportlogs and in November or December weshould expect to have some features forprofiling because Prof profiling is justout of the door keep watching it becauseit's going to be amazing all right sowhat are you going to learn or get outof this session so before so we going todo some tests going to look at theresults but before we we compare theactual resource behavior of the agentit's really important to understand theexperience design so when you'redesigning those pipeline becauseremember those agents rely onconfiguration files and you will have togo through the process of building apipeline so there's obviously differentuh the design experience so we're goingto walk through the experience and thenwe're going to look specifically on theplugins that we may need or we mayusually need when we deal with Logs withmetrics and traces so we're going tocompare uh the each agents on each ofthose signals and once we have a clearunderstanding we can jump into theresults and figure out who is the bestfighter uh for the Telemetry aspect andyou will see that we will also give somefew uh few recommendations uh during thetalk are youready yes ah come on okay so roundonedesign all right round one design solike I mentioned the those two agentswhen you launch them either in the bareAl environments or incin is going to bethrough config map you're going to loada pipeline file and that pipeline fileis holding the entire logic of what theagent is going to actually achieve so inthis in the in the the those two agentsthe the terminology has is a bitdifferent so if you look at FL bit youwill first Define what you going toreceive and this is going to be namedinput plug-in and then of course thoseagents usually they are not the�y aredesigned for specific purposes they aregateways usually you it's you send logsor traces metrics whatever to thoseagents and then they pass it to anotherdestination but not only what we want todo usually you get some raw data whichis not indexed so we're going to parseand add or enrich the data to be able toadd more context to the Telemetry dataso then when it ends up in our back endthen the data data is well indexed byMagic so the also the other thing wewant to do is to drop things because ifit's empty records uh it doesn't M itdoesn't make sense to send an mty recordto any back end of of the world becauseat the end you're going to pay for thestorage of an Mt record so there are afew things we're going to do so that'swhy in both uh in fin bit when you aregoing to process the data they have twodistinct plugins one is called first theparser because it's initially for logsremember and then we're going tofilter and last once we have finishedour job we want to send it to our backend and it's going to be named outputplugins in The Collector the terminologyis a bit different it's receiversobviously I'm receiving something thenI'm going to process I'm going to haveprocessor plugins and last once I'vedone the job I can export with exportersand then there is a connector I willexplain later what's the connector oncewe go uh deeper on this ations so the sothat's why when we when you deal withthose agents there are always pluginsand those plugins will always end up inthose buckets so either input plugins uhfilter plugins processorswhatever in flb2.0 they introduced um a pretty amazingfeature but it was quite um surprisingfor those who's been designing flim bitpipelines they introduced this thenotion of processors so what does thatmeans it means that when I'm receivingsomething I at the same thread when I'mreceiving I can start doing things andthat's it's so processor doesn't meanprocessors uh with the open Telemetryworld it's that another concept is I'mreceiving and instead of waiting to thenext thread of the filter filter orparser to do things I'm going to utilizethe same thread and do someTransformations so then I'm moreefficient at the yet and that's why yousee here I can start input and when I'mreceiving already and the same threat Ican do already some filtering so at theend you're pre massaging the data andit's a way of of being uh more alignedwith the multi-thread uh architectureand and because it's C language basedit's superefficient so that's that's the majordifference and you'll see that uh whenit comes to metrics and traces uh thispiece in the middle doesn't make senseit's going to be mainly from theprocessorperspective so now if you look at thepipelineperspective uh and here this is reallyrelated to FL bit 3 because it was nottrue with FL bit 2 so if we just startwith a collector if I receive metricstraces the logs then I can process themI have all the right plugins to dowhatever I want with the data and then Ican export so that's for true for theopen collector INF FL bit when it comesto logs you have the old architecture Iwould say where I receive receiving somelogs and I will I can have someprocessors by the way but I will do someparsing some filtering and then I willgo to output and maybe on the output I'mgoing to do some final stuff with theprocessor but when it comes to metricsand TR and traces there is no filterthere's no um there's no filter andpartial plug-in of first partial ismainly for logs so at the end everythingwill be done by processors so when youreceive you do those things and then youpass it to the exporter and then beforeyou send it to I don't know de danttrace or NE Relic uh you do the finalmodification and you send it to to thosebackends so when designing thosepipelines obviously you need a logic soyou don't have the if then else approachum in those pipelines there is aspecific structure and fbit has sincedecades um this logic of tag and tag wasvery very powerful if you understood theconcept of tags so the let's take anexample so here on the left I'm going toreceive some data fromcuet and on t�he right I'm reading logsfrom kubet files so when I'm receivingthose logs so I know the source I'mgoing to put a tag on that which meanson the right on the logs I will havecubet tag and on the onthe uh on the yeah on the left you willhave a cuet tag and then when I do thepipeline I don't need to put any if thenelse when I do some transformation I sayoh this is going to match only the datathat has that tag so then you don't havethe if then else approach you just domatch kubernetes will go through here soit means that when you are receivingflim bit is adding an extra attribute inthe data and that the attribute we useto to figure out what you going to dowith the data how you going to modifythe data so at the end with a simplenotion of tag you can dosuper complex pipelines but in the verysmall files you don't need thousands oflines just with the tags you can doamazing stuff in the collector thisnotion of tag doesn't exist to be clearso there is you can do as many pipelineas youwant and the the major thing is that uhto me make this this approach of oh thisis a a log fly fromuh from I don't know Cube system or fromCube proxy and I will do it in I want todeal it differently the I don't have theoptions to do that so the only way to dothis is using a connector a connector inin The Collector pipeline acts like areceiver and an exporter so it's goingto I start with a receiver I do someprocessor and then I'm going to send itto the routing plug-in and the routingplugin will say okay if the content hasthis then I'm going to trigger thispipeline if the content has this I'mgoing to trigger this pipeline so it'slike a switch so then the other pipelinehas as an input plugin will have the uhrouting connector has a receiver and atthe end with the this notion of routingPro uh connector you can do amazingpipelines and very very very very verycomplex pipelines but if you just lookat the number of lines to do the samething that you did in in FL bits you endup with five or six times more linesjust to do the samething in the structure of the pipelineum the flim bit has this structurecalled I think it's tumml the format ofTL plugin so the structure is that youin um you you have like a j bracket hereupper bracket with input so you'd say ohthis is an input plugin and then in thename of the plugin in the in the insideof that that section you say name andhere you name which plugin you want touse so here I'm using tail I don't ifyou see it and then after after that I'musing a filter and it's going to be thecubar plugin so this is how youstructure and that P that is basicallyalmost an Heritage of D that was prettymuch the same structure and since flim b2 I was so ex so happy because theychanged it to that yaml and I know thatall the room here you are pretty muchyaml lover so I know that it theyfulfill your need you fulfill your yourneed and and you're very happy aboutthat so the difference is that here youhave a you don't you you you will youhave a structured file so it's differentand then here you can see that sincefrom bits here in the section you haveprocessors so this is where I'mreceiving uh in the tail plugging somestuff and in the same thread I'm I havesome specific processor and you cantrigger them on that logic so at the endif you look at that pipeline it's prettymuch sequential so you receive and thenI go to the next step and then I go tothe next step so it's a very sequentialapproach while The Collector thepipeline structure is a bit differentit's more like a declarative approach Ihave some input so I list them withouthaving any logic so I will use the filelog receiver I will use uh uh thePrometheus receiver I will use OTPreceiver so there's no connectionsbetween your declaration you just listand configure your plugins one by oneyou do the same thing for processors soyou configure them one by one and thenyou do the same thing for the exportersyou config one by one and so on and atthe bottom section you have theseservice sections and this is whereyou're going to actually Define yourpipelines and the logic so here in thispipeline you can I �don't if you see itwell but you have you start with memorylimiter in the processor then you go tocommu attribute processor and then thebatch processor and then we jump intothe exporters so it's basically thelogic is at the bottom and you can haveas many pipeline as you want by defaultyou have logs traces and metrics but youcan have a log slash I don't know atests lock slash whatever so you canhave as many locks pipeline as you wantthere's nolimitations so I would say that for thedesign perspective I I have a preferencefor flim bit because because it's morelightweight and because you're when youdesign it you already have this thislogic but again it doesn't matter uhboth full FS our needs we can achievevery complicated structure pipelines soI would say that in this round of thedesign uh there's no clear winner Ithink they're pretty much equal so let'sgo to the next round roundtwo oh come on login ah all right sologging so what do you expect when youdeal with logs well when I deal withlogs I need to receive the commonprotocols where we going to exchangelogs and for that usually they arepretty much standards so I will maybesend logs from TCP UDP CIS logs uh ofcourse open Telemetry because it's anopen standard I will probably have logsgoing through open Telemetry uh I canalso usually go send logs from a cfat qso I may I may also uh use the cfatplugin uh and also something else fluenthas been out there and they have theirown flu protocol called the fluent forwater so fluent is also going to be anan a very important plugins for any typeof log agents of the market and ofcourse reading a file I mean what do youexpect so from a bare metal environmentand from accumul environment and if youlook both plugins The Collector and FLbits of course obviously the names ofthe plugins are slightly different wedon't care but they have them of courseif you look technically on the open tercont Breo you can say that oh TheCollector has 20 three plugins wow yeahbut do I really need them if I have thestandard ones I don't care I mean ofcourse they have specific plugins forspecific Cloud providers but at the endif you use the Open Standards you don'tneed thoseplugs then when you deal with a blogs Imentioned we're going to modify andyou're going to treat the data so youneed to do specific actions so what typeof actions you you expect from an agentsdealing with logs of course I need to beable to enrich the logs and by doingthis I will probably going to need someparsers so be able to pars logs extractsome extra Fields added to the logcontent um if I'm using uh collectinglogs from kubernetes I may want to addthe cunes metadata because it's a sorich information I need to contextualizemy Logs with my traces and Metric so ifI need to do that um I need to dropbecause if there's logs that is empty ordoesn't make sense it's going to costfor nothing so I'm need to drop that andalso any API of the market maret whichmeans any open observ backend of themaret has an API to send data and theywill put a rate limit which makes senseso I don't I need to basically batch mymy content to make sure that I'm notgoing to reach that rate limit so if youlook at those the the needs that we needfor logs well they both has the samestuff uh they have kubar the in factthey have the ability to add resourcesuh on the logs uh they have the abilityto parse the logs um they have theability to batch uh of course theplugins are name a bit differently butwhat is important is on the parsing sidefln bit has been out there since manyyears it has tons of features forparsing I mean it's very very powerfulso regs Jon uh you you can build yourcustom parser you can build Ula scriptsthey introduced the wasm recently um andif you look at the open temperature sidethen they have built their own languagecalled OTL so open Telemetry transformlanguage which is interesting and reallygood um there's some downsides when youdesign but they will probably improve itin the future I'm I'm pretty sure aboutthat and how to parse it you will usetransform that relies on that languageso at the end it's the same �thing uh theonly thing I would say is that designingthe pipeline will Logs with thecollector especially if you have lot oftransformation toachieve yeah FL it will be easier willbe easier but anyway when you designthose pipelines there's no debuggingexperience as of now so you have to sendthe locks to STD out you do a smallthing thing you look at the and say oh Idid the transformation great I do a nexttransformation I do it so it's very aniterative process very you don't do itever you don't do it continuously onceyou have done a a rubbish pipeline thenyou're good but the first time you do itthe developing experience is pretty badso I may expect that in the future itwill be improved but yeah that's that'sfor true for any agents of the market soI think that's that's the thing so interms of design experience for logs Ithink flim bit I have a just for theparsing aspect is my preferences but atthe end when feature feature-wise Ithink both has the same features I wouldsaydepending on on how how much experienceyou have with one of the agents uh Icannot tell you that fit will be betterbecause probably you will prefer thecollector so round let's go through thenext round which is metric so roundthree Ah that's better all right sometrics what do you expect from ametric all right you don't know no weneed to receive metrics so they areobviously same thing to logs commonprotocol to exchange logs so collect dstatsd uh open Telemetry because it'sthe Open Standards Prometheus of coursewho if you not supporting Prometheusthen you're not supporting most of theproduct of the market uh and I may wantalso to collect metrics from a host fromfrom the host where the the agent isrunning so there are specific pluginsfor Linux Mac windows and so on the goodnews is that both uh agents has thoseplugins so no problems receiving thedata all right next what do you expectfrom the metric what do you want to dowith the metrix well if you're not awareusually people think that metric is thecheapest signals in the observativeMarket do youagree no and it the answer is no andthere's a reasonbehind a metric is structured withlabels labels are great because you knowwhen you send a metric usually you addlabels because you want to say oh I wantto look at the CPU splited by hosts sohost is adimension and so having the labels isawesome because at the end you can do alot of analytics you can do a lot ofdrill down but there are some people whothinks that oh I'm going to add theversion number I'm going to add uh thedate the hour uh I'm going to I meansometimes you you ask yourself why didyou create that label but never mind theproblem is that when you have thoselabels then the cost of the metriccompletely change first of all thenumber of labels is like you it'sexponential the metric cost is like yourone single data point in the time seriesdatabas will be M will be exponential 50times or if you have 50 dimensions andthe other aspect is the cardinality sodo you know what is cardinality meansit's basically the distribution so ifyou have a two two labels let's say Ihave the process ID and I have the nameof the program so name of the programokay maybe I have a five I mean 50programs so it's not a big distributionI'm good with it but the process ID Iwill probably have thousands of numbersbehind the process ID so it means thatthe distribution is just exploding andthe cost of storage will be exploding solook when you look at labels try tothink about oh um do I really need itwould I do some statistics in the backend would that be useful for my needs noso let's do something so that's why whenyou deal with metrics I want to doprecise precise actions I want to Firstenrich probably because in cuetes Idon't have the metadata with the Podlabel mpace I'm going to add extra labelyou say hey you said that it's expensiveyes I know it's expensive but if I don'thave the right dimensions I'm not ableto to contextualize my data then I needto drop because of course if I have somemetrics which is you know in Prometheusthere's always the underscore underscorego metrics at the beginning which �is thebasically the the consumptions of theexporterI I think I usually I don't I drop itbecause I don't need it so droppingthose metrics will be also a way ofsaving some some money here and thereand then like I said drop reducing thecality dropping labels this is a crucialfeature that you need and converting youmay say what what is conversion do youhave an idea why you need convertno in the market there's two format ofmetrics there is thecumulative and there's the deltacumulative is all the things that isrelated and the on the support ofPrometheus so that will mean Prometheussupports as a back end supportscumulative that means that there is uh Idon't remember honeycom I think is on onuh oncumulative and uh I don't remember theother ones but if you look at I think uhI think NE Relic is also cumulative butin uh in the other word there's thestatsd format statd is Delta and who issupporting stat Delta D Trace uh datadog uh light step um and the rest Idon't know but depending on the back endif you you're collecting some metricsyou cannot send it straightforward tothe back end because the back end says Idon't get it so you will have to convertit so why this is why conversion is likea crucial feature so if you look atthose two agents The Collector obviouslyhas everything check marks good to go FLfbit didn't have nothing until FL bit 3where they introduced first the metricselector so metric selector is a way ofoh I don't need that I'm going to dropit I'm going to exclude this so you havea better control to filter the data andthen they introduce labels which is noteven your documentation so you have tolook at the code to understand uh theywill probably introduce on three bitthree whatever um but labels is greatbecause you can basically drop yourlabels you can do already reducingcardinality you can draw already placewith the some labels and also you canrename that the feature of renaminglabels is not working yet so I expectthat it will work soon but it has thosefeatures but theconversion the conversion is missing sono way of converting so which means ifyou collect metrics with FL bit and youare going to send to the back end itwill only work if you scrap Prometheusmetrics withouta the support cumulativeso that's one thing you have to keep inmind the other thing is there's no wayof enriching so no way of adding kubetmetadata by default so if the metricdoesn't have it then youare yeah you will have to go through acollector so and then like I said noconversion which is I think it's the I'mI'm complaining with the the communityand say add the conversion pleasebecause otherwise you're you will peoplenever use your your the FL the metricsupport so I would say for metricsclearly for me there is no questions ifyou have to deal with metrics at Highscale uh with a lot of data go for thecollector go for the collector for thefeatures that you need for your metricsso next round roundfour ah come on um so what do you needas a plugin of course we need to receivetraces the open standard is openTelemetry it's a new format so it makessense so we will have to support it andthen you have CFA because you can sendthe Tracer through CFA through q andthen you have the ancestors thedinosaurs of traces so Zipkin opensensus there are still application thatare not open tempet Friendly who stillsend their traces in those formats soyou still need those supports so when itcomes to traces there was no questionwhen I started the Benchmark I mean openTerry has all the feature we needso uh they support everything noquestion about thatflit it's on your Petry and only opPetry on the HTTP protocol which meansif you're are having anything sendingthroughgrpc yeah you will have to find a way ofconverting the the the the the protocolon the way then what you expect from afeature so when you deal with the thethe traces you need of course to enrichyour traces with extra metadata and thenyou want to drop probably some some somesome traces that are unuseful and youwant to sample so sampling do I'm I'mpretty sure that you know what issampling yeah so sampling is veryimportan�t because at the end you neversend 100% because people look at theirtheir billing and they're afraid aboutthe the cost of the traces usually it'sh you you need a number of samples it'slike you have a you have a bag ofmarbles with 100 Marbles and then sayokay uh I'm going to try to make somestats so you you say I'm going to put10% so so by in 1,000 bag of 1,000 youpick I'm going to pick 10 and then youlook at the color oh so now I know thatthere is 15% ofred is this Statistics reallyRepresentative no so the sampling isalso a really difficult job where youhave to figure out how many samples do Ineed to make the right decisions do Ineed to focus only on the errors or do Ineed also to F so that's why you needsome plugins so that's why there is twotype of sampling there is theprobability uh I mean head sampling andtail sampling so there are plugins thatwould do head sampling so at the frontand tail sampling is that you keep theentire trace and then you say based onrules say oh if there's an error if thethe response time is that that muchwhatever I'm going to keep it the rest Iwill drop it so sampling is reallyimportant when you could deal withtraces and of course as expected TheCollector has everything in it willserve our needs so no questions aboutthat on the flit side because they arepretty new on that aspect and they'redealing the TR fa is more like a Jonobject like a log they can basicallyenrich data with content bonier andthat's the only thing they can do so fortraces no question I think we all agreeThe Collector is obviously the winner ithas everything that we need all right sonow let's jump into the interestingnumbers which is the performance soroundfive ah right all right so to achievethis I had to do some benchmarkso I decided to do okay so I need to doI'm going to do different pipelines ofcourse I'm going to do pipelines forflimbits and then when I looked at thecollector I sayoh I can do different things incollector I can do processing up frontbecause the F log receiver has already af log receiver if you are not aware it'sdonated by um uh by uh observe IQ thathas an agent called stanza and when theydonated stanza was a full agents packageso they add all the feature which meanswhen you when you're dealing with a filelog receiver you can already process thelogs so there's two way things so shouldI do the processing at the F logreceiver or should I do the processingafter when with the processors so I saidokay maybe I'm going to add extra testsand see what's the best so I did somethe the logic is simple I will do so Ihave all those pipelines so collectingLogs with only FL bit collecting logsonly with the collector uh andprocessing at the re receiver and thenafter and so on and I will do first logsthen I will logs and traces and then logtraces and metric and I will dodifferent tests to be able to achievethat so the pipeline because what we sawobviously there's a there's a you cannotcompare with the same feature so I hadto reduce the scope to compare what iscomparable so with the logs I say okayI'm going to receive the logs um Icollect the logs on cubaris and I'mgoing to reach the data I'm going to addfew extra resources I'm going to do someparsing very limited and then I'm goingto export with open Telemetry on thetraces um I will just enrich becausethey have the contentmodifier uh and and I will export itthat's the only thing I'm doing um andthen on the metric I am I'm going toscrap some prometheum some data and I'mgoing to reduce the C it because FL bit3 supports that and then I'm going toadd a few uh few label with the label umuh processor and Export it to to the tothe to open themetry at the end both Iwill I was trying to use the samefeatures and one thing I I've addedbecause we want to measure the behaviorof those plugins i' I enabled theTelemetry so both agents has of courseobserv data exposed in Prometheus formatso it will help us to figure out somenumbers and how itreact so to achieve this uh I had todesign two distinct architecture so hereis by the way the repo if you want toplay around on your environment I �had todesign two different architecture whyremember Flyn bit doesn't supportJPC so I had to be to find a trick tomodify on the Fly the protocol to HTP soin this environment I I'm using I havetwo two how do I going to do the loadtest um I'm not going to send some fakedata I'm going to read act actual logsfrom the the cluster and uh I'm going touse the otel demo which is a open termdemo application uh generating openterat metrics and traces and everythingand I'm going to use the Hipster shopand I will have actual reload tests andif I generate more load on against anapplication there will be more logsproduced and there will be more tracesproduced naturally so that's is the theconcept and then to be able to collectthe logs I will have a flame bitdeployed as a Damon set to read the logsfrom the node I will have ISO becausewill include Envoy everywhere and itwill add extra metrics and extra logsand one I and I've added intentionallybecause I Kepler Kepler if you do youknow the Keplerproject yeah if you I see if you had theB I will remind Kepler is an awesomeproject it's it's a deploy a Damon setwith EV BPF probes and it will producethe energy conceptions of your podssplitted by namespaces by tons ofDimensions so it's an amazing project soyou can estimate the conceptions and thethe carbon FR print and everything butit's the most expensive Explorer I'veseen Lots of city lots of things so Iwill want to figure out okay how thoseagents behaves with this exporter thatcould be very intensive for for theagent and of course I will have danttrays because um I have D tray and uh sothat's the flit architecture TheCollector has a slightly uh I oh yeahsorry I didn't mention I have a statefulset deployments of a collector becauseopen tempature Dem MO is sending thetraces to uh The Collector just to dothe switch of protocol to be able tosend it back to fbit that's their onlyreason why there is a collector here inthe uh collector of course simpler uh Iwill have a Damon Set uh to read thelogs and I have a stateful set which Ididn't add here to receive the metricsand tracing and of course the samething all right oh yeah before we startwith the numbers um the My the the theapproach was very simple I will have tworound of tests for each family of testsI will have a romp up test so two hoursI start with 50 users on eachapplication so hpop demo and every 30minutes I'm adding 50 users so at theend of the test I have 200 users on theAP shop 200 users on the open TR demo sothat's how I'm I'm having an increase ofsignals and then I will do another roundof tests which is 50 user constant dring50 20 during 24 hours just to see howthose agents behaves in memoryconsumptions and CPU so let's look atthe first test which is logs during twohours so you can see that I I have thelogs uh the number of signals coming inand then we can look at uh I'm not goingto compare the the behavior of thereceivers because the numbers will bethe same thing but I'm going to focus onthe on the resource consumptions so onthe bottom here left is the CPU here isthememory uh and the the blue line is theis the flim bit so it consume about 60core 60 M core about that uh TheCollector is about 80 80 90 coredepending on the on on the because it'sa diamond set and then you on the on theright side you have the memory so flamebreit is about 15 Megs consumptions andThe Collector was around 90 to 100 100Mi so already just by looking at saythis you say Okay collector consume morememory again it's a go go binary and theother one is a c so you may expect thatnow I intro I added the traces so youcan see that now we can see the the rampup with the traces here and if you lookat the resources well it's the flame bitis is consuming a bit more but still isconsuming less than the collector interm CPU and if you look at thememory collect the FL bit is superstable still about almost 20 mix and TheCollector is still at 100 very stable noimpact during that test and then Iintroduce themetric and then I looked at the metrix Isaid okay so everything is normal I lookat the col I look at F bits so here I'mabo�ut 100 150 M cor um The Collectorstarts to reach about 800 M cor at theend of test I say okay and then I lookat thememory 15 Megs 20 Megs for FLbit and The Collector starts to reach to2.5gigs and I sayokay I need to do a r i need to do asoak test there's something behind so Idid the soak test 24 hours 50 users 24hours and then I looked atnumbers so here on on the purple thepink and gray graph is the collector soat the end it reach about almost twocorescrashrestarts but the most interesting pieceis thememory so it starts reach about Iintentionally I didn't put M limitsbecause I don't want to LEAP come on eateat resources eat resources to see howit behaves and The Collector reach aboutalmost 10gigscrash and restarts I say that soundslikea that sounds sounds like a memor leaksomewhere so I say okay so beforealertingeveryone which I did I saidoh guys there's a memory Le on TheCollector and this is only I think it'sonly on the metric and and I willconfirm that with some fewtests so I sayokay let's figure out which signals iscausing this just to confirm so so I didthe same one test soak test with onlylogs test with only logs and traces andThe Collector is stable nothing doesn'tmove I say okay so there is somethingwith themetrics and to continue before we moveon I did some P profiles uh so theticket is still open uh and what wediscover is that it'sthe it's the Prometheus receiver withthe cumulative to Delta the problem withthe memor leak andalso you don't have the memor leak whenyou doEnvoy you don't have the memory leakwhen you have just open Terry you don'thave the memory Leak with you have thememory leak when you haveKepler so when it'sintensive cardinality for some reasonthe collector starts to have an an issueso there's still the issue is still openand also something uh I'm going to I'm IhopeI have submitted a talk for cucon NorthAmerica to do some recommendation and wetry to figure out um the the rightsettings from the prometheo scrapbecause you will see that it's it's itcaused a lot of impact on the on thecollector so it's still open we do we'restill doing some investigation about itbut still there's major problems so Isaid okay so I want to do I want tocontinue that Benchmark I know there's aproblem how can I de delegate the metriccollections on The Collector do you havean ideano noone the target allocator who have haveheard of the target allocatorhere I seesomeone so the target allocator is afeature introduced by the open Telemetryoperator and it's oh I think it's thebest feature that the collector has tobe honest and it's it's a quite a Chamit's only on on on the on the uhoperator but let me explain how it worksso usually when you do the scrap configin your collector it takes the scrapconfig and then the collector do thescrap config and do uh reach out to thevarious end points and do whatever whathe have what he has to achieve tocollect the metrics when you enable thethe the target allocator what it happensis that it takes the scrap config so youhave the scrap config and then it saysokay I'm going to send it to the anotherworkload called the Target allocator andthe target Target locator will look atthe scrap config and will create jobs inthe Prometheus format and at the endevery collector so if you have four orfive replica he will split equally thenumber of jobs per per collector so eachone metric will all go to the samecollector and then uh The Collector isnot doing a scrap anymore he says giveme my job and then he just have to uh tosend an HB call to get the metric so itat the end The Collector is have nothingto doanymore so I said okay maybe by doingthat Target Ator maybe I will maybe savethe problem that we've seen before solet's test that and all another greatfeature is that the target allocatorsupport the prome Prometheus crds whichmeans if you have service Monitor andpodmonitor I don't have to do metricpipeline anymore I just deploy a podmonitor I just deploy a service Monitorand then the collector by Magic willgrab it so that is so great imagine ifyou deploy every day there's guysdeploying Prometheus expor�ters and yousay hey could you update your pipeline Isay good now you don't have it you justsay hey deploy the service monitor youknow I have a great uh feature calledtargeter it will pick it by byautomatically so in this architecture Ihave a specific collector deployed instateful set because at that time Targetallocator was only support on statefulset and deployments not in Damon set soI was not able to use in Damon set so Ithat's why I had a specific uh collectorjust for metrics for the Targetallocator so let's look at the behaviorsame tests 24 4 hours but with a Targetallocator so now on the top right uhleft it's the CPU and here you rememberwe reach almost twoCES and here it's about 100 to 200cord so no explosions of garbagecollector in the go runtime anymore andthen on a memory size I have onecollector with four 400 Megs and theother one about 8 00 Max so no problemswith 10 gigs that we talking aboutbefore and then if you look at say okayno collector great but I guess the thetarget allocator is is going to suffer Hit's consuming 10 mcore and the memory 60 makes so nothingso at the end if I just sum up I goingto use three four collectors plus thetarget allocator I consume less than ifI was using on the on the collector sojust for resource perspective I'm savingmoney andenergy the last question is where shouldI process so I did some tests about thatat that at that time um there was nofeature introduced by the open terCommunity uh there's a feature I don'tknow if you know about it there is anexperimental feature that you you needto enable it with a feature flag whenyou launch The Collector and yourcollector will produce traces so thenyou know the time time spent on thepipeline so that's F fantastic but itdoesn't it didn't exist at that time soI had to put some time stamps and thendo the difference with the to measurethe time spent on each each each uh eachsteps um so here I uh I have onepipeline like I said where I'm doingsome processing and filtering with afile log receiver and the other one isdoing mainly everything on the transformuh processor so parsing enriching and soon so on the on the left it's the the Flog and on the right it's transform fromthe so maybe it's too small but what Iwas surprised I thought usually when theindustry says it says always do that upfront which makes sense and then here interms of resource usage I had aslight very slight U advantage usingtransform so I was surprised about thisum in the processing time the time spentin the in the pipeline at theend it was the same so not no Revolutionit was the same but so I I think there'sno major conclusion about this type oftest I will always say that if you havesome data try to filter and modify upfront uh it's doesn't make sense to havea big uh agents that do everythingbecause it will suffer so if you you canchain different collectors and try totrans do the transformation step by stepand then uh you will have first uh youwill sweat less when you have to uhmanage those agents so that would be myrecommendation so as a conclusion Iwould say that uh from metric I thinkcollectors I love it traces I love itlogs like I said I have a preference forFL for fln bit uh for the logs I wouldsay uh just for the parsing perspectiveI I love The Collector for that um andthen from the resources I mean there'snothing to compare a Flyn bit uh whenyou compare a cuh agent compared to a go agent what youexpect it's a c is going to be obviouslyless expensive in terms of CPU and andmemory so the the FL bit is is it's it'sjust tremendous I mean I don't know ifyou've been aware but Flyn bit wasinitially designed at the moment whereiot was coming out and they was thinkingoh iot is the new stuff so we're goingto build an agent for iot devices sothat's how it's been built and iot yeahit's still there but some for somereason there's another project called KUthat is more popular so now we we use itin in in in uh incommunties so small teacher for myYouTube channel so there is plenty ofcontent I have a a more Deep dive uhepisode about those Benchmark I have anepisode about more on the Targetallocator on the metric uh conversionsuh yeah plenty of that again like I saidit's out there uh it's not may notperfect for sure but I need feedback toimprove the content so if you love itlove it if you hate it just let me knowso then I can I can adjust thecontent all right do you have anyquestions if you are interested tofigure out how Vector behaves I can Ican answer that because I've did TheBenchmark with Vector as well as far asyou know[Music]uh so what I know is that fln bit is uhlike I said the when I did thatBenchmark with famebit 3 there wasnothing about the label processor and Idiscovered when I looked at the code andI said ah what is this and then I Istart to look at it and then uh try tolearn how to use it uh by retro engeringthe code U they are act working on fewuse cases and it's been it's been uh ayear that I'm I'm trying to force themto use the metricconversion the the the way fbit worksand I think it's like every open sourceproduct on the market they they arebuilding features that make sense forcustomers so if the customer doesn't askfor it they will never build it and fbithas been mainly used so far for logs andthey still out there for logs so themetric is quite new but if everyone herenow today let's do it together if weopen an issue together maybe it will bethere and for traces I told so they Ididn't mention but they have a a SQLprocessor I mean I love it it's like aif you modify Fields you had to createI'm modifying this field then I modifythis field then I modify this field nowyou do a query and it will do everythingin one line so I think that is great andI told themum that hey you can use if you aresupporting groupy and stuff you canstart to do analytics on traces just bydoing a SQL query so they are thinkingthinking about it I don't know how farthey will do it but I think they theywant to do provide more feature fortraces at the moment they they presentthemselves more like a Gateway like aproxy for traces more than somethingelse any other questionswell on this final that you said youwould encourage people to use I wouldsay that that's um I would say that forfor uh in an organization if you justdeal with logs there is no question useFL bit because it will be cheaper interms of resourceconsumptions and then if you start doingother signals it doesn't make sense tohave two skills with two technologies uhbecause the end you adding more pressureto the the person in charge of thosepipelines so I would say yeah go for thecollector but then you will have youhave to be aware of uh that metricsdepending on how you scrape it and alsoI didn't I didn't talk about tailsampling if you do tail uh tail samplingin terms of memory depending on the loadthe number of sampling that goes throughthe collector it's going to be anightmare um so there are a few thingsthat you have to be uh cautious but yeahif you just deal with logs I mean noquestions just use the flim bits if youstart with other signals then collectorwill be a goodchoice any otherquestions did anyone when uh did you beton the collector ornoyeah no no because um because ifI no so what I did is I did some uh rampup tests 50 user 50 I didn't pick uh andand if I have more use more load on theapp then uh the app is is designed tosay h I have a I have some uh some orderI have a user connected so the if I havemore users I have more log produced sothis is how I I didn't do pick testswhere I have suddenly a big spike Ididn't do that because yeah um I couldhave done that but again it takes timeto do that type of Pand yeah if you're interested vectorvector is uh the agent for datadog um it's designed inU inRuby uh and uh the results is that theconsumption is closer to the collector abitlighter um and it's only supportedmetrix in logs to be honest um and theuh the design experience I think isquite interesting to be honest but uhyeah still if still Vector if you haveto deal with high load of logs Vectorwon't won't be comparable to flit forsure all right if there's any otherquestions thanks thanks for your timethanks for being here2025-04-15 22:17:42.724361 ��L�}#��OAulzjbGIYkJgso how many of you knows observ ingeneral yeah so quite quite quite a bitso I uh when I whenI when I arrived here I said oh Ihave 50 minutes um and I didn't didn'trealize that so I've added a few slidesjust for the introduction so if you knowobserv I will skip it because I think itdoesn't matter for you uh if you're okaywith that or you want to see the thoseslides you're okay all right so thenwe're going to focus on the the actualcontent uh and uh we will try to havefun together so let me remove those twothreeslides allright all right are you ready for somefighting sure I didn't hear anythingsure we need energy here it's likeboxing you know so in the in the back ofthe room you can still have time toplace your bets uh there's two comp twodifferent Fighters here so I guess it'sif you want to win some money it's veryquick win just go there put your put abet on that and you'll see that uh it'sworth it so let me introduce myself myname is Henrik Reed I am a cloud nativeAdvocate at D race so I've been workingfor din about 3 yearsnow uh so but prior to dino I've beenworking as a performance engineer mostof my career uh yeah testing breakingtuning at the end you have fun you knowwhat is the best of just destroyingthings you not only destroy things butat least you you you try to helpprojects and that's why I'm still a bigperformance is uh still in my heart Ilove I loveperformance um so that's why I'm stillproducing some content for a channelcalled perf bites for performance ingeneral and there's almu here who ALS isper espanol as well there's a per biteFrance as well so we're trying to to uhincrease the brand outside of the the USum and so check it out it's just contentmethology about performance in generaland this content is basically aperformance Benchmark at the end um andthen since 3 years uh coming from theperformance world I said Okay so let'sjump into the observ world and I waslearning stuff and I say hey why don'twe create a YouTube channel so uh thisis why I created a second YouTubechannel called is it oh there's aspelling mistake uh is it observablewith a which is here is missing um soit's almost three years now that I'mproducing content there so check it outas well uh to improve the content youneed to have feedback so if you hate itlet me know at least I will try toimprove to so you can like it um allright so before we start thispresentation I want to beclear to prepare this talk no birds andno telescope has been harmed nobody hasbeen no violence has been placed onbirds I promise I was ge��e yearand the end of the year they update thelist and maybe uh the last one was fromuh 2019 so 2023 so the last version andthen moving them up and down or maybe inmixing them together just uh afteranalyzing what has been happening aroundin the market so uh the first uh fiveare mostly um except the fourth one aremostly about something that is alwayslike the first victim brokenAuthentication and broken authorizationI mean up and down the chain of from thefirst record hits your servers um untilthe data is out or the other way aroundsomeone is not validating who enters andwhat they cando and then you have the unrestrictedunrestricted resource consumption thatuh if you are may maybe more familiarwith the Doos attacks or service abuseor maybe like I'm I'm doing everythingand I'm not leaving anything for therest of theconsumers um then you have like um otherand not least important I mean therethey are in the top 10 but number 10 issomething that I think is also veryneglected when you are consuming thirdparty applications you have to mind whatyou're getting back even how youhandling your red directions and thathappens in the web uh application butalso with the apis maybe you get thisresponses and you need to carefully lookwhere they are trying to redirect youbecause an attack can come that way ormaybe in the content you get back fromyour request so I'm going to get to thisback to this later on in the examples sookay what happens when we don't uhreally uh look for API Security on thedesign phase uh the security architectmay call the thread modeling when youlay down your design of your product andyou go to every step of the design andtry to locate what the threads are andyou decide uh if it's worth to work on amitigation on the design phase before uhactually start start coding this thisproject so um okay let me present youwith the zero day Loop of hell where youwe have development teams that they aretrying to put the code out they arepushed by the business I need need tohit the market I need to do it now Icannot wait you know I I need to sell soyou can eat youdeveloper yes but I have the securityteam they're bashing me they need to putevery Security Control in between andand they won't let me deliver toproduction is if if this product is notmeeting the the our security poster uhcontrols so of course the security teamsare feeling this pressure that becominga b neck because they may delay uh uhreleasing something in the productionenvironment is they saying you're notpassing the security scans you knowpassing the security controls I don'twant you to go like this um and then ofcourse we are all pressure because everyother day a new security threat arises Imean the minute the Cod hits theproduction environment it be it's it'sbecoming obsolet and it's piling uptechnical dep the one that you also leftbehind you because you were moving fastto put the your code in production andthe new one you are acquiring since youare thereso yeah it's a headache but why am Itelling you this um let me introducemyself uh I'm barbari I'm from aliancetechnology IAM I work for a visional hubwe uh provide with all the it platformfor the Alliance Insurance Group I workfor Centercountriesum and I I help defining securityblueprints I Define standards I designand Implement securitymechanisms uh and I mean to enhanceproduct security I TR I try to look atwhatever could be improved and try to uhbuild a proc uh process for it and Ialso work a lot with third partyIntegrations because there are some youknow regular cases and then you havelike unicorn that you have to lookbetter before just doing the standardthing and and some of these also becamestandards then so uh and of course Ialways trying to raise awareness betweenmy colleagues I'm always eating theirbrains with security topics I think yeahbut they they still put up with me sothank you some of them are here uhactually uh I'm Al a proud Music Addictum I had a like even a band like agazillion years ago so if this talkingthing doesn't work I can sing you a songuh yes I'm also an amateur rock climberuh this is my� debugging place where I goto decompress I mean I'm going to dothis very same thing tomorrow morning Ineed it uh but I'm also a professionaloverthinker I mean this is what a partof what make me apply to this this jobaccidentally actually um and this verytalk kept me uh awake a lot of nightsthinking how I can put it better how canI explain it so you understand what I'mtrying to to send to you um okay soenough of me let's work about a conceptI like a lot uh to explain I mean I willdo it briefly I don't have 10 talks toexplain everything the difference inthat uhstrategy uh I like to think it as anonion uh when you have layers and layersand layers and layers of securitycontrols so you can slow down anattacker or if you're lucky you aregoing to prevent them from reaching yourdata which is the thing that you are uhactually the victim you know the firstvictim so this is overall you have likea lot of tools a lot of toys you canplay with but this is like um res uhsynthetic version of what uh defense inthatuh strategy would look like from the APIperspective okay on the top side youhave theconsumers you have the web applicationsthey can be yours or they can be fromthird party webapplications and in the left uh side ofthe web applications you have the CDN uhat some point you make control uh thenarrative if you are uploading thestatic content to there your first partapplication okay I know what I'muploading and I'm preventing Supplychange attack we are going to have atalk at 6 pm today in this very track umbut sometimes you don't control andmaybe that is poisoned uh code also it'snot only receiving some attacks uh fromuh people that intercepting the calls orwhatever then you have the internet ofthings which I already mentioned youhave third parties I put in acertificate because we like to domtls Communications when we are talkingserver to server this is the using theuse of client certificates and you arealso uh exposing a certificate from yours and with it's a mutual trust system itreinforces uh a lot uh this securityposture so then you have of course userdevices with mobile native applicationsyou have the application stores therelike uh the most known ones and againmaybe you can control the narrative ifit's one application that you areputting there or maybe not because thereis another mobile application like Ilove to listen to Spotify you know andSpotify May connect to otherapplications to share my songs Facebookor well Facebook not but Twitter or orInstagram you know if you have whateversocial media and they are going to toyou know but I don't control it I Ihaven't built the mobile phone for amobile application for Spotify then ofcourse you have this big friend herethat should be up and down the chain theWatchdog the observability tools youhave to use them carefully and you haveto really think I I'm sorry I had toquote you like uh you have to know whatto to what is observable what is notobservable you know and you you you needto have like xray of the request all theway down or all the way up if you arethe one that is the client in this caseOkay and then you have my friends theauthentication and authorization toolswhatever you may have you may have anidentity provider yours or you're usingan identity Provider from somebody elseor you may have an authorization serveror even Legacylibraries reaching a database and askingfor user permissions I don't knowwhatever flavor you have um then youhave the web application firewall I willbriefly talk about this because uh howmany people here know what a wis nice but I see some hands that needsome the tiny expl explanation so thatslide is not wasted so the APImanagement platform which is the friendI'm going to talk about you today whichhas a set of features that you can seethem separated but they put themtogether for you to have like um gutmode where you operate the a secondlayer a second very necessary granularlayer uh of the entrance uh inbound andoutboundconnections uh through apis specificallyso then you have the network firewallokay is is always there it's controllingtraffic and it's it'�s also doing theirnormal thing uh and then you have yourapplications you can have on premise youcan have Cloud you can have whatever youyou have your load balancers decidingwhat instances you are going to entereverything and then you have the thegold of your company because companiesare all about data most of them I meanat the end you are you are uhaccountable for what lies here becauseit's my data it's your data is anybody'sdata and you need to care about it solet's move on so what is a wuh too long didn't read um it's aspecialized firework for webapplications it's the first shieldbetween the internet and our uh realmyou know our service um it it's exposedon the outer edge of the network it'sexposed directly to the internet thoseare where all the requests are comingthrough and it's um in in theapplication layer uh in the OSI model isit will be layer seven and I'm not goingto enter to the OSI model but I Iplanted a cure for you if you want toextend the information and then I'mgoing to push the slides to the Internetso uh don't forget if I'm flying overthe slides uh okayso it protects web service filteringmonitoring and blocking malicious HTTPand https traffic traic and whenproperly configured uh it helps preventsSQL attack injections cross scriptingHTTP protocol violations and all that uhas a bonus the they also some of them ormost of them are including uh a top 10mitigations capabilities and I meanthere are Wass that do that with webapplication threats but they are like uhWAP which is like another version of theWs that also include API specific risksit's really nice to have one of thisbecause you have com both uh mechanismscombined combinedand you have more information if we wantto extend uh on both things I've beenexplaining here but I mean you say Ihave a w if it can do marvelous so whydo I need API platforms yeah Imean no it falls short in in some placesand you shouldn't uh um put that much uhresponsibility in a w when you aretalking about granularity you areexposing different apis and each APIhave their own security requirements Imean it's not the same if you areputting a API I don't know you are doinga campaign and you want to sell a lotthis API will need like a wide trafficuh rate limiting you know um you need toto have like at certain hours you needeverybody to be capable entering andthen you may have like um okay I'mselling tickets uh but I'm I'm going tohave a lot of consumers but I need torate limits which type of web web sitesare going to actually use my platform tobuy the tickets you know um and I needto limit that because maybe they aregoing to abuse and one platform is goingto get all the tickets and I'm and theother one is going to be left with knownso and different cases you know so whatI'm be telling you this rate limit uh isnot one siiz fit all you need to tocarefully Define what your API needs umauthorization and pre-authorization Imean certain levels ofpre-authorization and a lot ofauthentication uh processes are best uhdecided at API level and the APIplatforms can serve this uh capabilitiesfor you uh the payload also I mean WSalso examine payloads but they have likea limits and they're not going to knowif one API is expecting Json not XML andthe other one is going to expect a big JJson and the other one like smallerthing and you you don't you don't havethat capability and shouldn't uh burdenthe W with this because it it getsslower so umof course attribute based allow lists uhcan be managed in bulk I mean you'reblocking or allowing a certain IP butmaybe this IP you are allowing in it'sit's entitled to go to certain apis notall of them so that you can control withan API platform and then there is the uhsettings in the Ws that are the openclose settings when somebody goes wrongsome WS are configured to let all thetraffic flow so there goes your Shieldgoodbye if something is happening to theW and it decides to allow old traffic ormaybe if you have the otherconfiguration it closes down all thetraffic and then maybe you you wouldn'tneed this so you decide to configurelike having hybrid configurations �thispart of the network is going to remainopen and this one is going to close shutdown every connection but for this onethat you're letting the traffic go inyou need something else you you wouldn'twant to them to hit your endpoint yourserver and maybe uh I mean yeah how doyou control DDS you have other tools butI mean maybe having something in themiddle that still uh stops the second uhthe second hit you know so okay what areAPI managements have been blabbering alot of the benefits and now I have toexplain you what they are okay so Ifirst uh let me show you some examplesof products there's another one missinghere which is quadrant that I learnedabout like few days ago I will put themon the slides and I will upload them toum to the slide dechum of course uh there are a lot offeatures in the API PL platforms youhave like an API Gateway future tocentralize and Route and secures the APItra traffic but this API Gateway is likea very very very smart reverse proxy andyou can use a set of rules and protocalto protect here's when you come you cancustomize all these controls with a lotof rules you can apply and you're notgoing to burden it unless you are doingbusiness coder you shouldn't do that butyou can filter this uh in a um course ina very granular way so you decide whatgoes through and what not okay uh tohelp this you have also an interface fordevelopers to do service Discovery uhyou can document M your API you can puthow they work how to test them uh youcan also build products configurationsand assign consumers to your API you canuse I I I took a lot of Advantage it'snormally used for monetizing but you canalso apply a lot of security using thisspecific part because you can uh controlyour consumers how much are they goingto consume your apis and where they cango you can manage up to three levels ofpre-authorization before letting them goto the end point with their token orwhatever uh authentication uh piece youare um managing for them okay so um ofcourse you have uh the monitoring umfeature so you can plug yourobservability tools here and collect thetraffic analyze it and then takedecisions and of course detect anyanomalies if your API are being misusedor you fall shorton anycontrol um and you can control the lifecycle of an API this is a good place toalso have this governance appli when youknow when to Bion when to uh deprecatean API when to retire uh you can test ituh you can deploy it uh of course theall all of these also they have the CIbut you can also have an API to automateall theseoperations um umso how would I use this um platform andthis is where I'm trying to getcreative uh to to apply these securitycontrols first um I'm going to uh tell astory that all of you already knowbecause you're sitting here is whathappens at Techconferences okay so for please let merecover this list that I talk aboutabout in the beginning uh the OAS listand I'm going to specifically addressthree of of these threats which arefound very convenient first the brokenauthentication uh this is a no-braineruh you can control who is accessing yourapis using the platform and then you cango to three levels of pre-authorizationsI've been saying this for a while uh thebroken function level authorization isthe where can I go I mean uh whatresources can I consume okay and also umin which capability because if if youthink of uh restful you you have like uhI don't know a customer but what you cando to the customer can you get the datacan you create a customer can you modifya customer so you you can uh so much ofcontrol this this these three levels andthen uh you have the unsafe consumptionof apis I really like to put this intorelevance because as I told you you'renot only being consumed you also consumeother third party applications and youwould like to take care of that so inthis analogy we are all the same thirdparties so uh when I came yesterday uhto register I went to the front desk Isaid okay I'm barI I'm a speaker I I flash my my ID andthen they validated and they gave me mymy patch here and you did all the sameuh depending on which type of um �ofconsumer you are okay you are consumingThe Techconferenceand okay so they gave me my batch mybatch is uh in this case the cardboardis uh aniki it'sgreenish uh it it classifies me what Iam I they already know who am I becausethey called uh they the the front deskis the API gway okay they asked for myID and they did a checkup to theregistration features which it would bethe authenticationpart um and then they gave me theyassigned me a batch so now everybody'ssees me around and now I'm a speakerokay so the control room are thesecurity people that were at the door isindeed this very morning I forgot to putthis on and I was just margining and Isaid wait you know uh please identifyyourself show that you have a sessionhere yes okay I'm still authenticated II could go here for today that would bemy token maybe so um it I mean I'm I'mtalking about third parties here but ifyou are curious about single pageapplication security you can also useAPI manager for this this is a cure codeyou have um and you can I'm going toupload this uh so you can also readabout this it's very complex I wouldneed another talk for this uh so okaythe moving on we had the broken functionlevelauthorization uh once I have been idedI've been classified on what I am and Ihave an access Tok that I can use forfurthercallsuh okay uh the first level is the whatokay with this batch this this would bean API key which is not acredential is not a credentialactually um it's something that I can II uh the third parties have like this uhconsumer key in order to show insubsequent requests to theAPI um what what they are what type ofconsumer they are okay so this helpsdetermine a lot of different uhauthorization levelsonwards okay of course as I told they'renot credentials because they normally Fvalues and they're manly rotated butthey should be treated as credentialsdon't G give it away on the code youknow just put it somewhere safe if youare sharing them with the third partiesum then um I told you already that theyare useful for tracking uh them you canalso use for tracking them or around thego that's one way of tracking them andyou can apply security policies attachedto these API Keys okay so the next levelyou can control is where I can go as aspeaker I can go to the exhibition HallI can go to the conference room go Ididn't say what I can do I'm just goingI if if there is a worship uh room I cango there we have a speakers room I canalso go there but I cannot enter anyprivate area with only the stuff that isin red can go I cannot go there theywouldn't let me inso um then you haveum this this translated to the APImanagement world are the API proxieswhere you expose your backend Serviceswhich has a lot if we're talking rest wehave a lot of end points okay on yourAPI and you are exposing them and so youare telling me okay I can seepotentially the content of these apisand the resources that are in theseapis this is the second level ofpre-authorization I can manage uh youdefine an API product normally a lot ofthese platforms called it API productokay where you can combine different APIproxies backends that are part of thisproduct maybe you you want to put like aservice a full service that you can buya um a ticket and you have the profilethat you are building because you'reregistering as a user your paymentmethods whatever that's another API andthen you have uh the actual uhum selection of the tickets you arepurchasing and the cart sorry and thenthe you you actually go pay go and payand then they may be calling anotherthird party that is producing the thethe payment website you know so this canbe a full product okay configuration andof course um they can include also inputand outputvalidation uh you can validate someparameters you can validate heads whatthe the the request looks like so youcan decide uh in the API Pro proxy withthese security policies you can decideif if this consumer is going to passthrough or you're going to block themokay because it's not meeting the thecriteria the first is they need to havethe API ke to go there if I go to thisprivate room as I'm gr�een and not redthey're not going to let me in they'renot going to look anything else that'sthe first thing they're going to look umand the third level you can control andusing the API produ capability as wellis what I can do once I reach thoseproxies you can filter you can decideI'm exposing the whole API proxy and allthe resources and all the methods thatyou have here uh or not or because youare uh I'm a register customer but uhmaybe Ican search my data and update some somedata but not all of them but I won't beable to delete myself or whatever I Ihave to delegate that on a a differentthing so in this analogy of course whatI can do as a speaker I can give aconference or I can give a workship butI cannot manage a stand I'm not anexhibitor and I cannot manage theschedule I cannot manage the event Icannot manage the location and I can doall General access uh actions okay sothis is where you let me go potentiallybecause let's let's see that I can givea conference but which conference I meanthere are there are uh other speakershere in the room and they are not givingmy conference so they might want to uhensure that I'm entitled to keep thatthe slides are mine the conference ismine and not impersonating myself thatuh is better determined at applicationLevel when you reach that point youmight face that's that's anothercapability that you shouldn't delegateto API managers normally you can butit's it's it's putting a lot of Burdenyou don't need but you can in maybecritical uh aspects uh of courseprotective resources will require uhmaybe uh better reinforcement of theauthentication like re authenticating ifyou are like doing a transfer you'realready at your bank account and they'reasking for an additional uh password orwhatever method of I say you areentitled to do thesetransfers um so to sum up uh thispart so uh these three levels that weare controlling here with the APImanagement platform is what I am okaywhere I can go and what I can dopotentially I I like to to underlinethis because it doesn't mean that when Ireach the Endo I I I will be actuallycapable of doing so so I mean when Isaid you need to apply security up anddown the chain I said up and down thechain even in your in yourresources okay you have to apply zerotrust I mean don't trust uh yourselfwith so much as a password I mean youneed to uh monitor traffic you need toapply every kind of security control youare capable of applying just to ensurethat nobody unwanted will reach yourdata so the last one but not least isunsafe consumption ofAIS so you are here in my conference bynow you can figure out I talk toomuch uh yes I'm Argentinian so thatcomes with myDNA and you're doing okay you came hereit's 10 a in the morning you came withyour coffee your Provisions are yourcoffee your cant your whatever you wantto eat or nothing because there arepeople that don't like to have breakfastbut your brain is capable of ofabsorbing a certain amount ofinformation okay and you would like togo and see other talks today not onlymine but thank you for being here uh soeverything is okay but you came withouta plan you didn't realize I was talkingtoo much and at certain point I startmaking nonsense and I'm like startsecond guessing myself I'm very nervousI'm I don't know what to say you knowand and or maybe I just uh diver go umand I what what do I say like now youknow I'm searching for the word andyou're not getting anything I'm sayingnow you know so maybe you you trying tofollow me and you can't so your brain isgoes away your Provisions are drainedyou consume everything and maybe youcould get out of this St and say Ibetter go home and leave it for todaybut if you can with a plan which is theAPI platform here okay you say h I'mgoing to take decisions based on whatthe message she's delivering to me is ifI start making nonsense you said okay Ican start looking at my cell phone I canwalk away I can use my my laptop or Ican just stare at the nothingness andwait for the time to be over so I canget out of here so uh what how thistrans translates to the API managementword okay first thing when� you receive aresponse from an API you areconsuming uh you might want to validatethe expected format you might validvalidate that you don't have a you don'thave an uh code injection attack attemptI know the wff is there but maybe theWAFF has failed and it's letting trafficgo through so revalidate here and alsoAl revalidate at the end point as wellyou know in the resource you have to doit all over again uh redundancy isalways a good thing here it's good Imean code redundancy maybe not but inthis case security redundancy is thebest thing the best option and of courseyou can do like uh Json or XMLprotection if you are expecting one ofthese responses in your payload and youcan you here is where you come becomegranular what size what scheme how manydeath levels you can have here andwhatever is not fitting yourrequirements go byy okay of course youyou can uh have other controls youshould have other controls avoid blindlyfollowing red directions because youmight get a300 whatever message which is thatredirection message of some type I meansome attacker can be replacing the redDirection URL and are youokay thankyou uh they they they might be uh youknow impersonating and they may bechanging this redirection Ur and andthey are going to get you elsewhere andif you don't control thatyou I don't know where you're going toend up okay so you can have an allowlist of known locations if you find thatit is not in your allow list I don't I'mnot following this rate okay of coursetime out and reconnection you can alsocontrol uh because if if this call istaking part of a longer process youdon't want to delay the rest of thechain you might want to be a circuitbreaker for this and say okay I give theresponse back to my back end and Ideliver the message to the user whomeverthey might be okay and then the ratelimit controls of course they apply hereat some points because if you are payingfor these um API calls some apis aremonetizing so you need to be carefulbecause you don't want them to go tohave a like a huge bill waiting for youif you're not controlling your backingkeeps calling because I don't know I'mgoing to try like 15 times maybe itworks at the 501 time so I don't knowjust R limit that as well umokay so finaladvice uh some best practices to Avo abe risk or or whatever risk anyway ofcourse authentication and authorizationfirst please ensure you know who or whatis entering your premises and what theycan do and control them all the wayuntil they reach your data and thenapply API governance you have to knowwhat to you what are you exposing to theinternet who is consuming it what thelife cycle is when to retire somethingand what it needs when it needs arevision to see if it's still meetingthe appropriate security controls as Itold you the code gets rapidly obsoletoutside so maybe there new threats ariseand you need to have you know you needto know your landscape so you can applythe additional security measures if youneed uh thread modeling should havehappen on the design phase but I alsolike to think that if you co has stayedlike there for a while in the productionenvironment there's no harm inreassessing these products and see if weneed to rethink uh our security controlsokay uh infrastructure security ofcourse uh because some of these attackscan also uh mean elevation of privilegesso you also need to have yourinfrastructure like very lock down andas I said don't trust yourself with somuch as a password um and third partyrisk management because if you arenowadays almost all of us are uh usingAPS from other third parties othercompanies and other companies are usingus this is another thing that you haveto uh that's you you are entitled toassess their security poster and see ifthere are putting all the mechanisms inplace that to to see if they are alignedwith your principles and then it makesyou it helps you decide if you areengaging in businesses with them and youshould do this periodically as well tosee if they are still um I mean inindeed like this very last week I hearda lot of attacks that came through thirdparties providers you know uh you knowyou may might want to to take a look atthem keep an eye on them they are goingto keep an eye on you as well so that'snormal it's not that we don't trust eachother but we should not trust eachother um so of course if you have asecurity Champions program whatsoeverwhat measure or even talk about securitybetween you if you're in a small companythis is always a win I mean I reallyencourage you to to there are a lot ofof podcast and YouTube channels thatthere are a lot of talks nowadays in theconferences that are talking aboutsecurity I really really reallyrecommend youto just uh go there and look what's instore it's very interesting and and it'sveryfascinating and of course sometimes it'sstressful but it's fascinating so okayum if you do did all your homework youwill have them crying and they willdecide to go elsewhere because it willbe easier to hug somebody else than yourcompany so uh thankyou if I don'tknow I think I'm done my time but if youhave some questions now is a coffeebreak I'm better have short distances sowhatever uh I don't know if we have yesfive minutes okay if somebody has aquestion be gentle on me but here do youhave any recommendations for a tool thatautomatically check myapis maybe an online tool a what atool uh do you have a recommendation foran online tool or a normal tool thatchecks my apis forvulnerability vulnerabilities yes uh Imean in the in you have these uhscanners that are that you can they arelike uh this in intrusion detection andintrustion prevention uh tools are alsodetecting if some someone is coming youknow and maybe you also have like onapplication Level application Level youhave like uh a tool that is called raspuh that you can plug it in and it's alsonot only preventing H inbound but alsofrom your own systems if there is a Mismisuse you know and the Ws are alsoscanning all the time you know they havealso antib detection maybe there arelike there a lot of toys in the marketyou just have to see what you need Imean don't get crazy on buyingeverything just know what you need to toto actually observe and then decide onit uh I don't know if I answer yourquestion yesyeah hello hello do you hear me so thankyou first thank you for yourpresentation is very interesting andclear and my question is um regardingyour experience are you decide whichbecause I saw that there are a lot of uhAPI Gateway no so regarding yourexperience how you Choice which is thebest for your company in this case orfor if you have an application how youdecide which one you need to use yeah II see I well uh in case you're asking Imean we are a big company so we canhandle a big product such as RG onGoogle it's a very complex but a veryinteresting platform I still find thatthey have a room for improvement everyevery product has but I have been takinga look at maybe gravity or Kong you knowbut it depends on the budget so this iswhy I didn't want to I I I'm never I'mgoing not to try to address a certainproduct because it depends on yourbudget and maybe if you don't have uhthe budget to even have a a API platformyou can use this capability separatedand use some open source and freesoftware that is laying around them somaybe you should see what you need whatyour budget is and then you can ask fordeos and then you can ask for you knowprices and then decide which is best foryou but they're getting like really goodat the controls they make they might bedifferences between what they callbecause in in fpg they call this aproducts and API prodes and we haveshare flows with children like you groupa lot of security policies together andreuse them but maybe other platformsdon't have this but have othercapabilities I mean there are in graphqlI think gravity is good at controllinggraphql but if you're using graphql doit carefully because it's this isanother but I mean this these productshave different capabilities and arebetter at uh at some I'm and and a lotof them also don'tdon't uhH post vendor locking so you might wantto take a look at each one of them youknow yeah I suggestyes but okay so thank you2025-04-15 22:17:43.432720 hh��o�~#��ArOylXCxix1Iuh in this talk I'm going to explainwhat are the benefits of using APIplatforms and put them in front of yourapis so you can reinforce securityaround your backend and prevent aninvited guest to reaching yourdata so um let's go to the problem firstwe're going to talk about apis internalthere's there are benefits behind usingapis although they have been like aroundfor quite a long time it's has been likearound the last decade that companieshave been ramping up the use of thembecause there are business enablers youknow you can expose your backand in tinytidbits so you are now more visible yourproducts reach farther audiences aroundthe world and of course revenues uh canbe increased uh by by doing this soadditional benefits of apis they aresimplifying the communication andinstead of using monol liticsapplications you can break them downinto little pieces and expose them in avariety ofclients uh of course systems cancommunicate with each other with acommon contract a Common Language theydon't need to know what the softwarebehind is you have you define a protocoland uh some uh contract and they starttalking applications toApplications uh of course you have now adiverse arve of client applications thatare now accessing your back end you canuse uh web applications either if thereare first party or third party webapplications you can use Internet ofThings who doesn't have an Alexa at homeor maybe even toasters or Internet ofThings nowaday they can do they can alsointeract withapis um you can have also of coursemobile devices communicating through theapplications mobile native applicationsand then you have uh server to servercommunication which a lot of businessesuse which is third party to third partyit's just overall no human interactionis involved them and you can this thisthese communications come inbound theyconsume you and they can also go theother way around you are consumingothers this is very important and keepthis inmind um of course if you are migratingin between protocols uh can I ask who ismaybe using soap uh protocols here arethere any users still in the ground onehand two hands three hands yes I some ofthem it's a very antique protocol but Imean it's still been used and maybe youwant to migrate but you have like a tonof Legacy uh web services and you're notuh fast enough to move it to rest orwhatever you need do you have otherprotocols that grpc or graph CLEdepending on the use case you might wantto be using one or another uh maybe youcan put an abstraction layer using theAPI and do some protocol transformationso you you can move slower you know andwith the API platforms you have thesebenefits but of course nothing isperfect Isaid uh I diverse client applicationscan access your backends yes and this isthe ones that you think you control butthere are also uninvited guests that arecoming through most of them um Imean uh like maybe some 10 15 years agodoing one operation maybe involved a lotof human interaction and if you maybecome came from a single point like themonoliths a web application or maybe aphone call or an email now it'sapplications talking to applications andin the worst case scenario nobody'slooking you you have observability toolsbut not everyone using them uses themproperly or I mean we just had a talklike at nine about um observabilitywhich was very good umso uh let me tell you about some risksthat you can find specifically for APIsecurity I don't know if you know AAS isthe open web application securityproject they dedicate their lives likethere a big community that that they areraising in security awareness they evenhave a whole framework for mobilesecurity I really recommend you to lookat it and they have this this also a lotof projects but they have this list thatthey uh put together a ranking likeevery two or three years of uh threatsthat can that are apis or webapplications or mobile applications ornow they have the AI list as well um theones that have happened that sam��event based on CPU utilizationyeah it's been yeah yeah yeah no I I wasnot using uh but we've been usingcluster autoscaler for a long time butas some of you know uh you could haveactually configured Autos scaling grouptrigger event uh in AWS um I'm surethere there are some of that for uh forfor uh Azure and for for gcp and basedon this node resource consumption itwould scale Autos scaling group so itwould add more instances more desiredstate to that um then cluster out scaleruh they started developing it in2017 um I've been using using clusterscalar from the very beginning um and umafter that time other other Autosscalers started to pop up uh they'vebeen um providing different ways uh toscale up the cluster um to add spotinstances if you didn't want to danglewith Autos scaling group yourself andthere has been a lot of um uh differentimprovements for uh autoscaling groupsand cluster um cluster autoscalers soyou could have mixed instances profilefor autoscaling group if you if you likereally into it um and now in 2022 uh AWSwas like hey hold on let's let's dosomething better and they starteddeveloping Carpenter which is clusterautoscaler so again like what does it doum you have some pots which which cannotbe scheduled in existing cluster uh dueto some limitations um this will trackif uh we will talk about not pool andnot pool weight a bit later it willtrack if like which of the node pool itshould scale up then it will launch anode it will Mark all the Pod that theyshould be scheduled on the specific nodeand once the instance is up it will pullall the im images and start um pods foryou oh no yes please okay nice um okayso uh who is using Carpenterhere okay yes quite a quite a few peopleum and um I I want to point out herevery nice benefits um we've been seeinguh using Carpenter uh first and foremostis a nice to build platform because withAutos scaling group and and with clustera scaler you would need to create Autosscaling groups through terraform or orclick Ops or whatever you prefer youwould need to configure clusterautoscaler how to operate with this Agand this feedback loop on working withdevelopers who wants to use a specificinstance type for some reason was reallylong and this is nice because everythingyou manage inside the kubernetes youhave all observability in one place soyou don't need to go to Autos scalinggroup and see oh why it didn't scale upwhy didn't why didn't scale down whatwas happening um it's a bit more smartabout sizing notes for a pack of pots soit will actually uh choose moreappropriate instance types for um forcluster autoscaling um you can provide avery nice um like you don't need toprovide a specific instance type you canspec you can specify a family um or youcan specify some restrictions so forexample oh lunch me instances in thisnote pull only up until some number ofCPU available uh it also considers pricefor launching uh for launching notes sothis is great uh if you're into costoptimization um and this this also hastwo very nice features which is driftAmy detection um and uh and disruptiondisruption we will talk in a bit um andfor drift Amy detection uh in case youhave uh very strict security um uhsecurity consideration for yourorganization uh you want to be able tosay hey I have a new Army available thatinclude all of the patches and Carpenterwill see oh I have a new am amidavailable if you specify it and we willcover some of nopool uh examples in abit and it will uh launch this new nodesand it will help you to have a very gooduh security posture at least on the onthe Nodelevel uh so on the right um I'm not sureif you can see from from the back rowbut this like some examples of ofrequirements and this is restrictionsyou can provide for not poool what typeof instances it can launch not type butonly zones and and all that and it's notthe whole list but it gives like areally nice um fine grain control overwhat's going on uhokay okay so Carpenter disruption or umit used to be named consolidation Ithink consolidation is a much nicer uhnicer name what it does imagine you haveuh two instances inst�ance a instance Bum and let's say it has enoughallocatable uh space for launching uhfive pods on on the top and three podson the bottom Let's ignore all the demonset let's let's not get into the detailsand Carpenter can see if if you use thisfeature it can spot oh my instances areunderutilized so I can launch a newbigger one move all the pods from thetwo previous to a new one and I willsave uh either a space so I will haveless non-utilized space or it can saveyou money and it's really nice um withsome details so we will cover we willcover problems with this features we wehad uh at perfect scalelater okay so uh what is Carpenter weknow that this is cluster autoscaler interms of KU terminology it's op it's anoperator so what it operat operates onuh it operates on three crds which isNoe class node pool and no claim nodeclaim is more of a virtual uh thing crdyou don't really control so this is whatuh what U Carpenter operator uh use uhto store the state or where it needs toprogress in terms of specific instancewe don't have much control over it butit provides uh very good observabilityum and note class and not pool uh youcan think of not class um as a storageclass so this is like a a class that'skind of reflective of what we need acloud provider to do and note pool thisis like actual kind of instantiation ofa specific node class with somelimitation in terms of kubernetes and wewill cover this uh we will cover this inaminute okay so um uh so uh a note classand this is uh just an example uh andhighlights couple of things that uh wethink important uh also to set up uh souh Amy family uh you can uh use UbuntuAmazon Linux whatever you name it uh wehad very funny issues with okay notfunny but we had an issue with that uhso um uh you want to either specify anAmi family which can bring you kind ofrandom um well not random but verifiedAmy in your cluster or you can specifyspecific specific Amis if you like it uhif you uh bake your own Amis anybody'sbaking ownAmis yeah okay some yeah uh then you canspecify this Army here um blog devicemapping uh for example you want to havea root device encrypted I usually wantto bump a bit um volume size for a rootdevice not have a default I think it'slike 20g maybe not enough for pullingall uh fat images and store all the logsand we also love using gp3 uh so weoverride the default option of gp2 umand uh Some Ds is like a roll um andSecurity Group subn subnet selectorsalso here we use wild wild card uhselectors uh that why uh we don't haveto uh we can uh have more easydeployments of the same uh home chartacross all other provisioned regions andand clusters so this is really nice uhbut like the the top part is isimportant uh and I'll have a QR codewith some more sophisticated examples ofnot class uh that that we use okay sonot poool uh again this is working inconjunct conjunction with uh not classuh so um a not class represent like aspecific um class uh and here we can uhspecify a configuration of disruptionand different in cluster details andI'll go over this um because I thinkit's important to to know some of theoptions and uh to use them uh so firstand foremost uh Let's ignore disruptionfor for a minute uh you want to specifylimits so you don't over provision toomuch notes in case you have a rock um uhreplica Set uh for deployment and justscaling up cluster infinitely you don'twant to have that uh you want to specifyand limit limit the blast radius um uhwe always specify how much we reserveresources for CU blood and systembecause well I'm not sure if you did itwe had this when uh some poort use a lotof memory and and use a lot of CPU andthen you cannot even SSH into instancebecause it's getting throttled soheavily um and uh here we arereferencing uh no class and this isthrough no class ra uh and we of coursespecify requirements which uh which inthis specific example is a concreteinstance type but you're you're free touse a family or or provide some otherrestrictions and the last one is aweight uh and this is what um whatallows us to say um to Carpenter pleaseprefer to launch this node and this �is avery interesting setting because forexample you have reserved instances uhin in your cluster uh who has reservedinstances yeah saving plansokay also saving plans yeah so you cansay that please Carpenter prefer tolaunch a reserved instance for usbecause we want to utilize first what wepay for and only after spin up some ondemand and this is this this also coolif you for example have has uh if you ifyou have a not pool for spot and OnDemand you can control oh prefer tolaunch uh spot uh if it's if specificinstance time for of instance type forspot is not available spin up and ondemand instance for me and this this isreally nice and the top section is um ismore of um is is super cool I love Ilove disruption um so budget I'll coverin a bit uh so consolidation uh what itdoes uh it's you have two options youhave one empty and you have oneunderutilized so when empty only thewhen the node is empty only after thatCarpenter will consider shrinking it alltogether uh when under utilized you canset up uh when it's underutilized so forexample if your memory underutilizedunder 60% cluster scaler has the samething then it will um start thinking howto improve a situation in the clusterand uh you have expire after uh ifyou're uh having to like dynamicallyfixing security issues with your notesuh together with AMI family uh you canhave a thing where you have a TTL thenote will be running for let's say aweek and after this week it will beexpired so Carpenter will will mark thisnode as being trade it will launch a newnode which will have a new Amy and itwill migrate all the part from aprevious node to this to this new nodeand this is great to have this likedynamically automatically updated systemand then you just go into control planeupdate control plane and you're goldenthis really nice um used to spend a lotof time doing cluster upgrades so itdefinitely helps uh and then you have abudget so uh think of this as a Max andavailable uh in terms of deployment soyou say hey this amount of uh no uh thispercentage of notes can be uh you candisrupt them um and you're you're freeto go uh and this works together with podisruption budget uh so you want youwant to balance it out together withthatum okay we have uh cure code with morespecific examples and um using reseredinstances this specific article hasouted crds uh uh Carpenter is a veryactive project so uh you have always uhsome changes and they change it crd butit has a good uh good enough detail andfeel free to uh copy and paste and umyeah test it out yourselfum okay so now um something we we uh werun Carpenter for um 1 and a half uhyear now and we accumulated someoperational experience uh so uhCarpenter does come with nice metricsbut it doesn't come with oh here's whatto look for or this alerts you shouldset up um so we came up with three mainthing not pull limits if you uh specifynot pull limits and your not poool isreaching the limit Carpenter will not beable to launch uh new EC instances uh souh we have an alert on top of that andthat means we are almost there we go andincrease it or see what's taking thespace uh second one is cloud provider SSif if you have some um I don't know uhrolled out uh weird uh policy change forCarpenter roll uh or something off onthe cloud provider side we also havehave an alert for that and the last oneis a note not registered in cluster uhand I'll tell a story about that that uhwe used uh we used I think it was UbuntuAmi and they rolled out a change wherethey uh do not properly Escape uh stringand we specify our own parameters uh forkuet and the no just couldn't registerin the cluster uh it was a new Amyversion released uh so um uh we didn'thave a disruption we have alert poppedin almost immediately but that was ascary uh thing in case you you use thisautomatic Amy so if if you want to be onthe safe side maybe provide a specificAmy or we use Amazon Linux uh which uhwe find to be more stable uh forus uh now what do we monitor uh we ummonitor pod in notes and I'll have ademo after that so it's a bit more wordsand then we'll we'll check something uhwe uh monito�r idle uh space for noes andjust uh see oh um what's going on maybeyou can improve something uh we check atlittle memory in total CPU just to havea grasp of what's going on inside thecluster and we uh use spot instances whohas spotinstances um okay yeah so we use we usespot instances it's it's very good uh Ilove it um but we want to track um howmuch percentage we always aim for moreum okay so demo um and this is a QR codeuh we also have a dashboard I'll showyou a dashboard now released uh forgrafana uh feel free to go there it'sfor new version of Carpenter and it alsohas this alert alert exam alerts exampleuh that uh I describe uh so feel free togo there download um and yeah it's it'snice so this how it looks like uh wehave a it's uh one of our Dev cluster uhwe have here we can um check fordifferent notes um we have 74% of spotnotes which is which is good how muchpods we run uh we can see uh clustermemory all that uh here um and forexample distribution of the nodes and wecan see it's quite nicely if you're haveheavier on one of the no one of thezones uh you may have some resiliencyissues just depends on your uhrequirements and I I love this one um soit gives a specific node name like forme this would be uh very hard to buildunless you build your own exporter uh ontop of um on top of notes uh in Commcluster so this is a carpenter metricand we are able to see how much nodes wehave on specific uh how much puts wehave on a specific node what is the CPUutilization memory utilization and allthat and this this is really nice I justopen here and you can see what's uhwhat's going onum so uh yeah um um yeah what what elsedo we have hereoh error codes not claim not found Iknow probably you'll have to go andcheck what's going on here uh yeah so umso this is uh forobservability um okay okay now it's uhit's a couple stories we had and um whatwhat we learned from it uh so uh we usea separate e static note pool when youprovision e you can have it uh I do notwork closely with with Azure uh soCarpenter also has a provider for Azurenow uh if you do use uh please write tome I'll be very uh Curious to to hearyour take on that uh so uh we uhbootstrap our system critical componentsthat provide bootst that provide clusterautoscaling attaching volumes Etc on aseparate note not pool of eeks that isnot controlled by Carpenter uh if you ifyou have Carpenter running on the nodeof Carpenter and it kills it can killhimself which is not cool so um so useTain T Toleration and Affinity to to runthis on the specific note if you boosterup a cluster through Argo City orsomething uh we also prefer to to to runon this uh static node pool uhinitially okay so uh careful withdisruptions um disruption um so you canconfigure how often or how fast they runfor specific notes but let's say youspecified 5 minutes and you have veryDynamic cluster that had that has newpods coming into the cluster um and uhCarpenter will always uh be your friendbut it will be your friend in a very umuh in a very specific way where it willjust do what what you says him to do andit will consolidate and when itconsolidate uh it uh it gracefully shutdown the pots uh so some of the PO takelonger time to start up for examplepredus write Ahad log reading on the onthe on the bootstrap um and this can Umthis can affect uh affect your SL forspecific components in the cluster so uhif you consider migrating to Carpenteror or you already have some uh workloadsuh be careful uh use one emptyconsolidation uh we are happy with uhwith with what we have we don't need anyfurther uh thing uh so I suggest if youjust migrate it from a clusterautoscaler uh or uh setting up acarpenter uh disable it or set one emptyand move towards your underutilized goalvery slowly so for example you can Markspecific PS uh as do not disrupt andCarpenter will uh do not uh disrupt andnotes where the spots runs so this isone of uh the instruments that you haveand only after you do that you're likehuh this the spots can actually bedisrupted maybe we launch them on onthis specific note pool and for thespecific not pool we do have� disruptionuh consolidation uh configureoh oh this is a slack okay thanks umokay and um uh yeah um so um so we hadthis uh issue uh with TTL for the nodesnow it's way more configurable so uhspecify your um disruption budget uh butwe had this issue when uh it would bringup uh a new not pool uh and for this notpoool we would have a TTL uh and wedidn't have pdb set up for uh forspecific deployments um and um then acarpenter would just uh bring down uhand move all the pods for uh for all thenotes at the same time because uh theTTL is like 3 hours from a launch it waslaunched 3 hours ago let me go and andand move them all uh so um so be carefulwiththatmhm okay I don't know what oh yeah niceokay uh so uh also suggest you to addInterruption queue uh it has a littlemore uh TDS uh set up uh than justrunning um Carpenter uh so you will needto add sqs and configure all thepermissions to read from that I justsuggest you uh do it right off when youstart using Carpenter anybody use a spotInterruption Handler uh within your uhspot instances yeah yeah so uh so spotinstances they have a Interruptionsignal 2 minutes before spot will beinterrupted it will receive a signal andalso AWS can send this to in tosqs uh then what carpent we'll we'll dofrom here and I'll cover some of eventsthat can lead uh that that can go intoInterruption queue how Carpenter handlesit is like okay this SP incense will beterminated what are the pots I need tolaunch a spe like the same node um onthe side and then it moves uh pods tothe new uh to to the new node uh so um Ithink it's it's great uh because if uhif you use a spot uh spot terminationHandler demon said uh then uh it wouldjust gracefully shut down pods uh andtrain uh it would just drain and uh thenit would take a longer time to move theSPs to new node um and it's not onlyabout spot Interruption but you can forexample um if AWS cannot do a healthcheck for instances launched then uhthey will also be interrupted you canalso have uh termination terminatingevents or stopping events if you go ininto console and terminate instanceCarpenter will like oh I need to launcha new one so it will it it will uh helpuh to reduce uh launch time for forpods um okay note pools uh We've cover Ithink I've covered all of the optionsbut again restrict max memory Max CPUfor note pool so none of the not poolsgo Rogue um choose specific instancetype uh we actually had a lot ofdiscussion with uh with team on thistopic and it's um it's not too small nottoo big uh let's say if you have ummicros uh instance available in your inyour cluster just keep in mind that whenit starts up you will need to put allyour demon set into that for example Iknow what you use uh fluent beat uh uhwe use V ccni so V ccni and all all ofthe demon s pods and they will uh theywill leave their rentree uh so you willactually have way less allocatable spaceon this smaller type of notes but alsonot too big in case you have three I'sand you love and uh you allow likemassive instances instance type uh allyour uh pots will be running on thislet's say three notes across A's andthen what happens if this note getsterminated for some reason not cool allthe pods will be moved so you will havelonger time uh like you you will haveharder time to recover uh also choosefamily types uh we prefer not to useburstable for our in-clusteruh components um because it's a bitunpredictable uh and uh you it's likechoose family types you know yourworkloads uh better um also for not toosmall not too big uh if it's a smallerinstances they usually have less Networkbandwidth uh if you have Network heavyapplication probably want to go biggeruh always reserve space for system andCU blad uh it's not funny when uh whenyou have some problem on the specificnode and you cannot even just go in andcheck what's going on actively uh choosebig enough root volume uh we had someissues when the root volume wasoverfilled with uh with images and justincreased it so just go back uh for thatum and um uh if you have uh well it'smore a specific um recommendation for umspecific recommendation for for for usecase uh but uh let's say you have acluster uh where you have replicas thatneeds to replicate a lot of data amongthem maybe you don't want to you don'twant to have in zone traffic uh for thatreason um and because you pay for inonetraffic uh so uh you can wait note poolsyou can say please prefer to launch thenot pools on in this a and this willlaterally collocate um it will increaseprobability of collocation of the portsin this specific zone so you'll pay lessfor traffic um but maybe some of youhave thatum okay for uh for cloud provider umCarpenter is good but um at last for ushe just launch uh instances uh soInterruption queue important uh and uhall AWS accounts they have limits uh foruh what types uh and how how manyinstances of a type you can uh launch atthe account um so go in your console andcheck if maybe you have this limits andso you know this limits ahead of time umand the last one is a a bit small butimportant if you have a cedar wrenchblock if you're using a as cni and youhave have um small uh cedar block foryour VPC um make sure to monitor this IPaddresses so you don't run in thebecause Carpenter loue instances andsometimes it Lune a lot of that so allof a lot of demons s they will havetheir own cni attached to the instanceand you don't want to be in thesituation where you have where you'renot able to laune pod on the Note justbecause you're out of uh IP addressesand uh you have sparse uh allocation ofthis IP addresses usually in uh in Cloudproviders okay uh so uh I think wecovered a lot ofcarpenter um and um what we run onCarpenter are workloads uh so how tomake them run better with Carpenter isof course first provide pdb uh if youhave some um uh if you have workloadsthey should handle uh graceful shut downnicely uh uh if you know that it willtake longer make sure you you add it uhyou all always wants to specifyresources uh for workloads running thatway um that way uh scheduler will beable to provide this resources forworkloads and do not collocate them allwith burstable type on on the or uh uhbest effort type on the specific notesuh and you want to optimize thisresources and after you optimized it andyou run it for a little bit and you havea new version release maybe you want tooptimize it again uhso uh you always want to have just theright side we call it internally if youuh over over provision uh request uh youwill have waste of money if you underprovision for example a memory requestyou run into the issue with with higherchance of om if you do not Define themuh that Works badly with uh clusterautoscaler uh so it can lead to a lot ofvarious problems with that and you alsowant to specify limits for memory wedon't specify limits for CPU uh to nothave CPU throttling but you can also uhdisable CFS uh for specific notes andstill specify limits just will not playfor you um and that's what we do atperfect scale for workload and it worksvery nicely for Carpenter so uh feelfree to first go to our blog post wehave more on Carpenter and we havecouple on specific workloads we have a30 days uh free trial and if you haveany questions um feel free to pop by ourboost we are on the on the lover floorand um talk about Carpenter or workloador resource optimization that that thatthat you do in the cluster so thank youfor your attention and uh yeah maybe youhave any questions to meyessure ohokay so uh does car carpenter work withother clo providers uh azur or or GoogleCL it has it has provider for AWS whichis which is good uh and for Azure theyjust released it I think couple monthsago uh I didn't have a chance to use itfor gcp is not available now but this isvery active uh project and I'm sure theywill they they will release it soonersooner orlater is an open Community or it comesfrom from it's Community it's in a siksik kubernetes carpenter project it'sopen source it's backed by I thinkcouple develop Engineers working on thisproject at least who started coming fromuh from AWS uh but uh it's veryCommunity Driven thanks thank you thankyou thank you for the explanationyeah anyoneelse he thanks[Applause]2025-04-15 22:17:44.211850 � ��d�#��AQx_6ItsOrQsHola a todos gracias porvenir me llamo Joan soy platformengineering o plataforma devo Sr lo quele queráis llamar desde hace dos añostrabajo arreglando problemas en D TRcasi la mayoría del tiempo el resto deltiempo estoy generando esos problemasvale Y hoy os voy a vamos a charlar unpoco sobrecrossplay cómo lo hemos hemosimplementado un caso de uso que hemosimplementado nosotros en D trace Vale yviendo básicamente los conceptos básicosde crossplay en Qué es cómo funcionapara que se puede usar Pero antes deempezar con la historia Veo gente jovengente más de mi edad quién de vosotrosha nacido en los 80 o antes levantar lamano genial el resto Supongo que soismás jóvenes para los más jóvenes si noentendéis alguna referencias que haySupongo que sí pero si no me avisáis y ylo aclaramos luego valee ya antes de nada deciros que estahistoria es verídica todo lo que Vais aver es verdad he cambiado nombres ylocalizaciones para proteger laidentidad de de la gente vale pero perotodo todo lo que os voy a explicar escierto no me he inventado nada deacuerdo o casi nadae bueno y la historia Cómo empieza puescomo de costumbre ehempieza con alguien pidiendo algo eneste caso una usuaria que quería hacerun boarding de unaaplicación bueno Y nosotros somos unaempieza muy cool y obviamente esto lotenemos automatizado tenemos un templateque todo el mundo puede usar para hacerel onboarding de una nuevaaplicación y el template luce tal queasí pues nada lo típico coges eltemplate de donde sea de gi en este casorender lo que tengas que renderizar tecrea tu repositoriolo registra y ya puedes empezar adesarrollar tu código y en ese templateobviamente hay un peline para tú poderdesplegar esa aplicación y el pelineluce tal que así y o sorpresa cuandolanzas el peline por primera vez pueshay cosas que no funcionanvale Y en este caso Qué es lo que nofuncionaba en los pelines queteníamos Puesbásicamente teníamos problemas con losrepositorios vale donde los losrepositorios de artefactos donde tenemosque pushear Ł� �#��IAkeInQWSYWzgwho is s you here DAV Ops anybodyplatform engineering many names yeahokay nice um I'm Ian I'm SRE at perfectscale and uh today we are going to talkabout Carpenter and some of the issueswe had uh using Carpenter uh and what isCarpenter and how does it compare toother cluster autoscalers you may usenow or have been using in thepast yeah okay so cluster Autos scalingin nutal um on the left you have acluster State let's say you have adeployment it has different port antiAffinity all all that set up so you wantto run on the separate notes and youhave for some reason a cluster uh not acluster workload scaling event it can benew workload deployed you can have HPArunning uh you can have some Crown jobscat Auto scaling or you may have a Maxsearch for roll out so you always addcouple more replicas of workload whenyou're running and inell what clusterautoscaling does it launch a new nodefor for your pods and this this reallygood uh in terms of cloud uh because youpay only for uh instances you use wellwith some exceptions but generally it isthe case um and it's also good it's alsogood for on demand because you will havemore hardw Hardware space um left forsome otherworkloads yeah uh so um where it allstarted uh for kubernetes um any anybodyhas been using uh Autos scaling group uhscaling ��todos los artefactos quegeneramos durante el pipeline uno de losproblemasprincipales es que bueno pues cadapiline tenía que pushear el artefacto acada uno de los repositorios que quetuviéramos Pues si teníamos en azure oen Google o en Amazon y además por cadauno de los entornos pues hay que pusheareso con lo cual tenemos que pushear nveces el mismo artefacto a nrepositorios además obviamente pues noes muy seguro Porque el pan tiene quetener permisos para acceder a cada unode los cl providers y y poder lanzar esoy finalmente sobre todo y lo máspeliagudo es que los repositorios tienenque existir De antemano si no por muchoque tú tengas premios y demás no vas apoder subir ese artefacto en ningúnlado bueno pues obviamente para hacereso hasta ahoraH cada equipo de de desarrollo se teníaque buscar un poco la vida paracontactar con los equipos del plataformadecirle Oye tengo la nueva aplicaciónpor favor créame elrepositorio ydemás cómo arreglamos nosotros esto Cuáles el plan que tenemos para arreglarloestá implementado ya y esto lo vamos aarreglar poniendo un hardb no sé sisabéis lo que es hardb vale buenoes un registry Open source que entreotras cosas nos permite replicar lasimágenes y el plan es poner hardb ahípara que desde los pelines tengamos quehacer el Push a un único sitio pushe lascosas en a hardb Y desde ahí puesautomáticamente la replicar deos dondeconsideremos que tiene que estar eseartefacto en Google en Amazon en azuredonde consideremos dependiendo delproyecto que sea ahademás como solo tengo que pusar a unsitio Pues solo necesito credencialespara tener acceso a hardb y como voy acrear los proyectos yo porque vas a verlos proyectos que son cada pilan va atener las credenciales solo para hacerPush al repositorio concreto para suaplicación Bueno pues además de unmontón más de de cosas que nos ofrecehardb aut Box como son gestión deaccesos escanes de seguridad etcéteraetcéterabueno despegar Hardware enprincipio es sencillo es una aplicaciónnosotros la hacemos con helm y con argoCD desplegamos hardb y qué hacemos conlainfraestructura Cómo gestionamos lainfraestructuraBueno pues tenemos varias maneras dehacerlo una de ellas es terraform peroaquí el el chico este de mi equipo puesodia mucho muy fuerte terraform Ah o seano es que no nos guste terraform vale lohemos usado durante mucho tiempo perotienealgunas particularidades que a nosotrosprincipalmente pues no nos encajabanvaleAh quéhicimos buen estoy aquí para hablar decrossplay no es ninguna sorpresa puesalguien dijo Oye pues vamos a hacer estocon crossplay que lo he visto por ahí hevisto un vídeo de vctor farsi y pareceque Mola vamos aprobarlo bueno y como estamos un pocozumbados nos tiramos a la piscina y lohicimos además tiene un logo super molóna qui no le gustan los helados pues quépodría salir mal Vamos con todo Ah buenoy cres cospinAh Es un es un framework Open sce de lacnfc que básicamente lo que te permitees extender la funcionalidad dekubernetes Vale hasta ahora conkubernetes pues podías desplegar Ah tusdeployments Demon sets ah stateful setchtus aplicaciones al uso vale Y con Crosspen puedes desplegar recursos en enCloud usando la misma Api Ah y por quéeh Por qué nos encaja o por qué usamosnosotros crossin valee lo primero de todo por os he dicho nospermite gestionar cualquier cosa externaal clúster pero lo más importante esporque funciona o sea emurdoc está loco pero no es tonto o seavale Ah lo probamos vimos que funcionatiene una comida muy potente detrás hayun soporte grande y hace lo que prometeque que hace O sea no es no es hummo Ahlo segundo es que al extender la ap dekubernetes nos permite reutilizar losmismos pipelines que teníamos hastaahora para desplegar aplicaciones no sécuántos de vosotros de aquí usaterraform levantar la mano valequé tal la experiencia de automatizarlos despliegues de terraform en plangitops bienno Bueno pues es complicado al menostienes que montarte algo custom aoc paradesplegarlo o usar una aplicación deterceros para desplegar cosas conterraformcon con crossin com�o despliegas objetosde kubernetes puedes reaprovechar lospelines que tengas de desple tal cuallos tengas ya me da igual si lo hacescon helm si despegasmanifiestos directamente a pinch Si usascustom lo que quieras lo Reus y lopuedes usar talcual Bueno pues además con Cross nospermite reusar configuraciones vale Yalo veremos luego tiene algo que sellaman compositions que nos permitepatizon distribuirlas y reusarlos demanera que no tengamos Que picar todo elrato todo el mismo códigopodemos extender las funcionalidadestambién de crossplane con providers estoes lo mismo que terraform pues cadaProvider te permiteAh gestionar un Cloud Provider unaaplicación o lo quesea y una cosa super importante es quecomo estás desplegando objetos dekubernetes usando lápiz de kubernetes alfinal es muy sencillo medir y observarlo que está pasando ahí vuelvo al similconform Cómo de fácil es con terraformsaber qué has desplegado si ha ido biensi ha ido malAh si está actualizado si no estáactualizado lo que está desplegado buenopues tienes que ir mirando los planesque has hecho los deploy que has hechosi si ha habido algún fallo si no Ahbueno pu al menos es es durillo de hacercon Cross Pues eso ya te viene de saquey puedes monitorizar tus despliegues ola infraestructura o los recursos quetengas de la misma manera que monetizascualquier otra aplicación en en enkubernetese bueno veamos un poco cómo funcionavale tienes una capa de Core vale que esdigamos cuando tú instalas crossplaneahí te se levantan dos pots de Core decrosslin que eso te da lasfuncionalidades base luego tienesproviders Vale pues como el de aws valecada Provider viene con un subset decustom res definitions vale cada uno conlos suyos cada Provider tiene los suyoslo veremos luego vale Y luego tú comocliente como usuario pues generas tusclaims que es tu definición de lainfraestructura que quieres y luegoCross Perdón por detrás de ese claimgenera un manag resource que es elobjeto interno que usa crossin paragestionar ese recurso que eso al finalse termina transformando en un objeto enel Cloud valetotal Qué son los providers Porque aquíes donde está la chicha porque el Coreen sí Lo único que te ofrece es una capade extracción que teda los servicios básicos de de Cross peny te ofrece los los objetos básicos o deprimer nivel de crossplay vale siquieres hacer algo realmente tienes queInstalar providers los providers pues noes más que un operador es una pieza deSoftware que instalas ahí también Vjuntamente con con crossplay que lo quete permite es extender la funcionalidadde de la Api dekubernetes para poder gestionar apisexternasVale entonces con cada uno de losproviders que instalas al final lo queestás es agregando funcionalidad extravale le estás dando un superpoder nuevoa tu a tu claser de crossplane vale quele permite gestionar a Pues un CloudProvider nuevo ya sea Amazon Google oazure o cualquier otra cosa no tiene porqué ser un un Cloud Provider puede sercualquier tipo de servicio que esté enla nube que esté fuera del clas dekubernetes HM Bueno pues podéisgestionar prácticamente cualquier cosaDe hecho si os vais a la web de abbond yecháis un vistazo a a los providers quehais veréis que hay una cantidad grandey creciendo cada día más de providersque podéisusar Bueno pues nosotros para elproyecto este de Harbor usamoscrossplane entre otras cosas paradesplegar unards que necesitamos para hardb lo usamostambién para desplegar el elastic elredis que también necesitamos para hardblo usamos para desplegar los backet S3que necesitamos también para guardar losartefactos en hardb y también lo usamospara desplegar las políticas de yam enaos que nos den permisos para que hardbpueda hacer lo que tiene que hacervale Y también lo usamos o lo queríamosusar para configurar Harbor en Sí porquees muy bonito desplegar toda la infuracon hardb Pero luego se nos quedaba unpoco cojo el hecho de bueno una vez lotenemos tododesplegado cómo gestionamos laconfiguración el día a día el día dos deHarbor vale porque una cosa es despegarel servicio con �su infraestructura yotra cosa muy diferente es luegogestionar el día a día de ese servicioAh bueno pues cómo loharíamos Pues si tenemos providers parael resto habrá un Provider también parahardb No pues o sorpresa no hay Providerpara hardb vale Entonces qué hacemosCómo arreglamos esto Pues estamos muylocos y dijimos bueno no pasa nada Ah lopicamos nosotros valeAh este si no os habías dado cuenta esel forzudo del equipo y le va bien todoy es muy valiente sin de que sí Y ademáseste es super Happy y le va bientodo al final desarrollar unProvider Ya ahora os lo explicaré unpoco por encima puede asustar un poco alprincipio vale da un poco respeto sobretodo a los pelacables como nosotros quea lo mejor no tenemos mucha experienciaAhdesarrollando pero ah la verdad es quees ssencillobásicamente lo primero que tienes quesaber es si tu el servicio que quierespara que quieres desarrollar un Providersi tiene una Api si tiene una Api y laApi está medio bien ya está salvado valeSi no tien nai yo no sé hacerlo valePero supongo que se podrá hacer pero conuna Api ya tienes muchoganado luego hay un repo enghub en el gab de crossplane el proyectode crossplane donde tien un template túte bajas ese template ya correr valepero correr muy fuerte Lo único que hayque hacer es inicializar elproyecto Definir la Api de tu Provider osea cómo va a lucir el custom resdefinition de cara para el cliente Ahimplementar el controller que es dondeestá un poco la chicha la complicaciónVale y hacer el build el distribute deeso pero la gracia es que eltemplate esas tres cosas con a ver laApi de tu servicio pues lo tienes quecorrer un poco para conocerla y ver quéquieres hacer con ella haces El Clonpero a partir de ahí lo que esinicializar el proyecto generar los crdsy generar todo el boilet Plate de delProvider incluso toda la parte del builddistribute lo hace mágicamente con sus mfiles que tiene tiene ahí que losmantiene agt de outp y funcionan valeAsí que tú como desarrollador lo únicoque tienes que hacer es poner la lógicamínima para desde el Provider decrossplay poder gestionar esos servicioso esos recursos externos vale vía un Apiya os digo en principio Esto está adpara hacerlo en Go no tiene ningunadificultad lo hemos hecho nosotros estoyprácticamente seguro que cualquiera devosotros lo podríahacer Bueno muy bonito Ah solo hay quehacer el controllerAh bueno no os voy a hacer una demoporque soy nuevo en esto de las charlasy me da un poco de miedo vale pero os hetraido un poco de código para que loveáis Y cómo luce el Provider que hemoshecho nosotros Cómo luce al menos laparte de de vamos a llamarle Front loque ven los clientes las tripas No lasvoy a enseñar por por no aburriros peroasí luciría por ejemplo la definición deun proyecto no sé si se ve vale pero loveis porahíbueno tiene la estructura y esto luceigual que que un objeto de kubernetes aluso tú le defines ahí la Api el kind quees me lo he inventado el kind projectque lo definimosnosotros no sé qué ha pasado eh habéisAh vale han movido algo ellos valee un nombre vale Y luego lo que definesdonde está un poco la gracia es en todala parte del spect vale donde tú como Ahplatform engineering vale vas a definirqué quieres que tus clientes pongan ahípara generar tu proyecto Pues en estecaso nosotros pusimos Simplemente nosvalía con que nos dijeran si el proyectoera público Y si querían hicieran o noescan Devulnerabilidades qué más Vamos a veraquí robot account vale es otra de lascosas que hemos implementado porque noshacía falta para cada uno de losproyectos necesitamos que se cree unrobot acant asociado a ese proyecto quetenga solo los permisos necesarios parahacer Pull o Push para ese proyecto lomismito vale defines tu Api Y cómo va alucir Ah más cositas retention policePues lo mismo para cada proyecto tambiénnecesitamosdefinir cuánto tiempo Cuántas imágenes ocuántos tags vamos a guardar de cadaartefacto Pues también lo implementamosvale Y finalmente la Joya de la coronalasreplicaciones cómo hacemos lasreplicaciones para cada proye�cto Y dóndereplicamos cada proyecto vale también loimplementamos con crossplane de maneraque lo podamos definir como un objeto dekubernetes bueno genial pero resulta quenosotros somos una empresa multicloud ehtenemos presencia en aws en azure y enGoogle Además nos gusta tener las cosasdistribuidas tenemos cosas en América enAsia en Europa Ah y tenemos un chorro dede entornos desarrollo test cuaAh de loa de producción vale necesitamosque todas las imágenes se repliquen atodoentornos Vale pues con este ejemplotendríamos que definir por cada uno porcada proyecto45 reglas dereplicación vale Y esta es la cara quese le quedó a nuestra querida suariavale bueno son jels Sípero son un chorro de yels yo no sé sios gusta mucho o poco el tema de losjels pero para cada proyecto tener quedefinir 45 reglas de replicación meparece un poco horroroso valee entonces eh Cómo mejoramos estoAh pues bueno muy fácil ya os he dichoque crossplane tenía una cosa que sellama compositions vale que lo que tepermite espaquetizado para el usuario y no tengaque escribir é mismo o ella misma 45veces una regla de replicación por cadaproyecto con lo cual vamos a tener menoscódigo menos código menos trabajo parami developer con lo cual es bien vale ynada este se apunta bombardeo pues vamosparaallá entonces qué son las compositionspues os lo acabo de explicar vale es elequivalente a los módulos de terraformvale No sé si habéis usado terraform yhabéis hecho módulos pero es lo mismitoes todo tu código terraf quetenías ahí a lo loco y te haces unmódulo quete permite extraer toda la complejidadde lo que tienes por detrás vale Ah puesponendo una Api mucho más sencilla pordelante y lo más importante agrupas unmontón de objetos en uno solo vale Y porqué usarlas pues bueno pues lo que osdigo lo primero es extracción vale si yole digo a mis desarrolladores que porcada proyecto que tienen que quedesplegar yan crearme primero me creasel proyecto luego me creas el robotaccount luego me creas las reglas deretención para saber cuántos artefactosme tengo que quedar por cada por cadatag y luego me creas 45 reglas de dereplicación me hubieran mandado la pero seguro vale con lo cual Ahles pemos una capa de extracción paraque sea mucho más sencillo que lo puedanreutilizar que con el mismo set de datospuedan crear Ah todos los proyectos queque les haga falta Ahtambién podemos hacer un en forment Ahponiendo esa capa de estación pordelante para generar una consistenciapara que todos los proyectos se generenigual y no haya gente que me crea unproyecto público el otro me lo creaprivado el otro con scan y el otro sinescan el otro con una retención de unaño el otro sin retención el otro unaretención de un díavale bueno Y además podemos gestionarlas dependencias entre los objetos valePues si tengo un proyecto puedo generarese robot account que depende delproyecto y me aseguro que como lo estoyhaciendo yo por debajo que eso va aexistir y van a existir siempre los dosy hay una relación Ah y evito quealguien me cree robot accounts apuntandoa proyectos que no existen porejemplo total al final lo que haces esAh mejorar la experiencia deldesarrollador para que sea másproductivo para que no tenga queescribir esas 45 yels para para replicarun único artefactoAh por ejemplo pues el ejemplo que osestoy contando nosotros por cadaproyecto que tenemos en la empresa queson muchos vale necesitamos crear elproyecto en sí el robot account elreplication rule y el retentionpolice queesto lo he roto ahoraAh a ver si puedo venir por aquí puesson todo este montón de código vale porcada proyecto debería escribir todo estemontón de código ahora bien si me hagoun composition todo esto se transformasimplemente esteyamel y esto es otro objeto dekubernetes normal y corriente vale dondeyo le defino mi mi ap version con mikind que en este caso es un proyectogenérico Y en este caso pues decido quesolo me hace falta que me digas elnombre del proyecto con solo poner elnombre yo ya voy a generar todos losrecursos por detrás que que yo comoplatform engine� sé que te hacenfalta lo he dicho es un objeto documentnormal tiene su su app version y su kindque esto lo defines tú comocomo como ingeniero de plataforma túdecides Esto vale Y luego tambiéndefines es las especificaciones quequieres pedir en este caso puessimplemente el nombre vale parasimplificarlo al máximoAhbueno perfecto y Cómo podemose mejorar todavía más la experiencia deusuario Bueno pues como bonus le podemosmeter un idp por delante valeAh bueno genial pues vamos a hacer ahorasí vamos a hacer platform engineeringvale que es el nuevo de bobsya que no le gusta eso y qué es un idpAh no sé si aquí tenéis idps por ahímontados vale e Pero bueno no es más queuna herramienta que en general montanlos equipos de plataforma vale paraproveer de golden pad si esta palabratan fancy Es simplemente cosas probadasque nosotros como equipo de plataformagarantizamos que funcionan vale demanera que los equipos de desarrollo sepuedan proveer lo que ellos necesitanvaleAh pues es como la tienda El service túpones ahí tus servicios con las cosasque tú sabes que funcionan y la gentepuede ir y y hacerlo ellos mismosde la estantera lo que necesiten ymontarse lo que necesitenAh al final básicamente es como unpegamento tú tienes ahí todas todos tusTools y lo que único que haces Estásjuntando los puntos vale con con laslíneas para que hacer que todo encajetodo funcione sin tener que ah que losdesarrolladores e tengan que volverselocos Pues pensando a ver cómo generoesto o cómo genero lo otro o cómo hacíao cómo tengo o a quién tengo quepreguntar nada lo tienes todo ahí juntole das al botón y o tiras De Api y todomágicoe Bueno pues lo que es decía por qué esinteresante para nosotros usar un idpnosotros en concreto nos interesa muchoporque Ah el desarrollador puededescubrir lo que tenemos o sea no hacefalta que yo vaya haciendo publicidad deeh Y ahora puedes hacer esto Ahorapuedes hacer lo otro tienes ahí tienestu catálogo de servicios sabes quépuedes hacer eh con lo cual eres librede de usarlo si quieres ah suelen tenerinterada más documentación de manera queademás de saber qué puedes hacer tienesal ladito cómo hacerlo vale Así que sijuntamos una interfaz amigable y fácilde usar con una documentación que aPrior debería estar medio bienpues tenemos las claves del éxito valepara que todo el mundo pueda enprincipio valerse y hacer lo que tengaquehacer Bueno a nivel de empresa nosaseguramos que si la gente usa nuestroidp con esos Golden path que hemospuesto ahí de alguna manera estamoshaciendo un enforcement de buenasprácticas vale garantizamos que todo elmundo está haciendo lo que nosotros comoequipo de plataform forma Ah queremosque se haga y como nosotros queremos quesehaga también poniendo un idp es muysencillo monitorizar y ver quién estáhaciendo qué quién lo está usando Quiénno lo está usando si hay una feature quete has puesto ahí porque pensabas que loI a usar todo el mundo y luego resultaque nadie la usa otra feature que tepensabas que no iba a funcionar resultaque todo el mundo la USA entoncesponiendo una plataforma por delante deeste estilo es muy sencillo saber Ah quése está usando Y qué no se está usandoY por último lo más importante es queempoderas a los developers o al resto deequipos vale No solo los developerspueden ser otros equipos de plataformaque usan también las automatizacionesque tú has montado para poder hacer sutrabajo de manera más eficiente y mássencillo sobre todoen o haciéndose elservice bueno veamos unejemplo nosotros usamos Backstage valeest es un template es como luce MedBackstage Y en este caso es para hacerel onboarding de un proyecto en HarborVale pues simplemente nosotros lehacemos indicar el componente que seríael nombre de del servicio que quierohacer el onboarding le exigimos que mepongan un domain porque tenemos variosdominios en laempresa un de jira porque obviamenteusamos jira también y todo tien quetenertraqueado Entonces cuando alguienrellena esos Campos y le da al preview yluego le da al next sucede que segeneranrequest automá�ticas en ghub valehaciendo Push de la definición de losobjetos que queremos desplegar y ademásesto lo hemos hecho de manera que esas prequest se automgen hay unos checks unoschecks que si se pasa y está todo verdese emgea solo de modo que el usuarioAdemás le da el botón y ya se olvida detodo eso se emgea y luego por detrástenemos argo CD que despliega susobjetos enen kubernetes y obviamente luego enkubernetes tenemos nuestro Cross plintque lo que hace es ese objeto dedefinición de mi proyecto y me lodespliega finalmente en mihardb Bueno pues para nosotros es superimportante tener esto porque nos permiteAh simplificar muchísimo la vida de delos developers esto es simplemente uncaso de uso aislado Pero esto si lojuntamos con el peline que hemos vistoal principio vale se encadena todo ycuando tú haces el onboarding de unaaplicación ya no tienes ese problema deAy es que no estaba el repo creado esque no teníapermisos toda esta magia sucede juntaacoplada de manera que cuando tú hacesel boarding de unaaplicación Pues también podemosproveer los repositorios y los permisosy las reglas de replicación necesariaspara garantizar que luego eso va afuncionar sin tener que hacer ningúntipo de intervención ni ningú tipo deactivaciónmanual perfecto lo hemos clavadoahí qué más bueno No todo es es perfectohay cosasque cuando te pasas a crossplane Puestienes que tener en cuenta lo primeroque nos hemos dado cuenta nosotros es lalínea bueno la curva de aprendizaje alfinal es verdad que concrossplay al final simplemente estáshaciendo yels de acuerdo compramos yamelengine everyware Ah pero cómo funcionaCrosspen pues tiene su complicación vale anivel de plataforma de servicios queestán corriendo por debajo los equiposde plataforma los equipos de operacionestienen que entender cómo funcionacrossplane para instalarlo bien demanera adecuada y segura para que esofuncione sobre todo en torno Enterprisevale muchas veces los vídeos que vemospor ahí o los tutoriales se quedan muypor encima y es verdad que es muy fácilinstalar crossplane ahpara montar una poc para hacer una demoTodo queda fabuloso pero cuando te vas aun mundo más Enterprise donde tienes losequipos de seguridad y tienes quegarantizar que est funciona 24 por7 y esresSilence hay que dar un poco a la neuronapara que funcione y funcione bien y notengamosdisgustos lo segundo y también muyimportante es el de baging Ah cuandofunciona Cross plan y funciona bien esuna maravilla nada que decir ahoracuando no funcionabueno en las últimas versiones sí quehan puesto en el en el cli han puestocomandos para que puedas hacer tracingque puedes hacer de Back puedas inclusorenderizar las compositions para vercómo quedarían Pero sigue siendocomplicado en algunas ocasiones ver porun objeto no se ha desplegado no se hadesplegado bien no se está sincronizandoetcéteraetcétera gestionar lasreconciliaciones buenouna cosa muy interesante que tienecrossplane es que te hace la detecciónautomática de del drift change famosovale lo hace de manera automática Yademás lo arregla de manera automáticasi todo vabien Qué sucede imaginaros que por loquesea tenemos un objeto que lo borramossigue estando en el Cloud yo lo borro demi clase de kubernetes si lo quierovolver a generar igual que estáVale pues no es trivial a veces Ahgarantizar que ese objeto se vuelva agenerar igual de manera que no que nogenere un segundo objeto en Amazon porejemplo no tengo dos bases de datosiguales vale ya tenía una Pues si lavuelvo a generar de la misma maneragarantizar que se vuelva a generar o queno se vuelva a generar de hecho y quereuso la que tenía pues no es trivial Yhay que gestionarlo con un poco decariño y eso me lleva al borrado de delos objetos Ah por defecto la políticapor defecto de crosp es que si tú borrasun objeto de tu clase de kubernetescrossen va a ir y lo va a borrar de delCloud Bueno a nosotros nos ha dado algúndisgusto vale Y al final hemos terminadopor poner por defecto siempre que losobjetos de kubernetes estén en modoorfan de manera que si se borran en elclaser al �menos no se me borran en el enel Cloud Provider si los Quiero borrartengo que ir yo de manera intencionadacambiar la política de ese objeto yluego borrarlos vale otra cosa tambiénreferente a al borrado de de recursos esque cuando tú haces tu Cloud Provider eo tu Provider de Cross spr en este casopuede ser que tenga recursos que tenendependencias entre ellos vale con locual cuando tú desarrollas eso tienesque darle un poco a la neurona y decir Pues si tengo un robot a accountpor ejemplo que depende de un proyectono tiene sentido que tenga un robotaccount si me he calzado el proyectovale de alguna manera tienes quegestionar eso para que no te quedan Ah opara que no te queden objetos huérfanosO al menos que si te quedan eh puedaslevantar un evento en kubernetes demanera que eh el equipo de operacionespueda ver que tienes un evento con unerror de Oye Aquí tienes un objetoAh que me has borrado el padre y el hijoestá por ahí muerto de la riza ha algovale e qué más Ah complejidad vale Hel tema de los compositions es muypotente pero se te puede ir de madre muyfácil o sea en nada puedes terminar concompositions kilométricas de millones delíneas deyels Y si te pones a hacer functions yani te explico te puedes volver muy locoEs verdad que puedes hacer muchísimascosas con lasfunctions pero ya osdigo mantenerla simplicidad en las compositions es unreto es un reto y hay que hacerlo porquesi no el que tenga que mantener eso seva a querer pegar un tiroAhdependencias bueno pues aquí hemos vistoque tenemos muchas cosas en marchasimultáneamente por un lado tenemoscrossplay en sí el Core por otro ladotenemos n providers v pues tener enmarcha 1 2 3 o 25 providers en la mismainstalación corriendo simultáneamente Ahy luego tienes compositions que usanesos providers vale Y esos providers queademás Usan el Core de crossplane todoopuede estar en versiones diferentes Perotodas las versiones que tengas ahítienen que poder trabajar juntasAh pues si haces un composition que usaun objeto de men invento de azureAh que estaba soportado en la versión 3de del del Provider necesitas la versión3 o superior ese Provider instalado enel mismo clúster si no pues no vas ahacernada si tienes un composition quenecesita que tiene functions necesitasuna versión de crossplane Pues a partirde la 15 que te soporta functions sitienes una versión más anterior Puestampoco vas bien entonces gestionartodas esasdependencias de todos los objetos decrossplanees un reto vale tienes varias maneras osea o puede ser muy flexible y permitira todo el mundo que instale lo quequiera como quiera ya ya se apañar Opuedes irte al otro lado y decir No miralas instalaciones son estas lo tengotodo super probado y esta instalación vacon crossplay versión 15.1 con estosproviders en esas versiones Y con estascompositions en esas versiones que hemoscertificado que funciona esa es unaopción per eso requiere trabajo V laotra opción es decir bueno que cada unohaga lo que quiera y si algo falla puesya os buscáis la vida para arreglarlovaleyluego hacer el isolado de o sea ogarantizar o dicho de otra manera paraprobar crossplane o sea crossplanefunciona a nivel declaser a nivel de claser dekubernetes Y tú como cliente te estáproveyendo apis nuevas de kubernetes sitú quieres probar una versión nueva dede crossplane y ya tienes una versióncorriendo O sea no puedes tener dos a lavez no puedes tener tener dos versionesdel mismo Provider corriendo a la vez enel mismo claser no puedes tener dosversiones de bueno de la composion síque puedes tener versiones pero deproviders y de crossplay no puedes tenermás de una versión corriendosimultáneamente en el mismo clúster valeporque funciona a nivel de de claserscope Entonces siquieres gestionarAh esas dependencias pues o necesitarásun classter por cada versión de crossenque quieras probar o en nuestro caso loque hemos hecho ha sido instalarcrossplane Siempre sobre B cluster quenos permite aislar las instalaciones decrossplane de manera que esténautocontenidas a nivel De Api Vale ypuedo yo instalar todas las versio�nes declosen que quiera dentro del mismoclasster físico de kubernetes y puede irprobando probando desarrollando inclusoofreciendo diferentes versionesde de de crossplane a diferentes equiposen función del B claster al que estánasociadosUy ahora vale Y para terminar para hacerun poco de Dewup nada Cross pen nos permite gestionarno solo infraestructura sino que puedesgestionar cualquier cosa desdekubernetes vale lo que para nosotros ess s super interesante porquepodemos reusar los mismos pelines dedespliegue que teníamos hasta ahora paraaplicaciones los podemos reusar paragestionar toda la infraestructura quequeramos más todos los serviciosadicionales que queramos gestionar concon con crosslin O sea no solo estamoshaciendo ahora toda la infraestructuracon con Amazon También estamosgestionando hardb gestionamos ghub gamosbitbet vale cosas que antes hacíamos ocon bash o a mano o con o con terraformlo estamos pasando todo esto a acrossplayAh como os he enseñado Pues bueno puedesextender la funcionalidad con conproviders y si no existen te los puedespicar tú nosotros por ejemplo hemospicado el de El de hardb hemos picadotambién uno para ghub y hemos picadotambién uno para Backstage no losbusquéis en giub ni en ni en en abbporque no los hemos liberado Todavíaestán todavía en las Mentes pensantesviendo a ver qué Qué hacen y cómo lohacenvale permite obviamente reusar todo loque hayas hecho lo puedes paquez concompositions reusar distribuirlo yofrecérselo de manera mucho más sencillamucho más consumible a los equipos dedesarrollo eso habilita que los equiposcualquier desarrollador o los otrosequipos de plataforma les ofreces unaApi en forma de objeto de kubernetespara que ellos puedan de manera Autónomadesplegar lo que necesiten sin tener quevenir cada vez a pedir o tener que hacermagia arcana para desplegarinfraestructura yfinalmente bueno algo muy importante quehe pasado un poco de puntillas pero dadoque estamos desplegando toda lainfraestructura todos los objetos desdekubernetes y con kubernetes puedereaprovechar todas las herramientas demonitorización que tuvieras paramonitorizar también el estado de tuinfraestructura el estado de desplieguesestán o no están sincronizados Y si todoestá bien o no está bien Ah bueno Yfinalmente también es muy importanteque además de todas las ventajas quetenga también le deis una vuelta a a losinconvenientes que pueda tener valeporque no todo es color de rosa y haycosas que funcionan mejor y cosas quefuncionan peor porque es importantetener en cuentaAh bueno y nada más llegados a estemomento ya daros lasgraciasy sacad el móvil H escanear esto y porfavor me dais feedback porque es de misprimeritas charlas Y ahora yo creo quees buen momento por si alguien tienepreguntas o dudas quejas por ahí hay unamanolevantada I have the questions the Firstone is to Keep consistency How do youmanage the platform That you need to Runcl like the cluster vpc Where you areafterwards clplat O sea la pregunta es cómogestionamos la infraestructura queusamos para desplegar luego Cross planahí y gestionar el resto deinfraestructura si lo usamos si hacemosa mano o si usamos previamente terraformla verdad es queahora los clusters donde están dondeestá corriendo crossplanese desplegaron con terraform vale porqueson los clases donde tenemos lainfraestructura que ya teníamos antesque no hemosredespiertospen gestione ese claser Y desde ahí yaah desplegar el resto de infraestructurano sésienesentioBueno o sea la pregunta es si luegopuedo adquirir o sea luego puedo unclaser que L desplegado A mano conterraform luego lo puedo gestionar concrossplayvale si haces el Setup de maneracorrecta tú luego con Cross puedesgestionar objetos que ya existenBecausefree free access to All the amaz providso Any developer can deploy anything andthe cost can Go So How do youmanage Which developers can deploy forexample a database but they can deployor Something Like that bueno primero detodo Nosotros hemos desplegado Cross encles diferentes que donde está viviendoPor ejemplo la aplicación la aplicaciónse está ejecutando en unos clústers ycrossplay nosotros lo ejecutamos en unosclústers ah digamos de infraestructurade plataforma vale cada despliegue decrossplane o cada proveedor que usamosal final tiene unos unas boundaries anivel de seguridad Ah muy limitadas osea es ese proveedor de de de crosslinqué puedo hacer y además como tenemos unun B cluster de crosslin por cadaproyecto vale Además está limitado Ah noa nivel de provid sino además a nivel deproyecto tú qué necesitas como proyectodesplegar bases de datos Pues yo a eseproveedor de crossin le doy permisospara desplegar bases de datos no vas apoder desplegar ah no sé un redis o novas a poder desplegar un cosas Randomvale Te va a dar permisos específicospara lo que necesitas hacer para eseproyecto eso por un lado Ah entonces aese cluster no tiene acceso a todo elmundo Solo tienes acceso para desplegaro gente de equipo de plataforma quetiene tiene los permisos deadministrador o Mediante los pelines olos golden pats que nosotros hemosdefinido es vale puedes desplegar ahípero solo puedes desplegar Mediante lospans o los flujos que yo te hepreestablecido O sea al final solo vas apoder desplegar lo que yo te permitadesplegar Ahí Ah eso por un ladoestuvimos también dándole al coco a vercómo podíamos hacerlo de otra maneravale también podríamos eventualmente siyo despliego todo mediante compositionsmis compositions son de un Cent enconcreto en este caso por ejemplo unacomposition de generic project donde yosolo te permito desplegar ese proyectode esa manera podría limitar con conkiber no por ejemplo con cualquier otrosistema de seguridad Qué tipos deobjetos te permito desplegar esa es otrade las cosas que habíamos evaluado queal final lo hemos implementado porque esmucho más sencillo hacerlo de la otramanera and One and you limit for exampledeoyinstances up to a specific Ram orSomething Like that you can limit thekind of record inside the Amazon logicUh o sea lo puedes limitar a nivel decomposition por ejemplo vale tú en elcomposition Le puedes decir mira esta esuna composition para desplegar una basede datos y tengo un campo que yo te voya pedir donde tú me digas el tamaño dela base de datos y ahí tú le puedesmeter un Rango vale al final es un crdque tú le puedes decir esto va a ser uninteger y el integer puede ir de un a100 más de 100 no vale o me puedesDefinir la familia de la base de datos Ypuede ser una Ray Perdón eso es un enumy puede ser uno de estos de Estos tiposno te puede salir de ahí o sea la puedeslimitar por ahí o también lo limitamosnosotros enBackstage poniendo esos formularios unosCampos específicos donde tú nopermitas cosas muy locas y y pongas a lomejor siempre unos Campos muy limitadosy controlados para que el usuario tengaflexibilidad pero que tampoco se puedavolver loco aome Thankyou otramano mi mi pregunta era un poco sobreestas compositions de las que acabas deque acabas de mencionar e esto CómoCómo se definen se definen en Go sedefinen en algún otro tipo de lenguajeSon templates sonyels al final el objeto composition comotal es un custom res definition que teda crossplane Vale pues bueno tiene suestructura con su spect y tú ahíconstruyes tucomposition siguiendo una guía unaestructura que Cross está definiendoVale básicamente lo que cjes es por unlado le dices los inputs que necesitaesa composition vale o sea defines laestructura del crd final que va a tenerAh tu composition Cómo la va a consumirtu usuario la Api que tú vas a ofrecerde cara fuera y por otro lado definesQué haces con todo eso vale pues cadainput Para qué lo voy a usar dónde lovoy a poner Qué objetos por dentro micomposition va a gestionar y ahí puedeshacer transformaciones y hacer muchamagia Entonces como una suerte de JasonPatch algo algo de ese estilo no Síjusto o sea vale vale perfecto y ahí esdonde os decía que eso se puedecomplicar muchísimo y es muy fácil quese te vaya muy de las manos pero muyrápidoGracias más preguntas antes de irnos acomer nonada bueno pues nada pues muchasgracias an2025-04-15 22:17:44.887792�u raise yourhands oh if you okay cool um anyone usescommercial version of cico oh looks likesound came up um okay uh sounds good soum the numbers I'm showing here this isin relation to kind of like overall uhwho uses Calico uh at least people whowe know of there is a lot of uh a lot ofkubes environments that are built wherewe don't have visibility into so theydon't share the numbers with us but uhas a ballpark uh for us it's uh the uhmost adopted cni for kubernetes platformcni stands for container networkinginterface um for forkubernetes um so um closer to the topicin terms of uh compliance you can seesome of the acronyms in there referringto compliance standards like PCI sockusually you would hear something sock toum uh fed Ram heppa G gdpr that you allprobably familiar with there is a bunchof different compliance standards and uheach compliance standard has a has a setof controls that you need to implementand then once you implement thosecontrols you need to um prove that thosecontrols imple implemented correctlywhen the audit comes that they canverify that everything is uh up to thatum uh compliancestandard why it's necessary um typicallyif you are in business where there issome regulation that is man mandatedtypically by the government or some kindof authority but usually it's thegovernment uh in one way or another theycan work through a proxy but usually thekind of like enforcement comes from fromthat part and um once uh if you yourlet's say handle uh financialinformation there is credit card uh uhdata in there in healthcare industrythere is a lot of your uh healthinformation that is stored somewhere andthat needs to be secured in one way oranother or whenever it moves uh aroundthe systems that needs to be done inSecure way so you you would see anycompromises that happen uh a lot of uhgovernments nowadays implemented thepolicies where someone gets compromisedyou need to kind of like disclose thatthat uh that compromise happened whatwas compromised what we are doing aboutthat all of that comes from the uhregulating authorities kind of uhbuilding those laws and posing that likeat the um at the level um uh of uh likeimplementing the recommendations uh thatthat become in laws and you you have tocomply with those and there are twoaspects of compliance there isenforcement that's enforcement ofcontrols that I was talking about thateach compliance standard has hasdifferent types of controls that youneed to enforce and once you enforcethat in your uh system uh over timethere is uh uh there is an auditor thatshows up that needs to do do audit towhere you need to demonstrate that isworkingcorrectly uh some of the requirements uhso a lot of this talk is going to berelated um to cetes because kico kind oflike like the native component of kuesplatform um and so uh I'm coming fromthe perspective of over overall the kuesplatform running the applications inthere the N networking aspect of it howto implement the compliance controlsfrom uh for that platform since uh itit's quite different uh to traditionalapplication deployments when you usesome some sort of bare Metals static VMSthat have um information that doesn'tmutate that often so um some of theexamples of compliance is that yourkubernetes environment itself when youbuild it it needs to be compliant withcertain standards um if you've heardthere is uh CIS or CIS benchmarks that'sone of the uh examples that um thatworks for kind of analyzing your cunesnodes whether they are compliant withcertain standards or whether you have umsome uh some configuration in there thatneeds to be adjusted so that you canprove your compliance um often uh whatwe see especially at the smallercompanies that uh kubernetes platform byitself is flexible enough that it canallow you to run bunch of differenttypes of applications you can build evenenvironments inside of the same onecluster you you can kind of segment itthrough the concept of labels intodifferent environments for the end usersI mean like for developers they can usepart of it as a QA de for QA deploymentsuh some for uh for uh integration fo�rdevelopment and uh well typically youdon't share production platform but insome rare cases you can see even that umand so you need to to do some uh toimplement compliance controls to makesure that your applications do not mixand you can prove that that they cannotkind of like cross The Logicalboundaries uh of those different Creaenvironment or Dev environment and gotalk uh talk to each other um that thatalso applies to having something likemulti-tenancy when when you can have andtenant is kind of like a loose term ifyou will to for uh when used withCarbones because it can represent anapplication or it can rep represent ateam uh so it depends what what istypically meant uh by by tenant in therebut in general when when you have amulti-tenancy that you build you need toprove that tenants are isolated um anduh they cannot kind of talk to eachother uh any sensitive um data like piipersonally identifiable information thatis something uh for uh financial sectorlike especially companies that handlecredit card transactions they would workwith the data or anyone who allows uhwho does the charging through theirplatform they would handle that databecause they take information you put uhthe credit card number in the uh in theon the website when you when you'retrying to to go and pay for something soas that data moves through the systemswhatever components handle that theyneed to comply with certain standardsso um why traditional um methods do notreally work for caradus is because ofthe uh uh Dynamic nature of cinesplatform uh anything you deployed in theform of your application stack uh comprcan be comprised of multiple Services umand uh those Services can have uh fromone to X number of PODS deployed so itcan be a typically deployment with areplica set in there and so that meansthat uh the IPS that traditional systemsusually rely on to kind of provecompliance to collect the informationand then later use in the compliancereports uh that information cannot berelied on um to to provethat uh alsoum the uh some of the controls definedin traditional firewalls as an exampleuh can be using uh using fqdn forkubernetes uh that is not that simplewhen you're saying can can can umcertain application be allowed to use togo and talk to some fqdn um maybe thereis some SAS uh service that you'reyou're consuming the applicationconsumes or you talking into yourdatabases in your network just outsideof commun cluster that is exposedthrough some sort of uh domain name umhow do you prove that only yourapplication was able to use that so thatthat becomes different uh difficult inCades and specifically like some of thereasons for that is that uh your clusteruh when you build your crias clusternowadays a lot of um a lot of um uhplatform teams when they build thecluster they choose to use a private uhNetwork range so it's a private set ofips that is used for the pods um you canstill use the routable APS andespecially I talked to a number ofpeople uh in this conference a lot ofpeople seems to be using something likeeks or AKs eks for AWS service AKs forAzure service um and so those platformsallow you to configure the uh uh rangefor the pods or the network range uhthat is from the VPC meaning that it'sroutable IPS within that VPC but a lotof companies especially when buildinglarger clusters bigger platforms um theytend to use the private ranges and thereason for that is that uh the VPC thatyou use uh in the example of kind oflike this managed kuet Services eks theVPC that you use has a limited number ofips in there usually it's like sl6 butthey are shared between the pods and theVMS themselves and whatever else youdeploy in there every any service uh youdeploy within that VPC would take an IPaddress so at certain point you cannotscale more just because you hit thelimit and so the that's why a lot ofplatform teams whenever they decide thatI want to be able to scale to uh to kindof like factor in the future scale inthere so they they choose to use theprivate cider range for kues platform sothat the VPC IPS would be just used bythe AWSservices and uh so um whe�never timecomes uh to kind of prove compliance andtypically it's done on some scheduledbasis I don't know if there are dailycompliance standards uh but there aredefinitely like quarterly yearlycompliance uh uh standards when when youhave to kind of run those reportsdemonstrate uh the that you um youcompliant with with that standard if youare not usually you get some sort ofwarning from the regulating Authoritysaying you have X number of daystypically like 30 60 days uh depends onthe standard itself uh that you need umbring everything into compliance if youare not then you get penalties and kindof like those penalties keep increasinguntil you get a something like SE anddeceased letter meaning that you cannotoperate your business cannot um operateanymore until you fix uh all thoseissues and uh as an example PCI uh oneof the standards um uh for uh creditcard um for companies who handle creditcard Financial transactions uh so if uhif the company handles over 6 milliontransactions a year they have to do thePCI compliance standard uh in order tooperate in the uh I'm from us so that'sa in the US that's the ruleum anotherum uh another requirement is uh thatwhen you deploy your application stackum some of the components they touchthis uh uh an example for PCI they touchsensitive data so they have to complywith the standard some of the componentsthey do not have to so uh it's kind oflike mix and match the easiest way tokind of implement everything and sayingoh it's all part of my PCI standard uhor whatever whatever compliance you'reusing in there uh but uh the power ofkubernetes comes from the fact that youcan deploy bunch of different apps andyou can then um have them live in thatplatform they kind of move around freelyuh within the confines of kuat'sboundaries uh and you don't have to kindof like go and manage those individuallyuh but when you do that that means theyall co-hosting and kind of like shco-sharing the resources and they if youdon't have the proper uh controls set upthat they cannot kind of like jump andtalk to other services from otherapplications you are bridging your uhcompliance standards as an example inthere so micro segmentation is one ofthe ways to ensure that yourapplications can only talk to theservices to um uh to the services withinthe application stack or like within thekubernetes cluster uh that they onlyallow to talk to and everything elseshould not be allowed so that'ssomething that that is difficult toachieve with a traditional firewallsjust because if I take the example evenof the routable IPS when you create yourcommun cluster and give it routable IPSthat you can route through the firewallsthose IPS change frequently uh somethinghappened to your pod one of the PODS ofyour uh application service it turnedmeaning it restarted uh restarted onceit restarted it gets a new IP it stillcarries the same function and itoperates in the same way but now it's adifferent IP so like in your firewallthat is kind of built to use IP as anidentity it becomes difficult toimplement any policies in there becausethey need to be uh changing kind of likefollowing what pods are doing the lifecycle of thePod so um how cico uh addresses theseissues um we provide uh vulnerabilitymanagement um and uh yeah just to beclear uh for those who using K know whatK is some of the stuff I'm talking abouthere related to Calo as a whole not justthe open source because we have the opensource piece and then we have commercialpieces so uh some of the stuff likevulnerability management that's a partof the uh commercial piece so uh weprovide vulnerability management wherewe can scan images uh build somethingcalled software bill of materials or forsure as bomb um that gives you aninventory of all the packages and allthe binaries in your um container imageswe we map that we run the scan we getthat as bomb then we map everything thatwe found against the cves that's acommon vulnerabilities and and um whatthe E stands for exposures um so it's auh a database of vulnerabilities that wewe take and kind of like map all of thatso that we give you the� inventory ofwhat you have in the images mappedagainst uh against the knownvulnerabilities and uh you you can goand kind of see what of exposure at thatlevel you would have if you deploy thoseimages in your cluster but we also wemap that with the admission controllerand we uh um we also map uh where yourun images in your clusters you you canhave multiple many different clusters ifI have time in the end I I can show youlike quick uh peek into that if not uhwe have a booth outside you can stop byand uh I can show you show you how thatworks um we provide the workloadsegmentation through Calico policies umin uh in the commercial products we alsooffer the tools where you can we canprovide you policy recommendations so ifyou don't know how to build thosepolicies we just look at the trafficsince we sit at the networking layer inthe kuber cluster we see all the trafficin there so we uh we can recommend youpolicies based on actual traffic that wesee uh we provide encryption for uh datain motionum we uh we also Al uh the uh we providethe flexibility with the policies thatyou can build them according to whatevercompliance standard you need to followum we have the reporting tools where thedata that we collect you we provide somethe outof the Box reports but you're notbound to those because you can take thedata and you can run kind of like uhbuild that into whatever reports uh youwould need and the audit logs for uh forthe audit Trail so whatever policies youdeploy who changed when that policy waschanged or deployed um and what changeswere in there we have all all that auditlog to present to the Auditors and so todive kind kind of expand on some of thisso for vulnerability management what wemean by that is that um we provide theimage scanner and there are a fewdifferent flavors of how we do thescanning but that's one of the pieces tokind of like scan what you have whatinventory of binaries packages you havein images U map that to CVS we haveadmission controller that bundles withthat that uses that information todecide you you you deploy a policy foradmission controller to decide what isallowed to be deployed in the cluster uhif the image doesn't pass the scan atwhat level did it pass did it get awarning did it get a fail result andwhat the threshold to get that resultand then uh with admission controller uhpolicies you you can decide what to dowith that uh whether to fail likeprevent the deployment of the image uhaltogether and uh we give you a runtimeview where we take the uh the imagesthat you scanned um you not necessarilymay have deployed those to your clusterwe because we have different ways howyou can do the scanning you can scanagainst your registry uh image registryso you can just point the scanner thereand say hey can I get the scan for allthe images I have in there but maybeonly a handful of those are running inyour cluster so we provide the one timeview where we take the images that youscanned and we show you where theyrunning in your cluster and we show inthat context we show the informationwhat cves uh what packages you have inthere and what CVS you have in there soso that you can see your exposure tocertain vulnerabilities inthere um then uh one of the kind of likeuh I would say it's a subtype of micrsegmentation but we separate that uherress controls as its own way of uhsecuring uh your uh applications becauseuh when you run application incarbonates it does not necessarily meanthat it will only talk to othercomponents running cetes uh quite oftenyou need to go and talk to the servicesthat were not migrated to cetes orservices that cannot be migrated to intothe pl uh platform so you talk to themby IPS or fqdn and so you need to securethose Communications uh another way umuh another way to talk to the servicesoutside and especially within yournetwork um is to uh like talk into thedatabase but you have to Traverse thefirewall as the uh part of uh thecompliance that you need to uh to complywith or as a part of the uh rules thesecurity team set up for uh for allapplications with um again mapping thisback to the what I w�as talking aboutlike private IPS for the pods thatbecomes impossible um or not impossiblebut uh nearly impossible uh just becausethe IP is not known outside of cinescluster and if once the Pod says like Ineed to go to talk to the database thatlives in your network outside of thekubernetes uh once it get uh getsoutside that IP is no longer pods IP soyou cannot open create a policy in yourfirewall saying like use this IP for thePod to to allow the access to thedatabase usually that IP gets SED to theuh IP of the node where the Pod at thattime is and so then it can go get outand kind of like Traverse your networkand and and the packet can get back butthe problem with that is that the in thefirewall you can at that point you canonly program the IP of the node and ifyour nodes are big enough you might havein there hundreds or thousands of PODSrunning and so which pod was going andtalking to to what that becomesdifficult to tell and for compliancethat also also doesn't fly because uhyou cannot prove that your uh your thatcommunication was only from what what isallowed to go and talk to uh to theservice outside of commines so weprovide something called igas Gatewaywhere you can assign a static IP knownIP to and you can map it to a namespaceor down to a specific pod because ituses the same concept of labels and kindof like uh using labels to map labels tomap um which pods will carry that IPoutside of communties cluster and so inthat way you get in the stable IP so thePod restarts it changes its IPinternally inside of cin cluster but thetraffic living cin cluster still carriesthe same IP that you define in thereit's called Uh erass erassGateway um for micro segmentation um thepolicies they are declarative as prettymuch everything else in kues so youbuild your yaml or Json whatever youprefer to work with you deploy that as apart of your uh application stack usingcicd pipelines that's what majority ofour umuh customers do this screenshot it's aactual screenshot from a commercialproduct where we can show you how thepolicies visualized how you deploy thosepolicies and there there is some uhgovernance aspect of it and there iskind of like hierarchical nature of howyou build policies you can map uh whocan manage from the different teams whatkind of policies in there but it's allum with the concept that policies areapplied uh in a dynamic nature uhfollowing the concepts of kubernetes uhbecause uh the the way uh the way theywork you you run a cube apply command toapply any object like a deployment orpolicy yaml to the cluster caronus inestingests that information puts it in itsown store it CD database is used forcetus as the storage uh and thenwhenever pod gets created and the labelson the policy in policy selector thereis a field policy selector in there thatwhere you specify the labels where thatpolicy should be applied so when the Podgets created uh the cni uh suppliesprograms the data plane on each node forthe policies for that pod so thepolicies are not deployed automaticallyacross all of your nodes in the clusteruh because if you have a very largecluster that would be a huge undertakingit kind of follows the pods where thepods are going so if the Pod isrestarting and move into another nodethe policy information the policy uhrules get torn down on that node ifthere are no other pods that match thatlabel and recreate it on another note soit's kind of like a following followingthe uh life cycle of thePod and we provide the uh topographicalview of all the communications in thecluster so whichever pod service istalking to what in the cluster we uh wegraph that we put it in into thegraph and uh we can also uh so with uhwith Calico policies you can also applythat not only to um to the apps runningincubates you can also go beyond thatyou can apply Calico policies to thehosts so uh not only protecting theports of PODS but also uh caronus hostyou can even go beyond that you you caninstall Calico as an agent on nonbonushost like VMS bare metals and uh pairthat with your kubus environment andhave Calico program policies on bothendsyou can �do it from one end just in Catesand control erass to the outsideServices running on VMS or bare Metalsum there is an object in uh in Calocalled host end point that that's theway to create that and then you cancontrol erass traffic in that way but ifyou want to secure connection on bothends and you need it for um uh and youum you uh on the other end that's goingto be V VM or bare metal you can installCalico agent on that VM uh and you willbe able program policy on bothends uh for data in transit we use atool um or a mo uh now nowadays it's aLinux kernel mod module called wireguard uh to do the encryption there areseveral different ways how encryption isdone for caronus workloads in generalthere is ipac openvpn wir guard it's ait's a module that I think it was umincluded added to the kernelprobably two years ago or something likethat uh it's not that long time ago likeipac is very old technology so that'sbeen for a while in the uh in Linuxwiard is kind of newish uh there isanother way how to secure Communicationsthrough TLS at the service level um soif some of you used or heard of servicemesh that's uh a technology or uh stackthat can be installed in kues whichprovides you with some uh likevisibility in how Services talk to eachother and it provides the encryptionthrough TLS TLS implemented at theapplication layer so it's higher up inthe stack if you are familiar with OSImod uh model uh so application layer itmeans layer 7 versus wiard works atlayer three uh some differences in therethat uh for layer 7 uh to use TLScertificates you need something calledcertificate Authority uh if you are allon the web applications like applicationdevelopment side you typically don'tdeal with that that's set up by uh bythe platform teams or security teamstypically they kind of set up acertificate Authority and manage whogets which certificate that that's whatservice mashes do kind of like automatedthat for kues but that operates at theapplication layer uh which means youonly your application payload is securedand also it has kind of like layers uhum adds moreto uh to how it does the encryption andwiard works at layer three uh soeverything from layer three up getsencrypted so you won't see even portswho's talking to tohome uh for compliance uh reporting weuh I kind of mentioned that we have thecapability to generate compliancereports and here's a list of some ofthem like network access or inventoryCIS CIS benchmarks but you are not justbound to those because you can take ourdata and generate your reports uhyourselves depending on what uh kind ofwhat what standard you need to provecompliance for um and how you want tosee thatdata um some of the uh uh key takeawaysfor um uh for the compliance uh so umthe you you need to implement securitypolicies in order to uh prove uh or uhin order to enforce uh that your uhapplication components can only talk towhat they need to talk the what what isallowed and especially if it's like PCIor if it's um uh sock to uh anythingrelated on the networking level wherethe communication needs to be happenedand it needs to happen only withspecific uh only allowed uh endpointsyou need uh you need to implement uhNetwork policies for that you need to beable to automate all of this because uhyou will deploy or or you are deployinga lot of different applications in ontothe carbonates platform and when thereis a lot of them uh there is no way oryou won't have enough time to go andkind of like manually do everythingusually you would tie that up into cicdPipeline and ideally uh cicd uh toolwill would be the only tool that that isallowed to deploy something at leastinto production in a lower level thereare some um uh of course uh people cancan get around uh those deployments andsometimes they need to uh but forproduction uh the standard or therecommended practice is that you set upcicd Pipeline and only through thatpipeline you can deploy anything thatyou need to touch in the production uhclusters you only go through throughthatroute and and uh yeah the the last bitin here uh you need the ability to beable to generate repo�rts on demandwhenever you you need that so one one ofthe challenges for instance for Auditorsis that they can come and investigatekind of like look at your platform uhrun through their checklist for whatevercompliance stand standard needs to befollowed but then uh whenever they doverification that only in that timepoint in time they can say okay you'recompliant with this but let's say nextday you have a roll out of a newapplication or uh the same appapplication just redesigned and so thatthat compliance is now broken becauseyou need to go and kind of uh re uhreview everything uh again and make surethat that you're complying so you needsome tools to be able to comply uhcomply uh comply uh build those reportscompliance reports to to see whetheryou're compliant ornot um one of our customers used uh useduh our commercial product for the theINF Financialuh fintech uh SAS um um vendor who builtthe fintech platform and for fintech uhor for a lot of SAS PL platforms we wehave cico Cloud as our SAS offering weas well have to be compliant with thesock 2 standard so they used us to um tobuild their platform and to prove uh thecompliance uh so to compliance uh wehave a case study on this Mulliganfunding uh that's the comp who build itwith us uh so they they' use our productto um to achieve that compliance and youcan kind of uh go take a look at uh whatthey use specifically what they calledout um from from the product that helpedthem to to achievethat um so uh what Calico uh why Calicouh can be useful to um uh to uh to provecompliance to kind of implementcompliance controls and uh prove thecompliance uh it removes the securityblockers you can easily create policiesand I don't know how much time I havebut I think I will be able to uh to showyes it looks like I I can show some uhsome of the uh tools that we have andwhat I'm talking about is removing thesecurity blocker where you can craft thepolicies without even knowing how tobuild them uh without going into uhcomplicated details of how to build therules who needs to talk to whom we canjust take a look at the traffic and wecan offer you policies and you candeploy them in a manner where you cantest them first and then come in and saylike okay policy seems to be doing whatit's supposed to let me enforce thatpolicy um the enforcement bit of it isvery fast it applies in milliseconds uhcomparing to some uh some of theanecdotal stories that I heard from thecustomers they uh it takes them weeks toimplement a change in a policy I don'tknow why that happens but usually thatbecause uh the change has to berequested through ticketing system thenthere is a security team that is alwaysoverloaded and kind of like Worksthrough a queue of different requestsand by the time they get to that requestto implement something uh the changelike in firewalls it can take them daysto to get there and uh to demonstratethe proof uh to kind of like build uhbuild everything into the report so Ithink I believe this is my last slide soI canactually um um show you what kind ofdata we collect inum and uh as an example I I'll use uhcico Cloud just because it's an easysetupum for um for me to uh to show theClustersuh you can see I I have bunch ofdifferent clusters connected to theplatform so you can assemble likewhatever PL whatever catus version orflavor you're using you can connect itto to cico management plane and fromthere you'll have uh all these tools soum what we uh we sh here or what we doand probably uh could be useful to tomany of you uh is the ability for us tomap um what is running in your clusterhow those applications and theircomponents how they talk to each otherso at the higher level right here I'mshowing you view from the namespacepoint of view um actually letme switch to different screen so that Ican um also run a test um test script togenerate some trafficyeah so uh at the high level we map uhinfo about all the communications and weshow you from the namespace point ofview but you can double click into anyof the name spaces and get the view fromthe inside so what are the deploymentswhat are the compon�ents that are runninginside everything that you you see inhere you see at the bottom there aredifferent in here uh so um the flow logsthat's what we use to build uh thisgraph and that's what uh what we usewhen I'm referring to like how you wouldprove the compliance how we build thereports that's the information that weuse uh there is a lot of data in here soif I open up one of those flow logs youwould see there is your typical fiveTuple for Network flow log but it'smerried with the kubernetes context soyou have the information about uh Sourceuh name space the name of yourdeployment or service uh same for thedestination Source labels destinationlabels um you have the information aboutum the decision that was made that thattraffic was allowed or which policieswere evaluated for for thatcommunication so all this informationgets compiled because for every requestthat that's happening in the system wecollect that info we build flow logsinto that and so you have historicaldata whenever the compliance uh auditcomes up uh comes up you run the reportsagainst those flow locks you collect thedata and then you can present likehere's what happened at that time uhpoint in time here's what I had so youcan see we also can collect like uh theuh the amount of data moving uh in andout uh there is more info for like allthe in-cluster communication stuff so ifI select uh something like connection tothe card service here and show youdetails in here so you can see we cansee exactly which binary run inside ofthe container which ports I use uh or uhprocess ID was used uh the TCP levelmetrix if you have SES that have tomonitor some business critical Servicesuh this is typically TCP level uhmetrics that's what they are interestedin to make sure if Service uh uh startskind of like latency starts increasingand they need to go and uh look at thatuh all of this data you can generate setup your alerts for whatever you you needand you can I can watch uh watch thisdata or uh build the alerts that willuse this data to generate an alert inwhenever something happens based on thequery that you select for that or set upfor thatalert so um when I ran my um uh just a afew moments ago when I run my script uhI generated a bunch of connections sonow you can see some of the connectionsin here highlighted in red thatrepresent uh that communication happenedbut it was uh it was denied by somethingso we give you the breakdown of whichpolicies you can see all thatinformation that I showed you in the logwe kind of uh put it into some of thatuh info that is uh used quite often weput it into this fly out panel uh thattells youum what happened and we also canhighlight uh the policies that actuallydid the deny so you can see exactly whendeny happened uh communication didn'twork you you can troubleshoot this stuffreallyeasily within the policies whenever youdeploy them um so if you're using cicopen source again this is part of thecommercial the all the visualizationtools you you would deal or kind of likemanage everything through ccod commanduh everything you deal through the CLIcommands uh in uh in the commercial youwould see everything that you deployedin the cluster and we also added each ofthese columns that you see like securityplatform application default th thoseare called policy tiers so you candeploy policies into into those policytiers depending on what policy issupposed to do and you can map thosetiers in different ways like one of theways is to like team function so that'sexactly how I set it up for this exampleum you can have like security team comein and configure their policies insecurity tier and they would typicallyset up some guard rails overall likeoverarching guard rails for the clusterthen platform team would uh come in andconfigure the policies only for thecomponents that they are managing andthe U in my example the developers wouldcome in and deploy the policy inapplication tier and all of this twokubernetes uh look like nativekubernetes objects which which means youtake standard cetes RB back and you cancreate governance you can decide whichteams h�ave access to what tiers and whatlevel of access you you can make somepolicies read only like uh that that'show typically I do that saying fordevelopers they have full access toapplication tier and only maybe only toto the name spaces they need to workwith uh everything else just make itread only for them so that they can seethe policies in there and they becausethey can click and kind of like whoeverwants to look and understand how trafficgoes through those policies but theywould not be able to come and kind of Umess with those and um do something thatuh they don't intend to do because someof these policies you can see here itsays Global uh and here it says uhhipster shop so hipster shop is one ofmy Nam spaces for the demo app uh Globalis uh that represents a scope so thereare namespace policy and then there areGlobal policies Global policies don'tcare about namespace you create that andit applies across everything you canstill refine the policy scope for Globalpolicies with the label selectors but bydefault if you don't specify a selectorthat means you apply it everywhere sothose are if you don't know what you aredoing those are kind of danger Dangerouspolicies that's why I'm saying youcreate a leverage arback to set up umgovernance uh who has access to whatbecause you can limit and say heydevelopers you can only deploy namespacepolicies or even going further you cansay you can only deploy namespace stagedpolicies so that's what I'm I'm about toshow um one of those examples um we haveI mentioned a couple of times that wehave the ability to recommend you apolicy so in here you can say hey Calicouh for the last 15 minutes can you takea look into one of the Nam spaces andrecommend me a policy for that Nam spaceand cico goes looks at the traffic thathappened uh for last 15 minutes and saysoh here's the traffic that I'm seen uhand from here you can go and downloadthis as yaml uh or you can do one of theactions you can preview this stage it orenforce it I'll I'll show you preview onon an existing policy uh just because II already have enough traffic in thereum for stageing uh that's an ability foryou to deploy the policy in a permissivemode so you deploy the policy and youwould see how traffic goes through itbut if that policy all of a suddenstarts denying traffic it doesn't uhshowing you that it would deny trafficit doesn't actually deny anything itjust shows you uh for the testingpurpose here's what would happen if youwere to enforce thatpolicy and I'll show you this as anexampleuh because everything uh is declarativeI I can build and tear it down withinseconds um so that that's kind of whatI'm going to do I'm going to remove allall these policies and apply one of thestage policies to show you the kind ofthe effect of this so right now I don'thave any policies deployed and if I runmy test connection script uh which runsconnectivity test between uh serviceswithin uh Nam spaces then across Namspaces I'm going hdden some external afew external fqdn like twio and I thinkI hit apple.com or google.com somethinglike that um and um I also run uh a fewicmp packets to Google DNS and um level3 DNS just uh from a few differentlocations in there toshow uh what happens behind thescenes so you can see all theseconnections right now they I'm gettingthe response and there either 200 orthree 301 which which means it passed II got the response back everything goesthrough so so now um when you kind of uhlook into the kubernetes security uhinterest group and their recommendationthey're saying uh you need to deploydefault deny policy when you buildingyour security posture for for theapplications uh and from there like onceyou deploy uh as you deploy um defaultdeny it cuts out all the access for forthe apps and then you start deployingpolicies that only allow thecommunications that need to be allowedso umnow I'm going to deploy that defaultdeny uh so you'll see that Landing herein the uh default uh default tier sotypically default deny means it shouldbe the last policy within the certainscope either uh if you're deployingdefault deny for the namespace it's lastpolicy in the Nam space and by last Imean uh like the order of evaluationbecause you want to go Traverse thepolicies that allowing the traffic andthen last one is going to be defaultdeny that's for the traffic that is notallowed so you can see traffic startshitting it you can see this label stagedindicating that this is a permissivepolicy so if I rerun my testconnection uh you'll see that noneactually nothing is broken nothing isdenied and that's that's kind of thepurpose of these stage policies that youcan deploy them you can T Test them andonce you happy with what they do thenyou you can come and enforce that uh alot of our customers use um stage policyin the lower level environments inproduction it's always enforced uhbecause there everything is tested youyou want that to be enforced in thelower level environments they want thisto be uh staged so that they can seethat uh whenever something ismisconfigured the traffic hits it and ithighlights and shows you that but youdon't stop developers from doing theirwork if they forgot something and theydeployed the application and uh they itstart it's blocked they they cannot dowork they need to figure out what to dowith the policies this allows them tosee that something is misconfigured andthey can go figure it out but thatdoesn't stop them from continuing towork so you can see all the traffic uhstill went through so now if I deployback my policies that I justremoved and these are enforcing policiesso in here you'll see uh you can kind ofSay by uh policy type in here when itsays either Network policy or GlobalNetwork policy that means it's anenforcing policy if you see a keywordstaged as a part of that name then youknow that's your stagepolicy so you can see I'm adding thisback and so what uh you would see in amoment that traffic um that is currentlyshowing up on this default policy and wekind of refresh this every I think like10 or 15 seconds you'll see that overtime over next uh 30 seconds or maybe aminute uh this traffic will start comingdown and eventually drop off and that'syour indication typically that okay yourproceeding policy do the right filteringfor all your application Services talkto each other uh and that's why it'snothing uh hdden my stage default denyanymore that's your indication that yourfiltering is donecorrectly yeah you can see the this iscoming down right now and um in a fewmoments it will drop off but um also Imentioned the vulnerability managementis one of the aspects uh in uh in theplatform so uh image Assurance that's uhthat's that part that shows thevulnerability management I'm runningthis through my phone net uh internet soit's kind of slow to get the data umokay so I got it so you can see there isa huge list of images there is 932entries this is a shared cluster that myteam uses there is a bunch of differentclusters bunch of different images butin example if I click on the image itshows you all the packages uh that itfound in this image all the libraries sothis one happens to be quite big um andyou can see some of some of the packagesthey have a critical level of uhseverity for uh vulnerab vulnerabilitiesCVSs that's the score for cves for thethose vulnerabilities you can kind oflike go and read what what this is aboutbut what's nice about all of this thereis a view where we show you runningimages so if if I take that example ofhipstershop and take one of the images runningin that cluster so you can see these areall the Clusters where it's running andthese are the name spaces where thoseimages are running and here is thevulnerabilities and uh as uh thosepackages and vulnerabilities inside ofthem so that that's kind of the quickexample uh of uh what we can do at ahigh level there is a lot more to thisbut so all the data that I was showingwithin the flow logs you can kind ofslice and dice and shape that into youruh reports for compliance whatevercompliance standards you need to followand build the your continuous complianceusing that I believe that's it and I'mout of time right okay thank you[Applause]2025-04-15 22:17:45.423244 �Y g��W�#��eA9TU601q-aBEI don't really have a lot of content Idon't know if I have enough time toactually fill the 15 minutes but we'llsee um I'm happy to have any sort ofconversation about this topic um um I'mgoing to talk today about this thingcalled software supply chain securityyou probably have heard about it beforeand you probably have seen it floatingaround um and a lot of times when peopletalk about software SP chain securitythey either talk too much about thetheoretical side of it or they talk toomuch about the Practical side of it andI'm trying to do talk is kind ofbalanced the to so I'm going to actuallytalk or more tell yo₣4�#��AXgoGyTNheqEwelcome to this talk on Dapper inpractice first uh quick poll how many ofyou have heard of dapperbeforecool yeahnice uh my name is Marklefter my LinkedIn bio currently saysthat I help organizations become eventdriven so that means that I supportCloud native app development effortswith a specific emphasis on event drivenprinciples practices and patterns whichalso explains my interest in DapperDapper stands for distributedapplication runtime it's an open- sourcecncf proق�v�#��#ADBBW5Yrc0Zsmy name is Ivan shermock I'm a principalSolutions architect for taera igera is acompany um who created and maintainsopen source project Calico product uhthat is known for networking securityand in the commercial versions of it forobservability forcarbonates so in this talk I will coveruh talk a little bit about whatcompliance is necessary uh how toachieve compliance and why uhtraditional compliance standards do notor or techniques do not really work forcontainerized applications specificallyto kubernetes platform and how weaddress it using cico um the most of thetalk is kind of like generic not relatedif I have any time left in the end I canshow you uh to kind of like map out someinformation that I'm going to be sharingtoday to what it what it looks like inCalico and how we do things um but itdoesn't mean that's like the only way todothat um so if you work cico does anyoneknow what cico is can yo��ject aimed at developing as thename implies distributed applications sowe're going to explore what that meansand what daer is in this talk but Inamed it in practice for a reason andthat's because I want you to be awareand leave this talk being aware of themany practicalconsiderations that go into adoptingthis technology should you choose thatpath uh because it's not not a Panaceafor all your application needs as we'llsee so let's begin investigating Dapperby setting the stage a littlebitso we are working with distributed appapplications or want to developthem so we'll use the popular termloosely coupled uh application servicesthat are distinct processescommunicating synchronously orasynchronously we're assuming a cloudnative environment where we packagedeploy and run these applicationservices our platform of choice will bekubernetes that's not a bad assumptionto make because I believe upwards of 90%of current dupper deployments are onkubus you can run Dapper in VMS on PremI believe in serverless now but we'll beconfined to kubernetes for now now ourapplicationServices utilize directly or indirectlycertain commoncapabilities some of these are providedby the platform kubernetes itself likeservice Discovery scaling out some aremore of a cross cutting nature forAccess Control resiliencyuh observability you can find these in aservice mesh typically and then you havecapabilitiesclosely sort of related to yourapplication service that that they needand you utilize commonly vendor Cloudvendor services for that for messagingfor StateManagement uh there's a book that is acouple of years old still a very goodread that summarizes these capabilitiesin the form of microser Servicespatterns for communication State andworkflow again you use cloud vendorservices to implement thesecapabilities but everything beyond whatthe platform provides that's where youwill find Dapper entering the pictureand that's where we're going to focuson so how many of you usepubsub or have used if there is a singlehand that doesn't go up I'm going toquestion your uh not profession but Iassume every every everyone has at theleast heard of seen it maybe evenimplemented pubs up so let's start withmessaging and how you would Implementmessagingpubsub if you integrate directly with acloud vendorservice so hopefully you can see thatthis is a bit of JavaScript code innode.jsand this application service needs topublish a message to AWS SNS to acertain topicto be subscribed to by other applicationservices that use sqs to receivethem we are incorporating thisdependency a client Library directly inourcode we use a certain version of thisLibrary we import it into our code wecreate a client configure it with inthis case a specific region that we'rein running our application and then wepublish the message to a topic accordingto this SDKdeclaredAPI very common to see this but thereare a number of issues related to thissetup the most obvious one is of coursethat you are tying your application to aparticular client or driver version youneed to make sure that the SDK versionyou're using matches the functionalityor feature set of the specific resourceor Cloud vendor service you're usingotherwise you might get intotrouble also obviously you're tied to aspecific infrastructure in this case AWSSNS andsqs with a client Library you sort ofbuy into the way you configure yourclient the the resiliency features itsupports underneath retries for exampleand how you observe the publishing ofthe messaging messaging going on and ofcourse security aspects you needcredentials when you you tried this outlocally you have credentials set in acertain way when you run Productions youmight have something else inplaceso is this really what we as applicationdevelopers are looking for do we reallywant to embrace all of this complexityor is this what we are looking for in asense we simply want an abstraction thatsays publish a message to a certaintopic and all of the other things wejust saw maybe we can find a way to ifnot completely hide away at leastextract from our set of responsibilitieswhen we sho�uld focus on our codepublishing a certainmessage and this resolving these issuesis where you'll find Dapper entering thepicture and promising a setof uh supportive features that we canutilize to accomplishso here is an obligatory overviewpicture ofdapper on top you have application codeagain your application services thatmake up your distributedapplication you can write them in anylanguageframework as you seefit but in order to take advantages takeadvantage of the capabilities the Dapperprovides youcommunicate through HTTP and or grpcwith a separate runtime an actualseparate process which is the Dapperruntime and this runtime as you can seeoffers a set of buildingblocks four common capabilities that areclosely application related we just sawmessaging so there's a building blockiner that says you can utilize me forpups upyou can utilize service invocation ifyou need to speak or Target communicatewith a service uh from another serviceyou can use State Management if you needto store session informationor uh a lot of other things that yourapplication needs to write and read andthen you have more specialized buildingblocks like distributed locks secretsand configuration are pretty much whatevery distributed application needs weneed to get secrets and configurationinto our appsso this runtime encompasses all of thatand you can utilize again these buildingblocks through HTTP andgrpc and you can then deploy yourapplication Services utilizing thisDapper feature set potentially where youneed them to run on in a cloud nativeenvironment likekubernetes VMS on Prem and soforth sometimes I get the question oroften get the question how does Dapperoverlap with a servicemesh short answer does it overlap yescan they coexist also yes there's alittle bit more to go into when it comesto Dapper and service measures Iencourage you to go to that link on theDapper documentation it has a very goodexplanation of some of the aspects butthey go hand in hand if you want themto so these building blocks if we wereto seethem Illustrated in a different way wecan goto this image again for those in theback might be a little hard to see butthis isa um yeah map of commonpatterns that you'll find in a book byChris Chris Richardson and in this blogarticle they have put the Dapper logo ontop of the common microservices patternsthat Dapper supports for for example youwill find what we just talked aboutmessaging is a pattern thatmicroservices utilize thater supportsthat thater supportsconfigurationSecrets um some things it doesn'tsupport for example on the top there yousee eventsourcing we'll see how we can addsupport for event sourcing in a customway later on so if you have this map ofpattern for microservices you can easilysee how they map onto Dapper buildingblocks so if you building a distributedapplication with these Services you needto uh Embrace or Implement commonpatterns you can see how Dappa willsupport you in thateffort so let's look a little bitmore uh about building blocks sobuilding blocks are not actual things inthemselves they're just standardizedapis so Dapper formally declares thatthere is a building block calledpubsub and the API the way you interactwith it is also defined as are certainsemantics and guarantees so at leastonce delivery what the message format ishow you route these messages and soforth all of this is specified in thebuilding block but that's just theinterface if you will the actualimplementation of the building BL ofthis API is what components are socomponents are the actual things thatsupport a building block in all thesesemantics and they will handle theinterfacing with cloud services in thiscase we have a component that supportsPub sub via SNS andsqs so if we were to use this componentunderneath the building block that we asapplicationdevelopers utilizing our code that wouldbe the set setup we would getnote the topics and thecues should not be thought of as beingautomatically created by this componentthere is still a provisioning part thatyou need to take care of or ratherhopefully platform Engineers can helpyou with so th�ey don't get justinstantiated onAWS unfortunately actually in this caseit does but in for production you turnthat off so you need to manuallyexplicitly create these resources esthat the component will interact withthat supports your application codeso we have this and if we look a littlebit closer uh into the kubernetes sortof map with Dapper included we'll seethat in your pod where your applicationservice container runs you will haveDapper included as a side car injectedand again HTTP grpc communicationbetween the service and the side car andthe side in turn communicates withAWS via this component theimplementation of the building blog forexample Pub sub and so you have sort ofa control plane for Dapper as well inyour cluster because there are a numberof things that need to be controlledwhen you have a number of pods with sidecarsinjected again this all thater is allonly concerned with one cluster whathappens inside one cluster so there isno by default in interclustercommunication going on just to make anote ofthat also just as a note thesecomponents are not separate processesnot separate pods in your kubernetescluster they are actually built into theDapper binary bydefault sometimes that's hard for us asdevelopers if want to extend Dapper incertain ways so we'll see how we can getaround thatlater and if you want to manage multipleclusters with Dapper installed which yousometimes will have to do there aretools to help you with that thecommercial entity backing Dapper calleddiagrid offer a very good tool for thiscalledconductor which I can talk more about uhafter thetalk so let's go back to the originalexample and see what the daer equivalentlookslike on the right this is where you asapplication developers should beconcernedso we have extracted everything AWSrelated and now we're using just theDapper client the DapperSDK and only that with the pub subbuilding block topublish a message in this case just apayment ID to a certain topic confirmedto the pub subcomponent and that's it this is theabstraction we were lookingfor now the magic doesn't simply wireitselfup we need to do something else tocomplete this this implementation so youhave some manifest on the left which isthis component that you need to apply toyour kubernetes cluster so we canpublish to it and it can in turn publishto AWS but you see that we havespecified here the region inside of thismanifest that's a vitalconfiguration part that we are nowexcluded from our codeand there will be other things that gointo this component as well hopefullyagain us developers do not apply thisyourselves we'll see later duringprovisioning that this could be handledfor you if you request it by theplatformteam and you can see the actual publishcode gets translated into a simple restcall to our side car so you can see whatthat looks likebelow final thing that I want to mentionis that every application service inDeer gets an ID in this called examplecalled paymentprocessor we can use this to limit whatapplication service can do with regardsto components so here we're saying thispayment processor this applicationservice canpublish but only to a certaintopic now this is sort of an accesscontrol list right but the cool parthere is that thisaccess controlers you can switch out theactual component that we're usingAWS SNS for Kafka but the same exesscontrol list could be used for Kafka aswell we now have a portable wayindependent of the actual broker we'reusing the message broker and set AccessControl list for it so we can reusethose policies if we wantto just by switching out the componentimplementation suddenly we using cab butthe application ID is durable and we canuse it to continue limiting the scope ofpublishing for this particular paymentprocessor it's a small thing maybe inthe grand scheme of things but it's veryuseful when you uh get to actualdeployingdaper all right let's look at anotherpopular building block which is serviceinvocation to go through that quickly uhso you have one service called customerwanting to interact with another servicecalledcheckout usually if you wa�nted tocommunicate between customer checkoutcustomer would construct a URL HTTP andthe name of the checkout service to postsomething here we're using the name ofthe service checkout but we'reindirectly calling it via Dapper as youcan see in the first rest request therenow Dapper will use whatever DNScapability can find in the environmentit's deployed in for kubat is it'sstandard QBEDNS the side car will call the Target orremote side car perhaps encrypted usingmtls and then the request will be postedto check out and it will need to listenas you can see on a post SL orderendpoint to receive the request fromcustomer and if you want to we can turnon automatic observability we openTelemetry so we can get all kinds ofinformation metrics distributed tracesEtc and also here you can apply AccessControl policies resiliency policies ina standardized portable way so all ofthis if we summarize it we now get in asense complete application portabilitywe already had it for our compute partyou can put your or deploy yourDocker image into any kubernetesenvironment have it run thereindependent of the cloudvendor but you were not as um luckypreviously when it came to actuallyintegrating with specific services thata cloud vendor could offer now you havea portable way to do that as well soportability both compute your workloadsand the integration points and needsthat youhave which means that if you need to doany type of switching between betweenproducts underlying infrastructure youminimize the costs hopefully usingthater as we saw we can replace theversion and the actual underlying ummessage brok if you want to withoutaffecting our applicationcode there is some modification to thatstatement which I'll make later but whenyou're building out your architect younow have common building blocks that youcan use you don't have to go down tohopefully a muchcustom code that um is required if youhave this these building blocks toutilize over time again policy is thesame thing with just a way to Portportably construct Access Control liststhe same goes for resiliency policiesyour skills hopefully once you learndaer you can apply them in manydifferent projects and and Endeavors sothere is a sense of um reusability ofyour knowledge there as well minimizingyour mental switching cost perhaps andand I cannot emphasize this enough youdon't have to buy into Dappercompletely you choose the buildingblocks that make sense for you hopefullythey cover enough of your requirementsmoving forward so for example should youhave a legacy system a big monolithyou're trying to apply the Stranglerpattern to extract certain features intotheir own application Services eachservice can then be a at for Dapper andthe building blocks that you seefit very important Point okayso how do we actually work with Dapperinpractice as an example I want to use abike rental service and again apologiesif not entirely clear picture-wise forthose in the back but this is supposedto be a domain model using the eventmodeling technique for picking up a bikeat thestation the blue stickies are commandsthey express intent to change the stateof a system the yellow stickiesrepresent events or state Transitionsand the green ones are read models orviews of system State and this shows howyour system changes state over time asyour user progresses through thisparticular business flow of picking up abike so I'm going to go through them notall of them but you see that theseslices that make up our business floware good candidates for aligning withbuilding blocks in Dapper so we take thefirstone there you can imagine that I'mentering in my credential at the stationwhere there is a panel and itsends some information including maybemy credentials to an APIbackend and this API backend then sendsa request to bike Management in thisdomain model bike management conconstitutes a capability in our systemfor managing bikes it could then be oneor more application services at thispoint it's not really that important butwhat is is important is that our APIbackend sends a request request bike viaDapper service indication t�o bikemanagement to record the fact that abike has now been requested by thecustomer so we see one building blockservice invocation being used in thisslice and bike management need needs touse State Management to store this factthat a bike has now beenrequested so we see how they these twoblocks align with this slice next wehave two slices have to do with readingState again a selector that needs toselect a bike for me needs informationabout available bikes so you can useState Management to read that fromwherever store that has thisinformation and the bike request thatjust came inend Dapper will now combine StateManagement with pubsub to arrive at atransactional outbox so there is arequest being made it will be recordedbut at the same time anatomically we'llalso publish a message to theselector and the selector has now allthe information it needs to move on tothe fourth slice basically unlocking thebike and it does so there's again Pubsub involved to trigger aworkflow there's a little bit oftrickiness going on when I want to pickup the bike when it's been unlockedbecause I have a time period there's 60seconds duration before it gets relockedunless I pick it up this a bit hard todescribe maybe or just codestraightforwardly but Dapper providesanother building block which is reallyinteresting called Dapper workflow itcan use your standard building blocksand compose them together to implementthe workflow for when a bike has beenunlocked there's another building blockI also want to explicitly mention calledbindings so if a bike is unlocked howdoes that happen we have to tell thestation the station subsystem or coursesystem to release the bike from the slotand turn on a light that saysgreen so we can use bindings tointerface with external systems that'svery interesting which I'll go into moredetailshortly the rest hopefully is more orless self-explanatory the workflow endswhen I have hopefully picked up the bikeor if the time run timer runs out buthopefully you can see we can start toreason about our system architectureusing theer building blocks and maybe afew of our own that I haven't added umwith these slices so we can go directlyfrom the main understanding to the startof a systemarchitecture few words about the Dapperworkflow part you have a workflowengine that you can use to write workClos incode the durable execution part isinteresting because you should somethingcrash you have made some progress in theworkflow you can resume where you leftoff and you can use certain patternsstandard patterns in your workflow uh ina in an easy way so quickly thisunlockingpart or workflow that I showed in thethe main model if we start by saying tothe renter your bike has now beenunlocked that's an an activity and itwill use what's called an output bindingto tell me maybe through a pushnotification or show me in the panelthat your bike has now been requestedyou can find it at slot 16 this workflowwill now wait for the bike to be pickedup an external event and if it gets theevent in time it will then use StateManagement to record this fact in somestore if I'm unlucky and do not pick upmy bike the timer will fire and it willrelock the bike again at the station andwe end theworkflow and it Maps pretty well to whatyou might think you would see incode um you describe your workflow youractivities you compose them together andyou're good to go more or less veryexciting about using this workflowengine particularly inDapper and that's a standardmicroservice pattern so you might heardof the term Saga insteadum but that's something again thatdupper provides for you should you havea need for itokay sometimes there isn't a buildingblock for everything I mentioned thatthater might cover a lot of things butnot everything so sometimes you need tothink about going your own way to someextenthere I've had a need to work with eventstores so if I want to do what's calledevent sour sourcing I want to take thoseyellow stickies that you saw in the inthe domain model and I want to storethem explicitly as a source of Truth formy system I can utilize an event sto�rewhere I can store these events nativelyinsequence different from message Brokersor event Brokers event stores storeeventsspecifically there isn't a buildingblock for this to communicate with eventstores or do event sourcing how do weget a building block well there is aprocess for it but it takes a long timewe can't wait for that so we can usethese Bindings that building block tocreate our own communication path withan external system such as an eventstore and we can register this as apluggable component a separate processthat tells Dapper I exist andapplications can now use me to interactor interface with event storeso how does thislook normally in event stores you appendand read events directly we're going touse another pattern inmicroservices uh context called cqsmaybe you're familiar with it in thisevent model we'll now think of theserelease bike and bikereleased types of messages as actualthings that we will use in our system soyou see we have removed serviceinvocation that we pre use previouslyused and now when a bike gets picked upthe station will actually send a commandrelease bike to a Target command Handlersomewhere where the command will beprocessed an event will be generated thebike released event it will be stored inthe event store we'll build up theavailable bikes view from it that theselector canquery so commands and and events they'renot just things in our domain modelanymore they're actual things we use tocommunicate between different parts ofour system we're using the adaperbindings component so we're saying if acommand is sent out it willbe uh sent via Dapper to a specificcomponent what we Implement that willtalk to the event store that I've havechosen for this example that I've workedwith a lot called axon so axon is atechnology stack which incorporatessupport for event sourcing and this typeofmessaging so we will via Dapperinterface with axon the that event storeand it will route my command to acommand Handler that could be anotherDapper service for all we know but wecould implement it as such and again thecommand gets processed we apply andgenerate an event gets stored in ACCserver and we can query the resultingview available bikes through a querymessage so there are a bit of thingsgoing on here but the main part to takeaway from this slide is that Dappergives us the option of interfacing withexternal system that it doesn't havebuilding blocks for but that we canbuild our own componentsfor and this is how we would interactwith an event store which has been aneed ofmine so we will callthis component messagebus and looking a little bit moreclosely at that component that we'rebuildingourselves so in our app we wouldwrite binding. send we would say sendnow something to this component calledmessage bus there's a command that Iwant you to handle so we'll invoke anoperation that a command supports calledsend command and we'll simply providethe actual command that we want to sendDapper will will VI a grpc communicatewith our component again a separateprocess running incubators and thatcomponent knows how to speak to theevent store but we have to apply someyaml to make sure that thatcomponent actually getsinstantiated so hopefully you can seethat when Dapper doesn't cover yourneeds there is some things you need todo on your own that might be trivial allthe way to non-trivial depending on yourown background and Knowledge and SkillsEtc um but this would be I think a veryillustrative case in terms of the stepsinvolved for interfacing with anexternal system in this case an eventstore I want to just stick around alittle bitmore when it comes tomessaging messaging is not just puup subas we have seen there are differenttypes of messages events are one kindthere are commands and there are queriesas we have seennow if you're using messagingasynchronouscommunication for most of theinteraction if not all of yourinteraction in your system there aresome interesting properties that comeintoplayso when you're building a new system sayyou might not always want to godistributed all the way through theoutset you might sa�y let's build asimple version of our system first let'sbuild a monolith modular of course butwe will use messages to communicatebetween the different parts of thismodular monolith commands events andqueries as time passes you might seethat certain parts of this modol cannotbe extracted they need to be their ownapplication Services then you can simplyextract them why because messages arerouted to a destination the caller thesender of a message doesn't know whereit ends up when you publish a messagewhen you publish an event EV you don'treally know who the consumer is that'sthe whole point so using messaging youcan extractcomponents the way you need them to beextracted and scale up as needed so youcall this locationtransparency very nice feature of ofmessaging however Dapper is distributedit's in the very name it means thatevery type of communication goes out ofyour application process outside to thenetwork Travers is a networkboundary so it makes it a bit hard tobuild systems that are message drivenwith depper if you don't account for itI'm not going to spend too much time onthis but you might have to introducespecificabstractions along with Dapper to enablefor example locationtransparency um so in this case we haveto build what's called a command bus tosay sometimes the command Handler thatwill process our Command is the verysame process as the rest of our systembut sometimes it needs to go out to thenetwork we as center of thecommand don't really want to know whereit goes it just needs to be handled sowe need again something else on top ofthe upper to Route commands and ingeneral messages in an intelligentway and again this is not so much aboutextending Dapper but you will run intocases with Dapper where a building blockdoesn't cover all your bases so youmight have an application service thatuses the building block State Managementto write and read state in this case thecomponent would be a post SQL databaseonAWS but then suddenly you have a need toexecute raw SQL queries against thatdatabase the State Management buildingblock doesn't support that doesn't knowwhat SQL is and there is no SQL buildingblock I don't even know that's would bemeaningful but you need to do some rawSQL query what's youroption well it's will become adependency you you will have to addsomething else to your code that doesthis explicitly but at least you canisolate that part so you might introduceanother service hopefully you can limitthe SQL related functionality to justthat service you can use maybe duerservice ification between A andB uh to execute the query but again youhave the problem with a specific clientLibrary version Etc but maybe you canlimit the scope of that impact on yoursystem but dependencies of this kindwill be yeah unavoidable but you canwork around them hopefully to to a goodresult all of this to say extendingDapper these types ofdependencies a famous architect said thefollowing you can build abstractionlayers which is what Dapper is it's aset of standard apis right withsupporting componentimplementations you can't hide theunderlyingcharacteristics of your platform andyour infrastructure they will surfaceyou will have to account for them andthey might impact your applicationBehavior in subtle ways or not so subtlesometimesexample I can use stage management I canread and write state to Dynamo DB thereis a component for it I can also do itto a postc database those differ widelyin terms of scaling capacity of latencypricing right so consistencyguarantees so just the fact you're usinga building block C State Managementchoosing the actual component might havesome behavioral impact on yourapplication so if you're applicationteam you have a certain budget you canjust just relaxing with this abstractionlayer called Dapper doesn't fulfilleverything that you needed to do youwill notice the underlyingcharacteristics uh as you work withDapper you need to bring thatconsideration in into yourplanning so that's one of the verycritical part I want to convey to you inthistalk in some of the final pieces of thistalk I want to go into how Dapp�errelates to platformengineering remember just you're usingthis obstruction obstruction layercalled Dapper using supportingcomponents but the actual resources donot get automatically created for youyou need to provision your your DynamoDB table your SQL databaseEtc so what tools to use for that whodoes it if we start with tooling thereare a number of such tools to choosefrom I ama um effici andado of crossplay recenteffici a I like the way you can declareabstractions for application teams thatthey can apply to a cluster in this casea platform team together withapplication teams have defined what adatabase is so I can as an applicationdeveloper say I want a database I wantit to run in this case onAWS and it should be a small instancewhatever thatmeans that will be created provisionedfor me uh on theAWS so if I want to use the buildingblock for StateManagement and pogress is what I have inmind for a supporting componentthis database abstraction when I applyit would provision this database forme but there's a little piece missing soif I apply that yaml you just saw thecross plane plane will take care of thatfor me it will instantiate what's calleda composition we can leave that termaside for now all it does it it gatherstogether everything required on AWS interms of resources and it gives me thedatabaseit also creates credentials for me andit populates the Dapper component yamlwith it so the crossplane stuff in thecomposition together with the Dappercomponent thatmanifest will then be applied togetherand my app is now ready to Via Dapperand this Dapper component read and WRState uh on AWSagain who is responsible for what justnaming myapplication is it me as an applicationdeveloper decides that together with theplatform team naming of topics againthis is a wider issue that this talkcannot go into for obvious reasonsbecause we don't have enough time butthis is just a technical wiring of itall there is a negotiation that needs toTak place takes play takes place betweenthe application team and the platformteam but this is how we you would wireit up in in asense again this is still quite lowlevel but there are other tools thatraise the bar addition additionally soif you use something like critics youcan have onemanifest more or less that says not onlywill I installed Dapper for you in yourcluster I'll also add this crossplanething you speak ofI will deploy your application yourworkloads as well as the database youwant and when it's all appliedeventually it will be up and running foryou in the environment that youdesire so creatics gives youa capability to sort of describeeverything you need for your applicationfinally to run uh on a higher level thiscould then use for example Cross Lanebehind the scenes but I like that thereare tools now that go through all theway to give us a application developersa simple perhaps just one manifest torule them all a bit more detail intothis but hopefully you get the sense ofwhat we can accomplish using a tool suchasthis this is important if I choose toadd a building block to my applicationand I need an additional infrastructurepiece they need to be in sync otherwiseI'll probably endend up in a state thatI'm don't want to be inso I want to end with some notes aboutwhat developers actually think aboutthis and whether you should adopt Dapperhow you should think aboutit so if you just look on the left youhave building blocks and theirpopularity among developers who haveadoptedDapper there is hopefully not too muchsurprise in pubsub being very popularservice invocation and State Managementclose second andthird on the right you have why peoplelikeDapper the one that surprises me when Isaw this was the second reason no codechanges when swapping components forexample if we move from red to kka forPub sub now does this happenoften probably notis it nice that you're able to do thisthat's the whole promise of dapper butI'm struggling with whether this is anice to have or it actually wassomething that Iused um but again that's what Dapperoffers so I guess that's what peopletake to when they when they look atthaterso hopefully you get a sense of whatpeople actually experience and some ofthe outcomes whenadoptingDapperso we get with Dapper a platform andlanguage agnostic way of buildingdistributedapplication and AssociatedServices which means that we get true ina sense applicationportability for our compute and for ourintegrationneeds by its very nature we separateapplication code from underinfrastructure but that's not the wholestory as we just saw the provisioningpart the platform aspect StillRemains and Dapper is again not one bigbuffet from which you have have tosample everything you choose thebuildingblocks and the same mindset you canapply whether it's a migration scenarioor a green field pick the parts of yoursystem where a certain set of buildingblocks make sense and you can simply addmore later so here's a $64,000 Questionshould you adopt Apper ornot I can only give you like a signalthings to lookfor if you're say building out a newsystem and a large I would say verylarge part of your system and thebusiness logic you have toimplement could be done through buildingblocksalone that's a very good signit means that these are um quiteexpressive and Broad enough to to covermost of your needs and if you also weighit to the management investment you haveto do you have to manage your Dapperinstallation in kubernetes you have toinvest in PL platform capabilities Etcif you can keep that cost down or to areasonable degree compared to what yousave and how effective you can be withthater when you actually build yourapplications with these buildingblocks I think you might have a reasonto at least look into it moreclosely and if you're interested inhaving that discussion more in depth I'dbe happy to take your call so pleasereach out reach out to me if you'rehaving questions about adopting Dapperand you need more insights into thethings we just discussed thank you verymuch a question in the back weyeah we have some time for questionsright okay thank you Mark for the talkvery interesting uh I just had aquestion like what were like the most Idon't know interesting uh applicationsor sorry adoptions of thater productionuh that you have seen uh and like howwere thearchitectures Uyeah I'll I'll answer that in a more ona more General level but that cussacross the project I've seen seen it umapplied mostdevelopers and this is not a skill issuemost developers including myself we havea hard time building distributed systemsthat's just afact the patterns alone that you sawthey're they can take quite some time todig through and the knowledge and how toapply them and when it's not anon-trivial or this is not a trivialexerciseso for teams that are just being exposedto this maybe they are currently with alegacy system and they need to becomeproductive quiterapidly and have an easy path tounderstanding what it actually means tobuild a distributed system this is whereDappershines it takes away a lot of themagical mysteries of distributed systemsmake it little more easy to Crunch andactually Implement and put intoproductionand it especially takes away the NRdetails of configuration as we've seenso that's where I've seen thater havesuccess from only speaking now from anapplication developer point of view yousimply feel and you are actually moreproductive as adeveloper uh withDapper um and as you saw the most commonbuilding blocks those are also the onesthat I've seen utilized mostly um thoughthey cover many types of scenarios formessaging State Management and I'mbeginning to see more about the workflowpart aswell hopefully that answered yourquestion thankyou anyother musingsquestions again the promise I would saylargely is the efficiency gains asdevelopers and this is mostly where Iwant to part this talk for applicationdevelopers um and what you can get outof it so hopefully you maybe got someinspiration but requisite Insight inwhat it would take to to work withDapperyeah one more question maybe or if notagain reach out to me if you if you needto wantto I think we're good all right thankyou[Applause]2025-04-15 22:17:46.061062�u stories of whenthese things go wrong and when they gowrong what could happen and then we'llshow you like a quick demo for onepossible way of improving your softwareSPL and security and right off the batbefore I get started it's just a verylong marketing term that means securingsoftware from where you produce it towhere you run it right the supply chainis just a term that basically means forus the developer cicd right justmarketing for cicd right so but it alsogoes into uh making sure that you areactually using software that you trustbecause most developers just downloadbunch of random stuff from the internetso making sure you trust yourdependencies uh you um verify yourdependencies and then securing theentire process by which you get softwarefrom where you build it to where you runit uh my name is abdal I'm a developeradvocat Google I do this thing calledkubernetes podcast by Google and I alsowork on kubernetes mostlyum anything around containers and so Imean this is this is what people mean bysupply chain again for us it just meansdevelop like cicd continuous integrationcontinuous delivery you have some codethat you write you take a bunch ofdependencies you put them through abuild system you produce a package andthen you either uh Supply that packageto somebody else if you're building alibrary or you run that packagesomewhere right I think that this is thesimplified version of how supply chaincould look like for most companies mostdevelopers but one thing that a lot ofpeople actually don't realize is that alot of R dependencies are actually builtthe same way a lot of people today evenopen source projects are using the sameexact processes by which you buildsoftware and and run it to build theirown software except that they don't runit they just you know host it somewherethey provide you with anartifact and because of this cyclicdependency between you you build yoursoftware and how your dependencies arebuilt it makes this extremely complexand extremely vulnerable to a lot oftypes of attacks and I'm going to talkabout some very concrete examples justsome numbers for reference those areresearch that have been done by sonattype uh synopsis and Gartner um theseare quite old but just in 2021 therehave been a 650 per surge in open opensource supply chain type attacks um 81%of Commercial Code has some sort of osvulnerabilities and um by 2025 which isnext year about 45% of organizations aregoing to face some sort of attack of thetype supply chain right and I'm going togive you some really U concrete examplessome of them are listed here log log 4Jon remember that one right solar windsso let's talk about solar winds that's avery interesting one so solar winds is acompany that makes uh monitoringsoftware and one of their customers uhwas a or is a company called Continentalpipeline which is an Americanconglomerate that supplies the eastcoast of America with oilso they take oil from wherever theyproduce It in America and then they justuh supply to the east coast to all theplace where the oil is transformed andthen distributed and so solar wind wasattacked well was attacked somebody atsolar wind downloaded a package that wasvulnerable patched their own softwaredistributed the software to theircustomers their customer is one of the Muh Continental pipeline downloaded thesoftware installed it on their serverand then U the the vulnerability gotactivated and then the attackers did anentire take down so they basicallylocked all the computers all servers outright and the funny the interestingthing with these kind of attacks that weare seeing today is that they have realworld concrete effects on humans it usedto be that security is something that isfought on the internet bad people youknow good people trying to like fighteach other on the internet when solarwind happened there was two to threehours cues at gas stations in Americabecause people were afraid there wouldbe no more gas people were actually isscared they cannot buy petrol anymoreand so people were rushing to go to togas stations to stock on on on gas soit's like it's somethin�g that we willfeel as humans the old the oldest I'mgoing to talk about another story andthen come back to to to to one thathappened in Sweden uh two years ago thefirst time I heard about supply chaintype of attacks was I think you probablyhave have anybody heard about a companycalledMK you have probably at least seen itonce or twice yeah a few of you rightyes so it's a container company it's aDanish company and they basicallytransport containers and uh there isactually an article on the verge which Ihighly recommend you read it becausefour years ago they were attacked andthe way they were attacked is thatsomebody put a back door in a piece ofsoftware that was installed by a MKemployee in Ukraine and that back dooris a warm that basically just once itwas activated inside somebody's computerit just traveled inside the entirenetwork of M and one of the things thatM did not do at the time is that theydid not have Network segmentation sotheir prod and corporate Network were onthe same place right they had nofirewalls no limitations so the the wormspread across the entire network andjust locked everyone out of theircomputers everyone servers computerseveryone was locked out and this had hada lot of effects it was estimated thatthe amount of damage that was done bythis attack which lasted for two weekswas something around the line of5billion um there was one of the effectspotential effects so five billiondollars that M lost did this this numberdid not include how much their customerslost because there was customers thatshipped stuff in containers and thereare stuff that needs to be refrigeratedand they were sitting somewhere for twoweeks not being refrigerated and nottransported and things go bad especiallyif you are shipping fresh stuff rightmeat fruits vegetables so in one oftheir biggest uh one of their biggest umports in the US they have an automatedsystem that scans the license plate ofthe trucks and then print out a piece ofpaper that tell the truck where to gopark so that they can offload thecontainer that was not working and therewas about 3 kilometers of lines oftrucks outside the port just waitingright so these are actually effects thathappens in real life and the funny storybecause I think it's funny how did theymanage to recover they actually foundout that they had so they they useactive directory or they use activedirectory at the time and they had an atypical active directory farm right youhave a main server and then you havereplicas one of their data centers in inSouth Africa went down so gotdisconnected from the rest of thenetwork just before the attack happenedso they had a fresh copy of activedirectory on a server in SouthAfrica and they had to ship a hard drivefrom South Africa all the way to Denmarkthrough the UK to be able to restorebecause because when the warm spread inthe network even backups gotdeleted so like they lost everything sojust like concrete real life thingsanother one which happened in Swedenjust recently is uh we have a bigcompany a big Supermarket shade calledcoupe right and two years ago they wereattacked and they were down for umanywhere between six six uh 24 hours and6 daysand it was a supply chain type attackthrough a piece of software called CAwhich they use so it's a commercialsoftware so they don't build itthemselves they just buy it and so thatbasically locked them out of all thecashiers and the self checkout bootsacross the stores across the entirecountry the problem when this happenedis that in some parts of Sweden Coupe isthe only Supermarket that existthere like like in a lot of very remotelocations Coupe is actually known fornot operating very big stores theyoperate a lot of very small storesthat's why they have 700 of them uh 700is a lot Sweden is not a very bigcountry there's like 10 million peoplebut in some remote locations that's theonly Supermarket where you can buy foodand imagine you cannot buy food for sixdaysso why is this a problem well because webuild software in a very complex waythis these days right it's extremelycomplex and because it's complex thosesupply c�hain can be vulnerable to a lotof lot of types of attacks the mostcommon vulnerability is a vulnerablepackage whether it's intentional or notintentional actually actually I can giveyou another example very quickly therecent one do do you remember XZ the SSHone so XZ is is a compression librarythat is used in SSH so when you use SSHor specifically SCP to copy filesbetween servers they get they getcompressed and there was people thatintentionally added a back door to thelibrary called XZ which then is used byS and it was discovered by somebody onMicrosoft because they discovered thatonce they run SCP to run files SSHcommand consumes a lot of CPU and that'sbecause in in the back it just shippingwhatever you're copying across twoservers it's sending them somewhere elseright so vable package intentional ornot intentional right that that couldhappen I mean not intentional becauseall developers are good but also alldevelopers can be bad and people buildbad code right um you can have avulnerable package that doesn't lookvulnerable you scan it you look at ityou read it you use your scanningsoftware it looks all goodonce you put it in production it willdownload Trigger and malicious updatesso it has a built-in mechanism by whichit downloads a remote you know bashscript or something and then it becomesvulnerable but when you scan itinitially it's not and this is actuallyvery hard kind of thing to to to to toto to detect because a lot of times whenpeople Implement back doors in packagesthrough this way they will just base 64encode the URL so for you it just lookslike a random string and you would knowthat is an URL right so you have to takethat URL and Bash 64 base 64 decod it tofigure out where it is right um you canhave vulnerable source code right I meanagain bad developers exist everywhereand people write badcode or you can have clean code cleanpackages and then your pipelines getsbreached right and this happenedactually to a company called circleciThis is a manage C company theybasically got breached and they leakedtheir own AWS keys not their customerAWS keys their infrastructure AWS keysto GitHub right and took them two weeksto realize that so all of all thecombination of these two things can makeit a way that you end up with avulnerable softwareright so I talked about the XZ outbreakuh this actually was very fascinating Ihighly recommend you go look into theanalysis of what happened because thepeople who implemented this umvulnerability they were intentionallydoing itlike it was intentional it was not andand and it took them three years to beable to do it because they started bycreating GitHub profiles and contributedto XZ and they looked like legitimatecontributors and they they just keptcontributing contributing up to thepoint where they just added the the thevulnerability that they were planning toadd it took them three years right alsobecause the funny thing is when you whenyou are maintaining open source softwareyou probably don't know who you areworking with right like the people thatyou just basically saw a PO request andyou you just review PO requests but youdon't know the people behind it youdon't know them you probably have neverseen them and you probably would neverseen them in your life right and a lotof Open Source I think there was like athere was um I was at one of the Linuxfoundation events and there was um ananalysis they did on a lot of these uhopen source projects like kubernetes andyou know the Linux the Linux kernel andlot of git itself and lot of and it'slike something like 60% of of thecontributions are called drive bycontribution so a drive by contributionis a person that contributes only onceso they they would only open a POrequest once in the entire life of thatproject that's why they call it drivebybecause you just contribute once andnever do it againright um so yeah so because because yoursupply chain is so complex it can besubject to a lot of types of attacksright bad code compromised source codecompromised build system compromisedpackaging system um one very concreteexamples that happens a l�ot for us is uma lot of times people in produthey don't actually filter out egresstraffic so you have an app in productionwhich makes outbound calls to anexternal endpoint but you don't filterthat that traffic so you don't reallyknow what the app is doing right um oryou don't intentionally Do It um and youend up with basically uh a package thatonce it runs in your production it willdownload an extra update that will makeit vable right so you started withsomething that you scan that looks goodbut then once you're R into productionyou just download some random updatesfrom the internet right so what how dowe solve this right so there have beentwo basically type of efforts that thesecurity Community have been working onone of them is called zero trust and theother one is called shift left neitherof those are very specific to securityzero trust is a networking term whichbasically means in a microservicesenvironment zero trust means that everysingle microservice verifies everysingle call and you don't blindly testthat you don't blindly trust thatwhoever is making a call toward yourmicroservice is a legit actor right soyou have a way to verify that callwhether you authenticate typicallyauthenticate and shift left essentiallyis a word that we came come from devopswhich meant making which in the contextof software supply chain security itmeans making developers more aware ofwhat they're doing not necessarilyShifting the responsibility todevelopers but just making them aware ofwhat they're doing and I'm going to talka bit I'm going to show some concreteexamples about the zero trustpart so a few years ago there was thisuh project that got started this is acommunity project called Sig store soSig store is a combination of multiplecompanies uh all these companies arecontributing in in into Sig store themain uh driver is a a foundation calledcalled op ssf which stands for the openthe open source security Foundationwhich is a sub foundation of the Linuxfoundation and they are doing actuallysecurity for open source generallyspeaking right and then there is a bunchof other companies and so they do threemain things they build software to signand verify other software and then theyhave ways to monitor whether softwarehave been signed or not I'm going toshow you concrete examples so one of theFrameworks that Sig store uh works on isa framework calledSalsa um there is an a missing but westill pronounce it salsa uh and itstands for security levels of softwareartifacts and essentially salsa is achecklist type framework that has a setof requirements that if the way youmanage software and the way you buildand ship your software meets all theserequirements you can say that my supplychain is level one level two level threeor level four compliance the checklistlooks likethis um it's a very complex longchecklist this is the vers first versionthere was an update to salsa frameworkthat focused only on the build but wecan quickly look at Salsa 3 you see thecolumn that says salsa 3 and on the leftside it says for example that you haveuh version Source control verifiedhistory retained history for 18 monthsyou have a scripted build system youhave an eeral build system whichbasically means you don't run two buildsin the same environments blah blah blahblah so those are all requirements rightand if if your software meets thoserequirements or so sorry if your supplychain meets the requirements itbasically means you're compliant withone of the levels one two three or fourtwo there are two big projects that areactually uh already certified to besalsa 3 compliant and you probably knowthem promethus and Argo so prises andArgo have all been through thecertification process uh which there arecompanies that does this for money ofcourse you can do it yourself U one ofthem is chainu guard it's company thatdoes this I I don't I'm not affiliatewith the chainu guard I just think thatthey're doing cool stuff and so theother thing that they do is that theybuild also software and tools and I'mgoing to show you today cosign which isa tool to verify sign and verifycontai�ners right or oci imagestechnically um so any image which iscompliant with the oci format you canjust verify it with cosine but they havea bunch of other things one of myfavorites is a tool called recor sorecor is a public Ledger where you canfind traces that a piece of artifacthave been signed so you can actually gothere and search for a library and thenyou can see if that Library have beensigned who signed it when it was signedand all that stuff right just a publicLedger they also have like a a publickey infrastructure because in order tosign stuff you need certificates andkeys and stuff like that so they have abunch of infrastructure that they offerfor open source projects for free to beable to implement all these mechanismsbut the the whole point of the salsaframework is to do this or the sorry thewhole point of the six store frameworkis to do this on the left side you arethe developer you are building some sortof software or artifact you have theartifact before you publish it you gorequest acertificate so you can get certificatesfrom anywhere you can get certificatesfrom but they have their own servicecalled fuo so fio is just a certificateAuthority right you download acertificate from ficio you use thecertificate to sign your artifact whichis on the top right corner signedartifactthen you publish that into recor the thetransparency log here once you publishit any end user can basically go intorecord type the name of your library andthen verify if that Library have beensigned who signed it who issued thecertificate and all that stuff so thisis supply chain security according tosix store this is their version ofsupply chain security but you don't haveto do this because this is mostly foropen source stuff the idea is that youcan get inspired from all theseprocesses and Implement them evenprivately of course privately you're notgoing to use a public certificateAuthority you're not going to have apublic recor because your your yourlibraries are private recor is both atransparency log but also a piece ofsoftware so you can run it yourself youcan have an internal version of recorright and I know there is actually Mavenis working on uh a record specificinstance for Maven so very soon therewill be a maven record which you canlook for Java libraries and then you cansee who signed them how they were signedand all that stuffright uh we talked about salsa we talkedabout the levels very quickly the lastpiece of the puzzle for supply chainsecurity is this Tang called es bomb andwhich stands for software bill ofmaterial I have a very um and I I I willgo back to the previous thing because Iforgot to say something there but I havea very basic very dumb way to explainwhat es bomb is you can think about esbomb as being the recipe version of yourgrandmother favorite recipe whateverrecipe your grandmother used to cook foryou and left you with a recipe thatrecipe is a set of instruction C and theset of ingredients right for softwarethe ingredients are the libraries so sbomb is the list of ingredients the onlydifference between a recipe and softwareis that your grandmother favorite Panamaybe right that's that's yeah that'sbasically a list of ingredients that youcan just buy over and over again andreplicate the dish multiple times in thecontext of software each artifact willbe different because artifacts are notbuilt the same so each s b should beunique to that specific artifact and ifthe artifact changes the s bomb shouldchange makesense so going back here just reallyquickly this another another anotherdumb way of explaining all of this topeople is to say you know when you go onthe internet and download command lineand sometimes with that command linethere is a sha 256 file right that weall download and we all verify right weall do it right good okay uh so thatshot 200 56 is the signature of thatcommand line so the people who publishthe command line they will calculate thesignature and then they will publish thesignature in a text file calledsha256 and what you're supposed to doyou're supposed to download the the CLIdownload the shat �56 and use a commandline on your computer to verify if thesignature of the the command line youdownloaded matches the publish signatureright this is just a more fancy way todo exactly the same thing this usescertificates and public keys and privateKeys the Sha 256 example uses a verysimple uh uh check some command makesense so the whole point of this is wedon't have to do sha 256 anymore becauseit has its own set of problems then thisis just a more sophisticated way ofdoing it make sense goodso putting all these things togetherusing salsa or ES bomb together orseparately are supposed to becomplimentary you can use salsa itselfyou can use es bomb you can use both youcan use none of them es Bomb by the wayis going to be mandatory in Europestarting2025 because the European Union havepublished a law called Nest 2N2 which forces companies operating incertain domains to generate store andSupply s bomb if they're asked by TheRegulators and that's going to be that'sthe new version of gdpr right that theEU introduced and companies in whichsector you might ask Finance Healthcareenergy blah blah blah the the key onesmilitary of course you know the the likethe the the major sectors the sectorsthat if something goes wrong people lifegets impacted right so let me show you avery quick example of how this lookslike so I have a very simple um uh uhcall um app here uh which is written ingo so it's just a very simpleapplication uh which has a Docker fileand I'm going to use that to build myapplication okay I'm going to not dothat and then just go copy the commanddirectly to build thisapplicationso so it's a very simple hello world Idon't really care too much about what itdoes I just care about showing you anexample so I'm going to build it I'mgoing to wait for a secondhere and I'm going to show you the theexample I'm going to show is usingkubernetes and um if you don't knowanything about kubernetes just know thisvery very very basic simple factkubernetes is a super stupid system whenit comes into deployment stuff when youtell kubernetes please deploy this imagefor me as long as it can pull the imageit will download it it doesn't care thatimage could be coming from your ownartifact registry or it could be comingfrom dockerhub or it could be comingfrom I will hack you.comgood image trust me right it will itwill run it right so I build this imageso I'm a developer build the image sofar so goodright so if I do this my image is therenow what I'm going to do I'm going totag it because I will push it to to aremote container registry uh don't worrytoo much about what these two commandline means it's basically just taggedthe same image with two different tagsone called unsigned and one calledsigned and I will explain to you whylater I will will take my unsigned imageand I'm going to push it to a remotecontainer registry so Docker push willpush my command my my my my container toremote the container registry and itwill fetch the Sha the nend Sha thesignature it will fetch that right so ifI do echoon uh dollarthis uh then I have my uh uh IGN chatthere right cool so I'm going to go intoI have a kubernetes cluster and I'mgoing to deploy this this containerimage onit uh withoutuh uh without doing too much to it justlet me give me onesecond okay coolso again don't worry too much if youdon't understand what this command linemeans I can explain to you it basicallyjust creates an application insidekubernetes called unsigned Dello worldand it's pass an image and the image ismy unsigned image okayso this is just to demonstrate the factthe very simple fact that kubernetesworld as long as the image isdownloadable it's installable right sobasically what Idid is what most people do build animage push it somewhere deploy it notrust no verificationnothing of course this is stupid becauseno one can tell me no one actually canensure that that images coming from me Icould have downloaded from the internetI could have literally Docker pulled itfrom again I am a hacker. trust me thisis a good image right kind of kind ofthing and uh would run it to kubernetesand that� should be a problem but that'sactually a problem in the sense that Ijust run a bunch of untrusted code in acluster right so how do we avoid to dothese problems one of the ways you coulddo it is you could use a command linecalled cosine I'm going to show you soI'm taking my signed image this time andI'm going to push my signed image to mycontainer registry and fetch theSha and then I will take this cosine socosine is a command line tool that ispublic by six store it's Cloud agnosticin the sense that as long as yourcontainer image is oci compliant ocistands for open container initiative andthat's the spec that all Container toolsused to build images that's basicallythe standard spec uh that's actuallywhat allows you for example as adeveloper to use Docker to build animage and use podman to run it becausethey both just implement the oci specright so as long as your image is ocicompliance as long as it's in a registrythat cosign can talk to and as long asthe key is either coming from a KMS akey management system or it's a key thatyou generate yourself right in myparticular case you see on this linenumber 52 where it says D- key I ampassing a key which is in gcp KMS sothat's Google Cloud platform our versionof key management system which is just atool that generates and stores keysright so public and private key so thekey that you have there after the flagD- key that's just an asymmetric key apublic private key right so now what Idid is I will tell cosine to use thatkey as an input and then sign the imagewhich is the last line so cosine signblah blah blah blah blah blah so thiswill take thekeyuh what's going on hold on this is notgood uh ah I know why I need to I needtoauthenticate and that's not good in thesensethat ahsorry uh just bear with me onesecond I am hopefully not going to showyou mypassword because I want not to dothatuh let me go fch that hereand thenhopefully I don't end up exposing mypassword on therecording and then I'm not also supposedto show you this key but that'scompletely fine uh and then I need yeahthat's the issue with this thing is Ineed to authenticatetwice sorry I should I I I when I wasauthenticating I used the wrong uhproject to authenticate so that's why Ihad a permission problemall right so now it should justwork continue not to show anything Ishouldn't be showing all good okay boomand then boom all right cool we shouldbe good to go now so now if I try tosign my image it looks like it's workingso what's going to happen is if I golook at the container registry so thisis the Google container registry this iswhere I put put push that image this isthe sign image if I click on it I seethat there are actually two artifactsinside right right the bottom artifactwhich is 3 minutes ago that's the imageitself right the top artifact is theactual signature right and you can seethat I only have 89 vulnerabilities soit's not toobad so if I go into the top artifact andI go into the Manifest I will see thatthere is a signature do you see the lastline in the Json payload that's asignature that's the unique signature ofthat container image using the publickey coming from KMS right so nowtheoretically if I give this image toanyone and I get them my public key theycan just verify it right so this isexactly what I'm going to do now I'mjust using cosine again uh using thesame exact parameters except that as anin as a as a as a command now I saycosine verify using the same key sameimage I run it and this should say thecosign claim were validated thesignature was verified against thespecific public key so what I did rightnow I just established trust I built animage I signed it and then verified itof course in a real world scenario youwill be developer building and sign inand somebody down the line through thesupply chain will verify that imagewhether it's a person that want to trustor verify that that image is coming fromyou or a cicd system or you knowwhatever supply chain thing you aredoing make sense uh of course if you arecurious because I am a Linux geek uh ifthe signature was verified properly theexit code wil�l be zero right so that'sessentially how you would in your CIscripting say okay I run the verifycommand but how do I make sure thatverification worked well you just lookat the exit code if it didn't work itwill be exit code one make sense goodnow how do I take this further wellluckily those cosign people actuallythere is also another demo I could dobut I'm I'm not going to do it you coulduh generate an sbom uh so there thereare multiple ways you could generate sbombs Docker has a feature to generate sbomb there are a bunch of other tools Iuse this tool called sift syft so youcan generate an s bomb you can attach itand you can also sign the s bomb itselfif you want to take the verificationeven further you can sign the imageattach the s bomb and sign the s bomb soyou have a boat of ways to verify thatthe image was signed by the people whosaid they signed it and the sbob itselfwas also signed by the people who saidthey signed it right I'm not going toshow you that I'm going to skip throughthis so in kubernetes World there isthis concept called admission policiesor policy controllers and so in kues apolicy controller is a piece of softwarethat registers itself with the controlplane and tells the control planewhenever this action have to happenverify with me first and I will tell youif you're allowed to do it or not rightthis is called an admission controllerin kubernetes world so cosign theycreated an admission controller thatbasically allows you to deploy a policyin a cluster and say for for theseimages verify the signature against thiskey and if the signature is not verifiedthen that would not be allowed to deployso I already deployed the policycontroller I already created this policyand you can see in the policy I say Ihave I want to to apply the policy toall the images that's the globe starstar so all the images and then I wantto verify the signature against this keythis is the same exact key I've beenusing right and so now what I can do uhso for cosine policy controllers to workI need to also do an extra step which isI need to tag in namespace so I need toput a label on the namespace which sayspolicy. 6 st.dev include through andthat just allows you to basically in acluster to enforce policies differentlyso you can have one name space Dev whereyou don't enforce policies but you docare but you want your production namespace to have uh policies enforced rightso once I have labeled mynamespace then what I can do is I can goback to my initial example which is whenI deployed the unsigned image yourememberso I will take that signedimage which is here I'm going to copy itand I'm going to go and attempt todeploy it in this namespace so rememberthe namespace is the one where the thepolicy controller is deployed and is setto enforce policies right so if I takethis uh holdon uh no but I want the I need the shaoh yeah here we go so if I take thisunsigned image that I have here and tryto deployit uh that should just failsorry so this should just fail and itwill fail by telling me that nosignature have been matched because thecontainer image is not signed but now ifI want to deploy my actually signedimage then should just work because it'sbeen it's been uh signed and then thepolicy controller will verify thesignature against the key right makesense so it's just very simple way tobasically um build images sign themverify the trust verify that they havebeen signed and then also enforce thatyou don't allow stuff to be deployed inyour cluster if they haven't been signedso this stuff only works with kubernetesuh cosign works with anything Conorimage but there are equivalent toolsacross all type of different variousways deployments for VMS and for otherkind of environmentsand that's it that's all I have for thetalk I think I am at 3 five minutes moreor less so I'm happy to answer anyquestions thank you very[Applause]much yes well one ofyou uh normal uh use of signing and andbody in the science we used to sharewith their parties I mean you're wouldyou do it uh in your own systems as welljust in case if you were managing multicloud or stuff and sharing images fromone side to the other yeah you just Sharthe image in the public key yeah I meanin your own I mean environment I meanyou're not sharing with her part oh yeahyeah yeah of course of course I meanthis is this stuff like this stuff wouldapply to even in your environment evenif you have even if you're not likeworking publicly even privately becauseyou still want to make sure that theimages are coming from your actualdevelopers right someone yes becauselike I mean one very simple example umthat I think of the problem I have withthis example is I don't have any exampleto actually U uh demonstrate that thishappens in real life but it might um ifsomebody steals your developer laptopright if somebody steals a laptop from adeveloper in your company and somehowhave access to their password or maybecut their finger to unlock with theTouch ID on Mac um then they have accessto their internal tools and sinceeverybody these days works with Gitbasically all your intellectual propertycode is copied in all the developershard drives right so so um somebodycould potentially try to hurt you bypushing a bad change that could happen Imean theoretically it could happen Idon't know if it actually happensum yeah so so um so yeah I mean yes yesI would do it even internally yes yes toanswer your question yes yeah and thenif you're working in the public workingacross multiple companies justpublishing the public keys because thepublic keys are public and no one caresjust don't publish the private ones andthen that should just work yeah therewas anotherquestion cool so thanks uh thank you forthe talk I think this was very veryinteresting something that I that Inoticed is that this this is kind ofputting the burden of actually verifyingthis on me me being the developer or theSRE and we kind of know that doesn'twork particularly well like you know I'ma very lazy person I I will probably notdo this so is there any way or or anidea for eventually to be a way for asystem to know is this image supposed tobe verified or to or supposed to besigned let's say say ah and if it issupposed to be signed then I will checkand fail if itisn't H that's a very good questionbut well I think if you are buildingsuch a system which then you will haveto build this right uh one easy way youcould do that is labels right like youlabel the images and then based on thelabels you can verify whether somethingis supposed to be signed as I said ornot that could be one way to do it um Ithink most Cloud providers have somesort of this but fully managed so youdon't even have to care about it toomuch I know we have it on Google Cloud Iknow Azure have some stuff like thiswhere as but but the challenge of courseis that you have to use their productright if you're going to use somethinglike a third party software like Argo CDor Jenkin or something then you willhave to implement all these extra checksyourself unfortunately so the I thinkthat to answer your question in a morebroaderway the problem or the reason why thesethings doesn't exist out of the boxtoday is because everybody'srequirements are different right and uhdepending on your internal security teamrequirements this might or might not bea problem and so then it will be hardfor actually companies to provide like aprescriptive outof thee boox solutionthat does all these things out of thebox right um so that's why but I thinkit's a gap in the market I think there'sprobably space for this kind of stuffespecially with with this law that Italked about coming in Europe I thinkthat a lot of companies will be rushinginto trying to buy any availablesolution to manage s bomb s bomb is ahuge problem for a lot of companies umgenerating and storing s bomb is a hugeproblem for a lot of companies so butyeah I don't know if it do an yourquestion yeah I like the fact that uh itdoesn't exist is uh kind can answer thequestion how to build it is yeah it'sprobably a that will depend on whateverCH achieve yeah yeah thank you noworries otherquestions all right thank you very muchthank you for having me2025-04-15 22:17:46.580253�f you want all right so this iswhat we're going to be covering todaywe're going to betalking about my dislike for passwordswhich we already did we're going todiscuss some Alternatives we're going totalk briefly about webn and then we'regoing to talk aboutpasis uh before we begin I would like toask you if you know what pasis are so Ican get like a sense of theroom cool are you usingpasis awesome that's great I didn't seethat many hands up so hopefully at theend I change your mind but yeah goingback to the password so if you if wedon't use passwords what's thealternative and the first thing thatcomes to mind is justpassword less so we get rid of thepassword and we have a few ways toimplement passwordless you probablyalready used um something like using amagic link so you log into a serviceusing your email and then you get a linkand then you click on that link and youare logged in or for example using aonetime password code sent to your SMSor something like that those are someforms of passwordlessauthentication um but like a formal morefancy definition would be any form ofauthentication that doesn't require theuser to use a password when they l inand apart from those examples I thinkwe're going to be discussing a few otherexamples um there's also authenticatorapps like Google authenticator forexample that could be an example and Ido personally think that passwordless isbetter than using a password and I havea few reasons to believe that the firstone is thatit improves the user experience so as auser you only need to remember oh welllet's rephrase that you need to rememberless things right you need to eitherremember your email or your phone numberand then you can log in you don't haveto remember another password or likecreate a new one from your 10 year oldself password you know that kind ofstuff um it is more secure because thething is with the thing with passwordsis that they can be vulnerable as as Isaid we have this tendency to reusepasswords because we can only memorize acertain amount of passwords and so weoften reuse them or make little changesto themand sorry uh or we make little changesto them and we also tend to share themwith other people and if you you I'msure some of you have done it with somestreaming services I've done it it'sfine but you know uhum because of that they're more SECthey're they're less secure than usingpassword is because you don't haveanything to share in there and finallyusing passwordless reduces the cost ofownership and I wrote in there both intime and emotionally and let me explainwhy so managing passwords is expensivenot only you have to deal with umstoring p passws probably in yourdatabase which I hope you do not do inplain text uh you also have to ensurethat you have password reset mechanismsthat you uh that your database issecure but me personally in the pastevery time I've had to implement any anylike type of uh username and passwordauthentication system I always feel likeam I missing something like is this isthis enough like there's always thatthat like stress of what I'm am Imissing right uh and that's what I meanwithemotionally and I think that if you feltthis stress and you've dealt with theannoyances of passwords um like me youwere probably looking for a solution andyou were wondering okay but do how doespassword list evenwork and to start talking about that wefirst need to talk about authenticationfactors and you probably already knowthem uh with some other names but wehave mainly three authenticator factorswe have knowledge and that is provingyour ENT your identity with somethingthat you know and that can be a passwordor the answer to a securityquestion you can also prove youridentity with something that you haveand that is possession so for examplewith your device and finally we haveproving your identity with your or likewith inherence or something that you aresomething that is inherent to you andthat for example can be any type ofbiometric your fingerprintum your face if you if you have like useface ID andstuff and the thing with uh passwordlessis we want to get rid of the� knowledgeFactor right we don't want to we don'twant the user to use a password to usesomething that they know so we're onlyremaining with the possession factor orthe inherencefactor and the thing is that if we lookat this diagram here that I madethere is a way for you to increment howsure you are the user is who they saythey are depending on how you mixauthenticator authentication factors andauthenticators and I'll explain a b inauthenticators in just a minute so ifyou look at this image for example if wego all the way to the okay we went tothe right I wanted to go to the left butwe're going to go back so if we look atthis image on the right side and that'swhen the right animation will come in uhthat's where we want to be becausethat's when we have the the highestAssurance that's when we're most surethat the user is who they say there arewhy because we will be using somethingcalled PH2 webent which I will explainin a minute and we can also useBiometrics the thing with phot web andwhy they're so secure is because they'rebased on um public cryptography which isharder for ATT attacker to guess in inyour biome in in your using yourbiometric as well whilst if we go to theum left side we're talking aboutsecurity questions and passwords andthat is something that people know soit's very easier for an attacker toguess okay so ideally what we want to doin order to be completely sure that thisuser is who they say they are is we wantto increase the level of assurance thereare a few ways you can do that I broughtthree here that we can discussthe first one is using multifactorauthentication the second one is usingpublic key cryptography and the last oneis uh using authenticators that arefishingresistant so let's talk about multiactorauthentication if we go back to theimage we see that the authenticators onthe left side um they're highlyvulnerable like I said because theythey're based on an authenticator whichis your memory you're basically yourbrain is basically the authenticator sothe is a guest they usually only supportsingle Factor authentication so they'renot very secure we want to keepincreasing that so if we move forward tothe middle part then we're talking aboutokay if we use SMS and voice and emailand software onetime passwords in thiscase we want to use authenticators thatensure that you're using two Factorauthentication that means that you areproving you are who you say you are withtwo different uh methodsand then finally if we go to the lastside which is where we want to be weneed to use here authenticators thatenforce multiactor authentication thatis using two or more factors of umauthentication and we also want to makesure that they have impersonationresistance compromise resistance andauthentication intent and those are veryfancy words to a few things that I'mgoing to explain in just a minute sobasically we want to be on the rightside using authenticators that ensureMFA that's the first step let's say wegot that so the second thing would beokay um we need to use publiccryptographysorry and um you are probably veryfamiliar with public cryptography but ifyou are not you think for example whenyou use kit and you use your SSH casethat's kind of like a form of usingpublic Crypt graphy and it basicallyconsists on uh having a pair of keys soyou have a private key and a public keyand those keys complement each other sothe public key is obviously public foreveryone you can use it for anything theprivate key is only available to you andshould be should be kept completelyprivate so this is a little example umhere mati wants to send a message to gyand so he's encrypting the message usinghis public key and then goody isdecrypting it using the private key andthen when he wants to reply back hesigns the message with the private keyand then mati can verify it with thepublic key that's what we that theycomplement each other because you needto use them both in order to get thatmessage so now we know that what thehell is publicly cryptography and thatwe ideally want to use an authenticatorthat enforces multiactor authenticationand pub�lic cryptography so what aboutfishingresistance um to guarantee fishingresistance there must be proof ofidentity every step of theauthentication process and this is whereweb auen comes into play so ween uh isshort for web authentication API and isa browser API that allows servers toregister and authenticate users usingpublic cryptography in instead of apassword this is a w3c recommendationit's available um if you want to read itout you can you can Google I forgot toput the QR code but that'sfine and yeah so using webn willactually ensure that you're usingpublicly cryptographer that's great butit would also ensure that you're usingan authenticator that is like the oneswe want to use and I'm wondering ifanyone of you is familiar with any ofthese devicesyes I see a few hands these are calledsecurity keys for those of you uh wholike me a year ago had no idea of themand you have like different models andBrands uh one of the most popular onesis this one is the UB Key by ubo but youalso have like Google's Titan andwhatnot and these devices are calledroaming authenticators so they aredevices that you can connect to yourlaptop for example and then theinformation the public keeper is isalways availablethere then we also can have uh platformauthenticators and that would be forexample my Mac uh if I wanted to use itto authenticate somewhere this would bea platform authenticator and that'sbecause the authenticator is integratedinto thedevice and I want us to take a look nowto how the wean architecture kind oflooks like and it's a little bit likethis so you have mainly three componentsthat is the authenticate the client andthe Reliant party the authenticator asI've I think I've said that where like20 times already uh it could be one ofthe devices I showed in the picture itcould be your Mac and as we said it hasto be an authenticator that enables thatenforces multiactor authentication anduses publicy cryptography that is arequirement for this and this device isactually the device in charge ofcreating public key pairs for you toauthenticate later onthen we have the client which in webofan is usually a web browser but Ithink there are also someimplementations on Native uhplatforms and then we have the relyingparty and that would be the server towhich you're trying to authenticate toso let's say for example you want to loginto your Google account uh using webinthe Reliant party would be uh Google'sserver in thiscase and then then there's an extra umstep let's call it like that in this andthat is that there there has to be somesort of user interaction and this userinteraction is called proving intentthrough deliberate action and here we'rekind of like guaranteeing the efficientresistance so every time you want tocreate or you want to authenticate usingweb ofen you need to confirm or a testthat you actually want to do it so inthe case of using a u key every time youwant to use it you have to touch it orif you want to use for example your Maxuh authenticator to log into somewhereyou have to put your your fingerprint inyour touch ID or use put your face inyour phone if you're using an iPhone forexample so in web in each of these stepsthere has to be some sort ofverification so we can guarantee ficientresistanceright uh the communication between allthese parts of the architecture happensuh like so so the client communicateswith the Reliant party using the weanAPI because it's a JavaScript uh browserbased API so they can talk like that andthe client and the authenticatorcommunicate with each other through aprotocol called the client 2authenticatorprotocol these are two standards cab 2and webent and they comprise what isknown as phto 2 the phto 2 standards andphto 2 is a stand standard that wasdeveloped by the phto alliance and thephto alliance is um it's an organizationand their mission is basically to reducethe Reliance of users with passwords soI share a goal with them I don't knowabout you so I want to keep implementingthis stuff if you want to learn moreabout this stuff you can scan that QRcode uptheresorry uh but right now I'm just hap�py ifyou remember that there's somethingcalled f 2 and ween and they take partin the architecture of um of ween ofcourse all right so uh I mentioned webauan is an API and it has a bunch ofmethods but the two most important onesare the ones that allow you to createcredentials and that is you can uh useit you can call it using Navigator docredentials. create and we have the getmethod which is the one that you can useto authenticate yourself so once youcreate a credential you have to get itand verify that all the informationthere is true and that would be how youauthenticate it and you uh use it withNavigatorcredentials. uh normally at this point Ido a whole demo on how web auenauthentication look like but um there isanother talk talking a little bit aboutpys in the front end track uh 3:45tomorrow and I think it makes more senseif you want to go take a look and see itthere and then here we can focus on likeother stuff with pasis uh so if you wantto go feel free to go I have nothing todo with it I just think it's cool thatthere's two people talking about what wehave passwords and yeah feel free to goall right so we're increasing the levelof assurance again we have multifactorauthentication we have publiccryptography we're ensuring fishingfishingresistance and now we also talk aboutwebo thing kind of in a little bit butwe want to comprise it all because Ifeel like we we talked about it in likedifferent phases but we want to compriseit all in one thing and that one thingispasis in aasy is nothing more than aunique cryptographic key pair thatallows you to access online serviceswithout using passwords and it's basedon public on asymmetric public keycryptography um I like the name changebecause you know we were using randomstrings which are words for passwordsand now we're using cryptographicKeepers for pass keys so I think thatwas a a good use of words inthere and um I have a few like benefitsor reasons to use paskis and one of themis that they are intuitive hopefully youcan read that because I don't know if ifthe contrast is good enough uh they areintuitive they create an experience thatis familiar to end users because itfeels like you're unlocking in secretand I'll show you that in just a minutewhen I if my demo doesn't failuh they're secure they're fishingresistant why because they are using allof the other things we were talking uhjust a minute ago they use publiccryptography they enforce multiactorauthentication then another reason isthat they are non reusable there is noway you can use a pass key that wascreated for a website and use it inanother website because they are boundto The Domain they were created and thisis also another reason why they'refishing resistance so usually if youhave been a victim or of a fishingattack you get this email that looks somuch like the real stuff and then youhover uh on an on a link and you're likethat's not the right URL so with paskythat can never happen because if the URLdoesn't match then you just cannot useit and lastly they are convenient theyallow you to access an application onmultipledevices um and you can use biometrictricks or security keys and all of thatso technically speaking and this is whyI was talking about PH2 and we and allof that like if you if you look for theformal definition of aasy aasy is a pho2discoverable credential and it's a phtotwo so let let's split that into in pho2and discoverable credential right it'spH 2 because under the hood is usingwebo and it's using the pho twostandards and it's comprising all thatinformation that we just saw and thenthe discoverable part it'sabout so when you use a pasy the browseris able to detect which pasis werecreated for that particular website andthat is what tells you how discoverableaaski is for example if you were to useum wean in awebsite and try to log in and you neverget like the the credential suggested toyou that is not undiscoverablecredential a pasy would be because itwould always be suggested to you by yourbrowser and we have two types of pasiswe have sync paskis and that are thoseare the pasis that you can um s�ync to acloud-based service and they'reavailable in multiple devices forexample you can sync a pasy to youriCloud or to your Googleaccount and then we have device boundpass and those are pasis that are storedin a single device only and that'susually the device that they werecreated on uh I want to point out thatthe difference or like the main maindifference of the two uh paskis apartfrom being available in multiple devicesand stuff is of um where does theprivate key lives so if you remember apass key is a key pair and a use youknow public key cryptography pair so wehave a private key and a public key inthe sync Pass Key you have to sync theprivate key to the cloud in order for itto be available multipledevices in the device bound example theprivate key never leaves your device soif you're using for example a ubb keyyou are you know you know for a factthat your private key is in there andonly in there it's not sync H sync toanywhere uh so there's a lot of peoplethinking you know then um I want to havemy my private key always with me I don'twant to sync it anywhere and when we getto the challenge bar I'll say why andI'll I'll we'll understandwhy all right uh talked a lot about pysand stuff but let's uh let's see them inaction because I think that's uh moreinteresting than hearing me talkinghopefully um the demo works so let meseeif I can move this I uhuh okay it'srunning so that's good atleast okayso sorry so this is a small applicationthat I wrote using Ruby and railsbecause I like rails no the reasons youcan you can write it in whatevertechnology you want um and then what itdoes is it allows you to create a newpasy read those pasis delete them soit's like a pasis manager app I wouldsay um so I'm going to go ahead and tryto create oneum I'm obviously using not zero becausethey they pay my wage and also becauseit's cool so I'm going to try to createone new account uh let me think Carlaata.com am I typing that rightokay so I'm going to click continue andI'm being prompt or asked it's likeyou're going to create a pass key thisis all the benefits you have are yousure you want to create this passy soI'm going to click on create a pass keyand then I'm presented with a fewoptions so the first thing is I cancreate a passy in my iCloud keychainbecause I'm using a Mac I can use aphone a tablet or a security key or Ican save it in my Chrome profile for thesake of the demo I'm going to create iton my phone so at the same time I canshow you how the multi device experiencefeels like so I'm going to go ahead andclick on use phone tablet or securitykey where's my phone here so let's seeI'm going to scan this QR code if it'sreadable can I read it from hereyeah could also readit okay so it's connecting to my mydevice and what my device is saying isyou need to use face ID to actually usethat passy and this is that part of userinteraction of like intent throughdeliberate action that we need to dolike if I don't do anything for a whilethis will time out and you know it willfail so that's already stopping like anykind of automated attacks so I'm goingto click continue use myface and now I should go back to myapplication at this point I'm I'malready logged in just asking me toauthorize some stuff because I need toaccess myprofile and here I have my profile andif I go to my PAs keys I can see that Ihave created a snc pass key with thisinformation and that kind of stuff Icould also delete it but I'm not goingto do that right now because it willdefeat the purpose of this demo sothat's how creating a pass key lookslike what happens when I later on wantto use it so I can click on the continuewith a key and remember a while ago whenI was talking about discoverablecredentials and how the browser candiscover credentials in this case I'musing Chrome and so Chrome is able todetect the p is created for this domainassociated with me so I have that onewhich is not the one I want to use rightnow so I'm going to click on use adifferent Pass Key and I'm going to haveit on another device so once again I'mgoing to scan this QRcode click on it it's connecting� to mydeviceand if you're a normal person you know Imean you shouldn't have like that manypassys for one single website I have abunch because I've been testing this outuh so I'm going to select the one Icreated which was the the Barcelona oneclick continue use myface and I should be authenticated nowum this is a trickypart not a tricky part but just a aninteresting part is that with the ozeroimplementation um AAL will always askyou if you want to create the pass keyin the device you're using at the momentso my pass key is created in my phoneand it's seeing through my iCloudbecause I have an iPhone but air wantsto know if you want to create it in myMac in this moment I don't have myadcloud here so that's why it's askingme so I can just either create a new oneor continue with the one that I alreadyhave I'm going to continue with the oneI already have and if we go to my passKeys then that's just one single Passkeythere all rightum let me see if I can bring my Tobthere wego all right okayso um I don't remember ah yeah I hadvideos to support me in case I didn'thave Wi-Fi so I'm going to click nextand next because I already showed youthis and I also want to keep it realbecause I come here and I tell you allthe great things about Pas keys andpasswords suck but we are still facingsome challenges so let's go through themthe first one is uh the Opera system andbrowser support so using aaski depends alot on the or like the experience youget as a user depends a lot on theoperative system you are and the browserthat you're using because it's up to thebrowser and to the operative system toimplement the specifications like weband PH2 and all of that and so uh youcan have a fragmented um experience forexample I was using Chrome here but ifyou useSafari or even if you're on Linux youmight have a completely differentexperience in fact I don't think forexample in Ubuntu you don't you don'tget to use S pasis yet and that's anexample of how different The Experiencebetween browsers and operative systemscan be don't ask me about Windowsbecause I don't use Windows so I'm justkidding um there's also the cloud vendorReliance so for S pasis we rely oncompanies like Google and apple andMicrosoft to save the private key intheir Cloud securely so um when I wastalking about sing p keys that's what Iwas referring to like you know somepeople just prefer to have it with themin their pockets than to sync theirprivate keys to these type of uh placeslike Apple and Google and Microsoftso it's it's a trade of it it reallydepends on on your needs and what youfeel most com comfortablewith then we also have some Enterpriseuse cases so some Enterprise users mightwant more control and flexibility overthe things they can do withpasswords uh for example if you uh areon Mac like on your work laptop and yourcompany decidedto disable iCloud uh then you can'treally useing Pas keys so it kind oflike defeats the purpose a littlebit or if they disable Google Chrome Idon't know I'm making this of the GoogleChromeexample then you can only use devicebound pasy so that is another uhchallenge that we're facingand then the last one is for reset andrecovery and I often get a lot ofquestions about this one because uh whathappens if I lose access to my iCloudyou know what do I do with my passwordlike you're depending um with somethingbar like you're giving something veryimportant to something someone elseright so there's not really a defaultrecovery flow uh especially for devicebound pasis and applications still needto implement recovery and reset like onthe on there's not there's not like uh adefined specification on how to do thisyet so it's kind of like a do it how youwant to do it kind ofthing all right um other than that welljust recapping a little bit on theagenda I talked a lot about things aboutpasswords we saw some Alternatives wetalked about webn briefly and finally wetalked about pasis so I think we coveredall that and um where can you use pasisnow so obviously Google Apple Microsoftyou can use it obviously out Zero byOCTA um and then I personally haveenabled them in my GitHub account andthe experience has been nice Idiscovered recently that Uber has pasisas well which isinteresting uh kayak's experienceapparently is very nice with pasis andthen of course um most I think the mostof the password managers like onepassword bit word and and all of themare catching up to give support uh topass one password already has it but theother ones I I think they're stillcatching up uh but there are many otherwebsites really where you can find pasisum and adoption is growing every day sowhere can you learn more about pasis andeverything that I just spoke about so ifyou want to learn more about ween youcan go to this website called web. me uhyou can debug the whole webin processsee how it works what's you know how arethe components interacting with eachother what data are there are theysending to each other uh if you want tolearn more about pasis specifically youcan go learn pasis and um you can run ademo basically doing uh what I did withmy demo but very like explained andexplain you a step by step what is goingon so it's moreclearsorry okay um one last thing before Ifinish this talk is I don't know if younoticed but I'm wearing this electronicbatch with me um this is a Pimon Badger2040 and you can program it usingmicropython and it's really fun to haveit has absolutely nothing to do withpasket but I just think it's fun but uhwe have prepared a little codingchallenge so if anyone wants to likefeels like doing a coding challenge it'sa spring boot application but honestly Idon't think you need to know that muchspring boot to do it it's pretty muchdone and if you manage to complete it Iwill give you one of these uh I will bein the outs here by OAB Booth so feelfree to uh scan that QR code and thenyou can go ahead and try it out I see afew phones out so I'm goingto wait a little bit here while you scanit if you don't get to scan it now youcan also find me in the in the booth orthe stand and I'll be there you can askme about it uh I only have a few thoughso I would say it's like first comefirstserved all right oh and the only thingis that you need your laptop to do it soyou can do it today or tomorrow but Idon't know if I'll have some tomorrowand yeah um this been a ride it's beenreally a ride and I'm very thankful foryou all to be here and at least it'sbeen a ride where we didn't have to dealwith any passwords so thank you so muchno questions right I I'm just if thereare any questions uh I'm happy to takethem if not I can justleave noso you were talking about the about fastfast fast fast fast fast fastfast wow was it that long ago thisside um yeah so the authenticator thatthe the question was for those who uhdidn't listen is uh so how can weguarantee that even though we're in thisside like what happens if my device getsstolen or if I lose my Dev device howcan I guarantee that no one else getsaccess to my account right okay uh sothese devices ensure that you have touse multip Factor authentication andwhat that means is that you're not onlyusing that device but you also have toimplement another Factor apart from thatdevice for example um in My Demo I wasusing pass keys but then I could alwaysdefault to a password not that I want tobut I could always default to a passwordso even if I lose access to that passkey I can still go to my password Iwould recommend setting up some othertype of uh factors in place like forexample you can set you can use a pasywith an authenticator app for example soif you lose access to that pasy you canstill use your authenticator app andthen later on you can make sure uh todelete that pasy so no one else can useit in thefuture you're welcome any otherquestions okay if you can goI won't take it personal but I'll findyou I'm just kidding well if there areany questions or if uh you just want tochat um with me not on this stage whichuh it would be much better for me uh youcan find me at the out here by OCTA Stanor just walking around and I'm happy toanswer all your questions once againthank you very much and enjoy the restof the[Applause]conference for2025-04-15 22:17:47.243585 ��#��=A5woqh5aRlqwIthinkun Spanish I think goingEnglish let's begin we are going to talkabout platformengineering I am moras Gonzalez as youcan maybe heard I am a Spaniard who hasbeen living in France for over 20 yearsso my English accent combines the worsttraits of both Spanish and Frenchaccents sorry about thatpeople I am head of Def rail developerrelations at clever Cloud clever cloudis a cloud provider a French cloudprovider and the main difference withthe generic cloud provider is��|�#��/AtppmEJQ1t-Uhello everyone thank you so much forbeing heretoday um today uh before I start I wantto apologize in advance because I have alittle bit of a cough left overfrom uh being sick and so at some pointI might have to cough to the mic andyou're all going to have to hear it sosorry about that but apart from that Itoday I'm here to just make you think ofa world where you don't really have touse passwordsanymore and you might think that youknow for people in this industry like weare or even people who are in theidentity and access management industrypasswords are like you know easier andeverything is better and stuff and thereality is that it's not and I want totell you about the one of the manyexperiences I've had with passwords so Iwant you to please raise your hand ifyou've ever been to a furniture shop andyou're waiting to pay in this selfcheckoutcashier and you're trying to pay withyour credit card but you're you don'tremember your pin and so you try torecover your pin and then you need toremember your password to get into yourBanks application but then you don'tremember your Banks application'spassword and so you're just makingpeople wa there for like 10minutes awesome I was not expectingexpecting to like that this happened tomore people because it's very specificbut yes this happened to me I was in avery big furniture shop you probablyknow it I was trying to use my creditcard I couldn't I couldn't recover mypassword at the time because of reasonsI couldn't use my face to L in with theapp because you know sometimes it's likeno now we need your password and you'relike what's my password I don't know umand so that happened to me and I wasthere for like 15 minutes struggling topay people were annoyed behind me andyou know that that that's that's reallythe thing about passwords the thingabout password is that they suck they'rethe worst I don't know if I can getanyone to agree with me on that one orif you like password let's change thatdo you likepasswords you're no you taking the no Idon't believe you um and one of the manyproblems with passwords is that theyrely on your memory and yes I know forus in this industry we probably usepassword managers or we store passwordswith um like the Google passwordmanagers and all that kind of stuff oruse one password and stuff but thinkabout other people outside of theindustry like I I often think about mymom because I'm the password managerI've had messages from my mom sayingwhat's my Gmail password and I'm likeit's this one and she's like no it saysI changed it three months ago and I'mlike well you didn't update me so howcan Iknow um but yeah the thing aboutpassword is that they're easy to forforget they rely on your memory and theycan be vulnerable for attackers and theycan be accessed by unauthorizedpeople that was my little rant aboutpassword so let me introduce myself myname is Carla I'm a senior developerAdvocate at ozero byoa I've been workingas a software engineer for about 10years already I make YouTube videossometimes although it's been quite awhile I probably should update thatslide and you can find me as Carlastabile almost every where where it wasavailable but yeah going back to oh Ihad another disclaimer to make is thatthere are some things that I mightsimplify in order to explain them inhere so very happy to talk about thatlater i�� that weare doing a platform as a service we aredoing a cloud by Developers fordevelopers we we will talk about thatlater I am also organizer of severalconferences and communities in the westcoast ofFrance and we are here to talk aboutplatform engineering or maybe better weare here to talk aboutcomplexity we are going to travel in thetime to go back 30 years in the it andwe are going to see that we have beenadding lier over lier over layer ofcomplexity and the developer job hasbecome something completelydifferent I began coding in1994 I began withPascal I had done lot of basic beforebut my real beginning of code was Pascaland then I began to do Java and at thetime when I wanted to develop a newapplication it was rather simple simplenoteasy I had my code editor I wrote myJava classes I added my dependencies myjar and I launch Java C the compiler andI got some class files then Java and theclass name and my app wasrunning that's what like I learned tocode Java several years ago I began toto teach Java to a students and to astudents that wereswitching their life path they had aanother career before and they wanted tobecomedevelopers and I realized that for themto code Java was only the beginning ofthe problem because before being able tobe consider a Java developer Enterprisethey needed to learn those all of otherthings that didn't exist at my time andmost of them were indevelopment I mean if I were if I was astudent now I will be really reallyreallyscared for every problem there are 220Solutions and it is your responsibilityto choose the right onethey are lot of ways to deploy the sameapplication and yeah it's like that yousee it is wonderful you have really onetool or several tools for every needwell for me that's a problem platformengineering try to offer one way tosolve this problem one way to reducecomplexity but to empower developers toobecause historically the way to reducedcomplexity was simple there was somearchitecture or methodology team in thecompany that say you use that that athat so complexity was reduced but alsoall the freedom of thedeveloper it was replaced with processnow we are going to talk about platformengineer and it is an approach that itis still beginning and we hope it canmaybe hold the the promise and reducethe complexity while empoweringdevelopers but before that let me goback in time I see in the room I don'tfeel alone because in other conferencesmost developers are in between 20 and 30years and I am like the old grandpatelling silly stories but here there area lot of people like me Hey You Are MyPeople thanks you are going to rememberthese things let's go back in time weare in the '90s in a time almost forgotwhen internet was young where Windows 95was CET hey how many time you have spendlooking at thisscreen when big companies still useMainframe no that's true yet but evenmore than now let'ssay and in that time deploying thingswas easy easy if you have if you had aserver it was the rign of bare metalservers companies usually had a serverroom for most companies the server roomwas a small room behind the coffee roomyou know the where there was a singlerack and three or four servers and whenyou arrived to the top of capacity youadded another server and but it workedit was very different than now and alsothe the way as we organized work wasdifferent there wasn't any experience ofthe best way to organize it so mostcompanies use it the way they knew theindustrial modelhey it's not maybe the model the mostadapted to the it but it was the onlyone and of course we created silos andhierarchies and responsibility and weseparated the project owners from theDevelopers from the C admins from theOps from everybody else hey it was thetraditional way to dothings so we were trating developmentlike we trating factories after alldevelopers produce things things therethey work as in a factory let's try thesame thing a spoiler our cut developerwasn't happy but it was the only way andanother thing and it will be our latemotive and most of things were stilldoing in a kit bashed way with plain oldbash or things lik�ethat by the end of the 90s things beginto evolve things begin to evolve becausetechnology allow it to evolve so thefirst evolution come with thetooling do you remember when you Cod anapplication by the end of the ' 990s youhave your application in your editorthis one was from the beginningof 20 z102 about all the clips and youwent to the website of your libraries toget the last version of every one ofyourdependencies hey I have seen that is newapach common Sayo let's get it hey butit said it needs another library inversion3.0 I am going to the site of the otherlibrary to get it dependency and allthat for me the first big evolution waswhen the tooling began to change thatway of things first change S control doyou remember how you did Version Controlbefore CVS clear case or things likethat copy copy in a new folder andrename the folder with the dateyeah Source control even if at thebeginning it wasn't really user friendlywho he has has used clear case anybodynobodyCVS yeah do you remember when you triedto do a merch withCVS I still do that I stillremember another big change when youwere able to begin begin to managedependencies using a patch I with AuntorMAV hey you don't need to go to 30different website to get the 30different libraries in the last versionyou updated your configuration file yourXML for I for M and yourdependencies were update in a magic waywithout conflict do you rememberthat what about testingwe began to get things like Hudsonbefore it was renamed as yanin orunit that changed the way as we couldpeople began to think hey we could alsodo unit test or even doing the unit testbefore thecode or themonitoring naos CTI savix before thatmonitoring an app was spending the nightin F in face of a screen with a green orred light and if the light was red therewas a problem in the application youconnect to the server and you tried tounderstand what was happening and nowyou have metrics you have Graphics youhavetraces so the toolevolve so manypossibilities but in a a few years wemade our developers like our cutdeveloper to manage this complexitycertainly your job wasn't only to C butalso to manage all that well but it'sfor a good thing yeah fullyagree and of course we still ose bashbut let's go to the beginning of thisCentury do you remember extremeprogramming it seem one wonderful onpaper Simplicitycommunication feedback respect andcouragewho who could read that and not feelwow and even better two years after theailemanifesto individuals and interationsover process and tools do you can youimagine that we were going to getfree from all those process and tools wewere going to break the to break theworldall to break down silos it waswonderful well it was wonderful until wesomebody decided to codifyagility individuals and process andinteractions over process and toolslet's do a daily scram where everybodymust spend 10 minutes telling what theydid the president day and what they aregoing to say today let's do a springplanning where you decide how many notnot time because time is not good in ahow many effort Point are you going toto put in every issue for the next threeweeks really individuals and interationsover processandtools everybody loves yso yeah on paper it was wonderful and Iknow that you are laughing because youhave lived thisperiod you understand mypain in fact we were back to the oldindustrial model we tried to do wecodifiedeverything it was codified agility butdon't worry10 year after in2012 we haddevops ah devops was going to solve thebig problem of the wall betweendevelopers and operations makingeverybody work together the key word wastogether devs working with Ops Opsworking with devs do you agree I thinkwell most companies didthey decided that devops was a nicekeyword but they want to implement in itin a way that fitted with the culture sothey had lot of antipatterns devs who are part devops butthere is another separated Ops Team DevOps and some devops teamor Dev and Ops fully separated you havelot of other devops to toies or I shouldsay anti topologies in this wonderfulsite are you you are going that �you aregoing to read that remember yourexperience and you're going to cry likeI did yeahwonderful but I especially cry when Iwhen I try to look at the joboffers what the yeah I'm in Spain Ican say what the is a devopsengineer what the is a de se Opsengineer a de ml Ops a de AI data setOps all that I have seen that inLinkedIn that's the definition of acompany or what is a devopsengineer they need to know to knoweverything it is not a job it isfive Dev as theis Ops Dev everything upsfor the companies it's wonderful theypay one person and they have five jobsnice but for the poor developer who nowhas to knoweverything well we have to deal with thecult of agility that says that the realagility is not to move and do your dailyscram or your Sprint review beenplanning and all that and you have toknow five jobs in order to do only yourjob nice our sad developer cut is evensadder and of course if you look atunder the hood it was a still mostlybash but now three years after thingsare going to change this time it is surewe have thecloud cloud is wonderfulyou are going to use the infrastructureat this maximum capability you are goingto be able todeploy in the clouds and Las noeswonderfulwell for companies it has a good effectat the beginning at last you need it tolearn to automate things if you have acloud provider what you are doing isallowing your users tocreate Compuand storage andnetwork on a click without a humanintervention but at the bottom you havereal data centers with real servers withreal switches with realdis so the only way to do that was toautomate and for developers it was a bigchange too before when you had an whenyou wanted a new server you had to do apurchase order and wait to two days 3days two weeks and then you got yourserver now you do click click click andyour server is there it is wonderful noyes because it empowers developer or atleast it was the argument everybody usednow as developer you can on your ownchoose and deploy yourinfrastructure hey deployinginfrastructure is maybe not a developerjob no you are a super developer now youcan deploy yourinfrastructureokay and you have to learn new thingsbecause by default in the cloudeverything is cloudnative what's Cloud native everything isdistributed when I was young and I hasher at in the University one teachertold me distribut Computing is one ofthe most difficult problem in it sciencedon't do it unless you have to do itwell now 25 years after every problem isa distributed computer Pro Computingproblem because distributed is the newbydefault and who did all this automationdevelopers no she admin that learn tocode ina I don't knowwhere because now s admin they have tocode nobody has asked CIS admin if theywant to code most of them when theychoose his admin over development it wasbecause they didn't like to develop theylike to play with servers infrastructurebig big toys not to develop now theyneed to develop and not only that wehave new roles that appear do youremember the de asteris Ops now we havethings likeSr system relability engineers and likethat well in fact what's that CIS adminH code okay fancy name for CIS admin Hcode so our poor developer cutnow needs to to deal with thiscomplexity with the 250 products ofevery cloud provider and now it has theresponsibility to choose and deploy theinfrastructure and to deal withthat and if you look under the hood evenon the cloud provider there are manythings that are still doing in the planeall way but on thecloud okay no problem too muchcomplexity I have another solution weare going to add a new arra layer so weare going to solve the complexityproblem but by adding a new layer ofcomplexity like our friend likecontainers it was nice now you can havein your laptop the same things that isrunning on the cloud inproduction well first thing not truethat that isn't a productionWordPress you have two containers anddatabase in yourlaptop in production you have maybe thesame containers by in several instancewith redundancy with other things butdon't worry containers are the futurenice and as containers are the future wear�e going to take our old trusty app andwe are going to split it in 20 differentapps let's call that with a fancy namemicroservices of fact in fact it is nicefor engineering managers before you haveyour application with 40 developers andthe 40 developers needing to needed tocoordinate by themselves in order to beable to put in production hey I wait amoment I am working in this feature andokay I need to stop this one okayeverybody's okay we can release now youhave your 40 developers working in 20differentmicroservices from a project manager orengineering manager point of view it iswonderful the two developers in thelogin H micros service are going to befully independent of the two developersin the back coffee micros service it isnice but for people who has toadminister that before you had oneapplication to deploy one application tomaintainone application to saiz now you have 20and the 20 arespeaking one with the others every timefrom the saning point of view themicroservicearchitector created lot a lot ofcomplexity so we need get anothersolution what can we do to solve acomplexity problem well you know myanswer no in in it science you have acomplexity problem you solve it byadding another layer ofcomplexity so let's make CIS admin liveseasier we are going to create anorchestrator if you don't knowkubernetes the easy way the easiest wayto understand this is like an intern youknow the intern you get from third orfive or fourth yearof information matics or Telecomengineering hey welcome to the companyhere you have our instances and here youhave the instruction I want you todeploy these 20 microservices the frontend I want you to deploy it in twoinstances and the back in in three whereI don't know you have the infrastructurewhatever you want and I also want you toadd a database my SQL with high spay diswhere hey it is your job you are theintern and please you also sauze thatand you add some airbag framework forthe access and you monitor that if thereis any problem you repair it I I will beat the coffeeroom by and welcome to the companythat'skubernetes well that's the publicitythey do as of kubernetes because inorder to explain all that to our interwe need we need to learn how kuberneteswork what are the kubernetes object howdo you cleare all that and you mustwrite so some hundreds or thousand ofline of lines of jaml who here lovesJam hey you have courage you are my heroyou are the first person when I havegiven this talk who has dared to say heloves jaml reallywhocan I believe you but I cannot love alanguage where every space is importantand you need to write1,000 lines file of configuration butyeah it solves a problem it makes sisadmin life easier by adding yet a newlier ofcomplexity well but if you have lot ofkubernetes and things outside kubernetesare three Cloud providers we still havetoo muchcomplexity okay let's solve that let askanother layer of complexityinfrastructure as scode so now you aregoing to be able to describe with scodewith code in which cloud provider youare deploying with with Servicesniceso now our developer needs to be awareof cont ERSkubernetes po Services deployments anterraform I no an terraform this islight is old now they are both from thesame company so we can mix them maybeterraform and terrible which a good namefor the newtool but you understand and of coursethere is a still lot ofjamlso we are here in24 there are people who want to becomedevelopers and they need to need theyneed to learn all that before being ableto be an Enterprisedeveloper it's a problem of complexitythere are too many options I have donethings fast I don't have the time but Ihav been beginning to talk about the 30different services for the same needsometimes in the same cloudprovider anybody here really understandany of the services catalog of one ofthe three main cloudproviderreallyand everybody tries to say e it is niceyou now have the power to chooseeverything is self service well thepower to choose is reallythe power to have a big cognitiveload you as developer are responsible tochoose the right tooling the rightservice� to deploy things to when thingsaren't working to try to seewhy it is even wonderful if you get onehour a week to do codeno platformengineering tries to solve that but andthat's myown my own comment they try to do it notby adding complexity but I am sure thatif platform engineering becomesmainstream in two or three years I willbe able to do the same talk addingplatform engineering as a new complexitylayerthe only way it could change if ifeverybody here and elsewhere try to sayhey maybe we can do it differently foronce so Iam I am always Optimist I am like acarebear always smiling always tryingeverybody to behappy I think it can be done but to berealistic I am fairly sure that in threeyears I will have platform Eng to mycomplexity liarthing let's try todream the definition was done by Lucagalante one of the first PE person totry to theorize that it is Complicatedby the ideais okay we want two things we want toreduce the complexity for developers butwe want them to be able to keep theirfreedom to use thethe tools they need we don't want to sayto thedeveloper when they say hey I need to dothat that do no you can't we want to sayokay wonderful here you have a way easyto use and if you need other thingslet's work together in order to allowyou to do it and that's the idea of aninternal developerplatform heyyou are old like me you have seen thisidea hundreds oftimes well most of them in proprietaryway some company who said we are goingto create some developer portal or weare going to but usually it was a bigmix of things kitbashed together withsome proprietary things that the peoplewho coded it werewhere K quit the company three years agono we are talking here about trying todo it in the right way creating an inand maintaining an internal developerplatform based on Open Standards thatmake developers life easier byempowering them not cting the optionsbut reducing thecomplexity and in order to create thatwe have three interesting pillars todaythat are mature enough in the first timewe have the G of's philosophy we aregoing to talk about the we have manyways to create this platform and we knowthe things that work and the things thatdon't work and interesting enough itisn't enough to create some toing toingyou need to also adapt theorganization adapt your your teambecause a platform who doesn't life itisn't a platform so you need also to putsome people powerbehind let's go in the detailsgiops the idea issimple well if instead of adding everytime a new tool you try to leverage oneof the tooling that every people in itdevelopers C admin s Ops already knowgit git has become the standard thefactor for Source control for many goodreason and the idea is you treateverything asgood a you treat yourdeployment as good your threat yourinfrastructure as good your threat yourworkflows as Cod and you try to have asingle source of Truth the idea isreally this single source of Truthexample you have a problem inproduction your admin goes to the serverSE there is a problem in oneconfiguration fileit she or he pches it life and theproblem has disappeared that's the oldway to do thingsit it doesn't work very well becausenext week when you are going to redeployyour server because in a cloud world youare redeploying your servers every timenobody has thought about applying thisPats to the new server giops your admingoes see the problem and instead ofapplying the patch it put they put thepatch on git and they apply it to allthe servers and to all the fut serversof your companyif you begin to do it you can get a newway to collaborate between yourdifferent teams everybody is doingthings and things are writeen down in away that everybody can understand youhave visibility andauditability and you improve thesecurity and especially you are able todo things more relable and around asingle piece oftool for me gths change how many thingsare done but it is not a magic one it issimply a firststep the Second Step if you want to havean internal developer developmentplatform is to create your platformteams Imean this book team topologies it isdon't but a guy aPortu�guese person living in Madrid isreally really interesting you have thewebsite I am not going to resume that infive minutes the idea is if you want tocreate a platform to allow yourdevelopers have the right tooling at thehands to be able to do their workwithout cting day Freedom you need tocreate a living platform something thatneed to evolve because development worldis evolving every time something thatneed to be enounced because there arenew needs and things that wereing in theplatform two months ago and intraditional way of doing things you willsay you cannot do that if you want yourplatform to be used you need to say okayright now we are enable to do that butyou can use that that and we are tryingto add that in a proper way to theplatform so you need to have a topologya team topology that support that youhave your product team he calls them astream alite team you have your platformteam who builds the platform and youalso have two other kind of teams thecomplicated Su system team because thereare problems that aren that are specificthat most teams have and that ask for aspecific knowledge and right now in manycompanies every new project try to findthe same solution to that problem forexampleauthentication authentication is aproblem that is for all the applicationof thecompany it the is ask from a specificKnowledge and Skills it is the best ifyou have a team that deals with that sowe have three kind of teams for themoment theplatform the product teams and the teamswho deal with the specificproblems if you want it to work you alsoneed an enabled team an advocacyteam a facilitator team you can callthat as you want but the idea is a teamwho makes people work together discusshow they does things discuss thesolution you have go back to theplatform team to say hey the solutionthis team found it is better than thecurrent solution in the platform we cantry to add it people who are there toput some oil in the machine to be surethat everything isworking so for the moment we have anorganization we have a single source ofTruth we can try to build this platformfirst thing is to remember that yourobjective is to propose a golden pathyou are en forcing people to do thingsin your way but you are proposing a waythat works that it is easy to use andthat allows a good developer experienceyou aren't there to say You must usethat you're are going to say hey look atthat it has 80% of what you need in areally easy way touse and most people are going to take itand use it because they see the interestyou can you aren't there to force youare then tohelp and for that you need to have aclear Mission and role for this platformteam they don't need to be the oldmethodology team that get recycled toand name change you need to create tohave a team was the mission is to createand maintain this platform and it mustbe shown as aproduct Imean you are all developers you don'tcreate an inform an IT product that youdo a bridge for a bridge you do yourplan your build your bridge and then oneday the president arrive he CS theribbon and cars are over the bridge andyour work is over and you go to buildthe nextbridge in it you are creating productsin an iterative way for your internaldeveloper platform you need to have aproduct approach you need to say myplatform won't never be done it will bealive because development war ischanging all the time you need to makeyour platform more capable every momentwith the new trends new tooling neweverything your developersneed even if your companyis really combined by the platformengine approach you are having to have ayou AR going to have an infinite budgetso you need to focus in the commonproblems the problems that most teamhave and the problems when you can getsome quick wins especially at thebeginning hey I want to create aplatform engineering team for my companybut we are going to be only three personthree people how we begin well go to seeall the other teams and what is what arethe worst part of your job well I I matsure every time we begin a new project Imust choose the right storage solutionokay thatwell we need to� decide how we get ourdatabase in our cloud provider there aredifferent plans different product weonly need a SQL database for the projectokay when you begin to see that there isthere are problems that are common tomost of your company you solve thatproblem you get a quick win you go goback to the teams hey you can continuewith your pain point or you can try ourapproach and then you begin to havepeople in the company who root for youwho believe in your project if youcontinue to do that you are going tocreate a platform it won't be completeit won't be full scope but it will beenough to make people use it to gettraction and that's the best way tobegin and to grow yourplatform another thing is you mustremember and your hierarchy mustremember that when you are building andmaintaining a platform what you aredoing is to create glue the glue thatallows the company to continue to gofaster to be performant but you arencreating value directly and that's aproblem because for most company if yourteam isn't creating value isn't sellingto the customers you are a costcenter your company your hierarchy yourteam need to remember that what you aredoing is valuable leg Glo has a value Ilove this slide do you remember maybethe film especially if you are old likeme there was an earthquake the ri wasbroken the time was coming so whatSupermandid he was gluethere it didn't repair it anything theonly work was to be there to be surethat the train couldcontinue but it was really valuable ifnot the train will be down thebridge a platform is the same thing youaren't creating value directly but youare allowing your developers to be whymore performant and reducing the C islow and you are also avoiding lot ofproblems because if every developer hasto create his own solution there are lotof these solutions that are going to benot optimal that they are going to bebad and lot of others good but if yougave if you give them a solutiondirectly from a scratch you know thatthe solution are going to work so youare creating value but being glued thereand you are there to make your developerlife easier knowto you are going to try to create theperfect solution you are going to try tocreate your solution you are going tocreate good solutions for yourdevelopers so you are in there toreinvent the well you are there tocreate and maintain a platform thatsolve their problemof course bash will still be used but ifyou do it well well maybe the Sdeveloper cut won't be sad anymore as Itold before sometimes the only thing wehave is the hope that this time we won'tadd a new layer ofcomplexity I have five minutes to givetwo quickexamples the first one was aboutkubernetes because yeah it is Cassroom how you can begin to create aninternal developer platform inkubernetes my experience what I haveseen and known first thing don't try toput everything inguates there is no real need and it isusually a bad idea your database you canput it in outside providers your legacyapis and application don't try to shornthem in kubernetesbut you can try to operate them fromkubernetes and the key word there isoperators kubernetes operators arewonderful because they allow you toextend kubernetes and make it understandnew things spoiler those things don'tneed to be in kubernetes you can createkubernetes operators to manage externalresources so even if you are't puttingyour databases or your Legacy API inkubernetes you can control them fromkubernetes you can make kubernetes thecenter of your internal developerplatform there are lot of operators whoexist today lot of Open Source but herewhere in a Java like conferenc it isreally easy to create a new operatorespecially in Java with operator SDK butin most language you have an equivalentSDK that allows you to create operatorsin an easy way if you don't have alreadyone forthem but well now we have a lot ofoperators how can we avoid thecomplexity trap we create an operator tooperateoperators there are lot of solution oneI love isktic ktic is an open source frameworkfor kubernetes that allows you toinstall conf configure your differentoperators and offer them as resources toyour developers like that yourdevelopers doesn't need to understandwhat operator you are using to deal withwhich database in which provider no youare going to give them some customresources and that's all they need toknow platform engineering I need a myqlhere you have a customresource okay which provider don't worryyou have a MyQ you are you are adeveloper what you want is a my SQL nota my SQL in thisprovider and that's a good platformengineering approach and of course putthat with a good giops with cicd forexample Argo CD so combining Argo CDwith CICS with the right operator youare going to create a platform that foryour developers that are already usingkubernetesthe only thing they need is to use theyour custom resources for the databasethe external API whatever they want touse and it make thingseasier and you you are dealing you withthe complexity to make your developerslifeeasier but and I am finishing with thatyou don't always need a one kubernetesespecially if your company isn't workingwith kubernetes nowI told before that was my favoriteworkflow I really love thatwell today you can do it in a slightlysimilar way you write your code you putin it and you choose a get to Prosolution we call that platform as aservice before but it isn't simplyplatform as a service it is allowing thedeveloper to Center themsel in their jobto develop V and putting the cloud atthe level it should be we take the codeand we put it inproduction and all the details a dealall thecomplexity is managed by the cloudprovider there are some of them aoku wasthe pioner of that several years agothere are others hey it is the lastminute my favorite one is my company butas I said there are others let meexplain by example I want I choose whatkind of application I want todeploy I give it the git address of theapplication and I Define what parametersof scalability I want a small instance abig instance a small instance butallowing it to grow to three big one andI code an a pushon code the other workis the cloud provider part we take thecode Cod we build it in the same waythat you if it is Java you are going tohave your MAV file or your GA file weuse it to build we deploy and we keep itin operationalcondition so for me if today you are enusing kubernetes look at a g to ProSolution hours or another but have alook wor it better of course but have alook at any of them because you don'treally uh need oblig in an obligatoryway to go to kubernetestoday let's finish with some links myfavorite one is a link to the past butyou have some others there and you haveeverything we have seen I am going toput the slide on Twitter of course aboutDeVos engering DeVos topologies platformenging dream topologies critics aen andof course CL cloudand that's all for now thank you veryverymuch I don't know if we have time forone question or two so as nobody saysanything I think we can try to snakesneak a small questions some questionsor commentanything yes pleasefor me they are devops in the real senseof the word devops there are developersand Ops you need to have some teams thathave the two skill as I say I don'treally believe in the super developerwho is a super Ops I will try and when Itry to do it I try to build teams whohas the both competencies some peoplemore deaf some people more Ops andpeople who are able to understand howthe others are working the real meaningof devops forme any other question before I close my[Music]computer no one well if you have otherquestions I am at the clever Cloud boothin the main room so thank you again veryvery much and see you later there[Applause]2025-04-15 22:17:47.999367�and we work for little Plusmobile application maybe you have Tredto pay with them and if it has work weare them if it yeah if it didn't workour folks didn't do their workproperly let's just startno okay okay as you can see my list ofhour is not ours so today I'm aassistant if you have any question youcan contact directly with with him sostart the first thing that we I want totalk about is what is a deployment orwhat does a deployment means probablyevery know in this room I know what is adeployment if not if anyone is notfamiliar with the concept a deploymentis the process of making an applicationavailable for the use putting anapplication into into a server for reingrequest and user can interact with thisapplication so it involves the processof transferringfiles from one development environmentto productionenvironment so if we move to the nextslide I I go back I don't work for youyou don't pay my salaryokay that's okayokay uh I come back 20 25 years ago whenI started to work I started as 23 yearsold so I remember that I start I startedin a small company and we have uh someservers I had a server for Linuxapplications I had another for Windowsapplications another another fordatabases so when I needed to deploy anew feature into our pageI need to copy the file from my workstation to the server using this kind ofcommand FTP probably the most common themost common that we use so I I don'tknow probably there is uh some uh smallcompany that still continues using thismodel but obviously in a big company ina big product it's impossible to followthis this way so this way presented toproblems the first one is a manualprocess and the second one there is nota backups or maybe some external devicesbut in any case you have to copy thefile from your workstation to theexternaldeviceso so so some years after appears theconcept of cor repositories correpositories were were invented in orderto enhancecollaboration and manage CESeffectively so code repository solve uhseveral common problems the first onesisokay the first one is collaborationbecause uh enable teams in order to workin parallel and merch the change withoutany conflict the second one is uh aboutVersion Control because you have ahistory of every change that you havemade the first isbackups C repositories enabl a centralstorage and the last one isdocumentation my college charena is nothere no my colleague Chara always saythat the documentation is in the repoand not in the Confluencepages but if you look thispicture there is one problem the processis still continues manualso is themoment of introducing a new concept theNew Concept is Pine by so by is a seriesof automatic state for deployingapplicationautomatically and now you can see thispicture this is how westarted a little five years ago this isthe real world we still have someproducts that are deployed using thismodelthis way you are not doing in thecorrect way they w't War to work withus so wehave as rep boards as cor repository andtwo kind of py build Pines and releasepiland build piland build the code testit and produce artifacts these artifactsare taken by the release Pyon andrelease Pilon put this code intodifferent uh Environ ments in our uh Iweb servicesso and this way it worked very fine forus because weuse asur I web services a platform formanaging and deploying application inasur this is the first uh Pyon that wehave created five years ago we startedwith this kind of pilent using boxesfrom the as as devops portal so I itworked very fine when the uh productsand things were small but when productsand teams startedgrowing this kind of peline presented alead twoproblems any change that you makethrough the portal is recorded and theboxes are very rigid are less flexiblethan JL fileso the first thing that we did isgo fromboxes to p as code using Jamal filesJamal files solve those problems you canuse coderepository and the Jamal file supportmore features than the than theboxes so in this moment it was the firstscenario we have to we have to tohere I haven't still talking about ourourinfrastructure during theseyears our m�odel of infrastructure J lookat me was click Ops model click Opsclick Ops model that's impossible I knowI know that you are click Ops is atotally new term click Ops nice wecreated all the resources manuallythrough the portal So In This Momentas you know is the worst thing that youcan do create create resources manuallyso westarted to migrate our infrastructure atinfrastructure as code using ter forfilesokay and we decide deploy the code andeach related infrastructure at the sametime in the same peline and it work veryfine it worked very fine for us but wewe started to we started to see a newproblems pamps to took a a long time toto complete so we had five four five uhdifferent environments and it eachenvironment depended on the previous oneso imagine if you wanted to deploy a newsetting not code only onesetting you have to wait a long time toomuch the P lines Wenaile with is the time what is the timemore than 10 minutes in some cases wheredo we come from where are we going towhich is time what is the concept no I'mjust kidding again sorry I'm a bittroll so at the same time we began tohave a more and more and more requestmore and more traffic in ourproducts we started growing a number ofusers so uh our I web services fellshort uh we experimented problem withTCP connections and it was very expensexpensive scale the web appsso in this moment in this scenario onthe onehand low uh slow PS and in the otherhand at theology that didn't didn't fitour needs we started to think a newsolutionand the solution was oh surprisekubernetes oh surprise I want to saysomething about pass the torque to mycollege the first thing we are usingkubernetes now in a big part of ourproducts tools like Argo promethus K butone important thing that I want to sayis uh kubernetes now it's it's a higheverybody talk about kubernetes and itseems that everybody needs a kubernetescluster for running each application butit's not true it's not true in ourcase it solve a problem for us slowpeles andtechnology that fell short for us youhave to uh you have to you need a aknowledge about kubernetes because it'sreally complicated to operate it and youyou need a be a deep knowledge to besuccess wewanted we wanted to move to a GTOmodel but we didn't want to introduce alot of new features and new technologieskubernetes Argo CD Hardware at the sametime in ourteamsso this is my my I I pass yourso as you can see we started to deployinto our kubernetes in the same way isthe same way that you saw in theprevious slide two kind of peline thebuild Pine that create an image put thisimage into a containerregistry as container registry harddoesn't matter and a release peline thatpull this image and run this image intodifferent environments and differentkuescluster this is the first deploy that wedid in ourcluster some monthlater we passto is it my turn it's your turn nicehopefully they arebored I promise that now the thepresentation woo goes up no I'm justkidding a bit the point is that when westarted with kubernetes we doing in ababy steps we follow baby steps becauseit's true that kubernetes is quitedisruptive disruptive technology withina big organization because if you workon an startup for instance you only needone or two different folks who knowabout the technology but when you wantto introduce a a huge disruptive changewithin a Corporation because imagine wehave more than 1,000 million users onlyon loyalty program so the changes can'tbe done like oh let's try this fancytool we need to plan it and design anadoption planfor achieving the succeed becauseotherwise it will be a problem aproject problem and a pain on our net sofirst of all we startedkeeping the things as they are it iswhat it is do you want to have asurepipelines we will keep the pipelines foryou but instead of using FTP becauseprobably you thought oh peoplefrom spars they are doing quite clickbait where is the FTP the app servicesare deployed by using FTP and I promisethat the Aur publicity ends right nowit's just because we use Aur now it'sfor kubernetes so no more ads they don'tpay usso just following the �the point is weplan to keep the current behavior or thethe the the behavior that they had thedevelopers just grabing them using thetools that they know for instance we ohbefore starting with gitops we as asAlonso has said we introduce pipelinesfor achieving the the process butdespite the pipeline worked and theyworked quite well it wasn't enoughbecause there were other Challenge andother questions like but if we need todeploy another environment from scratchwhat happens if we need to use anotherregion in the cloud provider if we wantto go through gcp or aw there are therewere some questions in our mind that wetried to solve and gitops was thesolution for us gitops not click Opsplease put away click Ops from your mindand if someone say click Ops signal themPoint them and say oh that's guy iscrazy so using giops brings somebenefits for us for us because probablyif I ask here who know what's Argo whatis Argo or Flags or Flagger maybe somepeople knows know about one of themmaybe all of them but is there is isthere anyone here who doesn't know whatisg they are potentially candidates yetnice First Step congratulations you havepassed the first the first test no butthe point is that g is the universalpoint of Truth nowadays and it's true ifyou don't know how it works probably youare not in a quite or a good company sowhy don't get the the power of G in ourdayBasics gitops bring that to us justhaving the repository a distributedrepository free for us because we arealready using it and we need to startlooking for Solutions because okayhaving the code on GitHub on gitlabwherever it's quite easy but how can wesynchronize one code with thecluster and Argo came to usit's true that there were otheroptions Argo doesn't pay us either so noI'm just kidding again H the point isthat currently there are two big playerson the open source because for us insidea comp a corporation having some of oror a bit of Trust on the projects thatwe include is mandatory so we relay oncncf as the sour of true or as a truthfor us a confident provider for us so atthat moment there were just two projectflux and Argo CD and in our in ourspecific case Argo CD matched better butcoming back at this point in thisspecific moment you can replace the ArgoCD logo with the flax logo and theconcept is exactly the same we have adeveloper committing their changes intothe code and that tool in the middlewill synchronize the code with theClusters so we can remove the pipelineswhich is an advantage and you canimagine okay but pipelines work well wethought it's so until a developer bymistake destroy a cluster and we need totrigger almost 60 or 80 pipelines whichis not prepared for almost any CI systembecause usually your company could havefive in our case we could have 20concurrent uh pipelines if you suddenlyschedule 60 100pipelines the recovery plan will not bethe best so in this case this approachbring us the capability for havingmulticluster multi- region for us thethe the release process is just animplementation detail because we rely onthe best backup system that we havewhich isG okayand how it works in a more high levelbecause maybe Thom nono just opening a bit the window forchecking the whole picture because justtwo icons is quite easy right it worksER one developer this blackdeveloper quite the standard icon justin their daily float create PR blah blahblah please review this how to go soon Cplease oh amazing I will have it make itbetter nice and the code is merg so thensome CI process choose your choose yourpreferent choice AR aror workflowsJenkinsTravis a deop something someone has saida deop whatever you will generate theimage and once the the artifact isalready therethe things will happen just calling tokubernetes API for making the things formaking the thingshappen and how it looks because maybefrom the high level don't worry I have ademo at the end it's not it's not I'mnot just selling quite a smoke a bit asmoke P I have a demo not it's just apicture of from Maro CD page so sosorry but one of the one of the thingsthat we had to take into account forchoosing o�ne tool or another tool wasthe developer experience becauseprobably everyone here or majority ofthe of the people here are close to SrWorld maybe C admin someone could be themore close to developer environments butusually humans we as humans maybe not weas as s but we as humans usually prefera userfriendlyinterface or surprise and and that atthat moment fla didn't have a reallygood user interface and probably if ourcase would have been just for us for thesres instead of for extending the toolto the whole companyflax have been or a really good choicebut we need to include in the loop thedevelopment teams and honestly I lovethe colors it's better having colors andmoving icons than jaml jaml jaml nicethe typical Sr experience jaml jaml andeven morejaml okay so another we have this quiterep repeated never mind the point isthat another difference that Argo hadand in comparation with fla is that Argocould run in an in a centralized placethat can be managed by us we manage thethe as SRE team we manage the Aro CDinstallation and we are responsible ofhaving it run up and running but wedon't need to install anything withinthe teams clusters because just ingiving a a few more context in ourscenario we have multiple clusters andit's true that we could use flags butfla at that moment this this Choice wastaken two years ago more or less so thethe projects has grown a lot but Aroalso bring us the the option for havinga single control plane for multipleclusters with multiple projects managingby arbach which is nice because onceagain when I work for anstartup everything was well you youdon't have a we didn't have H securitydoesn't matter go to life we need to putthe product on the feat on productionbut it doesn't fit quite well when weare talking about a big Corb so thosekind of features like our back acentralized control plane uh movingicons H child colors are things thatimproveing the processbut despite I love the colors I am stillanSr the the my blood is red and I lovejaml that's the true so at the end ofthe day Argo still usesjaml there are a lot of different jamalsa lot of different configurations thatyou can that you can use but basicallyyou have one important and another oneimportant for us we I'm just keepingaside the project they are back blahblah blah a lot of boring stuff aboutsecurity but just from the from thedevelopment point of view how can wemake that mat because if you if youremember the picture wait I have thepointer in this picture we are saying Iwant this repository deployed into thiscluster Argo make your magic how doesArgo make the magic just configuring itwe are we are going to say Argo this ismy destination I want to deploy intothis cluster within this name spacequite easy the second part the secondpart what I I I have just defined thetarget which is the or the the origin Ican def Define The Source saying go tothis public repository take the rootpath and the head revision and that'sall things happens this is more thanenough another boring stuff like okayplease create the name space for me iftheir name space doesn't exist but justwith this single file Ihave connect the pieces I have one repoand I have one cluster and Aro willstart the monitoring of the repositoryand automatically synchronize or suggestsynchronization it can be in in Jolomode or more secure depending on thestage we didn't start with automaticsync obviouslybut it's more than enough with this Argowill make the hard work so imagine 20file 20 jaml files like this one insteadof 20 pipelines Define how using morejaml jaml jamljaml and do you want to say something nothe only difference between applicationand application set if you want todeploy into uh many cluster you you haveto use set because uh the application isapplication set is a custom uhresource CD sorry and is uh for managetemplating and the template is the theapplication this is the way that we areusing in in our cluster oneapplication only one application fordeploying into many cluster yeah becauseobviously we needed to go step by stepimagine yourself put your yourself in adeveloper suit and think about I am justdeploying with a bass script andsuddenly someone has introduced jamlconcept multi multi clustering conceptprobably some about of Helm Docker a lotof tools so we needed to go a step bystep and this clear this concept isquite clear but if you have what happensif you have two clusters I'm not talkingabout 20 clusters just two do we need toD to duplicate the application step theapplication jaml no the the the toolbring us another option that is thetemplating templating is known also asapplication set it's just a templatethat we can configure set some kind ofgenerators in our case a clustergenerators Sly the picture is quitesmall but imagine that there are aamazing jaml definition for definingclusters or or something so so we canjust Define a template and Argo willautomatically populate the field andgenerate application for us so we don'tneed to be aware or to manage file byfile application by application we cancreate automatautomati automatized process on top ofit so another time the same picture no Idelete I delete this I delete this uhthis is light anothertime I have to I have to talk about withyour I delet okay now okay okay okayokay this is the this is the final stepthat we are introducing our teams ourdon't try to fix it probably this annualreview for you will be terrible I'm notsure but I can imagine it okay okay okaywe have first of all when Alonso startwe were deploying using ft f f fprotocolfrom some kind of automat pipeline intoaure app Services now we are usingkubernetes Docker deployed using gitopsvia H Argo City so we are in the middleof the road nice the change has beenincredible our development teams aresuper happy the Rel this process thelife cycle of the new features has beenimproved a lot terribly so one only onething now uh our teams are capable to uhset up a cluster and running all theirapplication in 10 10minutes this is the type that the onepent uh took in the past so it's areally Improvement yeah yes giving somesome insight aboutthis about this stage we improve thedisaster recovery from 68 hours inaverage into 30 minutes obviouslydepending on the cloud providers if gcpdon't doesn't bring you in the computepower you are not a magician you need torun the things on top of some VMS orsome machines but assuming that thecloud provider the cloud providersbecause there are PL a lot of them doingcitt stuff if the cloud providers bringus the infrastructure we can deploy anew environment from scratch in lessthan 30 seconds which is nice imaginethat a single hour of out of out gaug inour core product can mean more than 10million euros per hour so 30 minutesinstead of eight hours it's a huge IDEsaving not reflecting on my salary Ihave to say not yet but it's a reallygood Improvement so just changing themindset from pipelines to gitops plusAro CD we have improved the disasterrecovery plan and also the scalabilitybecause if one zone is affected by anykind of disruption we can spin upanother cluster in another region inanother 30 minutes so we can manage thedisruption incredibly better thanoriginally but no deploying is not theall the work because kubernetes is anice tool how do we go about time whattime is it uh 10 minutes 10 minutesonly let's try to speed up a bit theprocess okay kubernetes is awesomeincredible is it has changed incrediblythe sector but kubernetes has some gapsor some limitations one of them is therolling Pro the the roll outprocess in kuber ntic you can say okayif I have I want to update the image doa step by step in a percentage or youonly leave once remove all the previousinstances and deploy all the newinstances at the same time it's therecreate approach probably you didn'tknow about it and move away from yourmind because it's quite R tip but it'snot enough because obviously we canDefine some kind of prop we can say nono but we can say let's let's check theready the the components the thirdparties during the startup nice but ifthe problem is not with the databaseconnection but with the database datahow can you handle it it's not doableyou are fued it's the sadly truth butthere are progressive roll out toolslike Argo roll out or the flag versionthat is Flagger because at the end ofthe day we I'm talking about our ownexperience but I want to help or I wantto share the know if you are using Flagstake a look to Flagger is another nicetool but with Argo rollout we have changed the roll out processfrom just a rolling update based on thepercentage to a real smart roll outprocess where we can deploy design thenor Define okay maybe this this warlo isbetter to having a blue green do youknow what is a blue greendeployment wow the second take the nottake the not okay okay okay andCanary take notes take notes we have tohire but the point is that kubernetesdoesn't bring it out of the box but arorout or flagdo it and that's a really a really goodImprovement because we have improved ourdisaster recovery but we need a way toimprove our own disaster Generationgeneration process and we introduce amechanism for develop for analyzing theprocess before going to life uh yeah youlet's go basically not investing a lotof time because I think that I'm whiteI'm going quite quite slowly H bluegreen I have the blue instance or thegreen instance I I I spin up the otherone I execute a set of a suite of test amanual test whatever and when somethingwhen everything iswrong I click the button I tap thebutton I change the DNS and go to theother version and with the canary h ijust move the loadpercent step by step with a percentageuh steps for checking and measuring howit works this is the both approach thatArgo CD or Argo roll out H support butit's not the only one that you can usebecause one thing and the the way thatwe use is when you enable the the canaryprocess you can set okay I want to havetwo instances before receiving thetraffic and I can Define different kindofanalysis before starting the the canaryprocess so you can just spin up the newversion using a blue green approachexecute your swe test to ensure thequality of the code to ensure that wedoesn't M that don't include more errorsthat we the that the amount we canhandle and in case of we have done notcontinuing I mean b basically stoppingthe release process if something goes Hwrong do during that step the roll outprocess is cancelled automatically andno traffic no users has been affectedand in that case we can just start theprogressive rollout last last Slide the last yes nice umdo you want to finish you no only wantsay that uh you need to install Aro rollout in your cluster instead of Argo thatyou can install Argo in the centralclusterbut Aro roll out need to run in in everyin each clust in every cluster so yeahtrying to summarize a bit the concept westarted using CLI CLI and pretty prettypretty Bas scripts I love them no we useBas script then we swap to prior topipelines then we we include or wechange to kubernetes we include Argo CDfor deployant we include Argo roll outfor doing the last mile H check forensuring the quality just before goinglive using the same data and the sameconfiguration that the productioninstances have andnow we are going to show you trying toto not leak in any kind of informationbut before going to the step justfor having the knowledge this is how itlooks it's another h from a storepicture but don'tworrywor becareful noproblem I leaked enough informationduring cucon so you have to to rememberthat this recorded the session so yeahcareful in cuonto no I'm not sharingnicethanks H oh my God it's terrible yeahyeahyeah oh notbetter is a mechanis to not leak in dataleaking it too small can you see thescreen is a big enoughokay basically when we want to use Argoroll out H process we have to use the anew crd a new custom resource definitionbecause they replace the concept ofdeployment with their own crd andmajority of the crd is almost the samethat a deploymentbecause you got to do what you got to dothe deployment process is quite easy thedeployment aspect is quite easy so theyfollow it but they include a few changesfor instance the steps which are whichare the steps that I want to executefirst of all in this case I want toexecute two instances please spin up twoinstances with the newversion then run an analysis thatbasically is a sap of test that we havethat is an acceptant test that we haverunning on k6 but you can execute Newmanyou can call to a third party you can dowhatever you want because basically thisruns a kubernetes job so if you can codeit you can use it it's simple and thenand only at that moment and only if thattest has passed we start the roll outprocess we start putting in productiontruly and actually the code becauseuntil thisstep the traffic is is root only to theprevious version so we are safe we areusing the same database the same configthe same cluster the same networkingevery everything is the same that itwill be inproduction without production traffic soif this step fails we are fine no userswill not will notice the the differenceand then we just start our Pro a canaryprocess moving the traffic from theprevious instance to the from theprevious version to the new one let mestart itagain Rtray is you at the environment no it'snot prodution environment no no you onlyleave once YOLO you only leaveonce let me try to refresh the skrein noit work okay so as expected two newinstances has beenspin up spoon up and okay you have toimagine this shape in Orange becauseit's making the or is signaling thecurrent state but it's just followingthe script that I have defined and it'sexecuting it so probably at this stepthe processfail I'm not sure but I hope so so if itfails we will see that automatically itwill be roll back so no users will beaffected we have introduced or we havetried to introduceum not the best code saying it polite Ihave to be friend ofdevelopers we have to or we can we haveprotected our customers and our productfrom mistakes not from Mal intendedactions obviously and when I make jokesabout development or C code we arehumans we can introdu introduce mistakeindeed I was almost near to Fu theproduction environment on an internaldemo because I commit the gr code to thegr place but in this case it has P ithas passed nice let's see the process ithas passed so it has moved the trafficand we are there if I go oh no it hasfailed nice better as expected I can goto the logs because Irant a kubernetes SP a kubernetes joband I can see the logsand there were a plenty of errors inthis demo in this example but it's asexpected I have seen or I can see theerrors and the process has been revertedand how can I know it because after afew seconds the last H version has beenrolled back I still have some ver someER pots with let me see the life withtwo hours since the creation date so theroll back has been reverted the UI isnot the best because it's the a betaversion that I am using for severalreasons but oh oh my God you can see mesay me J are you stupid I can see the AGplease no no worry basically the theport was created two hours ago and thisis the running the Running P so theusers haven't been affected and it's theimportant way it's the important pointin this in this demo we have changedfrom manuallyuploading binary codes using FTP into afull automated process that bring us asecurity network a security net forprotecting ourcustomers and that's all yesthat's do you have anyquestion yeah for sure goahead database the database is thesame it depends it depends because ifyou use H if you use some database forexample with schemeprobably you will have a aa a job for modifying but as you areusing kubernetes and you need to makethe the the databas the scheme changesER step by step you have to be able torevert them for not have for notbreaking inkubernetes the the code will be revertedI mean nor use or using Argo CD or notusing Argo using Argo roll outs or notusing in Aro roll out any scheme changethat you perform on the database can hasto be revertible or or you can be or youshould be abletoyeah no because we use schem lessdatabases it's an advantage that wehave any otherquestion so thanks for joining us if youhave if you don't have any otherquestion thanks for joining us andremember that we are looking for a newCollege[Applause]2025-04-15 22:17:48.631908 � ��#��AABMpGWL4sGvYsome Giggles that's good okay people areawake that's great um I'm Heather thernice to meet all of you I'm fromNashville Tennessee but I live herecurrently uh today we're going to gothrough an analogy of baking and devopscan everyone hear me okay for those whoare listenting perfect uh so I'mcurrently a developer advocate for ChefI've been doing this for about n monthsbefore that I was at startups I'veworked in Enterprises and startups as afull-time um as a full stack softwareengineer and before that I was a baker Iwas a professional Baker I baked Breadsand pastries I've run won awards for mybiscuits I've been inmagazines um I've worked in fine diningI worked at farmers markets and so thistalk was kind of something that I neverreally intended to do I never reallythought I would be talking about bakingat a Tech conference but the first day Iwas working at Chef was for it was achef com and they had this evening eventwith Ignite talks where if you'refamiliar maybe not it's a f minute talkwhere the slide Auto increments ev��V�#��cA79J5K6dxSlIuh I don't[Laughter]know I don't know let's you start it'sokay so we will do inEnglish so thank you thank you forcoming so there are many so many greattalks that happening now and thank youfor coming so a little insight aboutthis talk we are going to explain howhow was the process for going uh from uhis pipelines and asure appp web servicesto uh uh tools like kubernetes and anAro CD of course we are discussing abouthow was uh the problem that we hadduring this process and how uh our teamswork with this kind of Technologybecause you have to think that they usedto work with umasure ABP web services and as Pelin forfor many years for many time so but letme introducefirst so my name is Alonso I work aexpert ER in the little digital we doamazing and funny thing so one importantthing is that we are looking for uh godevelopers I'm Don developer if you areinterested working with us you cancontact with me wait wait wait waitwait IAL you are doing the thingstotally in the wrong way first of all weneed to grab them showing how we workand then we try to catch them but don'tdon't don't throw the hook before beforework any case if you you want you cancontact directly with me or my colleag nn we are we are kidding we are justkidding I'm Jorge probably someone herealready knows about about me I'mMicrosoft MVP I'm cncf Ambassador I'malso gam maintainer I don't know why orI never use this kind of publicity frommy project and I work as principal Sr onthe little digital half we are based onBarcelona �ery 15seconds and you have 20 slides and so Ithought why not talk about baking anddevops so that's kind of how this talkstarted and and so I'm really excited totalk about baking and devops it's kindof an overview more of a beginnerfriendly sort of look at devops it'smore of a way to think about devops Idon't know if you're going to learnanything crazy specific to your owninfrastructure setup but maybe it willgive you a curious way of thinking aboutthings um yeah okay well if you w tothis is all I'm going to say about Chefwhich is where I work now maybe you'verecognized the old logo which is thetiny one in the top it was acquired fouryears ago by progress which is asoftware company Chef is open sourcedevops tool for Automation andconfigurationmanagement um it uses cookbooks tofacilitate infrastructure andconfiguration as code and it easilyintegrates with a lot of differentplatforms it's used by meta IBM CapitalOne Carfax Target Panera how many of youhave heard of Chef or used Chef in thepast fourawesome so uh I like to know where talksare going so I figured I would give youa menu Dela menu of the day first we'regoing to kind of look at culture andAutomation and then we'll see thedifferences of out of the box versus Umixing from scratch devopsinfrastructure and then we'll kind of gointo an analogywe'll look at disaster recovery andplanning and then a few recipes forsuccess and if there's enough time we'lllook at a few state of devopsreports any questions before I getstarted comments are we breathing I usedto be a yoga teacher so I'm always likeheavily concerned about people and howwell they'rebreathing but I'm not going to make youbreathe um with me today but I do hopethat you're doing thatso before we get into it there um is alot going on with devops and it's funnybecause it's supposed to be this thingthat breaks down the silos betweendevelopment and operations but it cankind of be overwhelming especiallylooking at the landcape so the term wascoined around 20072 2008 and it was kindof a response to like waterfallmanagement for software and it alignsmore with like agilemethodologies because in the ' 80s likeit was always just kind of a onepersonshow writing the code building deployingmaintaining testing security it couldall be done by one person but ascomplexity grew and roles evolved andchanged so did the tooling and teams sothere's a ton of different tools um thatyou can use for going through yourdevops lifecycle so before we get too deep into theanalogy or some of the other stuff Iwanted to emphasize two very importantaspects uh philosophy of culture andautomation so baking is largelydependenton the mother which is a culture acolony of yeast and bacteria workingtogether to digest grains and turn itinto these tasty profiles that we havegives us flavor and for devops I knowthat we talk a lot about tooling but itis heavily dependent on culture it's aphilosophy of collaboration and andcommunity and so devops doesn't reallywork if you're not working together andcommunicating with yourteammatesuh yeah and secondly automation we mixby hand to learn the craft both withbaking and probably devops too and thenas we scale it's just not reasonable todo everything by hand so we automate weuse machines we're not rolling out doughby hand every day unless you're tryingto ruin your hands you use machines tolaminate your D to mix your cookiebatters so keep these aspects in mindwill continue to touch on themthroughout thetalk and I spent a lot of time makingthis little graphic I don't know if wehave box Mi mixes here in Spain can youbuy these here this is I feel like avery American shelf maybe you recognizeBetty Crocker and all the differentboxes if you been to America I think wehave these here I'm not sure Ipersonally don't use them because I'msuch a skilled Baker but um I pickedsome of my favorite devops tools howmany of you have used one of those toolsbefore okay great I I would hope sobecause I I tried to pick some of themore common pieces um I'm glad you guysare still all responding to me becausesometimes it's it's really awkward whenno one like they're just like staring orhalf listening and then no one raistheir hands so um great so mostcompanies they have some tooling it'srare that you find a company that'sdoing everything by like in-house byscratch I think maybe Netflix does a lotof it themselves but I know that I'msure that they also use external toolingso often um you need something to helpalong the way if you're going toscale even yeah even for Bakers who mixby hand even though I pride myself indoing everything by scratch it's justreally not reasonable although I wouldlove to it's just not reasonable to milkmy cows turn it into butter uh growgrain fields and Mill that into Wheatsum own chickens and create eggs so wereally have to depend on the communityto provide these elements workingtogether to create ourproduct are you noticing something aboutthe icons uphere yeah is there okayso I okay I know my slides are a littleword dense they're a little wordy and Iapologize for that but here is just anoverview of looking at some of thefactors of out of the boox toolingversus made from scratch and I thinkthere's a lot of things to consider whenlooking at it um there's a limitationstoo like out of the box it's reallygreat if you're short on time but it cancreate like vendor lockin there's a lotof cost consider ation to consider uhintegration challenges it can lackcustomization andflexibility and then uh from scratchlimitations although it's really youlearn a lot by building things fromscratch you really do but then itrequires a lot of development and timeresources a lot of Maintenance andupkeep there's a lot of initial costsand Innovation and keeping up withtechnology Trends is kind of difficultsometimes so we're gonna kind of look atthis uh analogy between devops andbaking I kind of simplified it into fourdifferent parts with devops divided intoplanning building deploying andmonitoring and then baking divided aslike prepping mixing baking andfeedback so the first part uh planningand prepping in Tech we ask a lot ofquestions questions I would hope and theanswer to most of those questions isthat it depends uh what are you buildingwhat resources do you have what does itlook like who is it for and it'simportant to have all these um questionsanswered in excruciating detail beforewe get going just to set ourselves upfor Success do we have enough resourcesare we wasting resources are we buildingfast enough before the butter meltsthese are really important questionsbecause timing is everything and asBakers we are hyper aware of the passingof time a few seconds can make a reallybig difference everything has to alignlike the seasonality of apples the uhpyo has to be chilled yet relaxed youneed a golden image of a perfectly bakedpie in order to replicate it a milliontimes a day so um in order to be veryefficient it's really necessary to beplanning and prepping everything alongon the way but as we know we go throughthe cycles and we iterate and we findthings to approveupon the next stage building and mixingby the time we're here hopefully we knowall the different pieces of the pie thatwe're going to put together and we haveenvisioned this final product that isperfect and ready to build to mix um butyou know things things can go wrong andsometimes we find that the machines andthe ingredients are not behaving exactlyhow we want to so that really puts a lotof importance on testing taste testtaste testing your product along the wayand also testing your application andbuild as you go through themotions and then assuming that you haveeverything mixed and built ready to goit's time for the deployment and bakingstage uh this one it's it's reallyimportant to have trustworthy tools youreally need to know your oven and thetemperature control the hot spots thetimers working um as you would expectall the resources were allocated and thetests are complete we have to reallyhave a lot of faith in our tooling theenvironments have to be ready configuredum tests are green no red errors noflames from your burnt bread even thoughthose make for good stories it's notideal so uh yeah we have to trust thosetools so that we're not opening the doorhalfway through the process and lettingout that heat and ruining your perfectlyelevated loaf ofbread monitoring and feedback this kindof closes out the cycle of devops andbaking as a loop we have to haveaccurate monitors logs snapshots intowhat's happening in case we need tointervene and stop a bigger disasterfrom happening um we need to have likespend time with our final product andunderstand understand how people engagewith what we've created reflect upon theinsights and the first impressions theinterpretations of what you're makinginteractions all this in order to umcontinuously influence the futureiterations and I thought this was areally cool uh graphic of devops toolinghow many of you have seen thisbefore okay awesome I'm glad it's like athing that we all know about but as waslike a a science nerd a stem child um II was obsessed with the periodic tableso this is a really cool graphic to kindof like see what is out there and withbaking and devops it's both a scienceand an art you have to abide by theroles of chemistry but also you kind ofneed that Innovation and creativity tocreate new flavorprofiles so another way of looking atdevops and baking to together is kind oflike a cake like as a whole it's onething but when you start looking at thetiers each base has an important umelement so the Bas is for devops it'scollaboration and communication theshared responsibility across all theroles the support that provides uh thatis provided to all the other layers thisis kind of like the methodologies agilecicd and then the second layer it addsstructure and height it's a frameworkfor rapid and reliable deliveryautomation tooling develop uh deploymentpipelines config man managementinfrastructure ascode and then the next layer it's maybeyou know leveraging cloud and hybridservices for flexibilitycontainerization to build um scalableand flexible applications use the cloudfor on demand resources and they help todeploy consistently acrossenvironments and then the fourth is aprot protective layer where Ser I meanSEC security is integrated across everyphase of devopsideally um automating security testingvulnerability scanning compliance checksthat can sometimes come later in theprocess and then the top tier here is amonitoring logging visibility into theapplication performance operationalmetricsum feedback loops and this allows you toquickly identify issues respond toincidents optimizeworkflows and then the decoration thatis I guess what makes your team specialum The Daily cadences theinteractions the genes qua of your teamand Company uh elements that reallyFoster transparency and trust a sharedunderstanding andAlignment so you can see like thelayered complexity and interdependencyeach one builds upon another to form awhole which is a robust and securesystem made ofcake uh before we move on is there anyquestions comments thoughts feedbackhopes wishesdreamsokay is everyone stillbreathing okay me too uh disasterrecovery so Disaster Recovery planningwho has had like a hugeoutage one person I I feel like there'sprobablymore security so what can causedisasters uh natural disasters of courseflooding fires tornadoestsunamis um any will of God or theuniverse Cosmos outages security risksand we see these days that there's a lotof really bad actors at play a lot of umhackers and malicious intent coming intoour systems especially now with the riseof AI and automation so we really I feellike sometimes it's hard to keep up withall the Bad actors out there so it'simportant to you know have your DisasterRecovery plan in place and there's kindof you know five five steps here thatI've outlined but of course it's it's alot more than this it's it's reallyhaving a plan in place to analyze likewhat systems are most at risk whichsystem systems need um backups andsnapshots maybe you have servers that umneed backup regions or you're streamingdata constantly if it's really sensitivethese These Are all uh differentapproaches and I think it really dependson what you're working with and how youknow how much you can afford to be downduring an outage or how much time ittakes to recovery if you can afford thatand you know who you're using are ifthey're going to be really upset ifAmazon goes down for half a second Ifeel like a lot of people would be upsetbut if you're just runting a website umfor your portfolio maybe it's fine ifit's down for a month and you're notlooking for a job so understanding whatthe company needs and what kind of riskyou can take on is super important whencreating this recoveryplan and then With Disaster Recoveryhandling I think that as developers andpeople in Tech we all kind of braceourselves for the inevitable failuresand outages as a baker I've definitelyburned bread I've tried to bakechocolate cakes that ended up turninginto brownies or um repurposing cookiesinto you know truffles so having thisway of taking a disaster and turning itinto another um opportunity to learn andgrow I is very valuable and a healthyaspect of cultureso when disaster happens reallyidentifying and notifying and creatingthat assessment is important and youknow as you go through your plan whichyou've well you know is well documentedand uh tested ideally you're testingevery year if not every quarter thenit's I think it's the most importantpart is doing um you know a blamelesspostmortem knowing what you can learnfrom your incidents and and taking thatto see how you can create a more robustsystem because at the end of the day youknow feeling blamed for a mistake is notgreat and that is a reflection of thesystem filling you and not your ownerror so here are some recipes forsuccess I just um picked some things andit looks like I need to adjust the fontsright therebut uh as to reiterate the concept ofhaving a really healthy culture insystem is um I think the foundation andthen being able to empower developersand encourage experimentation and havethat space to do so through Automationand having all the tedious tasks therepetition handled by um your scriptsyour machines your tooling it's it'sreally can create like longevity withinyour productokay so I'm like zipping through thisthat's great okay so for the um I wantedto include two really nice pieces ofliterature here this is um some summaryof the 2024 state of devops report thatwas published by puppet humanite CDfoundation in women in devops who herehas like seen the door report and paysattention Okay five people that's allawesome um so so in this report bypuppets they uh pointed out the driversof success are efficiency speed andsecurity and their goal with this reportwhen serving 40,000 people is to betterunderstand the people processes andtrends that shape devops every year theyconducted the survey and thought leadersum analyzed the data Incorporated Trendsand these are some of the key findingsfor um puppet they really focused on theevolution of platform engineering andthis is another talk that I'm givinglater in the year about devops versusplatformengineering spoiler alert uh platformengineering fits in nicely with devopsit's kind of like um two parts of awhole so having elements of platformengineering within your devopsphilosophy is can really speed up andcreate tighter feedbackloops um platform engineering can act asa barrier against the chaos of toolstasks and information by creating theselittle packets of reusableinfrastructure environments umprovisioning security pieces that thedevelopers can take and run with so thatthey're not creating more uh silos andbottlenecks which was the goal of devopsbut we find that with complexityincreasing we need more solutions to theproblems that devops is also creating soplatform engineering kind of createsthat solution to the additional silosthat are built by having devops um yeahso the rise of developer uh developersis supported by platform engineeringteam with the self-servicing tools theygive developers flexibility to work andthen the full potential of devops isunlocked with automation it streamlineskey processes eliminates repetitionaccelerates delivery and it happenssimultaneously with the expandingplatformfunctions and security is very it's abuilt into platform engineering andeveryone benefits when it is it empowerspeople to take responsibility from fromthe very beginning to ensurecomplianceokay and this was another report that Ilooked at the state of devops report for2023 um this one surveyed over 36 ,000people and it was presented by Dora andGoogle Cloud it explored the outcomes oforganizational performance teamperformance and employee well-being itlooked at software delivery performanceand operational performance and its keyfindings wereinteresting first one establishing ahealthy culture they found that teamswith generative cultures have 30% higherorganizationalperformance um built in and built withusers in mind the teams that focused onusers have 40% higher organizationalperformance and unlocking softwaredevelopment delivery performance withfaster code reviews um this was actuallysomething that I experienced when I wasadeveloper maybe five years ago thereally quick cycle of code reviews and Ithink that was when I first became veryinterested in devops without realizingwhat devops was to have the abilitywithin my small team it was a hugeEnterprise company but to be able to uhjust like Ping my teammate and jump on acall to review my code and get it mergedinto production like one hour later wasamazing to have these like bite-sizedlittle code changes I don't know aboutyou but like when I see a PR with like5,000 lines of code changes like doesthat make me want to do it and look atit and review it kind of noI don't I don't know I feel like that'suh the smaller in bite-sized you can umtake these tasks down the changes themore efficient your cover viws becomeand the less of a you know a waterfallmentality you have about pushing toproduction um see unlocking softwaredeveloping delivery performance withfaster Cod reviews okay amplifying thecapabilities with quality documentationthey found that trunk based developmentwhere like everyone is working on codein a single branch is estimated to havelike 13 times more impact on uhperformance when the quality ofdocumentation is higher than versuslower which I think maybe is intuitivebut I thinkalso as much as I want documentation tobe a first class citizen in softwaredevelopment it's often an afterthoughtand I wish it was easier to change thatI I'm still trying to think of ways tolikemake that change I I'm sure there'sorganizations out there who umprioritize documentation I was lookingat Material UI documentation and theyhave like a really clean and um upto-date and useful set of docs um I knowChef we're still rewriting ourdocumentation so I'm I'm like activelyalways thinking of ways to help even ourcompany to like write betterdocumentation umokay let's see here else uhyeah increasing infrastructureflexibility with Cloud uh using a publicCloud leads to 22% increase ininfrastructure flexibility which leadsto 30% higher organizational performanceversus theinflexible um balancing deliveryspeed and um user Focus I thinkthat that's a hard challengechallenge balancing balancing issomething that we practice and wepractice it because we need to be betterat it right so okay the last one I thinkis is very important and this one wasmaybe the one that I could resonate withthe most it was uh Distributing work fafairly and they found thatunderrepresented groups have higherrates of burnout and a lot of systematicand environmental factors cause of thisthe people who happen to take on morerepetitive work are likely to experiencethat burnout and underrepresented groupsare likely to take on that repetitivework so um these groups basically anyonewho's like not aman are more likely to be burnt out andtake on these tasks and I think it'sactually up to the majority to help toalleviate that alleviate that pain anddecrease that burnout I know that I'veexperienced a lot of thatmyself so that kind of um reiteratesthat need for culture collaboration andcommunication and examining what kind ofwork each person isdoing and that is all I have uh you knowdevops isdelicious thank you[Applause]Mercy but2025-04-15 22:17:49.328310 which would bereally expensive or active passive so inthe event that there was a problem inone cloud provider you can just redirecteverything to a second secondary cloudprovider uh or this kind of stuff umanother uh another research is forworkload optimization some clouds areknown for certain features like uhpeople think of machine learning theymight think of Google Cloud being thestrong stronger platform or maybe youknow other people might think isAWS um you could have like some of youryou can have the workloads that needhigher compute power based on whicheverclass provider at that moment you'vedetermined to have the best cost ratioto Performance and the power potentialright okay another reason is somecompanies are have their workload uhhave the entire company built around aparticular company for example you havea lot of Fortune 500 companies that areusing Microsoft for their identitymanagement and their workloads for umyou know they use the prodso it's natural that they would uhcontinue in Azure but they may also havesome of their other work coming togetherin a different cloudprovider uh in addition another reasoncan be geographic distribution so notall Cloud providers have multipleavailability zones in a given countryand having two clouds can uh having aworkload on two clouds can help you withissues like latency um and Storage orthere could be some kind of regulationsregarding uh which should bring us togovernance there might be some kind ofregulations that you need your data tobe stored in a particular country and ifyou're using one cloud provider if itdoesn't have a presence in that countryyou might have to have a workload in anadditional country um there can also beissues with like the service levelagreements will vary over the cloudservice providers and maybe for onecountry you need it to be higher than itis available in another or you know theyhave different compliance certificateswhen we think of the Cloud we tend tothink of like gcp azer and AWS butthey're actually multiple private cloudsthat don't always have the same kind ofcompliance certificates that we've cometo expect of the bigthree okayso any questions or comments orthoughts okay so I thought we would goover a couple of common uh multicloudpatterns that we can see um the firstone is just uh arbitrary you knowthere's no plan there's workloads inmultiple clouds I've seen this incompanies that where one one companymight be I was working at this startupthrough when I was at TH as no sorry atthorworks I was working for a companythat was an AWS shop and it waspurchased by another company a largercompany that was more into gcp so theythis is a scenario where there was justarbitrary it was just arbitrary therewas no particular uh Rhyme or Reasonthere was no strategy they had nogovernance over their multiple clouds umand the big issue for them when theywere trying to merge these kind ofworkloads was the traffic costs so ifyou have just an arbitrary thing youknow workloads here workloads thereyou're going to have a lot of trafficcosts as you're moving your data fromone provider so if you have an A if youhave like a website in one and you'restoring a data in another uh that canlead to a really highcost another one is parallel parallel islike you have a single application thatis split over multiple clouds uh it'sit's simp generally to ensure Highavailability um it's the same app overmultiple clouds um one of the problemswith uh parallel structure is reallydifficult to use it can be difficult touse the manage service of a given cloudprovider because they are spe you know alot of them are well integrated intothat system uh and so it might be harderto actually use like imagine you'reusing um S3 and in AWS and then you'retrying to uh split a workload that'scoming from azer to have data storedinto that S3 is not a managed service sothat's a bad examplesorryum and finally you have the issue ofstrategic right the Strategic is when acompany decides to make uh up toactually plan to engage with multipleclouds and so they will be caring foryou know they may decide to avoi d vendorlockin they're going to have a multi-cloud strategy um so they'll take thestrength of a given cloud serviceprovider uh and and whatever it'swhatever workloads would work for thatcloud provider uh they will you can alsodo this as a plan of Disaster Recoverythey will you can create in this kind ofcompany like one of the last companiesthat I was one of the last uh clientsthat I was working at they had amulticloud strategy and they had agovernance team that would cover uhwhich work what type of workloads wouldgo into which uh which cloud provider uhagain one of some of the drawbacks ofthat is U you can have high trafficcosts uh let's say and also cloudschange Cloud providers change so whatmight be really good might be a betterservice today and one of the cloudservice provider in a six months or yearthey can have the same the same um thesame service the same offerings the sameavailability we see this every year witha summit uh ad Summit they're bringingout new products that are matchingwhat's in or ingcp um another challenge of this one isthe issue of data consistency because ifyou are storing your data in one of thecloud service providers uh and andstoring the same data in another one Mmaintaining data consistency it might beokay if you have eventual consistencythat is you're replicating the data intoanother in the secondary place but ifyou're in like if if if you're in ahighly regulated industry like bankingyou kind of need you want yourtransactions to be identical I Ishouldn't be able if if I hit an inpointin one cloud service provider andwithdraw half my money I shouldn't beable to you know by luck go into anotherone and be able to withdraw the sameamount of moneyright uh with strategic you can usethings such as um the service catalogs Iknow that they have equivalent one in inAzure and uh in in AWS and these allowyou to have more governance so you canbuild security into uh whatever productsthat people are building so basicallyyou create like templates that peoplewould have to use that can have the Uthe security requirements built intowhatever service they'reusing um so I talk I mentioned brieflysome of the issues some of thechallenges of multic cloudum uh one of those is networking becausenetworking is one of the biggestchallenges even if you're in one Cloudbut when you have to uh cross betweenmultiple clouds uh that interintercommunication can be more of achallenge right and it could besomething as simple like the secur thesub the security group rules and AWS isyou can allow things they only U allowhave allow rules whereas if you go intoAzure you can have allow and deny so youcan be getting a timeout error and youdon't even know what's the source of itor even which of the cloud providers iscausing this so that's a challenge um asecond one is security the more cloudsyou lose the more you're going to haveto uh manage the security in theapplications and and this stuff is wellintegrated within each cloud provideryou know you have your um the differenttemplates for and different permissionsfor each of the cloud providers but onceyou add two or more Cloud providers it'sgoing to make it even more complex uhand it's it's difficult to interop tomanage these things together uhanother issue is costuh it's the cloud can be costly andwe've been told for since the beginningof uh the time beginning of time whenwhen we started entering the cloud uhthat it's a way to move your cost fromon Prim like you don't have to provisiona million servers because you can justscale up and down automatically as youneed to in the cloud right but it canthese kind of costs can sneak up on youwith multicloud because you have tomanage a lot more costs in a lot ofthings and the biggest one of thebiggest challenges is data transfercosts right as you're moving from one toanother if you're moving across regionsand understanding these policies thecost optimization policies in the givencloud is really difficult and at asecond Cloud it gets even more of achallenge let me drink some water myvoice is soundsorry and then finally I want to talk alittle bit about the skill set Gapbecause it's not enough to just havesomeone who's an expert in AWS and thenyou would need someone to be an expertin multiple clouds and then thereforethey would also need to be how do thesethings interact so that is a a majorchallenge that we can't ignore whenwe're thinking about uh goingmulticloud so so we've know why peoplego into why we could have multicloudand we talked a little bit about some ofthe challenges of multicloud but I'mgoing to pause here for a second to giveyou time to think about anyone hereworking in the multicloud environmentyeah one two okay which clouds are youworkingwith okay and there was someone else saydo you work together you know okaythat's rightokay and do you find it challenginghow kubernetes can help thank you I didnot pay him to be here to support thatokay okay so we're going to go over someof the issues right abstraction this isthe was that a question I thought no ohno sorry okay okay that it's fine sorryso kubernetes access acts as anabstraction layer between theapplication and the and uh theunderlying infrastructure this is theadvantage of right um You can deploy amanager application and you're man andyou're handling this with a bunch ofyaml files right the deployments theyaml the replications the scaling up allof this can happen basically um in codeor yo files um regardless of the cloudprovider the yamu file is going to beessentially the same you need to masteryamu and you're going to be using thatif you're with AWS azer or even some ofthe other Cloud providedright uh kubernetes is an open sourceplatform it is vendor neutral so if I'mgoing to build it on a VM or if I'mgoing to build it with one of themanaged service it isn't going to reallymatter I can even it can be on PR notthat we would recommend I'm not I haveno opinion of that but it's vendorneutral um open source and it'sflexible um you can use the manageservices like eks or gke or uh AKs todeploy kubernetes and when if you use amanag service they're going to handlethe control plane so that takes of someof the uh so it worked for thedevelopersteam uh it's portable at the platformlevel like uh you're working incontainers and containers are portableright so it's I don't want I wrote hereeasy to move and I'm not saying it'seasy to move but because you're havingall of your dependencies in one thingyou can um and you're managing withkubernetes uh it's easier than if youwere using an alternative so someone Imet someone today they were talking thatthey build microv VM so maybe I'm wrongabout that we'll check um and it also ifyou're using kubernetes you do not haveuh you not you don't have vender lock inbecause you can be deploying to any ofthe clouds uh and if you just changeyour mind later um you canmove another reason um the biggest one Ithink is the growing multicloud EchoSystem uh from things that autom maketest or so you can focus on theapplication logic if you look aroundaround the cloud native the cncf there'sso many tools that have been that havebeen growing in the last few years thatare designed to work with kubernetes andthey can be moved you can use them inany of the different cloudsum there's like tools that you canmanage and even ones that I'm going togo over a few in few minutes the onesthat can help you manage costs ones thatcan help with monitoring and all ofthese things uh in addition to ResourceManagement with kubernetes so it helpswith standardized cost managementbecause there are tools like op cost orcube cost that will help you with liveuh cost analysis in real time or nearreal time management of your cost to seewhat you're spending so then if yourcosts are going above then you can umwork on trying to identify where what'sthe major uh the major cost uh SC andand we know that it's scalablekubernetes is scalable you scale up andscale down which just a change in anumber of rep because and you can alsouh have that buil into your your cicdpipeline so looking at a couple of thetools um you have the Rancher kubernetesengine which allows it's multioud youcan have this to manage across yourclusters in different uh in differentsimultaneously in different uh platformit's uh a light well can be lightweightwithK3 um this K Casten K10 I haven'tactually used this but it's one of thetools that was recommended to me by acolleague in terms of life cyclemanagement and data management as wellasSecurity in addition for networking andservice management have project Calico Idon't know if you have time but youshould go downstairs they they are hereand um you can talk to them about thisto help with networking or ISO if youwant to create some kind of service meshthat can help with finding if you'reusing aMicrosoft microservice architecture aservice mesh will help you withDiscovery uh and trafficmanagement and then we'll talk a littlebit about gups for there was a talk onon Argo CD I don't know if you had achance to see it earlier today uh so youhave flux which you can use to managedeployments as well as Argo CD um asanother tool that you can use uh bothare both both of these came out of if Ibelieve they are C cncf graduate Pro soopen-source tools that can help with tokeep to work with kubernetes uh tosupport configuration management anddeploymentso let's say we wanted to go to so rightnow we're just talking in the theoryright so but the idea is that you couldhave a multicloud uh kubernetesdeployment of any type of applicationthat um that needs to be highlyavailable and resilient um but goinginto prod we need going to have tofollow some other things which aren'tthat different from General look how dowe create uh how do we createapplications that are reliable umresilient and highly available um andcost efficient because we have to thinkabout cost as developers even thoughthat's not something thatwe it's it isn't something we talk aboutthat much um so fups right if you'regoing to go you're going to need we'regoing to need to talk about what are thepricing models not just of each Cloudbut the cloud combined and this thetotal cost of operations right so it'snot just about like this service mightbe cheaper um today but it's also likewe need to track it to see if it's cheapin this moment when we have a contractwith this particular vendor uh would italso be cheap if we looked at the to ifit would also be cost effective to uhtravel of to have multi clouds and soconsistently monitoring the cost to knowwhat the cost what what we're spendingas well as if there's a way to getadditional uh contract and negotiatingwith a different uh cloud serviceprovider um monitoring and observabilitya holistic view of the entire projectthat you're working on not just within acoud provider but across the differentCloud providers this could be one of thebiggest challenges because again ifyou're communicating with differentCloud providers they're native tooltools for that provider might not be asefficient and there may be some delay soyou need to have U uh Tools in in placefor example Pras grafana that you cansee what's going on and that you cantrace if you're you know if you wantmaking an API call to a database you cantrace it to see where it if you know ifthere's an error if you're getting atimeout error to know to be able toidentify the point at which is failingso you can resolve it in that moment andwherever itis um to know what's going on uh inaddition it would be good to have uh inyour in your teams a run book forexample because it's it's one thing toknow if there's an alarm going off andwe've all had this experience an alarmof going off if we don't know how toidentify how to dig into the problem andthen resolve it then we just knowsomething's broken so it's important forteams to create run books so they can uhso each anyone who is on call or anyonewho's in that moment and having an alarmgo off should be able to identify theproblem and work towards uh fixing itsecurity Now security uh is one of thebiggest challenges would is uh one ofthe big challenges would a multicloudenvironment because like I said when youare in one Cloud it has a a set of bestpractices and they might have their ownidentity and access management but therewas a really design in gen eral to workwithin well within their services thatthey have in their given cloud and thenit can get a uh increased complexity ifyou're adding an additional cloudprovider which might have slightlydifferent uh differentuh structure and monitors for because II don't really know if like guard Dutycan be used with um with something orwhat what's the equivalent in any of thedifferent Cloud providers so you need tobe aware of that and mindful of it andtry to develop something to help youidentify any uh suspicious API calls orany kind of things that are happeningwithin theapplication uh Disaster Recovery this islike a two-parter right so it's not justhaving Disaster Recovery as an I canfail over into a different Cloud butit's also like what if that cloud failsit's like within the cloud or betweenthe cloud to having you know toidentifying what is you know what areyour your number like how much you knowthe RPO RTO you you still have to managethat even if you do have you know twoclouds you have you have the idea thatif this Cloud fails I will go to thisone because the time it takes is notjust to you know how much time do youneed to bring up the database again areyou going to have active active and thatthat could be cost prohibitive are wegoing to active passive but still youhave to look at the recovery time Pointwhat's what's acceptable for yourorganization uh network connectivity nowI am I will tell you this to menetworking between even within a singlecloud is really difficult is reallychallenging and if we're going acrossclouds I was trying to create this thisproject to demonstrate for you and Ijust kept getting time out because inpart my expertise in networking justisn't there yet and so I would say thatthis is something that if team is doingthis multicloud thing you need to makesure that you have this particular skillset really deeply embedded in theteam and and decide like the questionsthat you have to ask yourself is it safeto communicate across the publicinternet like was the quality does thedata need to move with um within somekind of network like you know a VP a VPCtunnel between the two the two differentclouds um to make sure that it doesn'tactually uh um cross the public internetand ensure that you know you have theproper encryptions and everything likethat inplace uh management complexity so thisisn't a technical aspect of it this is au it's a business aspect of it right soyou need the proper governance thatyou're going to havethe how do you manage the complexity ofbuilding into clouds it isn't simplythatum the that now working in one projectin one cloud the cognitive load tends tobe really high and if you add additionalan additional Cloud it's going to beevenhigher um one of the one of the ways youcan resolve this is make sure you set upa proper governance that you have rulesof what cloud what goes into what cloudand who's responsible for what and andhave people behind it um I'm not sayingthat we need to introduce another levelof of management but maybe each projectshould have a project manager and thatyou can help to um have a consistency ofwhat is the plan if you're going to bemultioud and building your team becauseif you are going to have a multicloudstrategy if you are part of anorganization that decides that you wantto be multicloud you need to ensure thatyou have a team that has the skill setsand the capabilities to do this and andthat means identifying people who aregood in like have deep knowledge ofthese kind of things and deep interestand and um and and provide continuoustraining so that everyone can learn andgrow and be able efficient in the workthey'redoing so to summarize um designing yourmulticloud would include understandingthe workloads which which should go inwhich Cloud uh what is the flow of dataacross the throughout the life cycle ofit uh do you have a stateful or astateless uh web applic uh applicationso whether or not you need to managestate in your clusters um and having auniform criteria across the across theuhorganization uh so I mentioned beforebuilding out your team identifying howdo you um Implement I am um I would saythat I think probably everyone here usesinfrastructures code so you know findingsomething that can automatically buildout your infrastructure as well asdeploy and being aware of finops andthe the costs um and ensuring that youhave monit monitoring andobservability and this includes runbooks and training so anyone on the teamthat something breaks they'll be able toidentify and problem shoot andtroubleshoot so best practices arerecommended practices right disasterrecovery and failure over strategy nomatter if you one cloud two clouds or uhsome variation of that uh automationautomating your deployment and scalinguh monitoring and logging across youryour clouds and invest in your peopleand practice practice practice like gamedays to practice for security incidentsuh security Champions that can advocatefor the best practices and threatmodeling and everything of thatside and thankyou are there any questions I know Italked really fast and I don't even knowof what I said sorryare you just waiting to go out for thenetworkingservice so I think knowledge she askedme what was my biggest challenge inmultic cloud is like what I was workingon before because I wanted to produce ademo it's just the lack of knowledgebecause I could I was fine with I have astrong knowledge AWS and kubernetes butin the moment I was trying to deploysomething across the clouds with azer Ijust didn't have the knowledge so I wasgetting a lot of timeout errors and Iwasn't able to figure out what the wherethat was where that wasfalling any morequestions and the takeaway is do you domultic Cloud well yes if you have thebillions of Euros that you need toinvest in it but if not if you don'tneed to no but if you do do it withkubernetes and how do you implementmultic Cloud capability on theapplication Level for example you havean application that needs like an snsqor something like that something thisMenor locket how do you manage theabstraction in between to because youcan have your platform and newed multiCloud but application also needs to bemultic Cloud if you use any managesolution for example well yeah I thinkit's a big bigger challenge if you'reusing a manage solution right becausethen you have the particulars in thegiven Cloud I mean you probably havemore experience with this what would youdo how would youhandle I thinkwell well we we saw this morning uh talkabout Dapper that is something thatprovides this kind of section you havelike a side card proxy that it's inwithin your application and you are likemeeting a standard request like to apoop to a database and this implementsthe section so you have a a topic inAmazon and a topic in Google cloud withits own implementation and for exampledaer is one solution that so you need totalk Local Host with daer with astandard API and daer will translate tothe current provider you need that's anexample of how to it that's good thanksfor the tip everyone to hear thatawesomeC thannothingelse so you mention uh kubernetesportability so if I cannot use akubernetes what do I have to do uh liketo I don't know my platform so that itcan be just as portable uh so you meanyou don't want to use kubernetes tomanage your containers would you let'ssay like I work at I know the containerlevel or I deploy to a VT machine whatdo you have to add to have like that uhconsistent uh recipe across Cloud hmwell I would in which way do you meanabout consistency like you mean likeshouldyou have the same capabilities likenetworking or like I don't know yeah fornetworking I would use one of the um Imean you could do like celium or somekind of open- source tool to handle tomanage your networking if I'munderstanding your question which ispossible that I'm not because I think Iused all my brain cells before you willuse like a you will need to use likethird party products or something yeahsomething like that probably yeah cuz Isaid networking is my weakest point I'mtrying to build the skills in that butum it's likemagic morequestions okay I think we can wbp it upthank you thank you verymuch e2025-04-15 22:17:49.933075 VV�� #��IAMgKNa---WQgI wasn't expecting so many people so I'ma bit nervous so I'm going to just talkfor a few seconds so that my voice stopsshaking um I PL prepared a slide to tellyou a little bit about me so that I canwork out my nerves with you um so myname is Lori King I am a sof developerand I can't move I do this um so Istarted out my professional career as ajournalist because that's what I wantedto do to change the world and then Irealized that there wasn't really muchchange in the world in journalism so Iswitched my career to uh sociology andstarted a doctorate program in sociologyum left that and got a job as aresearcher at a a nonprofit that helpedvictims ofviolence uh and the outcome of that isthat I realized I needed to learnSpanish so I came to Spain to uh learnSpanish uh in 2006 I'm not a slowlearner I met my spouse here and then Istayed uh and a couple of reallyexciting things happened I startedteaching because that's what Americansdo in Spain when we can't speak thelanguage so our skills don't matter andthen we uh and in there I started uhvolunteering and attended a programcalled codebar which is helping it helpspeople uh enter into learn how toprogram learn how to code uh and Iwanted to learn to code because I wasplanning going back to the States andgoing back to research and I thought ifI know python then I could do someanalysis that we were doing before withyou know programs like SPSS anddata just trying to remind myself tobreathe okay and then I heard about thisprogram called rails girl summer of codewhich I don't know if you've heard aboutit doesn't exist anymore what it did wasuh take uh underrepresented groups andpaired you with created pair and youspent three months working at an opensource project uh with some coaches andthis is where I met Alvaro he was one ofthe first people and at that time Ididn't know I thought a standup has todo with comedy and you know retro was astyle but I learned a lot uh thoughtWorks was my host company and after thatI entered thought works as a graduatedeveloper um and that was about five anda half years ago um my last project atthought Works my last technical projectended in January I was a data engineerwe were working at a largepharmaceutical uh building data productsuh in the data M frameworkand that's when I started to think aboutmulticloud because they had they hadmultiple clouds providers um and thisFebruary I started in a new role for ayear to assist with people uh upscalinginfrastructure within the company sothat's how I gothere still a little nervous but I'mgoing to work through this okay so Iwanted us to begin with this uh issue ofwhy multicloud right because and we'veall been hearing this we've been hearinghybrid Cloud multicloud so I wanted tobring up a couple of things of you knowwhy some companies are moving towardsmulticloud um so obviously with if youhave I mean most of youve probably heardabout that little outage at Google oneof the companies they accidentallydeleted an entire account worth $400billion uh so multicloud can help withthings like high availability rightobviously we want continuous uptime forour users so if you the idea is if youdistribute your workload among differentcloud service providers it's going toincrease your high high availability andyourresiliency um so if you have an outagein one it could fail over into anotheror it can help with your disasterrecovery options so you you can storeyour entire workload in a you knoweither do active active

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/luebken/playlist-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server