{"id":2344,"date":"2026-05-21T08:28:08","date_gmt":"2026-05-21T06:28:08","guid":{"rendered":"https:\/\/askem.eu\/?p=2344"},"modified":"2026-05-21T08:28:13","modified_gmt":"2026-05-21T06:28:13","slug":"sglang-servir-des-llm-open-source","status":"publish","type":"post","link":"https:\/\/askem.eu\/en\/2026\/05\/21\/sglang-servir-des-llm-open-source\/","title":{"rendered":"SGLang : servir des LLM open source"},"content":{"rendered":"<h2 class=\"wp-block-heading\">SGLang&nbsp;: servir des LLM open source plus vite que vLLM gr\u00e2ce \u00e0 RadixAttention<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Une fois pos\u00e9e la question du <em>quel mod\u00e8le servir<\/em>, vient celle, plus structurante, du <em>moteur d&rsquo;inf\u00e9rence<\/em>. Dans l&rsquo;\u00e9cosyst\u00e8me open source, vLLM s&rsquo;est impos\u00e9 comme le choix par d\u00e9faut. <a href=\"https:\/\/github.com\/sgl-project\/sglang\">SGLang<\/a>, projet n\u00e9 \u00e0 Berkeley et port\u00e9 par la communaut\u00e9 LMSYS, propose une alternative cr\u00e9dible&nbsp;: un moteur plus rapide sur les charges agentiques et RAG, avec une syntaxe Python pens\u00e9e pour la g\u00e9n\u00e9ration structur\u00e9e, et une licence Apache 2.0.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Ce que SGLang change concr\u00e8tement<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">SGLang ne se contente pas de <em>servir<\/em> un mod\u00e8le, il propose une couche au-dessus de l&rsquo;inf\u00e9rence. Trois briques techniques expliquent les gains mesur\u00e9s.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>RadixAttention.<\/strong> SGLang indexe automatiquement les pr\u00e9fixes de prompts d\u00e9j\u00e0 vus dans un arbre radix, et r\u00e9utilise leur KV cache au lieu de le recalculer. Sur un workload typique d&rsquo;agent (syst\u00e8me prompt fixe, conversation qui s&rsquo;allonge), le gain est imm\u00e9diat&nbsp;: entre 3 et 6x sur le TTFT selon les charges. La logique est proche de ce que LMCache propose en cache distribu\u00e9, mais ici int\u00e9gr\u00e9e nativement au moteur.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>G\u00e9n\u00e9ration structur\u00e9e par grammaire compil\u00e9e.<\/strong> SGLang embarque un moteur de contraintes (FSM compil\u00e9e) qui pilote le d\u00e9codage, token par token, pour forcer du JSON, du SQL, une regex, ou une grammaire libre. L\u00e0 o\u00f9 Outlines ajoute une couche externe, SGLang int\u00e8gre la contrainte dans la boucle d&rsquo;inf\u00e9rence&nbsp;: le co\u00fbt est presque nul, et la sortie est garantie syntaxiquement valide.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Le frontend Python SGL.<\/strong> Le moteur expose un DSL l\u00e9ger qui permet d&rsquo;\u00e9crire des programmes d&rsquo;inf\u00e9rence avec branchement, parall\u00e9lisation et fork de g\u00e9n\u00e9rations. Cela ressemble \u00e0 du LangGraph minimaliste, mais ex\u00e9cut\u00e9 c\u00f4t\u00e9 serveur, sans aller-retour r\u00e9seau entre \u00e9tapes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Comparaison rapide avec les autres moteurs<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Pour situer SGLang dans le paysage des moteurs d&rsquo;inf\u00e9rence open source&nbsp;:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>vLLM<\/strong>&nbsp;: r\u00e9f\u00e9rence g\u00e9n\u00e9raliste, PagedAttention, \u00e9cosyst\u00e8me Hugging Face tr\u00e8s complet, support multi-GPU m\u00fbr. Avantage sur les workloads de batch et de throughput pur.<\/li>\n\n\n\n<li><strong>SGLang<\/strong>&nbsp;: meilleur sur prompts r\u00e9p\u00e9titifs, agents, structured output, multi-tours. Adopt\u00e9 par xAI et plusieurs labs.<\/li>\n\n\n\n<li><strong>TGI<\/strong> (Text Generation Inference, Hugging Face)&nbsp;: tr\u00e8s bien int\u00e9gr\u00e9 \u00e0 l&rsquo;\u00e9cosyst\u00e8me HF, plus simple \u00e0 d\u00e9ployer, un peu en retrait sur les performances brutes depuis vLLM 0.6.<\/li>\n\n\n\n<li><strong>llama.cpp<\/strong>&nbsp;: non comparable, c&rsquo;est un moteur pour mat\u00e9riel modeste, sans batching dynamique pens\u00e9 pour la production multi-utilisateurs.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Cas d&rsquo;usage o\u00f9 SGLang fait la diff\u00e9rence<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">SGLang devient particuli\u00e8rement int\u00e9ressant quand le workload pr\u00e9sente l&rsquo;un de ces traits&nbsp;:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>un syst\u00e8me prompt long et stable (instructions d&rsquo;agent, persona, contexte produit),<\/li>\n\n\n\n<li>des conversations multi-tours qui r\u00e9utilisent l&rsquo;historique,<\/li>\n\n\n\n<li>du RAG o\u00f9 les m\u00eames chunks reviennent souvent dans les requ\u00eates,<\/li>\n\n\n\n<li>une sortie qui doit \u00eatre JSON, XML ou suivre une grammaire m\u00e9tier,<\/li>\n\n\n\n<li>de la g\u00e9n\u00e9ration en parall\u00e8le (chain-of-thought avec plusieurs branches, vote majoritaire).<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Sur des charges de chat \u00e0 faible diversit\u00e9 de prompts, vLLM reste comp\u00e9titif. Plus le workload est agentique ou RAG, plus SGLang creuse l&rsquo;\u00e9cart.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Mise en route, version courte<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Le d\u00e9ploiement est volontairement proche de vLLM, ce qui facilite la migration. Un serveur compatible OpenAI se lance en une commande&nbsp;:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>pip install \"sglang&#91;all]\"\n\npython -m sglang.launch_server \\\n  --model-path Qwen\/Qwen2.5-7B-Instruct \\\n  --port 30000 \\\n  --enable-cache-radix-attention<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Le serveur expose une API <code>\/v1\/chat\/completions<\/code> compatible OpenAI, branchable directement depuis LiteLLM, LangGraph, Pydantic AI, Smolagents ou n&rsquo;importe quel client OpenAI standard. Aucun changement c\u00f4t\u00e9 application.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Pour exploiter pleinement la g\u00e9n\u00e9ration structur\u00e9e, on passe par le SDK Python&nbsp;:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import sglang as sgl\n\n@sgl.function\ndef extract_invoice(s, document):\n    s += \"Document&nbsp;: \" + document + \"\\n\"\n    s += \"Num\u00e9ro&nbsp;: \" + sgl.gen(\"num\", regex=r\"FA-\\d{6}\")\n    s += \"Montant TTC&nbsp;: \" + sgl.gen(\"ttc\", regex=r\"\\d+\\.\\d{2}\")\n\nstate = extract_invoice.run(document=texte_facture, backend=sgl.RuntimeEndpoint(\"http:\/\/localhost:30000\"))<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Le gain par rapport \u00e0 un prompt libre suivi d&rsquo;un parsing&nbsp;: la sortie est garantie, sans \u00e9tape de validation post-hoc, et la latence est plus basse car le mod\u00e8le a moins de tokens \u00e0 explorer.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Int\u00e9gration dans une stack open source<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dans un dispositif type, SGLang remplace ou compl\u00e8te vLLM sans rien changer en amont. Plusieurs combinaisons sont \u00e9prouv\u00e9es&nbsp;:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>SGLang + LiteLLM<\/strong>&nbsp;: LiteLLM route certaines requ\u00eates vers SGLang pour la g\u00e9n\u00e9ration structur\u00e9e, et garde vLLM pour le batch.<\/li>\n\n\n\n<li><strong>SGLang + Langfuse<\/strong>&nbsp;: observabilit\u00e9 standard via OpenTelemetry, traces, co\u00fbts, latences par pr\u00e9fixe.<\/li>\n\n\n\n<li><strong>SGLang + Open WebUI<\/strong>&nbsp;: interface chat standard pointant vers le serveur SGLang, exp\u00e9rience utilisateur identique \u00e0 un d\u00e9ploiement vLLM.<\/li>\n\n\n\n<li><strong>SGLang + LMCache<\/strong>&nbsp;: si la m\u00e9moire GPU est satur\u00e9e, LMCache offload le KV cache sur RAM ou disque, l\u00e0 o\u00f9 RadixAttention l&rsquo;organise dans la VRAM.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Limites \u00e0 conna\u00eetre<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">SGLang reste un projet plus jeune que vLLM. Quelques points de vigilance&nbsp;:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Le support multi-noeuds est moins m\u00fbr que sur vLLM, surtout pour les tr\u00e8s gros mod\u00e8les (200B+).<\/li>\n\n\n\n<li>La documentation est correcte mais moins fournie que celle de vLLM ou TGI.<\/li>\n\n\n\n<li>Tous les mod\u00e8les ne sont pas optimis\u00e9s au m\u00eame niveau&nbsp;: Llama, Qwen, DeepSeek et Mixtral sont premium, d&rsquo;autres restent fonctionnels mais moins ajust\u00e9s.<\/li>\n\n\n\n<li>La g\u00e9n\u00e9ration structur\u00e9e par FSM ajoute une compilation initiale&nbsp;; sur des grammaires tr\u00e8s complexes, le co\u00fbt de mise en cache n&rsquo;est rentable qu&rsquo;\u00e0 partir d&rsquo;un certain volume.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">En r\u00e9sum\u00e9<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">SGLang ne remplace pas vLLM pour tout le monde, mais devient un choix par d\u00e9faut s\u00e9rieux d\u00e8s que la charge devient agentique, structur\u00e9e ou multi-tours. Pour une stack open source qui s&rsquo;oriente vers les agents IA, il m\u00e9rite sa place en parall\u00e8le, voire \u00e0 la place, du moteur d&rsquo;inf\u00e9rence standard. Et comme tout est sous Apache 2.0, l&rsquo;exp\u00e9rimentation se fait sans engagement.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Liens utiles&nbsp;: <a href=\"https:\/\/github.com\/sgl-project\/sglang\" target=\"_blank\" rel=\"noreferrer noopener\">github.com\/sgl-project\/sglang<\/a>, documentation et benchmarks sur le site officiel du projet.<\/p>","protected":false},"excerpt":{"rendered":"<p>SGLang&nbsp;: servir des LLM open source plus vite que vLLM gr\u00e2ce \u00e0 RadixAttention Une fois pos\u00e9e la question du quel mod\u00e8le servir, vient celle, plus structurante, du moteur d&rsquo;inf\u00e9rence. Dans l&rsquo;\u00e9cosyst\u00e8me open source, vLLM s&rsquo;est impos\u00e9 comme le choix par d\u00e9faut. SGLang, projet n\u00e9 \u00e0 Berkeley et port\u00e9 par la communaut\u00e9 LMSYS, propose une alternative [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2345,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ocean_post_layout":"","ocean_both_sidebars_style":"","ocean_both_sidebars_content_width":0,"ocean_both_sidebars_sidebars_width":0,"ocean_sidebar":"","ocean_second_sidebar":"","ocean_disable_margins":"enable","ocean_add_body_class":"","ocean_shortcode_before_top_bar":"","ocean_shortcode_after_top_bar":"","ocean_shortcode_before_header":"","ocean_shortcode_after_header":"","ocean_has_shortcode":"","ocean_shortcode_after_title":"","ocean_shortcode_before_footer_widgets":"","ocean_shortcode_after_footer_widgets":"","ocean_shortcode_before_footer_bottom":"","ocean_shortcode_after_footer_bottom":"","ocean_display_top_bar":"default","ocean_display_header":"default","ocean_header_style":"","ocean_center_header_left_menu":"","ocean_custom_header_template":"","ocean_custom_logo":0,"ocean_custom_retina_logo":0,"ocean_custom_logo_max_width":0,"ocean_custom_logo_tablet_max_width":0,"ocean_custom_logo_mobile_max_width":0,"ocean_custom_logo_max_height":0,"ocean_custom_logo_tablet_max_height":0,"ocean_custom_logo_mobile_max_height":0,"ocean_header_custom_menu":"","ocean_menu_typo_font_family":"","ocean_menu_typo_font_subset":"","ocean_menu_typo_font_size":0,"ocean_menu_typo_font_size_tablet":0,"ocean_menu_typo_font_size_mobile":0,"ocean_menu_typo_font_size_unit":"px","ocean_menu_typo_font_weight":"","ocean_menu_typo_font_weight_tablet":"","ocean_menu_typo_font_weight_mobile":"","ocean_menu_typo_transform":"","ocean_menu_typo_transform_tablet":"","ocean_menu_typo_transform_mobile":"","ocean_menu_typo_line_height":0,"ocean_menu_typo_line_height_tablet":0,"ocean_menu_typo_line_height_mobile":0,"ocean_menu_typo_line_height_unit":"","ocean_menu_typo_spacing":0,"ocean_menu_typo_spacing_tablet":0,"ocean_menu_typo_spacing_mobile":0,"ocean_menu_typo_spacing_unit":"","ocean_menu_link_color":"","ocean_menu_link_color_hover":"","ocean_menu_link_color_active":"","ocean_menu_link_background":"","ocean_menu_link_hover_background":"","ocean_menu_link_active_background":"","ocean_menu_social_links_bg":"","ocean_menu_social_hover_links_bg":"","ocean_menu_social_links_color":"","ocean_menu_social_hover_links_color":"","ocean_disable_title":"default","ocean_disable_heading":"default","ocean_post_title":"","ocean_post_subheading":"","ocean_post_title_style":"","ocean_post_title_background_color":"","ocean_post_title_background":0,"ocean_post_title_bg_image_position":"","ocean_post_title_bg_image_attachment":"","ocean_post_title_bg_image_repeat":"","ocean_post_title_bg_image_size":"","ocean_post_title_height":0,"ocean_post_title_bg_overlay":0.5,"ocean_post_title_bg_overlay_color":"","ocean_disable_breadcrumbs":"default","ocean_breadcrumbs_color":"","ocean_breadcrumbs_separator_color":"","ocean_breadcrumbs_links_color":"","ocean_breadcrumbs_links_hover_color":"","ocean_display_footer_widgets":"default","ocean_display_footer_bottom":"default","ocean_custom_footer_template":"","osh_disable_topbar_sticky":"default","osh_disable_header_sticky":"default","osh_sticky_header_style":"default","osh_sticky_header_effect":"","osh_custom_sticky_logo":0,"osh_custom_retina_sticky_logo":0,"osh_custom_sticky_logo_height":0,"osh_background_color":"","osh_links_color":"","osh_links_hover_color":"","osh_links_active_color":"","osh_links_bg_color":"","osh_links_hover_bg_color":"","osh_links_active_bg_color":"","osh_menu_social_links_color":"","osh_menu_social_hover_links_color":"","ocean_post_oembed":"","ocean_post_self_hosted_media":"","ocean_post_video_embed":"","ocean_link_format":"","ocean_link_format_target":"self","ocean_quote_format":"","ocean_quote_format_link":"post","ocean_gallery_link_images":"on","ocean_gallery_id":[],"footnotes":""},"categories":[16],"tags":[],"class_list":["post-2344","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","entry","has-media"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.6 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>SGLang : servir des LLM open source - askem<\/title>\n<meta name=\"description\" content=\"ASKEM BUREAU D&#039;\u00c9TUDES ET DE FORMATION NUM\u00c9RIQUE. Nous vous assistons dans la transformation num\u00e9rique de vos outils, services et organisations tout en pla\u00e7ant l\u2019humain au c\u0153ur de notre d\u00e9marche d\u2019accompagnement.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/askem.eu\/en\/2026\/05\/21\/sglang-servir-des-llm-open-source\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"SGLang : servir des LLM open source - askem\" \/>\n<meta property=\"og:description\" content=\"ASKEM BUREAU D&#039;\u00c9TUDES ET DE FORMATION NUM\u00c9RIQUE. Nous vous assistons dans la transformation num\u00e9rique de vos outils, services et organisations tout en pla\u00e7ant l\u2019humain au c\u0153ur de notre d\u00e9marche d\u2019accompagnement.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/askem.eu\/en\/2026\/05\/21\/sglang-servir-des-llm-open-source\/\" \/>\n<meta property=\"og:site_name\" content=\"askem\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/fb.me\/askem.eu\" \/>\n<meta property=\"article:published_time\" content=\"2026-05-21T06:28:08+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-05-21T06:28:13+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/mlpi0fxo3sth.i.optimole.com\/cb:3obA.c61\/w:auto\/h:auto\/q:mauto\/f:best\/https:\/\/askem.eu\/wp-content\/uploads\/2026\/05\/sujet-askem-2026-05-21.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1600\" \/>\n\t<meta property=\"og:image:height\" content=\"1000\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"askemadmin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"askemadmin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/\"},\"author\":{\"name\":\"askemadmin\",\"@id\":\"https:\\\/\\\/askem.eu\\\/#\\\/schema\\\/person\\\/8bbee74ab9a977d56bf4826662e9d2e9\"},\"headline\":\"SGLang : servir des LLM open source\",\"datePublished\":\"2026-05-21T06:28:08+00:00\",\"dateModified\":\"2026-05-21T06:28:13+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/\"},\"wordCount\":956,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\/\\/askem.eu\\/wp-content\\/uploads\\/2026\\/05\\/sujet-askem-2026-05-21.png\",\"articleSection\":[\"AI\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/\",\"url\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/\",\"name\":\"SGLang : servir des LLM open source - askem\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\/\\/askem.eu\\/wp-content\\/uploads\\/2026\\/05\\/sujet-askem-2026-05-21.png\",\"datePublished\":\"2026-05-21T06:28:08+00:00\",\"dateModified\":\"2026-05-21T06:28:13+00:00\",\"description\":\"ASKEM BUREAU D'\u00c9TUDES ET DE FORMATION NUM\u00c9RIQUE. Nous vous assistons dans la transformation num\u00e9rique de vos outils, services et organisations tout en pla\u00e7ant l\u2019humain au c\u0153ur de notre d\u00e9marche d\u2019accompagnement.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/#primaryimage\",\"url\":\"https:\\/\\/askem.eu\\/wp-content\\/uploads\\/2026\\/05\\/sujet-askem-2026-05-21.png\",\"contentUrl\":\"https:\\/\\/askem.eu\\/wp-content\\/uploads\\/2026\\/05\\/sujet-askem-2026-05-21.png\",\"width\":1600,\"height\":1000},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/askem.eu\\\/2026\\\/05\\\/21\\\/sglang-servir-des-llm-open-source\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Accueil\",\"item\":\"https:\\\/\\\/askem.eu\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"SGLang : servir des LLM open source\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/askem.eu\\\/#website\",\"url\":\"https:\\\/\\\/askem.eu\\\/\",\"name\":\"askem\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/askem.eu\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/askem.eu\\\/#organization\",\"name\":\"Askem\",\"url\":\"https:\\\/\\\/askem.eu\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/askem.eu\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\/\\/mlpi0fxo3sth.i.optimole.com\\/cb:3obA.c61\\/w:760\\/h:480\\/q:mauto\\/f:best\\/https:\\/\\/askem.eu\\/wp-content\\/uploads\\/2020\\/10\\/logoGalaxieAskem3.png\",\"contentUrl\":\"https:\\/\\/mlpi0fxo3sth.i.optimole.com\\/cb:3obA.c61\\/w:760\\/h:480\\/q:mauto\\/f:best\\/https:\\/\\/askem.eu\\/wp-content\\/uploads\\/2020\\/10\\/logoGalaxieAskem3.png\",\"width\":760,\"height\":480,\"caption\":\"Askem\"},\"image\":{\"@id\":\"https:\\\/\\\/askem.eu\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/fb.me\\\/askem.eu\",\"https:\\\/\\\/linkedin.com\\\/company\\\/askem-eu\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/askem.eu\\\/#\\\/schema\\\/person\\\/8bbee74ab9a977d56bf4826662e9d2e9\",\"name\":\"askemadmin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a202f744ee3a4b6fdbe2ceb57fd84c72559337791a276662270d8d2fb7842e3f?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a202f744ee3a4b6fdbe2ceb57fd84c72559337791a276662270d8d2fb7842e3f?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/a202f744ee3a4b6fdbe2ceb57fd84c72559337791a276662270d8d2fb7842e3f?s=96&d=mm&r=g\",\"caption\":\"askemadmin\"},\"sameAs\":[\"https:\\\/\\\/askem.eu\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"SGLang : servir des LLM open source - askem","description":"ASKEM BUREAU D'\u00c9TUDES ET DE FORMATION NUM\u00c9RIQUE. Nous vous assistons dans la transformation num\u00e9rique de vos outils, services et organisations tout en pla\u00e7ant l\u2019humain au c\u0153ur de notre d\u00e9marche d\u2019accompagnement.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/askem.eu\/en\/2026\/05\/21\/sglang-servir-des-llm-open-source\/","og_locale":"en_US","og_type":"article","og_title":"SGLang : servir des LLM open source - askem","og_description":"ASKEM BUREAU D'\u00c9TUDES ET DE FORMATION NUM\u00c9RIQUE. Nous vous assistons dans la transformation num\u00e9rique de vos outils, services et organisations tout en pla\u00e7ant l\u2019humain au c\u0153ur de notre d\u00e9marche d\u2019accompagnement.","og_url":"https:\/\/askem.eu\/en\/2026\/05\/21\/sglang-servir-des-llm-open-source\/","og_site_name":"askem","article_publisher":"https:\/\/fb.me\/askem.eu","article_published_time":"2026-05-21T06:28:08+00:00","article_modified_time":"2026-05-21T06:28:13+00:00","og_image":[{"width":1600,"height":1000,"url":"https:\/\/mlpi0fxo3sth.i.optimole.com\/cb:3obA.c61\/w:auto\/h:auto\/q:mauto\/f:best\/https:\/\/askem.eu\/wp-content\/uploads\/2026\/05\/sujet-askem-2026-05-21.png","type":"image\/png"}],"author":"askemadmin","twitter_card":"summary_large_image","twitter_misc":{"Written by":"askemadmin","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/#article","isPartOf":{"@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/"},"author":{"name":"askemadmin","@id":"https:\/\/askem.eu\/#\/schema\/person\/8bbee74ab9a977d56bf4826662e9d2e9"},"headline":"SGLang : servir des LLM open source","datePublished":"2026-05-21T06:28:08+00:00","dateModified":"2026-05-21T06:28:13+00:00","mainEntityOfPage":{"@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/"},"wordCount":956,"commentCount":0,"publisher":{"@id":"https:\/\/askem.eu\/#organization"},"image":{"@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/#primaryimage"},"thumbnailUrl":"https:\/\/mlpi0fxo3sth.i.optimole.com\/cb:3obA.c61\/w:auto\/h:auto\/q:mauto\/f:best\/https:\/\/askem.eu\/wp-content\/uploads\/2026\/05\/sujet-askem-2026-05-21.png","articleSection":["AI"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/","url":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/","name":"SGLang : servir des LLM open source - askem","isPartOf":{"@id":"https:\/\/askem.eu\/#website"},"primaryImageOfPage":{"@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/#primaryimage"},"image":{"@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/#primaryimage"},"thumbnailUrl":"https:\/\/mlpi0fxo3sth.i.optimole.com\/cb:3obA.c61\/w:auto\/h:auto\/q:mauto\/f:best\/https:\/\/askem.eu\/wp-content\/uploads\/2026\/05\/sujet-askem-2026-05-21.png","datePublished":"2026-05-21T06:28:08+00:00","dateModified":"2026-05-21T06:28:13+00:00","description":"ASKEM BUREAU D'\u00c9TUDES ET DE FORMATION NUM\u00c9RIQUE. Nous vous assistons dans la transformation num\u00e9rique de vos outils, services et organisations tout en pla\u00e7ant l\u2019humain au c\u0153ur de notre d\u00e9marche d\u2019accompagnement.","breadcrumb":{"@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/#primaryimage","url":"https:\/\/mlpi0fxo3sth.i.optimole.com\/cb:3obA.c61\/w:auto\/h:auto\/q:mauto\/f:best\/https:\/\/askem.eu\/wp-content\/uploads\/2026\/05\/sujet-askem-2026-05-21.png","contentUrl":"https:\/\/mlpi0fxo3sth.i.optimole.com\/cb:3obA.c61\/w:auto\/h:auto\/q:mauto\/f:best\/https:\/\/askem.eu\/wp-content\/uploads\/2026\/05\/sujet-askem-2026-05-21.png","width":1600,"height":1000},{"@type":"BreadcrumbList","@id":"https:\/\/askem.eu\/2026\/05\/21\/sglang-servir-des-llm-open-source\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Accueil","item":"https:\/\/askem.eu\/"},{"@type":"ListItem","position":2,"name":"SGLang : servir des LLM open source"}]},{"@type":"WebSite","@id":"https:\/\/askem.eu\/#website","url":"https:\/\/askem.eu\/","name":"askem","description":"","publisher":{"@id":"https:\/\/askem.eu\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/askem.eu\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/askem.eu\/#organization","name":"Askem","url":"https:\/\/askem.eu\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/askem.eu\/#\/schema\/logo\/image\/","url":"https:\/\/mlpi0fxo3sth.i.optimole.com\/cb:3obA.c61\/w:760\/h:480\/q:mauto\/f:best\/https:\/\/askem.eu\/wp-content\/uploads\/2020\/10\/logoGalaxieAskem3.png","contentUrl":"https:\/\/mlpi0fxo3sth.i.optimole.com\/cb:3obA.c61\/w:760\/h:480\/q:mauto\/f:best\/https:\/\/askem.eu\/wp-content\/uploads\/2020\/10\/logoGalaxieAskem3.png","width":760,"height":480,"caption":"Askem"},"image":{"@id":"https:\/\/askem.eu\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/fb.me\/askem.eu","https:\/\/linkedin.com\/company\/askem-eu"]},{"@type":"Person","@id":"https:\/\/askem.eu\/#\/schema\/person\/8bbee74ab9a977d56bf4826662e9d2e9","name":"askemadmin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/a202f744ee3a4b6fdbe2ceb57fd84c72559337791a276662270d8d2fb7842e3f?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/a202f744ee3a4b6fdbe2ceb57fd84c72559337791a276662270d8d2fb7842e3f?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a202f744ee3a4b6fdbe2ceb57fd84c72559337791a276662270d8d2fb7842e3f?s=96&d=mm&r=g","caption":"askemadmin"},"sameAs":["https:\/\/askem.eu"]}]}},"_links":{"self":[{"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/posts\/2344","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/comments?post=2344"}],"version-history":[{"count":1,"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/posts\/2344\/revisions"}],"predecessor-version":[{"id":2346,"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/posts\/2344\/revisions\/2346"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/media\/2345"}],"wp:attachment":[{"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/media?parent=2344"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/categories?post=2344"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/askem.eu\/en\/wp-json\/wp\/v2\/tags?post=2344"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}