{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":634081686,"defaultBranch":"main","name":"mlc-llm","ownerLogin":"mlc-ai","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2023-04-29T01:59:25.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/106173866?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1725908714.0","currentOid":""},"activityList":{"items":[{"before":"97733af29252e43bd8bdf8a4cbac7f72916048d2","after":"d0c271763e1b4415ae2cea041bb694a4cff1e3ac","ref":"refs/heads/gh-pages","pushedAt":"2024-09-19T19:24:30.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Thu Sep 19 19:24:30 UTC 2024","shortMessageHtmlLink":"Build at Thu Sep 19 19:24:30 UTC 2024"}},{"before":"1828f9589688a36b8b967d256391dd94f9983128","after":"763a677d18466950e3f726d2960ec01e1d59e816","ref":"refs/heads/main","pushedAt":"2024-09-19T19:17:59.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"MasterJH5574","name":"Ruihang Lai","path":"/MasterJH5574","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/45167100?s=80&v=4"},"commit":{"message":"[Vision][Fix] Enable image processing kernel on non-CUDA backends (#2923)\n\nPrior to this PR, when compiling/running phi3.5-vision on non-CUDA\r\nbackend like Metal, we would run into the following issues:\r\n\r\n- Shape inference would exceed int32 (CUDA does not run into this as we\r\nuse int64 on CUDA), leading to error in runtime:\r\n```\r\nTVMError: Assert fail: (T.Div(new_h - 2147483185, 336) - -6391320) * 336 == T.Cast(\"int32\", resize2d1_var_lv4_shape[1]), Argument resize2d1.var_lv4.shape[1] has an unsatisfied constraint: new_h + T.Div((new_h + 336 - 1) // 336 * 336 - new_h, 2) + ((new_h + 336 - 1) // 336 * 336 - new_h - T.Div((new_h + 336 - 1) // 336 * 336 - new_h, 2)) == T.Cast(\"int32\", resize2d1_var_lv4_shape[1])\r\n``` \r\n- If naively keeping int64 on Metal, we run into:\r\n - `TVMError: Check failed: blockSize <= maxTotalThreadsPerThreadgroup (1024 vs. 896) :`\r\n - This is because when we use too many registers, number of available threads\r\nin a block decreases (to 896 here)\r\n\r\nThis PR fixes the issues above.\r\n\r\nBesides, we rename `std` to `stddev` to avoid reserved name issues on backends like WGSL.\r\n\r\nTested on Metal with:\r\n```\r\npython python/mlc_llm/testing/debug_chat.py \"List the objects you can identify in this image succinctly.\" --generate-len 256 --model dist/phi-3_5-vision-q4f16_1 --model-lib dist/libs/phi-3_5-vision-q4f16_1-metal.so --debug-dir debug/ --image-url https://www.islandvulnerability.org/borders/ai8699.jpg --disable-instrument\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: Ruihang Lai ","shortMessageHtmlLink":"[Vision][Fix] Enable image processing kernel on non-CUDA backends (#2923"}},{"before":"2223f4282686a7880ff760241566808dfae6a2f0","after":"97733af29252e43bd8bdf8a4cbac7f72916048d2","ref":"refs/heads/gh-pages","pushedAt":"2024-09-19T14:54:31.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Thu Sep 19 14:54:31 UTC 2024","shortMessageHtmlLink":"Build at Thu Sep 19 14:54:31 UTC 2024"}},{"before":"57f6d8c40d8fd9f4a96106ede9f6672c82ea624e","after":"1828f9589688a36b8b967d256391dd94f9983128","ref":"refs/heads/main","pushedAt":"2024-09-19T14:47:15.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"MasterJH5574","name":"Ruihang Lai","path":"/MasterJH5574","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/45167100?s=80&v=4"},"commit":{"message":"[Preset] Add Qwen2.5 3b to preset (#2919)\n\nAdd Qwen2.5 3B to preset","shortMessageHtmlLink":"[Preset] Add Qwen2.5 3b to preset (#2919)"}},{"before":"a1a3f21ef83a24028c6eeb5770be6221e2ee59eb","after":"2223f4282686a7880ff760241566808dfae6a2f0","ref":"refs/heads/gh-pages","pushedAt":"2024-09-19T00:19:36.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Thu Sep 19 00:19:35 UTC 2024","shortMessageHtmlLink":"Build at Thu Sep 19 00:19:35 UTC 2024"}},{"before":"de1bc368dcc54a7cc5b57304cb1c74b9a16bc680","after":"57f6d8c40d8fd9f4a96106ede9f6672c82ea624e","ref":"refs/heads/main","pushedAt":"2024-09-19T00:11:18.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"MasterJH5574","name":"Ruihang Lai","path":"/MasterJH5574","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/45167100?s=80&v=4"},"commit":{"message":"[Bench] Use OpenAI v1/completions as default (#2918)\n\nThis PR updates the benchmark to use OpenAI entrypoint `v1/completions`\r\nas the default to remove the impact of `v1/chat/completions` on\r\nconversation template and system prompt. The previous endpoint\r\nis renamed to `openai-chat`, and the naming here is aligned with vLLM.\r\n\r\nThis PR also fixes the TPOT calculation. Previously it used \"number of\r\noutput tokens - 1\" as the divisor, and this PR changes \"1\" to the number\r\nof tokens in the first output chunk.","shortMessageHtmlLink":"[Bench] Use OpenAI v1/completions as default (#2918)"}},{"before":"66fd62df1b68d1182623d7a934469ecbfa61422c","after":"de1bc368dcc54a7cc5b57304cb1c74b9a16bc680","ref":"refs/heads/main","pushedAt":"2024-09-18T20:58:01.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Auto updated submodule references","shortMessageHtmlLink":"Auto updated submodule references"}},{"before":"5bb91ecdb847676b2ae96d2c578c786874fe80bb","after":"a1a3f21ef83a24028c6eeb5770be6221e2ee59eb","ref":"refs/heads/gh-pages","pushedAt":"2024-09-18T19:55:25.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Wed Sep 18 19:55:25 UTC 2024","shortMessageHtmlLink":"Build at Wed Sep 18 19:55:25 UTC 2024"}},{"before":"1eabc65656d09e1d663f8c20777868b10024850c","after":"66fd62df1b68d1182623d7a934469ecbfa61422c","ref":"refs/heads/main","pushedAt":"2024-09-18T19:46:51.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"MasterJH5574","name":"Ruihang Lai","path":"/MasterJH5574","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/45167100?s=80&v=4"},"commit":{"message":"[Model] Udpate default prefill chunk size and max batch size (#2917)\n\nThis PR updates the default prefill chunk size from 2048 to 8192,\r\nand the default max batch size from 80 to 128.","shortMessageHtmlLink":"[Model] Udpate default prefill chunk size and max batch size (#2917)"}},{"before":"4264e11edf76766987b951b76682acc38ead7bc3","after":"5bb91ecdb847676b2ae96d2c578c786874fe80bb","ref":"refs/heads/gh-pages","pushedAt":"2024-09-18T13:52:35.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Wed Sep 18 13:52:34 UTC 2024","shortMessageHtmlLink":"Build at Wed Sep 18 13:52:34 UTC 2024"}},{"before":"571d3808251038ff11065a8665fa9bb152dd9828","after":"1eabc65656d09e1d663f8c20777868b10024850c","ref":"refs/heads/main","pushedAt":"2024-09-18T13:45:10.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"MasterJH5574","name":"Ruihang Lai","path":"/MasterJH5574","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/45167100?s=80&v=4"},"commit":{"message":"[Fix][Spec] Fix adaptive spec decoding (#2916)\n\nThis PR fixes the bug which happens when switching over from batch\r\ndecode mode to the spec decoding mode.","shortMessageHtmlLink":"[Fix][Spec] Fix adaptive spec decoding (#2916)"}},{"before":"a72af9b2094822e9eba6829b11d033d2028eb45b","after":"4264e11edf76766987b951b76682acc38ead7bc3","ref":"refs/heads/gh-pages","pushedAt":"2024-09-18T13:08:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Wed Sep 18 13:08:06 UTC 2024","shortMessageHtmlLink":"Build at Wed Sep 18 13:08:06 UTC 2024"}},{"before":"7b53664b14b4a44438e6df70d9d48000c3e0d384","after":"571d3808251038ff11065a8665fa9bb152dd9828","ref":"refs/heads/main","pushedAt":"2024-09-18T13:00:58.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"MasterJH5574","name":"Ruihang Lai","path":"/MasterJH5574","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/45167100?s=80&v=4"},"commit":{"message":"[Vision] Support image input in DebugChat (#2913)\n\nThis PR supports feeding an image url to DebugChat by generalizing\r\nits implementation for tokenization and embedding. Previous pure-text\r\nusage remains the same, and users can pass in `--image-url`\r\nfollowed by the image URL.\r\n\r\nWe also add `--disable-instrument` to disable dumping kernel\r\ninput/output details for faster generation when instrumenting is\r\nnot needed.","shortMessageHtmlLink":"[Vision] Support image input in DebugChat (#2913)"}},{"before":"7a276945ff16ab342bb293e039cbe8bb76bd0ffe","after":"a72af9b2094822e9eba6829b11d033d2028eb45b","ref":"refs/heads/gh-pages","pushedAt":"2024-09-17T20:24:39.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Tue Sep 17 20:24:39 UTC 2024","shortMessageHtmlLink":"Build at Tue Sep 17 20:24:39 UTC 2024"}},{"before":"2a32a3015de3fb9ed3b21cbed50a5141086b07c8","after":"7b53664b14b4a44438e6df70d9d48000c3e0d384","ref":"refs/heads/main","pushedAt":"2024-09-17T20:16:53.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"Align response/request format of logprobs for OpenAI completion API (#2912)\n\n* set correct top logprob limits\r\n\r\n* change logprobs interface for completion request\r\n\r\n* update logprobs response format for completion request\r\n\r\n* update API for correct logprobs format\r\n\r\n* check top_logprobs, update tests","shortMessageHtmlLink":"Align response/request format of logprobs for OpenAI completion API (#…"}},{"before":"202a45fdd6346b9843e20772171352705a65193e","after":"7a276945ff16ab342bb293e039cbe8bb76bd0ffe","ref":"refs/heads/gh-pages","pushedAt":"2024-09-17T03:04:15.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Tue Sep 17 03:04:14 UTC 2024","shortMessageHtmlLink":"Build at Tue Sep 17 03:04:14 UTC 2024"}},{"before":"52c96ac93b5940db89b40f874909a6e63506d0d5","after":"2a32a3015de3fb9ed3b21cbed50a5141086b07c8","ref":"refs/heads/main","pushedAt":"2024-09-17T02:57:36.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"MasterJH5574","name":"Ruihang Lai","path":"/MasterJH5574","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/45167100?s=80&v=4"},"commit":{"message":"[Bench] Print server metrics when enabled (#2909)\n\nThis PR tries to print the MLC server metrics after benchmark\r\nwhen the server is enabled with debug mode.","shortMessageHtmlLink":"[Bench] Print server metrics when enabled (#2909)"}},{"before":"a3a6f586ffa7c5ecd76b210bd4f445fa235e8aed","after":"202a45fdd6346b9843e20772171352705a65193e","ref":"refs/heads/gh-pages","pushedAt":"2024-09-16T11:34:47.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Mon Sep 16 11:34:47 UTC 2024","shortMessageHtmlLink":"Build at Mon Sep 16 11:34:47 UTC 2024"}},{"before":"36d0ed1031f6f0cd79820a8091c107c772d88810","after":"52c96ac93b5940db89b40f874909a6e63506d0d5","ref":"refs/heads/main","pushedAt":"2024-09-16T11:27:44.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[Engine][Spec] Initiali adaptive speculative decoding (#2906)\n\nThis PR introduces the initial support of speculative decoding,\nwhich is enabled when the `spec_draft_length` is 0 (its default\nvalue is also changed to 0 in this PR so that the adaptive mode\nis enabled by default).\n\nThe goal of adaptive speculative decoding is to dynamically adjust\nthe draft length depending on the engine's running state. For the\ninitial version, we use a fixed table for the draft length selection.\nWe will follow up with more powerful draft length selection.\n\nThis PR also fixes a few bugs after the introduction of tree draft.","shortMessageHtmlLink":"[Engine][Spec] Initiali adaptive speculative decoding (#2906)"}},{"before":"f48429dc00796aedbff95a6c4e05a2c02feb2957","after":"a3a6f586ffa7c5ecd76b210bd4f445fa235e8aed","ref":"refs/heads/gh-pages","pushedAt":"2024-09-14T19:36:37.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Sat Sep 14 19:36:36 UTC 2024","shortMessageHtmlLink":"Build at Sat Sep 14 19:36:36 UTC 2024"}},{"before":"6277afbf112f0e01040e05bf06f03c0c79c298f6","after":"36d0ed1031f6f0cd79820a8091c107c772d88810","ref":"refs/heads/main","pushedAt":"2024-09-14T19:29:59.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"MasterJH5574","name":"Ruihang Lai","path":"/MasterJH5574","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/45167100?s=80&v=4"},"commit":{"message":"[Model] Add image preprocess for vision model (#2892)\n\nThis PR add image preprocess for vision model","shortMessageHtmlLink":"[Model] Add image preprocess for vision model (#2892)"}},{"before":"df939d9f6a7118a2c36da39d89bc45d544b535ce","after":"f48429dc00796aedbff95a6c4e05a2c02feb2957","ref":"refs/heads/gh-pages","pushedAt":"2024-09-14T13:37:47.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Sat Sep 14 13:37:47 UTC 2024","shortMessageHtmlLink":"Build at Sat Sep 14 13:37:47 UTC 2024"}},{"before":"8f6b8e1677b5b9b95067c8b3e31fd17fa5dd0063","after":"6277afbf112f0e01040e05bf06f03c0c79c298f6","ref":"refs/heads/main","pushedAt":"2024-09-14T13:31:04.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"[CI][Docs] Update nightly CPU package (#2903)\n\nThis PR updates the uses of the nightly CPU package","shortMessageHtmlLink":"[CI][Docs] Update nightly CPU package (#2903)"}},{"before":"9918c4bdd3598e44645c463879d58cd5c3d97aef","after":"8f6b8e1677b5b9b95067c8b3e31fd17fa5dd0063","ref":"refs/heads/main","pushedAt":"2024-09-14T13:21:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Auto updated submodule references","shortMessageHtmlLink":"Auto updated submodule references"}},{"before":"d24af7acd7f4d9742c7852c9fa672a002faa60e0","after":"df939d9f6a7118a2c36da39d89bc45d544b535ce","ref":"refs/heads/gh-pages","pushedAt":"2024-09-13T14:05:48.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Fri Sep 13 14:05:48 UTC 2024","shortMessageHtmlLink":"Build at Fri Sep 13 14:05:48 UTC 2024"}},{"before":"2b7f128a9fb2b66577811e4795316d72df53ffb8","after":"9918c4bdd3598e44645c463879d58cd5c3d97aef","ref":"refs/heads/main","pushedAt":"2024-09-13T13:57:09.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"MasterJH5574","name":"Ruihang Lai","path":"/MasterJH5574","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/45167100?s=80&v=4"},"commit":{"message":"[Bench] Support multi round conversation (#2898)\n\nThis PR adds the support of multi round conversation when benchmarked\r\nwith fixed concurrent request mode. When enabled, the chat history will\r\nbe logged and appended during benchmark.","shortMessageHtmlLink":"[Bench] Support multi round conversation (#2898)"}},{"before":"18843e54e170d801b335ebd64c04a872d1403446","after":"d24af7acd7f4d9742c7852c9fa672a002faa60e0","ref":"refs/heads/gh-pages","pushedAt":"2024-09-11T22:24:54.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Wed Sep 11 22:24:53 UTC 2024","shortMessageHtmlLink":"Build at Wed Sep 11 22:24:53 UTC 2024"}},{"before":"c4a24316f4d97bdfccbc8263622da09601b6959b","after":"2b7f128a9fb2b66577811e4795316d72df53ffb8","ref":"refs/heads/main","pushedAt":"2024-09-11T22:17:26.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"tqchen","name":"Tianqi Chen","path":"/tqchen","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/2577440?s=80&v=4"},"commit":{"message":"Fix the dtype of batch_size to ensure mixtral ir well-formed (#2891)","shortMessageHtmlLink":"Fix the dtype of batch_size to ensure mixtral ir well-formed (#2891)"}},{"before":"77b20e166f785de010a1d0fdc74de49f7e05866a","after":"c4a24316f4d97bdfccbc8263622da09601b6959b","ref":"refs/heads/main","pushedAt":"2024-09-11T03:02:47.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Auto updated submodule references","shortMessageHtmlLink":"Auto updated submodule references"}},{"before":"27fba5cec14fdeed153efe87a1b1fc413c577def","after":"18843e54e170d801b335ebd64c04a872d1403446","ref":"refs/heads/gh-pages","pushedAt":"2024-09-09T16:39:53.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"github-actions[bot]","name":null,"path":"/apps/github-actions","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/15368?s=80&v=4"},"commit":{"message":"Build at Mon Sep 9 16:39:52 UTC 2024","shortMessageHtmlLink":"Build at Mon Sep 9 16:39:52 UTC 2024"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEuz5PaAA","startCursor":null,"endCursor":null}},"title":"Activity · mlc-ai/mlc-llm"}