Problems encountered during offline deployment（离线部署时遇到的问题） #723

jsxyhelu · 2025-09-08T06:45:29Z

jsxyhelu
Sep 8, 2025

During offline deployment, the following error occurs:
❌ An error occurred during execution: HTTPSConnectionPool(host='www.modelscope.cn', port=443): Max retries exceeded with url: /api/v1/models/sentence-transformers/all-MiniLM-L6-v2 (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f1eb9104040>: Failed to resolve 'www.modelscope.cn' ([Errno -3] Temporary failure in name resolution)"))

Meanwhile, the files have been downloaded as follows:
jsxyhelu@jsxyhelu-vmwarevirtualplatform:~/.cache/modelscope/hub/models/sentence-transformers/all-MiniLM-L6-v2$ ls
1_Pooling sentence_bert_config.json
config.json special_tokens_map.json
config_sentence_transformers.json tokenizer_config.json
configuration.json tokenizer.json
data_config.json train_script.py
modules.json

Preliminary analysis shows that during execution, the command actively connects to www.modelscope.cn to verify the software version. Is there a way to cancel this verification and directly use the models in the .cache folder, so that the program can run normally in offline mode?

在进行离线部署的时候，出现以下错误：
❌ 执行过程中发生错误：HTTPSConnectionPool(host='www.modelscope.cn', port=443): Max retries exceeded with url: /api/v1/models/sentence-transformers/all-MiniLM-L6-v2 (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7f1eb9104040>: Failed to resolve 'www.modelscope.cn' ([Errno -3] Temporary failure in name resolution)"))

同时，文件已经被下载如下：
jsxyhelu@jsxyhelu-vmwarevirtualplatform:~/.cache/modelscope/hub/models/sentence-transformers/all-MiniLM-L6-v2$ ls
1_Pooling sentence_bert_config.json
config.json special_tokens_map.json
config_sentence_transformers.json tokenizer_config.json
configuration.json tokenizer.json
data_config.json train_script.py
modules.json

初步分析，命令在执行的时候，会主动连接www.modelscope.cn进行软件版本确认。那么是否有方法取消确认，直接使用.cache中的models，从而使得离线的时候程序可以正常运行？

Answered by jsxyhelu

Sep 12, 2025

ms_agent\tools\docling\chunker.py

def __init__(self,
                 #embed_model_id: str = EMBED_MODEL_ID,
                 max_tokens: int = MAX_TOKENS):
        """
        Hybrid chunker that splits interleaved picture, table, and text into chunks.

        """

        embed_model_path = '/home/jsxyhelu/.cache/modelscope/hub/models/sentence-transformers/all-MiniLM-L6-v2'  # 本地模型路径
        self.tokenizer: BaseTokenizer = HuggingFaceTokenizer(
                # 从本地路径加载模型，而不是从网络
                tokenizer=AutoTokenizer.from_pretrained(embed_model_path, local_files_only=True),
                max_tokens=max_tokens,
        )
        
        self.chunker = HybridChunker(
                …

View full answer

suluyana · 2025-09-08T06:49:18Z

suluyana
Sep 8, 2025
Maintainer

可以试试在离线部署的脚步或者命令中，将 model_id修改为model_local_path

1 reply

jsxyhelu Sep 12, 2025
Author

ms_agent\tools\docling\chunker.py

def __init__(self,
                 #embed_model_id: str = EMBED_MODEL_ID,
                 max_tokens: int = MAX_TOKENS):
        """
        Hybrid chunker that splits interleaved picture, table, and text into chunks.

        """

        embed_model_path = '/home/jsxyhelu/.cache/modelscope/hub/models/sentence-transformers/all-MiniLM-L6-v2'  # 本地模型路径
        self.tokenizer: BaseTokenizer = HuggingFaceTokenizer(
                # 从本地路径加载模型，而不是从网络
                tokenizer=AutoTokenizer.from_pretrained(embed_model_path, local_files_only=True),
                max_tokens=max_tokens,
        )
        
        self.chunker = HybridChunker(
                tokenizer=self.tokenizer,
                serializer_provider=ImgPlaceholderSerializerProvider(),
        )

        logger.info(
            #f'Hybrid chunker initialized with tokenizer {embed_model_id}, max_tokens={self.tokenizer.get_max_tokens()}'
            f'Hybrid chunker initialized with tokenizer {embed_model_path}, max_tokens={self.tokenizer.get_max_tokens()}'
        )

此外2处
ms_agent\tools\docling\patches.py

def download_models_ms(
    local_dir=None,
    force: bool = False,
    progress: bool = False,
) -> Path:
    from modelscope import snapshot_download

    model_id: str = 'ms-agent/docling-models'
    logger.info(f'Downloading or reloading {model_id} from ModelScope Hub ...')
    download_path: str = snapshot_download(model_id=model_id)
    return Path(download_path)


def download_models_pic_classifier_ms(
    local_dir=None,
    force: bool = False,
    progress: bool = False,
) -> Path:
    from modelscope import snapshot_download

    model_id: str = 'ms-agent/DocumentFigureClassifier'
    logger.info(f'Downloading or reloading {model_id} from ModelScope Hub ...')
    download_path: str = snapshot_download(model_id=model_id)
    return Path(download_path)

Answer selected by jsxyhelu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Problems encountered during offline deployment（离线部署时遇到的问题） #723

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Problems encountered during offline deployment（离线部署时遇到的问题） #723

Uh oh!

jsxyhelu Sep 8, 2025

Preliminary analysis shows that during execution, the command actively connects to www.modelscope.cn to verify the software version. Is there a way to cancel this verification and directly use the models in the .cache folder, so that the program can run normally in offline mode?

Replies: 1 comment · 1 reply

Uh oh!

suluyana Sep 8, 2025 Maintainer

Uh oh!

jsxyhelu Sep 12, 2025 Author

jsxyhelu
Sep 8, 2025

Replies: 1 comment 1 reply

suluyana
Sep 8, 2025
Maintainer

jsxyhelu Sep 12, 2025
Author