| 失效链接处理 | 
| 多模态模型持续预训练实战指南详解-从FoMo-in-Flux到实际应用 PDF 下载 
	相关截图:  
	主要内容: 
		2. Concept Frequency Ordering (concept-frequency) draws motivation from Udandarao et al. [181], 
		with user requests for model improvement starting from least frequent concepts first (as these constitute 
		edge cases that are most likely to cause undesired performance drops) and incrementally extending to more 
		frequent concepts, which are already represented well in the pretraining pool. 
		Implementation. We use the What’s In My Big Data [43] tool’s elastic search index to search for the frequency 
		of occurrence of each of the class names in the C4 [145] dataset. We compute the frequencies of each of the 
		classes, and order them such that the least frequent concepts (long-tail) occur first and the most frequent 
		ones (head-concepts) are at the end. 
		3. Concept Similarity Ordering (similarity), inspired by Yıldız et al. [205], is based on the hypothesis 
		that training on conceptually similar tasks allows users to minimize catastrophic forgetting over tasks. 
		Implementation. To find a trajectory with the highest semantic similarity between subsequent concepts, we 
		start with a similarity matrix containing the pairwise similarities between all the class names (via CLIP 
		ViT-L-14 text embeddings of templated text captions of the respective classes). Defining each class as a 
		node in a graph, with weights between the classes being their similarity, the problem reduces to finding the 
		minimum spanning path. We use a simple greedy algorithm: pick a starting class, find its closest neighbour 
		from the remaining set of classes, and keep repeating until we exhaust all classes. We repeat this procedure 
		for every class as a starting point and pick the path with the smallest total weight across all starting classes. 
		4. Time-incremental Ordering (time), inspired by [15, 74, 21, 136, 49], arranges in chronological order. 
		Implementation. As we only have reliable time information about datasets (via release dates of corresponding 
		publications or the official dataset upload date), concepts are ordered on a dataset-level [15]. These year-level 
		groups are arranged from oldest to most recent, assuming that older datasets are more likely to be conceptually 
		integrated within the pretraining data. Within each year, concepts are randomly ordered. Alongside the 
		above orderings, we compare with two baseline methods popular in continual learning, to better understand 
		the trade-offs made by these data-centric orderings: 
		5. Dataset-Incremental Ordering (dataset) is motivated by [149, 112, 113, 191, 207], but extended to a 
		larger sequence of datasets. To set up dataset, we simply randomly sample datasets from Tab. 2 to create a 
		dataset-incremental concept sequence. This sequence is then broken down into the desired number of tasks T. 
		6. Random Ordering (random), a baseline class-incremental ordering widely used across continual learning 
		setups [150, 201, 71, 137], mimics a scenario where user requests for model improvement are unstructured. 
		For this ordering, we simply shuffle class names at random. | 



 
     苏公网安备 32061202001004号
苏公网安备 32061202001004号


 
    