Add mid train dataset generation scripts#80
Open
JewelRoam wants to merge 5 commits intoPaddlePaddle:developfrom
Open
Add mid train dataset generation scripts#80JewelRoam wants to merge 5 commits intoPaddlePaddle:developfrom
JewelRoam wants to merge 5 commits intoPaddlePaddle:developfrom
Conversation
Move the 498-line monolithic bash script logic into tools/triton_kernel_extractor/, a structured Python module with clear separation of concerns (config, sample enumeration, multi-GPU compilation, speedup filtering, kernel extraction, cleanup). The bash entry script is reduced to a thin launcher that sets machine-specific paths and delegates to `python3 -m tools.triton_kernel_extractor`. CLI interface unchanged: `bash extract_triton_kernels.sh <source> [gpu_ids]`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend Step 4 of the extraction pipeline to locate and pair each triton kernel with its PTX assembly from the inductor cache. When multiple autotuning candidates exist, the winning configuration is identified via the .best_config triton_cache_hash field. Add package README documenting the full pipeline, PTX resolution algorithm, and output structure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add cache_analyzer.py: replaces analyze_inductor_cache.sh with a Python module that concatenates logs, computes speedup statistics, and generates distribution plots. - Add 'analyze' subcommand to CLI with backward-compatible implicit 'extract' for the old --source-first invocation style. - Add --enable-cache-analysis flag to run analysis after extraction pipeline. - Harden kernel_extractor.py: guard file reads with try/except OSError, deduplicate kernel names across multiple output_code.py files per sample, remove dead KeyError from exception handler. - Extract shared is_sample_dir() into config.py, remove duplicates from speedup_filter.py and cache_analyzer.py. - Replace assert with explicit raise ValueError in pipeline.py for -O safety. - Update README with simplified cache analysis section and CLI arguments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thread the GraphNet inductor config template through the compilation pipeline: CLI flag → PipelineConfig → base64-encoded --config arg on test_compiler subprocess. The flag is off by default; the bash launcher enables it alongside --enable-cache-analysis. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
生成流程
Usage:
# bash expand_graph_paths.shUsage: