-
Notifications
You must be signed in to change notification settings - Fork 0
feat: search, load and validates manifests #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 52 commits
60320ba
c67c661
22ae256
6cb437f
00825de
442bcc6
e19f6f1
b6ecc27
1406b18
8aba035
70b2c5e
c8b7a39
1181871
31f1d3d
df49dfc
f4169b9
aade454
e0bd4a4
2e572fb
3c5de1c
7eff973
b7c7c66
c7196e9
7539db7
cd22c1f
27d8645
373f3c1
1414990
2ce5b21
17162f4
c21ef34
77f5771
40ef8e8
0a25844
71c4496
4696dcc
089109d
09faa6a
bd7a707
3ea70ce
3efb7d2
a421818
e88de2d
acac421
0262cb2
7445fcd
5fca0c5
fb6500b
d7fa155
43e8a71
46d2119
164c68b
15550ad
4ecb555
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11,6 +11,7 @@ import ( | |
| "github.com/canonical/chisel/internal/cache" | ||
| "github.com/canonical/chisel/internal/setup" | ||
| "github.com/canonical/chisel/internal/slicer" | ||
| "github.com/canonical/chisel/public/manifest" | ||
| ) | ||
|
|
||
| var shortCutHelp = "Cut a tree with selected slices" | ||
|
|
@@ -73,6 +74,21 @@ func (cmd *cmdCut) Execute(args []string) error { | |
| } | ||
| } | ||
|
|
||
| mfest, err := slicer.Inspect(cmd.RootDir, release) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if mfest != nil { | ||
| mfest.IterateSlices("", func(slice *manifest.Slice) error { | ||
| sk, err := setup.ParseSliceKey(slice.Name) | ||
| if err != nil { | ||
| return err | ||
|
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [Note to reviewer]: At this point a valid manifest was extracted from the root dir but the dir is invalid (or we failed to validate it) so we return the error. |
||
| } | ||
| sliceKeys = append(sliceKeys, sk) | ||
| return nil | ||
| }) | ||
| } | ||
|
|
||
| selection, err := setup.Select(release, sliceKeys, cmd.Arch) | ||
| if err != nil { | ||
| return err | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,6 +4,7 @@ import ( | |
| "fmt" | ||
| "io" | ||
| "io/fs" | ||
| "maps" | ||
| "path/filepath" | ||
| "slices" | ||
| "sort" | ||
|
|
@@ -34,6 +35,19 @@ func FindPaths(slices []*setup.Slice) map[string][]*setup.Slice { | |
| return manifestSlices | ||
| } | ||
|
|
||
| // FindPathsInRelease finds all the paths marked with "generate:manifest" | ||
| // for the given release. | ||
| func FindPathsInRelease(r *setup.Release) []string { | ||
| allSlices := []*setup.Slice{} | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not use the
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you elaborate? Do you mean use |
||
| for _, pkg := range r.Packages { | ||
| for _, slice := range pkg.Slices { | ||
| allSlices = append(allSlices, slice) | ||
| } | ||
| } | ||
| manifestMap := FindPaths(allSlices) | ||
|
upils marked this conversation as resolved.
Outdated
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a bit unfortunate that we have to iterate over all slices to collect them just to iterate over them again. Because the real business logic takes a path + info and returns the manifest path by validating then appending a filename, what about extracting only that bit into an auxiliary function instead? Then have both functions call the auxiliary one. Thoughts?
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After implementing a naive version of
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reworked in 4ecb555 |
||
| return slices.Sorted(maps.Keys(manifestMap)) | ||
| } | ||
|
|
||
| type WriteOptions struct { | ||
| PackageInfo []*archive.PackageInfo | ||
| Selection []*setup.Slice | ||
|
|
@@ -340,3 +354,12 @@ func Validate(mfest *manifest.Manifest) (err error) { | |
| } | ||
| return nil | ||
| } | ||
|
|
||
| // CompareSchemas compare two manifest schema strings | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does it compare them? What is the return value? This looks a bit artificial at the moment, it is saying:
Which is a little bit weird for a comparator, right? Normally, either -1, 0 and 1 are returned based on a total order or
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is a little weird on purpose. My goal was to have a signature consistent with other comparison functions in the project and avoid changing it when actually supporting multiple schema versions. The end goal is to provide a total order, but since the schema format is not yet formally defined, I did not want to implement a complex comparison making assumptions (that could be wrong in the future). When we actually support multiple schema, the format will be properly defined and we should naturally change this implementation because it will look obviously wrong. For now it indeed feels a bit artificial but that may be a good thing(?). |
||
| func CompareSchemas(va, vb string) int { | ||
| if va == manifest.Schema && va == vb { | ||
| return 0 | ||
| } | ||
| return -1 | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,186 @@ | ||
| package slicer | ||
|
|
||
| import ( | ||
| "crypto/sha256" | ||
| "encoding/hex" | ||
| "fmt" | ||
| "io" | ||
| "io/fs" | ||
| "os" | ||
| "path/filepath" | ||
| "syscall" | ||
|
|
||
| "github.com/klauspost/compress/zstd" | ||
|
|
||
| "github.com/canonical/chisel/internal/manifestutil" | ||
| "github.com/canonical/chisel/public/manifest" | ||
| ) | ||
|
|
||
| type pathInfo struct { | ||
| mode string | ||
| size int64 | ||
| link string | ||
| hash string | ||
| } | ||
|
|
||
| func unixPerm(mode fs.FileMode) (perm uint32) { | ||
| perm = uint32(mode.Perm()) | ||
| if mode&fs.ModeSticky != 0 { | ||
| perm |= 0o1000 | ||
| } | ||
| return perm | ||
| } | ||
|
|
||
| // checkDir checks the content of the target directory matches with | ||
| // the manifest. | ||
| // This function works under the assumption the manifest is valid. | ||
| // Files not managed by chisel are ignored. | ||
| func checkDir(mfest *manifest.Manifest, rootDir string) error { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I am still not happy with having it here, but let's postpone that discussion. It is not blocking and it is not major (for now if we don't introduce a chisel validate).
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Good point, and this is in line with the
I agree. |
||
| singlePathsByFSInode := make(map[uint64]string) | ||
| fsInodeByManifestInode := make(map[uint64]uint64) | ||
| manifestInfos := make(map[string]*pathInfo) | ||
| err := mfest.IteratePaths("", func(path *manifest.Path) error { | ||
| pathHash := path.FinalSHA256 | ||
| if pathHash == "" { | ||
| pathHash = path.SHA256 | ||
| } | ||
| recordedPathInfo := &pathInfo{ | ||
| mode: path.Mode, | ||
| size: int64(path.Size), | ||
| link: path.Link, | ||
| hash: pathHash, | ||
| } | ||
|
|
||
| fsInfo := &pathInfo{} | ||
| fullPath := filepath.Join(rootDir, path.Path) | ||
| info, err := os.Lstat(fullPath) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| mode := info.Mode() | ||
|
upils marked this conversation as resolved.
|
||
| fsInfo.mode = fmt.Sprintf("0%o", unixPerm(mode)) | ||
| ftype := mode & fs.ModeType | ||
| switch ftype { | ||
| case fs.ModeDir: | ||
| // Nothing to do | ||
| case fs.ModeSymlink: | ||
| fsInfo.link, err = os.Readlink(fullPath) | ||
| if err != nil { | ||
| return fmt.Errorf("cannot read symlink %q: %w", fullPath, err) | ||
| } | ||
| case 0: // Regular | ||
| h, err := contentHash(fullPath) | ||
| if err != nil { | ||
| return fmt.Errorf("cannot compute hash for %q: %w", fullPath, err) | ||
| } | ||
| fsInfo.hash = hex.EncodeToString(h) | ||
| fsInfo.size = info.Size() | ||
| default: | ||
| return fmt.Errorf("inconsistent content: %q has unrecognized type %s", fullPath, mode.String()) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would remove the prefix
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See 15550ad |
||
| } | ||
|
|
||
| // Collect manifests for tailored checking later. | ||
| // Adjust observed hash and size to still compare in a generic way. | ||
| if filepath.Base(path.Path) == manifestutil.DefaultFilename && recordedPathInfo.size == 0 && recordedPathInfo.hash == "" { | ||
| mfestInfo := *fsInfo | ||
| manifestInfos[path.Path] = &mfestInfo | ||
| fsInfo.size = 0 | ||
| fsInfo.hash = "" | ||
| } | ||
|
|
||
| if recordedPathInfo.mode != fsInfo.mode { | ||
| return fmt.Errorf("inconsistent mode at %q: recorded %v, observed %v", path.Path, recordedPathInfo.mode, fsInfo.mode) | ||
| } | ||
| if recordedPathInfo.size != fsInfo.size { | ||
| return fmt.Errorf("inconsistent size at %q: recorded %v, observed %v", path.Path, recordedPathInfo.size, fsInfo.size) | ||
| } | ||
| if recordedPathInfo.link != fsInfo.link { | ||
| return fmt.Errorf("inconsistent link at %q: recorded %v, observed %v", path.Path, recordedPathInfo.link, fsInfo.link) | ||
| } | ||
| if recordedPathInfo.hash != fsInfo.hash { | ||
| return fmt.Errorf("inconsistent hash at %q: recorded %v, observed %v", path.Path, recordedPathInfo.hash, fsInfo.hash) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Normally we use
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I used these terms at first but they felt opinionated. Since the manifest is found in the rootfs, it could be "wrong" too. But it depends on the abstraction we want to present to the user. I think it would be useful for them to be aware that the inspection relies on the manifest so they can more easily solve the problem if the inspection fails. If we decide that the usage of the manifest is in an implementation detail then I agree that WDYT? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It is NOT an implementation detail. The fact we use the manifest has to be transparent IMO, how else would we do it. What can the user expect if not using the manifest |
||
| } | ||
| // Check hardlink | ||
| if ftype != fs.ModeDir { | ||
| stat, ok := info.Sys().(*syscall.Stat_t) | ||
| if !ok { | ||
| return fmt.Errorf("cannot get syscall stat info for %q", info.Name()) | ||
| } | ||
| inode := stat.Ino | ||
|
|
||
| if path.Inode == 0 { | ||
| // This path must not be linked to any other | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Make sure all comments are punctuated.
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See 15550ad |
||
| singlePath, ok := singlePathsByFSInode[inode] | ||
| if ok { | ||
| return fmt.Errorf("inconsistent content at %q: file hardlinked to %q", path.Path, singlePath) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This error message is a bit cryptic, as you say above normally we prefer the structure of
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree. I still would like to display the other path already hardlinked to the inode to help the user solve the problem. See 15550ad |
||
| } | ||
| singlePathsByFSInode[inode] = path.Path | ||
| } else { | ||
| recordedInode, ok := fsInodeByManifestInode[path.Inode] | ||
| if !ok { | ||
| fsInodeByManifestInode[path.Inode] = inode | ||
| } else if recordedInode != inode { | ||
| return fmt.Errorf("inconsistent content at %q: file hardlinked to a different inode", path.Path) | ||
| } | ||
| } | ||
| } | ||
| return nil | ||
| }) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| // Check manifests | ||
| // They must all be valid manifests. | ||
| // They must be consistent per schema version. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see that you like this comment style :) but to be consistent with the rest of the code it is better to group sentences into paragraphs unless they are unrelated or need a paragraph break (which IMO is not the case here). This suggestion applies to all comments which follow the same style.
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See 15550ad for improved comments. |
||
| schemaManifestInfos := make(map[string]*pathInfo) | ||
| for path, info := range manifestInfos { | ||
| fullPath := filepath.Join(rootDir, path) | ||
| f, err := os.Open(fullPath) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| defer f.Close() | ||
| r, err := zstd.NewReader(f) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| defer r.Close() | ||
| mfest, err = manifest.Read(r) | ||
| if err != nil { | ||
| return err | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See my other comment, I think we need a named error here to know if we should skip. If we find a valid format then |
||
| } | ||
| err = manifestutil.Validate(mfest) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| schema := mfest.Schema() | ||
| refInfo, ok := schemaManifestInfos[schema] | ||
| if !ok { | ||
| schemaManifestInfos[schema] = info | ||
| continue | ||
| } | ||
|
|
||
| if refInfo.size != info.size { | ||
| return fmt.Errorf("inconsistent manifest size for version %s at %q: recorded %v, observed %v", schema, path, refInfo.size, info.size) | ||
| } | ||
| if refInfo.hash != info.hash { | ||
| return fmt.Errorf("inconsistent manifest hash for version %s at %q: recorded %v, observed %v", schema, path, refInfo.hash, info.hash) | ||
| } | ||
|
upils marked this conversation as resolved.
|
||
| } | ||
| return nil | ||
| } | ||
|
|
||
| func contentHash(path string) ([]byte, error) { | ||
| f, err := os.Open(path) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| defer f.Close() | ||
|
|
||
| h := sha256.New() | ||
| if _, err := io.Copy(h, f); err != nil { | ||
| return nil, err | ||
| } | ||
| return h.Sum(nil), nil | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,10 +3,12 @@ package slicer | |
| import ( | ||
| "archive/tar" | ||
| "bytes" | ||
| "encoding/hex" | ||
| "fmt" | ||
| "io" | ||
| "io/fs" | ||
| "os" | ||
| "path" | ||
| "path/filepath" | ||
| "slices" | ||
| "sort" | ||
|
|
@@ -21,6 +23,7 @@ import ( | |
| "github.com/canonical/chisel/internal/manifestutil" | ||
| "github.com/canonical/chisel/internal/scripts" | ||
| "github.com/canonical/chisel/internal/setup" | ||
| "github.com/canonical/chisel/public/manifest" | ||
| ) | ||
|
|
||
| const manifestMode fs.FileMode = 0644 | ||
|
|
@@ -537,3 +540,93 @@ func selectPkgArchives(archives map[string]archive.Archive, selection *setup.Sel | |
| } | ||
| return pkgArchive, nil | ||
| } | ||
|
|
||
| // Inspect examines and validates the targetDir. | ||
| // Return the list of SliceKeys used to build the targetDir. | ||
| func Inspect(targetDir string, release *setup.Release) (*manifest.Manifest, error) { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this name and comment is too generic, let's think of something better. What about One problem I see with this grouping is that when I get back
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The "cut" concept is an external abstraction, unknown to the slicer. I was looking for a name at the same level of abstraction than
I agree that |
||
| var mfest *manifest.Manifest | ||
| manifestPaths := manifestutil.FindPathsInRelease(release) | ||
| if len(manifestPaths) > 0 { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Prefer |
||
| logf("Inspecting root directory...") | ||
| var err error | ||
| mfest, err = selectValidManifest(targetDir, manifestPaths) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| if mfest != nil { | ||
| err = checkDir(mfest, targetDir) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| } | ||
| } | ||
| return mfest, nil | ||
| } | ||
|
|
||
| // selectValidManifest the first valid manifest found in a directory with the | ||
| // latest schema version. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Improved in 15550ad. Let me know what you think. |
||
| // A manifest is considered valid if it is consistent with others present and | ||
| // of the same schema. | ||
| func selectValidManifest(targetDir string, manifestPaths []string) (*manifest.Manifest, error) { | ||
| targetDir = filepath.Clean(targetDir) | ||
| if !filepath.IsAbs(targetDir) { | ||
| dir, err := os.Getwd() | ||
| if err != nil { | ||
| return nil, fmt.Errorf("cannot obtain current directory: %w", err) | ||
| } | ||
| targetDir = filepath.Join(dir, targetDir) | ||
| } | ||
|
Comment on lines
+570
to
+577
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are doing this every time for an internal function. If the function is internal then you can expect an absolute path and you can document it or check with |
||
|
|
||
| type manifestHash struct { | ||
| path string | ||
| hash string | ||
| } | ||
| var selected *manifest.Manifest | ||
| schemaManifest := make(map[string]manifestHash) | ||
| for _, mfestPath := range manifestPaths { | ||
| err := func() error { | ||
| mfestFullPath := path.Join(targetDir, mfestPath) | ||
| f, err := os.Open(mfestFullPath) | ||
| if err != nil { | ||
| if os.IsNotExist(err) { | ||
| return nil | ||
| } | ||
| return err | ||
| } | ||
| defer f.Close() | ||
| r, err := zstd.NewReader(f) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| defer r.Close() | ||
| mfest, err := manifest.Read(r) | ||
| if err != nil { | ||
| return nil | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See my other other comment. Here it is even more important. And |
||
| } | ||
| err = manifestutil.Validate(mfest) | ||
| if err != nil { | ||
| return nil | ||
| } | ||
|
|
||
| if selected == nil || manifestutil.CompareSchemas(mfest.Schema(), selected.Schema()) > 0 { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Compare is never > 0. In my opinion, in terms of logic the code should check all manifests of the same version have to be equal. You don't have to use the content has because you have both manifests in memory and you need to return selected anyways, so you can check whether they match or not (in a separate function please). |
||
| h, err := contentHash(mfestFullPath) | ||
| if err != nil { | ||
| return fmt.Errorf("cannot compute hash for %q: %w", mfestFullPath, err) | ||
| } | ||
| mfestHash := hex.EncodeToString(h) | ||
| refMfest, ok := schemaManifest[mfest.Schema()] | ||
| if !ok { | ||
| schemaManifest[mfest.Schema()] = manifestHash{mfestPath, mfestHash} | ||
| } else if refMfest.hash != mfestHash { | ||
| return fmt.Errorf("inconsistent manifests: %q and %q", refMfest.path, mfestPath) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will likely change, but this error message doesn't state that the contents of the manifests should be equal and they are not. |
||
| } | ||
| selected = mfest | ||
| } | ||
| return nil | ||
| }() | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| } | ||
| return selected, nil | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -106,3 +106,7 @@ func iteratePrefix[T prefixable](manifest *Manifest, prefix *T, onMatch func(*T) | |
| } | ||
| return nil | ||
| } | ||
|
|
||
| func (manifest *Manifest) Schema() string { | ||
|
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [Note to reviewer]: This is needed to compare manifests by schema versions but exposing it while the format of this value is not precisely defined might cause issues in the future. Should we hold on exposing and for now internally rely on the assumption that all valid manifests are necessarily using the |
||
| return manifest.db.Schema() | ||
| } | ||
|
upils marked this conversation as resolved.
Outdated
|
||
Uh oh!
There was an error while loading. Please reload this page.