SampleSelection¶
从训练向量数据集中选择样本。
描述¶
应用程序从用于训练的几何图形中选择一组样本(它们应该有一个字段来指定相关的类)。
首先,几何图形必须由 PolygonClassStatistics 用于计算有关几何图形的统计信息的应用程序,这些统计信息汇总在一个XML文件中。然后,必须将该XML文件作为输入提供给该应用程序(参数instats)。
输入支持图像和输入训练向量应分别以参数‘in’和‘vec’给出。只有采样网格(原点、大小、间距)将在输入图像中读取。有几种选择样本的策略(参数策略):
- 最小(默认):在每个类别中选择相同数量的样本,以便对最小的样本进行完全采样。
- 常量:在每个类别中选择相同数量的样本N(其中N小于或等于最小类别的大小)。
- By Class:手动设置每个类所需的数量,输入CSV文件(第一列为类名称,第二列为所需的样本数)。
- 百分比:设置要使用的样本的目标全局百分比。阶级比例将得到尊重。
- 总数:设置要使用的样本的目标总数。阶级比例将得到尊重。
还可以选择要执行的采样类型:
- 周期性:选择均匀分布的样本
- 随机:选择随机分布的样本
一旦选择了策略和类型,应用程序就会输出样本位置(参数输出)。
需要考虑的其他参数包括:
- Layer:指定从哪个层拾取几何图形的索引。
- 字段:设置包含类的字段名称。
- 蒙版:可以使用可选的栅格蒙版来丢弃样本。
- Outtrates:允许输出汇总每个类别的采样率的CSV文件。
就像 PolygonClassStatistics 应用程序,支持不同类型的几何图形:多边形、直线、点。对于每种类型的几何体,此应用程序的行为是不同的:
- 多边形:选择中心位于多边形内部的点
- 直线:选择与直线相交的点
- 点:选择距离提供的点最近的点
参数¶
InputImage -in image
Mandatory
Support image that will be classified
InputMask -mask image
Validity mask (only pixels corresponding to a mask value greater than 0 will be used for statistics)
Input vectors -vec vectorfile
Mandatory
Input geometries to analyse
Output vectors -out filename [dtype]
Mandatory
Output resampled geometries
Input Statistics -instats filename [dtype]
Mandatory
Input file storing statistics (XML format)
Output rates -outrates filename [dtype]
Output rates (CSV formatted)
Sampler type -sampler [periodic|random]
Default value: periodic
Type of sampling (periodic, pattern based, random)
- Periodic sampler
Takes samples regularly spaced - Random sampler
The positions to select are randomly shuffled.
周期性采样器选项¶
Jitter amplitude -sampler.periodic.jitter int
Default value: 0
Jitter amplitude added during sample selection (0 = no jitter)
Sampling strategy -strategy [byclass|constant|percent|total|smallest|all]
Default value: smallest
- Set samples count for each class
Set samples count for each class - Set the same samples counts for all classes
Set the same samples counts for all classes - Use a percentage of the samples available for each class
Use a percentage of the samples available for each class - Set the total number of samples to generate, and use class proportions.
Set the total number of samples to generate, and use class proportions. - Set the same number of samples for all classes, with the smallest class fully sampled
Set the same number of samples for all classes, with the smallest class fully sampled - Use all samples
Use all samples
设置每个类别的样本计数选项¶
Number of samples by class -strategy.byclass.in filename [dtype]
Mandatory
Number of samples by class (CSV format with class name in 1st column and required samples in the 2nd.
为所有类设置相同的样本数选项¶
Number of samples for all classes -strategy.constant.nb int
Mandatory
Number of samples for all classes
使用每个类别选项可用的样本的百分比¶
The percentage to use -strategy.percent.p float
Default value: 0.5
The percentage to use
设置要生成的样本总数,并使用类别比例。选项¶
The number of samples to generate -strategy.total.v int
Default value: 1000
The number of samples to generate
Field Name -field string
Name of the field carrying the class name in the input vectors.
Layer Index -layer int
Default value: 0
Layer index to read in the input vector file.
高程管理¶
这组参数允许管理高程值。
DEM directory -elev.dem directory
This parameter allows selecting a directory containing Digital Elevation Model files. Note that this directory should contain only DEM files. Unexpected behaviour might occurs if other images are found in this directory. Input DEM tiles should be in a raster format supported by GDAL.
Geoid File -elev.geoid filename [dtype]
Use a geoid grid to get the height above the ellipsoid in case there is no DEM available, no coverage for some points or pixels with no_data in the DEM tiles. A version of the geoid can be found on the OTB website (egm96.grd and egm96.grd.hdr at https://gitlab.orfeo-toolbox.org/orfeotoolbox/otb/-/tree/master/Data/Input/DEM).
Default elevation -elev.default float
Default value: 0
This parameter allows setting the default height above ellipsoid when there is no DEM available, no coverage for some points or pixels with no_data in the DEM tiles, and no geoid file has been set. This is also used by some application as an average elevation value.
Random seed -rand int
Set a specific random seed with integer value.
Available RAM (MB) -ram int
Default value: 256
Available memory for processing (in MB).
实例¶
从命令行执行以下操作:
otbcli_SampleSelection -in support_image.tif -vec variousVectors.sqlite -field label -instats apTvClPolygonClassStatisticsOut.xml -out resampledVectors.sqlite
来自Python的评论:
import otbApplication
app = otbApplication.Registry.CreateApplication("SampleSelection")
app.SetParameterString("in", "support_image.tif")
app.SetParameterString("vec", "variousVectors.sqlite")
app.SetParameterString("field", "label")
app.SetParameterString("instats", "apTvClPolygonClassStatisticsOut.xml")
app.SetParameterString("out", "resampledVectors.sqlite")
app.ExecuteAndWriteOutput()