r - how to programmatically back-out, deduce, decompile, reverse-engineer the algorithm used to construct a variable in a data set -


i'm looking algorithm or program or function deduce how variable created, long supply other variables. think computer programmers call "decompiling" , architects call "reverse-engineering" guess don't know statisticians call it..or if there accepted methods it.

let's i've got categorical column in data.frame called newvar , don't know how constructed. do know variables used create it..or @ least can provide exhaustive set of variables used create -- if not of them used.

# start example data set x <- mtcars  # # # # # # # # # # # # # # # # # # # # # # # # # pretend block of code black box x <-     transform(         x ,         newvar =             ifelse( mpg > 24 , 1 ,             ifelse( cyl == 6 , 9 ,             ifelse( hp > 120 , 4 ,             ifelse( mpg > 22 , 7 , 2 ) ) ) )     ) # end of unknown block of code # # # # # # # # # # # # # # # # # # # # # # # #  # knowing `mtcars` has 11 columns choose names(x)  # how these 11 columns used construct `newvar`? table( x$newvar )  # here's start.. y <- data.frame( ftable( x[ , c( 'mpg' , 'cyl' , 'hp' , 'newvar' ) ] ) ) # ..combinations records y[y[,5]!=0,] # that's not enough back-out construction 

so think out construction of newvar linear regression or decision trees, still require bit of thinking , piecing coefficients figure out happened inside black box.

is there algorithm available guesses @ black box, so-to-speak? thanks!!

in general, no. , applying lot of knowledge going on, still (probably) no. let me show example example. adding knowledge of "black box" output discrete values , derived based on thresholds of other values, classification tree should able recover criteria. so:

library("party") tmp <- ctree(factor(newvar) ~ ., data=x,    controls=ctree_control(mincriterion=0, minsplit=2, minbucket=1)) 

i've set control values unreasonable values force algorithm drive each bucket containing single value. , not started with:

enter image description here

so simple example , adding more knowledge transformation, can not done, there not hope able in general case.


Comments

Popular posts from this blog

ios - UICollectionView Self Sizing Cells with Auto Layout -

node.js - ldapjs - write after end error -

DOM Manipulation in Wordpress (and elsewhere) using php -